Research data infrastructures in the UK


The Open Research Data Task Force : “This report is intended to inform the work of the Open Research Data Task Force, which has been established with the aim of building on the principles set out in Open Research Data Concordat (published in July 2016) to co-ordinate creation of a roadmap to develop the infrastructure for open research data across the UK. As an initial contribution to that work, the report provides an outline of the policy and service infrastructure in the UK as it stands in the first half of 2017, including some comparisons with other countries; and it points to some key areas and issues which require attention. It does not seek to identify possible courses of action, nor even to suggest priorities the Task Force might consider in creating its final report to be published in 2018. That will be the focus of work for the Task Force over the next few months.

Why is this important?

The digital revolution continues to bring fundamental changes to all aspects of research: how it is conducted, the findings that are produced, and how they are interrogated and transmitted not only within the research community but more widely. We are as yet still in the early stages of a transformation in which progress is patchy across the research community, but which has already posed significant challenges for research funders and institutions, as well as for researchers themselves. Research data is at the heart of those challenges: not simply the datasets that provide the core of the evidence analysed in scholarly publications, but all the data created and collected throughout the research process. Such data represents a potentially-valuable resource for people and organisations in the commercial, public and voluntary sectors, as well as for researchers. Access to such data, and more general moves towards open science, are also critically-important in ensuring that research is reproducible, and thus in sustaining public confidence in the work of the research community. But effective use of research data depends on an infrastructure – of hardware, software and services, but also of policies, organisations and individuals operating at various levels – that is as yet far from fully-formed. The exponential increases in volumes of data being generated by researchers create in themselves new demands for storage and computing power. But since the data is characterised more by heterogeneity then by uniformity, development of the infrastructure to manage it involves a complex set of requirements in preparing, collecting, selecting, analysing, processing, storing and preserving that data throughout its life cycle.

Over the past decade and more, there have been many initiatives on the part of research institutions, funders, and members of the research community at local, national and international levels to address some of these issues. Diversity is a key feature of the landscape, in terms of institutional types and locations, funding regimes, and nature and scope of partnerships, as well as differences between disciplines and subject areas. Hence decision-makers at various levels have fostered via their policies and strategies many community-organised developments, as well as their own initiatives and services. Significant progress has been achieved as a result, through the enthusiasm and commitment of key organisations and individuals. The less positive features have been a relative lack of harmonisation or consolidation, and there is an increasing awareness of patchiness in provision, with gaps, overlaps and inconsistencies. This is not surprising, since policies, strategies and services relating to research data necessarily affect all aspects of support for the diverse processes of research itself. Developing new policies and infrastructure for research data implies significant re-thinking of structures and regimes for supporting, fostering and promoting research itself. That in turn implies taking full account of widely-varying characteristics and needs of research of different kinds, while also keeping in clear view the benefits to be gained from better management of research data, and from greater openness in making data accessible for others to re-use for a wide range of different purposes….(More)”.

Using Collaboration to Harness Big Data for Social Good


Jake Porway at SSIR: “These days, it’s hard to get away from the hype around “big data.” We read articles about how Silicon Valley is using data to drive everything from website traffic to autonomous cars. We hear speakers at social sector conferences talk about how nonprofits can maximize their impact by leveraging new sources of digital information like social media data, open data, and satellite imagery.

Braving this world can be challenging, we know. Creating a data-driven organization can require big changes in culture and process. Some nonprofits, like Crisis Text Line and Watsi, started off boldly by building their own data science teams. But for the many other organizations wondering how to best use data to advance their mission, we’ve found that one ingredient works better than all the software and tech that you can throw at a problem: collaboration.

As a nonprofit dedicated to applying data science for social good, DataKind has run more than 200 projects in collaboration with other nonprofits worldwide by connecting them to teams of volunteer data scientists. What do the most successful ones have in common? Strong collaborations on three levels: with data science experts, within the organization itself, and across the nonprofit sector as a whole.

1. Collaborate with data science experts to define your project. As we often say, finding problems can be harder than finding solutions. ….

2. Collaborate across your organization to “build with, not for.” Our projects follow the principles of human-centered design and the philosophy pioneered in the civic tech world of “design with, not for.” ….

3. Collaborate across your sector to move the needle. Many organizations think about building data science solutions for unique challenges they face, such as predicting the best location for their next field office. However, most of us are fighting common causes shared by many other groups….

By focusing on building strong collaborations on these three levels—with data experts, across your organization, and across your sector—you’ll go from merely talking about big data to making big impact….(More).

A Road-Map To Transform The Secure And Accessible Use Of Data For High Impact Program Management, Policy Development, And Scholarship


Preface and Roadmap by Andrew Reamer and Julia Lane: “Throughout the United States, there is broadly emerging support to significantly enhance the nation’s capacity for evidence-based policymaking. This support is shared across the public and private sectors and all levels of geography. In recent years, efforts to enable evidence-based analysis have been authorized by the U.S. Congress, and funded by state and local governments, philanthropic foundations.

The potential exists for substantial change. There has been dramatic growth in technological capabilities to organize, link, and analyze massive volumes of data from multiple, disparate sources. A major resource is administrative data, which offer both advantages and challenges in comparison to data gathered through the surveys that have been the basis for much policymaking to date. To date, however, capability-building efforts have been largely “artisanal” in nature. As a result, the ecosystem of evidence-based policymaking capacity-building efforts is thin and weakly connected.

Each attempt to add a node to the system faces multiple barriers that require substantial time, effort, and luck to address. Those barriers are systemic. Too much attention is paid to the interests of researchers, rather than in the engagement of data producers. Individual projects serve focused needs and operate at a relative distance from one another Researchers, policymakers and funding agencies thus need exists to move from these artisanal efforts to new, generalized solutions that will catalyze the creation of a robust, large-scale data infrastructure for evidence-based policymaking.

This infrastructure will have be a “complex, adaptive ecosystem” that expands, regenerates, and replicates as needed while allowing customization and local control. To create a path for achieving this goal, the U.S. Partnership on Mobility from Poverty commissioned 12 papers and then hosted a day-long gathering (January 23, 2017) of over 60 experts to discuss findings and implications for action. Funded by the Gates Foundation, the papers and workshop panels were organized around three topics: privacy and confidentiality, data providers, and comprehensive strategies.

This issue of the Annals showcases those 12 papers which jointly propose solutions for catalyzing the development of a data infrastructure for evidence-based policymaking.

This preface:

  • places current evidence-based policymaking efforts in historical context
  • briefly describes the nature of multiple current efforts,
  • provides a conceptual framework for catalyzing the growth of any large institutional ecosystem,
  • identifies the major dimensions of the data infrastructure ecosystem,
  • describes key barriers to the expansion of that ecosystem, and
  • suggests a roadmap for catalyzing that expansion….(More)

(All 12 papers can be accessed here).

Computational Propaganda Worldwide


Executive Summary: “The Computational Propaganda Research Project at the Oxford Internet Institute, University of Oxford, has researched the use of social media for public opinion manipulation. The team involved 12 researchers across nine countries who, altogether, interviewed 65 experts, analyzed tens of millions posts on seven different social media platforms during scores of elections, political crises, and national security incidents. Each case study analyzes qualitative, quantitative, and computational evidence collected between 2015 and 2017 from Brazil, Canada, China, Germany, Poland, Taiwan, Russia, Ukraine, and the United States.

Computational propaganda is the use of algorithms, automation, and human curation to purposefully distribute misleading information over social media networks. We find several distinct global trends in computational propaganda. •

  • Social media are significant platforms for political engagement and crucial channels for disseminating news content. Social media platforms are the primary media over which young people develop their political identities.
    • In some countries this is because some companies, such as Facebook, are effectively monopoly platforms for public life. o In several democracies the majority of voters use social media to share political news and information, especially during elections.
    • In countries where only small proportions of the public have regular access to social media, such platforms are still fundamental infrastructure for political conversation among the journalists, civil society leaders, and political elites.
  • Social media are actively used as a tool for public opinion manipulation, though in diverse ways and on different topics. o In authoritarian countries, social media platforms are a primary means of social control. This is especially true during political and security crises. o In democracies, social media are actively used for computational propaganda either through broad efforts at opinion manipulation or targeted experiments on particular segments of the public.
  • In every country we found civil society groups trying, but struggling, to protect themselves and respond to active misinformation campaigns….(More)”.

6 Jurisdictions Tackling Homelessness with Technology


 in GovernmentTechnology: “Public servants who work to reduce homelessness often have similar lists of challenges.

The most common of these are data sharing between groups involved with the homeless, the ability to track interactions between individuals and outreach providers, and a system that makes it easier to enter information about the population. Recently, we spoke with more than a half-dozen government officials who are involved with the homeless, and while obstacles and conditions varied among cities, all agreed that their work would be much easier with better tech-based solutions for the problems cited above.

These officials, however, were uniformly optimistic that such solutions were becoming more readily available — solutions with potential to solve the logistical hurdles that most often hamstring government, community and nonprofit efforts to help the homeless find jobs, residences and medical care. Some agencies, in fact, have already had success implementing tech as components in larger campaigns, while others are testing new platforms that may bolster organization and efficiency.

Below are a few brief vignettes that detail some — but far from all — ongoing governmental efforts to use tech to aid and reduce the homeless population.

1. BERGEN COUNTY, N.J.

One of the best examples of government using tech to address homelessness can be found in Bergen County, N.J., where officials recently certified their jurisdiction as first in the nation to end chronic homelessness. READ MORE

2. AURORA, COLO.

Aurora, Colo., in the Denver metropolitan, area uses the Homeless Management Information System required by the U.S. Department of Housing and Urban Development, but those involved with addressing homelessness there have also developed tech-based efforts that are specifically tailored to the area’s needs. READ MORE

4. NEW YORK CITY

New York City is rolling out an app called StreetSmart, which enables homelessness outreach workers in all five boroughs to communicate and log data seamlessly in real time while in the field. With StreetSmart, these workers will be able to enter that information into a single citywide database as they collect it. READ MORE(Full article)

Inspecting Algorithms for Bias


Matthias Spielkamp at MIT Technology Review: “It was a striking story. “Machine Bias,” the headline read, and the teaser proclaimed: “There’s software used across the country to predict future criminals. And it’s biased against blacks.”

ProPublica, a Pulitzer Prize–winning nonprofit news organization, had analyzed risk assessment software known as COMPAS. It is being used to forecast which criminals are most likely to ­reoffend. Guided by such forecasts, judges in courtrooms throughout the United States make decisions about the future of defendants and convicts, determining everything from bail amounts to sentences. When ProPublica compared COMPAS’s risk assessments for more than 10,000 people arrested in one Florida county with how often those people actually went on to reoffend, it discovered that the algorithm “correctly predicted recidivism for black and white defendants at roughly the same rate.”…

After ProPublica’s investigation, Northpointe, the company that developed COMPAS, disputed the story, arguing that the journalists misinterpreted the data. So did three criminal-justice researchers, including one from a justice-reform organization. Who’s right—the reporters or the researchers? Krishna Gummadi, head of the Networked Systems Research Group at the Max Planck Institute for Software Systems in Saarbrücken, Germany, offers a surprising answer: they all are.

Gummadi, who has extensively researched fairness in algorithms, says ProPublica’s and Northpointe’s results don’t contradict each other. They differ because they use different measures of fairness.

Imagine you are designing a system to predict which criminals will reoffend. One option is to optimize for “true positives,” meaning that you will identify as many people as possible who are at high risk of committing another crime. One problem with this approach is that it tends to increase the number of false positives: people who will be unjustly classified as likely reoffenders. The dial can be adjusted to deliver as few false positives as possible, but that tends to create more false negatives: likely reoffenders who slip through and get a more lenient treatment than warranted.

Raising the incidence of true positives or lowering the false positives are both ways to improve a statistical measure known as positive predictive value, or PPV. That is the percentage of all positives that are true….

But if we accept that algorithms might make life fairer if they are well designed, how can we know whether they are so designed?

Democratic societies should be working now to determine how much transparency they expect from ADM systems. Do we need new regulations of the software to ensure it can be properly inspected? Lawmakers, judges, and the public should have a say in which measures of fairness get prioritized by algorithms. But if the algorithms don’t actually reflect these value judgments, who will be held accountable?

These are the hard questions we need to answer if we expect to benefit from advances in algorithmic technology…(More)”.

Expanding Training on Data and Technology to Improve Communities


Kathryn Pettit at the National Neighborhood Indicators Partnership (NNIP): “Local government and nonprofit staff need data and technology skills to regularly monitor local conditions and design programs that achieve more effective outcomes. Tailored training is essential to help them gain the knowledge and confidence to leverage these indispensable tools. A recent survey of organizations that provide data and technology training documented current practices and how such training should be expanded. Four recommendations are provided to assist government agencies, elected leaders, nonprofit executives, and local funders in empowering workers with the necessary training to use data and technology to benefit their communities. Specifically, community stakeholders should collectively work to

  • expand the training available to government and nonprofit staff;
  • foster opportunities for sharing training materials and lessons;
  • identify allies who can enhance and support local training efforts;
  • and assess the local landscape of data and technology training.

Project Products

  • Brief: A summary of the current training landscape and key action steps for various sectors to ensure that local government and nonprofit staff have the data and technology skills needed for their civic missions.
  • Guide: A document for organizations interested in providing community data and technology training, including advice on how to assess local needs, develop training content, and fund these efforts.
  • Catalog: Example training descriptions and related materials collected from various cities for local adaptation.
  • Fact sheet: A summary of results from a survey on current training content and practices….(More)”

Information for the People: Tunisia Embraces Open Government, 2011–2016


Case study by Tristan Dreisback at Innovations for Successful Societies: “In January 2011, mass demonstrations in Tunisia ousted a regime that had tolerated little popular participation, opening the door to a new era of transparency. The protesters demanded an end to the secrecy that had protected elite privilege. Five months later, the president issued a decree that increased citizen access to government data and formed a steering committee to guide changes in information practices, building on small projects already in development. Advocates in the legislature and the public service joined with civil society leaders to support a strong access-to-information policy, to change the culture of public administration, and to secure the necessary financial and technical resources to publish large quantities of data online in user-friendly formats. Several government agencies launched their own open-data websites. External pressure, coupled with growing interest from civil society and legislators, helped keep transparency reforms on the cabinet office agenda despite frequent changes in top leadership. In 2016, Tunisia adopted one of the world’s strongest laws regarding access to information. Although members of the public did not put all of the resources to use immediately, the country moved much closer to having the data needed to improve access to services, enhance government performance, and support the evidence-based deliberation on which a healthy democracy depended…(More)”

Open Data Barometer 2016


Open Data Barometer: “Produced by the World Wide Web Foundation as a collaborative work of the Open Data for Development (OD4D) network and with the support of the Omidyar Network, the Open Data Barometer (ODB) aims to uncover the true prevalence and impact of open data initiatives around the world. It analyses global trends, and provides comparative data on countries and regions using an in-depth methodology that combines contextual data, technical assessments and secondary indicators.

Covering 115 jurisdictions in the fourth edition, the Barometer ranks governments on:

  • Readiness for open data initiatives.
  • Implementation of open data programmes.
  • Impact that open data is having on business, politics and civil society.

After three successful editions, the fourth marks another step towards becoming a global policymaking tool with a participatory and inclusive process and a strong regional focus. This year’s Barometer includes an assessment of government performance in fulfilling the Open Data Charter principles.

The Barometer is a truly global and collaborative effort, with input from more than 100 researchers and government representatives. It takes over six months and more than 10,000 hours of research work to compile. During this process, we address more than 20,000 questions and respond to more than 5,000 comments and suggestions.

The ODB global report is a summary of some of the most striking findings. The full data and methodology is available, and is intended to support secondary research and inform better decisions for the progression of open data policies and practices across the world…(More)”.

Improving public services through open government


Tim Hughes at Involve: “As citizens, we rely on public services being accessible and high quality – to give us an education, keep us healthy, make our communities a safe place to be, and ensure our basic needs are met. Public services are critical to our wellbeing and life chances, and building stronger and more prosperous societies. Open government reforms have the potential to improve existing services, and unlock the ideas, knowledge and capacity for new solutions to societal challenges. The idea is simple – public services that are more responsive and accountable to us as citizens – and benefit from our insights, ideas, energy and scrutiny – will work better for us.

This is why, in partnership with the Open Government Partnership, we have written a new guidance paper on how to develop robust and ambitious open public service reforms.  The guidance is particularly targeted at governments and civil society developing open governments commitments through the Open Government Partnership, but should be useful to anyone interested in how transparency, citizen participation and accountability can improve public services.

The paper sets out a framework of open public service reforms, as well as guidance, recommendations, resources and case studies. We will be updating the guide over time, so please do get in touch to let us know what you think….Download the report.