Responsible Data Re-Use for COVID19

” The Governance Lab (The GovLab) at the NYU Tandon School of Engineering, with support from the Henry Luce Foundation, today released guidance to inform decision-making in the responsible re-use of data — re-purposing data for a use other than that for which it was originally intended — to address COVID-19. The findings, recommendations, and a new Responsible Data Re-Use framework stem from The Data Assembly initiative in New York City. An effort to solicit diverse, actionable public input on data re-use for crisis response in the United States, the Data Assembly brought together New York City-based stakeholders from government, the private sector, civic rights and advocacy organizations, and the general public to deliberate on innovative, though potentially risky, uses of data to inform crisis response in New York City. The findings and guidance from the initiative will inform policymaking and practice regarding data re-use in New York City, as well as free data literacy training offerings.

The Data Assembly’s Responsible Data Re-Use Framework provides clarity on a major element of the ongoing crisis. Though leaders throughout the world have relied on data to reduce uncertainty and make better decisions, expectations around the use and sharing of siloed data assets has remained unclear. This summer, along with the New York Public Library and Brooklyn Public Library, The GovLab co-hosted four months of remote deliberations with New York-based civil rights organizations, key data holders, and policymakers. Today’s release is a product of these discussions, to show how New Yorkers and their leaders think about the opportunities and risks involved in the data-driven response to COVID-19….(More)”

See: The Data Assembly Synthesis Report by y Andrew Young, Stefaan G. Verhulst, Nadiya Safonova, and Andrew J. Zahuranec

Leveraging Open Data with a National Open Computing Strategy

Policy Brief by Lara Mangravite and John Wilbanks: “Open data mandates and investments in public data resources, such as the Human Genome Project or the U.S. National Oceanic and Atmospheric Administration Data Discovery Portal, have provided essential data sets at a scale not possible without government support. By responsibly sharing data for wide reuse, federal policy can spur innovation inside the academy and in citizen science communities. These approaches are enabled by private-sector advances in cloud computing services and the government has benefited from innovation in this domain. However, the use of commercial products to manage the storage of and access to public data resources poses several challenges.

First, too many cloud computing systems fail to properly secure data against breaches, improperly share copies of data with other vendors, or use data to add to their own secretive and proprietary models. As a result, the public does not trust technology companies to responsibly manage public data—particularly private data of individual citizens. These fears are exacerbated by the market power of the major cloud computing providers, which may limit the ability of individuals or institutions to negotiate appropriate terms. This impacts the willingness of U.S. citizens to have their personal information included within these databases.

Second, open data solutions are springing up across multiple sectors without coordination. The federal government is funding a series of independent programs that are working to solve the same problem, leading to a costly duplication of effort across programs.

Third and most importantly, the high costs of data storage, transfer, and analysis preclude many academics, scientists, and researchers from taking advantage of governmental open data resources. Cloud computing has radically lowered the costs of high-performance computing, but it is still not free. The cost of building the wrong model at the wrong time can quickly run into tens of thousands of dollars.

Scarce resources mean that many academic data scientists are unable or unwilling to spend their limited funds to reuse data in exploratory analyses outside their narrow projects. And citizen scientists must use personal funds, which are especially scarce in communities traditionally underrepresented in research. The vast majority of public data made available through existing open science policy is therefore left unused, either as reference material or as “foreground” for new hypotheses and discoveries….The Solution: Public Cloud Computing…(More)”.

The Potential Role Of Open Data In Mitigating The COVID-19 Pandemic: Challenges And Opportunities

Essay by Sunyoung Pyo, Luigi Reggi and Erika G. Martin: “…There is one tool for the COVID-19 response that was not as robust in past pandemics: open data. For about 15 years, a “quiet open data revolution” has led to the widespread availability of governmental data that are publicly accessible, available in multiple formats, free of charge, and with unlimited use and distribution rights. The underlying logic of open data’s value is that diverse users including researchers, practitioners, journalists, application developers, entrepreneurs, and other stakeholders will synthesize the data in novel ways to develop new insights and applications. Specific products have included providing the public with information about their providers and health care facilities, spotlighting issues such as high variation in the cost of medical procedures between facilities, and integrating food safety inspection reports into Yelp to help the public make informed decisions about where to dine. It is believed that these activities will in turn empower health care consumers and improve population health.

Here, we describe several use cases whereby open data have already been used globally in the COVID-19 response. We highlight major challenges to using these data and provide recommendations on how to foster a robust open data ecosystem to ensure that open data can be leveraged in both this pandemic and future public health emergencies…(More)” See also Repository of Open Data for Covid19 (OECD/TheGovLab)

Open data in public libraries: Gauging activities and supporting ambitions

Paper by Kaitlin Fender Throgmorton, Bree Norlander and Carole L. Palmer: “As the open data movement grows, public libraries must assess if and how to invest resources in this new service area. This paper reports on a recent survey on open data in public libraries across Washington state, conducted by the Open Data Literacy project (ODL) in collaboration with the Washington State Library. Results document interests and activity in open data across small, medium, and large libraries in relation to traditional library services and priorities. Libraries are particularly active in open data through reference services and are beginning to release their own library data to the public. While capacity and resource challenges hinder progress for some, many libraries, large and small, are making progress on new initiatives, including strategic collaborations with local government agencies. Overall, the level and range of activity suggest that Washington state public libraries of all sizes recognize the value of open data for their communities, with a groundswell of libraries moving beyond ambition to action as they develop new services through evolution and innovation….(More)”.

Third Wave of Open Data

Paper (and site) by Stefaan G. Verhulst, Andrew Young, Andrew J. Zahuranec, Susan Ariel Aaronson, Ania Calderon, and Matt Gee on “How To Accelerate the Re-Use of Data for Public Interest Purposes While Ensuring Data Rights and Community Flourishing”: “The paper begins with a description of earlier waves of open data. Emerging from freedom of information laws adopted over the last half century, the First Wave of Open Data brought about newfound transparency, albeit one only available on request to an audience largely composed of journalists, lawyers, and activists. 

The Second Wave of Open Data, seeking to go beyond access to public records and inspired by the open source movement, called upon national governments to make their data open by default. Yet, this approach too had its limitations, leaving many data silos at the subnational level and in the private sector untouched..

The Third Wave of Open Data seeks to build on earlier successes and take into account lessons learned to help open data realize its transformative potential. Incorporating insights from various data experts, the paper describes the emergence of a Third Wave driven by the following goals:

  1. Publishing with Purpose by matching the supply of data with the demand for it, providing assets that match public interests;
  2. Fostering Partnerships and Data Collaboration by forging relationships with  community-based organizations, NGOs, small businesses, local governments, and others who understand how data can be translated into meaningful real-world action;
  3. Advancing Open Data at the Subnational Level by providing resources to cities, municipalities, states, and provinces to address the lack of subnational information in many regions.
  4. Prioritizing Data Responsibility and Data Rights by understanding the risks of using (and not using) data to promote and preserve the public’s general welfare.

Riding the Wave

Achieving these goals will not be an easy task and will require investments and interventions across the data ecosystem. The paper highlights eight actions that decision and policy makers can take to foster more equitable, impactful benefits… (More) (PDF) “

Data to Go: The Value of Data Portability as a Means to Data Liquidity

Juliet McMurren and Stefaan G. Verhulst at Data & Policy: “If data is the “new oil,” why isn’t it flowing? For almost two decades, data management in fields such as government, healthcare, finance, and research has aspired to achieve a state of data liquidity, in which data can be reused where and when it is needed. For the most part, however, this aspiration remains unrealized. The majority of the world’s data continues to stagnate in silos, controlled by data holders and inaccessible to both its subjects and others who could use it to create or improve services, for research, or to solve pressing public problems.

Efforts to increase liquidity have focused on forms of voluntary institutional data sharing such as data pools or other forms of data collaboratives. Although useful, these arrangements can only advance liquidity so far. Because they vest responsibility and control over liquidity in the hands of data holders, their success depends on data holders’ willingness and ability to provide access to their data for the greater good. While that willingness exists in some fields, particularly medical research, a willingness to share data is much less likely where data holders are commercial competitors and data is the source of their competitive advantage. And even where willingness exists, the ability of data holders to share data safely, securely, and interoperably may not. Without a common set of secure, standardized, and interoperable tools and practices, the best that such bottom-up collaboration can achieve is a disconnected patchwork of initiatives, rather than the data liquidity proponents are seeking.

Image for post

Data portability is one potential solution to this problem. As enacted in the EU General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018), the right to data portability asserts that individuals have a right to obtain, copy, and reuse their personal data and transfer it between platforms or services. In so doing, it shifts control over data liquidity to data subjects, obliging data holders to release data whether or not it is in their commercial interests to do so. Proponents of data portability argue that, once data is unlocked and free to move between platforms, it can be combined and reused in novel ways and in contexts well beyond those in which it was originally collected, all while enabling greater individual control.

To date, however, arguments for the benefits of the right to data portability have typically failed to connect this rights-based approach with the larger goal of data liquidity and how portability might advance it. This failure to connect these principles and to demonstrate their collective benefits to data subjects, data holders, and society has real-world consequences. Without a clear view of what can be achieved, policymakers are unlikely to develop interventions and incentives to advance liquidity and portability, individuals will not exercise their rights to data portability, and industry will not experiment with use cases and develop the tools and standards needed to make portability and liquidity a reality.

Toward these ends, we have been exploring the current literature on data portability and liquidity, searching for lessons and insights into the benefits that can be unlocked when data liquidity is enabled through the right to data portability. Below we identify some of the greatest potential benefits for society, individuals, and data-holding organizations. These benefits are sometimes in conflict with one another, making the field a contentious one that demands further research on the trade-offs and empirical evidence of impact. In the final section, we also discuss some barriers and challenges to achieving greater data liquidity….(More)”.

Open data governance: civic hacking movement, topics and opinions in digital space

Paper by Mara Maretti, Vanessa Russo & Emiliano del Gobbo: “The expression ‘open data’ relates to a system of informative and freely accessible databases that public administrations make generally available online in order to develop an informative network between institutions, enterprises and citizens. On this topic, using the semantic network analysis method, the research aims to investigate the communication structure and the governance of open data in the Twitter conversational environment. In particular, the research questions are: (1) Who are the main actors in the Italian open data infrastructure? (2) What are the main conversation topics online? (3) What are the pros and cons of the development and use (reuse) of open data in Italy? To answer these questions, we went through three research phases: (1) analysing the communication network, we found who are the main influencers; (2) once we found who were the main actors, we analysed the online content in the Twittersphere to detect the semantic areas; (3) then, through an online focus group with the main open data influencers, we explored the characteristics of Italian open data governance. Through the research, it has been shown that: (1) there is an Italian open data governance strategy; (2) the Italian civic hacker community plays an important role as an influencer; but (3) there are weaknesses in governance and in practical reuse….(More)”.

Situating Open Data: Global Trends in Local Contexts

Open Access Book edited by Danny Lämmerhirt, Ana Brandusescu, Natalia Domagala & Patrick Enaholo: “Open data and its effects on society are always woven into infrastructural legacies, social relations, and the political economy. This raises questions about how our understanding and engagement with open data shifts when we focus on its situated use. 

To shed a light on these questions, Situating Open Data provides several empirical accounts of open data practices, the local implementation of global initiatives, and the development of new open data ecosystems. Drawing on case studies in different countries and contexts, the chapters demonstrate the practices and actors involved in open government data initiatives unfolding within different socio-political settings. 

The book proposes three recommendations for researchers, policy-makers and practitioners. First, beyond upskilling through ‘data literacy’ programmes, open data initiatives should be specified through the kinds of data practices and effects they generate. Second, global visions of open data implementation require more studies of the resonances and tensions created in localised initiatives. And third, research into open data ecosystems requires more attention to the histories and legacies of information infrastructures and how these shape who benefits from open data flows. 

As such, this volume departs from the framing of data as a resource to be deployed. Instead, it proposes a prism of different data practices in different contexts through which to study the social relations, capacities, infrastructural histories and power structures affecting open data initiatives. It is hoped that the contributions collected in Situating Open Data will spark critical reflection about the way open data is locally practiced and implemented. The contributions should be of interest to open data researchers, advocates, and those in or advising government administrations designing and rolling out effective open data initiatives….(More)”.

Improving data access democratizes and diversifies science

Research article by Abhishek Nagaraj, Esther Shears, and Mathijs de Vaan: “Data access is critical to empirical research, but past work on open access is largely restricted to the life sciences and has not directly analyzed the impact of data access restrictions. We analyze the impact of improved data access on the quantity, quality, and diversity of scientific research. We focus on the effects of a shift in the accessibility of satellite imagery data from Landsat, a NASA program that provides valuable remote-sensing data. Our results suggest that improved access to scientific data can lead to a large increase in the quantity and quality of scientific research. Further, better data access disproportionately enables the entry of scientists with fewer resources, and it promotes diversity of scientific research….(More)”

Smart Rural: The Open Data Gap

Paper by Johanna Walker et al: “The smart city paradigm has underpinned a great deal of thevuse and production of open data for the benefit of policymakers and citizens. This paper posits that this further enhances the existing urban rural divide. It investigates the availability and use of rural open data along two parameters: pertaining to rural populations, and to key parts of the rural economy (agriculture, fisheries and forestry). It explores the relationship between key statistics of national / rural economies and rural open data; and the use and users of rural open data where it is available. It finds that although countries with more rural populations are not necessarily earlier in their Open Data Maturity journey, there is still a lack of institutionalisation of open data in rural areas; that there is an apparent gap between the importance of agriculture to a country’s GDP and the amount of agricultural data published openly; and lastly, that the smart
city paradigm cannot simply be transferred to the rural setting. It suggests instead the adoption of the emerging ‘smart region’ paradigm as that most likely to support the specific data needs of rural areas….(More)”.