Ann Marimow in the Washington Post: “Leaders of the federal judiciary are working to block bipartisan legislation designed to create a national database of court records that would provide free access to case documents.
Backers of the bill, who are pressing for a House vote in the coming days, envision a streamlined, user-friendly system that would allow citizens to search for court documents and dockets without having to pay. Under the current system, users pay 10 cents per page to view the public records through the service known as PACER, an acronym for Public Access to Court Electronic Records.
“Everyone wants to have a system that is technologically first class and free,” said Rep. Hank Johnson (D-Ga.), a sponsor of the legislation with Rep. Douglas A. Collins (R-Ga.).
A modern system, he said, “is more efficient and brings more transparency into the equation and is easier on the pocketbooks of regular people.”…(More)”.
Report by the Open Data Watch: “The 2020/21 Open Data Inventory (ODIN) is the fifth edition of the index compiled by Open Data Watch. ODIN 2020/21 provides an assessment of the coverage and openness of official statistics in 187 countries, an increase of 9 countries compared to ODIN 2018/19. The year 2020 was a challenging year for the world as countries grappled with the COVID-19 pandemic. Nonetheless, and despite the pandemic’s negative impact on the capacity of statistics producers, 2020 saw great progress in open data.
However, the news on data this year isn’t all good. Countries in every region still struggle to publish gender data and many of the same countries are unable to provide sex-disaggregated data on the COVID-19 pandemic. In addition, low-income countries continue to need more support with capacity building and financial resources to overcome the barriers to publishing open data.
ODIN is an evaluation of the coverage and openness of data provided on the websites maintained by national statistical offices (NSOs) and any official government website that is accessible from the NSO site. The overall ODIN score is an indicator of how complete and open an NSO’s data offerings are. It is comprised of both a coverage and openness subscore. Openness is measured against standards set by the Open Definition and Open Data Charter. ODIN 2020/21 includes 22 data categories, grouped under social, economic and financial, and environmental statistics. ODIN scores are represented on a range between 0 and 100, with 100 representing the best performance on open data… The full report will be released in February 2021….(More)”.
Paper by Nikolaos Yiannakoulias, Catherine E. Slavik, Shelby L. Sturrock, J. Connor Darlington: “Governments around the world have made data on COVID-19 testing, case numbers, hospitalizations and deaths openly available, and a breadth of researchers, media sources and data scientists have curated and used these data to inform the public about the state of the coronavirus pandemic. However, it is unclear if all data being released convey anything useful beyond the reputational benefits of governments wishing to appear open and transparent. In this analysis we use Ontario, Canada as a case study to assess the value of publicly available SARS-CoV-2 positive case numbers. Using a combination of real data and simulations, we find that daily publicly available test results probably contain considerable error about individual risk (measured as proportion of tests that are positive, population based incidence and prevalence of active cases) and that short term variations are very unlikely to provide useful information for any plausible decision making on the part of individual citizens. Open government data can increase the transparency and accountability of government, however it is essential that all publication, use and re-use of these data highlight their weaknesses to ensure that the public is properly informed about the uncertainty associated with SARS-CoV-2 information….(More)”
” The Governance Lab (The GovLab) at the NYU Tandon School of Engineering, with support from the Henry Luce Foundation, today released guidance to inform decision-making in the responsible re-use of data — re-purposing data for a use other than that for which it was originally intended — to address COVID-19. The findings, recommendations, and a new Responsible Data Re-Use framework stem from The Data Assembly initiative in New York City. An effort to solicit diverse, actionable public input on data re-use for crisis response in the United States, the Data Assembly brought together New York City-based stakeholders from government, the private sector, civic rights and advocacy organizations, and the general public to deliberate on innovative, though potentially risky, uses of data to inform crisis response in New York City. The findings and guidance from the initiative will inform policymaking and practice regarding data re-use in New York City, as well as free data literacy training offerings.
The Data Assembly’s Responsible Data Re-Use Framework provides clarity on a major element of the ongoing crisis. Though leaders throughout the world have relied on data to reduce uncertainty and make better decisions, expectations around the use and sharing of siloed data assets has remained unclear. This summer, along with the New York Public Library and Brooklyn Public Library, The GovLab co-hosted four months of remote deliberations with New York-based civil rights organizations, key data holders, and policymakers. Today’s release is a product of these discussions, to show how New Yorkers and their leaders think about the opportunities and risks involved in the data-driven response to COVID-19….(More)”
Policy Brief by Lara Mangravite and John Wilbanks: “Open data mandates and investments in public data resources, such as the Human Genome Project or the U.S. National Oceanic and Atmospheric Administration Data Discovery Portal, have provided essential data sets at a scale not possible without government support. By responsibly sharing data for wide reuse, federal policy can spur innovation inside the academy and in citizen science communities. These approaches are enabled by private-sector advances in cloud computing services and the government has benefited from innovation in this domain. However, the use of commercial products to manage the storage of and access to public data resources poses several challenges.
First, too many cloud computing systems fail to properly secure data against breaches, improperly share copies of data with other vendors, or use data to add to their own secretive and proprietary models. As a result, the public does not trust technology companies to responsibly manage public data—particularly private data of individual citizens. These fears are exacerbated by the market power of the major cloud computing providers, which may limit the ability of individuals or institutions to negotiate appropriate terms. This impacts the willingness of U.S. citizens to have their personal information included within these databases.
Second, open data solutions are springing up across multiple sectors without coordination. The federal government is funding a series of independent programs that are working to solve the same problem, leading to a costly duplication of effort across programs.
Third and most importantly, the high costs of data storage, transfer, and analysis preclude many academics, scientists, and researchers from taking advantage of governmental open data resources. Cloud computing has radically lowered the costs of high-performance computing, but it is still not free. The cost of building the wrong model at the wrong time can quickly run into tens of thousands of dollars.
Scarce resources mean that many academic data scientists are unable or unwilling to spend their limited funds to reuse data in exploratory analyses outside their narrow projects. And citizen scientists must use personal funds, which are especially scarce in communities traditionally underrepresented in research. The vast majority of public data made available through existing open science policy is therefore left unused, either as reference material or as “foreground” for new hypotheses and discoveries….The Solution: Public Cloud Computing…(More)”.
Essay by Sunyoung Pyo, Luigi Reggi and Erika G. Martin: “…There is one tool for the COVID-19 response that was not as robust in past pandemics: open data. For about 15 years, a “quiet open data revolution” has led to the widespread availability of governmental data that are publicly accessible, available in multiple formats, free of charge, and with unlimited use and distribution rights. The underlying logic of open data’s value is that diverse users including researchers, practitioners, journalists, application developers, entrepreneurs, and other stakeholders will synthesize the data in novel ways to develop new insights and applications. Specific products have included providing the public with information about their providers and health care facilities, spotlighting issues such as high variation in the cost of medical procedures between facilities, and integrating food safety inspection reports into Yelp to help the public make informed decisions about where to dine. It is believed that these activities will in turn empower health care consumers and improve population health.
Here, we describe several use cases whereby open data have already been used globally in the COVID-19 response. We highlight major challenges to using these data and provide recommendations on how to foster a robust open data ecosystem to ensure that open data can be leveraged in both this pandemic and future public health emergencies…(More)” See also Repository of Open Data for Covid19 (OECD/TheGovLab)
Paper by Kaitlin Fender Throgmorton, Bree Norlander and Carole L. Palmer: “As the open data movement grows, public libraries must assess if and how to invest resources in this new service area. This paper reports on a recent survey on open data in public libraries across Washington state, conducted by the Open Data Literacy project (ODL) in collaboration with the Washington State Library. Results document interests and activity in open data across small, medium, and large libraries in relation to traditional library services and priorities. Libraries are particularly active in open data through reference services and are beginning to release their own library data to the public. While capacity and resource challenges hinder progress for some, many libraries, large and small, are making progress on new initiatives, including strategic collaborations with local government agencies. Overall, the level and range of activity suggest that Washington state public libraries of all sizes recognize the value of open data for their communities, with a groundswell of libraries moving beyond ambition to action as they develop new services through evolution and innovation….(More)”.
Paper (and site) by Stefaan G. Verhulst, Andrew Young, Andrew J. Zahuranec, Susan Ariel Aaronson, Ania Calderon, and Matt Gee on “How To Accelerate the Re-Use of Data for Public Interest Purposes While Ensuring Data Rights and Community Flourishing”: “The paper begins with a description of earlier waves of open data. Emerging from freedom of information laws adopted over the last half century, the First Wave of Open Data brought about newfound transparency, albeit one only available on request to an audience largely composed of journalists, lawyers, and activists.
The Second Wave of Open Data, seeking to go beyond access to public records and inspired by the open source movement, called upon national governments to make their data open by default. Yet, this approach too had its limitations, leaving many data silos at the subnational level and in the private sector untouched..
The Third Wave of Open Data seeks to build on earlier successes and take into account lessons learned to help open data realize its transformative potential. Incorporating insights from various data experts, the paper describes the emergence of a Third Wave driven by the following goals:
Publishing with Purpose by matching the supply of data with the demand for it, providing assets that match public interests;
Fostering Partnerships and Data Collaboration by forging relationships with community-based organizations, NGOs, small businesses, local governments, and others who understand how data can be translated into meaningful real-world action;
Advancing Open Data at the Subnational Level by providing resources to cities, municipalities, states, and provinces to address the lack of subnational information in many regions.
Prioritizing Data Responsibility and Data Rights by understanding the risks of using (and not using) data to promote and preserve the public’s general welfare.
Riding the Wave
Achieving these goals will not be an easy task and will require investments and interventions across the data ecosystem. The paper highlights eight actions that decision and policy makers can take to foster more equitable, impactful benefits… (More) (PDF) “
Juliet McMurren and Stefaan G. Verhulst at Data & Policy: “If data is the “new oil,” why isn’t it flowing? For almost two decades, data management in fields such as government, healthcare, finance, and research has aspired to achieve a state of data liquidity, in which data can be reused where and when it is needed. For the most part, however, this aspiration remains unrealized. The majority of the world’s data continues to stagnate in silos, controlled by data holders and inaccessible to both its subjects and others who could use it to create or improve services, for research, or to solve pressing public problems.
Efforts to increase liquidity have focused on forms of voluntary institutional data sharing such as data pools or other forms of data collaboratives. Although useful, these arrangements can only advance liquidity so far. Because they vest responsibility and control over liquidity in the hands of data holders, their success depends on data holders’ willingness and ability to provide access to their data for the greater good. While that willingness exists in some fields, particularly medical research, a willingness to share data is much less likely where data holders are commercial competitors and data is the source of their competitive advantage. And even where willingness exists, the ability of data holders to share data safely, securely, and interoperably may not. Without a common set of secure, standardized, and interoperable tools and practices, the best that such bottom-up collaboration can achieve is a disconnected patchwork of initiatives, rather than the data liquidity proponents are seeking.
Data portability is one potential solution to this problem. As enacted in the EU General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018), the right to data portability asserts that individuals have a right to obtain, copy, and reuse their personal data and transfer it between platforms or services. In so doing, it shifts control over data liquidity to data subjects, obliging data holders to release data whether or not it is in their commercial interests to do so. Proponents of data portability argue that, once data is unlocked and free to move between platforms, it can be combined and reused in novel ways and in contexts well beyond those in which it was originally collected, all while enabling greater individual control.
To date, however, arguments for the benefits of the right to data portability have typically failed to connect this rights-based approach with the larger goal of data liquidity and how portability might advance it. This failure to connect these principles and to demonstrate their collective benefits to data subjects, data holders, and society has real-world consequences. Without a clear view of what can be achieved, policymakers are unlikely to develop interventions and incentives to advance liquidity and portability, individuals will not exercise their rights to data portability, and industry will not experiment with use cases and develop the tools and standards needed to make portability and liquidity a reality.
Toward these ends, we have been exploring the current literature on data portability and liquidity, searching for lessons and insights into the benefits that can be unlocked when data liquidity is enabled through the right to data portability. Below we identify some of the greatest potential benefits for society, individuals, and data-holding organizations. These benefits are sometimes in conflict with one another, making the field a contentious one that demands further research on the trade-offs and empirical evidence of impact. In the final section, we also discuss some barriers and challenges to achieving greater data liquidity….(More)”.
Paper by Mara Maretti, Vanessa Russo & Emiliano del Gobbo: “The expression ‘open data’ relates to a system of informative and freely accessible databases that public administrations make generally available online in order to develop an informative network between institutions, enterprises and citizens. On this topic, using the semantic network analysis method, the research aims to investigate the communication structure and the governance of open data in the Twitter conversational environment. In particular, the research questions are: (1) Who are the main actors in the Italian open data infrastructure? (2) What are the main conversation topics online? (3) What are the pros and cons of the development and use (reuse) of open data in Italy? To answer these questions, we went through three research phases: (1) analysing the communication network, we found who are the main influencers; (2) once we found who were the main actors, we analysed the online content in the Twittersphere to detect the semantic areas; (3) then, through an online focus group with the main open data influencers, we explored the characteristics of Italian open data governance. Through the research, it has been shown that: (1) there is an Italian open data governance strategy; (2) the Italian civic hacker community plays an important role as an influencer; but (3) there are weaknesses in governance and in practical reuse….(More)”.