Third Wave of Open Data


Paper (and site) by Stefaan G. Verhulst, Andrew Young, Andrew J. Zahuranec, Susan Ariel Aaronson, Ania Calderon, and Matt Gee on “How To Accelerate the Re-Use of Data for Public Interest Purposes While Ensuring Data Rights and Community Flourishing”: “The paper begins with a description of earlier waves of open data. Emerging from freedom of information laws adopted over the last half century, the First Wave of Open Data brought about newfound transparency, albeit one only available on request to an audience largely composed of journalists, lawyers, and activists. 

The Second Wave of Open Data, seeking to go beyond access to public records and inspired by the open source movement, called upon national governments to make their data open by default. Yet, this approach too had its limitations, leaving many data silos at the subnational level and in the private sector untouched..

The Third Wave of Open Data seeks to build on earlier successes and take into account lessons learned to help open data realize its transformative potential. Incorporating insights from various data experts, the paper describes the emergence of a Third Wave driven by the following goals:

  1. Publishing with Purpose by matching the supply of data with the demand for it, providing assets that match public interests;
  2. Fostering Partnerships and Data Collaboration by forging relationships with  community-based organizations, NGOs, small businesses, local governments, and others who understand how data can be translated into meaningful real-world action;
  3. Advancing Open Data at the Subnational Level by providing resources to cities, municipalities, states, and provinces to address the lack of subnational information in many regions.
  4. Prioritizing Data Responsibility and Data Rights by understanding the risks of using (and not using) data to promote and preserve the public’s general welfare.

Riding the Wave

Achieving these goals will not be an easy task and will require investments and interventions across the data ecosystem. The paper highlights eight actions that decision and policy makers can take to foster more equitable, impactful benefits… (More) (PDF) “

Data to Go: The Value of Data Portability as a Means to Data Liquidity


Juliet McMurren and Stefaan G. Verhulst at Data & Policy: “If data is the “new oil,” why isn’t it flowing? For almost two decades, data management in fields such as government, healthcare, finance, and research has aspired to achieve a state of data liquidity, in which data can be reused where and when it is needed. For the most part, however, this aspiration remains unrealized. The majority of the world’s data continues to stagnate in silos, controlled by data holders and inaccessible to both its subjects and others who could use it to create or improve services, for research, or to solve pressing public problems.

Efforts to increase liquidity have focused on forms of voluntary institutional data sharing such as data pools or other forms of data collaboratives. Although useful, these arrangements can only advance liquidity so far. Because they vest responsibility and control over liquidity in the hands of data holders, their success depends on data holders’ willingness and ability to provide access to their data for the greater good. While that willingness exists in some fields, particularly medical research, a willingness to share data is much less likely where data holders are commercial competitors and data is the source of their competitive advantage. And even where willingness exists, the ability of data holders to share data safely, securely, and interoperably may not. Without a common set of secure, standardized, and interoperable tools and practices, the best that such bottom-up collaboration can achieve is a disconnected patchwork of initiatives, rather than the data liquidity proponents are seeking.

Image for post

Data portability is one potential solution to this problem. As enacted in the EU General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018), the right to data portability asserts that individuals have a right to obtain, copy, and reuse their personal data and transfer it between platforms or services. In so doing, it shifts control over data liquidity to data subjects, obliging data holders to release data whether or not it is in their commercial interests to do so. Proponents of data portability argue that, once data is unlocked and free to move between platforms, it can be combined and reused in novel ways and in contexts well beyond those in which it was originally collected, all while enabling greater individual control.

To date, however, arguments for the benefits of the right to data portability have typically failed to connect this rights-based approach with the larger goal of data liquidity and how portability might advance it. This failure to connect these principles and to demonstrate their collective benefits to data subjects, data holders, and society has real-world consequences. Without a clear view of what can be achieved, policymakers are unlikely to develop interventions and incentives to advance liquidity and portability, individuals will not exercise their rights to data portability, and industry will not experiment with use cases and develop the tools and standards needed to make portability and liquidity a reality.

Toward these ends, we have been exploring the current literature on data portability and liquidity, searching for lessons and insights into the benefits that can be unlocked when data liquidity is enabled through the right to data portability. Below we identify some of the greatest potential benefits for society, individuals, and data-holding organizations. These benefits are sometimes in conflict with one another, making the field a contentious one that demands further research on the trade-offs and empirical evidence of impact. In the final section, we also discuss some barriers and challenges to achieving greater data liquidity….(More)”.

Open data and data sharing: An economic analysis


Paper by Alevtina Krotova, Armin Mertens, Marc Scheufen: “Data is an important business resource. It forms the basis for various digital technologies such as artificial intelligence or smart services. However, access to data is unequally distributed in the market. Hence, some business ideas fail due to a lack of data sources. Although many governments have recognised the importance of open data and already make administrative data available to the public on a large scale, many companies are still reluctant to share their data among other firms and competitors. As a result, the economic potential of data is far from being fully exploited. Against this background, we analyse current developments in the area of open data. We compare the characteristics of open governmental and open company data in order to define the necessary framework conditions for data sharing. Subsequently, we examine the status quo of data sharing among firms. We use a qualitative analysis of survey data of European companies to derive the sufficient conditions to strengthen data sharing. Our analysis shows that governmental data is a public good, while company data can be seen as a club or private good. Latter frequently build the core for companies’ business models and hence are less suitable for data sharing. Finally, we find that promoting legal certainty and the economic impact present important policy steps for fostering data sharing….(More)”

Airbnb’s Data ‘Portal’ Promises a Better Relationship With Cities


Article by Patrick Sisson: “When startups go public, a big part of the process is opening up their books and being more transparent about their business model. With global short-term rental giant Airbnb moving towards its own IPO, the company has introduced a new product that seeks to address recent safety concerns and answer the data-sharing requests that critics have long claimed make the company a less-than-perfect partner for local leaders. 

The Airbnb City Portal, which launched on Wednesday as a pilot program with 15 global cities and tourism agencies, aims to provide municipal staff with more efficient access to data about listings, including whether or not they’re complying with local laws. Each city, including Buffalo, San Francisco and Seattle, will have access to a new data dashboard as well as a dedicated staffer at Airbnb. Like so many of its sharing economy and Silicon Valley peers, Airbnb has had a contentious, and evolving, relationship with municipalities and local government ever since launching (an especially fraught situation in Europe, as an EU court just ruled in favor of city regulations of the site). 

At a time when so many tech platforms are wrestling, often unsuccessfully, with the need to moderate the behavior of bad actors who use the site, Airbnb’s City Portal is an attempt to “productize” how the home-sharing site works with local government, says Chris Lehane, Airbnb’s senior vice president for global policy and communications. It’s a more useful framework to access information and report violations, he says. And it delivers on the platform’s long-term goals around sharing data, paying taxes and working with cities on regulation. He frames the move as part of a balancing act around the security and safety responsibilities of local governments and a private global company.

The dashboard will also be useful for local tourism officials: It will provide visitor information, including city of origin and demographic information, that helps bureaus better target their advertising and marketing campaigns….(More)”

Announcing the New Data4COVID19 Repository


Blog by Andrew Zahuranec: “It’s been a long year. Back in March, The GovLab released a Call for Action to build the data infrastructure and ecosystem we need to tackle pandemics and other dynamic societal and environmental threats. As part of that work, we launched a Data4COVID19 repository to monitor progress and curate projects that reused data to address the pandemic. At the time, it was hard to say how long it would remain relevant. We did not know how long the pandemic would last nor how many organizations would publish dashboards, visualizations, mobile apps, user tools, and other resources directed at the crisis’s worst consequences.

Seven months later, the COVID-19 pandemic is still with us. Over one million people around the world are dead and many countries face ever-worsening social and economic costs. Though the frequency with which data reuse projects are announced has slowed since the crisis’s early days, they have not stopped. For months, The GovLab has posted dozens of additions to an increasingly unwieldy GoogleDoc.

Today, we are making a change. Given the pandemic’s continued urgency and relevance into 2021 and beyond, The GovLab is pleased to release the new Data4COVID19 Living Repository. The upgraded platform allows people to more easily find and understand projects related to the COVID-19 pandemic and data reuse.

Image for post
The Data4COVID19 Repository

On the platform, visitors will notice a few improvements that distinguish the repository from its earlier iteration. In addition to a main page with short descriptions of each example, we’ve added improved search and filtering functionality. Visitors can sort through any of the projects by:

  • Scope: the size of the target community;
  • Region: the geographic area in which the project takes place;
  • Topic: the aspect of the crisis the project seeks to address; and
  • Pandemic Phase: the stage of pandemic response the project aims to address….(More)”.

A New Normal for Data Collection: Using the Power of Community to Tackle Gender Violence Amid COVID-19


Claudia Wells at SDG Knowledge Hub: “A shocking increase in violence against women and girls has been reported in many countries during the COVID-19 pandemic, amounting to what UN Women calls a “shadow pandemic.”

The jarring facts are:

  • Globally 243 million women and girls have been subjected to sexual and/or physical violence by an intimate partner in the past 12 months.
  • The UNFPA estimates that the pandemic will cause a one-third reduction in progress towards ending gender-based violence by 2030;
  • UNFPA predicts an additional 15 million cases of gender-based violence for every three months of lockdown.
  • Official data captures only a fraction of the true prevalence and nature of gender-based violence.

The response to these new challenges were discussed at a meeting in July with a community-led response delivered through local actors highlighted as key. This means that timely, disaggregated, community-level data on the nature and prevalence of gender-based violence has never been more important. Data collected within communities can play a vital role to fill the gaps and ensure that data-informed policies reflect the lived experiences of the most marginalized women and girls.

Community Scorecards: Example from Nepal

Collecting and using community-level data can be challenging, particularly under the restrictions of the pandemic. Working in partnerships is therefore vital if we are to respond quickly and flexibly to new and old challenges.

A great example of this is the Leave No One Behind Partnership, which responds to these challenges while delivering on crucial data and evidence at the community level. This important partnership brings together international civil society organizations with national NGOs, civic platforms and community-based organizations to monitor progress towards the SDGs….

While COVID-19 has highlighted the need for local, community-driven data, public health restrictions have also made it more challenging to collect such data. For example the usual focus group approach to creating a community scorecard is no longer possible.

The coalition in Nepal  therefore faces an increased demand for community-driven data while needing to develop a “new normal for data collection.”. Partners must: make data collection more targeted; consider how data on gender-based violence are included in secondary sources; and map online resources and other forms of data collection.

Addressing these new challenges may include using more blended collection approaches such as  mobile phones or web-based platforms. However, while these may help to facilitate data collection, they come with increased privacy and safeguarding risks that have to be carefully considered to ensure that participants, particularly women and girls, are not at increased risk of violence or have their privacy and confidentiality exposed….(More)”.

The ambitious effort to piece together America’s fragmented health data


Nicole Wetsman at The Verge: “From the early days of the COVID-19 pandemic, epidemiologist Melissa Haendel knew that the United States was going to have a data problem. There didn’t seem to be a national strategy to control the virus, and cases were springing up in sporadic hotspots around the country. With such a patchwork response, nationwide information about the people who got sick would probably be hard to come by.

Other researchers around the country were pinpointing similar problems. In Seattle, Adam Wilcox, the chief analytics officer at UW Medicine, was reaching out to colleagues. The city was the first US COVID-19 hotspot. “We had 10 times the data, in terms of just raw testing, than other areas,” he says. He wanted to share that data with other hospitals, so they would have that information on hand before COVID-19 cases started to climb in their area. Everyone wanted to get as much data as possible in the hands of as many people as possible, so they could start to understand the virus.

Haendel was in a good position to help make that happen. She’s the chair of the National Center for Data to Health (CD2H), a National Institutes of Health program that works to improve collaboration and data sharing within the medical research community. So one week in March, just after she’d started working from home and pulled her 10th grader out of school, she started trying to figure out how to use existing data-sharing projects to help fight this new disease.

The solution Haendel and CD2H landed on sounds simple: a centralized, anonymous database of health records from people who tested positive for COVID-19. Researchers could use the data to figure out why some people get very sick and others don’t, how conditions like cancer and asthma interact with the disease, and which treatments end up being effective.

But in the United States, building that type of resource isn’t easy. “The US healthcare system is very fragmented,” Haendel says. “And because we have no centralized healthcare, that makes it also the case that we have no centralized healthcare data.” Hospitals, citing privacy concerns, don’t like to give out their patients’ health data. Even if hospitals agree to share, they all use different ways of storing information. At one institution, the classification “female” could go into a record as one, and “male” could go in as two — and at the next, they’d be reversed….(More)”.

Data Sharing 2.0: New Data Sharing, New Value Creation


MIT CISR research:”…has found that interorganizational data sharing is a top concern of companies; leaders often find data sharing costly, slow, and risky. Interorganizational data sharing, however, is requisite for new value creation in the digital economy. Digital opportunities require data sharing 2.0: cross-company sharing of complementary data assets and capabilities, which fills data gaps and allows companies, often collaboratively, to develop innovative solutions. This briefing introduces three sets of practices—curated content, designated channels, and repeatable controls—that help companies accelerate data sharing 2.0….(More)”.

Responsible group data for children


Issue Brief by Andrew Young: “Understanding how and why group data is collected and what can be done to protect children’s rights…While the data protection field largely focuses on individual data harms, it is a focus that obfuscates and exacerbates the risks of data that could put groups of people at risk, such as the residents of a particular village, rather than individuals.

Though not well-represented in the current responsible data literature and policy domains writ large, the challenges group data poses are immense. Moreover, the unique and amplified group data risks facing children are even less scrutinized and understood.

To achieve Responsible Data for Children (RD4C) and ensure effective and legitimate governance of children’s data, government policymakers, data practitioners, and institutional decision makers need to ensure children’s group data are a core consideration in all relevant policies, procedures, and practices….(More)”. (See also Responsible Data for Children).

From (Horizontal and Sectoral) Data Access Solutions Towards Data Governance Systems


Paper by Wolfgang Kerber: “Starting with the assumption that under certain conditions also mandatory solutions for access to privately held data can be necessary, this paper analyses the legal and regulatory instruments for the implementation of such data access solutions. After an analysis of advantages and problems of horizontal versus sectoral access solutions, the main thesis of this paper is that focusing only on data access solutions is often not enough for achieving the desired positive effects on competition and innovation. An analysis of the two examples access to bank account data (PSD2: Second Payment Service Directive) and access to data of the connected car shows that successful data access solutions might require an entire package of additional complementary regulatory solutions (e.g. regarding interoperability, standardisation, and safety and security), and therefore the analysis and regulatory design of entire data governance systems (based upon an economic market failure analysis). In the last part important instruments that can be used within data governance systems are discussed, like, e.g. data trustee solutions….(More)”.