Public Data Commons: A public-interest framework for B2G data sharing in the Data Act


Policy Brief by Alek Tarkowski & Francesco Vogelezang: “It is by now a truism that data is a crucial resource in the digital era. Yet today access to data and the capacity to make use of data and to benefit from it are unevenly distributed. A new understanding of data is needed, one that takes into account a society-wide data sharing and value creation. This will solve power asymmetries related to data ownership and the capacity to use it, and fill the public value gap with regard to data-driven growth and innovation.

Public institutions are also in a unique position to safeguard the rule of law, ensure democratic control and accountability, and drive the use of data to generate non-economic value.

The “data sharing for public good” narratives have been presented for over a decade, arguing that privately-owned big data should be used for the public interest. The idea of the commons has attracted the attention of policymakers interested in developing institutional responses that can advance public interest goals. The concept of the data commons offers a generative model of property that is well-aligned with the ambitions of the European data strategy. And by employing the idea of the data commons, the public debate can be shifted beyond an opposition between treating data as a commodity or protecting it as the object of fundamental rights.

The European Union is uniquely positioned to deliver a data governance framework that ensures Business-to-Government (B2G) data sharing in the public interest. The policy vision for such a framework has been presented in the European strategy for data, and specific recommendations for a robust B2G data sharing model have been made by the Commission’s high-level expert group.

There are three connected objectives that must be achieved through a B2G data sharing framework. Firstly, access to data and the capacity to make use of it needs to be ensured for a broader range of actors. Secondly, exclusive corporate control over data needs to be reduced. And thirdly, the information power of the state and its generative capacity should be strengthened.

Yet the current proposal for the Data Act fails to meet these goals, due to a narrow B2G data sharing mandate limited only to situations of public emergency and exceptional need.

This policy brief therefore presents a model for public interest B2G data sharing, aimed to complement the current proposal. This framework would also create a robust baseline for sectoral regulations, like the recently proposed Regulation on the European Health Data Space. The proposal includes the creation of the European Public Data Commons, a body that acts as a recipient and clearinghouse for the data made available…(More)”.

We can’t create shared value without data. Here’s why


Article by Kriss Deiglmeier: “In 2011, I was co-teaching a course on Corporate Social Innovation at the Stanford Graduate School of Business, when our syllabus nearly went astray. A paper appeared in Harvard Business Review (HBR), titled “Creating Shared Value,” by Michael E. Porter and Mark R. Kramer. The students’ excitement was palpable: This could transform capitalism, enabling Adam Smith’s “invisible hand” to bend the arc of history toward not just efficiency and profit, but toward social impact…

History shows that the promise of shared value hasn’t exactly been realized. In the past decade, most indexes of inequality, health, and climate change have gotten worse, not better. The gap in wealth equality has widened – the combined worth of the top 1% in the United States increased from 29% of all wealth in 2011 to 32.3% in 2021 and the bottom 50% increased their share from 0.4% to 2.6% of overall wealth; everyone in between saw their share of wealth decline. The federal minimum wage has remained stagnant at $7.25 per hour while the US dollar has seen a cumulative price increase of 27.81%

That said, data is by no means the only – or even primary – obstacle to achieving shared value, but the role of data is a key aspect that needs to change. In a shared value construct, data is used primarily for profit and not the societal benefit at the speed and scale required.

Unfortunately, the technology transformation has resulted in an emerging data divide. While data strategies have benefited the commercial sector, the public sector and nonprofits lag in education, tools, resources, and talent to use data in finding and scaling solutions. The result is the disparity between the expanding use of data to create commercial value, and the comparatively weak use of data to solve social and environmental challenges…

Data is part of our future and is being used by corporations to drive success, as they should. Bringing data into the shared value framework is about ensuring that other entities and organizations also have the access and tools to harness data for solving social and environmental challenges as well….

Business has the opportunity to help solve the data divide through a shared value framework by bringing talent, product and resources to bear beyond corporate boundaries to help solve our social and environmental challenges. To succeed, it’s essential to re-envision the shared value framework to ensure data is at the core to collectively solve these challenges for everyone. This will require a strong commitment to collaboration between business, government and NGOs – and it will undoubtedly require a dedication to increasing data literacy at all levels of education….(More)”.

Facebook-owner Meta to share more political ad targeting data


Article by Elizabeth Culliford: “Facebook owner Meta Platforms Inc (FB.O) will share more data on targeting choices made by advertisers running political and social-issue ads in its public ad database, it said on Monday.

Meta said it would also include detailed targeting information for these individual ads in its “Facebook Open Research and Transparency” database used by academic researchers, in an expansion of a pilot launched last year.

“Instead of analyzing how an ad was delivered by Facebook, it’s really going and looking at an advertiser strategy for what they were trying to do,” said Jeff King, Meta’s vice president of business integrity, in a phone interview.

The social media giant has faced pressure in recent years to provide transparency around targeted advertising on its platforms, particularly around elections. In 2018, it launched a public ad library, though some researchers criticized it for glitches and a lack of detailed targeting data.Meta said the ad library will soon show a summary of targeting information for social issue, electoral or political ads run by a page….The company has run various programs with external researchers as part of its transparency efforts. Last year, it said a technical error meant flawed data had been provided to academics in its “Social Science One” project…(More)”.

Mobile phone data reveal the effects of violence on internal displacement in Afghanistan


Paper by Nearly 50 million people globally have been internally displaced due to conflict, persecution and human rights violations. However, the study of internally displaced persons—and the design of policies to assist them—is complicated by the fact that these people are often underrepresented in surveys and official statistics. We develop an approach to measure the impact of violence on internal displacement using anonymized high-frequency mobile phone data. We use this approach to quantify the short- and long-term impacts of violence on internal displacement in Afghanistan, a country that has experienced decades of conflict. Our results highlight how displacement depends on the nature of violence. High-casualty events, and violence involving the Islamic State, cause the most displacement. Provincial capitals act as magnets for people fleeing violence in outlying areas. Our work illustrates the potential for non-traditional data sources to facilitate research and policymaking in conflict settings….(More)”.

Building a Data Infrastructure for the Bioeconomy


Article by Gopal P. Sarma and Melissa Haendel: “While the development of vaccines for COVID-19 has been widely lauded, other successful components of the national response to the pandemic have not received as much attention. The National COVID Cohort Collaborative (N3C), for example, flew under the public’s radar, even though it aggregated crucial US public health data about the new disease through cross-institutional collaborations among government, private, and nonprofit health and research organizations. These data, which were made available to researchers via cutting-edge software tools, have helped in myriad ways: they led to identification of the clinical characteristics of acute COVID-19 for risk prediction, assisted in providing clinical care for immunocompromised adults, revealed how COVID infection affects children, and documented that vaccines appear to reduce the risk of developing long COVID.

N3C has created the largest national, publicly available patient-level dataset in US history. Through a unique public-private partnership, over 300 participating organizations quickly overcame privacy concerns and data silos to include 13 million patient records in the project. More than 3,000 participating scientists are now working to overcome the particular challenge faced in the United States—the lack of a national healthcare data infrastructure available in many other countries—to support public health and medical responses. N3C shows great promise for unraveling answers to questions related to COVID, but it could easily be expanded for many areas of public health, including pandemic preparedness and monitoring disease status across the population.

As public servants dedicated to improving public health and equity, we believe that to unite the nation’s fragmented public health system, the United States should establish a standing capacity to collect, harmonize, and sustain a wide range of data types and sources. The public health data collected by N3C would ultimately be but one component of a rich landscape of interoperable data systems that can guide public policy in an era of rapid environmental change, sophisticated biological threats, and an economy enabled by biotechnology. Such an effort will require new thinking about data collection, infrastructure, and regulation, but its benefits could be enormous—enabling policymakers to make decisions in an increasingly complex world. And as the interconnections between society, industry, and government continue to intensify, decisionmaking of all types and scales will be more efficient and responsive if it can rely on significantly expanded data collection and analysis capabilities…(More)”.

Using mobile money data and call detail records to explore the risks of urban migration in Tanzania


Paper by Rosa Lavelle-Hill: “Understanding what factors predict whether an urban migrant will end up in a deprived neighbourhood or not could help prevent the exploitation of vulnerable individuals. This study leveraged pseudonymized mobile money interactions combined with cell phone data to shed light on urban migration patterns and deprivation in Tanzania. Call detail records were used to identify individuals who migrated to Dar es Salaam, Tanzania’s largest city. A street survey of the city’s subwards was used to determine which individuals moved to more deprived areas. t-tests showed that people who settled in poorer neighbourhoods had less money coming into their mobile money account after they moved, but not before. A machine learning approach was then utilized to predict which migrants will move to poorer areas of the city, making them arguably more vulnerable to poverty, unemployment and exploitation. Features indicating the strength and location of people’s social connections in Dar es Salaam before they moved (‘pull factors’) were found to be most predictive, more so than traditional ‘push factors’ such as proxies for poverty in the migrant’s source region…(More)”.

A Consumer Price Index for the 21st Century


Press Release by the National Academies of Sciences, Engineering, and Medicine: “The Bureau of Labor Statistics (BLS) should undertake a new strategy to modernize the Consumer Price Index by accelerating its use of new data sources and developing price indexes based on different income levels, says a new report from the National Academies of Sciences, Engineering, and Medicine.

The Consumer Price Index is the most widely used measure of inflation in the U.S. It is used to determine cost-of-living allowances and, importantly, influences monetary policy, among many other private- and public-sector applications. The new report, Modernizing the Consumer Price Index for the 21st Century, says the index has traditionally relied on field-generated data, such as prices observed in person at grocery stores or major retailers. These data have become more challenging and expensive to collect, and the availability of vast digital sources of consumer price data presents an opportunity. BLS has begun tapping into these data and has said its objective is to switch a significant portion of its measurement to nontraditional and digital data sources by 2024.

“The enormous economic disruption of the COVID-19 pandemic presents a perfect case study for the need to rapidly employ new data sources for the Consumer Price Index,” said Daniel E. Sichel, professor of economics at Wellesley College, and chair of the committee that wrote the report. “Modernizing the Consumer Price Index can help our measurement of household costs and inflation be more accurate, timelier, and ultimately more useful for policymakers responding to rapidly changing economic conditions.”..
The report says BLS should embark on a strategy of accelerating and enhancing the use of scanner, web-scraped, and digital data directly from retailers in compiling the Consumer Price Index. Scanner data — recorded at the point of sale or by consumers in their homes — can expand the variety of products represented in the Consumer Price Index, and better detect shifts in buying patterns. Web-scraped data can more nimbly track the prices of online goods, and goods where one company dominates the market. Permanently automating web-scraping of price data should be a high priority for the Consumer Price Index program, especially for food, electronics, and apparel, the report says.

Embracing these alternative data sources now will ensure that the accuracy and timeliness of the Consumer Price Index will not be compromised in the future, the report adds. Moreover, accelerating this process will give BLS time to carefully assess new data sources and methodologies before taking the decision to incorporate them in the official index….(More)”

European Health Union: A European Health Data Space for people and science


Press Release: “Today, the European Commission launched the European Health Data Space (EHDS), one of the central building blocks of a strong European Health Union. The EHDS will help the EU to achieve a quantum leap forward in the way healthcare is provided to people across Europe. It will empower people to control and utilise their health data in their home country or in other Member States. It fosters a genuine single market for digital health services and products. And it offers a consistent, trustworthy and efficient framework to use health data for research, innovation, policy-making and regulatory activities, while ensuring full compliance with the EU’s high data protection standards…

Putting people in control of their own health data, in their country and cross-border

  • Thanks to the EHDS, people will have immediate, and easy access to the data in electronic form, free of charge. They can easily share these data with other health professionals in and across Member States to improve health care delivery. Citizens will be in full control of their data and will be able to add information, rectify wrong data, restrict access to others and obtain information on how their data are used and for which purpose.
  • Member States will ensure that patient summaries, ePrescriptions, images and image reports, laboratory results, discharge reports are issued and accepted in a common European format.
  • Interoperability and security will become mandatory requirements. Manufacturers of electronic health record systems will need to certify compliance with these standards.
  • To ensure that citizens’ rights are safeguarded, all Member States have to appoint digital health authorities. These authorities will participate in the cross-border digital infrastructure (MyHealth@EU) that will support patients to share their data across borders.

Improving the use of health data for research, innovation and policymaking

  • The EHDS creates a strong legal framework for the use of health data for research, innovation, public health, policy-making and regulatory purposes. Under strict conditions, researchers, innovators, public institutions or industry will have access to large amounts of high-quality health data, crucial to develop life-saving treatments, vaccines or medical devices and ensuring better access to healthcare and more resilient health systems.  
  • The access to such data by researchers, companies or institutions will require a permit from a health data access body, to be set up in all Member States. Access will only be granted if the requested data is used for specific purposesin closed, secure environments and without revealing the identity of the individual. It is also strictly prohibited to use the data for decisions, which are detrimental to citizens such as designing harmful products or services or increasing an insurance premium.
  • The health data access bodies will be connected to the new decentralised EU-infrastructure for secondary use (HealthData@EU) which will be set up to support cross-border projects…(More)”

Data scientists are using the most annoying feature on your phones to save lives in Ukraine


Article by Bernhard Warner: “In late March, five weeks into Russia’s war on Ukraine, an international team of researchers, aid agency specialists, public health experts, and data nerds gathered on a Zoom call to discuss one of the tragic by-products of the war: the refugee crisis.

The numbers discussedweregrim. The United Nations had just declared Ukraine was facing the biggest humanitarian crisis to hit Europe since World War II as more than 4 million Ukrainians—roughly 10% of the population—had been forced to flee their homes to evade Russian President Vladimir Putin’s deadly and indiscriminate bombing campaign. That total has since swelled to 5.5 million, the UN estimates.

What the aid specialists on the call wanted to figure out was how many Ukrainian refugees still remained in the country (a population known as “internally displaced people”) and how many had crossed borders to seek asylum in the neighboring European Union countries of Poland, Slovakia, and Hungary, or south into Moldova. 

Key to an effective humanitarian response of this magnitude is getting accurate and timely data on the flow of displaced people traveling from a Point A danger zone to a Point B safe space. And nobody on the call, which was organized by CrisisReady, an A-team of policy experts and humanitarian emergency responders, had anything close to precise numbers.

But they did have a kind of secret weapon: mobility data.

“The importance of mobility data is often overstated,” Rohini Sampoornam Swaminathan, a crisis specialist at Unicef, told her colleagues on the call. Such anonymized data—pulled from social media feeds, geolocation apps like Google Maps, cell phone towers and the like—may not give the precise picture of what’s happening on the ground in a moment of extreme crisis, “but it’s valuable” as it can fill in points on a map. ”It’s important,” she added, “to get a picture for where people are moving, especially in the first days.”

Ukraine, a nation of relatively tech-savvy social media devotees and mobile phone users, is rich in mobility data, and that’s profoundly shaped the way the world sees and interprets the deadly conflict. The CrisisReady group believes the data has an even higher calling—that it can save lives.

Since the first days of Putin’s bombing campaign, various international teams have been tapping publicly available mobility data to map the refugee crisis and coordinate an effective response. They believe the data can reveal where war-torn Ukrainians are now, and even where they’re heading. In the right hands, the data can provide local authorities the intel they need to get essential aid—medical care, food, and shelter—to the right place at the right time…(More)”

Data sharing between humanitarian organisations and donors


Report by Larissa Fast: “This report investigates issues related to data sharing between humanitarian actors and donors, with a focus on two key questions:

  • What formal or informal frameworks govern the collection and sharing of disaggregated humanitarian data between humanitarian actors and donors?
  • How are these frameworks and the related requirements understood or perceived by humanitarian actors and donors?

Drawing on interviews with donors and humanitarians about data sharing practices and examination of formal documents, the research finds that, overall and perhaps most importantly, references to ‘data’ in the context of humanitarian operations are usually generic and lack a consistent definition or even a shared terminology. Complex regulatory frameworks, variability among donor expectations, both among and within donor governments (e.g., at the country or field/headquarters levels), and among humanitarian experiences of data sharing all complicate the nature and handling of data sharing requests. Both the lack of data literacy and the differing perceptions of operational data management risks exacerbate many issues related to data sharing and create inconsistent practice (see full summary of findings in Table 3).

More specifically, while much formal documentation about data sharing between humanitarians and donors is available in the public domain, few contain explicit policies or clauses on data sharing, instead referring only to financial or compliance data and programme reporting requirements. Additionally, the justifications for sharing disaggregated humanitarian data are framed most often in terms of accountability, compliance, efficiency, and programme design. Most requests for data are linked to monitoring and compliance, as well as requests for data as ‘assurances’. Even so, donors indicated that although they request detailed/disaggregated data, they may not have the time, or human and/or technical capacity to deal with it properly. In general, donor interviewees insisted that no record level data is shared within their governments, but only aggregated or in low or no sensitivity formats….(More)”.