Data Access, Consumer Interests and Public Welfare


Book edited by Bundesministerium der Justiz und für Verbraucherschutz, and Max-Planck-Institut für Innovation und Wettbewerb: “Data are considered to be key for the functioning of the data economy as well as for pursuing multiple public interest concerns. Against this backdrop this book strives to device new data access rules for future legislation. To do so, the contributions first explain the justification for such rules from an economic and more general policy perspective. Then, building on the constitutional foundations and existing access regimes, they explore the potential of various fields of the law (competition and contract law, data protection and consumer law, sector-specific regulation) as a basis for the future legal framework. The book also addresses the need to coordinate data access rules with intellectual property rights and to integrate these rules as one of multiple measures in larger data governance systems. Finally, the book discusses the enforcement of the Government’s interest in using privately held data as well as potential data access rights of the users of connected devices….(More)”.

The Use of Mobility Data for Responding to the COVID-19 Pandemic


New Report, Repository and set of Case Studies commissioned by the Open Data Institute: “…The GovLab and Cuebiq firstly assembled a repository of mobility data collaboratives related to Covid-19. They then selected five of these to analyse further, and produced case studies on each of the collaboratives (which you can find below in the ‘Key outputs’ section).

After analysing these initiatives, Cuebiq and The GovLab then developed a synthesis report, which contains sections focused on:

  • Mobility data – what it is and how it can be used
  • Current practice – insights from five case studies
  • Prescriptive analysis – recommendations for the future

Findings and recommendations

Based on this analysis, the authors of the report recommend nine actions which have the potential to enable more effective, sustainable and responsible re-use of mobility data through data collaboration to support decision making regarding pandemic prevention, monitoring, and response:

  1. Developing and clarifying governance framework to enable the trusted, transparent, and accountable reuse of privately held data in the public interest under a clear regulatory framework
  2. Building capacity of organisations in the public and private sector to reuse and act on data through investments in training, education, and reskilling of relevant authorities; especially driving support for institutions in the Global South
  3. Establishing data stewards in organisations who can coordinate and collaborate with counterparts on using data in the public’s interest and acting on it.
  4. Establishing dedicated and sustainable CSR (Corporate Social Responsibility) programs on data in organisations to coordinate and collaborate with counterparts on using and acting upon data in the public’s interest.
  5. Building a network of data stewards to coordinate and streamline efforts while promoting greater transparency; as well as exchange best practices and lessons learned.
  6. Engaging citizens about how their data is being used so clearly articulate how they want their data to be responsibly used, shared, and protected.
  7. Promoting technological innovation through collaboration between funders (eg governments and foundations) and researchers (eg data scientists) to develop and deploy useful, privacy-preserving technologies.
  8. Unlocking funds from a variety of sources to ensure projects are sustainable and can operate long term.
  9. Increase research and spur evidence gathering by publishing easily accessible research and creating dedicated centres to develop best practices.

This research begins to demonstrate the value that a handful of new data-sharing initiatives have had in the ongoing response to Covid-19. The pandemic isn’t yet over, and we will need to continue to assess and evaluate how data has been shared – both successfully and unsuccessfully – and who has benefited or been harmed in the process. More research is needed to highlight the lessons from this emergency that can be applied to future crises….(More)”.

Unlocking Responsible Access to Data to Increase Equity and Economic Mobility


Report by the Markle Foundation and the Bill and Melinda Gates Foundation (BMGF): “Economic mobility remains elusive for far too many Americans and has been declining for several decades. A person born in 1980 is 50% less likely to earn more than their parents than a person born in 1950 is. While all children who grow up in low-opportunity neighborhoods face mobility challenges, racial, ethnic, and gender disparities add even more complexity. In 99% of neighborhoods in America, Black boys earn less, and are more likely to fall into poverty, than white boys, even when they grow up on the same block, attend the same schools, and have the same family income. In 2016, a Pew Research study found that the median wealth of white households was ten times the median wealth of Black households and eight times that of Hispanic households. The COVID-19 pandemic has further exacerbated existing disparities, as communities of color suffer higher exposure and death rates, along with greater job loss and increased food and housing insecurity.

Reversing this overall decline to address the persistent racial, ethnic, and gender gaps in economic mobility is one of the great challenges of our time. Some progress has been made in identifying the causes and potential solutions to declining mobility, yet policymakers, researchers, and the public still lack access to critical data necessary to understand which policies, programs, interventions, and investments are most effective at creating opportunity for students and workers, particularly those struggling with intergenerational poverty. Data collected across all levels of governments, nonprofit organizations, and private sector companies can help answer foundational policy and research questions on what drives economic mobility. There are promising efforts underway to improve government data infrastructure and processes at both the federal and state levels, but critical data often remains siloed, and legitimate concerns about privacy and civil liberties can make data difficult to share. Often, data on vulnerable populations most in need of services is of poor quality or is not collected at all.

To tackle this challenge, the Bill and Melinda Gates Foundation (BMGF) and the Markle Foundation (Markle) spent much of 2020 working with a diverse range of experts to identify strategic opportunities to accelerate progress towards unlocking data to improve policymaking, answer foundational research questions, and ensure that individuals can easily and responsibly access the information they need to make informed decisions in a rapidly changing environment….(More)”.

The mysterious user editing a global open-source map in China’s favor


Article by Vittoria Elliott and Nilesh Christopher Late last year, Nick Doiron spotted an article in The New York Times, detailing how China had built a village along the contested border with neighboring Bhutan. Doiron is a mapping aficionado and longtime contributor to OpenStreetMap (OSM), an open-source mapping platform that relies on an army of unpaid volunteers, just as Wikipedia does. Governments, universities, humanitarian groups, and companies like Amazon, Grab, Baidu, and Facebook all use data from OSM, making it an important tool that underpins ride-hailing apps and other technologies used by millions of people.

After reading the article, Doiron went to add new details about the Chinese village to OSM, which he expected would be missing. But when he zoomed in on the area, he made a peculiar discovery: Someone else had already documented the settlement before it was reported in the Times, and they had included granular details that Doiron couldn’t find anywhere else.

“They mapped the outlines of the buildings,” Doiron said, labeling one as a kindergarten, one as a police station, and another as a radio station. Even if the mysterious person had bought a satellite image from a private company, “I don’t know how they could have had that specific kind of information,” Doiron said.

That wasn’t the only thing that struck Doiron as strange. The user had also made the changes under the name NM$L, Chinese slang for the insult “Your mom is dead,” and linked to a Chinese rap music label that shares the same name. An accompanying bio hinted at their motives: “Safeguarding national sovereignty, unity and territorial integrity is the common obligation of all Chinese people, including compatriots in Hong Kong, Macao and Taiwan,” it read.

“Most people on OpenStreetMap don’t even have anything in their profile,” said Doiron. “It’s not like a social media site.”

As he looked deeper, Doiron discovered that NM$L had made several other edits, many of them along China’s border and in contested territories. The account had added changes to the Spratly Islands, an archipelago that an international tribunal ruled in 2016 was not part of China’s possible territorial claims, though it has continued to develop in the area. The account also drew along the Line of Actual Control (LAC) that separates Indian and Chinese territory in the disputed Himalayan border region, which the two countries fought a war over in 1962.

What, Doiron wondered, is going on here? 

Anyone can contribute to OSM, which makes the site democratic and open, but also leaves it vulnerable to the politics and perspectives of its individual contributors. This wasn’t the first time Doiron had heard of a user making edits in a certain country’s favor. “I know there are pro-India accounts that have added things like military checkpoints from the India perspective,” he said….(More)”.

Sustainable mobility: Policy making for data sharing


WBCSD report: “The demand for mobility will grow significantly in the coming years, but our urban transportation systems are at their limits. Increasing digitalization and data sharing in urban mobility can help governments and businesses to respond to this challenge and accelerate the transition toward sustainability. There is an urgent need for greater policy coherence in data-sharing ecosystems and governments need to adopt a more collaborative approach toward policy making.

With well-orchestrated policies, data sharing can result in shared value for public and private sectors and support the achievement of sustainability goals. Data-sharing policies should also aim to minimize risks around privacy and cybersecurity, minimize mobility biases rooted in race, gender and age, prevent the creation of runaway data monopolies and bridge the widening data divide.

This report outlines a global policy framework and practical guidance for policy making on data sharing. The report offers multiple case studies from across the globe to document emerging good practices and policy suggestions, recognizing the hyperlocal context of mobility needs and policies, the nascent state of the data-sharing market and limited evidence from regulatory practices….(More)”

An early warning approach to monitor COVID-19 activity with multiple digital traces in near real time


Paper by Nicole E. Kogan et al: “We propose that several digital data sources may provide earlier indication of epidemic spread than traditional COVID-19 metrics such as confirmed cases or deaths. Six such sources are examined here: (i) Google Trends patterns for a suite of COVID-19–related terms; (ii) COVID-19–related Twitter activity; (iii) COVID-19–related clinician searches from UpToDate; (iv) predictions by the global epidemic and mobility model (GLEAM), a state-of-the-art metapopulation mechanistic model; (v) anonymized and aggregated human mobility data from smartphones; and (vi) Kinsa smart thermometer measurements.

We first evaluate each of these “proxies” of COVID-19 activity for their lead or lag relative to traditional measures of COVID-19 activity: confirmed cases, deaths attributed, and ILI. We then propose the use of a metric combining these data sources into a multiproxy estimate of the probability of an impending COVID-19 outbreak. Last, we develop probabilistic estimates of when such a COVID-19 outbreak will occur on the basis of multiproxy variability. These outbreak-timing predictions are made for two separate time periods: the first, a “training” period, from 1 March to 31 May 2020, and the second, a “validation” period, from 1 June to 30 September 2020. Consistent predictive behavior among proxies in both of these subsequent and nonoverlapping time periods would increase the confidence that they may capture future changes in the trajectory of COVID-19 activity….(More)”.

Establishment of Sustainable Data Ecosystems


Report and Recommendations for the evolution of spatial data infrastructures by S. Martin, Gautier, P., Turki, and S., Kotsev: “The purpose of this study is to identify and analyse a set of successful data ecosystems and to address recommendations that can act as catalysts of data-driven innovation in line with the recently published European data strategy. The work presented here tries to identify to the largest extent possible actionable items.

Specifically, the study contributes with insights into the approaches that would help in the evolution of existing spatial data infrastructures (SDI), which are usually governed by the public sector and driven by data providers, to self-sustainable data ecosystems where different actors (including providers, users, intermediaries.) contribute and gain social and economic value in accordance with their specific objectives and incentives.

The overall approach described in this document is based on the identification and documentation of a set of case studies of existing data ecosystems and use cases for developing applications based on data coming from two or more data ecosystems, based on existing operational or experimental applications. Following a literature review on data ecosystem thinking and modelling, a framework consisting of three parts (Annex I) was designed. An ecosystem summary is drawn, giving an overall representation of the ecosystem key aspects. Two additional parts are detailed. One dedicated to ecosystem value dynamic illustrating how the ecosystem is structured through the resources exchanged between stakeholders, and the associated value.

Consequently, the ecosystem data flows represent the ecosystem from a complementary and more technical perspective, representing the flows and the data cycles associated to a given scenario. These two parts provide good proxies to evaluate the health and the maturity of a data ecosystem…(More)”.

Measuring Commuting and Economic Activity Inside Cities with Cell Phone Records


Paper by Gabriel Kreindler and Yuhei Miyauchi: “We show how to use commuting flows to infer the spatial distribution of income within a city. A simple workplace choice model predicts a gravity equation for commuting flows whose destination fixed effects correspond to wages. We implement this method with cell phone transaction data from Dhaka and Colombo. Model-predicted income predicts separate income data, at the workplace and residential level, and by skill group. Unlike machine learning approaches, our method does not require training data, yet achieves comparable predictive power. We show that hartals (transportation strikes) in Dhaka reduce commuting more for high model-predicted wage and high-skill commuters….(More)”.

The Landscape of Big Data and Gender


Report by Data2X: “This report draws out six observations about trends in big data and gender:

– The current environment COVID-19 and the global economic recession is stimulating groundbreaking gender research.

– Where we’re progressing, where we’re lagging Some gendered topics—especially mobility, health, and social norms—are increasingly well-studied through the combination of big data and traditional data. However, worrying gaps remain, especially around the subjects of economic opportunity, human security, and public participation.

– Capturing gender-representative samples using big data continues to be a challenge, but progress is being made.

– Large technology firms generate an immense volume of gender data critical for policymaking, and researchers are finding ways to reuse this data safely.

– Data collaboratives that bring private sector data-holders, researchers, and public policymakers together in a formal, enduring relationship can help big data make a practical difference in the lives of women and girls….(More)”

COVID vaccination studies: plan now to pool data, or be bogged down in confusion


Natalie Dean at Nature: “More and more COVID-19 vaccines are rolling out safely around the world; just last month, the United States authorized one produced by Johnson & Johnson. But there is still much to be learnt. How long does protection last? How much does it vary by age? How well do vaccines work against various circulating variants, and how well will they work against future ones? Do vaccinated people transmit less of the virus?

Answers to these questions will help regulators to set the best policies. Now is the time to make sure that those answers are as reliable as possible, and I worry that we are not laying the essential groundwork. Our current trajectory has us on course for confusion: we must plan ahead to pool data.

Many questions remain after vaccines are approved. Randomized trials generate the best evidence to answer targeted questions, such as how effective booster doses are. But for others, randomized trials will become too difficult as more and more people are vaccinated. To fill in our knowledge gaps, observational studies of the millions of vaccinated people worldwide will be essential….

Perhaps most importantly, we must coordinate now on plans to combine data. We must take measures to counter the long-standing siloed approach to research. Investigators should be discouraged from setting up single-site studies and encouraged to contribute to a larger effort. Funding agencies should favour studies with plans for collaborating or for sharing de-identified individual-level data.

Even when studies do not officially pool data, they should make their designs compatible with others. That means up-front discussions about standardization and data-quality thresholds. Ideally, this will lead to a minimum common set of variables to be collected, which the WHO has already hammered out for COVID-19 clinical outcomes. Categories include clinical severity (such as all infections, symptomatic disease or critical/fatal disease) and patient characteristics, such as comorbidities. This will help researchers to conduct meta-analyses of even narrow subgroups. Efforts are under way to develop reporting guidelines for test-negative studies, but these will be most successful when there is broad engagement.

There are many important questions that will be addressed only by observational studies, and data that can be combined are much more powerful than lone results. We need to plan these studies with as much care and intentionality as we would for randomized trials….(More)”.