Machine Learning and Mobile Phone Data Can Improve the Targeting of Humanitarian Assistance


Paper by Emily Aiken et al: “The COVID-19 pandemic has devastated many low- and middle-income countries (LMICs), causing widespread food insecurity and a sharp decline in living standards. In response to this crisis, governments and humanitarian organizations worldwide have mobilized targeted social assistance programs. Targeting is a central challenge in the administration of these programs: given available data, how does one rapidly identify the individuals and families with the greatest need? This challenge is particularly acute in the large number of LMICs that lack recent and comprehensive data on household income and wealth.

Here we show that non-traditional “big” data from satellites and mobile phone networks can improve the targeting of anti-poverty programs. Our approach uses traditional survey-based measures of consumption and wealth to train machine learning algorithms that recognize patterns of poverty in non-traditional data; the trained algorithms are then used to prioritize aid to the poorest regions and mobile subscribers. We evaluate this approach by studying Novissi, Togo’s flagship emergency cash transfer program, which used these algorithms to determine eligibility for a rural assistance program that disbursed millions of dollars in COVID-19 relief aid. Our analysis compares outcomes – including exclusion errors, total social welfare, and measures of fairness – under different targeting regimes. Relative to the geographic targeting options considered by the Government of Togo at the time, the machine learning approach reduces errors of exclusion by 4-21%. Relative to methods that require a comprehensive social registry (a hypothetical exercise; no such registry exists in Togo), the machine learning approach increases exclusion errors by 9-35%. These results highlight the potential for new data sources to contribute to humanitarian response efforts, particularly in crisis settings when traditional data are missing or out of date….(More)”.

The coloniality of collaboration: sources of epistemic obedience in data-intensive astronomy in Chile


Paper by Sebastián Lehuedé: “Data collaborations have gained currency over the last decade as a means for data- and skills-poor actors to thrive as a fourth paradigm takes hold in the sciences. Against this backdrop, this article traces the emergence of a collaborative subject position that strives to establish reciprocal and technical-oriented collaborations so as to catch up with the ongoing changes in research.

Combining insights from the modernity/coloniality group, political theory and science and technology studies, the article argues that this positionality engenders epistemic obedience by bracketing off critical questions regarding with whom and for whom knowledge is generated. In particular, a dis-embedding of the data producers, the erosion of local ties, and a data conformism are identified as fresh sources of obedience impinging upon the capacity to conduct research attuned to the needs and visions of the local context. A discursive-material analysis of interviews and field notes stemming from the case of astronomy data in Chile is conducted, examining the vision of local actors aiming to gain proximity to the mega observatories producing vast volumes of data in the Atacama Desert.

Given that these observatories are predominantly under the control of organisations from the United States and Europe, the adoption of a collaborative stance is now seen as the best means to ensure skills and technology transfer to local research teams. Delving into the epistemological dimension of data colonialism, this article warns that an increased emphasis on collaboration runs the risk of reproducing planetary hierarchies in times of data-intensive research….(More)”.

Inclusive SDG Data Partnerships


Learning report” by Partners for Review (P4R/GIZ), the Danish Institute for Human Rights (DIHR), and the International Civil Society Centre: “It brought together National SDG Units, National Statistics Offices, National Human Rights Institutions and civil society organisations from across six countries. The initiative’s purpose is to advance data partnerships for the SDGs and to strengthen multi-actor data ecosystems at the national level. Goal is to meet the SDG data challenge by improving the use of alternative data sources, particularly data produced by civil society and human rights institutions, and complementary to official statistics….(More)”.

Household Financial Transaction Data


Paper by Scott R. Baker & Lorenz Kueng: “The growth of the availability and use of detailed household financial transaction microdata has dramatically expanded the ability of researchers to understand both household decision-making as well as aggregate fluctuations across a wide range of fields. This class of transaction data is derived from a myriad of sources including financial institutions, FinTech apps, and payment intermediaries. We review how these detailed data have been utilized in finance and economics research and the benefits they enable beyond more traditional measures of income, spending, and wealth. We discuss the future potential for this flexible class of data in firm-focused research, real-time policy analysis, and macro statistics….(More)”.

Financial data unbound: The value of open data for individuals and institutions


Paper by McKinsey Global Institute: “As countries around the world look to ensure rapid recovery once the COVID-19 crisis abates, improved financial services are emerging as a key element to boost growth, raise economic efficiency, and lift productivity. Robust digital financial infrastructure proved its worth during the crisis, helping governments cushion people and businesses from the economic shock of the pandemic. The next frontier is to create an open-data ecosystem for finance.

Already, technological, regulatory, and competitive forces are moving markets toward easier and safer financial data sharing. Open-data initiatives are springing up globally, including the United Kingdom’s Open Banking Implementation Entity, the European Union’s second payment services directive, Australia’s new consumer protection laws, Brazil’s drafting of open data guidelines, and Nigeria’s new Open Technology Foundation (Open Banking Nigeria). In the United States, the Consumer Financial Protection Bureau aims to facilitate a consumer-authorized data-sharing market, while the Financial Data Exchange consortium attempts to promote common, interoperable standards for secure access to financial data. Yet, even as many countries put in place stronger digital financial infrastructure and data-sharing mechanisms, COVID-19 has exposed limitations and gaps in their reach, a theme we explored in earlier research.

This discussion paper from the McKinsey Global Institute (download full text in 36-page PDF) looks at the potential value that could be created—and the key issues that will need to be addressed—by the adoption of open data for finance. We focus on four regions: the European Union, India, the United Kingdom, and the United States.

By open data, we mean the ability to share financial data through a digital ecosystem in a manner that requires limited effort or manipulation. Advantages include more accurate credit risk evaluation and risk-based pricing, improved workforce allocation, better product delivery and customer service, and stronger fraud protection.

Our analysis suggests that the boost to the economy from broad adoption of open-data ecosystems could range from about 1 to 1.5 percent of GDP in 2030 in the European Union, the United Kingdom, and the United States, to as much as 4 to 5 percent in India. All market participants benefit, be they institutions or consumers—either individuals or micro-, small-, and medium-sized enterprises (MSMEs)—albeit to varying degrees….(More)”.

How data governance technologies can democratize data sharing for community well-being


Paper by Dan Wu, Stefaan Verhulst, Alex Pentland, Thiago Avila, Kelsey Finch, and Abhishek Gupta in Data & Policy (Cambridge University Press) focusing on “Data sharing efforts to allow underserved groups and organizations to overcome the concentration of power in our data landscape…

A few special organizations, due to their data monopolies and resources, are able to decide which problems to solve and how to solve them. But even though data sharing creates a counterbalancing democratizing force, it must nevertheless be approached cautiously. Underserved organizations and groups must navigate difficult barriers related to technological complexity and legal risk.

To examine what those common barriers are, one type of data sharing effort—data trusts—are examined, specifically the reports commenting on that effort. To address these practical issues, data governance technologies have a large role to play in democratizing data trusts safely and in a trustworthy manner. Yet technology is far from a silver bullet. It is dangerous to rely upon it. But technology that is no-code, flexible, and secure can help more responsibly operate data trusts. This type of technology helps innovators put relationships at the center of their efforts….(More)”.

Charting the ‘Data for Good’ Landscape


Report by Jake Porway at Data.org: “There is huge potential for data science and AI to play a productive role in advancing social impact. However, the field of “data for good” is not only overshadowed by the public conversations about the risks rampant data misuse can pose to civil society, it is also a fractured and disconnected space. There are a myriad of different interpretations of what it means to “use data for good” or “use AI for good”, which creates duplicate efforts, nonstrategic initiatives, and confusion about what a successfully data-driven social sector could look like. To add to that, funding is scarce for a field that requires expensive tools and skills to do well. These enduring challenges result in work being done at an activity and project level, but do not create a coherent set of building blocks to constitute a strong and healthy field that is capable of solving a new class of systems-level problems.

We are taking one tiny step forward in trying to make a more coherent Data for Good space with a landscape that makes clear what various Data for Good initiatives (and AI for Good initiatives) are trying to achieve, how they do it, and what makes them similar or different from one another. One of the major confusion points in talking about “Data for Good” is that it treats all efforts as similar by the mere fact that they use “data” and seek to do something “good”. This term is so broad as to be practically meaningless; as unhelpful as saying “Wood for Good”. We would laugh at a term as vague as “Wood for Good”, which would lump together activities as different as building houses to burning wood in cook stoves to making paper, combining architecture with carpentry, forestry with fuel. However, we are content to say “Data for Good”, and its related phrases “we need to use our data better” or “we need to be data-driven”, when data is arguably even more general than something like wood.

We are trying to bring clarity to the conversation by going beyond mapping organizations into arbitrary groups, to define the dimensions of what it means to do data for good. By creating an ontology for what Data for Good initiatives seek to achieve, in which sector, and by what means, we can gain a better understanding of the underlying fundamentals of using data for good, as well as creating a landscape of what initiatives are doing.

We hope that this landscape of initiatives will help to bring some more nuance and clarity to the field, as well as identify which initiatives are out there and what purpose they serve. Specifically, we hope this landscape will help:

  • Data for Good field practitioners align on a shared language for the outcomes, activities, and aims of the field.
  • Purpose-driven organizations who are interested in applying data and computing to their missions better understand what they might need and who they might go to to get it.
  • Funders make more strategic decisions about funding in the data/AI space based on activities that align with their interests and the amount of funding already devoted to that area.
  • Organizations with Data for Good initiatives can find one another and collaborate based on similarity of mission and activities.

Below you will find a very preliminary landscape map, along with a description of the different kinds of groups in the Data for Good ecosystem and why you might need to engage with them….(More)”.

On regulation for data trusts


Paper by Aline Blankertz and Louisa Specht: “Data trusts are a promising concept for enabling data use while maintaining data privacy. Data trusts can pursue many goals, such as increasing the participation of consumers or other data subjects, putting data protection into practice more effectively, or strengthening data sharing along the value chain. They have the potential to become an alternative model to the large platforms, which are accused of accumulating data power and using it primarily for their own purposes rather than for the benefit of their users. To fulfill these hopes, data trusts must be trustworthy so that their users understand and trust that data is being used in their interest.

It is an important step that policymakers have recognized the potential of data trusts. This should be followed by measures that address specific risks and thus promote trust in the services. Currently, the political approach is to subject all forms of data trusts to the same rules through “one size fits all” regulation. This is the case, for example, with the Data Governance Act (DGA), which gives data trusts little leeway to evolve in the marketplace.

To encourage the development of data trusts, it makes sense to broadly define them as all organizations that manage data on behalf of others while adhering to a legal framework (including competition, trade secrets, and privacy). Which additional rules are necessary to ensure trustworthiness should be decided depending on the use case. The risk of a use case should be considered as well as the need for incentives to act as a data trust.

Risk factors can be identified across sectors; in particular, centralized or decentralized data storage and voluntary or mandatory use of data trusts are among them. The business model is not a main risk factor. Although many regulatory proposals call for strict neutrality, several data trusts without strict neutrality appear trustworthy in terms of monetization or vertical integration. At the same time, it is unclear what incentives exist for developing strictly neutral data trusts. Neutrality requirements that go beyond what is necessary make it less likely that desired alternative models will develop and take hold….(More)”.

Lessons learned from telco data informing COVID-19 responses: toward an early warning system for future pandemics?


Introduction to a special issue of Data and Policy (Open Access) by Richard Benjamins, Jeanine Vos, and Stefaan Verhulst: “More than a year into the COVID-19 pandemic, the damage is still unfolding. While some countries have recently managed to gain an element of control through aggressive vaccine campaigns, much of the developing world — South and Southeast Asia in particular — remain in a state of crisis. Given what we now know about the global nature of this disease and the potential for mutant versions to develop and spread, a crisis anywhere is cause for concern everywhere. The world remains very much in the grip of this public health crisis.

From the beginning, there has been hope that data and technology could offer solutions to help inform the government’s response strategy and decision-making. Many of the expectations have been focused on mobile data analytics in particular, whereby mobile network operators create mobility insights and decision-support tools generated from anonymized and aggregated telco data. This stems both from a growing group of mobile network operators having significantly invested in systems and capabilities to develop such products and services for public and private sector customers. As well as their value having been demonstrated in addressing different global challenges, ranging from models to better understand the spread of Zika in Brazil to interactive dashboards to aid emergency services during earthquakes and floods in Japan. Yet despite these experiences, many governments across the world still have limited awareness, capabilities and resources to leverage these tools, in their efforts to limit the spread of COVID-19 using non-pharmaceutical interventions (NPI), both from a medical and economic point of view.

Today, we release the first batch of papers of a special collection of Data & Policy that examines both the potential of mobile data, as well as the challenges faced in delivering these tools to inform government decision-making. Consisting of 5 papers from 33 researchers and experts from academia, industry and government, the articles cover a wide range of geographies, including Europe, Argentina, Brazil, Ecuador, France, Gambia, Germany, Ghana, Austria, Belgium, and Spain. Responding to our call for case studies to illustrate the opportunities (and challenges) offered by mobile big data in the fight against COVID-19, the authors of these papers describe a number of examples of how mobile and mobile-related data have been used to address the medical, economic, socio-cultural and political aspects of the pandemic….(More)”.

Using big data for insights into the gender digital divide for girls: A discussion paper


 Using big data for insights into the gender digital divide for girls: A discussion paper

UNICEF paper: “This discussion paper describes the findings of a study that used big data as an alternative data source to understand the gender digital divide for under-18s. It describes 6 key insights gained from analysing big data from Facebook and Instagram platforms, and discusses how big data can be further used to contribute to the body of evidence for the gender digital divide for adolescent girls….(More)”