The 100 Questions Initiative: Sourcing 100 questions on key societal challenges that can be answered by data insights

100Q Screenshot

Press Release: “The Governance Lab at the NYU Tandon School of Engineering announced the launch of the 100 Questions Initiative — an effort to identify the most important societal questions whose answers can be found in data and data science if the power of data collaboratives is harnessed.

The initiative, launched with initial support from Schmidt Futures, seeks to address challenges on numerous topics, including migration, climate change, poverty, and the future of work.

For each of these areas and more, the initiative will seek to identify questions that could help unlock the potential of data and data science with the broader goal of fostering positive social, environmental, and economic transformation. These questions will be sourced by leveraging “bilinguals” — practitioners across disciplines from all over the world who possess both domain knowledge and data science expertise.

The 100 Questions Initiative starts by identifying 10 key questions related to migration. These include questions related to the geographies of migration, migrant well-being, enforcement and security, and the vulnerabilities of displaced people. This inaugural effort involves partnerships with the International Organization for Migration (IOM) and the European Commission, both of which will provide subject-matter expertise and facilitation support within the framework of the Big Data for Migration Alliance (BD4M).

“While there have been tremendous efforts to gather and analyze data relevant to many of the world’s most pressing challenges, as a society, we have not taken the time to ensure we’re asking the right questions to unlock the true potential of data to help address these challenges,” said Stefaan Verhulst, co-founder and chief research and development officer of The GovLab. “Unlike other efforts focused on data supply or data science expertise, this project seeks to radically improve the set of questions that, if answered, could transform the way we solve 21st century problems.”

In addition to identifying key questions, the 100 Questions Initiative will also focus on creating new data collaboratives. Data collaboratives are an emerging form of public-private partnership that help unlock the public interest value of previously siloed data. The GovLab has conducted significant research in the value of data collaboration, identifying that inter-sectoral collaboration can both increase access to information (e.g., the vast stores of data held by private companies) as well as unleash the potential of that information to serve the public good….(More)”.

Data Protection and Digital Agency for Refugees

Paper by Dragana Kaurin: “For the millions of refugees fleeing conflict and persecution every year, access to information about their rights and control over their personal data are crucial for their ability to assess risk and navigate the asylum process. While asylum seekers are required to provide significant amounts of personal information on their journey to safety, they are rarely fully informed of their data rights by UN agencies or local border control and law enforcement staff tasked with obtaining and processing their personal information. Despite recent improvements in data protection mechanisms in the European Union, refugees’ informed consent for the collection and use of their personal data is rarely sought. Using examples drawn from interviews with refugees who have arrived in Europe since 2013, and an analysis of the impacts of the 2016 EU-Turkey deal on migration, this paper analyzes how the vast amount of data collected from refugees is gathered, stored and shared today, and considers the additional risks this collection process poses to an already vulnerable population navigating a perilous information-decision gap….(More)”.

Opportunities and Challenges of Emerging Technologies for the Refugee System

Research Paper by Roya Pakzad: “Efforts are being made to use information and communications technologies (ICTs) to improve accountability in providing refugee aid. However, there remains a pressing need for increased accountability and transparency when designing and deploying humanitarian technologies. This paper outlines the challenges and opportunities of emerging technologies, such as machine learning and blockchain, in the refugee system.

The paper concludes by recommending the creation of quantifiable metrics for sharing information across both public and private initiatives; the creation of the equivalent of a “Hippocratic oath” for technologists working in the humanitarian field; the development of predictive early-warning systems for human rights abuses; and greater accountability among funders and technologists to ensure the sustainability and real-world value of humanitarian apps and other digital platforms….(More)”

How Technology Could Revolutionize Refugee Resettlement

Krishnadev Calamur in The Atlantic: “… For nearly 70 years, the process of interviewing, allocating, and accepting refugees has gone largely unchanged. In 1951, 145 countries came together in Geneva, Switzerland, to sign the Refugee Convention, the pact that defines who is a refugee, what refugees’ rights are, and what legal obligations states have to protect them.

This process was born of the idealism of the postwar years—an attempt to make certain that those fleeing war or persecution could find safety so that horrific moments in history, such as the Holocaust, didn’t recur. The pact may have been far from perfect, but in successive years, it was a lifeline to Afghans, Bosnians, Kurds, and others displaced by conflict.

The world is a much different place now, though. The rise of populism has brought with it a concomitant hostility toward immigrants in general and refugees in particular. Last October, a gunman who had previously posted anti-Semitic messages online against HIAS killed 11 worshippers in a Pittsburgh synagogue. Many of the policy arguments over resettlement have shifted focus from humanitarian relief to security threats and cost. The Trump administration has drastically cut the number of refugees the United States accepts, and large parts of Europe are following suit.

If it works, Annie could change that dynamic. Developed at Worcester Polytechnic Institute in Massachusetts, Lund University in Sweden, and the University of Oxford in Britain, the software uses what’s known as a matching algorithm to allocate refugees with no ties to the United States to their new homes. (Refugees with ties to the United States are resettled in places where they have family or community support; software isn’t involved in the process.)

Annie’s algorithm is based on a machine learning model in which a computer is fed huge piles of data from past placements, so that the program can refine its future recommendations. The system examines a series of variables—physical ailments, age, levels of education and languages spoken, for example—related to each refugee case. In other words, the software uses previous outcomes and current constraints to recommend where a refugee is most likely to succeed. Every city where HIAS has an office or an affiliate is given a score for each refugee. The higher the score, the better the match.

This is a drastic departure from how refugees are typically resettled. Each week, HIAS and the eight other agencies that allocate refugees in the United States make their decisions based largely on local capacity, with limited emphasis on individual characteristics or needs….(More)”.

Seven design principles for using blockchain for social impact

Stefaan Verhulst at Apolitical: “2018 will probably be remembered as the bust of the blockchain hype. Yet even as crypto currencies continue to sink in value and popular interest, the potential of using blockchain technologies to achieve social ends remains important to consider but poorly understood.

In 2019, business will continue to explore blockchain for sectors as disparate as finance, agriculture, logistics and healthcare. Policymakers and social innovators should also leverage 2019 to become more sophisticated about blockchain’s real promise, limitations  and current practice.

In a recent report I prepared with Andrew Young, with the support of the Rockefeller Foundation, we looked at the potential risks and challenges of using blockchain for social change — or “” A number of implementations and platforms are already demonstrating potential social impact.

The technology is now being used to address issues as varied as homelessness in New York City, the Rohingya crisis in Myanmar and government corruption around the world.

In an illustration of the breadth of current experimentation, Stanford’s Center for Social Innovation recently analysed and mapped nearly 200 organisations and projects trying to create positive social change using blockchain. Likewise, the GovLab is developing a mapping of blockchange implementations across regions and topic areas; it currently contains 60 entries.

All these examples provide impressive — and hopeful — proof of concept. Yet despite the very clear potential of blockchain, there has been little systematic analysis. For what types of social impact is it best suited? Under what conditions is it most likely to lead to real social change? What challenges does blockchain face, what risks does it pose and how should these be confronted and mitigated?

These are just some of the questions our report, which builds its analysis on 10 case studies assembled through original research, seeks to address.

While the report is focused on identity management, it contains a number of lessons and insights that are applicable more generally to the subject of blockchange.

In particular, it contains seven design principles that can guide individuals or organisations considering the use of blockchain for social impact. We call these the Genesis principles, and they are outlined at the end of this article…(More)”.

NHS Pulls Out Of Data-Sharing Deal With Home Office Immigration Enforcers

Jasmin Gray at Huffington Post: “The NHS has pulled out of a controversial data-sharing arrangement with the Home Office which saw confidential patients’ details passed on to immigration enforcers.

In May, the government suspended the ‘memorandum of understanding’ agreement between the health service and the Home Office after MPs, doctors and health charities warned it was leaving seriously ill migrants too afraid to seek medical treatment. 

But on Tuesday, NHS Digital announced that it was cutting itself out of the agreement altogether. 

“NHS Digital has received a revised narrowed request from the Home Office and is discussing this request with them,” a spokesperson for the data-branch of the health service said, adding that they have “formally closed-out our participation” in the previous memorandum of understanding. 

The anxieties of “multiple stakeholder communities” to ensure the agreement made by the government was respected was taken into account in the decision, they added. 

Meanwhile, the Home Office confirmed it was working to agree a new deal with NHS Digital which would only allow it to make requests for data about migrants “facing deportation action because they have committed serious crimes, or where information necessary to protect someone’s welfare”. 

The move has been welcomed by campaigners, with Migrants’ Rights Network director Rita Chadra saying that many migrants had missed out on “the right to privacy and access to healthcare” because of the data-sharing mechanism….(More)”.

DNA databases are too white. This man aims to fix that.

Interview of Carlos D. Bustamante by David Rotman: “In the 15 years since the Human Genome Project first exposed our DNA blueprint, vast amounts of genetic data have been collected from millions of people in many different parts of the world. Carlos D. Bustamante’s job is to search that genetic data for clues to everything from ancient history and human migration patterns to the reasons people with different ancestries are so varied in their response to common diseases.

Bustamante’s career has roughly spanned the period since the Human Genome Project was completed. A professor of genetics and biomedical data science at Stanford and 2010 winner of a MacArthur genius award, he has helped to tease out the complex genetic variation across different populations. These variants mean that the causes of diseases can vary greatly between groups. Part of the motivation for Bustamante, who was born in Venezuela and moved to the US when he was seven, is to use those insights to lessen the medical disparities that still plague us.

But while it’s an area ripe with potential for improving medicine, it’s also fraught with controversies over how to interpret genetic differences between human populations. In an era still obsessed with race and ethnicity—and marred by the frequent misuse of science in defining the characteristics of different groups—Bustamante remains undaunted in searching for the nuanced genetic differences that these groups display.

Perhaps his optimism is due to his personality—few sentences go by without a “fantastic” or “extraordinarily exciting.” But it is also his recognition as a population geneticist of the incredible opportunity that understanding differences in human genomes presents for improving health and fighting disease.

David Rotman, MIT Technology Review’s editor at large, discussed with Bustamante why it’s so important to include more people in genetic studies and understand the genetics of different populations.

How good are we at making sure that the genomic data we’re collecting is inclusive?

I’m optimistic, but it’s not there yet.

In our 2011 paper, the statistic we had was that more than 96% of participants in genome-wide association studies were of European descent. In the follow-up in 2016, the number went from 96% to around 80%. So that’s getting better. Unfortunately, or perhaps fortunately, a lot of that is due to the entry of China into genetics. A lot of that was due to large-scale studies in Chinese and East Asian populations. Hispanics, for example, make up less than 1% of genome-wide association studies. So we need to do better. Ultimately, we want precision medicine to benefit everybody.

Aside from a fairness issue, why is diversity in genomic data important? What do we miss without it?

First of all, it has nothing to do with political correctness. It has everything to do with human biology and the fact that human populations and the great diaspora of human migrations have left their mark on the human genome. The genetic underpinnings of health and disease have shared components across human populations and things that are unique to different populations….(More)”.

How data helped visualize the family separation crisis

Chava Gourarie at StoryBench: “Early this summer, at the height of the family separation crisis – where children were being forcibly separated from their parents at our nation’s border – a team of scholars pooled their skills to address the issue. The group of researchers – from a variety of humanities departments at multiple universities – spent a week of non-stop work mapping the immigration detention network that spans the United States. They named the project “Torn Apart/Separados” and published it online, to support the efforts of locating and reuniting the separated children with their parents.

The project utilizes the methods of the digital humanities, an emerging discipline that applies computational tools to fields within the humanities, like literature and history. It was led by members of Columbia University’s Group for Experimental Methods in the Humanities, which had previously used methods such as rapid deployment to responded to natural disasters.

The group has since expanded the project, publishing a second volume that focuses on the $5 billion immigration industry, based largely on public data about companies that contract with the Immigration and Customs Enforcement agency. The visualizations highlight the astounding growth in investment of ICE infrastructure (from $475 million 2014 to $5.1 billion in 2018), as well as who benefits from these contracts, and how the money is spent.

Storybench spoke with Columbia University’s Alex Gil, who worked on both phases of the project, about the process of building “Torn Apart/Separados,” about the design and messaging choices that were made and the ways in which methods of the digital humanities can cross pollinate with those of journalism…(More)”.

Information Asymmetries, Blockchain Technologies, and Social Change

Reflections by Stefaan Verhulst on “the potential (and challenges) of Distributed Ledgers for “Market for Lemons” Conditions: We live in a data age, and it has become common to extol the transformative power of data and information. It is now conventional to assume that many of our most pressing public problems—everything from climate change to terrorism to mass migration—are amenable to a “data fix.”

The truth, though, is a little more complicated. While there is no doubt that data—when analyzed and used responsibly—holds tremendous potential, many factors affect whether, and to what extent, that potential will ultimately be fulfilled.

Our ability to address complex public problems using data depends vitally on how our respective data ecosystems is designed (as well as ongoing questions of representation in, power over, and stewardship of these ecosystems).

Flaws in our data ecosystem that prevent us from addressing problems; may also be responsible for many societal failures and inequalities result from the fact that:

  • some actors have better access to data than others;
  • data is of poor quality (or even “fake”); contains implicit bias; and/or is not validated and thus not trusted;
  • only easily accessible data are shared and integrated (“open washing”) while important data remain carefully hidden or without resources for relevant research and analysis; and more generally that
  • even in an era of big and open data, information too often remains stove-piped, siloed, and generally difficult to access.

Several observers have pointed to the relationship between these information asymmetries and, for example, corruption, financial exclusion, global pandemics, forced mass migration, human rights abuses, and electoral fraud.

Consider the transaction costs, power inequities and other obstacles that result from such information asymmetries, namely:

–     At the individual level: too often someone who is trying to open a bank account (or sign up for new cell phone service) is unable to provide all the requisite information, such as credit history, proof of address or other confirmatory and trusted attributes of identity. As such, information asymmetries are in effect limiting this individual’s access to financial and communications services.

–     At the corporate level, a vast body of literature in economics has shown how uncertainty over the quality and trustworthiness of data can impose transaction costs, limit the development of markets for goods and services, or shut them down altogether. This is the well-known “market for lemons” problem made famous in a 1970 paper of the same name by George Akerlof.

–     At the societal or governance level, information asymmetries don’t just affect the efficiency of markets or social inequality. They can also incentivize unwanted behaviors that cause substantial public harm. Tyrants and corrupt politicians thrive on limiting their citizens’ access to information (e.g., information related to bank accounts, investment patterns or disbursement of public funds). Likewise, criminals, operate and succeed in the information-scarce corners of the underground economy.

Blockchain technologies and Information Asymmetries

This is where blockchain comes in. At their core, blockchain technologies are a new type of disclosure mechanism that have the potential to address some of the information asymmetries listed above. There are many types of blockchain technologies, and while I use the blanket term ‘blockchain’ in the below for simplicity’s sake, the nuances between different types of blockchain technologies can greatly impact the character and likelihood of success of a given initiative.

By leveraging a shared and verified database of ledgers stored in a distributed manner, blockchain seeks to redesign information ecosystems in a more transparent, immutable, and trusted manner. Solving information asymmetries may be the real potential of blockchain, and this—much more than the current hype over virtual currencies—is the real reason to assess its potential….(More)”.

Migration Data using Social Media

European Commission JRC Technical Report: “Migration is a top political priority for the European Union (EU). Data on international migrant stocks and flows are essential for effective migration management. In this report, we estimated the number of expatriates in 17 EU countries based on the number of Facebook Network users who are classified by Facebook as “expats”. To this end, we proposed a method for correcting the over- or under-representativeness of Facebook Network users compared to countries’ actual population.

This method uses Facebook penetration rates by age group and gender in the country of previous residence and country of destination of a Facebook expat. The purpose of Facebook Network expat estimations is not to reproduce migration statistics, but rather to generate separate estimates of expatriates, since migration statistics and Facebook Network expats estimates do not measure the same quantities of interest.

Estimates of social media application users who are classified as expats can be a timely, low-cost, and almost globally available source of information for estimating stocks of international migrants. Our methodology allowed for the timely capture of the increase of Venezuelan migrants in Spain. However, there are important methodological and data integrity issues with using social media data sources for studying migration-related phenomena. For example, our methodology led us to significantly overestimate the number of expats from Philippines in Spain and in Italy and there is no evidence that this overestimation may be valid. While research on the use of big data sources for migration is in its infancy, and the diffusion of internet technologies in less developed countries is still limited, the use of big data sources can unveil useful insights on quantitative and qualitative characteristics of migration….(More)”.