How Data Can Map and Make Racial Inequality More Visible (If Done Responsibly)


Reflection Document by The GovLab: “Racism is a systemic issue that pervades every aspect of life in the United States and around the world. In recent months, its corrosive influence has been made starkly visible, especially on Black people. Many people are hurting. Their rage and suffering stem from centuries of exclusion and from being subject to repeated bias and violence. Across the country, there have been protests decrying racial injustice. Activists have called upon the government to condemn bigotry and racism, to act against injustice, to address systemic and growing inequality.

Institutions need to take meaningful action to address such demands. Though racism is not experienced in the same way by all communities of color, policymakers must respond to the anxieties and apprehensions of Black people as well as those of communities of color more generally. This work will require institutions and individuals to reflect on how they may be complicit in perpetuating structural and systematic inequalities and harm and to ask better questions about the inequities that exist in society (laid bare in both recent acts of violence and in racial disadvantages in health outcomes during the ongoing COVID-19 crisis). This work is necessary but unlikely to be easy. As Rashida Richardson, Director of Policy Research at the AI Now Institute at NYU notes:

“Social and political stratifications also persist and worsen because they are embedded into our social and legal systems and structures. Thus, it is difficult for most people to see and understand how bias and inequalities have been automated or operationalized over time.”

We believe progress can be made, at least in part, through responsible data access and analysis, including increased availability of (disaggregated) data through data collaboration. Of course, data is only one part of the overall picture, and we make no claims that data alone can solve such deeply entrenched problems. Nonetheless, data can have an impact by making inequalities resulting from racism more quantifiable and inaction less excusable.

…Prioritizing any of these topics will also require increased community engagement and participatory agenda setting. Likewise, we are deeply conscious that data can have a negative as well as positive impact and that technology can perpetuate racism when designed and implemented without the input and participation of minority communities and organizations. While our report here focuses on the promise of data, we need to remain aware of the potential to weaponize data against vulnerable and already disenfranchised communities. In addition, (hidden) biases in data collected and used in AI algorithms, as well as in a host of other areas across the data life cycle, will only exacerbate racial inequalities if not addressed….(More)”

ALSO: The piece is supplemented by a crowdsourced listing of Data-Driven Efforts to Address Racial Inequality.

Tribalism Comes for Pandemic Science



Yuval Levin at The New Atlantis: “he Covid-19 pandemic has tested our society in countless ways. From the health system to the school system, the economy, government, and family life, we have confronted some enormous and unfamiliar challenges. But many of these stresses are united by the need to constantly adapt to new information and evidence and accept that any knowledge we might have is only provisional. This demands a kind of humble restraint — on the part of public health experts, political leaders, and the public at large — that our society now finds very hard to muster.

The virus is novel, so our understanding of what responding to it might require of us has had to be built on the fly. But the polarized culture war that pervades so much of our national life has made this kind of learning very difficult. Views developed in response to provisional assessments of incomplete evidence quickly rigidify as they are transformed into tribal markers and then cultural weapons. Soon there are left-wing and right-wing views on whether to wear masks, whether particular drugs are effective, or how to think about social distancing.

New evidence is taken as an assault on these tribal commitments, and policy adjustments in response are seen as forms of surrender to the enemy. Every new piece of information gets filtered through partisan sieves, implicitly examined to see whose interest it serves, and then embraced or rejected on that basis. We all do this. You’re probably doing it right now — skimming quickly to the end of this piece to see if I’m criticizing you or only those other people who behave so irresponsibly….(More)”.

Race After Technology: Abolitionist Tools for the New Jim Code


Book by Ruha Benjamin: “From everyday apps to complex algorithms, Ruha Benjamin cuts through tech-industry hype to understand how emerging technologies can reinforce White supremacy and deepen social inequity.

Benjamin argues that automation, far from being a sinister story of racist programmers scheming on the dark web, has the potential to hide, speed up, and deepen discrimination while appearing neutral and even benevolent when compared to the racism of a previous era. Presenting the concept of the “New Jim Code,” she shows how a range of discriminatory designs encode inequity by explicitly amplifying racial hierarchies; by ignoring but thereby replicating social divisions; or by aiming to fix racial bias but ultimately doing quite the opposite. Moreover, she makes a compelling case for race itself as a kind of technology, designed to stratify and sanctify social injustice in the architecture of everyday life.

This illuminating guide provides conceptual tools for decoding tech promises with sociologically informed skepticism. In doing so, it challenges us to question not only the technologies we are sold but also the ones we ourselves manufacture….(More)”.

Centering Racial Equity Throughout Data Integration


Toolkit by AISP: “Societal “progress” is often marked by the construction of new infrastructure that fuels change and innovation. Just as railroads and interstate highways were the defining infrastructure projects of the 1800 and 1900s, the development of data infrastructure is a critical innovation of our century. Railroads and highways were drivers of development and prosperity for some investors and sites. Yet other individuals and communities were harmed, displaced, bypassed, ignored, and forgotten by
those efforts.

At this moment in our history, we can co-create data infrastructure to promote racial equity and the public good, or we can invest in data infrastructure that disregards the historical, social, and political context—reinforcing racial inequity that continues to harm communities. Building data infrastructure without a racial equity lens and understanding of historical context will exacerbate existing inequalities along the lines of race, gender, class, and ability. Instead, we commit to contextualize our work in the historical and structural oppression that shapes it, and organize stakeholders across geography, sector, and experience to center racial equity throughout data integration….(More)”.

How Crowdsourcing Aided a Push to Preserve the Histories of Nazi Victims


Andrew Curry at the New York Times: “With people around the globe sheltering at home amid the pandemic, an archive of records documenting Nazi atrocities asked for help indexing them. Thousands joined the effort….

As the virus prompted lockdowns across Europe, the director of the Arolsen Archives — the world’s largest devoted to the victims of Nazi persecution — joined millions of others working remotely from home and spending lots more time in front of her computer.

“We thought, ‘Here’s an opportunity,’” said the director, Floriane Azoulay.

Two months later, the archive’s “Every Name Counts” project has attracted thousands of online volunteers to work as amateur archivists, indexing names from the archive’s enormous collection of papers. To date, they have added over 120,000 names, birth dates and prisoner numbers in the database.

“There’s been much more interest than we expected,” Ms. Azoulay said. “The fact that people were locked at home and so many cultural offerings have moved online has played a big role.”

It’s a big job: The Arolsen Archives are the largest collection of their kind in the world, with more than 30 million original documents. They contain information on the wartime experiences of as many as 40 million people, including Jews executed in extermination camps and forced laborers conscripted from across Nazi-occupied Europe.

The documents, which take up 16 miles of shelving, include things like train manifests, delousing records, work detail assignments and execution records…(More)”.

Digital contact tracing and surveillance during COVID-19


Report on General and Child-specific Ethical Issues by Gabrielle Berman, Karen Carter, Manuel García-Herranz and Vedran Sekara: “The last few years have seen a proliferation of means and approaches being used to collect sensitive or identifiable data on children. Technologies such as facial recognition and other biometrics, increased processing capacity for ‘big data’ analysis and data linkage, and the roll-out of mobile and internet services and access have substantially changed the nature of data collection, analysis, and use.

Real-time data are essential to support decision-makers in government, development and humanitarian agencies such as UNICEF to better understand the issues facing children, plan appropriate action, monitor progress and ensure that no one is left behind. But the collation and use of personally identifiable data may also pose significant risks to children’s rights.

UNICEF has undertaken substantial work to provide a foundation to understand and balance the potential benefits and risks to children of data collection. This work includes the Industry Toolkit on Children’s Online Privacy and Freedom of Expression and a partnership with GovLab on Responsible Data for Children (RD4C) – which promotes good practice principles and has developed practical tools to assist field offices, partners and governments to make responsible data management decisions.

Balancing the need to collect data to support good decision-making versus the need to protect children from harm created through the collection of the data has never been more challenging than in the context of the global COVID-19 pandemic. The response to the pandemic has seen an unprecedented rapid scaling up of technologies to support digital contact tracing and surveillance. The initial approach has included:

  • tracking using mobile phones and other digital devices (tablet computers, the Internet of Things, etc.)
  • surveillance to support movement restrictions, including through the use of location monitoring and facial recognition
  • a shift from in-person service provision and routine data collection to the use of remote or online platforms (including new processes for identity verification)
  • an increased focus on big data analysis and predictive modelling to fill data gaps…(More)”.

An introduction to human rights for the mobile sector


Report by the GSMA: “Human rights risks are present throughout mobile operators’ value chains. These range from the treatment and conditions of people working in the supply chain to how operators’ own employees are treated and how the human rights of customers are respected online.

This summary provides a high-level introduction to the most salient human rights issues for mobile operators. The aim is to explain why the issues are relevant for operators and share initial practical guidance for companies beginning to focus and respond to human rights issues….(More)”.

How Humanitarian Blockchain Can Deliver Fair Labor to Global Supply Chains


Paper by  Ashley Mehra and John G. Dale: “Blockchain technology in global supply chains has proven most useful as a tool for storing and keeping records of information or facilitating payments with increased efficiency. The use of blockchain to improve supply chains for humanitarian projects has mushroomed over the last five years; this increased popularity is in large part due to the potential for transparency and security that the design of the technology proposes to offer. Yet, we want to ask an important but largely unexplored question in the academic literature about the human rights of the workers who produce these “humanitarian blockchain” solutions: “How can blockchain help eliminate extensive labor exploitation issues embedded within our global supply chains?”

To begin to answer this question, we suggest that proposed humanitarian blockchain solutions must (1) re-purpose the technical affordances of blockchain to address relations of power that, sometimes unwittingly, exploit and prevent workers from collectively exercising their voice; (2) include legally or socially enforceable mechanisms that enable workers to meaningfully voice their knowledge of working conditions without fear of retaliation; and (3) re-frame our current understanding of human rights issues in the context of supply chains to include the labor exploitation within supply chains that produce and sustain the blockchain itself….(More)”.

Apparent Algorithmic Bias and Algorithmic Learning


Paper by Anja Lambrecht and Catherine E. Tucker: “It is worrying to think that algorithms might discriminate against minority groups and reinforce existing inequality. Typically, such concerns have focused on the idea that the algorithm’s code could reflect bias, or the data that feeds the algorithm might lead the algorithm to produce uneven outcomes.

In this paper, we highlight another reason for why algorithms might appear biased against minority groups which is the length of time algorithms need to learn: if an algorithm has access to less data for particular groups, or accesses this data at differential speeds, it will produce differential outcomes, potentially disadvantaging minority groups.

Specifically, we revisit a classic study which documents that searches on Google for black names were more likely to return ads that highlighted the need for a criminal background check than searches for white names. We show that at least a partial explanation for this finding is that if consumer demand for a piece of information is low, an algorithm accumulates information at a lesser speed and thus takes longer to learn about consumer preferences. Since black names are less common, the algorithm learns about the quality of the underlying ad more slowly, and as a result an ad is more likely to persist for searches next to black names even if the algorithm judges the ad to be of low-quality. Therefore, the algorithm may be likely to show an ad — including an undesirable ad — in the context of searches for a disadvantaged group for a longer period of time.

We replicate this result using the context of religious affiliations and present evidence that ads targeted towards searches for religious groups persists for longer for groups that are less searched for. This suggests that the process of algorithmic learning can lead to differential outcomes across those whose characteristics are more common and those who are rarer in society….(More)”.

Responsible Data Toolkit


Andrew Young at The GovLab: “The GovLab and UNICEF, as part of the Responsible Data for Children initiative (RD4C), are pleased to share a set of user-friendly tools to support organizations and practitioners seeking to operationalize the RD4C Principles. These principles—Purpose-Driven, People-Centric, Participatory, Protective of Children’s Rights, Proportional, Professionally Accountable, and Prevention of Harms Across the Data Lifecycle—are especially important in the current moment, as actors around the world are taking a data-driven approach to the fight against COVID-19.

The initial components of the RD4C Toolkit are:

The RD4C Data Ecosystem Mapping Tool intends to help users to identify the systems generating data about children and the key components of those systems. After using this tool, users will be positioned to understand the breadth of data they generate and hold about children; assess data systems’ redundancies or gaps; identify opportunities for responsible data use; and achieve other insights.

The RD4C Decision Provenance Mapping methodology provides a way for actors designing or assessing data investments for children to identify key decision points and determine which internal and external parties influence those decision points. This distillation can help users to pinpoint any gaps and develop strategies for improving decision-making processes and advancing more professionally accountable data practices.

The RD4C Opportunity and Risk Diagnostic provides organizations with a way to take stock of the RD4C principles and how they might be realized as an organization reviews a data project or system. The high-level questions and prompts below are intended to help users identify areas in need of attention and to strategize next steps for ensuring more responsible handling of data for and about children across their organization.

Finally, the Data for Children Collaborative with UNICEF developed an Ethical Assessment that “forms part of [their] safe data ecosystem, alongside data management and data protection policies and practices.” The tool reflects the RD4C Principles and aims to “provide an opportunity for project teams to reflect on the material consequences of their actions, and how their work will have real impacts on children’s lives.

RD4C launched in October 2019 with the release of the RD4C Synthesis ReportSelected Readings, and the RD4C Principles. Last month we published the The RD4C Case Studies, which analyze data systems deployed in diverse country environments, with a focus on their alignment with the RD4C Principles. The case studies are: Romania’s The Aurora ProjectChildline Kenya, and Afghanistan’s Nutrition Online Database.

To learn more about Responsible Data for Children, visit rd4c.org or contact rd4c [at] thegovlab.org. To join the RD4C conversation and be alerted to future releases, subscribe at this link.”