Charting the ‘Data for Good’ Landscape


Report by Jake Porway at Data.org: “There is huge potential for data science and AI to play a productive role in advancing social impact. However, the field of “data for good” is not only overshadowed by the public conversations about the risks rampant data misuse can pose to civil society, it is also a fractured and disconnected space. There are a myriad of different interpretations of what it means to “use data for good” or “use AI for good”, which creates duplicate efforts, nonstrategic initiatives, and confusion about what a successfully data-driven social sector could look like. To add to that, funding is scarce for a field that requires expensive tools and skills to do well. These enduring challenges result in work being done at an activity and project level, but do not create a coherent set of building blocks to constitute a strong and healthy field that is capable of solving a new class of systems-level problems.

We are taking one tiny step forward in trying to make a more coherent Data for Good space with a landscape that makes clear what various Data for Good initiatives (and AI for Good initiatives) are trying to achieve, how they do it, and what makes them similar or different from one another. One of the major confusion points in talking about “Data for Good” is that it treats all efforts as similar by the mere fact that they use “data” and seek to do something “good”. This term is so broad as to be practically meaningless; as unhelpful as saying “Wood for Good”. We would laugh at a term as vague as “Wood for Good”, which would lump together activities as different as building houses to burning wood in cook stoves to making paper, combining architecture with carpentry, forestry with fuel. However, we are content to say “Data for Good”, and its related phrases “we need to use our data better” or “we need to be data-driven”, when data is arguably even more general than something like wood.

We are trying to bring clarity to the conversation by going beyond mapping organizations into arbitrary groups, to define the dimensions of what it means to do data for good. By creating an ontology for what Data for Good initiatives seek to achieve, in which sector, and by what means, we can gain a better understanding of the underlying fundamentals of using data for good, as well as creating a landscape of what initiatives are doing.

We hope that this landscape of initiatives will help to bring some more nuance and clarity to the field, as well as identify which initiatives are out there and what purpose they serve. Specifically, we hope this landscape will help:

  • Data for Good field practitioners align on a shared language for the outcomes, activities, and aims of the field.
  • Purpose-driven organizations who are interested in applying data and computing to their missions better understand what they might need and who they might go to to get it.
  • Funders make more strategic decisions about funding in the data/AI space based on activities that align with their interests and the amount of funding already devoted to that area.
  • Organizations with Data for Good initiatives can find one another and collaborate based on similarity of mission and activities.

Below you will find a very preliminary landscape map, along with a description of the different kinds of groups in the Data for Good ecosystem and why you might need to engage with them….(More)”.

Understanding crowdsourcing projects: A review on the key design elements of a crowdsourcing initiative


Paper by Rea Karachiwalla and Felix Pinkow: “Crowdsourcing has gained considerable traction over the past decade and has emerged as a powerful tool in the innovation process of organizations. Given its growing significance in practice, a profound understanding of the concept is crucial. The goal of this study is to develop a comprehensive understanding of designing crowdsourcing projects for innovation by identifying and analyzing critical design elements of crowdsourcing contests. Through synthesizing the principles of the social exchange theory and absorptive capacity, this study provides a novel conceptual configuration that accounts for both the attraction of solvers and the ability of the crowdsourcer to capture value from crowdsourcing contests. Therefore, this paper adopts a morphological approach to structure the four dimensions, namely, (i) task, (ii) crowd, (iii) platform and (iv) crowdsourcer, into a conceptual framework to present an integrated overview of the various crowdsourcing design options. The morphological analysis allows the possibility of identifying relevant interdependencies between design elements, based on the goals of the problem to be crowdsourced. In doing so, the paper aims to enrich the extant literature by providing a comprehensive overview of crowdsourcing and to serve as a blueprint for practitioners to make more informed decisions when designing and executing crowdsourcing projects….(More)”.

Could Trade Agreements Help Address the Wicked Problem of Cross-Border Disinformation?


Essay by Susan Ariel Aaronson: “Whether produced domestically or internationally, disinformation is a “wicked” problem that has global impacts. Although trade agreements contain measures that address cross-border disinformation, domestically created disinformation remains out of their reach. This paper looks at how policy makers can use trade agreements to mitigate disinformation and spam while implementing financial and trade sanctions against entities and countries that engage in disseminating cross-border disinformation. Developed and developing countries will need to work together to solve this global problem….(More)”.

Research directions in policy modeling: Insights from comparative analysis of recent projects


Paper by Alexander Ronzhyn and Maria A. Wimmer: “With the increased availability of data and the capacity to make sense of these data, computational approaches to analyze, model and simulate public policy evolved toward viable instruments to deliberate, plan, and evaluate them in different areas of application. Such examples include infrastructure, mobility, monetary, or austerity policies, policies on different aspects of societies (health, pandemic, skills, inclusion, etc.). Technological advances along with the evolution of theoretical models and frameworks open valuable opportunities, while at the same time, posing new challenges. The paper investigates the current state of research in the domain and aims at identifying the most pressing areas for future research. This is done through both literature research of policy modeling and the analysis of research and innovation projects that either focus on policy modeling or involve it as a significant component of the research design. In the paper, 16 recent projects involving the keyword policy modeling were analyzed. The majority of projects concern the application of policy modeling to a specific domain or area of interest, while several projects tackled the cross-cutting topics (risk and crisis management). The detailed analysis of the projects led to topics of future research in the domain of policy modeling. Most prominent future research topics in policy modeling include stakeholder involvement approaches, applicability of research results, handling complexity of models, integration of models from different modeling and simulation paradigms and approaches, visualization of simulation results, real-time data processing, and scalability. These aspects require further research to appropriately contribute to further advance the field….(More)”.

A New Tool Shows How Google Results Vary Around the World


Article by Tom Simonite: “Google’s claim to “organize the world’s information and make it universally accessible and useful” has earned it an aura of objectivity. Its dominance in search, and the disappearance of most competitors, make its lists of links appear still more canonical. An experimental new interface for  Google Search aims to remove that mantle of neutrality.

Search Atlas makes it easy to see how Google offers different responses to the same query on versions of its search engine offered in different parts of the world. The research project reveals how Google’s service can reflect or amplify cultural differences or government preferences—such as whether Beijing’s Tiananmen Square should be seen first as a sunny tourist attraction or the site of a lethal military crackdown on protesters.

Divergent results like that show how the idea of search engines as neutral is a myth, says Rodrigo Ochigame, a PhD student in science, technology, and society at MIT and cocreator of Search Atlas. “Any attempt to quantify relevance necessarily encodes moral and political priorities,” Ochigame says.

Ochigame built Search Atlas with Katherine Ye, a computer science PhD student at Carnegie Mellon University and a research fellow at the nonprofit Center for Arts, Design, and Social Research.

Just like Google’s homepage, the main feature of Search Atlas is a blank box. But instead of returning a single column of results, the site displays three lists of links, from different geographic versions of Google Search selected from the more than 100 the company offers. Search Atlas automatically translates a query to the default languages of each localized edition using Google Translate.

Ochigame and Ye say the design reveals “information borders” created by the way Google’s search technology ranks web pages, presenting different slices of reality to people in different locations or using different languages.

When they used their tool to do an image search on “Tiananmen Square,” the UK and Singaporean versions of Google returned images of tanks and soldiers quashing the 1989 student protests. When the same query was sent to a version of Google tuned for searches from China, which can be accessed by circumventing the country’s Great Firewall, the results showed recent, sunny images of the square, smattered with tourists.

Google’s search engine has been blocked in China since 2010, when the company said it would stop censoring topics the government deemed sensitive, such as the Tiananmen massacre. Search Atlas suggests that the China edition of the company’s search engine can reflect the Chinese government’s preferences all the same. That pattern could result in part from how the corpus of web pages from any language or region would reflect cultural priorities and pressures….(More)”

Search Atlas graph showing different search results
An experimental interface for Google Search found that it offered very different views of Beijing’s Tiananmen Square to searchers from the UK (left), Singapore (center), and China. COURTESY OF SEARCH ATLAS

Guide on Geospatial Data Integration in Official Statistics


Report by PARIS21: “National geospatial integration agencies can provide detailed, timely and relevant data about people, businesses, buildings, infrastructures, agriculture, natural resources and anthropogenic impacts on the biosphere. There is a clear benefit to integrating geospatial data into traditional national statistical systems. Together they provide a very clear picture of the social, economic and environmental issues that underpin sustainable development and allow for more informed policy making. But the question is where to start?

geospatial data integration

This new PARIS21 publication provides a practical guide, based on five principles for national statistics offices to form stronger partnerships with national geospatial integration agencies….(More)”.

The tyranny of spreadsheets


Tim Harford at the Financial Times: “Early last October my phone rang. On the line was a researcher calling from Today, the BBC’s agenda-setting morning radio programme. She told me that something strange had happened, and she hoped I might be able to explain it. Nearly 16,000 positive Covid cases had disappeared completely from the UK’s contact tracing system. These were 16,000 people who should have been warned they were infected and a danger to others, 16,000 cases contact tracers should have been running down to figure out where the infected went, who they met and who else might be at risk. None of which was happening. Why had the cases disappeared? Apparently, Microsoft Excel had run out of numbers.

It was an astonishing story that would, in time, lead me to delve into the history of accountancy, epidemiology and vaccination, discuss file formatting with Microsoft’s founder, Bill Gates, and even trace the aftershocks of the collapse of Enron. But above all, it was a story that would teach me about the way we take numbers for granted….

The origin of Excel can be traced back far further than that of Microsoft. In the late 1300s, the need for a solid system for accounts was evident in the outbursts of one man in particular, an Italian textile merchant named Francesco di Marco Datini. Poor Datini was surrounded by fools.

“You cannot see a crow in a bowlful of milk!” he berated one associate.

“You could lose your way from your nose to your mouth!” he chided another.

Iris Origo’s vivid book The Merchant of Prato describes Datini’s everyday life and explains his problem: keeping track of everything in a complicated world. By the end of the 14th century, merchants such as Datini had progressed from mere travelling salesmen able to keep track of profits by patting their purses. They were now in charge of sophisticated operations.

Datini, for example, ordered wool from the island of Mallorca two years before the sheep had even grown it, a hedge to account for the numerous subcontractors that would process it before it became beautiful rolls of dyed cloth. The supply chain between shepherd and consumer stretched across Barcelona, Pisa, Venice, Valencia, North Africa and back to Mallorca. It took four years between the initial order of wool and the final sale of cloth.

No wonder Datini insisted on absolute clarity about where his product was at any moment, not to mention his money. How did he manage? Spreadsheets…(More)”

Seek diversity to solve complexity


Katrin Prager at Nature: “As a social scientist, I know that one person cannot solve a societal problem on their own — and even a group of very intelligent people will struggle to do it. But we can boost our chances of success if we ensure not only that the team members are intelligent, but also that the team itself is highly diverse.

By ‘diverse’ I mean demographic diversity encompassing things such as race, gender identity, class, ethnicity, career stage and age, and cognitive diversity, including differences in thoughts, insights, disciplines, perspectives, frames of reference and thinking styles. And the team needs to be purposely diverse instead of arbitrarily diverse.

In my work I focus on complex world problems, such as how to sustainably manage our natural resources and landscapes, and I’ve found that it helps to deliberately assemble diverse teams. This effort requires me to be aware of the different ways in which people can be diverse, and to reflect on my own preferences and biases. Sometimes the teams might not be as diverse as I’d like. But I’ve found that making the effort not only to encourage diversity, but also to foster better understanding between team members reaps dividends….(more)”

Media Is Us: Understanding Communication and Moving beyond Blame


Book by Elizaveta Friesem: “Media is usually seen as a feature of the modern world enabled by the latest technologies. Scholars, educators, parents, and politicians often talk about media as something people should be wary of due to its potential negative impact on their lives. But do we really understand what media is?

Elizaveta Friesem argues that instead of being worried about media or blaming it for what’s going wrong in society, we should become curious about uniquely human ways we communicate with each other. Media Is Us proposes five key principles of communication that are relevant both for the modern media and for people’s age-old ways of making sense of the world.

In order to understand problems of the contemporary society revealed and amplified by the latest technologies, we will have to ask difficult questions about ourselves. Where do our truths and facts come from? How can we know who is to blame for flaws of the social system? What can we change about our own everyday actions to make the world a better place? To answer these questions we will need to rethink not only the term “media” but also the concept of power. The change of perspective proposed by the book is intended to help the reader become more self-aware and also empathic towards those who choose different truths.

Concluding with practical steps to build media literacy through the ACE model—from Awareness to Collaboration through Empathy—this timely book is essential for students and scholars, as well as anyone who would use the new understanding of media to decrease the current levels of cultural polarization….(More)”.

Manipulation As Theft


Paper by Cass Sunstein: “Should there be a right not to be manipulated? What kind of right? On Kantian grounds, manipulation, lies, and paternalistic coercion are moral wrongs, and for similar reasons; they deprive people of agency, insult their dignity, and fail to respect personal autonomy. On welfarist grounds, manipulation, lies, and paternalistic coercion share a different characteristic; they displace the choices of those whose lives are directly at stake, and who are likely to have epistemic advantages, with the choices of outsiders, who are likely to lack critical information. Kantians and welfarists should be prepared to endorse a (moral) right not to be manipulated, though on very different grounds.

The moral prohibition on manipulation, like the moral prohibition on lies, should run against officials and regulators, not only against private institutions. At the same time, the creation of a legal right not to be manipulated raises hard questions, in part because of definitional challenges; there is a serious risk of vagueness and a serious risk of overbreadth. (Lies, as such, are not against the law, and the same is true of unkindness, inconsiderateness, and even cruelty.) With welfarist considerations in mind, it is probably best to start by prohibiting particular practices, while emphasizing that they are forms of manipulation and may not count as fraud. The basic goal should be to build on the claim that in certain cases, manipulation is a form of theft; the law should forbid theft, whether it occurs through force, lies, or manipulation. Some manipulators are thieves….(More)”