Harnessing collective intelligence to find missing children


Cordis: “It is estimated that over 250 000 children go missing every year in the EU. Statistics on their recovery is scant, but based on data from the EU-wide 116 000 hotline, 14 % of runaways and 57 % of migrant minors reported missing in 2019 had not been found by the end of the year. The EU-supported ChildRescue project has developed a collective intelligence and stakeholder communication approach for missing children investigations. It consists of a collaborative platform and two mobile apps available for organisations, verified volunteers and the general public. “ChildRescue is being used by our piloting organisations and is already becoming instrumental in missing children investigations. The public response has exceeded our expectations, with over 22 000 app downloads,” says project coordinator Christos Ntanos from the Decision Support Systems Laboratory at the National Technical University of Athens. ChildRescue has also published a white paper on the need for a comprehensive legal framework on missing unaccompanied migrant minors in the EU….

To assist in missing children investigations, ChildRescue trained machine learning algorithms to find underlying patterns useful for investigations. As input, they used structured information about individual cases combined with open data from multiple sources, alongside data from similar past cases. The ChildRescue community mobile app issues real-time alerts near places of interest, such as where a child was last seen. Citizens can respond with information, including photos, exclusively accessible by the organisation involved in the case. The quality, relevance and credibility of this feedback are assessed by an algorithm. The organisation can then pass information to the police and engage its own volunteers. Team members can share real-time information through a dedicated private collaboration space….(More)”.

Citizen Science Is Helping Tackle Stinky Cities


Article by Lucrezia Lozza: “Marta has lived with a bad smell lingering in her hometown in central Spain, Villanueva del Pardillo, for a long time. Fed up, in 2017 she and her neighbors decided to pursue the issue. “The smell is disgusting,” Marta says, pointing a finger at a local yeast factory.

Originally, she thought of recording the “bad smell days” on a spreadsheet. When this didn’t work out, after some research she found Odour Collect, a crowdsourced map that allows users to enter a geolocalized timestamp of bad smells in their neighborhood.

After noise, odor nuisances are the second cause of environmental complaints. Odor regulations vary among countries and there’s little legislation about how to manage smells. For instance, in Spain some municipalities regulate odors, but others do not. In the United States, the Environmental Protection Agency does not regulate odor as a pollutant, so states and local jurisdictions are in charge of the issue.

Only after Marta started using Odour Collect to record the unpleasant smells in her town did she discover that the map was part of ‘D-NOSES’, a European project aimed at bringing citizens, industries and local authorities together to monitor and minimize odor nuisances. D-NOSES relies heavily on citizen science: Affected communities gather odor observations through two maps — Odour Collect and Community Maps — with the goal of implementing new policies in their area. D-NOSES launched several pilots in Europe — in Spain, Greece, Bulgaria, and Portugal — and two outside the continent in Uganda and in Chile.

“Citizen science promotes transparency between all the actors,” said Nora Salas Seoane, Social Sciences Researcher at Fundación Ibercivis, one of the partners of D-NOSES…(More)”.

Assessing the social and emotional costs of mass shootings with Twitter data


Article by Mary Blankenship and Carol Graham: “Mass shootings that result in mass casualties are almost a weekly occasion in the United States, which—not coincidentally—also has the most guns per capita in the world. Viewed from outside the U.S., it seems that Americans are not bothered by the constant deadly gun violence and have simply adapted to it. Yet, our analysis of the well-being costs of gun violence—using Twitter data to track real-time responses throughout the course of these appalling events—suggest that is not necessarily the case. We focus on the two March 2021 shootings in Atlanta and Boulder, and compare to similar data for the “1 October” (Las Vegas) and El Paso shootings a few years prior. (Details on our methodology can be found at the end of this blog.)

A reason for the one-sided debate on guns is that beyond the gruesome body counts, we do not have many tools for assessing the large—but unobservable—effects of this violence on family members, friends, and neighbors of the victims, as well as on society in general. By assessing how emotions evolve over time, real changes can be seen in Twitter messages. Our analysis shows that society is increasingly angered by gun violence, rather than simply adapting to it.

A striking characteristic of the response to the 1 October shooting is the immediate influx of users sending their thoughts and players to the victims and the Las Vegas community. Figure 1 shows the top emoji usage and “praying hands” being the most frequently used emoji. Although that is still the most used emoji in response to the other shootings, the margin between “praying hands” and other emojis has substantially decreased in recent responses to Atlanta and Boulder. Our focus is on the “yellow face” emojis, which can correlate to six primary emotions categories: surprise, sadness, disgust, fear, anger, and neutral. While the majority of face emojis reflect emotions of sadness in the 1 October and El Paso shooting, new emojis like the “red angry face” show greater feelings of anger in the Atlanta and Boulder shootings shown in Figure 3….(More)”.

Figure 1. Top 10 emojis used in response to the 1 October shooting

Top 10 emojis used in response to the 1 October shooting

Source: Authors

Building on a year of open data: progress and promise


Jennifer Yokoyama at Microsoft: “…The biggest takeaway from our work this past year – and the one thing I hope any reader of this post will take away – is that data collaboration is a spectrum. From the presence (or absence) of data to how open that data is to the trust level of the collaboration participants, these factors may necessarily lead to different configurations and different goals, but they can all lead to more open data and innovative insights and discoveries.

Here are a few other lessons we have learned over the last year:

  1. Principles set the foundation for stakeholder collaboration: When we launched the Open Data Campaign, we adopted five principles that guide our contributions and commitments to trusted data collaborations: Open, Usable, Empowering, Secure and Private. These principles underpin our participation, but importantly, organizations can build on them to establish responsible ways to share and collaborate around their data. The London Data Commission, for example, established a set of data sharing principles for public- and private-sector organizations to ensure alignment and to guide the participating groups in how they share data.
  2. There is value in pilot projects: Traditionally, data collaborations with several stakeholders require time – often including a long runway for building the collaboration, plus the time needed to execute on the project and learn from it. However, our learnings show short-term projects that experiment and test data collaborations can provide valuable insights. The London Data Commission did exactly that with the launch of four short-term pilot projects. Due to the success of the pilots, the partners are exploring how they can be expanded upon.
  3. Open data doesn’t require new data: Identifying data to share does not always mean it must be newly shared data; sometimes the data was narrowly shared, but can be shared more broadly, made more accessible or analyzed for a different purpose. Microsoft’s environmental indicator data is an example of data that was already disclosed in certain venues, but was then made available to the Linux Foundation’s OS-Climate Initiative to be consumed through analytics, thereby extending its reach and impact…

To get started, we suggest that emerging data collaborations make use of the wealth of existing resources. When embarking on data collaborations, we leveraged many of the definitions, toolkits and guides from leading organizations in this space. As examples, resources such as the Open Data Institute’s Data Ethics Canvas are extremely useful as a framework to develop ethical guidance. Additionally, The GovLab’s Open Data Policy Lab and Executive Course on Data Stewardship, both supported by Microsoft, highlight important case studies, governance considerations and frameworks when sharing data. If you want to learn more about the exciting work our partners are doing, check out the latest posts from the Open Data Institute and GovLab…(More)”. See also Open Data Policy Lab.

Control Creep: When the Data Always Travels, So Do the Harms


Essay by Sun-ha Hong: “In 2014, a Canadian firm made history. Calgary-based McLeod Law brought the first known case in which Fitbit data would be used to support a legal claim. The device’s loyalty was clear: the young woman’s personal injury claim would be supported by her own Fitbit data, which would help prove that her activity levels had dipped post-injury. Yet the case had opened up a wider horizon for data use, both for and against the owners of such devices. Leading artificial intelligence (AI) researcher Kate Crawford noted at the time that the machines we use for “self-tracking” may be opening up a “new age of quantified self incrimination.”

Subsequent cases have demonstrated some of those possibilities. In 2015, a Connecticut man reported that his wife had been murdered by a masked intruder. Based partly on the victim’s Fitbit data, and other devices such as the family house alarm, detectives charged the man — not a masked intruder — with the crime. “In 2016, a Pennsylvania woman claimed she was sexually assaulted, but police argued that the woman’s own Fitbit data suggested otherwise, and charged her with false reporting.” In the courts and elsewhere, data initially gathered for self-tracking is increasingly being used to contradict or overrule the self — despite academic research and even a class action lawsuit alleging high rates of error in Fitbit data.

The data always travels, creating new possibilities for judging and predicting human lives. We might call it control creep: data-driven technologies tend to be pitched for a particular context and purpose, but quickly expand into new forms of control. Although we often think about data use in terms of trade-offs or bargains, such frameworks can be deeply misleading. What does it mean to “trade” personal data for the convenience of, say, an Amazon Echo, when the other side of that trade is constantly arranging new ways to sell and use that data in ways we cannot anticipate? As technology scholars Jake Goldenfein, Ben Green and Salomé Viljoen argue, the familiar trade-off of “privacy vs. X” rarely results in full respect for both values but instead tends to normalize a further stripping of privacy….(More)”.

Socially Responsible Data Labeling


Blog By Hamed Alemohammad at Radiant Earth Foundation: “Labeling satellite imagery is the process of applying tags to scenes to provide context or confirm information. These labeled training datasets form the basis for machine learning (ML) algorithms. The labeling undertaking (in many cases) requires humans to meticulously and manually assign captions to the data, allowing the model to learn patterns and estimate them for other observations.

For a wide range of Earth observation applications, training data labels can be generated by annotating satellite imagery. Images can be classified to identify the entire image as a class (e.g., water body) or for specific objects within the satellite image. However, annotation tasks can only identify features observable in the imagery. For example, with Sentinel-2 imagery at the 10-meter spatial resolution, one cannot detect the more detailed features of interest, such as crop types but would be able to distinguish large croplands from other land cover classes.

Human error in labeling is inevitable and results in uncertainties and errors in the final label. As a result, it’s best practice to examine images multiple times and then assign a majority or consensus label. In general, significant human resources and financial investment is needed to annotate imagery at large scales.

In 2018, we identified the need for a geographically diverse land cover classification training dataset that required human annotation and validation of labels. We proposed to Schmidt Futures a project to generate such a dataset to advance land cover classification globally. In this blog post, we discuss what we’ve learned developing LandCoverNet, including the keys to generating good quality labels in a socially responsible manner….(More)”.

A Resurgence of Democracy in 2040?


Blog by Steven Aftergood: “The world will be “increasingly out of balance and contested at every level” over the next twenty years due to the pressures of demographic, environmental, economic and technological change, a new forecast from the National Intelligence Council called Global Trends 2040 said last week.

But among the mostly grim possible futures that can be plausibly anticipated — international chaos, political paralysis, resource depletion, mounting poverty — one optimistic scenario stands out: “In 2040, the world is in the midst of a resurgence of open democracies led by the United States and its allies.”

How could such a global renaissance of democracy possibly come about?

The report posits that between now and 2040 technological innovation in open societies will lead to economic growth, which will enable solutions to domestic problems, build public confidence, reduce vulnerabilities and establish an attractive model for emulation by others. Transparency is both a precondition and a consequence of this process.

“Open, democratic systems proved better able to foster scientific research and technological innovation, catalyzing an economic boom. Strong economic growth, in turn, enabled democracies to meet many domestic needs, address global challenges, and counter rivals,” the report assessed in this potential scenario.

“With greater resources and improving services, these democracies launched initiatives to crack down on corruption, increase transparency, and improve accountability worldwide, boosting public trust. These efforts helped to reverse years of social fragmentation and to restore a sense of civic nationalism.”

“The combination of rapid innovation, a stronger economy, and greater societal cohesion enabled steady progress on climate and other challenges. Democratic societies became more resilient to disinformation because of greater public awareness and education initiatives and new technologies that quickly identify and debunk erroneous information. This environment restored a culture of vigorous but civil debate over values, goals, and policies.”

“Strong differences in public preferences and beliefs remained but these were worked out democratically.”

In this hopeful future, openness provided practical advantages that left closed authoritarian societies lagging behind.

“In contrast to the culture of collaboration prevailing in open societies, Russia and China failed to cultivate the high-tech talent, investment, and environment necessary to sustain continuous innovation.”

“By the mid-2030s, the United States and its allies in Europe and Asia were the established global leaders in several technologies, including AI, robotics, the Internet of Things, biotech, energy storage, and additive manufacturing.”

The success of open societies in problem solving, along with their economic and social improvements, inspired other countries to adopt the democratic model.

“Technological success fostered a widely perceived view among emerging and developing countries that democracies were more adaptable and resilient and better able to cope with growing global challenges.”…(More)”.

Combining Racial Groups in Data Analysis Can Mask Important Differences in Communities


Blog by Jonathan Schwabish and Alice Feng: “Surveys, datasets, and published research often lump together racial and ethnic groups, which can erase the experiences of certain communities. Combining groups with different experiences can mask how specific groups and communities are faring and, in turn, affect how government funds are distributed, how services are provided, and how groups are perceived.

Large surveys that collect data on race and ethnicity are used to disburse government funds and services in a number of ways. The US Department of Housing Urban Development, for instance, distributes millions of dollars annually to Native American tribes through the Indian Housing Block Grant. And statistics on race and ethnicity are used as evidence in employment discrimination lawsuits and to help determine whether banks are discriminating against people and communities of color.

Despite the potentially large effects these data can have, researchers don’t always disaggregate their analysis to more racial groups. Many point to small sample sizes as a limitation for including more race and ethnicity categories in their analysis, but efforts to gather more specific data and disaggregate available survey results are critical to creating better policy for everyone.

To illustrate how aggregating racial groups can mask important variation, we looked at the 2019 poverty rate across 139 detailed race categories in the Census Bureau’s annual American Community Survey (ACS). The ACS provides information that helps determine how more than $675 billion in government funds is distributed each year.

The official poverty rate in the United States stood at 10.5 percent in 2019, with significant variation across racial and ethnic groups. The primary question in the ACS concerning race includes 15 separate checkboxes, with space to print additional names or races for some options (a separate question refers to Hispanic or Latino origin).

Screenshot of the American Community Survey's race question

Although the survey offers ample latitude for interviewees to respond with their race, researchers have a tendency to aggregate racial categories. People who identify as Asian or Pacific Islander (API), for example, are often combined in economic analyses.

This aggregation can mask variation within racial or ethnic categories. As an example, one analysis that used the ACS showed 11 percent of children in the API group are in poverty, relative to 18 percent of the overall population. But that estimate could understate the poverty rate among children who identify as Pacific lslanders and could overstate the poverty rate among children who identify as Asian, which itself is a broad grouping that encompasses many different communities with various experiences. Similar aggregating can be found across economic literature, including on educationimmigration (PDF), and wealth….(More)”.

10 + 1 Guidelines for EU Citizen’s Assemblies


Blog post: “Over the past years, deliberative citizens’ assemblies selected by lot have increased their popularity and impact around the world. If introduced at European Union level, and aimed at developing recommendations on EU policy issues such first ever transnational citizens’ assemblies would be groundbreaking in advancing EU democratic reform. The Citizens Take Over Europe coalition recognizes the political urgency and democratic potential of such innovations of EU governance. We therefore call for the introduction of European citizens’ assemblies as a regular and permanent body for popular policy deliberation. In order for EU level citizens’ assemblies to work as an effective tool in further democratising EU decision-making, we have thoroughly examined preexisting exercises of deliberative democracy. The following 10 + 1 guidelines are based on best practices and lessons learned from national and local citizens’ assemblies across Europe. They have been designed in collaboration with leading experts. At present, these guidelines shall instruct the Conference on the Future of Europe on how to create the first experimental space for transnational citizens’ assemblies. But they are designed for future EU citizens’ assemblies as well.

1. Participatory prerequisites 

Strong participatory instruments are a prerequisite for a democratic citizens’ assembly. Composed as a microcosm of the EU population with people selected by lot, the assembly workings must be participatory and allow all members to have a say, with proper professional moderation during the deliberative rounds. The assembly must fit the EU participatory pillar and connect to the existing tools of EU participatory democracy, for instance by deliberating on successful European citizens’ initiatives. 

The scope and structure of the citizens’ assembly should be designed in a participatory manner by the members of the assembly, starting with the first assembly meeting that will draft and adopt its rules of procedure and set its agenda.

Additional participatory instruments such as the possibility to submit online proposals  to the assembly on relevant topics should be included in order to facilitate the engagement of all citizens. Information about opportunities to get involved and participate in the citizens’ assembly proceedings must be attractive and accessible to ordinary citizens….(More)”.

How We Built a Facebook Feed Viewer


Citizen Browser at The MarkUp: “Our interactive dashboard, Split Screen, gives readers a peek into the content Facebook delivered to people of different demographic backgrounds and voting preferences who participated in our Citizen Browser project. 

Using Citizen Browser, our custom Facebook inspector, we perform daily captures of Facebook data from paid panelists. These captures collect the content that was displayed on their Facebook feeds at the moment the app performed its automated capture. From Dec. 1, 2020, to March 2, 2021, 2,601 paid participants have contributed their data to the project. 

To measure what Facebook’s recommendation algorithm displays to different groupings of people, we compare data captured from each over a two-week period. We look at three different pairings:

  • Women vs. Men
  • Biden Voters vs. Trump Voters
  • Millennials vs. Boomers 

We labeled our panelists based on their self-disclosed political leanings, gender, and age. We describe each pairing in more detail in the Pairings section of this article. 

For each pair, we examine four types of content served by Facebook: news sources, posts with news links, hashtags, and group recommendations. We compare the percentage of each grouping that was served each piece of content to that of the other grouping in the pair.  

For more information on the data we collect, the panel’s demographic makeup, and the extensive redaction process we undertake to preserve privacy, see our methodology How We Built a Facebook Inspector.

Our observations should not be taken as proof of Facebook’s choosing to target specific content at specific demographic groups. There are many factors that influence any given person’s feed that we do not account for, including users’ friends and social networks….(More)”.