The Future of Minds and Machines


Report by Aleksandra Berditchevskaia and Peter Baek: “When it comes to artificial intelligence (AI), the dominant media narratives often end up taking one of two opposing stances: AI is the saviour or the villain. Whether it is presented as the technology responsible for killer robots and mass job displacement or the one curing all disease and halting the climate crisis, it seems clear that AI will be a defining feature of our future society. However, these visions leave little room for nuance and informed public debate. They also help propel the typical trajectory followed by emerging technologies; with inevitable regularity we observe the ascent of new technologies to the peak of inflated expectations they will not be able to fulfil, before dooming them to a period languishing in the trough of disillusionment.[1]

There is an alternative vision for the future of AI development. By starting with people first, we can introduce new technologies into our lives in a more deliberate and less disruptive way. Clearly defining the problems we want to address and focusing on solutions that result in the most collective benefit can lead us towards a better relationship between machine and human intelligence. By considering AI in the context of large-scale participatory projects across areas such as citizen science, crowdsourcing and participatory digital democracy, we can both amplify what it is possible to achieve through collective effort and shape the future trajectory of machine intelligence. We call this 21st-century collective intelligence (CI).

In The Future of Minds and Machines we introduce an emerging framework for thinking about how groups of people interface with AI and map out the different ways that AI can add value to collective human intelligence and vice versa. The framework has, in large part, been developed through analysis of inspiring projects and organisations that are testing out opportunities for combining AI & CI in areas ranging from farming to monitoring human rights violations. Bringing together these two fields is not easy. The design tensions identified through our research highlight the challenges of navigating this opportunity and selecting the criteria that public sector decision-makers should consider in order to make the most of solving problems with both minds and machines….(More)”.

How Philanthropy Can Help Lead on Data Justice


Louise Lief at Stanford Social Innovation Review: “Today, data governs almost every aspect of our lives, shaping the opportunities we have, how we perceive reality and understand problems, and even what we believe to be possible. Philanthropy is particularly data driven, relying on it to inform decision-making, define problems, and measure impact. But what happens when data design and collection methods are flawed, lack context, or contain critical omissions and misdirected questions? With bad data, data-driven strategies can misdiagnose problems and worsen inequities with interventions that don’t reflect what is needed.

Data justice begins by asking who controls the narrative. Who decides what data is collected and for which purpose? Who interprets what it means for a community? Who governs it? In recent years, affected communities, social justice philanthropists, and academics have all begun looking deeper into the relationship between data and social justice in our increasingly data-driven world. But philanthropy can play a game-changing role in developing practices of data justice to more accurately reflect the lived experience of communities being studied. Simply incorporating data justice principles into everyday foundation practice—and requiring it of grantees—would be transformative: It would not only revitalize research, strengthen communities, influence policy, and accelerate social change, it would also help address deficiencies in current government data sets.

When Data Is Flawed

Some of the most pioneering work on data justice has been done by Native American communities, who have suffered more than most from problems with bad data. A 2017 analysis of American Indian data challenges—funded by the W.K. Kellogg Foundation and the Morris K. Udall and Stewart L. Udall Foundation—documented how much data on Native American communities is of poor quality, inaccurate, inadequate, inconsistent, irrelevant, and/or inaccessible. The National Congress of American Indians even described American Native communities as “The Asterisk Nation,” because in many government data sets they are represented only by an asterisk denoting sampling errors instead of data points.

Where it concerns Native Americans, data is often not standardized and different government databases identify tribal members at least seven different ways using different criteria; federal and state statistics often misclassify race and ethnicity; and some data collection methods don’t allow tribes to count tribal citizens living off the reservation. For over a decade the Department of the Interior’s Bureau of Indian Affairs has struggled to capture the data it needs for a crucial labor force report it is legally required to produce; methodology errors and reporting problems have been so extensive that at times it prevented the report from even being published. But when the Department of the Interior changed several reporting requirements in 2014 and combined data submitted by tribes with US Census data, it only compounded the problem, making historical comparisons more difficult. Moreover, Native Americans have charged that the Census Bureau significantly undercounts both the American Indian population and key indicators like joblessness….(More)”.

New privacy-protected Facebook data for independent research on social media’s impact on democracy


Chaya Nayak at Facebook: “In 2018, Facebook began an initiative to support independent academic research on social media’s role in elections and democracy. This first-of-its-kind project seeks to provide researchers access to privacy-preserving data sets in order to support research on these important topics.

Today, we are announcing that we have substantially increased the amount of data we’re providing to 60 academic researchers across 17 labs and 30 universities around the world. This release delivers on the commitment we made in July 2018 to share a data set that enables researchers to study information and misinformation on Facebook, while also ensuring that we protect the privacy of our users.

This new data release supplants data we released in the fall of 2019. That 2019 data set consisted of links that had been shared publicly on Facebook by at least 100 unique Facebook users. It included information about share counts, ratings by Facebook’s third-party fact-checkers, and user reporting on spam, hate speech, and false news associated with those links. We have expanded the data set to now include more than 38 million unique links with new aggregated information to help academic researchers analyze how many people saw these links on Facebook and how they interacted with that content – including views, clicks, shares, likes, and other reactions. We’ve also aggregated these shares by age, gender, country, and month. And, we have expanded the time frame covered by the data from January 2017 – February 2019 to January 2017 – August 2019.

With this data, researchers will be able to understand important aspects of how social media shapes our world. They’ll be able to make progress on the research questions they proposed, such as “how to characterize mainstream and non-mainstream online news sources in social media” and “studying polarization, misinformation, and manipulation across multiple platforms and the larger information ecosystem.”

In addition to the data set of URLs, researchers will continue to have access to CrowdTangle and Facebook’s Ad Library API to augment their analyses. Per the original plan for this project, outside of a limited review to ensure that no confidential or user data is inadvertently released, these researchers will be able to publish their findings without approval from Facebook.

We are sharing this data with researchers while continuing to prioritize the privacy of people who use our services. This new data set, like the data we released before it, is protected by a method known as differential privacy. Researchers have access to data tables from which they can learn about aggregated groups, but where they cannot identify any individual user. As Harvard University’s Privacy Tools project puts it:

“The guarantee of a differentially private algorithm is that its behavior hardly changes when a single individual joins or leaves the dataset — anything the algorithm might output on a database containing some individual’s information is almost as likely to have come from a database without that individual’s information. … This gives a formal guarantee that individual-level information about participants in the database is not leaked.” …(More)”

This emoji could mean your suicide risk is high, according to AI


Rebecca Ruiz at Mashable: “Since its founding in 2013, the free mental health support service Crisis Text Line has focused on using data and technology to better aid those who reach out for help. 

Unlike helplines that offer assistance based on the order in which users dialed, texted, or messaged, Crisis Text Line has an algorithm that determines who is in most urgent need of counseling. The nonprofit is particularly interested in learning which emoji and words texters use when their suicide risk is high, so as to quickly connect them with a counselor. Crisis Text Line just released new insights about those patterns. 

Based on its analysis of 129 million messages processed between 2013 and the end of 2019, the nonprofit found that the pill emoji, or ?, was 4.4 times more likely to end in a life-threatening situation than the word suicide. 

Other words that indicate imminent danger include 800mg, acetaminophen, excedrin, and antifreeze; those are two to three times more likely than the word suicide to involve an active rescue of the texter. The loudly crying emoji face, or ?, is similarly high-risk. In general, the words that trigger the greatest alarm suggest the texter has a method or plan to attempt suicide or may be in the process of taking their own life. …(More)”.

Our personal health history is too valuable to be harvested by the tech giants


Eerke Boiten at The Guardian: “…It is clear that the black box society does not only feed on internet surveillance information. Databases collected by public bodies are becoming more and more part of the dark data economy. Last month, it emerged that a data broker in receipt of the UK’s national pupil database had shared its access with gambling companies. This is likely to be the tip of the iceberg; even where initial recipients of shared data might be checked and vetted, it is much harder to oversee who the data is passed on to from there.

Health data, the rich population-wide information held within the NHS, is another such example. Pharmaceutical companies and internet giants have been eyeing the NHS’s extensive databases for commercial exploitation for many years. Google infamously claimed it could save 100,000 lives if only it had free rein with all our health data. If there really is such value hidden in NHS data, do we really want Google to extract it to sell it to us? Google still holds health data that its subsidiary DeepMind Health obtained illegally from the NHS in 2016.

Although many health data-sharing schemes, such as in the NHS’s register of approved data releases], are said to be “anonymised”, this offers a limited guarantee against abuse.

There is just too much information included in health data that points to other aspects of patients’ lives and existence. If recipients of anonymised health data want to use it to re-identify individuals, they will often be able to do so by combining it, for example, with publicly available information. That this would be illegal under UK data protection law is a small consolation as it would be extremely hard to detect.

It is clear that providing access to public organisations’ data for research purposes can serve the greater good and it is unrealistic to expect bodies such as the NHS to keep this all in-house.

However, there are other methods by which to do this, beyond the sharing of anonymised databases. CeLSIUS, for example, a physical facility where researchers can interrogate data under tightly controlled conditions for specific registered purposes, holds UK census information over many years.

These arrangements prevent abuse, such as through deanonymisation, do not have the problem of shared data being passed on to third parties and ensure complete transparency of the use of the data. Online analogues of such set-ups do not yet exist, but that is where the future of safe and transparent access to sensitive data lies….(More)”.

Self-interest and data protection drive the adoption and moral acceptability of big data technologies: A conjoint analysis approach


Paper by Rabia I.Kodapanakka, lMark J.Brandt, Christoph Kogler, and Iljavan Beest: “Big data technologies have both benefits and costs which can influence their adoption and moral acceptability. Prior studies look at people’s evaluations in isolation without pitting costs and benefits against each other. We address this limitation with a conjoint experiment (N = 979), using six domains (criminal investigations, crime prevention, citizen scores, healthcare, banking, and employment), where we simultaneously test the relative influence of four factors: the status quo, outcome favorability, data sharing, and data protection on decisions to adopt and perceptions of moral acceptability of the technologies.

We present two key findings. (1) People adopt technologies more often when data is protected and when outcomes are favorable. They place equal or more importance on data protection in all domains except healthcare where outcome favorability has the strongest influence. (2) Data protection is the strongest driver of moral acceptability in all domains except healthcare, where the strongest driver is outcome favorability. Additionally, sharing data lowers preference for all technologies, but has a relatively smaller influence. People do not show a status quo bias in the adoption of technologies. When evaluating moral acceptability, people show a status quo bias but this is driven by the citizen scores domain. Differences across domains arise from differences in magnitude of the effects but the effects are in the same direction. Taken together, these results highlight that people are not always primarily driven by self-interest and do place importance on potential privacy violations. They also challenge the assumption that people generally prefer the status quo….(More)”.

Twitter might have a better read on floods than NOAA


Interview by By Justine Calma: “Frustrated tweets led scientists to believe that tidal floods along the East Coast and Gulf Coast of the US are more annoying than official tide gauges suggest. Half a million geotagged tweets showed researchers that people were talking about disruptively high waters even when government gauges hadn’t recorded tide levels high enough to be considered a flood.

Capturing these reactions on social media can help authorities better understand and address the more subtle, insidious ways that climate change is playing out in peoples’ daily lives. Coastal flooding is becoming a bigger problem as sea levels rise, but a study published recently in the journal Nature Communications suggests that officials aren’t doing a great job of recording that.

The Verge spoke with Frances Moore, lead author of the new study and a professor at the University of California, Davis. This isn’t the first time that she’s turned to Twitter for her climate research. Her previous research also found that people tend to stop reacting to unusual weather after dealing with it for a while — sometimes in as little as two years. Similar data from Twitter has been used to study how people coped with earthquakes and hurricanes…(More)”.

NGOs embrace GDPR, but will it be used against them?


Report by Vera Franz et al: “When the world’s most comprehensive digital privacy law – the EU General Data Protection Regulation (GDPR) – took effect in May 2018, media and tech experts focused much of their attention on how corporations, who hold massive amounts of data, would be affected by the law.

This focus was understandable, but it left some important questions under-examined–specifically about non-profit organizations that operate in the public’s interest. How would non-governmental organizations (NGOs) be impacted? What does GDPR compliance mean in very practical terms for NGOs? What are the challenges they are facing? Could the GDPR be ‘weaponized’ against NGOs and if so, how? What good compliance practices can be shared among non-profits?

Ben Hayes and Lucy Hannah from Data Protection Support & Management and I have examined these questions in detail and released our findings in this report.

Our key takeaway: GDPR compliance is an integral part of organisational resilience, and it requires resources and attention from NGO leaders, foundations and regulators to defend their organisations against attempts by governments and corporations to misuse the GDPR against them.

In a political climate where human rights and social justice groups are under increasing pressure, GDPR compliance needs to be given the attention it deserves by NGO leaders and funders. Lack of compliance will attract enforcement action by data protection regulators and create opportunities for retaliation by civil society adversaries.

At the same time, since the law came into force, we recognise that some NGOs have over-complied with the law, possibly diverting scarce resources and hampering operations.

For example, during our research, we discovered a small NGO that undertook an advanced and resource-intensive compliance process (a Data Protection Impact Assessment or DPIA) for all processing operations. DPIAs are only required for large-scale and high-risk processing of personal data. Yet this NGO, which holds very limited personal data and undertakes no marketing or outreach activities, engaged in this complex and time-consuming assessment because the organization was under enormous pressure from their government. They told us they “wanted to do everything possible to avoid attracting attention.”…

Our research also found that private companies, individuals and governments who oppose the work of an organisation have used GDPR to try to keep NGOs from publishing their work. To date, NGOs have successfully fought against this misuse of the law….(More)“.

The many perks of using critical consumer user data for social benefit


Sushant Kumar at LiveMint: “Business models that thrive on user data have created profitable global technology companies. For comparison, market capitalization of just three tech companies, Google (Alphabet), Facebook and Amazon, combined is higher than the total market capitalization of all listed firms in India. Almost 98% of Facebook’s revenue and 84% of Alphabet’s come from serving targeted advertising powered by data collected from the users. No doubt, these tech companies provide valuable services to consumers. It is also true that profits are concentrated with private corporations and societal value for contributors of data, that is, the user, can be much more significant….

In the existing economic construct, private firms are able to deploy top scientists and sophisticated analytical tools to collect data, derive value and monetize the insights.

Imagine if personalization at this scale was available for more meaningful outcomes, such as for administering personalized treatment for diabetes, recommending crop patterns, optimizing water management and providing access to credit to the unbanked. These socially beneficial applications of data can generate undisputedly massive value.

However, handling critical data with accountability to prevent misuse is a complex and expensive task. What’s more, private sector players do not have any incentives to share the data they collect. These challenges can be resolved by setting up specialized entities that can manage data—collect, analyse, provide insights, manage consent and access rights. These entities would function as a trusted intermediary with public purpose, and may be named “data stewards”….(More)”.

See also: http://datastewards.net/ and https://datacollaboratives.org/

An Algorithm That Grants Freedom, or Takes It Away


Cade Metz and Adam Satariano at The New York Times: “…In Philadelphia, an algorithm created by a professor at the University of Pennsylvania has helped dictate the experience of probationers for at least five years.

The algorithm is one of many making decisions about people’s lives in the United States and Europe. Local authorities use so-called predictive algorithms to set police patrols, prison sentences and probation rules. In the Netherlands, an algorithm flagged welfare fraud risks. A British city rates which teenagers are most likely to become criminals.

Nearly every state in America has turned to this new sort of governance algorithm, according to the Electronic Privacy Information Center, a nonprofit dedicated to digital rights. Algorithm Watch, a watchdog in Berlin, has identified similar programs in at least 16 European countries.

As the practice spreads into new places and new parts of government, United Nations investigators, civil rights lawyers, labor unions and community organizers have been pushing back.

They are angered by a growing dependence on automated systems that are taking humans and transparency out of the process. It is often not clear how the systems are making their decisions. Is gender a factor? Age? ZIP code? It’s hard to say, since many states and countries have few rules requiring that algorithm-makers disclose their formulas.

They also worry that the biases — involving race, class and geography — of the people who create the algorithms are being baked into these systems, as ProPublica has reported. In San Jose, Calif., where an algorithm is used during arraignment hearings, an organization called Silicon Valley De-Bug interviews the family of each defendant, takes this personal information to each hearing and shares it with defenders as a kind of counterbalance to algorithms.

Two community organizers, the Media Mobilizing Project in Philadelphia and MediaJustice in Oakland, Calif., recently compiled a nationwide database of prediction algorithms. And Community Justice Exchange, a national organization that supports community organizers, is distributing a 50-page guide that advises organizers on how to confront the use of algorithms.

The algorithms are supposed to reduce the burden on understaffed agencies, cut government costs and — ideally — remove human bias. Opponents say governments haven’t shown much interest in learning what it means to take humans out of the decision making. A recent United Nations report warned that governments risked “stumbling zombie-like into a digital-welfare dystopia.”…(More)”.