Big Mind: How Collective Intelligence Can Change Our World


Book by Geoff Mulgan: “A new field of collective intelligence has emerged in the last few years, prompted by a wave of digital technologies that make it possible for organizations and societies to think at large scale. This “bigger mind”—human and machine capabilities working together—has the potential to solve the great challenges of our time. So why do smart technologies not automatically lead to smart results? Gathering insights from diverse fields, including philosophy, computer science, and biology, Big Mind reveals how collective intelligence can guide corporations, governments, universities, and societies to make the most of human brains and digital technologies.

Geoff Mulgan explores how collective intelligence has to be consciously organized and orchestrated in order to harness its powers. He looks at recent experiments mobilizing millions of people to solve problems, and at groundbreaking technology like Google Maps and Dove satellites. He also considers why organizations full of smart people and machines can make foolish mistakes—from investment banks losing billions to intelligence agencies misjudging geopolitical events—and shows how to avoid them.

Highlighting differences between environments that stimulate intelligence and those that blunt it, Mulgan shows how human and machine intelligence could solve challenges in business, climate change, democracy, and public health. But for that to happen we’ll need radically new professions, institutions, and ways of thinking.

Informed by the latest work on data, web platforms, and artificial intelligence, Big Mind shows how collective intelligence could help us survive and thrive….(More)”

The Digital Footprint of Europe’s Refugees


Pew Research Center: “Migrants leaving their homes for a new country often carry a smartphone to communicate with family that may have stayed behind and to help search for border crossings, find useful information about their journey or search for details about their destination. The digital footprints left by online searches can provide insight into the movement of migrants as they transit between countries and settle in new locations, according to a new Pew Research Center analysis of refugee flows between the Middle East and Europe.1

Refugees from just two Middle Eastern countries — Syria and Iraq — made up a combined 38% of the record 1.3 million people who arrived and applied for asylum in the European Union, Norway and Switzerland in 2015 and a combined 37% of the 1.2 million first-time asylum applications in 2016. Most Syrian and Iraqi refugees during this period crossed from Turkey to Greece by sea, before continuing on to their final destinations in Europe.

Since many refugees from Syria and Iraq speak Arabic as their native, if not only, language, it is possible to identify key moments in their migration by examining trends in internet searches conducted in Turkey using Arabic, as opposed to the dominant Turkic languages in that country. For example, Turkey-based searches for the word “Greece” in Arabic closely mirror 2015 and 2016 fluctuations in the number of refugees crossing the Aegean Sea to Greece. The searches also provide a window into how migrants planned to move across borders — for example, the search term “Greece” was often combined with “smuggler.” In addition, an hourly analysis of searches in Turkey shows spikes in the search term “Greece” during early morning hours, a typical time for migrants making their way across the Mediterranean.

Comparing online searches with migration data

This report’s analysis compares data from internet searches with government and international agency refugee arrival and asylum application data in Europe from 2015 and 2016. Internet searches were captured from Google Trends, a publicly-available analytical tool that standardizes search volume by language and location over time. The analysis examines searches in Arabic, done in Turkey and Germany, for selected words such as “Greece” or “German” that can be linked to migration patterns. For a complete list of search terms employed, see the methodology. Google releases hourly, daily and weekly search data.

Google does not release the actual number of searches conducted but provides a metric capturing the relative change in searches over a specified time period. The metric ranges from 0 to 100 and indicates low- or high-volume search activity for the time period. Predicting or deciphering human behavior from the analysis of internet searches has limitations and remains experimental. But, internet search data does offer a potentially promising way to explore migration flows crossing international borders.

Migration data cited in this report come from two sources. The first is the United Nations High Commissioner for Refugees (UNHCR), which provides data on new arrivals into Greece on a monthly basis. The second is first-time asylum applications from Eurostat, Europe’s statistical agency. Since both Syrian and Iraqi asylum seekers have had fairly high acceptance rates in Europe, it is likely that most Syrian and Iraqi migrants entering during 2015 and 2016 were counted by UNHCR and applied for asylum with European authorities.

The unique circumstances of this Syrian and Iraqi migration — the technology used by refugees, the large and sudden movement of refugees and language groups in transit and destination countries — presents a unique opportunity to integrate the analysis of online searches and migration data. The conditions that permit this type of analysis may not apply in other circumstances where migrants are moving between countries….(More)”

The final Global Open Data Index is now live


Open Knowledge International: “The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here.

Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here

We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.”

Inside the Algorithm That Tries to Predict Gun Violence in Chicago


Gun violence in Chicago has surged since late 2015, and much of the news media attention on how the city plans to address this problem has focused on the Strategic Subject List, or S.S.L.

The list is made by an algorithm that tries to predict who is most likely to be involved in a shooting, either as perpetrator or victim. The algorithm is not public, but the city has now placed a version of the list — without names — online through its open data portal, making it possible for the first time to see how Chicago evaluates risk.

We analyzed that information and found that the assigned risk scores — and what characteristics go into them — are sometimes at odds with the Chicago Police Department’s public statements and cut against some common perceptions.

■ Violence in the city is less concentrated at the top — among a group of about 1,400 people with the highest risk scores — than some public comments from the Chicago police have suggested.

■ Gangs are often blamed for the devastating increase in gun violence in Chicago, but gang membership had a small predictive effect and is being dropped from the most recent version of the algorithm.

■ Being a victim of a shooting or an assault is far more predictive of future gun violence than being arrested on charges of domestic violence or weapons possession.

■ The algorithm has been used in Chicago for several years, and its effectiveness is far from clear. Chicago accounted for a large share of the increase in urban murders last year….(More)”.

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are


Book by Seth Stephens-Davidowitz: “Blending the informed analysis of The Signal and the Noise with the instructive iconoclasm of Think Like a Freak, a fascinating, illuminating, and witty look at what the vast amounts of information now instantly available to us reveals about ourselves and our world—provided we ask the right questions.

By the end of an average day in the early twenty-first century, human beings searching the internet will amass eight trillion gigabytes of data. This staggering amount of information—unprecedented in history—can tell us a great deal about who we are—the fears, desires, and behaviors that drive us, and the conscious and unconscious decisions we make. From the profound to the mundane, we can gain astonishing knowledge about the human psyche that less than twenty years ago, seemed unfathomable.

Everybody Lies offers fascinating, surprising, and sometimes laugh-out-loud insights into everything from economics to ethics to sports to race to sex, gender and more, all drawn from the world of big data. What percentage of white voters didn’t vote for Barack Obama because he’s black? Does where you go to school effect how successful you are in life? Do parents secretly favor boy children over girls? Do violent films affect the crime rate? Can you beat the stock market? How regularly do we lie about our sex lives and who’s more self-conscious about sex, men or women?

Investigating these questions and a host of others, Seth Stephens-Davidowitz offers revelations that can help us understand ourselves and our lives better. Drawing on studies and experiments on how we really live and think, he demonstrates in fascinating and often funny ways the extent to which all the world is indeed a lab. With conclusions ranging from strange-but-true to thought-provoking to disturbing, he explores the power of this digital truth serum and its deeper potential—revealing biases deeply embedded within us, information we can use to change our culture, and the questions we’re afraid to ask that might be essential to our health—both emotional and physical. All of us are touched by big data everyday, and its influence is multiplying. Everybody Lies challenges us to think differently about how we see it and the world…(More)”.

How to Track What Congress Is Doing on the Internet


Louise Matsakis at Motherboard: “There’s now a way to track what government employees, including elected officials, are doing online during working hours.

A new plugin created by a software engineer in North Carolina lets website administrators monitor when someone accesses their site from an IP address associated with the federal government. It was created in part to protest a piece of legislation the president signed earlier this year.

In April, President Trump signed a measure allowing internet service providers (ISPs) to sell sensitive information about your online habits without needing your consent, rolling back Obama-era regulations intended to stop that very thing from happening.

Corporations like Verizon and AT&T hated the regulations (and spent a boatload lobbying against them), because they made it difficult to monetize the mountain of customer data they have the ability to collect.

Consumers, on the other hand, were outraged, and wondered what could be done to get back at the lawmakers who voted in favor of the measure. One appealing suggestion was to buy and release their browsing history, then release it to the public.

Almost immediately, a handful of GoFundMe pages dedicated to raising money for the cause popped up. While the campaigns are well-intentioned, what their creators don’t realize is that what they want to do is illegal. The Telecommunications Act prohibits sharing (or selling) customer information that is “individually identifiable,” except under special circumstances.

In other words, there’s no database where you can purchase your Congressman’s online porn habits and there likely won’t be anytime soon, even with the data-collection regulations dismantled.

But a new tool created by Matt Feld, the founder of several nonprofits including Speak Together, could help the public get a sense of what elected officials are up to online….(More)”

6 Jurisdictions Tackling Homelessness with Technology


 in GovernmentTechnology: “Public servants who work to reduce homelessness often have similar lists of challenges.

The most common of these are data sharing between groups involved with the homeless, the ability to track interactions between individuals and outreach providers, and a system that makes it easier to enter information about the population. Recently, we spoke with more than a half-dozen government officials who are involved with the homeless, and while obstacles and conditions varied among cities, all agreed that their work would be much easier with better tech-based solutions for the problems cited above.

These officials, however, were uniformly optimistic that such solutions were becoming more readily available — solutions with potential to solve the logistical hurdles that most often hamstring government, community and nonprofit efforts to help the homeless find jobs, residences and medical care. Some agencies, in fact, have already had success implementing tech as components in larger campaigns, while others are testing new platforms that may bolster organization and efficiency.

Below are a few brief vignettes that detail some — but far from all — ongoing governmental efforts to use tech to aid and reduce the homeless population.

1. BERGEN COUNTY, N.J.

One of the best examples of government using tech to address homelessness can be found in Bergen County, N.J., where officials recently certified their jurisdiction as first in the nation to end chronic homelessness. READ MORE

2. AURORA, COLO.

Aurora, Colo., in the Denver metropolitan, area uses the Homeless Management Information System required by the U.S. Department of Housing and Urban Development, but those involved with addressing homelessness there have also developed tech-based efforts that are specifically tailored to the area’s needs. READ MORE

4. NEW YORK CITY

New York City is rolling out an app called StreetSmart, which enables homelessness outreach workers in all five boroughs to communicate and log data seamlessly in real time while in the field. With StreetSmart, these workers will be able to enter that information into a single citywide database as they collect it. READ MORE(Full article)

Inspecting Algorithms for Bias


Matthias Spielkamp at MIT Technology Review: “It was a striking story. “Machine Bias,” the headline read, and the teaser proclaimed: “There’s software used across the country to predict future criminals. And it’s biased against blacks.”

ProPublica, a Pulitzer Prize–winning nonprofit news organization, had analyzed risk assessment software known as COMPAS. It is being used to forecast which criminals are most likely to ­reoffend. Guided by such forecasts, judges in courtrooms throughout the United States make decisions about the future of defendants and convicts, determining everything from bail amounts to sentences. When ProPublica compared COMPAS’s risk assessments for more than 10,000 people arrested in one Florida county with how often those people actually went on to reoffend, it discovered that the algorithm “correctly predicted recidivism for black and white defendants at roughly the same rate.”…

After ProPublica’s investigation, Northpointe, the company that developed COMPAS, disputed the story, arguing that the journalists misinterpreted the data. So did three criminal-justice researchers, including one from a justice-reform organization. Who’s right—the reporters or the researchers? Krishna Gummadi, head of the Networked Systems Research Group at the Max Planck Institute for Software Systems in Saarbrücken, Germany, offers a surprising answer: they all are.

Gummadi, who has extensively researched fairness in algorithms, says ProPublica’s and Northpointe’s results don’t contradict each other. They differ because they use different measures of fairness.

Imagine you are designing a system to predict which criminals will reoffend. One option is to optimize for “true positives,” meaning that you will identify as many people as possible who are at high risk of committing another crime. One problem with this approach is that it tends to increase the number of false positives: people who will be unjustly classified as likely reoffenders. The dial can be adjusted to deliver as few false positives as possible, but that tends to create more false negatives: likely reoffenders who slip through and get a more lenient treatment than warranted.

Raising the incidence of true positives or lowering the false positives are both ways to improve a statistical measure known as positive predictive value, or PPV. That is the percentage of all positives that are true….

But if we accept that algorithms might make life fairer if they are well designed, how can we know whether they are so designed?

Democratic societies should be working now to determine how much transparency they expect from ADM systems. Do we need new regulations of the software to ensure it can be properly inspected? Lawmakers, judges, and the public should have a say in which measures of fairness get prioritized by algorithms. But if the algorithms don’t actually reflect these value judgments, who will be held accountable?

These are the hard questions we need to answer if we expect to benefit from advances in algorithmic technology…(More)”.

Slave to the Algorithm? Why a ‘Right to Explanation’ is Probably Not the Remedy You are Looking for


Paper by Lilian Edwards and Michael Veale: “Algorithms, particularly of the machine learning (ML) variety, are increasingly consequential to individuals’ lives but have caused a range of concerns evolving mainly around unfairness, discrimination and opacity. Transparency in the form of a “right to an explanation” has emerged as a compellingly attractive remedy since it intuitively presents as a means to “open the black box”, hence allowing individual challenge and redress, as well as possibilities to foster accountability of ML systems. In the general furore over algorithmic bias and other issues laid out in section 2, any remedy in a storm has looked attractive.

However, we argue that a right to an explanation in the GDPR is unlikely to be a complete remedy to algorithmic harms, particularly in some of the core “algorithmic war stories” that have shaped recent attitudes in this domain. We present several reasons for this conclusion. First (section 3), the law is restrictive on when any explanation-related right can be triggered, and in many places is unclear, or even seems paradoxical. Second (section 4), even were some of these restrictions to be navigated, the way that explanations are conceived of legally — as “meaningful information about the logic of processing” — is unlikely to be provided by the kind of ML “explanations” computer scientists have been developing. ML explanations are restricted both by the type of explanation sought, the multi-dimensionality of the domain and the type of user seeking an explanation. However (section 5) “subject-centric” explanations (SCEs), which restrict explanations to particular regions of a model around a query, show promise for interactive exploration, as do pedagogical rather than decompositional explanations in dodging developers’ worries of IP or trade secrets disclosure.

As an interim conclusion then, while convinced that recent research in ML explanations shows promise, we fear that the search for a “right to an explanation” in the GDPR may be at best distracting, and at worst nurture a new kind of “transparency fallacy”. However, in our final section, we argue that other parts of the GDPR related (i) to other individual rights including the right to erasure (“right to be forgotten”) and the right to data portability and (ii) to privacy by design, Data Protection Impact Assessments and certification and privacy seals, may have the seeds of building a better, more respectful and more user-friendly algorithmic society….(More)”

Facebook Disaster Maps


Molly Jackman et al at Facebook: “After a natural disaster, humanitarian organizations need to know where affected people are located, what resources are needed, and who is safe. This information is extremely difficult and often impossible to capture through conventional data collection methods in a timely manner. As more people connect and share on Facebook, our data is able to provide insights in near-real time to help humanitarian organizations coordinate their work and fill crucial gaps in information during disasters. This morning we announced a Facebook disaster map initiative to help organizations address the critical gap in information they often face when responding to natural disasters.

Facebook disaster maps provide information about where populations are located, how they are moving, and where they are checking in safe during a natural disaster. All data is de-identified and aggregated to a 360 square meter tile or local administrative boundaries (e.g. census boundaries). [1]

This blog describes the disaster maps datasets, how insights are calculated, and the steps taken to ensure that we’re preserving privacy….(More)”.