What statistics can and can’t tell us about ourselves


Hannah Fry at The New Yorker: “Harold Eddleston, a seventy-seven-year-old from Greater Manchester, was still reeling from a cancer diagnosis he had been given that week when, on a Saturday morning in February, 1998, he received the worst possible news. He would have to face the future alone: his beloved wife had died unexpectedly, from a heart attack.

Eddleston’s daughter, concerned for his health, called their family doctor, a well-respected local man named Harold Shipman. He came to the house, sat with her father, held his hand, and spoke to him tenderly. Pushed for a prognosis as he left, Shipman replied portentously, “I wouldn’t buy him any Easter eggs.” By Wednesday, Eddleston was dead; Dr. Shipman had murdered him.

Harold Shipman was one of the most prolific serial killers in history. In a twenty-three-year career as a mild-mannered and well-liked family doctor, he injected at least two hundred and fifteen of his patients with lethal doses of opiates. He was finally arrested in September, 1998, six months after Eddleston’s death.

David Spiegelhalter, the author of an important and comprehensive new book, “The Art of Statistics” (Basic), was one of the statisticians tasked by the ensuing public inquiry to establish whether the mortality rate of Shipman’s patients should have aroused suspicion earlier. Then a biostatistician at Cambridge, Spiegelhalter found that Shipman’s excess mortality—the number of his older patients who had died in the course of his career over the number that would be expected of an average doctor’s—was a hundred and seventy-four women and forty-nine men at the time of his arrest. The total closely matched the number of victims confirmed by the inquiry….

In 1825, the French Ministry of Justice ordered the creation of a national collection of crime records. It seems to have been the first of its kind anywhere in the world—the statistics of every arrest and conviction in the country, broken down by region, assembled and ready for analysis. It’s the kind of data set we take for granted now, but at the time it was extraordinarily novel. This was an early instance of Big Data—the first time that mathematical analysis had been applied in earnest to the messy and unpredictable realm of human behavior.

Or maybe not so unpredictable. In the early eighteen-thirties, a Belgian astronomer and mathematician named Adolphe Quetelet analyzed the numbers and discovered a remarkable pattern. The crime records were startlingly consistent. Year after year, irrespective of the actions of courts and prisons, the number of murders, rapes, and robberies reached almost exactly the same total. There is a “terrifying exactitude with which crimes reproduce themselves,” Quetelet said. “We know in advance how many individuals will dirty their hands with the blood of others. How many will be forgers, how many poisoners.”

To Quetelet, the evidence suggested that there was something deeper to discover. He developed the idea of a “Social Physics,” and began to explore the possibility that human lives, like planets, had an underlying mechanistic trajectory. There’s something unsettling in the idea that, amid the vagaries of choice, chance, and circumstance, mathematics can tell us something about what it is to be human. Yet Quetelet’s overarching findings still stand: at some level, human life can be quantified and predicted. We can now forecast, with remarkable accuracy, the number of women in Germany who will choose to have a baby each year, the number of car accidents in Canada, the number of plane crashes across the Southern Hemisphere, even the number of people who will visit a New York City emergency room on a Friday evening….(More)”

Misinformation Has Created a New World Disorder


Claire Wardle at Scientific American: “…Online misinformation has been around since the mid-1990s. But in 2016 several events made it broadly clear that darker forces had emerged: automation, microtargeting and coordination were fueling information campaigns designed to manipulate public opinion at scale. Journalists in the Philippines started raising flags as Rodrigo Duterte rose to power, buoyed by intensive Facebook activity. This was followed by unexpected results in the Brexit referendum in June and then the U.S. presidential election in November—all of which sparked researchers to systematically investigate the ways in which information was being used as a weapon.

During the past three years the discussion around the causes of our polluted information ecosystem has focused almost entirely on actions taken (or not taken) by the technology companies. But this fixation is too simplistic. A complex web of societal shifts is making people more susceptible to misinformation and conspiracy. Trust in institutions is falling because of political and economic upheaval, most notably through ever widening income inequality. The effects of climate change are becoming more pronounced. Global migration trends spark concern that communities will change irrevocably. The rise of automation makes people fear for their jobs and their privacy.

Bad actors who want to deepen existing tensions understand these societal trends, designing content that they hope will so anger or excite targeted users that the audience will become the messenger. The goal is that users will use their own social capital to reinforce and give credibility to that original message.

Most of this content is designed not to persuade people in any particular direction but to cause confusion, to overwhelm and to undermine trust in democratic institutions from the electoral system to journalism. And although much is being made about preparing the U.S. electorate for the 2020 election, misleading and conspiratorial content did not begin with the 2016 presidential race, and it will not end after this one. As tools designed to manipulate and amplify content become cheaper and more accessible, it will be even easier to weaponize users as unwitting agents of disinformation….(More)”.

Credit: Jen Christiansen; Source: Information Disorder: Toward an Interdisciplinary Framework for Research and Policymaking, by Claire Wardle and Hossein Derakhshan. Council of Europe, October 2017

Investigators Use New Strategy to Combat Opioid Crisis: Data Analytics


Byron Tau and Aruna Viswanatha in the Wall Street Journal: “When federal investigators got a tip in 2015 that a health center in Houston was distributing millions of doses of opioid painkillers, they tried a new approach: look at the numbers.

State and federal prescription and medical billing data showed a pattern of overprescription, giving authorities enough ammunition to send an undercover Drug Enforcement Administration agent. She found a crowded waiting room and armed security guards. After a 91-second appointment with the sole doctor, the agent paid $270 at the cash-only clinic and walked out with 100 10mg pills of the powerful opioid hydrocodone.

The subsequent prosecution of the doctor and the clinic owner, who were sentenced last year to 35 years in prison, laid the groundwork for a new data-driven Justice Department strategy to help target one of the worst public-health crises in the country. Prosecutors expanded the pilot program from Houston to the hard-hit Appalachian region in early 2019. Within months, the effort resulted in the indictments of dozens of doctors, nurses, pharmacists and others. Two-thirds of them had been identified through analyzing the data, a Justice Department official said. A quarter of defendants were expected to plead guilty, according to the Justice Department, and additional indictments through the program are expected in the coming weeks.

“These are doctors behaving like drug dealers,” said Brian Benczkowski, head of the Justice Department’s criminal division who oversaw the expansion.

“They’ve been operating as though nobody could see them for a long period of time. Now we have the data,” Mr. Benczkowski said.

The Justice Department’s fraud section has been using data analytics in health-care prosecutions for several years—combing through Medicare and Medicaid billing data for evidence of fraud, and deploying the strategy in cities around the country that saw outlier billings. In 2018, the health-care fraud unit charged more than 300 people with fraud totaling more than $2 billion, according to the Justice Department.

But using the data to combat the opioid crisis, which is ravaging communities across the country, is a new development for the department, which has made tackling the epidemic a key priority in the Trump administration….(More)”.

The Internet Freedom League: How to Push Back Against the Authoritarian Assault on the Web


Essay by Richard A. Clarke And Rob Knake in Foreign Affairs: “The early days of the Internet inspired a lofty dream: authoritarian states, faced with the prospect of either connecting to a new system of global communication or being left out of it, would choose to connect. According to this line of utopian thinking, once those countries connected, the flow of new information and ideas from the outside world would inexorably pull them toward economic openness and political liberalization. In reality, something quite different has happened. Instead of spreading democratic values and liberal ideals, the Internet has become the backbone of authoritarian surveillance states all over the world. Regimes in China, Russia, and elsewhere have used the Internet’s infrastructure to build their own national networks. At the same time, they have installed technical and legal barriers to prevent their citizens from reaching the wider Internet and to limit Western companies from entering their digital markets. 

But despite handwringing in Washington and Brussels about authoritarian schemes to split the Internet, the last thing Beijing and Moscow want is to find themselves relegated to their own networks and cut off from the global Internet. After all, they need access to the Internet to steal intellectual property, spread propaganda, interfere with elections in other countries, and threaten critical infrastructure in rival countries. China and Russia would ideally like to re-create the Internet in their own images and force the world to play by their repressive rules. But they haven’t been able to do that—so instead they have ramped up their efforts to tightly control outside access to their markets, limit their citizens’ ability to reach the wider Internet, and exploit the vulnerability that comes with the digital freedom and openness enjoyed in the West.

The United States and its allies and partners should stop worrying about the risk of authoritarians splitting the Internet. Instead, they should split it themselves, by creating a digital bloc within which data, services, and products can flow freely…(More)”.

Sharing Private Data for Public Good


Stefaan G. Verhulst at Project Syndicate: “After Hurricane Katrina struck New Orleans in 2005, the direct-mail marketing company Valassis shared its database with emergency agencies and volunteers to help improve aid delivery. In Santiago, Chile, analysts from Universidad del Desarrollo, ISI Foundation, UNICEF, and the GovLab collaborated with Telefónica, the city’s largest mobile operator, to study gender-based mobility patterns in order to design a more equitable transportation policy. And as part of the Yale University Open Data Access project, health-care companies Johnson & Johnson, Medtronic, and SI-BONE give researchers access to previously walled-off data from 333 clinical trials, opening the door to possible new innovations in medicine.

These are just three examples of “data collaboratives,” an emerging form of partnership in which participants exchange data for the public good. Such tie-ups typically involve public bodies using data from corporations and other private-sector entities to benefit society. But data collaboratives can help companies, too – pharmaceutical firms share data on biomarkers to accelerate their own drug-research efforts, for example. Data-sharing initiatives also have huge potential to improve artificial intelligence (AI). But they must be designed responsibly and take data-privacy concerns into account.

Understanding the societal and business case for data collaboratives, as well as the forms they can take, is critical to gaining a deeper appreciation the potential and limitations of such ventures. The GovLab has identified over 150 data collaboratives spanning continents and sectors; they include companies such as Air FranceZillow, and Facebook. Our research suggests that such partnerships can create value in three main ways….(More)”.

The Ethics of Hiding Your Data From the Machines


Molly Wood at Wired: “…But now that data is being used to train artificial intelligence, and the insights those future algorithms create could quite literally save lives.

So while targeted advertising is an easy villain, data-hogging artificial intelligence is a dangerously nuanced and highly sympathetic bad guy, like Erik Killmonger in Black Panther. And it won’t be easy to hate.

I recently met with a company that wants to do a sincerely good thing. They’ve created a sensor that pregnant women can wear, and it measures their contractions. It can reliably predict when women are going into labor, which can help reduce preterm births and C-sections. It can get women into care sooner, which can reduce both maternal and infant mortality.

All of this is an unquestionable good.

And this little device is also collecting a treasure trove of information about pregnancy and labor that is feeding into clinical research that could upend maternal care as we know it. Did you know that the way most obstetricians learn to track a woman’s progress through labor is based on a single study from the 1950s, involving 500 women, all of whom were white?…

To save the lives of pregnant women and their babies, researchers and doctors, and yes, startup CEOs and even artificial intelligence algorithms need data. To cure cancer, or at least offer personalized treatments that have a much higher possibility of saving lives, those same entities will need data….

And for we consumers, well, a blanket refusal to offer up our data to the AI gods isn’t necessarily the good choice either. I don’t want to be the person who refuses to contribute my genetic data via 23andMe to a massive research study that could, and I actually believe this is possible, lead to cures and treatments for diseases like Parkinson’s and Alzheimer’s and who knows what else.

I also think I deserve a realistic assessment of the potential for harm to find its way back to me, because I didn’t think through or wasn’t told all the potential implications of that choice—like how, let’s be honest, we all felt a little stung when we realized the 23andMe research would be through a partnership with drugmaker (and reliable drug price-hiker) GlaxoSmithKline. Drug companies, like targeted ads, are easy villains—even though this partnership actually couldproduce a Parkinson’s drug. But do we know what GSK’s privacy policy looks like? That deal was a level of sharing we didn’t necessarily expect….(More)”.

After Technopoly


Alan Jacobs at the New Atlantis: “Technocratic solutionism is dying. To replace it, we must learn again the creation and reception of myth….
What Neil Postman called “technopoly” may be described as the universal and virtually inescapable rule of our everyday lives by those who make and deploy technology, especially, in this moment, the instruments of digital communication. It is difficult for us to grasp what it’s like to live under technopoly, or how to endure or escape or resist the regime. These questions may best be approached by drawing on a handful of concepts meant to describe a slightly earlier stage of our common culture.

First, following on my earlier essay in these pages, “Wokeness and Myth on Campus” (Summer/Fall 2017), I want to turn again to a distinction by the Polish philosopher Leszek Kołakowski between the “technological core” of culture and the “mythical core” — a distinction he believed is essential to understanding many cultural developments.

“Technology” for Kołakowski is something broader than we usually mean by it. It describes a stance toward the world in which we view things around us as objects to be manipulated, or as instruments for manipulating our environment and ourselves. This is not necessarily meant in a negative sense; some things ought to be instruments — the spoon I use to stir my soup — and some things need to be manipulated — the soup in need of stirring. Besides tools, the technological core of culture includes also the sciences and most philosophy, as those too are governed by instrumental, analytical forms of reasoning by which we seek some measure of control.

By contrast, the mythical core of culture is that aspect of experience that is not subject to manipulation, because it is prior to our instrumental reasoning about our environment. Throughout human civilization, says Kołakowski, people have participated in myth — they may call it “illumination” or “awakening” or something else — as a way of connecting with “nonempirical unconditioned reality.” It is something we enter into with our full being, and all attempts to describe the experience in terms of desire, will, understanding, or literal meaning are ways of trying to force the mythological core into the technological core by analyzing and rationalizing myth and pressing it into a logical order. This is why the two cores are always in conflict, and it helps to explain why rational argument is often a fruitless response to people acting from the mythical core….(More)”.

We Need a New Science of Progress


Patrick Collison and Tyler Cowen in The Atlantic: “In 1861, the American scientist and educator William Barton Rogers published a manifesto calling for a new kind of research institution. Recognizing the “daily increasing proofs of the happy influence of scientific culture on the industry and the civilization of the nations,” and the growing importance of what he called “Industrial Arts,” he proposed a new organization dedicated to practical knowledge. He named it the Massachusetts Institute of Technology.

Rogers was one of a number of late-19th-century reformers who saw that the United States’ ability to generate progress could be substantially improved. These reformers looked to the successes of the German university models overseas and realized that a combination of focused professorial research and teaching could be a powerful engine for advance in research. Over the course of several decades, the group—Rogers, Charles Eliot, Henry Tappan, George Hale, John D. Rockefeller, and others—founded and restructured many of what are now America’s best universities, including Harvard, MIT, Stanford, Caltech, Johns Hopkins, the University of Chicago, and more. By acting on their understanding, they engaged in a kind of conscious “progress engineering.”

Progress itself is understudied. By “progress,” we mean the combination of economic, technological, scientific, cultural, and organizational advancement that has transformed our lives and raised standards of living over the past couple of centuries. For a number of reasons, there is no broad-based intellectual movement focused on understanding the dynamics of progress, or targeting the deeper goal of speeding it up. We believe that it deserves a dedicated field of study. We suggest inaugurating the discipline of “Progress Studies.”…(More)”

The World Is Complex. Measuring Charity Has to Be Too


Joy Ito at Wired: “If you looked at how many people check books out of libraries these days, you would see failure. Circulation, an obvious measure of success for an institution established to lend books to people, is down. But if you only looked at that figure, you’d miss the fascinating transformation public libraries have undergone in recent years. They’ve taken advantage of grants to become makerspaces, classrooms, research labs for kids, and trusted public spaces in every way possible. Much of the successful funding encouraged creative librarians to experiment and scale when successful, iterating and sharing their learnings with others. If we had focused our funding to increase just the number of books people were borrowing, we would have missed the opportunity to fund and witness these positive changes.

I serve on the boards of the MacArthur Foundation and the Knight Foundation, which have made grants that helped transform our libraries. I’ve also worked over the years with dozens of philanthropists and investors—those who put money into ventures that promise environmental and public health benefits in addition to financial returns. All of us have struggled to measure the effectiveness of grants and investments that seek to benefit the community, the environment, and so forth. My own research interest in the practice of change has converged with the research of those who are trying to quantify this change, and so recently, my colleague Louis Kang and I have begun to analyse the ways in which people are currently measuring impact and perhaps find methods to better measure the impact of these investments….(More)”.

How Can We Use Administrative Data to Prevent Homelessness among Youth Leaving Care?


Article by Naomi Nichols: “In 2017, I was part of a team of people at the Canadian Observatory on Homelessness and A Way Home Canada who wrote a policy brief titled, Child Welfare and Youth Homelessness in Canada: A proposal for action. Drawing on the results of the first pan-Canadian survey on youth homelessness, Without a Home: The National Youth Homelessness Surveythe brief focused on the disproportionate number of young people who had been involved with child protection services and then later became homeless. Indeed, 57.8% of homeless youth surveyed reported some type of involvement with child protection services over their lifetime. By comparison, in the general population, only 0.3% of young people receive child welfare service. This means, youth experiencing homelessness are far more likely to report interactions with the child welfare system than young people in the general population. 

Where research reveals systematic patterns of exclusion and neglect – that is, where findings reveal that one group is experiencing disproportionately negative outcomes (relative to the general population) in a particular public sector context – this suggests the need for changes in public policy, programming and practice. Since producing this brief, I have been working with an incredibly talented and passionate McGill undergraduate student (who also happens to be the Vice President of Youth in Care Canada), Arisha Khan. Together, we have been exploring just uses of data to better serve the interests of those young people who depend on the state for their access to basic services (e.g., housing, healthcare and food) as well as their self-efficacy and status as citizens. 

One component of this work revolved around a grant application that has just been funded by the Social Sciences and Humanities Research Council of Canada (Data Justice: Fostering equitable data-led strategies to prevent, reduce and end youth homelessness). Another aspect of our work revolved around a policy brief, which we co-wrote and published with the Montreal data-for-good organization, Powered by Data. The brief outlines how a rights-based and custodial approach to administrative data could a) effectively support young people in and leaving care to participate more actively in their transition planning and engage in institutional self-advocacy; and b) enable systemic oversight of intervention implementation and outcomes for young people in and leaving the provincial care system. We produced this brief with the hope that it would be useful to government decision-makers, service providers, researchers, and advocates interested in understanding how institutional data could be used to improve outcomes for youth in and leaving care. In particular, we wanted to explore whether a different orientation to data collection and use in child protection systems could prevent young people from graduating from provincial child welfare systems into homelessness. In addition to this practical concern, we also undertook to think through the ethical and human rights implications of more recent moves towards data-driven service delivery in Canada, focusing on how we might make this move with the best interests of young people in mind. 

As data collection, management and use practices have become more popularresearch is beginning to illuminate how these new monitoring, evaluative and predictive technologies are changing governance processes within and across the public sector, as well as in civil society. ….(More)”.