Privacy Is Power: How Tech Policy Can Bolster Democracy


Essay by Andrew Imbrie, Daniel Baer, Andrew Trask, Anna Puglisi, Erik Brattberg, and Helen Toner: “…History is rarely forgiving, but as we adopt the next phase of digital tools, policymakers can avoid the errors of the past. Privacy-enhancing technologies, or PETs, are a collection of technologies with applications ranging from improved medical diagnostics to secure voting systems and messaging platforms. PETs allow researchers to harness big data to solve problems affecting billions of people while also protecting privacy. …

PETs are ripe for coordination among democratic allies and partners, offering a way for them to jointly develop standards and practical applications that benefit the public good. At an AI summit last July, U.S. Secretary of State Antony Blinken noted the United States’ interest in “increasing access to shared public data sets for AI training and testing, while still preserving privacy,” and National Security Adviser Jake Sullivan pointed to PETs as a promising area “to overcome data privacy challenges while still delivering the value of big data.” Given China’s advantages in scale, the United States and like-minded partners should foster emerging technologies that play to their strengths in medical research and discovery, energy innovation, trade facilitation, and reform around money laundering. Driving innovation and collaboration within and across democracies is important not only because it will help ensure those societies’ success but also because there will be a first-mover advantage in the adoption of PETs for governing the world’s private data–sharing networks.

Accelerating the development of PETs for the public good will require an international approach. Democratic governments will not be the trendsetters on PETs; instead, policymakers for these governments should focus on nurturing the ecosystems these technologies need to flourish. The role for policymakers is not to decide the fate of specific protocols or techniques but rather to foster a conducive environment for researchers to experiment widely and innovate responsibly.    

Democracies should identify shared priorities and promote basic research to mature the technological foundations of PETs. The underlying technologies require greater investment in algorithmic development and hardware to optimize the chips and mitigate the costs of network overhead. To support the computational requirements for PETs, for example, the National Science Foundation could create an interface through CloudBank and provide cloud compute credits to researchers without access to these resources. The United States could also help incubate an international network of research universities collaborating on these technologies.

Second, science-funding agencies in democracies should host competitions to incentivize new PETs protocols and standards—the collaboration between the United States and the United Kingdom announced in early December is a good example. The goal should be to create free, open-source protocols and avoid the fragmentation of the market and the proliferation of proprietary standards. The National Institute of Standards and Technology and other similar bodies should develop standards and measurement tools for PETs; governments and companies should form public-private partnerships to fund open-source protocols over the long term. Open-source protocols are especially important in the early days of PET development, because closed-source PET implementations by profit-seeking actors can be leveraged to build data monopolies. For example, imagine a scenario where all U.S. cancer data could be controlled by a single company because all the hospitals are running their proprietary software. And you have to become a customer to join the network…(More)”.

The Attack of Zombie Science


Article by Natalia Pasternak, Carlos Orsi, Aaron F. Mertz, & Stuart Firestein: “When we think about how science is distorted, we usually think about concepts that have ample currency in public discourse, such as pseudoscience and junk science. Practices like astrology and homeopathy come wrapped in scientific concepts and jargon that can’t meet the methodological requirements of actual sciences. During the COVID-19 pandemic, pseudoscience has had a field day. Bleach, anyone? Bear bile? Yet the pandemic has brought a newer, more subtle form of distortion to light. To the philosophy of science, we humbly submit a new concept: “zombie science.”

We think of zombie science as mindless science. It goes through the motions of scientific research without a real research question to answer, it follows all the correct methodology, but it doesn’t aspire to contribute to advance knowledge in the field. Practically all the information about hydroxychloroquine during the pandemic falls into that category, including not just the living dead found in preprint repositories, but also papers published in journals that ought to have been caught by a more discerning eye. Journals, after all, invest their reputation in every piece they choose to publish. And every investment in useless science is a net loss.

From a social and historical stance, it seems almost inevitable that the penchant for productivism in the academic and scientific world would end up encouraging zombie science. If those who do not publish perish, then publishing—even nonsense or irrelevancies—is a matter of life or death. The peer-review process and the criteria for editorial importance are filters, for sure, but they are limited. Not only do they get clogged and overwhelmed due to excess submissions, they have to deal with the weaknesses of the human condition, including feelings of personal loyalty, prejudice, and vanity. Additionally, these filters fail, as the proliferation of predatory journals shows us all too well…(More)”.

Making data for good better


Article by Caroline Buckee, Satchit Balsari, and Andrew Schroeder: “…Despite the long standing excitement about the potential for digital tools, Big Data and AI to transform our lives, these innovations–with some exceptions–have so far had little impact on the greatest public health emergency of our time.

Attempts to use digital data streams to rapidly produce public health insights that were not only relevant for local contexts in cities and countries around the world, but also available to decision makers who needed them, exposed enormous gaps across the translational pipeline. The insights from novel data streams which could help drive precise, impactful health programs, and bring effective aid to communities, found limited use among public health and emergency response systems. We share here our experience from the COVID-19 Mobility Data Network (CMDN), now Crisis Ready (crisisready.io), a global collaboration of researchers, mostly infectious disease epidemiologists and data scientists, who served as trusted intermediaries between technology companies willing to share vast amounts of digital data, and policy makers, struggling to incorporate insights from these novel data streams into their decision making. Through our experience with the Network, and using human mobility data as an illustrative example, we recognize three sets of barriers to the successful application of large digital datasets for public good.

First, in the absence of pre-established working relationships with technology companies and data brokers, the data remain primarily confined within private circuits of ownership and control. During the pandemic, data sharing agreements between large technology companies and researchers were hastily cobbled together, often without the right kind of domain expertise in the mix. Second, the lack of standardization, interoperability and information on the uncertainty and biases associated with these data, necessitated complex analytical processing by highly specialized domain experts. And finally, local public health departments, understandably unfamiliar with these novel data streams, had neither the bandwidth nor the expertise to sift noise from signal. Ultimately, most efforts did not yield consistently useful information for decision making, particularly in low resource settings, where capacity limitations in the public sector are most acute…(More)”.

Trove of unique health data sets could help AI predict medical conditions earlier


Madhumita Murgia at the Financial Times: “…Ziad Obermeyer, a physician and machine learning scientist at the University of California, Berkeley, launched Nightingale Open Science last month — a treasure trove of unique medical data sets, each curated around an unsolved medical mystery that artificial intelligence could help to solve.

The data sets, released after the project received $2m of funding from former Google chief executive Eric Schmidt, could help to train computer algorithms to predict medical conditions earlier, triage better and save lives.

The data include 40 terabytes of medical imagery, such as X-rays, electrocardiogram waveforms and pathology specimens, from patients with a range of conditions, including high-risk breast cancer, sudden cardiac arrest, fractures and Covid-19. Each image is labelled with the patient’s medical outcomes, such as the stage of breast cancer and whether it resulted in death, or whether a Covid patient needed a ventilator.

Obermeyer has made the data sets free to use and mainly worked with hospitals in the US and Taiwan to build them over two years. He plans to expand this to Kenya and Lebanon in the coming months to reflect as much medical diversity as possible.

“Nothing exists like it,” said Obermeyer, who announced the new project in December alongside colleagues at NeurIPS, the global academic conference for artificial intelligence. “What sets this apart from anything available online is the data sets are labelled with the ‘ground truth’, which means with what really happened to a patient and not just a doctor’s opinion.”…

The Nightingale data sets were among dozens proposed this year at NeurIPS.

Other projects included a speech data set of Mandarin and eight subdialects recorded by 27,000 speakers in 34 cities in China; the largest audio data set of Covid respiratory sounds, such as breathing, coughing and voice recordings, from more than 36,000 participants to help screen for the disease; and a data set of satellite images covering the entire country of South Africa from 2006 to 2017, divided and labelled by neighbourhood, to study the social effects of spatial apartheid.

Elaine Nsoesie, a computational epidemiologist at the Boston University School of Public Health, said new types of data could also help with studying the spread of diseases in diverse locations, as people from different cultures react differently to illnesses.

She said her grandmother in Cameroon, for example, might think differently than Americans do about health. “If someone had an influenza-like illness in Cameroon, they may be looking for traditional, herbal treatments or home remedies, compared to drugs or different home remedies in the US.”

Computer scientists Serena Yeung and Joaquin Vanschoren, who proposed that research to build new data sets should be exchanged at NeurIPS, pointed out that the vast majority of the AI community still cannot find good data sets to evaluate their algorithms. This meant that AI researchers were still turning to data that were potentially “plagued with bias”, they said. “There are no good models without good data.”…(More)”.

Economists Pin More Blame on Tech for Rising Inequality


Steve Lohr at the New York Times: “Daron Acemoglu, an influential economist at the Massachusetts Institute of Technology, has been making the case against what he describes as “excessive automation.”

The economywide payoff of investing in machines and software has been stubbornly elusive. But he says the rising inequality resulting from those investments, and from the public policy that encourages them, is crystal clear.

Half or more of the increasing gap in wages among American workers over the last 40 years is attributable to the automation of tasks formerly done by human workers, especially men without college degrees, according to some of his recent research…

Mr. Acemoglu, a wide-ranging scholar whose research makes him one of most cited economists in academic journals, is hardly the only prominent economist arguing that computerized machines and software, with a hand from policymakers, have contributed significantly to the yawning gaps in incomes in the United States. Their numbers are growing, and their voices add to the chorus of criticism surrounding the Silicon Valley giants and the unchecked advance of technology.

Paul Romer, who won a Nobel in economic science for his work on technological innovation and economic growth, has expressed alarm at the runaway market power and influence of the big tech companies. “Economists taught: ‘It’s the market. There’s nothing we can do,’” he said in an interview last year. “That’s really just so wrong.”

Anton Korinek, an economist at the University of Virginia, and Joseph Stiglitz, a Nobel economist at Columbia University, have written a paper, “Steering Technological Progress,” which recommends steps from nudges for entrepreneurs to tax changes to pursue “labor-friendly innovations.”

Erik Brynjolfsson, an economist at Stanford, is a technology optimist in general. But in an essay to be published this spring in Daedalus, the journal of the American Academy of Arts and Sciences, he warns of “the Turing trap.” …(More)”

Nudges: Four reasons to doubt popular technique to shape people’s behavior


Article by Magda Osman: “Throughout the pandemic, many governments have had to rely on people doing the right thing to reduce the spread of the coronavirus – ranging from social distancing to handwashing. Many enlisted the help of psychologists for advice on how to “nudge” the public to do what was deemed appropriate.

Nudges have been around since the 1940s and originally were referred to as behavioural engineering. They are a set of techniques developed by psychologists to promote “better” behaviour through “soft” interventions rather than “hard” ones (mandates, bans, fines). In other words, people aren’t punished if they fail to follow them. The nudges are based on psychological and behavioural economic research into human behaviour and cognition.

The nudges can involve subtle as well as obvious methods. Authorities may set a “better” choice, such as donating your organs, as a default – so people have to opt out of a register rather than opt in. Or they could make a healthy option more attractive through food labelling.

But, despite the soft approach, many people aren’t keen on being nudged. During the pandemic, for example, scientists examined people’s attitudes to nudging in social and news media in the UK, and discovered that half of the sentiments expressed in social media posts were negative…(More)”.

Technology and the Global Struggle for Democracy


Essay by Manuel Muniz: “The commemoration of the first anniversary of the January 6, 2021, attack on the US Capitol by supporters of former President Donald Trump showed that the extreme political polarization that fueled the riot also frames Americans’ interpretations of it. It would, however, be gravely mistaken to view what happened as a uniquely American phenomenon with uniquely American causes. The disruption of the peaceful transfer of power that day was part of something much bigger.

As part of the commemoration, President Joe Biden said that a battle is being fought over “the soul of America.” What is becoming increasingly clear is that this is also true of the international order: its very soul is at stake. China is rising and asserting itself. Populism is widespread in the West and major emerging economies. And chauvinistic nationalism has re-emerged in parts of Europe. All signs point to increasing illiberalism and anti-democratic sentiment around the world.

Against this backdrop, the US hosted in December a (virtual) “Summit for Democracy” that was attended by hundreds of national and civil-society leaders. The message of the gathering was clear: democracies must assert themselves firmly and proactively. To that end, the summit devoted numerous sessions to studying the digital revolution and its potentially harmful implications for our political systems.

Emerging technologies pose at least three major risks for democracies. The first concerns how they structure public debate. Social networks balkanize public discourse by segmenting users into ever smaller like-minded communities. Algorithmically-driven information echo chambers make it difficult to build social consensus. Worse, social networks are not liable for the content they distribute, which means they can allow misinformation to spread on their platforms with impunity…(More)”.

A data ‘black hole’: Europol ordered to delete vast store of personal data


Article by Apostolis Fotiadis, Ludek Stavinoha, Giacomo Zandonini, Daniel Howden: “…The EU’s police agency, Europol, will be forced to delete much of a vast store of personal data that it has been found to have amassed unlawfully by the bloc’s data protection watchdog. The unprecedented finding from the European Data Protection Supervisor (EDPS) targets what privacy experts are calling a “big data ark” containing billions of points of information. Sensitive data in the ark has been drawn from crime reports, hacked from encrypted phone services and sampled from asylum seekers never involved in any crime.

According to internal documents seen by the Guardian, Europol’s cache contains at least 4 petabytes – equivalent to 3m CD-Roms or a fifth of the entire contents of the US Library of Congress. Data protection advocates say the volume of information held on Europol’s systems amounts to mass surveillance and is a step on its road to becoming a European counterpart to the US National Security Agency (NSA), the organisation whose clandestine online spying was revealed by whistleblower Edward Snowden….(More)”.

Are we witnessing the dawn of post-theory science?


Essay by Laura Spinney: “Does the advent of machine learning mean the classic methodology of hypothesise, predict and test has had its day?..

Isaac Newton apocryphally discovered his second law – the one about gravity – after an apple fell on his head. Much experimentation and data analysis later, he realised there was a fundamental relationship between force, mass and acceleration. He formulated a theory to describe that relationship – one that could be expressed as an equation, F=ma – and used it to predict the behaviour of objects other than apples. His predictions turned out to be right (if not always precise enough for those who came later).

Contrast how science is increasingly done today. Facebook’s machine learning tools predict your preferences better than any psychologist. AlphaFold, a program built by DeepMind, has produced the most accurate predictions yet of protein structures based on the amino acids they contain. Both are completely silent on why they work: why you prefer this or that information; why this sequence generates that structure.

You can’t lift a curtain and peer into the mechanism. They offer up no explanation, no set of rules for converting this into that – no theory, in a word. They just work and do so well. We witness the social effects of Facebook’s predictions daily. AlphaFold has yet to make its impact felt, but many are convinced it will change medicine.

Somewhere between Newton and Mark Zuckerberg, theory took a back seat. In 2008, Chris Anderson, the then editor-in-chief of Wired magazine, predicted its demise. So much data had accumulated, he argued, and computers were already so much better than us at finding relationships within it, that our theories were being exposed for what they were – oversimplifications of reality. Soon, the old scientific method – hypothesise, predict, test – would be relegated to the dustbin of history. We’d stop looking for the causes of things and be satisfied with correlations.

With the benefit of hindsight, we can say that what Anderson saw is true (he wasn’t alone). The complexity that this wealth of data has revealed to us cannot be captured by theory as traditionally understood. “We have leapfrogged over our ability to even write the theories that are going to be useful for description,” says computational neuroscientist Peter Dayan, director of the Max Planck Institute for Biological Cybernetics in Tübingen, Germany. “We don’t even know what they would look like.”

But Anderson’s prediction of the end of theory looks to have been premature – or maybe his thesis was itself an oversimplification. There are several reasons why theory refuses to die, despite the successes of such theory-free prediction engines as Facebook and AlphaFold. All are illuminating, because they force us to ask: what’s the best way to acquire knowledge and where does science go from here?…(More)”

Data in Collective Impact: Focusing on What Matters


Article by Justin Piff: “One of the five conditions of collective impact, “shared measurement systems,” calls upon initiatives to identify and share key metrics of success that align partners toward a common vision. While the premise that data should guide shared decision-making is not unique to collective impact, its articulation 10 years ago as a necessary condition for collective impact catalyzed a focus on data use across the social sector. In the original article on collective impact in Stanford Social Innovation Review, the authors describe the benefits of using consistent metrics to identify patterns, make comparisons, promote learning, and hold actors accountable for success. While this vision for data collection remains relevant today, the field has developed a more nuanced understanding of how to make it a reality….

Here are four lessons from our work to help collective impact initiatives and their funders use data more effectively for social change.

1. Prioritize the Learning, Not the Data System

Those of us who are “data people” have espoused the benefits of shared data systems and common metrics too many times to recount. But a shared measurement system is only a means to an end, not an end in itself. Too often, new collective impact initiatives focus on creating the mythical, all-knowing data system—spending weeks, months, and even years researching or developing the perfect software that captures, aggregates, and computes data from multiple sectors. They let the perfect become the enemy of the good, as the pursuit of perfect data and technical precision inhibits meaningful action. And communities pay the price.

Using data to solve complex social problems requires more than a technical solution. Many communities in the US have more data than they know what to do with, yet they rarely spend time thinking about the data they actually need. Before building a data system, partners must focus on how they hope to use data in their work and identify the sources and types of data that can help them achieve their goals. Once those data are identified and collected, partners, residents, students, and others can work together to develop a shared understanding of what the data mean and move forward. In Connecticut, the Hartford Data Collaborative helps community agencies and leaders do just this. For example, it has matched programmatic data against Hartford Public Schools data and National Student Clearinghouse data to get a clear picture of postsecondary enrollment patterns across the community. The data also capture services provided to residents across multiple agencies and can be disaggregated by gender, race, and ethnicity to identify and address service gaps….(More)”.