Social Distancing and Social Capital: Why U.S. Counties Respond Differently to Covid-19


NBER Paper by Wenzhi Ding et al: Since social distancing is the primary strategy for slowing the spread of many diseases, understanding why U.S. counties respond differently to COVID-19 is critical for designing effective public policies. Using daily data from about 45 million mobile phones to measure social distancing we examine how counties responded to both local COVID-19 cases and statewide shelter-in-place orders. We find that social distancing increases more in response to cases and official orders in counties where individuals historically (1) engaged less in community activities and (2) demonstrated greater willingness to incur individual costs to contribute to social objectives. Our work highlights the importance of these two features of social capital—community engagement and individual commitment to societal institutions—in formulating public health policies….(More)”

Are there laws of history?


Amanda Rees at AEON: “…If big data could enable us to turn big history into mathematics rather than narratives, would that make it easier to operationalise our past? Some scientists certainly think so.

In February 2010, Peter Turchin, an ecologist from the University of Connecticut, predicted that 2020 would see a sharp increase in political volatility for Western democracies. Turchin was responding critically to the optimistic speculations of scientific progress in the journal Nature: the United States, he said, was coming to the peak of another instability spike (regularly occurring every 50 years or so), while the world economy was reaching the point of a ‘Kondratiev wave’ dip, that is, a steep downturn in a growth-driven supercycle. Along with a number of ‘seemingly disparate’ social pointers, all indications were that serious problems were looming. In the decade since that prediction, the entrenched, often vicious, social, economic and political divisions that have increasingly characterised North American and European society, have made Turchin’s ‘quantitative historical analysis’ seem remarkably prophetic.

A couple of years earlier, in July 2008, Turchin had made a series of trenchant claims about the nature and future of history. Totting up in excess of ‘200 explanations’ proposed to account for the fall of the Roman empire, he was appalled that historians were unable to agree ‘which explanations are plausible and which should be rejected’. The situation, he maintained, was ‘as risible as if, in physics, phlogiston theory and thermodynamics coexisted on equal terms’. Why, Turchin wanted to know, were the efforts in medicine and environmental science to produce healthy bodies and ecologies not mirrored by interventions to create stable societies? Surely it was time ‘for history to become an analytical, and even a predictive, science’. Knowing that historians were themselves unlikely to adopt such analytical approaches to the past, he proposed a new discipline: ‘theoretical historical social science’ or ‘cliodynamics’ – the science of history.

Like C P Snow 60 years before him, Turchin wanted to challenge the boundary between the sciences and humanities – even as periodic attempts to apply the theories of natural science to human behaviour (sociobiology, for example) or to subject natural sciences to the methodological scrutiny of the social sciences (science wars, anyone?) have frequently resulted in hostile turf wars. So what are the prospects for Turchin’s efforts to create a more desirable future society by developing a science of history?…

In 2010, Cliodynamics, the flagship journal for this new discipline, appeared, with its very first article (by the American sociologist Randall Collins) focusing on modelling victory and defeat in battle in relation to material resources and organisational morale. In a move that paralleled Comte’s earlier argument regarding the successive stages of scientific complexity (from physics, through chemistry and biology, to sociology), Turchin passionately rejected the idea that complexity made human societies unsuitable for quantitative analysis, arguing that it was precisely that complexity which made mathematics essential. Weather predictions were once considered unreliable because of the sheer complexity of managing the necessary data. But improvements in technology (satellites, computers) mean that it’s now possible to describe mathematically, and therefore to model, interactions between the system’s various parts – and therefore to know when it’s wise to carry an umbrella. With equal force, Turchin insisted that the cliodynamic approach was not deterministic. It would not predict the future, but instead lay out for governments and political leaders the likely consequences of competing policy choices.

Crucially, and again on the back of the abundantly available and cheap computer power, cliodynamics benefited from the surge in interest in the digital humanities. Existing archives were being digitised, uploaded and made searchable: every day, it seemed, more data were being presented in a format that encouraged quantification and enabled mathematical analysis – including the Old Bailey’s online database, of which Wolf had fallen foul. At the same time, cliodynamicists were repositioning themselves. Four years after its initial launch, the subtitle of their flagship journal was renamed, from The Journal of Theoretical and Mathematical History to The Journal of Quantitative History and Cultural Evolution. As Turchin’s editorial stated, this move was intended to position cliodynamics within a broader evolutionary analysis; paraphrasing the Russian-American geneticist Theodosius Dobzhansky, he claimed that ‘nothing in human history makes sense except in the light of cultural evolution’. Given Turchin’s ecological background, this evolutionary approach to history is unsurprising. But given the historical outcomes of making politics biological, it is potentially worrying….

Mathematical, data-driven, quantitative models of human experience that aim at detachment, objectivity and the capacity to develop and test hypotheses need to be balanced by explicitly fictional, qualitative and imaginary efforts to create and project a lived future that enable their audiences to empathically ground themselves in the hopes and fears of what might be to come. Both, after all, are unequivocally doing the same thing: using history and historical experience to anticipate the global future so that we might – should we so wish – avoid civilisation’s collapse. That said, the question of who ‘we’ are does, always, remain open….(More)”.

‘For good measure’: data gaps in a big data world


Paper by Sarah Giest & Annemarie Samuels: “Policy and data scientists have paid ample attention to the amount of data being collected and the challenge for policymakers to use and utilize it. However, far less attention has been paid towards the quality and coverage of this data specifically pertaining to minority groups. The paper makes the argument that while there is seemingly more data to draw on for policymakers, the quality of the data in combination with potential known or unknown data gaps limits government’s ability to create inclusive policies. In this context, the paper defines primary, secondary, and unknown data gaps that cover scenarios of knowingly or unknowingly missing data and how that is potentially compensated through alternative measures.

Based on the review of the literature from various fields and a variety of examples highlighted throughout the paper, we conclude that the big data movement combined with more sophisticated methods in recent years has opened up new opportunities for government to use existing data in different ways as well as fill data gaps through innovative techniques. Focusing specifically on the representativeness of such data, however, shows that data gaps affect the economic opportunities, social mobility, and democratic participation of marginalized groups. The big data movement in policy may thus create new forms of inequality that are harder to detect and whose impact is more difficult to predict….(More)“.

How data privacy leader Apple found itself in a data ethics catastrophe


Article by Daniel Wu and Mike Loukides: “…Apple learned a critical lesson from this experience. User buy-in cannot end with compliance with rules. It requires ethics, constantly asking how to protect, fight for, and empower users, regardless of what the law says. These strategies contribute to perceptions of trust.

Trust has to be earned, is easily lost, and is difficult to regain….

In our more global, diverse, and rapidly- changing world, ethics may be embodied by the “platinum rule”: Do unto others as they would want done to them. One established field of ethics—bioethics—offers four principles that are related to the platinum rule: nonmaleficence, justice, autonomy, and beneficence.

For organizations that want to be guided by ethics, regardless of what the law says, these principles as essential tools for a purpose-driven mission: protecting (nonmaleficence), fighting for (justice), and empowering users and employees (autonomy and beneficence).

An ethics leader protects users and workers in its operations by using governance best practices. 

Before creating the product, it understands both the qualitative and quantitative contexts of key stakeholders, especially those who will be most impacted, identifying their needs and fears. When creating the product, it uses data protection by design, working with cross-functional roles like legal and privacy engineers to embed ethical principles into the lifecycle of the product and formalize data-sharing agreements. Before launching, it audits the product thoroughly and conducts scenario planning to understand potential ethical mishaps, such as perceived or real gender bias or human rights violations in its supply chain. After launching, its terms of service and collection methods are highly readable and enables even disaffected users to resolve issues delightfully.

Ethics leaders also fight for users and workers, who can be forgotten. These leaders may champion enforceable consumer protections in the first place, before a crisis erupts. With social movements, leaders fight powerful actors preying on vulnerable communities or the public at large—and critically examines and ameliorates its own participation in systemic violence. As a result, instead of last-minute heroic efforts to change compromised operations, it’s been iterating all along.

Finally, ethics leaders empower their users and workers. With diverse communities and employees, they co-create new products that help improve basic needs and enable more, including the vulnerable, to increase their autonomy and their economic mobility. These entrepreneurial efforts validate new revenue streams and relationships while incubating next-generation workers who self-govern and push the company’s mission forward. Employees voice their values and diversify their relationships. Alison Taylor, the Executive Director of Ethical Systems, argues that internal processes should “improve [workers’] reasoning and creativity, instead of short-circuiting them.” Enabling this is a culture of psychological safety and training to engage kindly with divergent ideas.

These purpose-led strategies boost employee performance and retention, drive deep customer loyalty, and carve legacies.

To be clear, Apple may be implementing at least some of these strategies already—but perhaps not uniformly or transparently. For instance, Apple has implemented some provisions of the European Union’s General Data Protection Regulation for all US residents—not just EU and CA residents—including the ability to access and edit data. This expensive move, which goes beyond strict legal requirements, was implemented even without public pressure.

But ethics strategies have major limitations leaders must address

As demonstrated by the waves of ethical “principles” released by Fortune 500 companies and commissions, ethics programs can be murky, dominated by a white, male, and Western interpretation.

Furthermore, focusing purely on ethics gives companies an easy way to “free ride” off social goodwill, but ultimately stay unaccountable, given the lack of external oversight over ethics programs. When companies substitute unaccountable data ethics principles for thoughtful engagement with the enforceable data regulation principles, users will be harmed.

Long-term, without the ability to wave a $100 million fine with clear-cut requirements and lawyers trained to advocate for them internally, ethics leaders may face barriers to buy-in. Unlike their sales, marketing, or compliance counterparts, ethics programs do not directly add revenue or reduce costs. In recessions, these “soft” programs may be the first on the chopping block.

As a result of these factors, we will likely see a surge in ethics-washing: well-intentioned companies that talk ethics, but don’t walk it. More will view these efforts as PR-driven ethics stunts, which don’t deeply engage with actual ethical issues. If harmful business models do not change, ethics leaders will be fighting a losing battle….(More)”.

Synthetic data offers advanced privacy for the Census Bureau, business


Kate Kaye at IAPP: “In the early 2000s, internet accessibility made risks of exposing individuals from population demographic data more likely than ever. So, the U.S. Census Bureau turned to an emerging privacy approach: synthetic data.

Some argue the algorithmic techniques used to develop privacy-secure synthetic datasets go beyond traditional deidentification methods. Today, along with the Census Bureau, clinical researchers, autonomous vehicle system developers and banks use these fake datasets that mimic statistically valid data.

In many cases, synthetic data is built from existing data by filtering it through machine learning models. Real data representing real individuals flows in, and fake data mimicking individuals with corresponding characteristics flows out.

When data scientists at the Census Bureau began exploring synthetic data methods, adoption of the internet had made deidentified, open-source data on U.S. residents, their households and businesses more accessible than in the past.

Especially concerning, census-block-level information was now widely available. Because in rural areas, a census block could represent data associated with as few as one house, simply stripping names, addresses and phone numbers from that information might not be enough to prevent exposure of individuals.

“There was pretty widespread angst” among statisticians, said John Abowd, the bureau’s associate director for research and methodology and chief scientist. The hand-wringing led to a “gradual awakening” that prompted the agency to begin developing synthetic data methods, he said.

Synthetic data built from the real data preserves privacy while providing information that is still relevant for research purposes, Abowd said: “The basic idea is to try to get a model that accurately produces an image of the confidential data.”

The plan for the 2020 census is to produce a synthetic image of that original data. The bureau also produces On the Map, a web-based mapping and reporting application that provides synthetic data showing where workers are employed and where they live along with reports on age, earnings, industry distributions, race, ethnicity, educational attainment and sex.

Of course, the real census data is still locked away, too, Abowd said: “We have a copy and the national archives have a copy of the confidential microdata.”…(More)”.

Scraping the Web for Public Health Gains: Ethical Considerations from a ‘Big Data’ Research Project on HIV and Incarceration


Stuart Rennie, Mara Buchbinder, Eric Juengst, Lauren Brinkley-Rubinstein, and David L Rosen at Public Health Ethics: “Web scraping involves using computer programs for automated extraction and organization of data from the Web for the purpose of further data analysis and use. It is frequently used by commercial companies, but also has become a valuable tool in epidemiological research and public health planning. In this paper, we explore ethical issues in a project that “scrapes” public websites of U.S. county jails as part of an effort to develop a comprehensive database (including individual-level jail incarcerations, court records and confidential HIV records) to enhance HIV surveillance and improve continuity of care for incarcerated populations. We argue that the well-known framework of Emanuel et al. (2000) provides only partial ethical guidance for the activities we describe, which lie at a complex intersection of public health research and public health practice. We suggest some ethical considerations from the ethics of public health practice to help fill gaps in this relatively unexplored area….(More)”.

How Taiwan Used Big Data, Transparency and a Central Command to Protect Its People from Coronavirus


Article by Beth Duff-Brown: “…So what steps did Taiwan take to protect its people? And could those steps be replicated here at home?

Stanford Health Policy’s Jason Wang, MD, PhD, an associate professor of pediatrics at Stanford Medicine who also has a PhD in policy analysis, credits his native Taiwan with using new technology and a robust pandemic prevention plan put into place at the 2003 SARS outbreak.

“The Taiwan government established the National Health Command Center (NHCC) after SARS and it’s become part of a disaster management center that focuses on large-outbreak responses and acts as the operational command point for direct communications,” said Wang, a pediatrician and the director of the Center for Policy, Outcomes, and Prevention at Stanford. The NHCC also established the Central Epidemic Command Center, which was activated in early January.

“And Taiwan rapidly produced and implemented a list of at least 124 action items in the past five weeks to protect public health,” Wang said. “The policies and actions go beyond border control because they recognized that that wasn’t enough.”

Wang outlines the measures Taiwan took in the last six weeks in an article published Tuesday in the Journal of the American Medical Association.

“Given the continual spread of COVID-19 around the world, understanding the action items that were implemented quickly in Taiwan, and the effectiveness of these actions in preventing a large-scale epidemic, may be instructive for other countries,” Wang and his co-authors wrote.

Within the last five weeks, Wang said, the Taiwan epidemic command center rapidly implemented those 124 action items, including border control from the air and sea, case identification using new data and technology, quarantine of suspicious cases, educating the public while fighting misinformation, negotiating with other countries — and formulating policies for schools and businesses to follow.

Big Data Analytics

The authors note that Taiwan integrated its national health insurance database with its immigration and customs database to begin the creation of big data for analytics. That allowed them case identification by generating real-time alerts during a clinical visit based on travel history and clinical symptoms.

Taipei also used Quick Response (QR) code scanning and online reporting of travel history and health symptoms to classify travelers’ infectious risks based on flight origin and travel history in the last 14 days. People who had not traveled to high-risk areas were sent a health declaration border pass via SMS for faster immigration clearance; those who had traveled to high-risk areas were quarantined at home and tracked through their mobile phones to ensure that they stayed home during the incubation period.

The country also instituted a toll-free hotline for citizens to report suspicious symptoms in themselves or others. As the disease progressed, the government called on major cities to establish their own hotlines so that the main hotline would not become jammed….(More)”.

Facebook Ads as a Demographic Tool to Measure the Urban-Rural Divide


Paper by Daniele Rama, Yelena Mejova, Michele Tizzoni, Kyriaki Kalimeri, and Ingmar Weber: “In the global move toward urbanization, making sure the people remaining in rural areas are not left behind in terms of development and policy considerations is a priority for governments worldwide. However, it is increasingly challenging to track important statistics concerning this sparse, geographically dispersed population, resulting in a lack of reliable, up-to-date data. In this study, we examine the usefulness of the Facebook Advertising platform, which offers a digital “census” of over two billions of its users, in measuring potential rural-urban inequalities.

We focus on Italy, a country where about 30% of the population lives in rural areas. First, we show that the population statistics that Facebook produces suffer from instability across time and incomplete coverage of sparsely populated municipalities. To overcome such limitation, we propose an alternative methodology for estimating Facebook Ads audiences that nearly triples the coverage of the rural municipalities from 19% to 55% and makes feasible fine-grained sub-population analysis. Using official national census data, we evaluate our approach and confirm known significant urban-rural divides in terms of educational attainment and income. Extending the analysis to Facebook-specific user “interests” and behaviors, we provide further insights on the divide, for instance, finding that rural areas show a higher interest in gambling. Notably, we find that the most predictive features of income in rural areas differ from those for urban centres, suggesting researchers need to consider a broader range of attributes when examining rural wellbeing. The findings of this study illustrate the necessity of improving existing tools and methodologies to include under-represented populations in digital demographic studies — the failure to do so could result in misleading observations, conclusions, and most importantly, policies….(More)”.

How big data is dividing the public in China’s coronavirus fight – green, yellow, red


Article by Viola Zhou: “On Valentine’s Day, a 36-year-old lawyer Matt Ma in the eastern Chinese province of Zhejiang discovered he had been coded “red”.The colour, displayed in a payment app on his smartphone, indicated that he needed to be quarantined at home even though he had no symptoms of the dangerous coronavirus.

Without a green light from the system, Ma could not travel from his ancestral hometown of Lishui to his new home city of Hangzhou, which is now surrounded by checkpoints set up to contain the epidemic.

Ma is one of the millions of people whose movements are being choreographed by the government through software that feeds on troves of data and issues orders that effectively dictate whether they must stay in or can go to work.Their experience represents a slice of China’s desperate attempt to stop the coronavirus by using a mixed bag of cutting-edge technologies and old-fashioned surveillance. It was also a rare real-world test of the use of technology on a large scale to halt the spread of communicable diseases.

“This kind of massive use of technology is unprecedented,” said Christos Lynteris, a medical anthropologist at the University of St Andrews who has studied epidemics in China.

But Hangzhou’s experiment has also revealed the pitfalls of applying opaque formulas to a large population.

In the city’s case, there are reports of people being marked incorrectly, falling victim to an algorithm that is, by the government’s own admission, not perfect….(More)”.

Who will benefit most from the data economy?


Special Report by The Economist: “The data economy is a work in progress. Its economics still have to be worked out; its infrastructure and its businesses need to be fully built; geopolitical arrangements must be found. But there is one final major tension: between the wealth the data economy will create and how it will be distributed. The data economy—or the “second economy”, as Brian Arthur of the Santa Fe Institute terms it—will make the world a more productive place no matter what, he predicts. But who gets what and how is less clear. “We will move from an economy where the main challenge is to produce more and more efficiently,” says Mr Arthur, “to one where distribution of the wealth produced becomes the biggest issue.”

The data economy as it exists today is already very unequal. It is dominated by a few big platforms. In the most recent quarter, Amazon, Apple, Alphabet, Microsoft and Facebook made a combined profit of $55bn, more than the next five most valuable American tech firms over the past 12 months. This corporate inequality is largely the result of network effects—economic forces that mean size begets size. A firm that can collect a lot of data, for instance, can make better use of artificial intelligence and attract more users, who in turn supply more data. Such firms can also recruit the best data scientists and have the cash to buy the best ai startups.

It is also becoming clear that, as the data economy expands, these sorts of dynamics will increasingly apply to non-tech companies and even countries. In many sectors, the race to become a dominant data platform is on. This is the mission of Compass, a startup, in residential property. It is one goal of Tesla in self-driving cars. And Apple and Google hope to repeat the trick in health care. As for countries, America and China account for 90% of the market capitalisation of the world’s 70 largest platforms (see chart), Africa and Latin America for just 1%. Economies on both continents risk “becoming mere providers of raw data…while having to pay for the digital intelligence produced,” the United Nations Conference on Trade and Development recently warned.

Yet it is the skewed distribution of income between capital and labour that may turn out to be the most pressing problem of the data economy. As it grows, more labour will migrate into the mirror worlds, just as other economic activity will. It is not only that people will do more digitally, but they will perform actual “data work”: generating the digital information needed to train and improve ai services. This can mean simply moving about online and providing feedback, as most people already do. But it will increasingly include more active tasks, such as labelling pictures, driving data-gathering vehicles and perhaps, one day, putting one’s digital twin through its paces. This is the reason why some say ai should actually be called “collective intelligence”: it takes in a lot of human input—something big tech firms hate to admit….(More)”.