Paper by Sarah Tahamont et al: “While linking records across large administrative datasets [“big data”] has the potential to revolutionize empirical social science research, many administrative data files do not have common identifiers and are thus not designed to be linked to others. To address this problem, researchers have developed probabilistic record linkage algorithms which use statistical patterns in identifying characteristics to perform linking tasks. Naturally, the accuracy of a candidate linking algorithm can be substantially improved when an algorithm has access to “ground-truth” examples — matches which can be validated using institutional knowledge or auxiliary data. Unfortunately, the cost of obtaining these examples is typically high, often requiring a researcher to manually review pairs of records in order to make an informed judgement about whether they are a match. When a pool of ground-truth information is unavailable, researchers can use “active learning” algorithms for linking, which ask the user to provide ground-truth information for select candidate pairs. In this paper, we investigate the value of providing ground-truth examples via active learning for linking performance. We confirm popular intuition that data linking can be dramatically improved with the availability of ground truth examples. But critically, in many real-world applications, only a relatively small number of tactically-selected ground-truth examples are needed to obtain most of the achievable gains. With a modest investment in ground truth, researchers can approximate the performance of a supervised learning algorithm that has access to a large database of ground truth examples using a readily available off-the-shelf tool…(More)”.
Am I Normal? The 200-Year Search for Normal People (and Why They Don’t Exist)
Book by Sarah Chaney: “Before the 19th century, the term ’normal’ was rarely ever associated with human behaviour. Normal was a term used in maths, for right angles. People weren’t normal; triangles were.
But from the 1830s, this branch of science really took off across Europe and North America, with a proliferation of IQ tests, sex studies, a census of hallucinations – even a UK beauty map (which concluded the women in Aberdeen were “the most repellent”). This book tells the surprising history of how the very notion of the normal came about, how it shaped us all, often while entrenching oppressive values.
Sarah Chaney looks at why we’re still asking the internet: Do I have a normal body? Is my sex life normal? Are my kids normal? And along the way, she challenges why we ever thought it might be a desirable thing to be…(More)”.
Protecting the integrity of survey research
Paper by Jamieson, Kathleen Hall, et al: “Although polling is not irredeemably broken, changes in technology and society create challenges that, if not addressed well, can threaten the quality of election polls and other important surveys on topics such as the economy. This essay describes some of these challenges and recommends remediations to protect the integrity of all kinds of survey research, including election polls. These 12 recommendations specify ways that survey researchers, and those who use polls and other public-oriented surveys, can increase the accuracy and trustworthiness of their data and analyses. Many of these recommendations align practice with the scientific norms of transparency, clarity, and self-correction. The transparency recommendations focus on improving disclosure of factors that affect the nature and quality of survey data. The clarity recommendations call for more precise use of terms such as “representative sample” and clear description of survey attributes that can affect accuracy. The recommendation about correcting the record urges the creation of a publicly available, professionally curated archive of identified technical problems and their remedies. The paper also calls for development of better benchmarks and for additional research on the effects of panel conditioning. Finally, the authors suggest ways to help people who want to use or learn from survey research understand the strengths and limitations of surveys and distinguish legitimate and problematic uses of these methods…(More)”.
China’s fake science industry: how ‘paper mills’ threaten progress
Article by Eleanor Olcott, Clive Cookson and Alan Smith at the Financial Times: “…Over the past two decades, Chinese researchers have become some of the world’s most prolific publishers of scientific papers. The Institute for Scientific Information, a US-based research analysis organisation, calculated that China produced 3.7mn papers in 2021 — 23 per cent of global output — and just behind the 4.4mn total from the US.
At the same time, China has been climbing the ranks of the number of times a paper is cited by other authors, a metric used to judge output quality. Last year, China surpassed the US for the first time in the number of most cited papers, according to Japan’s National Institute of Science and Technology Policy, although that figure was flattered by multiple references to Chinese research that first sequenced the Covid-19 virus genome.
The soaring output has sparked concern in western capitals. Chinese advances in high-profile fields such as quantum technology, genomics and space science, as well as Beijing’s surprise hypersonic missile test two years ago, have amplified the view that China is marching towards its goal of achieving global hegemony in science and technology.
That concern is a part of a wider breakdown of trust in some quarters between western institutions and Chinese ones, with some universities introducing background checks on Chinese academics amid fears of intellectual property theft.
But experts say that China’s impressive output masks systemic inefficiencies and an underbelly of low-quality and fraudulent research. Academics complain about the crushing pressure to publish to gain prized positions at research universities…(More)”.
Machine Learning as a Tool for Hypothesis Generation
Paper by Jens Ludwig & Sendhil Mullainathan: “While hypothesis testing is a highly formalized activity, hypothesis generation remains largely informal. We propose a systematic procedure to generate novel hypotheses about human behavior, which uses the capacity of machine learning algorithms to notice patterns people might not. We illustrate the procedure with a concrete application: judge decisions about who to jail. We begin with a striking fact: The defendant’s face alone matters greatly for the judge’s jailing decision. In fact, an algorithm given only the pixels in the defendant’s mugshot accounts for up to half of the predictable variation. We develop a procedure that allows human subjects to interact with this black-box algorithm to produce hypotheses about what in the face influences judge decisions. The procedure generates hypotheses that are both interpretable and novel: They are not explained by demographics (e.g. race) or existing psychology research; nor are they already known (even if tacitly) to people or even experts. Though these results are specific, our procedure is general. It provides a way to produce novel, interpretable hypotheses from any high-dimensional dataset (e.g. cell phones, satellites, online behavior, news headlines, corporate filings, and high-frequency time series). A central tenet of our paper is that hypothesis generation is in and of itself a valuable activity, and hope this encourages future work in this largely “pre-scientific” stage of science…(More)”.
Collaborative Advantage: Creating Global Commons for Science, Technology, and Innovation
Essay by Leonard Lynn and Hal Salzman: “…We argue that abandoning this techno-nationalist approach and instead investing in systems of global innovation commons, modeled on successful past experiences, and developing new principles and policies for collaborative STI could bring substantially greater benefits—not only for the world, but specifically for the United States. Key to this effort will be creating systems of governance that enable nations to contribute to the commons and to benefit from its innovations, while also allowing each country substantial freedom of action…
The competitive and insular tone of contemporary discourse about STI stands in contrast to our era’s most urgent challenges, which are global in scale: the COVID-19 pandemic, climate change, and governance of complex emerging technologies such as gene editing and artificial intelligence. These global challenges, we believe, require resources, scientific understanding, and know-how that can best be developed through common resource pools to enable both global scale and rapid dissemination. Moreover, aside from moral or ethical considerations about sharing such innovations, the reality of current globalization means that solutions—such as pandemic vaccines—must spread beyond national borders to fully benefit the world. Consequently, each separate national interest will be better served by collaboratively building up the global stocks of STI as public goods. Global scientific commons could be vital in addressing these challenges, but will require new frameworks for governance that are fair and attractive to many nations while also enabling them to act individually.
A valuable perspective on the governance of common pool resources (CPR) can be found in the work that Nobel laureate Elinor Ostrom did with her colleagues beginning in the 1950s. Ostrom, a political scientist, studied how communities that must share common resources—water, fisheries, or grazing land—use trust, cooperation, and collective deliberation to manage those resources over the long term. Before Ostrom’s work, many economists believed that shared resource systems were inherently unsustainable because individuals acting in their own self-interest would ultimately undermine the good of the group, often described as “the tragedy of the commons.” Instead, Ostrom demonstrated that communities can create durable “practical algorithms” for sharing pooled resources, whether that be irrigation in Nepal or lobster fishing in Maine…(More)”.
The Statistics That Come Out of Nowhere
Article by Ray Fisman, Andrew Gelman, and Matthew C. Stephenson: “This winter, the university where one of us works sent out an email urging employees to wear a hat on particularly cold days because “most body heat is lost through the top of the head.” Many people we know have childhood memories of a specific figure—perhaps 50 percent or, by some accounts, 80 percent of the heat you lose is through your head. But neither figure is scientific: One is flawed, and the other is patently wrong. A 2004 New York Times column debunking the claim traced its origin to a U.S. military study from the 1950s in which people dressed in neck-high Arctic-survival suits were sent out into the cold. Participants lost about half of their heat through the only part of their body that was exposed to the elements. Exaggeration by generations of parents got us up to 80 percent. (According to a hypothermia expert cited by the Times, a more accurate figure is 10 percent.)
This rather trivial piece of medical folklore is an example of a more serious problem: Through endless repetition, numbers of dubious origin take on the veneer of scientific fact, in many cases in the context of vital public-policy debates. Unreliable numbers are always just an internet search away, and serious people and institutions depend on and repeat seemingly precise quantitative measurements that turn out to have no reliable support…(More)”.
The big idea: should governments run more experiments?
Article by Stian Westlake: “…Conceived in haste in the early days of the pandemic, Recovery (which stands for Randomised Evaluation of Covid-19 Therapy) sought to find drugs to help treat people seriously ill with the novel disease. It brought together epidemiologists, statisticians and health workers to test a range of promising existing drugs at massive scale across the NHS.
The secret of Recovery’s success is that it was a series of large, fast, randomised experiments, designed to be as easy as possible for doctors and nurses to administer in the midst of a medical emergency. And it worked wonders: within three months, it had demonstrated that dexamethasone, a cheap and widely available steroid, reduced Covid deaths by a fifth to a third. In the months that followed, Recovery identified four more effective drugs, and along the way showed that various popular treatments, including hydroxychloroquine, President Trump’s tonic of choice, were useless. All in all, it is thought that Recovery saved a million lives around the world, and it’s still going.
But Recovery’s incredible success should prompt us to ask a more challenging question: why don’t we do this more often? The question of which drugs to use was far from the only unknown we had to navigate in the early days of the pandemic. Consider the decision to delay second doses of the vaccine, when to close schools, or the right regime for Covid testing. In each case, the UK took a calculated risk and hoped for the best. But as the Royal Statistical Society pointed out at the time, it would have been cheap and quick to undertake trials so we could know for sure what the right choice was, and then double down on it.
There is a growing movement to apply randomised trials not just in healthcare but in other things government does. ..(More)”.
When Ideology Drives Social Science
Article by Michael Jindra and Arthur Sakamoto: Last summer in these pages, Mordechai Levy-Eichel and Daniel Scheinerman uncovered a major flaw in Richard Jean So’s Redlining Culture: A Data History of Racial Inequality and Postwar Fiction, one that rendered the book’s conclusion null and void. Unfortunately, what they found was not an isolated incident. In complex areas like the study of racial inequality, a fundamentalism has taken hold that discourages sound methodology and the use of reliable evidence about the roots of social problems.
We are not talking about mere differences in interpretation of results, which are common. We are talking about mistakes so clear that they should cause research to be seriously questioned or even disregarded. A great deal of research — we will focus on examinations of Asian American class mobility — rigs its statistical methods in order to arrive at ideologically preferred conclusions.
Most sophisticated quantitative work in sociology involves multivariate research, often in a search for causes of social problems. This work might ask how a particular independent variable (e.g., education level) “causes” an outcome or dependent variable (e.g., income). Or it could study the reverse: How does parental income influence children’s education?
Human behavior is too complicated to be explained by only one variable, so social scientists typically try to “control” for various causes simultaneously. If you are trying to test for a particular cause, you want to isolate that cause and hold all other possible causes constant. One can control for a given variable using what is called multiple regression, a statistical tool that parcels out the separate net effects of several variables simultaneously.
If you want to determine whether income causes better education outcomes, you’d want to compare everyone from a two-parent family, since family status might be another causal factor, for instance. You’d also want to see the effect of family status by comparing everyone with similar incomes. And so on for other variables.
The problem is that there are potentially so many variables that a researcher inevitably leaves some out…(More)”.
Toward a 21st Century National Data Infrastructure: Mobilizing Information for the Common Good
Report by National Academies of Sciences, Engineering, and Medicine: “Historically, the U.S. national data infrastructure has relied on the operations of the federal statistical system and the data assets that it holds. Throughout the 20th century, federal statistical agencies aggregated survey responses of households and businesses to produce information about the nation and diverse subpopulations. The statistics created from such surveys provide most of what people know about the well-being of society, including health, education, employment, safety, housing, and food security. The surveys also contribute to an infrastructure for empirical social- and economic-sciences research. Research using survey-response data, with strict privacy protections, led to important discoveries about the causes and consequences of important societal challenges and also informed policymakers. Like other infrastructure, people can easily take these essential statistics for granted. Only when they are threatened do people recognize the need to protect them…(More)”.