Digital Privacy for Reproductive Choice in the Post-Roe Era


Paper by Aziz Z. Huq and Rebecca Wexler: “The overruling of Roe v. Wade unleashed a torrent of regulatory and punitive activity restricting lawful reproductive options. The turn to the expansive criminal law and new schemes of civil liability creates new, and quite different, concerns from the pre-Roe landscape a half-century, ago. Reproductive choice, and its nemesis, rests on information. For pregnant people, deciding on a choice of medical care entails a search for advice and services. Information is at a premium for them. Meanwhile, efforts to regulate abortion begin with clinic closings, but quickly will extend to civil actions and criminal indictments of patients, providers, and those who facilitate abortions. Like the pregnant themselves, criminal and civil enforcers depend on information. And in the contemporary context, the informational landscape, and hence access to counseling and services such as medication abortion, is largely digital. In an era when most people use search engines or social media to access information, the digital architecture and data retention policies of those platforms will determine not only whether the pregnant can access medically accurate advice but also whether the mere act of doing so places them in legal peril.

This Article offers the first comprehensive accounting of abortion-related digital privacy after the end of Roe. It demonstrates first that digital privacy for pregnant persons in the United States has suddenly become a tremendously fraught and complex question. It then maps the treacherous social, legal and economic terrain upon which firms, individuals, and states will make privacy related decisions. Building on this political economy, we develop a moral and economic argument to the effect that digital firms should maximize digital privacy for pregnant persons within the scope of the law, and should actively resist restrictionist states’ efforts to instrumentalize them into their war on reproductive choice. We then lay out precise, tangible steps that firms should take to enact this active resistance, explaining in particular a range of powerful yet legal options for firms to refuse cooperation with restrictionist criminal and civil investigations. Finally, we present an original, concrete and immediately actionable proposal for federal and state legislative intervention: a statutory evidentiary privilege to shield abortion-relevant data from restrictionist warrants, subpoenas, court orders, and judicial proceedings…(More)”

Income Inequality Is Rising. Are We Even Measuring It Correctly?


Article by Jon Jachimowicz et al: “Income inequality is on the rise in many countries around the world, according to the United Nations. What’s more, disparities in global income were exacerbated by the COVID-19 pandemic, with some countries facing greater economic losses than others.

Policymakers are increasingly focusing on finding ways to reduce inequality to create a more just and equal society for all. In making decisions on how to best intervene, policymakers commonly rely on the Gini coefficient, a statistical measure of resource distribution, including wealth and income levels, within a population. The Gini coefficient measures perfect equality as zero and maximum inequality as one, with higher numbers indicating a greater concentration of resources in the hands of a few.

This measure has long dominated our understanding (pdf) of what inequality means, largely because this metric is used by governments around the world, is released by statistics bureaus in multiple countries, and is commonly discussed in news media and policy discussions alike.

In our paper, recently published in Nature Human Behaviour, we argue that researchers and policymakers rely too heavily on the Gini coefficient—and that by broadening our understanding of how we measure inequality, we can both uncover its impact and intervene to more effectively correct It…(More)”.

Nudging Science Towards Fairer Evaluations: Evidence From Peer Review


Paper by Inna Smirnova, Daniel M. Romero, and Misha Teplitskiy: “Peer review is widely used to select scientific projects for funding and publication, but there is growing evidence that it is biased towards prestigious individuals and institutions. Although anonymizing submissions can reduce prestige bias, many organizations do not implement anonymization, in part because enforcing it can be prohibitively costly. Here, we examine whether nudging but not forcing authors to anonymize their submissions reduces prestige bias. We partnered with IOP Publishing, one of the largest academic publishers, which adopted a policy strongly encouraging authors to anonymize their submissions and staggered the policy rollout across its physics journal portfolio. We examine 156,015 submissions to 57 peer-reviewed journals received between January 2018 and February 2022 and measure author prestige with citations accrued at submission time. Higher prestige first authors were less likely to anonymize. Nevertheless, for low-prestige authors, the policy increased positive peer reviews by 2.4% and acceptance by 5.6%. For middle- and high-prestige authors, the policy decreased positive reviews (1.8% and 1%) and final acceptance (4.6% and 2.2%). The policy did not have unintended consequences on reviewer recruitment or the characteristics of submitting authors. Overall, nudges are a simple, low-cost, and effective method to reduce prestige bias and should be considered by organizations for which enforced-anonymization is impractical…(More)”.

The Low Threshold for Face Recognition in New Delhi


Article by Varsha Bansal: “Indian law enforcement is starting to place huge importance on facial recognition technology. Delhi police, looking into identifying people involved in civil unrest in northern India in the past few years, said that they would consider 80 percent accuracy and above as a “positive” match, according to documents obtained by the Internet Freedom Foundation through a public records request.

Facial recognition’s arrival in India’s capital region marks the expansion of Indian law enforcement officials using facial recognition data as evidence for potential prosecution, ringing alarm bells among privacy and civil liberties experts. There are also concerns about the 80 percent accuracy threshold, which critics say is arbitrary and far too low, given the potential consequences for those marked as a match. India’s lack of a comprehensive data protection law makes matters even more concerning.

The documents further state that even if a match is under 80 percent, it would be considered a “false positive” rather than a negative, which would make that individual “subject to due verification with other corroborative evidence.”

“This means that even though facial recognition is not giving them the result that they themselves have decided is the threshold, they will continue to investigate,” says Anushka Jain, associate policy counsel for surveillance and technology with the IFF, who filed for this information. “This could lead to harassment of the individual just because the technology is saying that they look similar to the person the police are looking for.” She added that this move by the Delhi Police could also result in harassment of people from communities that have been historically targeted by law enforcement officials…(More)”

Blue Spoons: Sparking Communication About Appropriate Technology Use


Paper by Arun G. Chandrasekhar, Esther Duflo, Michael Kremer, João F. Pugliese, Jonathan Robinson & Frank Schilbach: “An enduring puzzle regarding technology adoption in developing countries is that new technologies often diffuse slowly through the social network. Two of the key predictions of the canonical epidemiological model of technology diffusion are that forums to share information and higher returns to technology should both spur social transmission. We design a large-scale experiment to test these predictions among farmers in Western Kenya, and we fail to find support for either. However, in the same context, we introduce a technology that diffuses very fast: a simple kitchen spoon (painted in blue) to measure out how much fertilizer to use. We develop a model that explains both the failure of the standard approaches and the surprising success of this new technology. The core idea of the model is that not all information is reliable, and farmers are reluctant to develop a reputation of passing along false information. The model and data suggest that there is value in developing simple, transparent technologies to facilitate communication…(More)”.

A journey toward an open data culture through transformation of shared data into a data resource


Paper by Scott D. Kahn and Anne Koralova: “The transition to open data practices is straightforward albeit surprisingly challenging to implement largely due to cultural and policy issues. A general data sharing framework is presented along with two case studies that highlight these challenges and offer practical solutions that can be adjusted depending on the type of data collected, the country in which the study is initiated, and the prevailing research culture. Embracing the constraints imposed by data privacy considerations, especially for biomedical data, must be emphasized for data outside of the United States until data privacy law(s) are established at the Federal and/or State level…(More).”

Without appropriate metadata, data-sharing mandates are pointless


Article by Mark A. Musen: “Last month, the US government announced that research articles and most underlying data generated with federal funds should be made publicly available without cost, a policy to be implemented by the end of 2025. That’s atop other important moves. The European Union’s programme for science funding, Horizon Europe, already mandates that almost all data be FAIR (that is, findable, accessible, interoperable and reusable). The motivation behind such data-sharing policies is to make data more accessible so others can use them to both verify results and conduct further analyses.

But just getting those data sets online will not bring anticipated benefits: few data sets will really be FAIR, because most will be unfindable. What’s needed are policies and infrastructure to organize metadata.

Imagine having to search for publications on some topic — say, methods for carbon reclamation — but you could use only the article titles (no keywords, abstracts or search terms). That’s essentially the situation for finding data sets. If I wanted to identify all the deposited data related to carbon reclamation, the task would be futile. Current metadata often contain only administrative and organizational information, such as the name of the investigator and the date when the data were acquired.

What’s more, for scientific data to be useful to other researchers, metadata must sensibly and consistently communicate essentials of the experiments — what was measured, and under what conditions. As an investigator who builds technology to assist with data annotation, it’s frustrating that, in the majority of fields, the metadata standards needed to make data FAIR don’t even exist.

Metadata about data sets typically lack experiment-specific descriptors. If present, they’re sparse and idiosyncratic. An investigator searching the Gene Expression Omnibus (GEO), for example, might seek genomic data sets containing information on how a disease or condition manifests itself in young animals or humans. Performing such a search requires knowledge of how the age of individuals is represented — which in the GEO repository, could be age, AGE, age (after birth), age (years), Age (yr-old) or dozens of other possibilities. (Often, such information is missing from data sets altogether.) Because the metadata are so ad hoc, automated searches fail, and investigators waste enormous amounts of time manually sifting through records to locate relevant data sets, with no guarantee that most (or any) can be found…(More)”.

The New ADP National Employment Report


Press Release: “The new ADP National Employment Report (NER) launched today in collaboration with the Stanford Digital Economy Lab. Earlier this spring, the ADP Research Institute paused the NER in order to refine the methodology and design of the report. Part of that evolution was teaming up data scientists at the Stanford Digital Economy Lab to add a new perspective and rigor to the report. The new report uses fine-grained, high-frequency data on jobs and wages to deliver a richer and more useful analysis of the labor market.

Let’s take a look at some of the key changes with the new NER, along with the new ADP® Pay Insights Report.

It’s independent. The key change is that the new ADP NER is an independent measure of the US labor market, rather than a forecast of the BLS monthly jobs number. Jobs report and pay insights are based on anonymized and aggregated payroll data from more than 25 million US employees across 500,000 companies. The new report focuses solely on ADP’s clients and private-sector change…(More)”.

Measuring Small Business Dynamics and Employment with Private-Sector Real-Time Data


Paper by André Kurmann, Étienne Lalé and Lien Ta: “The COVID-19 pandemic has led to an explosion of research using private-sector datasets to measure business dynamics and employment in real-time. Yet questions remain about the representativeness of these datasets and how to distinguish business openings and closings from sample churn – i.e., sample entry of already operating businesses and sample exits of businesses that continue operating. This paper proposes new methods to address these issues and applies them to the case of Homebase, a real-time dataset of mostly small service-sector sector businesses that has been used extensively in the literature to study the effects of the pandemic. We match the Homebase establishment records with information on business activity from Safegraph, Google, and Facebook to assess the representativeness of the data and to estimate the probability of business closings and openings among sample exits and entries. We then exploit the high frequency / geographic detail of the data to study whether small service-sector businesses have been hit harder by the pandemic than larger firms, and the extent to which the Paycheck Protection Program (PPP) helped small businesses keep their workforce employed. We find that our real-time estimates of small business dynamics and employment during the pandemic are remarkably representative and closely fit population counterparts from administrative data that have recently become available. Distinguishing business closings and openings from sample churn is critical for these results. We also find that while employment by small businesses contracted more severely in the beginning of the pandemic than employment of larger businesses, it also recovered more strongly thereafter. In turn, our estimates suggests that the rapid rollout of PPP loans significantly mitigated the negative employment effects of the pandemic. Business closings and openings are a key driver for both results, thus underlining the importance of properly correcting for sample churn…(More)”.

One Data Point Can Beat Big Data


Essay by Gerd Gigerenzer: “…In my research group at the Max Planck Institute for Human Development, we’ve studied simple algorithms (heuristics) that perform well under volatile conditions. One way to derive these rules is to rely on psychological AI: to investigate how the human brain deals with situations of disruption and change. Back in 1838, for instance, Thomas Brown formulated the Law of Recency, which states that recent experiences come to mind faster than those in the distant past and are often the sole information that guides human decision. Contemporary research indicates that people do not automatically rely on what they recently experienced, but only do so in unstable situations where the distant past is not a reliable guide for the future. In this spirit, my colleagues and I developed and tested the following “brain algorithm”:

Recency heuristic for predicting the flu: Predict that this week’s proportion of flu-related doctor visits will equal those of the most recent data, from one week ago.

Unlike Google’s secret Flu Trends algorithm, this rule is transparent and can be easily applied by everyone. Its logic can be understood. It relies on a single data point only, which can be looked up on the website of the Center for Disease Control. And it dispenses with combing through 50 million search terms and trial-and-error testing of millions of algorithms. But how well does it actually predict the flu?

Three fellow researchers and I tested the recency rule using the same eight years of data on which Google Flu Trends algorithm was tested, that is, weekly observations between March 2007 and August 2015. During that time, the proportion of flu-related visits among all doctor visits ranged between one percent and eight percent, with an average of 1.8 percent visits per week (Figure 1). This means that if every week you were to make the simple but false prediction that there are zero flu-related doctor visits, you would have a mean absolute error of 1.8 percentage points over four years. Google Flu Trends predicted much better than that, with a mean error of 0.38 percentage points (Figure 2). The recency heuristic had a mean error of only 0.20 percentage points, which is even better. If we exclude the period where the swine flu happened, that is before the first update of Google Flu Trends, the result remains essentially the same (0.38 and 0.19, respectively)….(More)”.