The Sensitive Politics Of Information For Digital States


Essay by Federica Carugati, Cyanne E. Loyle and Jessica Steinberg: “In 2020, Vice revealed that the U.S. military had signed a contract with Babel Street, a Virginia-based company that created a product called Locate X, which collects location data from users across a variety of digital applications. Some of these apps are seemingly innocuous: one for following storms, a Muslim dating app and a level for DIY home repair. Less innocuously, these reports indicate that the U.S. government is outsourcing some of its counterterrorism and counterinsurgency information-gathering activities to a private company.

While states have always collected information about citizens and their activities, advances in digital technologies — including new kinds of data and infrastructure — have fundamentally altered their ability to access, gather and analyze information. Bargaining with and relying on non-state actors like private companies creates tradeoffs between a state’s effectiveness and legitimacy. Those tradeoffs might be unacceptable to citizens, undermining our very understanding of what states do and how we should interact with them …(More)”

The Statistics That Come Out of Nowhere


Article by Ray Fisman, Andrew Gelman, and Matthew C. Stephenson: “This winter, the university where one of us works sent out an email urging employees to wear a hat on particularly cold days because “most body heat is lost through the top of the head.” Many people we know have childhood memories of a specific figure—perhaps 50 percent or, by some accounts, 80 percent of the heat you lose is through your head. But neither figure is scientific: One is flawed, and the other is patently wrong. A 2004 New York Times column debunking the claim traced its origin to a U.S. military study from the 1950s in which people dressed in neck-high Arctic-survival suits were sent out into the cold. Participants lost about half of their heat through the only part of their body that was exposed to the elements. Exaggeration by generations of parents got us up to 80 percent. (According to a hypothermia expert cited by the Times, a more accurate figure is 10 percent.)

This rather trivial piece of medical folklore is an example of a more serious problem: Through endless repetition, numbers of dubious origin take on the veneer of scientific fact, in many cases in the context of vital public-policy debates. Unreliable numbers are always just an internet search away, and serious people and institutions depend on and repeat seemingly precise quantitative measurements that turn out to have no reliable support…(More)”.

The big idea: should governments run more experiments?


Article by Stian Westlake: “…Conceived in haste in the early days of the pandemic, Recovery (which stands for Randomised Evaluation of Covid-19 Therapy) sought to find drugs to help treat people seriously ill with the novel disease. It brought together epidemiologists, statisticians and health workers to test a range of promising existing drugs at massive scale across the NHS.

The secret of Recovery’s success is that it was a series of large, fast, randomised experiments, designed to be as easy as possible for doctors and nurses to administer in the midst of a medical emergency. And it worked wonders: within three months, it had demonstrated that dexamethasone, a cheap and widely available steroid, reduced Covid deaths by a fifth to a third. In the months that followed, Recovery identified four more effective drugs, and along the way showed that various popular treatments, including hydroxychloroquine, President Trump’s tonic of choice, were useless. All in all, it is thought that Recovery saved a million lives around the world, and it’s still going.

But Recovery’s incredible success should prompt us to ask a more challenging question: why don’t we do this more often? The question of which drugs to use was far from the only unknown we had to navigate in the early days of the pandemic. Consider the decision to delay second doses of the vaccine, when to close schools, or the right regime for Covid testing. In each case, the UK took a calculated risk and hoped for the best. But as the Royal Statistical Society pointed out at the time, it would have been cheap and quick to undertake trials so we could know for sure what the right choice was, and then double down on it.

There is a growing movement to apply randomised trials not just in healthcare but in other things government does. ..(More)”.

LocalView, a database of public meetings for the study of local politics and policy-making in the United State


Paper by Soubhik Barari and Tyler Simko: “Despite the fundamental importance of American local governments for service provision in areas like education and public health, local policy-making remains difficult and expensive to study at scale due to a lack of centralized data. This article introduces LocalView, the largest existing dataset of real-time local government public meetings–the central policy-making process in local government. In sum, the dataset currently covers 139,616 videos and their corresponding textual and audio transcripts of local government meetings publicly uploaded to YouTube–the world’s largest public video-sharing website– from 1,012 places and 2,861 distinct governments across the United States between 2006–2022. The data are processed, downloaded, cleaned, and publicly disseminated (at localview.net) for analysis across places and over time. We validate this dataset using a variety of methods and demonstrate how it can be used to map local governments’ attention to policy areas of interest. Finally, we discuss how LocalView may be used by journalists, academics, and other users for understanding how local communities deliberate crucial policy questions on topics including climate change, public health, and immigration…(More)”.

It’s Time to Rethink the Idea of “Indigenous” 


Essay by Manvir Singh: “Identity evolves. Social categories shrink or expand, become stiffer or more elastic, more specific or more abstract. What it means to be white or Black, Indian or American, able-bodied or not shifts as we tussle over language, as new groups take on those labels and others strip them away.

On August 3, 1989, the Indigenous identity evolved. Moringe ole Parkipuny, a Maasai activist and a former member of the Tanzanian Parliament, spoke before the U.N. Working Group on Indigenous Populations, in Geneva—the first African ever to do so. “Our cultures and way of life are viewed as outmoded, inimical to national pride, and a hindrance to progress,” he said. As a result, pastoralists like the Maasai, along with hunter-gatherers, “suffer from common problems which characterize the plight of indigenous peoples throughout the world. The most fundamental rights to maintain our specific cultural identity and the land that constitutes the foundation of our existence as a people are not respected by the state and fellow citizens who belong to the mainstream population.”

Parkipuny’s speech was the culmination of an astonishing ascent. Born in a remote village near Tanzania’s Rift Valley, he attended school after British authorities demanded that each family “contribute” a son to be educated. His grandfather urged him to flunk out, but he refused. “I already had a sense of how Maasai were being treated,” he told the anthropologist Dorothy Hodgson in 2005. “I decided I must go on.” He eventually earned an M.A. in development studies from the University of Dar es Salaam.

In his master’s thesis, Parkipuny condemned the Masai Range Project, a twenty-million-dollar scheme funded by the U.S. Agency for International Development to boost livestock productivity. Naturally, then, U.S.A.I.D. was resistant when the Tanzanian government hired him to join the project. In the end, he was sent to the United States to learn about “proper ranches.” He travelled around until, one day, a Navajo man invited him to visit the Navajo Nation, the reservation in the Southwest.

“I stayed with them for two weeks, and then with the Hopi for two weeks,” he told Hodgson. “It was my first introduction to the indigenous world. I was struck by the similarities of our problems.” The disrepair of the roads reminded him of the poor condition of cattle trails in Maasailand…

By the time Parkipuny showed up in Geneva, the concept of “indigenous” had already undergone major transformations. The word—from the Latin indigena, meaning “native” or “sprung from the land”—has been used in English since at least 1588, when a diplomat referred to Samoyed peoples in Siberia as “Indigenæ, or people bred upon that very soyle.” Like “native,” “indigenous” was used not just for people but for flora and fauna as well, suffusing the term with an air of wildness and detaching it from history and civilization. The racial flavor intensified during the colonial period until, again like “native,” “indigenous” served as a partition, distinguishing white settlers—and, in many cases, their slaves—from the non-Europeans who occupied lands before them….When Parkipuny showed up in Geneva, activists were consciously remodelling indigeneity to encompass marginalized peoples worldwide, including, with Parkipuny’s help, in Africa.

Today, nearly half a billion people qualify as Indigenous…(More)”.

When Ideology Drives Social Science


Article by Michael Jindra and Arthur Sakamoto: Last summer in these pages, Mordechai Levy-Eichel and Daniel Scheinerman uncovered a major flaw in Richard Jean So’s Redlining Culture: A Data History of Racial Inequality and Postwar Fiction, one that rendered the book’s conclusion null and void. Unfortunately, what they found was not an isolated incident. In complex areas like the study of racial inequality, a fundamentalism has taken hold that discourages sound methodology and the use of reliable evidence about the roots of social problems.

We are not talking about mere differences in interpretation of results, which are common. We are talking about mistakes so clear that they should cause research to be seriously questioned or even disregarded. A great deal of research — we will focus on examinations of Asian American class mobility — rigs its statistical methods in order to arrive at ideologically preferred conclusions.

Most sophisticated quantitative work in sociology involves multivariate research, often in a search for causes of social problems. This work might ask how a particular independent variable (e.g., education level) “causes” an outcome or dependent variable (e.g., income). Or it could study the reverse: How does parental income influence children’s education?

Human behavior is too complicated to be explained by only one variable, so social scientists typically try to “control” for various causes simultaneously. If you are trying to test for a particular cause, you want to isolate that cause and hold all other possible causes constant. One can control for a given variable using what is called multiple regression, a statistical tool that parcels out the separate net effects of several variables simultaneously.

If you want to determine whether income causes better education outcomes, you’d want to compare everyone from a two-parent family, since family status might be another causal factor, for instance. You’d also want to see the effect of family status by comparing everyone with similar incomes. And so on for other variables.

The problem is that there are potentially so many variables that a researcher inevitably leaves some out…(More)”.

The False Promise of ChatGPT


Article by Noam Chomsky: “…OpenAI’s ChatGPT, Google’s Bard and Microsoft’s Sydney are marvels of machine learning. Roughly speaking, they take huge amounts of data, search for patterns in it and become increasingly proficient at generating statistically probable outputs — such as seemingly humanlike language and thought. These programs have been hailed as the first glimmers on the horizon of artificial general intelligence — that long-prophesied moment when mechanical minds surpass human brains not only quantitatively in terms of processing speed and memory size but also qualitatively in terms of intellectual insight, artistic creativity and every other distinctively human faculty.

That day may come, but its dawn is not yet breaking, contrary to what can be read in hyperbolic headlines and reckoned by injudicious investments. The Borgesian revelation of understanding has not and will not — and, we submit, cannot — occur if machine learning programs like ChatGPT continue to dominate the field of A.I. However useful these programs may be in some narrow domains (they can be helpful in computer programming, for example, or in suggesting rhymes for light verse), we know from the science of linguistics and the philosophy of knowledge that they differ profoundly from how humans reason and use language. These differences place significant limitations on what these programs can do, encoding them with ineradicable defects.

It is at once comic and tragic, as Borges might have noted, that so much money and attention should be concentrated on so little a thing — something so trivial when contrasted with the human mind, which by dint of language, in the words of Wilhelm von Humboldt, can make “infinite use of finite means,” creating ideas and theories with universal reach…(More)”.

Innovation Power: Why Technology Will Define the Future of Geopolitics


Essay by Eric Schmidt: “When Russian forces marched on Kyiv in February 2022, few thought Ukraine could survive. Russia had more than twice as many soldiers as Ukraine. Its military budget was more than ten times as large. The U.S. intelligence community estimated that Kyiv would fall within one to two weeks at most.

Outgunned and outmanned, Ukraine turned to one area in which it held an advantage over the enemy: technology. Shortly after the invasion, the Ukrainian government uploaded all its critical data to the cloud, so that it could safeguard information and keep functioning even if Russian missiles turned its ministerial offices into rubble. The country’s Ministry of Digital Transformation, which Ukrainian President Volodymyr Zelensky had established just two years earlier, repurposed its e-government mobile app, Diia, for open-source intelligence collection, so that citizens could upload photos and videos of enemy military units. With their communications infrastructure in jeopardy, the Ukrainians turned to Starlink satellites and ground stations provided by SpaceX to stay connected. When Russia sent Iranian-made drones across the border, Ukraine acquired its own drones specially designed to intercept their attacks—while its military learned how to use unfamiliar weapons supplied by Western allies. In the cat-and-mouse game of innovation, Ukraine simply proved nimbler. And so what Russia had imagined would be a quick and easy invasion has turned out to be anything but.

Ukraine’s success can be credited in part to the resolve of the Ukrainian people, the weakness of the Russian military, and the strength of Western support. But it also owes to the defining new force of international politics: innovation power. Innovation power is the ability to invent, adopt, and adapt new technologies. It contributes to both hard and soft power. High-tech weapons systems increase military might, new platforms and the standards that govern them provide economic leverage, and cutting-edge research and technologies enhance global appeal. There is a long tradition of states harnessing innovation to project power abroad, but what has changed is the self-perpetuating nature of scientific advances. Developments in artificial intelligence in particular not only unlock new areas of scientific discovery; they also speed up that very process. Artificial intelligence supercharges the ability of scientists and engineers to discover ever more powerful technologies, fostering advances in artificial intelligence itself as well as in other fields—and reshaping the world in the process…(More)”.

Ten lessons for data sharing with a data commons


Article by Robert L. Grossman: “..Lesson 1. Build a commons for a specific community with a specific set of research challenges

Although there are a few data repositories that serve the general scientific community that have proved successful, in general data commons that target a specific user community have proven to be the most successful. The first lesson is to build a data commons for a specific research community that is struggling to answer specific research challenges with data. As a consequence, a data commons is a partnership between the data scientists developing and supporting the commons and the disciplinary scientists with the research challenges.

Lesson 2. Successful commons curate and harmonize the data

Successful commons curate and harmonize the data and produce data products of broad interest to the community. It’s time consuming, expensive, and labor intensive to curate and harmonize data, by much of the value of data commons is centralizing this work so that it can be done once instead of many times by each group that needs the data. These days, it is very easy to think of a data commons as a platform containing data, not spend the time curating or harmonizing it, and then be surprised that the data in the commons is not used more widely used and its impact is not as high as expected.

Lesson 3. It’s ultimately about the data and its value to generate new research discoveries

Despite the importance of a study, few scientists will try to replicate previously published studies. Instead, data is usually accessed if it can lead to a new high impact paper. For this reason, data commons play two different but related roles. First, they preserve data for reproducible science. This is a small fraction of the data access, but plays a critical role in reproducible science. Second, data commons make data available for new high value science.

Lesson 4. Reduce barriers to access to increase usage

A useful rule of thumb is that every barrier to data access cuts down access by a factor of 10. Common barriers that reduce use of a commons include: registration vs no-registration; open access vs controlled access; click through agreements vs signing of data usage agreements and approval by data access committees; license restrictions on the use of the data vs no license restrictions…(More)”.

Satellite data: The other type of smartphone data you might not know about


Article by Tommy Cooke et al: “Smartphones determine your location in several ways. The first way involves phones triangulating distances between cell towers or Wi-Fi routers.

The second way involves smartphones interacting with navigation satellites. When satellites pass overhead, they transmit signals to smartphones, which allows smartphones to calculate their own location. This process uses a specialized piece of hardware called the Global Navigation Satellite System (GNSS) chipset. Every smartphone has one.

When these GNSS chipsets calculate navigation satellite signals, they output data in two standardized formats (known as protocols or languages): the GNSS raw measurement protocol and the National Marine Electronics Association protocol (NMEA 0183).

GNSS raw measurements include data such as the distance between satellites and cellphones and measurements of the signal itself.

NMEA 0183 contains similar information to GNSS raw measurements, but also includes additional information such as satellite identification numbers, the number of satellites in a constellation, what country owns a satellite, and the position of a satellite.

NMEA 0183 was created and is governed by the NMEA, a not-for-profit lobby group that is also a marine electronics trade organization. The NMEA was formed at the 1957 New York Boat Show when boating equipment manufacturers decided to build stronger relationships within the electronic manufacturing industry.

In the decades since, the NMEA 0183 data standard has improved marine electronics communications and is now found on a wide variety of non-marine communications devices today, including smartphones…

It is difficult to know who has access to data produced by these protocols. Access to NMEA protocols is only available under licence to businesses for a fee.

GNSS raw measurements, on the other hand, are a universal standard and can be read by different devices in the same way without a license. In 2016, Google allowed industries to have open access to it to foster innovation around device tracking accuracy, precision, analytics about how we move in real-time, and predictions about our movements in the future.

While automated processes can quietly harvest location data — like when a French-based company extracted location data from Salaat First, a Muslim prayer app — these data don’t need to be taken directly from smartphones to be exploited.

Data can be modelled, experimented with, or emulated in licensed devices in labs for innovation and algorithmic development.

Satellite-driven raw measurements from our devices were used to power global surveillance networks like STRIKE3, a now defunct European-led initiative that monitored and reported perceived threats to navigation satellites…(More)”.