Facial Recognition Plan from IRS Raises Big Concerns


Article by James Hendler: “The U.S. Internal Revenue Service is planning to require citizens to create accounts with a private facial recognition company in order to file taxes online. The IRS is joining a growing number of federal and state agencies that have contracted with ID.me to authenticate the identities of people accessing services.

The IRS’s move is aimed at cutting down on identity theft, a crime that affects millions of Americans. The IRS, in particular, has reported a number of tax filings from people claiming to be others, and fraud in many of the programs that were administered as part of the American Relief Plan has been a major concern to the government.

The IRS decision has prompted a backlash, in part over concerns about requiring citizens to use facial recognition technology and in part over difficulties some people have had in using the system, particularly with some state agencies that provide unemployment benefits. The reaction has prompted the IRS to revisit its decision.

As a computer science researcher and the chair of the Global Technology Policy Council of the Association for Computing Machinery, I have been involved in exploring some of the issues with government use of facial recognition technology, both its use and its potential flaws. There have been a great number of concerns raised over the general use of this technology in policing and other government functions, often focused on whether the accuracy of these algorithms can have discriminatory affects. In the case of ID.me, there are other issues involved as well….(More)”.

COVID’s lesson for governments? Don’t cherry-pick advice, synthesize it


Essay by Geoff Mulgan: “Too many national leaders get good guidance yet make poor decisions…Handling complex scientific issues in government is never easy — especially during a crisis, when uncertainty is high, stakes are huge and information is changing fast. But for some of the nations that have fared the worst in the COVID-19 pandemic, there’s a striking imbalance between the scientific advice available and the capacity to make sense of it. Some advice is ignored because it’s politically infeasible or unpragmatic. Nonetheless, much good scientific input has fallen aside because there’s no means to pick it up.

Part of the problem has been a failure of synthesis — the ability to combine insights and transcend disciplinary boundaries. Creating better syntheses should be a governmental priority as the crisis moves into a new phase….

Input from evidence synthesis is crucial for policymaking. But the capacity of governments to absorb such evidence is limited, and syntheses for decisions must go much further in terms of transparently incorporating assessments of political or practical feasibility, implementation, benefits and cost, among many other factors. The gap between input and absorption is glaring.

I’ve addressed teams in the UK prime minister’s office, the European Commission and the German Chancellery about this issue. In responding to the pandemic, some countries (including France and the United Kingdom) have tried to look at epidemiological models alongside economic ones, but none has modelled the social or psychological effects of different policy choices, and none would claim to have achieved a truly synthetic approach.

There are dozens of good examples of holistic thinking and action: programmes to improve public health in Finland, cut UK street homelessness, reduce poverty in China. But for many governments, the capacity to see things in the round has waned over the past decade. The financial crisis of 2007 and then populism both shortened governments’ time horizons for planning and policy in the United States and Europe….

The worst governments rely on intuition. But even the best resort to simple heuristics — for example, that it’s best to act fast, or that prioritizing health is also good for the economy. This was certainly true in 2020 and 2021. But that might change with higher vaccination and immunity rates.

What would it mean to transcend simple heuristics and achieve a truly synthetic approach? It would involve mapping and ranking relevant factors (from potential impacts on hospital capacity to the long-run effects of isolation); using formal and informal models to capture feedbacks, trade-offs and synergies; and more creative work to shape options.

Usually, such work is best done by teams that encompass breadth and depth, disparate disciplines, diverse perspectives and both officials and outsiders. Good examples include Singapore’s Strategy Group (and Centre for Strategic Futures), which helps the country to execute sophisticated plans on anything from cybercrime to climate resilience. But most big countries, despite having large bureaucracies, lack comparable teams…(More)”.

Sample Truths


Christopher Beha at Harpers’ Magazine: “…How did we ever come to believe that surveys of this kind could tell us something significant about ourselves?

One version of the story begins in the middle of the seventeenth century, after the Thirty Years’ War left the Holy Roman Empire a patchwork of sovereign territories with uncertain borders, contentious relationships, and varied legal conventions. The resulting “weakness and need for self-definition,” the French researcher Alain Desrosières writes, created a demand among local rulers for “systematic cataloging.” This generally took the form of descriptive reports. Over time the proper methods and parameters of these reports became codified, and thus was born the discipline of Statistik: the systematic study of the attributes of a state.

As Germany was being consolidated in the nineteenth century, “certain officials proposed using the formal, detailed framework of descriptive statistics to present comparisons between the states” by way of tables in which “the countries appeared in rows, and different (literary) elements of the description appeared in columns.” In this way, a single feature, such as population or climate, could be easily removed from its context. Statistics went from being a method for creating a holistic description of one place to what Desrosières calls a “cognitive space of equivalence.” Once this change occurred, it was only a matter of time before the descriptions themselves were put into the language of equivalence, which is to say, numbers.

The development of statistical reasoning was central to the “project of legibility,” as the anthropologist James C. Scott calls it, ushered in by the rise of nation-states. Strong centralized governments, Scott writes in Seeing Like a State, required that local communities be made “legible,” their features abstracted to enable management by distant authorities. In some cases, such “state simplifications” occurred at the level of observation. Cadastral maps, for example, ignored local land-use customs, focusing instead on the points relevant to the state: How big was each plot, and who was responsible for paying taxes on it?

But legibility inevitably requires simplifying the underlying facts, often through coercion. The paradigmatic example here is postrevolutionary France. For administrative purposes, the country was divided into dozens of “departments” of roughly equal size whose boundaries were drawn to break up culturally cohesive regions such as Normandy and Provence. Local dialects were effectively banned, and use of the new, highly rational metric system was required. (As many commentators have noted, this work was a kind of domestic trial run for colonialism.)

One thing these centralized states did not need to make legible was their citizens’ opinions—on the state itself, or anything else for that matter. This was just as true of democratic regimes as authoritarian ones. What eventually helped bring about opinion polling was the rise of consumer capitalism, which created the need for market research.

But expanding the opinion poll beyond questions like “Pepsi or Coke?” required working out a few kinks. As the historian Theodore M. Porter notes, pollsters quickly learned that “logically equivalent forms of the same question produce quite different distributions of responses.” This fact might have led them to doubt the whole undertaking. Instead, they “enforced a strict discipline on employees and respondents,” instructing pollsters to “recite each question with exactly the same wording and in a specified order.” Subjects were then made “to choose one of a small number of packaged statements as the best expression of their opinions.”

This approach has become so familiar that it may be worth noting how odd it is to record people’s opinions on complex matters by asking them to choose among prefabricated options. Yet the method has its advantages. What it sacrifices in accuracy it makes up in pseudoscientific precision and quantifiability. Above all, the results are legible: the easiest way to be sure you understand what a person is telling you is to put your own words in his mouth.

Scott notes a kind of Heisenberg principle to state simplifications: “They frequently have the power to transform the facts they take note of.” This is another advantage to multiple-choice polling. If people are given a narrow range of opinions, they may well think that those are the only options available, and in choosing one, they may well accept it as wholly their own. Even those of us who reject the stricture of these options for ourselves are apt to believe that they fairly represent the opinions of others. One doesn’t have to be a postmodern relativist to suspect that what’s going on here is as much the construction of a reality as the depiction of one….(More)”.

Leveraging Non-Traditional Data For The Covid-19 Socioeconomic Recovery Strategy


Article by Deepali Khanna: “To this end, it is opportune to ask the following questions: Can we harness the power of data routinely collected by companies—including transportation providers, mobile network operators, social media networks and others—for the public good? Can we bridge the data gap to give governments access to data, insights and tools that can inform national and local response and recovery strategies?

There is increasing recognition that traditional and non-traditional data should be seen as complementary resources. Non-traditional data can bring significant benefits in bridging existing data gaps but must still be calibrated against benchmarks based on established traditional data sources. These traditional datasets are widely seen as reliable as they are subject to established stringent international and national standards. However, they are often limited in frequency and granularity, especially in low- and middle-income countries, given the cost and time required to collect such data. For example, official economic indicators such as GDP, household consumption and consumer confidence may be available only up to national or regional level with quarterly updates…

In the Philippines, UNDP, with support from The Rockefeller Foundation and the government of Japan, recently setup the Pintig Lab: a multidisciplinary network of data scientists, economists, epidemiologists, mathematicians and political scientists, tasked with supporting data-driven crisis response and development strategies. In early 2021, the Lab conducted a study which explored how household spending on consumer-packaged goods, or fast-moving consumer goods (FMCGs), can been used to assess the socioeconomic impact of Covid-19 and identify heterogeneities in the pace of recovery across households in the Philippines. The Philippine National Economic Development Agency is now in the process of incorporating this data for their GDP forecasting, as additional input to their predictive models for consumption. Further, this data can be combined with other non-traditional datasets such as credit card or mobile wallet transactions, and machine learning techniques for higher-frequency GDP nowcasting, to allow for more nimble and responsive economic policies that can both absorb and anticipate the shocks of crisis….(More)”.

Automation exacts a toll in inequality


Rana Foroohar at The Financial Times: “When humans compete with machines, wages go down and jobs go away. But, ultimately, new categories of better work are created. The mechanisation of agriculture in the first half of the 20th century, or advances in computing and communications technology in the 1950s and 1960s, for example, went hand in hand with strong, broadly shared economic growth in the US and other developed economies.

But, in later decades, something in this relationship began to break down. Since the 1980s, we’ve seen the robotics revolution in manufacturing; the rise of software in everything; the consumer internet and the internet of things; and the growth of artificial intelligence. But during this time trend GDP growth in the US has slowed, inequality has risen and many workers — particularly, men without college degrees — have seen their real earnings fall sharply.

Globalisation and the decline of unions have played a part. But so has technological job disruption. That issue is beginning to get serious attention in Washington. In particular, politicians and policymakers are homing in on the work of MIT professor Daron Acemoglu, whose research shows that mass automation is no longer a win-win for both capital and labour. He testified at a select committee hearing to the US House of Representatives in November that automation — the substitution of machines and algorithms for tasks previously performed by workers — is responsible for 50-70 per cent of the economic disparities experienced between 1980 and 2016.

Why is this happening? Basically, while the automation of the early 20th century and the post-1945 period “increased worker productivity in a diverse set of industries and created myriad opportunities for them”, as Acemoglu said in his testimony, “what we’ve experienced since the mid 1980s is an acceleration in automation and a very sharp deceleration in the introduction of new tasks”. Put simply, he added, “the technological portfolio of the American economy has become much less balanced, and in a way that is highly detrimental to workers and especially low-education workers.”

What’s more, some things we are automating these days aren’t so economically beneficial. Consider those annoying computerised checkout stations in drug stores and groceries that force you to self-scan your purchases. They may save retailers a bit in labour costs, but they are hardly the productivity enhancer of, say, a self-driving combine harvester. Cecilia Rouse, chair of the White House’s Council of Economic Advisers, spoke for many when she told a Council on Foreign Relations event that she’d rather “stand in line [at the pharmacy] so that someone else has a job — it may not be a great job, but it is a job — and where I actually feel like I get better assistance.”

Still, there’s no holding back technology. The question is how to make sure more workers can capture its benefits. In her “Virtual Davos” speech a couple of weeks ago, Treasury secretary Janet Yellen pointed out that recent technologically driven productivity gains might exacerbate rather than mitigate inequality. She pointed to the fact that, while the “pandemic-induced surge in telework” will ultimately raise US productivity by 2.7 per cent, the gains will accrue mostly to upper income, white-collar workers, just as online learning has been better accessed and leveraged by wealthier, white students.

Education is where the rubber meets the road in fixing technology-driven inequality. As Harvard researchers Claudia Goldin and Laurence Katz have shown, when the relationship between education and technology gains breaks down, tech-driven prosperity is no longer as widely shared. This is why the Biden administration has been pushing investments into community college, apprenticeships and worker training…(More)”.

Suicide hotline shares data with for-profit spinoff, raising ethical questions


Alexandra Levine at Politico: “Crisis Text Line is one of the world’s most prominent mental health support lines, a tech-driven nonprofit that uses big data and artificial intelligence to help people cope with traumas such as self-harm, emotional abuse and thoughts of suicide.

But the data the charity collects from its online text conversations with people in their darkest moments does not end there: The organization’s for-profit spinoff uses a sliced and repackaged version of that information to create and market customer service software.

Crisis Text Line says any data it shares with that company, Loris.ai, has been wholly “anonymized,” stripped of any details that could be used to identify people who contacted the helpline in distress. Both entities say their goal is to improve the world — in Loris’ case, by making “customer support more human, empathetic, and scalable.”

In turn, Loris has pledged to share some of its revenue with Crisis Text Line. The nonprofit also holds an ownership stake in the company, and the two entities shared the same CEO for at least a year and a half. The two call their relationship a model for how commercial enterprises can help charitable endeavors thrive…(More).”

We Still Can’t See American Slavery for What It Was


Jamelle Bouie at the New York Times: “…It is thanks to decades of painstaking, difficult work that we know a great deal about the scale of human trafficking across the Atlantic Ocean and about the people aboard each ship. Much of that research is available to the public in the form of the SlaveVoyages database. A detailed repository of information on individual ships, individual voyages and even individual people, it is a groundbreaking tool for scholars of slavery, the slave trade and the Atlantic world. And it continues to grow. Last year, the team behind SlaveVoyages introduced a new data set with information on the domestic slave trade within the United States, titled “Oceans of Kinfolk.”

The systematic effort to quantify the slave trade goes back at least as far as the 19th century…

Because of its specificity with regard to individual enslaved people, this new information is as pathbreaking for lay researchers and genealogists as it is for scholars and historians. It is also, for me, an opportunity to think about the difficult ethical questions that surround this work: How exactly do we relate to data that allows someone — anyone — to identify a specific enslaved person? How do we wield these powerful tools for quantitative analysis without abstracting the human reality away from the story? And what does it mean to study something as wicked and monstrous as the slave trade using some of the tools of the trade itself?…

“The data that we have about those ships is also kind of caught in a stranglehold of ship captains who care about some things and don’t care about others,” Jennifer Morgan said. We know what was important to them. It is the task of the historian to bring other resources to bear on this knowledge, to shed light on what the documents, and the data, might obscure.

“By merely reproducing the metrics of slave traders,” Fuentes said, “you’re not actually providing us with information about the people, the humans, who actually bore the brunt of this violence. And that’s important. It is important to humanize this history, to understand that this happened to African human beings.”

It’s here that we must engage with the question of the public. Work like the SlaveVoyages database exists in the “digital humanities,” a frequently public-facing realm of scholarship and inquiry. And within that context, an important part of respecting the humanity of the enslaved is thinking about their descendants.

“If you’re doing a digital humanities project, it exists in the world,” said Jessica Marie Johnson, an assistant professor of history at Johns Hopkins and the author of “Wicked Flesh: Black Women, Intimacy, and Freedom in the Atlantic World.” “It exists among a public that is beyond the academy and beyond Silicon Valley. And that means that there should be certain other questions that we ask, a different kind of ethics of care and a different morality that we bring to things.”…(More)”.

The UN is testing technology that processes data confidentially


The Economist: “Reasons of confidentiality mean that many medical, financial, educational and other personal records, from the analysis of which much public good could be derived, are in practice unavailable. A lot of commercial data are similarly sequestered. For example, firms have more granular and timely information on the economy than governments can obtain from surveys. But such intelligence would be useful to rivals. If companies could be certain it would remain secret, they might be more willing to make it available to officialdom.

A range of novel data-processing techniques might make such sharing possible. These so-called privacy-enhancing technologies (PETs) are still in the early stages of development. But they are about to get a boost from a project launched by the United Nations’ statistics division. The UN PETs Lab, which opened for business officially on January 25th, enables national statistics offices, academic researchers and companies to collaborate to carry out projects which will test various PETs, permitting technical and administrative hiccups to be identified and overcome.

The first such effort, which actually began last summer, before the PETs Lab’s formal inauguration, analysed import and export data from national statistical offices in America, Britain, Canada, Italy and the Netherlands, to look for anomalies. Those could be a result of fraud, of faulty record keeping or of innocuous re-exporting.

For the pilot scheme, the researchers used categories already in the public domain—in this case international trade in things such as wood pulp and clocks. They thus hoped to show that the system would work, before applying it to information where confidentiality matters.

They put several kinds of PETs through their paces. In one trial, OpenMined, a charity based in Oxford, tested a technique called secure multiparty computation (SMPC). This approach involves the data to be analysed being encrypted by their keeper and staying on the premises. The organisation running the analysis (in this case OpenMined) sends its algorithm to the keeper, who runs it on the encrypted data. That is mathematically complex, but possible. The findings are then sent back to the original inquirer…(More)”.

The West already monopolized scientific publishing. Covid made it worse.


Samanth Subramanian at Quartz: “For nearly a decade, Jorge Contreras has been railing against the broken system of scientific publishing. Academic journals are dominated by the Western scientists, who not only fill their pages but also work for institutions that can afford the hefty subscription fees to these journals. “These issues have been brewing for decades,” said Contreras, a professor at the University of Utah’s College of Law who specializes in intellectual property in the sciences. “The covid crisis has certainly exacerbated things, though.”

The coronavirus pandemic triggered a torrent of academic papers. By August 2021, at least 210,000 new papers on covid-19 had been published, according to a Royal Society study. Of the 720,000-odd authors of these papers, nearly 270,000 were from the US, the UK, Italy or Spain.

These papers carry research forward, of course—but they also advance their authors’ careers, and earn them grants and patents. But many of these papers are often based on data gathered in the global south, by scientists who perhaps don’t have the resources to expand on their research and publish. Such scientists aren’t always credited in the papers their data give rise to; to make things worse, the papers appear in journals that are out of the financial reach of these scientists and their institutes.

These imbalances have, as Contreras said, been a part of the publishing landscape for years. (And it doesn’t occur just in the sciences; economists from the US or the UK, for instance, tend to study countries where English is the most common language.) But the pace and pressures of covid-19 have rendered these iniquities especially stark.

Scientists have paid to publish their covid-19 research—sometimes as much as $5,200 per article. Subscriber-only journals maintain their high fees, running into thousands of dollars a year; in 2020, the Dutch publishing house Elsevier, which puts out journals such as Cell and Gene, reported a profit of nearly $1 billion, at a margin higher than that of Apple or Amazon. And Western scientists are pressing to keep data out of GISAID, a genome database that compels users to acknowledge or collaborate with anyone who deposits the data…(More)”

Building machines that work for everyone – how diversity of test subjects is a technology blind spot, and what to do about it


Article by Tahira Reid and James Gibert: “People interact with machines in countless ways every day. In some cases, they actively control a device, like driving a car or using an app on a smartphone. Sometimes people passively interact with a device, like being imaged by an MRI machine. And sometimes they interact with machines without consent or even knowing about the interaction, like being scanned by a law enforcement facial recognition system.

Human-Machine Interaction (HMI) is an umbrella term that describes the ways people interact with machines. HMI is a key aspect of researching, designing and building new technologies, and also studying how people use and are affected by technologies.

Researchers, especially those traditionally trained in engineering, are increasingly taking a human-centered approach when developing systems and devices. This means striving to make technology that works as expected for the people who will use it by taking into account what’s known about the people and by testing the technology with them. But even as engineering researchers increasingly prioritize these considerations, some in the field have a blind spot: diversity.

As an interdisciplinary researcher who thinks holistically about engineering and design and an expert in dynamics and smart materials with interests in policy, we have examined the lack of inclusion in technology design, the negative consequences and possible solutions….

It is possible to use a homogenous sample of people in publishing a research paper that adds to a field’s body of knowledge. And some researchers who conduct studies this way acknowledge the limitations of homogenous study populations. However, when it comes to developing systems that rely on algorithms, such oversights can cause real-world problems. Algorithms are as only as good as the data that is used to build them.

Algorithms are often based on mathematical models that capture patterns and then inform a computer about those patterns to perform a given task. Imagine an algorithm designed to detect when colors appear on a clear surface. If the set of images used to train that algorithm consists of mostly shades of red, the algorithm might not detect when a shade of blue or yellow is present…(More)”.