Digital Anthropology Meets Data Science


Article by Katie Hillier: “Analyzing online ecosystems in real time, teams of anthropologists and data scientists can begin to understand rapid social changes as they happen.

Ask not what data science can do for anthropology, but what anthropology can do for data science. —Anders Kristian Munk, Why the World Needs Anthropologists Symposium 2022

In the last decade, emerging technologies, such as AI, immersive realities, and new and more addictive social networks, have permeated almost every aspect of our lives. These innovations are influencing how we form identities and belief systems. Social media influences the rise of subcultures on TikTok, the communications of extremist communities on Telegram, and the rapid spread of conspiracy theories that bounce around various online echo chambers. 

People with shared values or experiences can connect and form online cultures at unprecedented scales and speeds. But these new cultures are evolving and shifting faster than our current ability to understand them. 

To keep up with the depth and speed of online transformations, digital anthropologists are teaming up with data scientists to develop interdisciplinary methods and tools to bring the deep cultural context of anthropology to scales available only through data science—producing a surge in innovative methodologies for more effectively decoding online cultures in real time…(More)”.

Five Enablers for a New Phase of Behavioral Science


Article by Michael Hallsworth: “Over recent weeks I’ve been sharing parts of a “manifesto” that tries to give a coherent vision for the future of applied behavioral science. Stepping back, if I had to identify a theme that comes through the various proposals, it would be the need for self-reflective practice.

Behavioral science has seen a tremendous amount of growth and interest over the last decade, largely focused on expanding its uses and methods. My sense is it’s ready for a new phase of maturity. That maturity involves behavioral scientists reflecting on the various ways that their actions are shaped by structural, institutional, environmental, economic, and historical factors.

I’m definitely not exempt from this need for self-reflection. There are times when I’ve focused on a cognitive bias when I should have been spending more time exploring the context and motivations for a decision instead. Sometimes I’ve homed in on a narrow slice of a problem that we can measure, even if that means dispensing with wider systemic effects and challenges. Once I spent a long time trying to apply the language of heuristics and biases to explain why people were failing to use the urgent care alternatives to hospital emergency departments, before realizing that their behavior was completely reasonable.     

The manifesto critiques things like this, but it doesn’t have all the answers. Because it tries to both cover a lot of ground and go into detail, many of the hard knots of implementation go unpicked. The truth is that writing reports and setting goals is the easy part. Turning those goals into practice is much tougher; as behavioral scientists know, there is often a gap between intention and action.

Right now, I and others don’t always realize the ambitions set out in the manifesto. Changing that is going to take time and effort, and it will involve the discomfort of disrupting familiar practices. Some have made public commitments in this direction; my organization is working on upgrading its practices in line with proposals around making predictions prior to implementation, strengthening RCTs to cope with complexity, and enabling people to use behavioral science, among others.

The truth is that writing reports and setting goals is the easy part. Turning those goals into practice is much tougher; as behavioral scientists know, there is often a gap between intention and action.

But changes by individual actors will not be enough. The big issue is that several of the proposals require coordination. For example, one of the key ideas is the need for more multisite studies that are well coordinated and have clear goals. Another prioritizes developing international professional networks to support projects in low- and middle-income countries…(More)”.

Let’s Randomize America! 


Article by Dalton Conley: “…As our society has become less random, it has become more unequal. Many people know that inequality has been rising steadily over time, but a less-remarked-on development is that there’s been a parallel geographic shift, with high- and low-income people moving into separate, ever more distinct communities…As a sociologist, I study inequality and what can be done about it. It is, to say the least, a difficult problem to solve…I’ve come to believe that lotteries could help to crack this nut and make our society fairer and more equal. We can’t randomly assign where people live, of course. And we can’t integrate neighborhoods by fiat, either. We learned that lesson in the nineteen-seventies, when counties tried busing schoolchildren across town. Those programs aimed to create more racially and economically integrated schools; they resulted in the withdrawal of affluent students from urban public-school systems, and set off a political backlash that can still be felt today…

As a political tool, lotteries have come and gone throughout history. Sortition—the selection of political officials by lot—was first practiced in Athens in the sixth century B.C.E., and later reappeared in Renaissance city-states such as Florence, Venice, and Lombardy, and in Switzerland and elsewhere. In recent years, citizens’ councils—randomly chosen groups of individuals who meet to hammer out a particular issue, such as climate policy—have been tried in Canada, France, Iceland, Ireland, and the U.K. Some political theorists, such as Hélène Landemore, Jane Mansbridge, and the Belgian writer David Van Reybrouck, have argued that randomly selected decision-makers who don’t have to campaign are less likely to be corrupt or self-interested than those who must run for office; people chosen at random are also unlikely to be typically privileged, power-hungry politicians. The wisdom of the crowd improves when the crowd is more diverse…(More)”.

Data portability and interoperability: A primer on two policy tools for regulation of digitized industries


Article by Sukhi Gulati-Gilbert and Robert Seamans: “…In this article we describe two other tools, data portability and interoperability, that may be particularly useful in technology-enabled sectors. Data portability allows users to move data from one company to another, helping to reduce switching costs and providing rival firms with access to valuable customer data. Interoperability allows two or more technical systems to exchange data interactively. Due to its interactive nature, interoperability can help prevent lock-in to a specific platform by allowing users to connect across platforms. Data portability and interoperability share some similarities; in addition to potential pro-competitive benefits, the tools promote values of openness, transparency, and consumer choice.

After providing an overview of these topics, we describe the tradeoffs involved with implementing data portability and interoperability. While these policy tools offer lots of promise, in practice there can be many challenges involved when determining how to fund and design an implementation that is secure and intuitive and accomplishes the intended result.  These challenges require that policymakers think carefully about the initial implementation of data portability and interoperability. Finally, to better show how data portability and interoperability can increase competition in an industry, we discuss how they could be applied in the banking and social media sectors. These are just two examples of how data portability and interoperability policy could be applied to many different industries facing increased digitization. Our definitions and examples should be helpful to those interested in understanding the tradeoffs involved in using these tools to promote competition and innovation in the U.S. economy…(More)” See also: Data to Go: The Value of Data Portability as a Means to Data Liquidity.

German lawmakers mull creating first citizen assembly


APNews: “German lawmakers considered Wednesday whether to create the country’s first “citizen assembly’” to advise parliament on the issue of food and nutrition.

Germany’s three governing parties back the idea of appointing consultative bodies made up of members of the public selected through a lottery system who would discuss specific topics and provide nonbinding feedback to legislators. But opposition parties have rejected the idea, warning that such citizen assemblies risk undermining the primacy of parliament in Germany’s political system.

Baerbel Bas, the speaker of the lower house, or Bundestag, said that she views such bodies as a “bridge between citizens and politicians that can provide a fresh perspective and create new confidence in established institutions.”

“Everyone should be able to have a say,” Bas told daily Passauer Neue Presse. “We want to better reflect the diversity in our society.”

Environmental activists from the group Last Generation have campaigned for the creation of a citizen assembly to address issues surrounding climate change. However, the group argues that proposals drawn up by such a body should at the very least result in bills that lawmakers would then vote on.

Similar efforts to create citizen assemblies have taken place in other European countries such as Spain, Finland, Austria, Britain and Ireland…(More)”.

Misunderstanding Misinformation


Article by Claire Wardle: “In the fall of 2017, Collins Dictionary named fake news word of the year. It was hard to argue with the decision. Journalists were using the phrase to raise awareness of false and misleading information online. Academics had started publishing copiously on the subject and even named conferences after it. And of course, US president Donald Trump regularly used the epithet from the podium to discredit nearly anything he disliked.

By spring of that year, I had already become exasperated by how this term was being used to attack the news media. Worse, it had never captured the problem: most content wasn’t actually fake, but genuine content used out of context—and only rarely did it look like news. I made a rallying cry to stop using fake news and instead use misinformationdisinformation, and malinformation under the umbrella term information disorder. These terms, especially the first two, have caught on, but they represent an overly simple, tidy framework I no longer find useful.

Both disinformation and misinformation describe false or misleading claims, but disinformation is distributed with the intent to cause harm, whereas misinformation is the mistaken sharing of the same content. Analyses of both generally focus on whether a post is accurate and whether it is intended to mislead. The result? We researchers become so obsessed with labeling the dots that we can’t see the larger pattern they show.

By focusing narrowly on problematic content, researchers are failing to understand the increasingly sizable number of people who create and share this content, and also overlooking the larger context of what information people actually need. Academics are not going to effectively strengthen the information ecosystem until we shift our perspective from classifying every post to understanding the social contexts of this information, how it fits into narratives and identities, and its short-term impacts and long-term harms…(More)”.

AI Is Tearing Wikipedia Apart


Article by Claire Woodcock: “As generative artificial intelligence continues to permeate all aspects of culture, the people who steward Wikipedia are divided on how best to proceed. 

During a recent community call, it became apparent that there is a community split over whether or not to use large language models to generate content. While some people expressed that tools like Open AI’s ChatGPT could help with generating and summarizing articles, others remained wary. 

The concern is that machine-generated content has to be balanced with a lot of human review and would overwhelm lesser-known wikis with bad content. While AI generators are useful for writing believable, human-like text, they are also prone to including erroneous information, and even citing sources and academic papers which don’t exist. This often results in text summaries which seem accurate, but on closer inspection are revealed to be completely fabricated

“The risk for Wikipedia is people could be lowering the quality by throwing in stuff that they haven’t checked,” Bruckman added. “I don’t think there’s anything wrong with using it as a first draft, but every point has to be verified.” 

The Wikimedia Foundation, the nonprofit organization behind the website, is looking into building tools to make it easier for volunteers to identify bot-generated content. Meanwhile, Wikipedia is working to draft a policy that lays out the limits to how volunteers can use large language models to create content.

The current draft policy notes that anyone unfamiliar with the risks of large language models should avoid using them to create Wikipedia content, because it can open the Wikimedia Foundation up to libel suits and copyright violations—both of which the nonprofit gets protections from but the Wikipedia volunteers do not. These large language models also contain implicit biases, which often result in content skewed against marginalized and underrepresented groups of people

The community is also divided on whether large language models should be allowed to train on Wikipedia content. While open access is a cornerstone of Wikipedia’s design principles, some worry the unrestricted scraping of internet data allows AI companies like OpenAI to exploit the open web to create closed commercial datasets for their models. This is especially a problem if the Wikipedia content itself is AI-generated, creating a feedback loop of potentially biased information, if left unchecked…(More)”.

Will A.I. Become the New McKinsey?


Essay by Ted Chiang: “When we talk about artificial intelligence, we rely on metaphor, as we always do when dealing with something new and unfamiliar. Metaphors are, by their nature, imperfect, but we still need to choose them carefully, because bad ones can lead us astray. For example, it’s become very common to compare powerful A.I.s to genies in fairy tales. The metaphor is meant to highlight the difficulty of making powerful entities obey your commands; the computer scientist Stuart Russell has cited the parable of King Midas, who demanded that everything he touched turn into gold, to illustrate the dangers of an A.I. doing what you tell it to do instead of what you want it to do. There are multiple problems with this metaphor, but one of them is that it derives the wrong lessons from the tale to which it refers. The point of the Midas parable is that greed will destroy you, and that the pursuit of wealth will cost you everything that is truly important. If your reading of the parable is that, when you are granted a wish by the gods, you should phrase your wish very, very carefully, then you have missed the point.

So, I would like to propose another metaphor for the risks of artificial intelligence. I suggest that we think about A.I. as a management-consulting firm, along the lines of McKinsey & Company. Firms like McKinsey are hired for a wide variety of reasons, and A.I. systems are used for many reasons, too. But the similarities between McKinsey—a consulting firm that works with ninety per cent of the Fortune 100—and A.I. are also clear. Social-media companies use machine learning to keep users glued to their feeds. In a similar way, Purdue Pharma used McKinsey to figure out how to “turbocharge” sales of OxyContin during the opioid epidemic. Just as A.I. promises to offer managers a cheap replacement for human workers, so McKinsey and similar firms helped normalize the practice of mass layoffs as a way of increasing stock prices and executive compensation, contributing to the destruction of the middle class in America…(More)”.

Spamming democracy


Article by Natalie Alms: “The White House’s Office of Information and Regulatory Affairs is considering AI’s effect in the regulatory process, including the potential for generative chatbots to fuel mass campaigns or inject spam comments into the federal agency rulemaking process.

A recent executive order directed the office to consider using guidance or tools to address mass comments, computer-generated comments and falsely attributed comments, something an administration official told FCW that OIRA is “moving forward” on.

Mark Febrezio, a senior policy analyst at George Washington University’s Regulatory Studies Center, has experimented with Open AI’s generative AI system ChatGPT to create what he called a “convincing” public comment submission to a Labor Department proposal. 

“Generative AI also takes the possibility of mass and malattributed comments to the next level,” wrote Fabrizio and co-author Bridget Dooling, research professor at the center, in a paper published in April by the Brookings Institution.

The executive order comes years after astroturfing during the rollback of net neutrality policies by the Federal Communications Commission in 2017 garnered public attention. That rulemaking docket received a record-breaking 22 million-plus comments, but over 8.5 million came from a campaign against net neutrality led by broadband companies, according to an investigation by the New York Attorney General released in 2021. 

The investigation found that lead generators paid by these companies submitted many comments with real names and addresses attached without the knowledge or consent of those individuals.  In the same docket were over 7 million comments supporting net neutrality submitted by a computer science student, who used software to submit comments attached to computer-generated names and addresses.

While the numbers are staggering, experts told FCW that agencies aren’t just counting comments when reading through submissions from the public…(More)”

Unlocking the Power of Data Refineries for Social Impact


Essay by Jason Saul & Kriss Deiglmeier: “In 2021, US companies generated $2.77 trillion in profits—the largest ever recorded in history. This is a significant increase since 2000 when corporate profits totaled $786 billion. Social progress, on the other hand, shows a very different picture. From 2000 to 2021, progress on the United Nations Sustainable Development Goals has been anemic, registering less than 10 percent growth over 20 years.

What explains this massive split between the corporate and the social sectors? One explanation could be the role of data. In other words, companies are benefiting from a culture of using data to make decisions. Some refer to this as the “data divide”—the increasing gap between the use of data to maximize profit and the use of data to solve social problems…

Our theory is that there is something more systemic going on. Even if nonprofit practitioners and policy makers had the budget, capacity, and cultural appetite to use data; does the data they need even exist in the form they need it? We submit that the answer to this question is a resounding no. Usable data doesn’t yet exist for the sector because the sector lacks a fully functioning data ecosystem to create, analyze, and use data at the same level of effectiveness as the commercial sector…(More)”.