Language Machinery


Essay by Richard Hughes Gibson: “… current debates about writing machines are not as fresh as they seem. As is quietly acknowledged in the footnotes of scientific papers, much of the intellectual infrastructure of today’s advances was laid decades ago. In the 1940s, the mathematician Claude Shannon demonstrated that language use could be both described by statistics and imitated with statistics, whether those statistics were in human heads or a machine’s memory. Shannon, in other words, was the first statistical language modeler, which makes ChatGPT and its ilk his distant brainchildren. Shannon never tried to build such a machine, but some astute early readers of his work recognized that computers were primed to translate his paper-and-ink experiments into a powerful new medium. In writings now discussed largely in niche scholarly and computing circles, these readers imagined—and even made preliminary sketches of—machines that would translate Shannon’s proposals into reality. These readers likewise raised questions about the meaning of such machines’ outputs and wondered what the machines revealed about our capacity to write.

The current barrage of commentary has largely neglected this backstory, and our discussions suffer for forgetting that issues that appear novel to us belong to the mid-twentieth century. Shannon and his first readers were the original residents of the headspace in which so many of us now find ourselves. Their ambitions and insights have left traces on our discourse, just as their silences and uncertainties haunt our exchanges. If writing machines constitute a “philosophical event” or a “prompt for philosophizing,” then I submit that we are already living in the event’s aftermath, which is to say, in Shannon’s aftermath. Amid the rampant speculation about a future dominated by writing machines, I propose that we turn in the other direction to listen to field reports from some of the first people to consider what it meant to read and write in Shannon’s world…(More)”.

Copyright Policy Options for Generative Artificial Intelligence


Paper by Joshua S. Gans: “New generative artificial intelligence (AI) models, including large language models and image generators, have created new challenges for copyright policy as such models may be trained on data that includes copy-protected content. This paper examines this issue from an economics perspective and analyses how different copyright regimes for generative AI will impact the quality of content generated as well as the quality of AI training. A key factor is whether generative AI models are small (with content providers capable of negotiations with AI providers) or large (where negotiations are prohibitive). For small AI models, it is found that giving original content providers copyright protection leads to superior social welfare outcomes compared to having no copyright protection. For large AI models, this comparison is ambiguous and depends on the level of potential harm to original content providers and the importance of content for AI training quality. However, it is demonstrated that an ex-post `fair use’ type mechanism can lead to higher expected social welfare than traditional copyright regimes…(More)”.

How Health Data Integrity Can Earn Trust and Advance Health


Article by Jochen Lennerz, Nick Schneider and Karl Lauterbach: “Efforts to share health data across borders snag on legal and regulatory barriers. Before detangling the fine print, let’s agree on overarching principles.

Imagine a scenario in which Mary, an individual with a rare disease, has agreed to share her medical records for a research project aimed at finding better treatments for genetic disorders. Mary’s consent is grounded in trust that her data will be handled with the utmost care, protected from unauthorized access, and used according to her wishes. 

It may sound simple, but meeting these standards comes with myriad complications. Whose job is it to weigh the risk that Mary might be reidentified, even if her information is de-identified and stored securely? How should that assessment be done? How can data from Mary’s records be aggregated with patients from health systems in other countries, each with their own requirements for data protection and formats for record keeping? How can Mary’s wishes be respected, both in terms of what research is conducted and in returning relevant results to her?

From electronic medical records to genomic sequencing, health care providers and researchers now have an unprecedented wealth of information that could help tailor treatments to individual needs, revolutionize understanding of disease, and enhance the overall quality of health care. Data protection, privacy safeguards, and cybersecurity are all paramount for safeguarding sensitive medical information, but much of the potential that lies in this abundance of data is being lost because well-intentioned regulations have not been set up to allow for data sharing and collaboration. This stymies efforts to study rare diseases, map disease patterns, improve public health surveillance, and advance evidence-based policymaking (for instance, by comparing effectiveness of interventions across regions and demographics). Projects that could excel with enough data get bogged down in bureaucracy and uncertainty. For example, Germany now has strict data protection laws—with heavy punishment for violations—that should allow de-identified health insurance claims to be used for research within secure processing environments, but the legality of such use has been challenged…(More)”.

Data and density: Two tools to boost health equity in cities


Article by Ann Aerts and Diana Rodríguez Franco: “Improving health and health equity for vulnerable populations requires addressing the social determinants of health. In the US, it is estimated that medical care only accounts for 10-20% of health outcomes while social determinants like education and income account for the remaining 80-90%.

Place-based interventions, however, are showing promise for improving health outcomes despite persistent inequalities. Research and practice increasingly point to the role of cities in promoting health equity — or reversing health inequities — as 56% of the global population lives in cities, and several social determinants of health are directly tied to urban factors like opportunity, environmental health, neighbourhoods and physical environments, access to food and more.

Thus, it is critical to identify both true drivers of good health and poor health outcomes so that underserved populations can be better served.

Place-based strategies can address health inequities and lead to meaningful improvements for vulnerable populations…

Initial data analysis revealed a strong correlation between cardiovascular disease risk in city residents and social determinants such as higher education, commuting time, access to Medicaid, rental costs and internet access.

Understanding which data points are correlated with health risks is key to effectively tailoring interventions.

Determined to reverse this trend, city authorities have launched a “HealthyNYC” campaign and are working with the Novartis Foundation to uncover the behavioural and social determinants behind non-communicable diseases (NCDs) (e.g. diabetes and cardiovascular disease), which cause 87% of all deaths in New York City…(More)”

Does information about citizen participation initiatives increase political trust?


Paper by Martin Ardanaz,  Susana Otálvaro-Ramírez, and Carlos Scartascini: “Participatory programs can reduce the informational and power asymmetries that engender mistrust. These programs, however, cannot include every citizen. Hence, it is important to evaluate if providing information about those programs could affect trust among those who do not participate. We assess the effect of an informational campaign about these programs in the context of a survey experiment conducted in the city of Buenos Aires, Argentina. Results show that providing detailed information about citizen involvement and outputs of a participatory budget initiative marginally shapes voters’ assessments of government performance and political trust. In particular, it increases voters’ perceptions about the benevolence and honesty of the government. Effects are larger for individuals with ex ante more negative views about the local government’s quality and they differ according to the respondents’ interpersonal trust and their beliefs about the ability of their communities to solve the type of collective-action problems that the program seeks to address. This article complements the literature that has examined the effects of participatory interventions on trust, and the literature that evaluates the role of information. The results in the article suggest that participatory budget programs could directly affect budget allocations and trust for those who participate, and those that are well-disseminated could also affect trust in the broader population. Because mistrustful individuals tend to shy away from demanding the government public goods that increase overall welfare, well-disseminated participatory budget programs could affect budget allocations directly and through their effect on trust…(More)”.

Computing Power and the Governance of AI


Blog by Lennart Heim, Markus Anderljung, Emma Bluemke, and Robert Trager: “Computing power – compute for short – is a key driver of AI progress. Over the past thirteen years, the amount of compute used to train leading AI systems has increased by a factor of 350 million. This has enabled the major AI advances that have recently gained global attention.

Governments have taken notice. They are increasingly engaged in compute governance: using compute as a lever to pursue AI policy goals, such as limiting misuse risks, supporting domestic industries, or engaging in geopolitical competition. 

There are at least three ways compute can be used to govern AI. Governments can: 

  • Track or monitor compute to gain visibility into AI development and use
  • Subsidize or limit access to compute to shape the allocation of resources across AI projects
  • Monitor activity, limit access, or build “guardrails” into hardware to enforce rules

Compute governance is a particularly important approach to AI governance because it is feasible. Compute is detectable: training advanced AI systems requires tens of thousands of highly advanced AI chips, which cannot be acquired or used inconspicuously. It is excludable: AI chips, being physical goods, can be given to or taken away from specific actors and in cases of specific uses. And it is quantifiable: chips, their features, and their usage can be measured. Compute’s detectability and excludability are further enhanced by the highly concentrated structure of the AI supply chain: very few companies are capable of producing the tools needed to design advanced chips, the machines needed to make them, or the data centers that house them. 

However, just because compute can be used as a tool to govern AI doesn’t mean that it should be used in all cases. Compute governance is a double-edged sword, with both potential benefits and the risk of negative consequences: it can support widely shared goals like safety, but it can also be used to infringe on civil liberties, perpetuate existing power structures, and entrench authoritarian regimes. Indeed, some things are better ungoverned. 

In our paper we argue that compute is a particularly promising node for AI governance. We also highlight the risks of compute governance and offer suggestions for how to mitigate them. This post summarizes our findings and key takeaways, while also offering some of our own commentary…(More)”

AI is too important to be monopolised


Article by Marietje Schaake: “…From the promise of medical breakthroughs to the perils of election interference, the hopes of helpful climate research to the challenge of cracking fundamental physics, AI is too important to be monopolised.

Yet the market is moving in exactly that direction, as resources and talent to develop the most advanced AI sit firmly in the hands of a very small number of companies. That is particularly true for resource-intensive data and computing power (termed “compute”), which are required to train large language models for a variety of AI applications. Researchers and small and medium-sized enterprises risk fatal dependency on Big Tech once again, or else they will miss out on the latest wave of innovation. 

On both sides of the Atlantic, feverish public investments are being made in an attempt to level the computational playing field. To ensure scientists have access to capacities comparable to those of Silicon Valley giants, the US government established the National AI Research Resource last month. This pilot project is being led by the US National Science Foundation. By working with 10 other federal agencies and 25 civil society groups, it will facilitate government-funded data and compute to help the research and education community build and understand AI. 

The EU set up a decentralised network of supercomputers with a similar aim back in 2018, before the recent wave of generative AI created a new sense of urgency. The EuroHPC has lived in relative obscurity and the initiative appears to have been under-exploited. As European Commission president Ursula von der Leyen said late last year: we need to put this power to useThe EU now imagines that democratised supercomputer access can also help with the creation of “AI factories,” where small businesses pool their resources to develop new cutting-edge models. 

There has long been talk of considering access to the internet a public utility, because of how important it is for education, employment and acquiring information. Yet rules to that end were never adopted. But with the unlocking of compute as a shared good, the US and the EU are showing real willingness to make investments into public digital infrastructure.

Even if the latest measures are viewed as industrial policy in a new jacket, they are part of a long overdue step to shape the digital market and offset the outsized power of big tech companies in various corners of our societies…(More)”.

Toward a 21st Century National Data Infrastructure: Managing Privacy and Confidentiality Risks with Blended Data


Report by the National Academies of Sciences, Engineering, and Medicine: “Protecting privacy and ensuring confidentiality in data is a critical component of modernizing our national data infrastructure. The use of blended data – combining previously collected data sources – presents new considerations for responsible data stewardship. Toward a 21st Century National Data Infrastructure: Managing Privacy and Confidentiality Risks with Blended Data provides a framework for managing disclosure risks that accounts for the unique attributes of blended data and poses a series of questions to guide considered decision-making.

Technical approaches to manage disclosure risk have advanced. Recent federal legislation, regulation and guidance has described broadly the roles and responsibilities for stewardship of blended data. The report, drawing from the panel review of both technical and policy approaches, addresses these emerging opportunities and the new challenges and responsibilities they present. The report underscores that trade-offs in disclosure risks, disclosure harms, and data usefulness are unavoidable and are central considerations when planning data-release strategies, particularly for blended data…(More)”.

Developing skills for digital government


OECD “review of good practices across OECD governments”: “Digital technologies are having a profound impact on economies, labour markets and societies. They also have the potential to transform government, by enabling the implementation of more accessible and effective services. To support a shift towards digital government, investment is needed in developing the skills of civil servants. This paper reviews good practices across OECD countries to foster skills for digital government. It presents different approaches in public administration to organising training activities as well as opportunities for informal learning. It also provides insights into how relevant skills can be identified through competence frameworks, how they can be assessed, and how learning opportunities can be evaluated….(More)”

Tech Strikes Back


Essay by Nadia Asparouhova: “A new tech ideology is ascendant online. “Introducing effective accelerationism,” the pseudonymous user Beff Jezos tweeted, rather grandly, in May 2022. “E/acc” — pronounced ee-ack — “is a direct product [of the] tech Twitter schizosphere,” he wrote. “We hope you join us in this new endeavour.”

The reaction from Jezos’s peers was a mix of positive, critical, and perplexed. “What the f*** is e/acc,” posted multiple users. “Accelerationism is unfortunately now just a buzzword,” sighed political scientist Samo Burja, referring to a related concept popularized around 2017. “I guess unavoidable for Twitter subcultures?” “These [people] are absolutely bonkers,” grumbled Timnit Gebru, an artificial intelligence researcher and activist who frequently criticizes the tech industry. “Their fanaticism + god complex is exhausting.”

Despite the criticism, e/acc persists, and is growing, in the tech hive mind. E/acc’s founders believe that the tech world has become captive to a monoculture. If it becomes paralyzed by a fear of the future, it will never produce meaningful benefits. Instead, e/acc encourages more ideas, more growth, more competition, more action. “Whether you’re building a family, a startup, a spaceship, a robot, or better energy policy, just build,” writes one anonymous poster. “Do something hard. Do it for everyone who comes next. That’s it. Existence will take care of the rest.”…(More)”.