The danger of building strong narratives on weak data


Article by John Burn-Murdoch: “Measuring gross domestic product is extremely complicated. Around the world, national statistics offices are struggling to get the sums right the first time around.

Some struggle more than others. When Ireland first reported its estimate for GDP growth in Q1 2015, it came in at 1.4 per cent. One year later, and with some fairly unique distortions due to its location as headquarters for many US big tech and pharma companies, this was revised upwards to an eye-watering 21.4 per cent.

On average, five years after an estimate of quarterly Irish GDP growth is first published, the latest revision of that figure is two full percentage points off the original value. The equivalent for the UK is almost 10 times smaller at 0.25 percentage points, making the ONS’s initial estimates among the most accurate in the developed world, narrowly ahead of the US at 0.26 and well ahead of the likes of Japan (0.46) and Norway (0.56).

But it’s not just the size of revisions that matters, it’s the direction. Out of 24 developed countries that consistently report quarterly GDP revisions to the OECD, the UK’s initial estimates are the most pessimistic. Britain’s quarterly growth figures typically end up 0.15 percentage points higher than first thought. The Germans go up by 0.07 on average, the French by 0.04, while the Americans, ever optimistic, typically end up revising their estimates down by 0.11 percentage points.

In other words, next time you hear a set of quarterly growth figures, it wouldn’t be unreasonable to mentally add 0.15 to the UK one and subtract 0.11 from the US.

This may all sound like nerdy detail, but it matters because people graft strong narratives on to this remarkably flimsy data. Britain was the only G7 economy yet to rebound past pre-Covid levels until it wasn’tIreland is booming, apparently, except its actual individual consumption per capita — a much better measure of living standards than GDP — has fallen steadily from just above the western European average in 2007 to 10 per cent below last year.

And the phenomenon is not exclusive to economic data. Two years ago, progressives critical of the government’s handling of the pandemic took to calling the UK “Plague Island”, citing Britain’s reported Covid death rates, which were among the highest in the developed world. But with the benefit of hindsight, we know that Britain was simply better at counting its deaths than most countries…(More)”

Making AI Less “Thirsty”: Uncovering and Addressing the Secret Water Footprint of AI Models


Paper by Pengfei Li, Jianyi Yang, Mohammad A. Islam, Shaolei Ren: “The growing carbon footprint of artificial intelligence (AI) models, especially large ones such as GPT-3 and GPT-4, has been undergoing public scrutiny. Unfortunately, however, the equally important and enormous water footprint of AI models has remained under the radar. For example, training GPT-3 in Microsoft’s state-of-the-art U.S. data centers can directly consume 700,000 liters of clean freshwater (enough for producing 370 BMW cars or 320 Tesla electric vehicles) and the water consumption would have been tripled if training were done in Microsoft’s Asian data centers, but such information has been kept as a secret. This is extremely concerning, as freshwater scarcity has become one of the most pressing challenges shared by all of us in the wake of the rapidly growing population, depleting water resources, and aging water infrastructures. To respond to the global water challenges, AI models can, and also should, take social responsibility and lead by example by addressing their own water footprint. In this paper, we provide a principled methodology to estimate fine-grained water footprint of AI models, and also discuss the unique spatial-temporal diversities of AI models’ runtime water efficiency. Finally, we highlight the necessity of holistically addressing water footprint along with carbon footprint to enable truly sustainable AI…(More)”.

Assembling an Assembly Guide


Guide prepared by DemocracyNext: :The Assembling an Assembly Guide is a resource for any institution, organisation, city administration, or policy maker interested in running a Citizens’ Assembly. It is also a useful tool for citizens and activists wishing to learn more about what a Citizens’ Assembly is and how it works, in order to strengthen their advocacy efforts.

This 3-stage guide will accompany you through the different steps of designing, running, and acting on the results of a Citizens’ Assembly. It draws on and points to a curated selection of the best available resources. From deciding how to choose and define an issue, to setting the budget, timeline, and which people to involve, this guide aims to make it a simple and clear process…(More)”.

Valuing Data: Where Are We, and Where Do We Go Next?


Article by Tim Sargent and Laura Denniston: “The importance of data as a driver of technological advancement cannot be underestimated, but how can it be measured? This paper looks at measuring the value of data in national accounts using three different categories of data-related assets: data itself, databases and data science. The focus then turns to three recent studies by statistical agencies in Canada, the Netherlands and the United States to examine how each country uses a cost-based analysis to value data-related assets. Although there are two other superior ways of valuing data (the income-based method and the market-based method, as well as a hybrid approach), the authors find that these methods will be difficult to implement. The paper concludes with recommendations that include widening data-valuation efforts to the public sector, which is a major holder of data. The social value of data also needs to be calculated by considering both the positive and negative aspects of data-related investment and use. Appropriate data governance strategies are needed to ensure that data is being used for everyone’s benefit…(More)”.

Mapping the landscape of data intermediaries


Report by the European Commission’s Joint Research Centre: “…provides a landscape analysis of key emerging types of data intermediaries. It reviews and syntheses current academic and policy literature, with the goal of identifying shared elements and definitions. An overall objective is to contribute to establishing a common vocabulary among EU policy makers, experts, and practitioners. Six types are presented in detail: personal information management systems (PIMS), data cooperatives, data trusts, data unions, data marketplaces, and data sharing pools. For each one, the report provides information about how it works, its main features, key examples, and business model considerations. The report is grounded in multiple perspectives from sociological, legal, and economic disciplines. The analysis is informed by the notion of inclusive data governance, contextualised in the recent EU Data Governance Act, and problematised according to the economic literature on business models.

The findings highlight the fragmentation and heterogeneity of the field. Data intermediaries range from individualistic and business-oriented types to more collective and inclusive models that support greater engagement in data governance, while certain types do aim at facilitating economic transactions between data holders and users, others mainly seek to produce collective benefits or public value. In the conclusions, it derives a series of take-aways regarding main obstacles faced by data intermediaries and identifies lines of empirical work in this field…(More)”.

AI could choke on its own exhaust as it fills the web


Article by Ina Fried and Scott Rosenberg: “Scott RosenbergThe internet is beginning to fill up with more and more content generated by artificial intelligence rather than human beings, posing weird new dangers both to human society and to the AI programs themselves.

What’s happening: Experts estimate that AI-generated content could account for as much as 90% of information on the internet in a few years’ time, as ChatGPT, Dall-E and similar programs spill torrents of verbiage and images into online spaces.

  • That’s happening in a world that hasn’t yet figured out how to reliably label AI-generated output and differentiate it from human-created content.

The danger to human society is the now-familiar problem of information overload and degradation.

  • AI turbocharges the ability to create mountains of new content while it undermines the ability to check that material for reliability and recycles biases and errors in the data that was used to train it.
  • There’s also widespread fear that AI could undermine the jobs of people who create content today, from artists and performers to journalists, editors and publishers. The current strike by Hollywood actors and writers underlines this risk.

The danger to AI itself is newer and stranger. A raft of recent research papers have introduced a novel lexicon of potential AI disorders that are just coming into view as the technology is more widely deployed and used.

  • Model collapse” is researchers’ name for what happens to generative AI models, like OpenAI’s GPT-3 and GPT-4, when they’re trained using data produced by other AIs rather than human beings.
  • Feed a model enough of this “synthetic” data, and the quality of the AI’s answers can rapidly deteriorate, as the systems lock in on the most probable word choices and discard the “tail” choices that keep their output interesting.
  • Model Autophagy Disorder, or MAD, is how one set of researchers at Rice and Stanford universities dubbed the result of AI consuming its own products.
  • “Habsburg AI” is what another researcher earlier this year labeled the phenomenon, likening it to inbreeding: “A system that is so heavily trained on the outputs of other generative AIs that it becomes an inbred mutant, likely with exaggerated, grotesque features.”…(More)”.

Toward Bridging the Data Divide


Blog by Randeep Sudan, Craig Hammer, and Yaroslav Eferin: “Developing countries face a data conundrum. Despite more data being available than ever in the world, low- and middle-income countries often lack adequate access to valuable data and struggle to fully use the data they have.

This seemingly paradoxical situation represents a data divide. The terms “digital divide” and “data divide” are often used interchangeably but differ. The digital divide is the gap between those with access to digital technologies and those without access. On the other hand, the data divide is the gap between those who have access to high-quality data and those who do not. The data divide can negatively skew development across countries and therefore is a serious issue that needs to be addressed…

The effects of the data divide are alarming, with low- and middle-income countries getting left behind. McKinsey estimates that 75% of the value that could be created through Generative AI (such as ChatGPT) would be in four areas of economic activity: customer operations, marketing and sales, software engineering, and research and development. They further estimate that Generative AI  could add between $2.6 trillion and $4.4 trillion in value in these four areas.

PWC estimates that approximately 70% of all economic value generated by AI will likely accrue to just two countries: the USA and China. These two countries account for nearly two-thirds of the world’s hyperscale data centers, high rates of 5G adoption, the highest number of AI researchers, and the most funding for AI startups. This situation creates serious concerns for growing global disparities in accessing benefits from data collection and processing, and the related generation of insights and opportunities. These disparities will only increase over time without deliberate efforts to counteract this imbalance…(More)”

A Blueprint for the EU Citizens’ Assembly


Paper by Carsten Berg, Claudia Chwalisz, Kalypso Nicolaidis, and Yves Sintomer: “The European Union has recognised that citizens are not sufficiently involved or empowered in its governance—how can we solve this problem?

Today, ahead of President Von Der Leyen’s 2023 State of the Union address on 13 September, we’re proud to co-publish a paper with the European University Institute written by four leading experts. The paper offers a blueprint for a solution: establishing the EU Citizens’ Assembly (EUCA) to share power with the other three institutions of the European Council, Commission, and Parliament.

After all, “a new push for democracy” is one of the European Commission’s self-declared top priorities for this coming year. This needs to be more than lip service. Unless citizens are given genuine agency and voice in deciding the big issues facing us in this age of turbulence, the authors argue, we will have lost the global battle in defence of democracy. The foundation has been laid for the EUCA with the success of lottery-selected EU Citizens’ Panels during the Conference on the Future of Europe, as well as those initiated by the European Commission over the past year, but more work must be done. 

In the paper, the authors explain why such an Assembly is needed, then suggest how it could be designed in an iterative fashion, operated, and what powers it could have in the EU system.

“In a broader context of democratic crisis and green, digital, and geopolitical transitions, we need to open up our imaginations to radical political change,” the authors say. “Political and technocratic elites must start giving up some control and allow for a modicum of self-determination by citizens.”..(More)”

Developing Wearable Technologies to Advance Understanding of Precision Environmental Health


Report by the National Academies of Sciences, Engineering, and Medicine: “The rapid proliferation of wearable devices that gather data on physical activity and physiology has become commonplace across various sectors of society. Concurrently, the development of advanced wearables and sensors capable of detecting a multitude of compounds presents new opportunities for monitoring environmental exposure risks. Wearable technologies are additionally showing promise in disease prediction, detection, and management, thereby offering potential advancements in the interdisciplinary fields of both environmental health and biomedicine.

To gain insight into this burgeoning field, on June 1 and 2, 2023, the National Academies of Sciences, Engineering, and Medicine organized a 2-day virtual workshop titled Developing Wearable Technologies to Advance Understanding of Precision Environmental Health. Experts from government, industry, and academia convened to discuss emerging applications and the latest advances in wearable technologies. The workshop aimed to explore the potential of wearables in capturing, monitoring, and predicting environmental exposures and risks to inform precision environmental health…(More)”.

The Coming Wave


Book by Mustafa Suleyman and Michael Bhaskar: “Soon you will live surrounded by AIs. They will organise your life, operate your business, and run core government services. You will live in a world of DNA printers and quantum computers, engineered pathogens and autonomous weapons, robot assistants and abundant energy.

None of us are prepared.

As co-founder of the pioneering AI company DeepMind, part of Google, Mustafa Suleyman has been at the centre of this revolution. The coming decade, he argues, will be defined by this wave of powerful, fast-proliferating new technologies.

In The Coming Wave, Suleyman shows how these forces will create immense prosperity but also threaten the nation-state, the foundation of global order. As our fragile governments sleepwalk into disaster, we face an existential dilemma: unprecedented harms on one side and the threat of overbearing surveillance on the other…(More)”.