Valuing Data: Where Are We, and Where Do We Go Next?


Article by Tim Sargent and Laura Denniston: “The importance of data as a driver of technological advancement cannot be underestimated, but how can it be measured? This paper looks at measuring the value of data in national accounts using three different categories of data-related assets: data itself, databases and data science. The focus then turns to three recent studies by statistical agencies in Canada, the Netherlands and the United States to examine how each country uses a cost-based analysis to value data-related assets. Although there are two other superior ways of valuing data (the income-based method and the market-based method, as well as a hybrid approach), the authors find that these methods will be difficult to implement. The paper concludes with recommendations that include widening data-valuation efforts to the public sector, which is a major holder of data. The social value of data also needs to be calculated by considering both the positive and negative aspects of data-related investment and use. Appropriate data governance strategies are needed to ensure that data is being used for everyone’s benefit…(More)”.

Mapping the landscape of data intermediaries


Report by the European Commission’s Joint Research Centre: “…provides a landscape analysis of key emerging types of data intermediaries. It reviews and syntheses current academic and policy literature, with the goal of identifying shared elements and definitions. An overall objective is to contribute to establishing a common vocabulary among EU policy makers, experts, and practitioners. Six types are presented in detail: personal information management systems (PIMS), data cooperatives, data trusts, data unions, data marketplaces, and data sharing pools. For each one, the report provides information about how it works, its main features, key examples, and business model considerations. The report is grounded in multiple perspectives from sociological, legal, and economic disciplines. The analysis is informed by the notion of inclusive data governance, contextualised in the recent EU Data Governance Act, and problematised according to the economic literature on business models.

The findings highlight the fragmentation and heterogeneity of the field. Data intermediaries range from individualistic and business-oriented types to more collective and inclusive models that support greater engagement in data governance, while certain types do aim at facilitating economic transactions between data holders and users, others mainly seek to produce collective benefits or public value. In the conclusions, it derives a series of take-aways regarding main obstacles faced by data intermediaries and identifies lines of empirical work in this field…(More)”.

AI could choke on its own exhaust as it fills the web


Article by Ina Fried and Scott Rosenberg: “Scott RosenbergThe internet is beginning to fill up with more and more content generated by artificial intelligence rather than human beings, posing weird new dangers both to human society and to the AI programs themselves.

What’s happening: Experts estimate that AI-generated content could account for as much as 90% of information on the internet in a few years’ time, as ChatGPT, Dall-E and similar programs spill torrents of verbiage and images into online spaces.

  • That’s happening in a world that hasn’t yet figured out how to reliably label AI-generated output and differentiate it from human-created content.

The danger to human society is the now-familiar problem of information overload and degradation.

  • AI turbocharges the ability to create mountains of new content while it undermines the ability to check that material for reliability and recycles biases and errors in the data that was used to train it.
  • There’s also widespread fear that AI could undermine the jobs of people who create content today, from artists and performers to journalists, editors and publishers. The current strike by Hollywood actors and writers underlines this risk.

The danger to AI itself is newer and stranger. A raft of recent research papers have introduced a novel lexicon of potential AI disorders that are just coming into view as the technology is more widely deployed and used.

  • Model collapse” is researchers’ name for what happens to generative AI models, like OpenAI’s GPT-3 and GPT-4, when they’re trained using data produced by other AIs rather than human beings.
  • Feed a model enough of this “synthetic” data, and the quality of the AI’s answers can rapidly deteriorate, as the systems lock in on the most probable word choices and discard the “tail” choices that keep their output interesting.
  • Model Autophagy Disorder, or MAD, is how one set of researchers at Rice and Stanford universities dubbed the result of AI consuming its own products.
  • “Habsburg AI” is what another researcher earlier this year labeled the phenomenon, likening it to inbreeding: “A system that is so heavily trained on the outputs of other generative AIs that it becomes an inbred mutant, likely with exaggerated, grotesque features.”…(More)”.

Toward Bridging the Data Divide


Blog by Randeep Sudan, Craig Hammer, and Yaroslav Eferin: “Developing countries face a data conundrum. Despite more data being available than ever in the world, low- and middle-income countries often lack adequate access to valuable data and struggle to fully use the data they have.

This seemingly paradoxical situation represents a data divide. The terms “digital divide” and “data divide” are often used interchangeably but differ. The digital divide is the gap between those with access to digital technologies and those without access. On the other hand, the data divide is the gap between those who have access to high-quality data and those who do not. The data divide can negatively skew development across countries and therefore is a serious issue that needs to be addressed…

The effects of the data divide are alarming, with low- and middle-income countries getting left behind. McKinsey estimates that 75% of the value that could be created through Generative AI (such as ChatGPT) would be in four areas of economic activity: customer operations, marketing and sales, software engineering, and research and development. They further estimate that Generative AI  could add between $2.6 trillion and $4.4 trillion in value in these four areas.

PWC estimates that approximately 70% of all economic value generated by AI will likely accrue to just two countries: the USA and China. These two countries account for nearly two-thirds of the world’s hyperscale data centers, high rates of 5G adoption, the highest number of AI researchers, and the most funding for AI startups. This situation creates serious concerns for growing global disparities in accessing benefits from data collection and processing, and the related generation of insights and opportunities. These disparities will only increase over time without deliberate efforts to counteract this imbalance…(More)”

The Coming Wave


Book by Mustafa Suleyman and Michael Bhaskar: “Soon you will live surrounded by AIs. They will organise your life, operate your business, and run core government services. You will live in a world of DNA printers and quantum computers, engineered pathogens and autonomous weapons, robot assistants and abundant energy.

None of us are prepared.

As co-founder of the pioneering AI company DeepMind, part of Google, Mustafa Suleyman has been at the centre of this revolution. The coming decade, he argues, will be defined by this wave of powerful, fast-proliferating new technologies.

In The Coming Wave, Suleyman shows how these forces will create immense prosperity but also threaten the nation-state, the foundation of global order. As our fragile governments sleepwalk into disaster, we face an existential dilemma: unprecedented harms on one side and the threat of overbearing surveillance on the other…(More)”.

Regulation of Artificial Intelligence Around the World


Report by the Law Library of Congress: “…provides a list of jurisdictions in the world where legislation that specifically refers to artificial intelligence (AI) or systems utilizing AI have been adopted or proposed. Researchers of the Law Library surveyed all jurisdictions in their research portfolios to find such legislation, and those encountered have been compiled in the annexed list with citations and brief descriptions of the relevant legislation. Only adopted or proposed instruments that have legal effect are reported for national and subnational jurisdictions and the European Union (EU); guidance or policy documents that have no legal effect are not included for these jurisdictions. Major international organizations have also been surveyed and documents adopted or proposed by these organizations that specifically refer to AI are reported in the list…(More)”.

Using Data Science for Improving the Use of Scholarly Research in Public Policy


Blog by Basil Mahfouz: “Scientists worldwide published over 2.6 million papers in 2022 – Almost 5 papers per minute and more than double what they published in the year 2000. Are policy makers making the most of the wealth of available scientific knowledge? In this blog, we describe how we are applying data science methods on the bibliometric database of Elsevier’s International Centre for the Study of Research (ICSR) to analyse how scholarly research is being used by policy makers. More specifically, we will discuss how we are applying natural language processing and network dynamics to identify where there is policy action and also strong evidence; where there is policy interest but a lack of evidence; and where potential policies and strategies are not making full use of available knowledge or tools…(More)”.

Data Is Everybody’s Business


Book by Barbara H. Wixom, Cynthia M. Beath and Leslie Owens: “Most organizations view data monetization—converting data into money—too narrowly: as merely selling data sets. But data monetization is a core business activity for both commercial and noncommercial organizations, and, within organizations, it’s critical to have wide-ranging support for this pursuit. In Data Is Everybody’s Business, the authors offer a clear and engaging way for people across the entire organization to understand data monetization and make it happen. The authors identify three viable ways to convert data into money—improving work with data, wrapping products with data, and selling information offerings—and explain when to pursue each and how to succeed…(More)”.

Guess who’s getting the world’s first self-sovereign national digital ID?


Article by Durga M Sengupta: “Bhutan — a small Himalayan nation with less than 800,000 people — has decided to roll out a national digital identity system for all its citizens. “National digital ID is the platform on which digitization and online services of banks to hospitals to taxation to universities, everything can come online with 100% assurance,” Ujjwal Deep Dahal, CEO of Druk Holding and Investments, the commercial and investment arm of the government which developed the system, told me over a video call from the capital city of Thimphu.

The national ID system has been built using blockchain technology, which will provide each individual a “self-sovereign” identity, meaning it can only be controlled by the citizen and no other entity, similar to how cryptocurrencies work.

The country’s 7-year-old crown prince, Jigme Namgyel Wangchuck, was the first to enroll in the new system, and it is expected to reach the rest of the population within the year, Dahal said. 

“Once I’m onboarded, the interesting part about self-sovereign identity is that only I have my verified credentials in my wallet, in my phone. Nobody has access to it thereon but me, not even the government,” he said. The onboarding process takes about 5 seconds, Dahal estimated. “In our system, you will not visit any booth to register yourself. You’ll just download an app; share your details, selfie, and national ID card; and in the back end, the AI algorithm will run and say, ‘Okay, I can give you a verified credential,’” he said. This timeline would differ for people who don’t have smartphones or require assistance.

Druk Holding and Investments has been instrumental in setting up various other parallel projects, including the recently announced Bhutanverse — a metaverse that displays Bhutanese art, architecture, and motifs…(More)”. See also: Field Report: On the Emergent Use of Distributed Ledger Technologies for Identity Management

It’s Official: Cars Are the Worst Product Category We Have Ever Reviewed for Privacy


Article by the Mozilla Foundation: “Car makers have been bragging about their cars being “computers on wheels” for years to promote their advanced features. However, the conversation about what driving a computer means for its occupants’ privacy hasn’t really caught up. While we worried that our doorbells and watches that connect to the internet might be spying on us, car brands quietly entered the data business by turning their vehicles into powerful data-gobbling machines. Machines that, because of their all those brag-worthy bells and whistles, have an unmatched power to watch, listen, and collect information about what you do and where you go in your car.

All 25 car brands we researched earned our *Privacy Not Included warning label — making cars the official worst category of products for privacy that we have ever reviewed…(More)”.