Data property, data governance and Common European Data Spaces


Paper by Thomas Margoni, Charlotte Ducuing and Luca Schirru: “The Data Act proposal of February 2022 constitutes a central element of a broader and ambitious initiative of the European Commission (EC) to regulate the data economy through the erection of a new general regulatory framework for data and digital markets. The resulting framework may be represented as a model of governance between a pure market-driven model and a fully regulated approach, thereby combining elements that traditionally belong to private law (e.g., property rights, contracts) and public law (e.g., regulatory authorities, limitation of contractual freedom). This article discusses the role of (intellectual) property rights as well as of other forms of rights allocation in data legislation with particular attention to the Data Act proposal. We argue that the proposed Data Act has the potential to play a key role in the way in which data, especially privately held data, may be accessed, used, and shared. Nevertheless, it is only by looking at the whole body of data (and data related) legislation that the broader plan for a data economy can be grasped in its entirety. Additionally, the Data Act proposal may also arguably reveal the elements for a transition from a property-based to a governance-based paradigm in the EU data strategy. Whereas elements of data governance abound, the stickiness of property rights and rhetoric seem however hard to overcome. The resulting regulatory framework, at least for now, is therefore an interesting but not always perfectly coordinated mix of both. Finally, this article suggests that the Data Act Proposal may have missed the chance to properly address the issue of data holders’ power and related information asymmetries, as well as the need for coordination mechanisms…(More)”.

Africa fell in love with crypto. Now, it’s complicated


Article by Martin K.N Siele: “Chiamaka, a former product manager at a Nigerian cryptocurrency startup, has sworn off digital currencies. The 22-year-old has weathered a layoff and lost savings worth 4,603,500 naira ($9,900) after the collapse of FTX in November 2022. She now works for a corporate finance company in Lagos, earning a salary that is 45% lower than her previous job.

“I used to be bullish on crypto because I believed it could liberate Africans financially,” Chiamaka, who asked to be identified by a pseudonym as she was concerned about breaching her contract with her current employer, told Rest of World. “Instead, it has managed to do the opposite so far … at least to me and a few of my friends.”

Chiamaka is among the tens of millions of Africans who bought into the cryptocurrency frenzy over the last few years. According to one estimate in mid-2022, around 53 million Africans owned crypto — 16.5% of the total global crypto users. Nigeria led with over 22 million users, ranking fourth globally. Blockchain startups and businesses on the continent raised $474 million in 2022, a 429% increase from the previous year, according to the African Blockchain Report. Young African creatives also became major proponents of non-fungible tokens (NFTs), taking inspiration from pop culture and the continent’s history. Several decentralized autonomous organizations (DAOs), touted as the next big thing, emerged across Africa…(More)”.

Accept All: Unacceptable? 


Report by Demos and Schillings: “…sought to investigate how our data footprints are being created and exploited online. It involved an exploratory investigation into how data sharing and data regulation practices are impacting citizens: looking into how individuals’ data footprints are created, what people experience when they want to exercise their data rights, and how they feel about how their data is being used. This was a novel approach, using live case studies as they embarked on a data odyssey in order to understand, in real time, the data challenge people face.

We then held a series of stakeholder roundtables with academics, lawyers, technologists, people working in industry and civil society, which focused on diagnosing the problems and what potential solutions already look like, or could look like in the future, across multiple stakeholder groups….(More)” See also: documentary produced by the project partners, law firm Schillings and the independent consumer data action service Rightly, and TVN, alongside this report, here.

End of data sharing could make Covid-19 harder to control, experts and high-risk patients warn


Article by Sam Whitehead: “…The federal government’s public health emergency that’s been in effect since January 2020 expires May 11. The emergency declaration allowed for sweeping changes in the U.S. health care system, like requiring state and local health departments, hospitals, and commercial labs to regularly share data with federal officials.

But some shared data requirements will come to an end and the federal government will lose access to key metrics as a skeptical Congress seems unlikely to grant agencies additional powers. And private projects, like those from The New York Times and Johns Hopkins University, which made covid data understandable and useful for everyday people, stopped collecting data in March.

Public health legal scholars, data experts, former and current federal officials, and patients at high risk of severe covid outcomes worry the scaling back of data access could make it harder to control covid.

There have been improvements in recent years, such as major investments in public health infrastructure and updated data reporting requirements in some states. But concerns remain that the overall shambolic state of U.S. public health data infrastructure could hobble the response to any future threats.

“We’re all less safe when there’s not the national amassing of this information in a timely and coherent way,” said Anne Schuchat, former principal deputy director of the Centers for Disease Control and Prevention.

A lack of data in the early days of the pandemic left federal officials, like Schuchat, with an unclear picture of the rapidly spreading coronavirus. And even as the public health emergency opened the door for data-sharing, the CDC labored for months to expand its authority.

Eventually, more than a year into the pandemic, the CDC gained access to data from private health care settings, such as hospitals and nursing homes, commercial labs, and state and local health departments…(More)”. See also: Why we still need data to understand the COVID-19 pandemic

How to worry wisely about AI


The Economist:  “Should we automate away all the jobs, including the fulfilling ones? Should we develop non-human minds that might eventually outnumber, outsmart…and replace us? Should we risk loss of control of our civilisation?” These questions were asked last month in an open letter from the Future of Life Institute, an ngo. It called for a six-month “pause” in the creation of the most advanced forms of artificial intelligence (ai), and was signed by tech luminaries including Elon Musk. It is the most prominent example yet of how rapid progress in ai has sparked anxiety about the potential dangers of the technology.

In particular, new “large language models” (llms)—the sort that powers Chatgpt, a chatbot made by Openai, a startup—have surprised even their creators with their unexpected talents as they have been scaled up. Such “emergent” abilities include everything from solving logic puzzles and writing computer code to identifying films from plot summaries written in emoji…(More)”.

Speaking in Tongues — Teaching Local Languages to Machines


Report by DIAL: “…Machines learn to talk to people by digesting digital content in languages people speak through a technique called Natural Language Processing (NLP). As things stand, only about 85 of the world’s approximately 7500 languages are represented in the major NLPs — and just 7 languages, with English being the most advanced, comprise the majority of the world’s digital knowledge corpus. Fortunately, many initiatives are underway to fill this knowledge gap. My new mini-report with Digital Impact Alliance (DIAL) highlights a few of them from Serbia, India, Estonia, and Africa.

The examples in the report are just a subset of initiatives on the ground to make digital services accessible to people in their local languages. They are a cause for excitement and hope (tempered by realistic expectations). A few themes across the initiatives include –

  • Despite the excitement and enthusiasm, most of the programs above are still at a very nascent stage — many may fail, and others will require investment and time to succeed. While countries such as India have initiated formal national NLP programs (one that is too early to assess), others such as Serbia have so far taken a more ad hoc approach.
  • Smaller countries like Estonia recognize the need for state intervention as the local population isn’t large enough to attract private sector investment. Countries will need to balance their local, cultural, and political interests against commercial realities as languages become digital or are digitally excluded.
  • Community engagement is an important component of almost all initiatives. India has set up a formal crowdsourcing program; other programs in Africa are experimenting with elements of participatory design and crowd curation.
  • While critics have accused ChatGPT and others of paying contributors from the global south very poorly for their labeling and other content services; it appears that many initiatives in the south are beginning to dabble with payment models to incentivize crowdsourcing and sustain contributions from the ground.
  • The engagement of local populations can ensure that NLP models learn appropriate cultural nuances, and better embody local social and ethical norms…(More)”.

AI translation is jeopardizing Afghan asylum claims


Article by Andrew Deck: “In 2020, Uma Mirkhail got a firsthand demonstration of how damaging a bad translation can be.

A crisis translator specializing in Afghan languages, Mirkhail was working with a Pashto-speaking refugee who had fled Afghanistan. A U.S. court had denied the refugee’s asylum bid because her written application didn’t match the story told in the initial interviews.

In the interviews, the refugee had first maintained that she’d made it through one particular event alone, but the written statement seemed to reference other people with her at the time — a discrepancy large enough for a judge to reject her asylum claim.

After Mirkhail went over the documents, she saw what had gone wrong: An automated translation tool had swapped the “I” pronouns in the woman’s statement to “we.”

Mirkhail works with Respond Crisis Translation, a coalition of over 2,500 translators that provides interpretation and translation services for migrants and asylum seekers around the world. She told Rest of World this kind of small mistake can be life-changing for a refugee. In the wake of the Taliban’s return to power in Afghanistan, there is an urgent demand for crisis translators working in languages such as Pashto and Dari. Working alongside refugees, these translators can help clients navigate complex immigration systems, including drafting immigration forms such as asylum applications. But a new generation of machine translation tools is changing the landscape of this field — and adding a new set of risks for refugees…(More)”.

The Coming Age of AI-Powered Propaganda


Essay by Josh A. Goldstein and Girish Sastry: “In the seven years since Russian operatives interfered in the 2016 U.S. presidential election, in part by posing as Americans in thousands of fake social media accounts, another technology with the potential to accelerate the spread of propaganda has taken center stage: artificial intelligence, or AI. Much of the concern has focused on the risks of audio and visual “deepfakes,” which use AI to invent images or events that did not actually occur. But another AI capability is just as worrisome. Researchers have warned for years that generative AI systems trained to produce original language—“language models,” for short—could be used by U.S. adversaries to mount influence operations. And now, these models appear to be on the cusp of enabling users to generate a near limitless supply of original text with limited human effort. This could improve the ability of propagandists to persuade unwitting voters, overwhelm online information environments, and personalize phishing emails. The danger is twofold: not only could language models sway beliefs; they could also corrode public trust in the information people rely on to form judgments and make decisions.

The progress of generative AI research has outpaced expectations. Last year, language models were used to generate functional proteins, beat human players in strategy games requiring dialogue, and create online assistants. Conversational language models have come into wide use almost overnight: more than 100 million people used OpenAI’s ChatGPT program in the first two months after it was launched, in December 2022, and millions more have likely used the AI tools that Google and Microsoft introduced soon thereafter. As a result, risks that seemed theoretical only a few years ago now appear increasingly realistic. For example, the AI-powered “chatbot” that powers Microsoft’s Bing search engine has shown itself to be capable of attempting to manipulate users—and even threatening them.

As generative AI tools sweep the world, it is hard to imagine that propagandists will not make use of them to lie and mislead…(More)”.

Harnessing Data Innovation for Migration Policy: A Handbook for Practitioners


Report by IOM: “The Practitioners’ Handbook provides first-hand insights into why and how non-traditional data sources can contribute to better understanding migration-related phenomena. The Handbook aims to (a) bridge the practical and technical aspects of using data innovations in migration statistics, (a) demonstrate the added value of using new data sources and innovative methodologies to analyse key migration topics that may be hard to fully grasp using traditional data sources, and (c) identify good practices in addressing issues of data access and collaboration with multiple stakeholders (including the private sector), ethical standards, and security and data protection issues…(More)” See also Big Data for Migration Alliance.

The Myth of Objective Data


Article by Melanie Feinberg: “The notion that human judgment pollutes scientific attempts to understand natural phenomena as they really are may seem like a stable and uncontroversial value. However, as Lorraine Daston and Peter Galison have established, objectivity is a fairly recent historical development.

In Daston and Galison’s account, which focuses on scientific visualization, objectivity arose in the 19th century, congruent with the development of photography. Before photography, scientific illustration attempted to portray an ideal exemplar rather than an actually existing specimen. In other words, instead of drawing a realistic portrait of an individual fruit fly — which has unique, idiosyncratic characteristics — an 18th-century scientific illustrator drew an ideal fruit fly. This ideal representation would better portray average fruit fly characteristics, even as no actual fruit fly is ever perfectly average.

With the advent of photography, drawings of ideal types began to lose favor. The machinic eye of the lens was seen as enabling nature to speak for itself, providing access to a truer, more objective reality than the human eye of the illustrator. Daston and Galison emphasize, however, that this initial confidence in the pure eye of the machine was swiftly undermined. Scientists soon realized that photographic devices introduce their own distortions into the images that they produce, and that no eye provides an unmediated view onto nature. From the perspective of scientific visualization, the idea that machines allow us to see true has long been outmoded. In everyday discourse, however, there is a continuing tendency to characterize the objective as that which speaks for itself without the interference of human perception, interpretation, judgment, and so on.

This everyday definition of objectivity particularly affects our understanding of data collection. If in our daily lives we tend to overlook the diverse, situationally textured sense-making actions that information seekers, conversation listeners, and other recipients of communicative acts perform to make automated information systems function, we are even less likely to acknowledge and value the interpretive work of data collectors, even as these actions create the conditions of possibility upon which data analysis can operate…(More)”.