Stefaan Verhulst
Report by Columbia World Projects: “Citizens cannot make active choices about what they see on social media. Independent regulators cannot hold companies accountable for their obligations under a growing number of national and regional online safety regimes. The research community — made up of academics, civil society groups and the media — cannot highlight potential deficiencies in both platform and regulatory action. Collectively, it represents a deficit in social media platform transparency and accountability that is a direct threat to individuals’ fundamental rights, as well as to wider societal democratic norms. Funders, regulators and researchers must act within the next 6-12 months to establish foundational infrastructure and standards related to social media data access. Without swift action, democratic institutions are vulnerable to the weaponization of social media platforms whose activities remain opaque and subject to potential manipulation by malign actors.
It is within this context the Columbia-Hertie initiative provides clear funding recommendations, as outlined in the chart below. At its core, this work is based on upholding the highest levels of data protection and security practices so that any form of social media data access protects the privacy rights of individual social media users — no matter where they are located. That is the guiding principle for all recommendations.
The report is divided into three sections:
1. Supporting Underlying Data Access Infrastructure
2. Building Best Practices for the Research Community
3. Fostering Researcher-Regulator Relationships
Each of these sections provide specific recommendations on how public and private funders can meet the existing opportunities within social media data access. The recommendations include which type of funder is most appropriate; how much money is required to meet the objectives; and a time-scale for results…(More)”
Report by James P. Cummings: “Humanity stands at the threshold of a new era in biological understanding, disease treatment, and overall wellness. The convergence of evolving patient and caregiver (consumer) behaviors, increased data collection, advancements in health technology and standards, federal policies, and the rise of artificial intelligence (AI) is driving one of the most significant transformations in human history. To achieve transformative health care insights, AI must have access to comprehensive longitudinal health records (LHRs) that span clinical, genomic, nonclinical, wearable, and patient-generated data. Despite the extensive use of electronic medical records and widespread interoperability efforts, current health care organizations, electronic medical record vendors, and public agencies are not incentivized to develop and maintain complete LHRs. This paper explores the new paradigm of consumers as the common provenance and singular custodian of LHRs. With fully aligned intentions and ample time to dedicate to optimizing their health outcomes, patients and caregivers must assume the sole responsibility to manage or delegate aggregation of complete, accurate, and real-time LHRs. Significant gaps persist in empowering consumers to act as primary custodians of their health data and to aggregate their complete LHRs, a foundational requirement for the effective application of AI. Rare disease communities, leaders in participatory care, offer a compelling model for demonstrating how consumer-driven data aggregation can be achieved and underscore the need for improved policy frameworks and technological tools. The convergence of AI and LHRs promises to transform medicine by enhancing clinical decision-making, accelerating accurate diagnoses, and dramatically advancing our ability to understand and treat disease at an unprecedented pace…(More)”.
Article by Stefaan G. Verhulst: “Since 2016, the FAIR principles — specifying that data should be Findable, Accessible, Interoperable, and Reusable — have served as the foundation for responsible open data management. Especially within the open science community, FAIR has shaped how we publish, share, and reuse scientific and public data. It brought a common language to a fragmented ecosystem.
But as artificial intelligence transforms how knowledge is produced and decisions are made, FAIR alone may no longer be enough. We now face a new question:
What does it mean for data to be AI-ready — and ready for what kind of AI?
Earlier this year we sought to provide an answer to that question by proposing the FAIR-R Principles and Framework. Last week, Frontiers, released its own FAIR² data management platform. Both seek to extend FAIR, but they diverge in focus and method. FAIR-R introduces a conceptual expansion; FAIR² adds operational guidance. Together, they reveal how our understanding of data readiness is evolving in the age of AI…(More)”
Book by Tim Wu: “Our world is dominated by a handful of tech platforms. They provide great conveniences and entertainment, but also stand as some of the most effective instruments of wealth extraction ever invented, seizing immense amounts of money, data, and attention from all of us. An economy driven by digital platforms and AI influence offers the potential to enrich us, and also threatens to marginalize entire industries, widen the wealth gap, and foster a two-class nation. As technology evolves and our markets adapt, can society cultivate a better life for everyone? Is it possible to balance economic growth and egalitarianism, or are we too far gone?
Tim Wu—the preeminent scholar and former White House official who coined the phrase “net neutrality”—explores the rise of platform power and details the risks and rewards of working within such systems. The Age of Extraction tells the story of an Internet that promised widespread wealth and democracy in the 1990s and 2000s, only to create new economic classes and aid the spread of autocracy instead. Wu frames our current moment with lessons from recent history—from generative AI and predictive social data to the antimonopoly and crypto movements—and envisions a future where technological advances can serve the greatest possible good. Concise and hopeful, The Age of Extraction offers consequential proposals for taking back control in order to achieve a better economic balance and prosperity for all…(More)”.
UNECE Report: “Developed under the Applying Data Science and Modern Methods Group of the High-Level Group for the Modernisation of Official Statistics (HLG-MOS), this framework provides practical guidance on the responsible use of Artificial Intelligence (AI) and Machine Learning (ML) in the production of official statistics. It outlines key principles such as fairness, accountability, transparency, and validity, accompanied by concrete guidelines and examples. The framework aims to help national and international statistical organisations adopt AI technologies in a trustworthy, ethical, and sustainable manner…(More)”.
Paper by by Marta Zorrilla and Juan Yebenes: “The growing importance of data as a driver of the digital economy is promoting the creation of data spaces for the secure and controlled exchange of data between organizations. Data governance is emerging as an essential pillar to ensure efficient, ethical and transparent access and use of data in these ecosystems. The article reviews the state of the art to identify the specific requirements that data governance must address in data spaces and proposes a reference enterprise architecture to facilitate the design, development and implementation of a data governance system for a data space scenario. The proposed framework has already been formally defined and validated in the context of Industry 4.0, and is now adapted to the particular characteristics and needs of data spaces. This architecture focuses on key aspects of data governance in data spaces, such as new requirements, principles, organization, roles and responsibilities, and data quality, security and metadata management, as well as the data lifecycle in the data space. This research contributes to guiding data space government bodies to formalize data strategies and high-level governance principles in concrete architectural components that establish the capacities to be implemented within the data ecosystem. To support practical adoption, this work also provides clarifying examples of different blocks of architecture…(More)”.
Press Release by the European Commission: “As of today, new rules under the Digital Services Act (DSA) will allow researchers to gain unprecedented access to very large online platforms’ data to study the societal impact stemming from the platforms’ systems.

Such access is now possible following the entry into force of the delegated act on data access.
The measures will allow qualified researchers to request access to previously unavailable data from very large online platforms and search engines. Platforms’ own data is a key element in understanding the possible systemic risks stemming from, for example, recommender systems. It will also help address risks such as the spread of illegal content and financial scams. Hence, ensuring a safer online experience for users, and, importantly, minors.
While creating opportunities for new studies, these measures also include safeguards to protect the companies’ interest. To get access to platforms’ data, researchers will have to undergo a strict assessment carried out by Digital Services Coordinators, the national authorities responsible for the implementation of the DSA. If researchers fulfil all the criteria prescribed by the law and if the research projects are relevant for studying systemic risks, including the spread of illegal content or negative effects on mental health, under the DSA, the platforms are legally required to comply with their data requests. Digital Services Coordinators are already working together to ensure that data access applications will be assessed uniformly across Member States and in due time…(More)”.
Paper by Joseph Bak-Coleman, et al: “Emerging information technologies like social media, search engines, and AI can have a broad impact on public health, political institutions, social dynamics, and the natural world. It is critical to develop a scientific understanding of these impacts to inform evidence-based technology policy that minimizes harm and maximizes benefits. Unlike most other global-scale scientific challenges, however, the data necessary for scientific progress are generated and controlled by the same industry that might be subject to evidence-based regulation. Moreover, technology companies historically have been, and continue to be, a major source of funding for this field. These asymmetries in information and funding raise significant concerns about the potential for undue industry influence on the scientific record. In this Perspective, we explore how technology companies can influence our scientific understanding of their products. We argue that science faces unique challenges in the context of technology research that will require strengthening existing safeguards and constructing wholly new ones…(More)”.
Article by Yale Insights: “Let’s say you see an interesting headline about new research, suggesting that a ride-share service reduced traffic jams in cities. You click the link, scroll down the article, and then discover that the study authors based their results on private data from the ride-share company itself. How much would that disclosure dampen your trust in the finding?
Quite a bit, according to a new study co-authored by Prof. John Barrios. On average, the use of private data decreased people’s trust in economic results by one-fifth. And that’s a problem for economists, who often rely on these data sets to fuel their studies.
“That’s the lifeblood of modern economic research,” Barrios says. “But it also creates a tension: the very data that let us answer new questions can make our work seem less independent.” His team’s study suggested that other conflicts, such as accepting consulting fees from companies and ideological biases, also substantially lowered trust.
When conflicts undermine credibility, studies lose some of their power to shape public policy decisions. “We engage in creating knowledge, and that marketplace of ideas runs on the currency of trust,” he says. “Once that trust erodes, so too does our research in impacting the world around us.”..(More)”. See also: “The Conflict-of-Interest Discount in the Marketplace of Ideas”
Article by Daniel Björkegren: “The cost of using artificial intelligence (AI) has plummeted. Almost every smartphone in the world can access AI chatbots for free for several queries a day, through ChatGPT, WhatsApp, Google, or other providers. However, these services have yet to reach most mobile phones that lack active internet service, which number around 1.7 billion based on statistics from the International Telecommunication Union, or ITU. Those phones are in the hands of on the order of a billion people who are excluded from AI. They include some of the world’s poorest and most remote residents, who might use the technology to obtain information of all sorts, including advice on business, education, health, and agriculture.
Basic mobile phones could theoretically access AI services through text messages, through a technology developed in the 1990s called short message service or SMS. SMS allows the sending and receiving of messages of up to 160 characters of text. It is extremely efficient: SMS uses the leftover capacity of a cellular network, so the marginal resource use per message is tiny. And it is popular: over 6 trillion SMS messages were sent in the latest year in which data is available.
However, in many countries mobile phone operators charge application providers extremely high prices for SMS, which limit the viability of digital services for basic phones. SMS pricing in different countries can be opaque, so I gathered prices from several reputable international platforms and, when available, from major operators themselves. The prices I could find vary dramatically: for example, it would cost $0.08 to send 100 SMS messages to users in Ghana (through an operator)—but $44.75 to send the same messages to users in Pakistan (through the platform Vonage Nexmo; a major operator did not provide prices in response to my request). Overall, prices are high: for the median phone without internet, the priceI found is $5.33 per 100 messages sent. These prices represent the lowest bulk or application-to-person (A2P) SMS rate that would be paid by an organization sending messages from a digital service (the end of the post describes the methodology). Consumers are typically also charged separate retail prices for any messages they send, though message allowances may be included in their plans. While it may be possible for organizations to negotiate lower prices, A2P SMS is expensive any way you look at it. The median A2P SMS price corresponds to $380,616 per gigabyte of data transmitted. In comparison, when using mobile data on a smartphone the median price of bandwidth is only $6 per gigabyte. For the median price of sending one message via SMS, you could send 58,775 similarly sized messages via mobile data on a smartphone. SMS is expensive relative to mobile internet around the world, as shown in Table 1, which shows statistics by country…(More)”.