Stefaan Verhulst
Article by James W Kelly: “Medical information of 500,000 participants of one of the UK’s landmark scientific programmes, UK Biobank, were offered for sale online in China, the government has confirmed.
Technology minister Ian Murray said information of all members of the database was found listed for sale on the website Alibaba.
Murray told MPs the charity which runs UK Biobank had told the government about the breach on Monday. He said the information did not include names, addresses, contact details or telephone numbers.
However he said it could include gender, age, month and year of birth, socioeconomic status, lifestyle habits, and measures from biological samples.
The Biobank is a collection of health data offered by volunteers which has been used to help improvements in detection and treatment of dementia, some cancers and Parkinson’s.
It has collected intimate details – including whole body scans, DNA sequences and their medical records – from hundreds of thousands of volunteers for over two decades. The project has led to more than 18,000 scientific publications.
Participants were aged from 40 to 69 when they were recruited between 2006 and 2010.
UK Biobank said it was investigating the incident and thanked the UK and Chinese governments, as well as Alibaba, for support and cooperation…(More)”.
Paper by Juan Ortiz-Freuler and Manuel Castells: “Control over digital interfaces has become a significant aspect of geopolitical struggles. This article advances an analytical framework illuminating how global communication power manifests across three key interfaces: search engines, social media, and AI agents. We articulate the evolution of these interfaces from corporate innovation to an aspect of contested transnational control, and conceptualize how corporate multinationals like Google, Facebook, TikTok, and DeepSeek leverage interface design to consolidate authority while state interventions challenge their market control. Governments seek to instrumentalize or challenge corporate interfaces to advance national goals, while firms strategically align with or resist state agendas to secure market access. The framework articulates how these forces reconfigure relations between information, people, and machines, with implications for the internet’s next phase…(More)”.
Article by Manon Revel & Théophile Pénigaud: “…unpacks the design choices behind longstanding and newly proposed computational frameworks aimed at finding common grounds across collective preferences and examines their potential future impacts, both technically and normatively. It begins by situating AI-assisted preference elicitation within the historical role of opinion polls, emphasizing that preferences are shaped by the decision-making context and are seldom objectively captured. With that caveat in mind, we explore the extent to which AI-based democratic innovations might serve as discovery tools which support reasonable representations of a collective will, sense-making, and agreement-seeking. At the same time, we caution against dangerously misguided uses, such as enabling binding decisions, fostering gradual disempowerment or post-rationalizing political outcomes…(More)”.
Paper by Jakob Ohme and LK Seiling: “The EU’s Digital Services Act (DSA) establishes, for the first time, a legal right for independent researchers to access platform data in the public interest. Once designated as Very Large Online Platforms or Search Engines (VLOPSEs), services reaching 45 million EU users must provide data access to support research on “systemic risks.” Article 40 creates two pathways: Article 40(12) enables access to publicly available data beyond voluntary platform tools, while Article 40(4) allows vetted researchers to request non-public data – such as exposure logs, moderation records, and recommendation metrics – through national Digital Services Coordinators rather than platforms. Both routes are purpose-limited to studying systemic risks and, for Article 40(4), mitigation measures. Yet the DSA’s broad, non-exhaustive definition of systemic risk – covering illegal content, fundamental rights, civic discourse, public health, and user well-being – opens a wide research space spanning misinformation flows, political networks, algorithmic amplification, and platform governance, among others. Early implementation reveals challenges, including uneven compliance, uncertain technical standards, funding constraints, and limits to data sharing for replication. Nonetheless, the DSA marks a turning point: platform research is no longer dependent on corporate discretion but grounded in public-interest regulation. Researchers now play a central role in shaping evidence-based oversight of digital platforms in Europe…(More)”.
Article by Alex Daniels: “Could the soul-sucking process of applying for philanthropic grants be on the way out? That is one of the goals of a new $8 million effort supported by the MacArthur Foundation.
The project, dubbed the Philanthropy Data Commons, is an attempt to bring a huge reservoir of foundation and charity information into a single database. Grant seekers and grant makers can drill into the data to find partners that share the same goals, among the vast universe of tax-exempt organizations.
“It should take nonprofits less time to apply for grants and allow them more time to spend on their missions,” said Elizabeth Kane, co-director of the Commons. “By the same token, many funders struggle to find and support organizations that are aligned with their goals. It could make the grant application process more efficient for both sides.”
Currently, the publicly available data from Internal Revenue Service filings that nonprofits can scour for grant information is limited. It only provides basic personnel and financial information and lacks detail about what work funders want to support and how well nonprofits have performed.
If enough organizations provide more granular information to the Data Commons — things like due diligence reports on potential grantees, project timelines, and impact data — the database and the applications created to use it could play matchmaker. Grantees and grant makers could be connected through a largely automated process. Grantees would be able to search grant makers and vice versa. Applications for many grants could be completed with a minimum of keystrokes. For instance, if a grantee located several foundations that matched some basic criteria, it could auto-populate fields in an application using its stored data and send it off to all of the grant makers at the same time…(More)”.
Paper by Alek Tarkowski: “Standard open licenses treat all users as formally equal. But when a researcher in Nairobi and a multinational technology company are offered the same terms of use for a language dataset, the result is not democratization but value extraction. This is the equity gap at the heart of the Paradox of Open. The Nwulite Obodo Open Data License (NOODL) directly responds to this challenge.
This report analyses the NOODL license, a tiered licensing framework developed for African language datasets, as an experiment in open data licensing and a contribution to emerging approaches to data commons governance. It is our second study that looks in detail in how components of a public AI stack can be created and governed (the first study concerned the development of AI models in Poland)
Developed in consultation with African language communities, NOODL builds on Creative Commons licensing but introduces a tiered framework of obligations based on users’ geographic and economic position. For users in the Global South, it applies permissive open terms. For users in high-income countries, it requires benefit or value sharing with the data community. Rather than treating all users identically, NOODL assumes that meaningful openness requires differentiation based on capacity and power.
This report examines NOODL as an experiment in open licensing with relevance beyond its immediate context: it points to the need to go beyond the binary of open vs closed. The analysis situates the license within the broader debate on democratizing AI, the growing ecosystem of commons-based data governance experiments, and Open Future’s own framework for commons-based data set governance. It also assesses the enforcement and adoption challenges NOODL faces, and considers what a healthier licensing ecosystem might look like: one that supports context-sensitive experimentation.
NOODL is currently applied to a single dataset. Its significance does not lie in scale, but in what it opens up: space to think beyond the “one size fits all” model that has defined open licensing for over two decades…(More)”
Article by Joel Gurin: “…A growing coalition of organizations, researchers, technologists, and civic leaders is working to save and preserve national data on many levels. Now it’s time to bring those lines of work together. We need a coordinated, national program to protect essential data and build alternatives where federal sources fail.
Such a program can begin by acknowledging that we cannot save everything. Data.gov, the federal portal for all the government’s public data, provides access to more than 400,000 datasets. Not all are equally important, equally used, or equally at risk. The challenge is to identify the most essential datasets—such as the ones that underpin public health, climate science, economic stability, education, and democratic accountability—and determine which are vulnerable.
A practical, scalable strategy can include several steps:
1. Track what we’ve lost. We need a thorough, AI-enabled scan of the federal data ecosystem to see what’s already been lost or changed, and set up automated monitoring to detect even subtle changes going forward.
2. Build coalitions in key domains. Public health experts know which datasets matter most to disease surveillance. Climate scientists know which environmental indicators are irreplaceable. Education researchers know which federal surveys track opportunity. These experts must work alongside data scientists, AI specialists, and philanthropic partners to map what truly counts.
3. Prioritize core datasets. Through interviews, surveys, and quantitative analysis—such as tracking citations in research or journalism—coalitions can identify a “core canon” of essential datasets in each field.
4. Assess the risks. Tools like the Data Checkup, developed by dataindex.us, can assess threats to federal datasets. This work can be automated and scaled with AI.
5. Determine the federal role. Some federal data—like satellite observations, national health surveillance, or economic indicators—cannot be replicated by states or private actors. Other data can be supplemented or replaced by state and local sources, private‑sector datasets, crowdsourcing, or nontraditional data sources.
6. Take action to save essential data. When federal data is essential, coalitions can pursue advocacy, public comments, direct engagement with agencies, or litigation. When alternatives exist, they can be developed, benchmarked, and scaled.
7. Put the data to work. The best way to defend data is to use it. Publishing use cases, visualizations, tools, and plain‑language insights helps the public see why this information matters. Generative AI can make federal and open data accessible to millions of non‑technical users.
8. Think globally. The threats to data go beyond the U.S. We need to track the international impacts of U.S. data loss, study how international sources might replace U.S. data, and share lessons learned with other countries.
9. Strengthen institutional protections. In addition to managing today’s immediate problems, we need to develop policies, laws, governance strategies, and guardrails for more stable, reliable data in the future.
10. Sustain the cycle. The threats will evolve. So must the response…(More)”.

Book edited by Petra Ahrweiler and Nigel Gilbert: “This open access volume showcases a series of models – particularly agent-based simulations – that explore pressing issues on national policy agendas, engaging with frontier questions around artificial intelligence (AI), welfare-related social assessment, and value diversity. These themes have profound implications for policy, cultural and societal life. The volume underscores the role of policy modelling in addressing how AI can be made context-specific, adaptive, and responsive within the public sector. Drawing on case studies from nine countries with differing value frameworks, the models examine welfare service provision choices and assess the demands, limitations, and effects of using AI to augment or replace traditional practices. The analyses reflect the pluralism of societal norms and values, while also considering the political, economic, and social pressures that shape them. The volume advocates for a participatory methodology and socio-technical infrastructure that can enable the development of more responsible, value-sensitive, and context-aware AI, and policies to implement it. By situating AI research, innovation and policy in close collaboration with society, it offers a fresh perspective for industry and innovation leaders. Ultimately, it presents a model for how participatory design and responsible technology production can better meet societal needs…(More)”.
Paper by Kathleen Gregory et al: “Sustaining knowledge infrastructures (KIs) remains a persistent issue that requires continued engagement from diverse stakeholders as new questions and values arise in relation to KI maintenance. We draw on existing academic literature, practical experience with KI projects, and our discussions at a 2024 workshop for researchers and practitioners exploring KI evaluation to pose five questions for KI project managers to consider when thinking about how to make their KIs evolve sustainably over time. These questions include reflecting on sustainability throughout the life cycle of KIs, communicating evolving visions and values, engaging communities, “right sizing” a KI, and developing an iterative process for decision-making. Reflecting on these themes, we suggest, can support KI stakeholders to evolve (not necessarily “grow”) to meet the needs and values of their communities. How these themes are discussed will necessarily vary by funding sources, discipline(s), governance, communities, and other contextual factors. However, adopting a deliberate and strategic approach to KI sustainability and aligning the invisible infrastructural work of KI maintenance with the outward-facing institutional work is, we argue, relevant to all KIs…(More)”.
Journal by the Machine Institute: “… is a fully automated journal of AI interpretability. This journal features original research composed, conducted, and written entirely by LLMs analyzing LLMs. Much of the research published in Mirror falls within the category of “mechanistic interpretability,” in which model behaviors are decomposed into operations in the model’s internal representation space, but any rigorous research advancing our understanding of LLMs is welcome, be it mechanistic, behavioral, or theoretical.
Research advancing AI capabilities is already being automated at a rapid pace. Interpretability research, which seeks to improve our understanding of these systems, runs the risk of being left behind if it does not similarly leverage the power of automated inquiry, analysis, and discovery. As AI systems become more powerful, applying these systems to interpretability research will play a critical role in ensuring safety and alignment.
Mirror is intended to be read by human and AI alike. By publishing studies at scale on the open web, the discoveries in Mirror become training data for future generations of automated interpretability, safety, and alignment research systems. While human scientists must limit their reading to the most relevant, influential, and surprising findings, AI systems are more capable of productively ingesting and incorporating information at a massive scale, and may thus benefit from encountering papers that make even incremental or confirmatory findings. Although we hope that Mirror will publish paradigm-shifting research, scaling the “normal science” of AI interpretability remains a key objective as well…(More)”.