Paper by Uri Y. Hacohen: “Data is often heralded as “the world’s most valuable resource,” yet its potential to benefit society remains unrealized due to systemic barriers in both public and private sectors. While open data-defined as data that is available, accessible, and usable-holds immense promise to advance open science, innovation, economic growth, and democratic values, its utilization is hindered by legal, technical, and organizational challenges. Public sector initiatives, such as U.S. and European Union open data regulations, face uneven enforcement and regulatory complexity, disproportionately affecting under-resourced stakeholders such as researchers. In the private sector, companies prioritize commercial interests and user privacy, often obstructing data openness through restrictive policies and technological barriers. This article proposes an innovative, four-layered policy framework to overcome these obstacles and foster data openness. The framework includes (1) improving open data infrastructures, (2) ensuring legal frameworks for open data, (3) incentivizing voluntary data sharing, and (4) imposing mandatory data sharing obligations. Each policy cluster is tailored to address sector-specific challenges and balance competing values such as privacy, property, and national security. Drawing from academic research and international case studies, the framework provides actionable solutions to transition from a siloed, proprietary data ecosystem to one that maximizes societal value. This comprehensive approach aims to reimagine data governance and unlock the transformative potential of open data…(More)”.
Trump Wants to Merge Government Data. Here Are 314 Things It Might Know About You.
Article by Emily Badger and Sheera Frenkel: “The federal government knows your mother’s maiden name and your bank account number. The student debt you hold. Your disability status. The company that employs you and the wages you earn there. And that’s just a start. It may also know your …and at least 263 more categories of data.These intimate details about the personal lives of people who live in the United States are held in disconnected data systems across the federal government — some at the Treasury, some at the Social Security Administration and some at the Department of Education, among other agencies.
The Trump administration is now trying to connect the dots of that disparate information. Last month, President Trump signed an executive order calling for the “consolidation” of these segregated records, raising the prospect of creating a kind of data trove about Americans that the government has never had before, and that members of the president’s own party have historically opposed.
The effort is being driven by Elon Musk, the world’s richest man, and his lieutenants with the Department of Government Efficiency, who have sought access to dozens of databases as they have swept through agencies across the federal government. Along the way, they have elbowed past the objections of career staff, data security protocols, national security experts and legal privacy protections…(More)”.

We Must Steward, Not Subjugate Nor Worship AI
Essay by Brian J. A. Boyd: “…How could stewardship of artificially living AI be pursued on a broader, even global, level? Here, the concept of “integral ecology” is helpful. Pope Francis uses the phrase to highlight the ways in which everything is connected, both through the web of life and in that social, political, and environmental challenges cannot be solved in isolation. The immediate need for stewardship over AI is to ensure that its demands for power and industrial production are addressed in a way that benefits those most in need, rather than de-prioritizing them further. For example, the energy requirements to develop tomorrow’s AI should spur research into small modular nuclear reactors and updated distribution systems, making energy abundant rather than causing regressive harms by driving up prices on an already overtaxed grid. More broadly, we will need to find the right institutional arrangements and incentive structures to make AI Amistics possible.
We are having a painfully overdue conversation about the nature and purpose of social media, and tech whistleblowers like Tristan Harris have offered grave warnings about how the “race to the bottom of the brain stem” is underway in AI as well. The AI equivalent of the addictive “infinite scroll” design feature of social media will likely be engagement with simulated friends — but we need not resign ourselves to it becoming part of our lives as did social media. And as there are proposals to switch from privately held Big Data to a public Data Commons, so perhaps could there be space for AI that is governed not for maximizing profit but for being sustainable as a common-pool resource, with applications and protocols ordered toward long-run benefit as defined by local communities…(More)”.
Data Localization: A Global Threat to Human Rights Online
Article by Freedom House: “From Pakistan to Zambia, governments around the world are increasingly proposing and passing data localization legislation. These laws, which refer to the rules governing the storage and transfer of electronic data across jurisdictions, are often justified as addressing concerns such as user privacy, cybersecurity, national security, and monopolistic market practices. Notwithstanding these laudable goals, data localization initiatives cause more harm than good, especially in legal environments with poor rule of law.
Data localization requirements can take many different forms. A government may require all companies collecting and processing certain types of data about local users to store the data on servers located in the country. Authorities may also restrict the foreign transfer of certain types of data or allow it only under narrow circumstances, such as after obtaining the explicit consent of users, receiving a license or permit from a public authority, or conducting a privacy assessment of the country to which the data will be transferred.
While data localization can have significant economic and security implications, the focus of this piece—inline with that of the Global Network Initiative and Freedom House—is on its potential human rights impacts, which are varied. Freedom House’s research shows that the rise in data localization policies worldwide is contributing to the global decline of internet freedom. Without robust transparency and accountability frameworks embedded into these provisions, digital rights are often put on the line. As these types of legislation continue to pop up globally, the need for rights-respecting solutions and norms for cross-border data flows is greater than ever…(More)”.
Global data-driven prediction of fire activity
Paper by Francesca Di Giuseppe, Joe McNorton, Anna Lombardi & Fredrik Wetterhall: “Recent advancements in machine learning (ML) have expanded the potential use across scientific applications, including weather and hazard forecasting. The ability of these methods to extract information from diverse and novel data types enables the transition from forecasting fire weather, to predicting actual fire activity. In this study we demonstrate that this shift is feasible also within an operational context. Traditional methods of fire forecasts tend to over predict high fire danger, particularly in fuel limited biomes, often resulting in false alarms. By using data on fuel characteristics, ignitions and observed fire activity, data-driven predictions reduce the false-alarm rate of high-danger forecasts, enhancing their accuracy. This is made possible by high quality global datasets of fuel evolution and fire detection. We find that the quality of input data is more important when improving forecasts than the complexity of the ML architecture. While the focus on ML advancements is often justified, our findings highlight the importance of investing in high-quality data and, where necessary create it through physical models. Neglecting this aspect would undermine the potential gains from ML-based approaches, emphasizing that data quality is essential to achieve meaningful progress in fire activity forecasting…(More)”.
Privacy-Enhancing and Privacy-Preserving Technologies in AI: Enabling Data Use and Operationalizing Privacy by Design and Default
Paper by the Centre for Information Policy Leadership at Hunton (“CIPL”): “provides an in-depth exploration of how privacy-enhancing technologies (“PETs”) are being deployed to address privacy within artificial intelligence (“AI”) systems. It aims to describe how these technologies can help operationalize privacy by design and default and serve as key business enablers, allowing companies and public sector organizations to access, share and use data that would otherwise be unavailable. It also seeks to demonstrate how PETs can address challenges and provide new opportunities across the AI life cycle, from data sourcing to model deployment, and includes real-world case studies…
As further detailed in the Paper, CIPL’s recommendations for boosting the adoption of PETs for AI are as follows:
Stakeholders should adopt a holistic view of the benefits of PETs in AI. PETs deliver value beyond addressing privacy and security concerns, such as fostering trust and enabling data sharing. It is crucial that stakeholders consider all these advantages when making decisions about their use.
Regulators should issue more clear and practical guidance to reduce regulatory uncertainty in the use of PETs in AI. While regulators increasingly recognize the value of PETs, clearer and more practical guidance is needed to help organizations implement these technologies effectively.
Regulators should adopt a risk-based approach to assess how PETs can meet standards for data anonymization, providing clear guidance to eliminate uncertainty. There is uncertainty around whether various PETs meet legal standards for data anonymization. A risk-based approach to defining anonymization standards could encourage wider adoption of PETs.
Deployers should take steps to provide contextually appropriate transparency to customers and data subjects. Given the complexity of PETs, deployers should ensure customers and data subjects understand how PETs function within AI models…(More)”.
Exploring Human Mobility in Urban Nightlife: Insights from Foursquare Data
Article by Ehsan Dorostkar: “In today’s digital age, social media platforms like Foursquare provide a wealth of data that can reveal fascinating insights into human behavior, especially in urban environments. Our recent study, published in Cities, delves into how virtual mobility on Foursquare translates into actual human mobility in Tehran’s nightlife scenes. By analyzing user-generated data, we uncovered patterns that can help urban planners create more vibrant and functional nightlife spaces…
Our study aimed to answer two key questions:
- How does virtual mobility on Foursquare influence real-world human mobility in urban nightlife?
- What spatial patterns emerge from these movements, and how can they inform urban planning?
To explore these questions, we focused on two bustling nightlife spots in Tehran—Region 1 (Darband Square) and Region 6 (Valiasr crossroads)—where Foursquare data indicated high user activity.
Methodology
We combined data from two sources:
- Foursquare API: To track user check-ins and identify popular nightlife venues.
- Tehran Municipality API: To contextualize the data within the city’s urban framework.
Using triangulation and interpolation techniques, we mapped the “human mobility triangles” in these areas, calculating the density and spread of user activity…(More)”.
LLM Social Simulations Are a Promising Research Method
Paper by Jacy Reese Anthis et al: “Accurate and verifiable large language model (LLM) simulations of human research subjects promise an accessible data source for understanding human behavior and training new AI systems. However, results to date have been limited, and few social scientists have adopted these methods. In this position paper, we argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges. We ground our argument in a literature survey of empirical comparisons between LLMs and human research subjects, commentaries on the topic, and related work. We identify promising directions with prompting, fine-tuning, and complementary methods. We believe that LLM social simulations can already be used for exploratory research, such as pilot experiments for psychology, economics, sociology, and marketing. More widespread use may soon be possible with rapidly advancing LLM capabilities, and researchers should prioritize developing conceptual models and evaluations that can be iteratively deployed and refined at pace with ongoing AI advances…(More)”.
Enabling an Open-Source AI Ecosystem as a Building Block for Public AI
Policy brief by Katarzyna Odrozek, Vidisha Mishra, Anshul Pachouri, Arnav Nigam: “…informed by insights from 30 open dataset builders convened by Mozilla and EleutherAI and a policy analysis on open-source Artificial intelligence (AI) development, outlines four key areas for G7 action: expand access to open data, support sustainable governance, encourage policy alignment in open-source AI and local capacity building and identification of use cases. These steps will enhance AI competitiveness, accountability, and innovation, positioning the G7 as a leader in Responsible AI development…(More)”.
Massive, Unarchivable Datasets of Cancer, Covid, and Alzheimer’s Research Could Be Lost Forever
Article by Sam Cole: “Almost two dozen repositories of research and public health data supported by the National Institutes of Health are marked for “review” under the Trump administration’s direction, and researchers and archivists say the data is at risk of being lost forever if the repositories go down.
“The problem with archiving this data is that we can’t,” Lisa Chinn, Head of Research Data Services at the University of Chicago, told 404 Media. Unlike other government datasets or web pages, downloading or otherwise archiving NIH data often requires a Data Use Agreement between a researcher institution and the agency, and those agreements are carefully administered through a disclosure risk review process.
A message appeared at the top of multiple NIH websites last week that says: “This repository is under review for potential modification in compliance with Administration directives.”
Repositories with the message include archives of cancer imagery, Alzheimer’s disease research, sleep studies, HIV databases, and COVID-19 vaccination and mortality data…
“So far, it seems like what is happening is less that these data sets are actively being deleted or clawed back and more that they are laying off the workers whose job is to maintain them, update them and maintain the infrastructure that supports them,” a librarian affiliated with the Data Rescue Project told 404 Media. “In time, this will have the same effect, but it’s really hard to predict. People don’t usually appreciate, much less our current administration, how much labor goes into maintaining a large research dataset.”…(More)”.