A standardised differential privacy framework for epidemiological modeling with mobile phone data


Paper by Merveille Koissi Savi et al: “During the COVID-19 pandemic, the use of mobile phone data for monitoring human mobility patterns has become increasingly common, both to study the impact of travel restrictions on population movement and epidemiological modeling. Despite the importance of these data, the use of location information to guide public policy can raise issues of privacy and ethical use. Studies have shown that simple aggregation does not protect the privacy of an individual, and there are no universal standards for aggregation that guarantee anonymity. Newer methods, such as differential privacy, can provide statistically verifiable protection against identifiability but have been largely untested as inputs for compartment models used in infectious disease epidemiology. Our study examines the application of differential privacy as an anonymisation tool in epidemiological models, studying the impact of adding quantifiable statistical noise to mobile phone-based location data on the bias of ten common epidemiological metrics. We find that many epidemiological metrics are preserved and remain close to their non-private values when the true noise state is less than 20, in a count transition matrix, which corresponds to a privacy-less parameter ϵ = 0.05 per release. We show that differential privacy offers a robust approach to preserving individual privacy in mobility data while providing useful population-level insights for public health. Importantly, we have built a modular software pipeline to facilitate the replication and expansion of our framework…(More)”.

Data Equity: Foundational Concepts for Generative AI


WEF Report: “This briefing paper focuses on data equity within foundation models, both in terms of the impact of Generative AI (genAI) on society and on the further development of genAI tools.

GenAI promises immense potential to drive digital and social innovation, such as improving efficiency, enhancing creativity and augmenting existing data. GenAI has the potential to democratize access and usage of technologies. However, left unchecked, it could deepen inequities. With the advent of genAI significantly increasing the rate at which AI is deployed and developed, exploring frameworks for data equity is more urgent than ever.

The goals of the briefing paper are threefold: to establish a shared vocabulary to facilitate collaboration and dialogue; to scope initial concerns to establish a framework for inquiry on which stakeholders can focus; and to shape future development of promising technologies.

The paper represents a first step in exploring and promoting data equity in the context of genAI. The proposed definitions, framework and recommendations are intended to proactively shape the development of promising genAI technologies…(More)”.

Artificial intelligence in government: Concepts, standards, and a unified framework


Paper by Vincent J. Straub, Deborah Morgan, Jonathan Bright, Helen Margetts: “Recent advances in artificial intelligence (AI), especially in generative language modelling, hold the promise of transforming government. Given the advanced capabilities of new AI systems, it is critical that these are embedded using standard operational procedures, clear epistemic criteria, and behave in alignment with the normative expectations of society. Scholars in multiple domains have subsequently begun to conceptualize the different forms that AI applications may take, highlighting both their potential benefits and pitfalls. However, the literature remains fragmented, with researchers in social science disciplines like public administration and political science, and the fast-moving fields of AI, ML, and robotics, all developing concepts in relative isolation. Although there are calls to formalize the emerging study of AI in government, a balanced account that captures the full depth of theoretical perspectives needed to understand the consequences of embedding AI into a public sector context is lacking. Here, we unify efforts across social and technical disciplines by first conducting an integrative literature review to identify and cluster 69 key terms that frequently co-occur in the multidisciplinary study of AI. We then build on the results of this bibliometric analysis to propose three new multifaceted concepts for understanding and analysing AI-based systems for government (AI-GOV) in a more unified way: (1) operational fitness, (2) epistemic alignment, and (3) normative divergence. Finally, we put these concepts to work by using them as dimensions in a conceptual typology of AI-GOV and connecting each with emerging AI technical measurement standards to encourage operationalization, foster cross-disciplinary dialogue, and stimulate debate among those aiming to rethink government with AI…(More)”.

A Feasibility Study of Differentially Private Summary Statistics and Regression Analyses with Evaluations on Administrative and Survey Data


Report by Andrés F. Barrientos, Aaron R. Williams, Joshua Snoke, Claire McKay Bowen: “Federal administrative data, such as tax data, are invaluable for research, but because of privacy concerns, access to these data is typically limited to select agencies and a few individuals. An alternative to sharing microlevel data is to allow individuals to query statistics without directly accessing the confidential data. This paper studies the feasibility of using differentially private (DP) methods to make certain queries while preserving privacy. We also include new methodological adaptations to existing DP regression methods for using new data types and returning standard error estimates. We define feasibility as the impact of DP methods on analyses for making public policy decisions and the queries accuracy according to several utility metrics. We evaluate the methods using Internal Revenue Service data and public-use Current Population Survey data and identify how specific data features might challenge some of these methods. Our findings show that DP methods are feasible for simple, univariate statistics but struggle to produce accurate regression estimates and confidence intervals. To the best of our knowledge, this is the first comprehensive statistical study of DP regression methodology on real, complex datasets, and the findings have significant implications for the direction of a growing research field and public policy…(More)”.

Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence


The White House: “Today, President Biden is issuing a landmark Executive Order to ensure that America leads the way in seizing the promise and managing the risks of artificial intelligence (AI). The Executive Order establishes new standards for AI safety and security, protects Americans’ privacy, advances equity and civil rights, stands up for consumers and workers, promotes innovation and competition, advances American leadership around the world, and more.

As part of the Biden-Harris Administration’s comprehensive strategy for responsible innovation, the Executive Order builds on previous actions the President has taken, including work that led to voluntary commitments from 15 leading companies to drive safe, secure, and trustworthy development of AI…(More)”.

AI is Like… A Literature Review of AI Metaphors and Why They Matter for Policy


Paper by Matthijs M. Maas: “As AI systems have become increasingly capable and impactful, there has been significant public and policymaker debate over this technology’s impacts—and the appropriate legal or regulatory responses. Within these debates many have deployed—and contested—a dazzling range of analogies, metaphors, and comparisons for AI systems, their impact, or their regulation.

This report reviews why and how metaphors matter to both the study and practice of AI governance, in order to contribute to more productive dialogue and more reflective policymaking. It first reviews five stages at which different foundational metaphors play a role in shaping the processes of technological innovation, the academic study of their impacts; the regulatory agenda, the terms of the policymaking process, and legislative and judicial responses to new technology. It then surveys a series of cases where the choice of analogy materially influenced the regulation of internet issues, as well as (recent) AI law issues. The report then provides a non-exhaustive survey of 55 analogies that have been given for AI technology, and some of their policy implications. Finally, it discusses the risks of utilizing unreflexive analogies in AI law and regulation.

By disentangling the role of metaphors and frames in these debates, and the space of analogies for AI, this survey does not aim to argue against the use or role of analogies in AI regulation—but rather to facilitate more reflective and productive conversations on these timely challenges…(More)”.

Urban Development and the State of Open Data


Chapter by Stefaan G. Verhulst and Sampriti Saxena: “Nearly 4.4 billion people, or about 55% of the world’s population, lived in cities in 2018. By 2045, this number is anticipated to grow to 6 billion. Such level of growth requires innovative and targeted urban solutions. By more effectively leveraging open data, cities can meet the needs of an ever-growing population in an effective and sustainable manner. This paper updates the previous contribution by Jean-Noé Landry, titled “Open Data and Urban Development” in the 2019 edition of The State of Open Data. It also aims to contribute to a further deepening of the Third Wave of Open Data, which highlights the significance of open data at the subnational level as a more direct and immediate response to the on-the-ground needs of citizens. It considers recent developments in how the use of, and approach to, open data has evolved within an urban development context. It seeks to discuss emerging applications of open data in cities, recent developments in open data infrastructure, governance and policies related to open data, and the future outlook of the role of open data in urbanization…(More)”.

Governing Urban Data for the Public Interest


Report by The New Hanse: “…This report represents the culmination of our efforts and offers actionable guidelines for European cities seeking to harness the power of data for the public good.

The key recommendations outlined in the report are:

1. Shift the Paradigm towards Democratic Control of Data: Advocate for a policy that defaults to making urban data accessible, requiring private data holders to share in the public interest.

2. Provide Legal Clarity in a Dynamic Environment: Address legal uncertainties by balancing privacy and confidentiality needs with the public interest in data accessibility, working collaboratively with relevant authorities at national and EU level.

3. Build a Data Commons Repository of Use cases: Streamline data sharing efforts by establishing a standardised use case repository with common technical frameworks, procedures, and contracts.

4. Set up an Urban Data Intermediary for the Public Interest: Institutionalise data sharing, by building urban data intermediaries to address complexities, following principles of public purpose, transparency, and accountability.

5. Learning from the Hamburg Experiment and Scale it across Europe: Embrace experimentation as a vital step, even if outcomes are uncertain, to adapt processes for future innovations. Experiments at the local level can inform policy and scale nationally and across Europe…(More)”.

AI Adoption in America: Who, What, and Where


Paper by Kristina McElheran: “…We study the early adoption and diffusion of five AI-related technologies (automated-guided vehicles, machine learning, machine vision, natural language processing, and voice recognition) as documented in the 2018 Annual Business Survey of 850,000 firms across the United States. We find that fewer than 6% of firms used any of the AI-related technologies we measure, though most very large firms reported at least some AI use. Weighted by employment, average adoption was just over 18%. AI use in production, while varying considerably by industry, nevertheless was found in every sector of the economy and clustered with emerging technologies such as cloud computing and robotics. Among dynamic young firms, AI use was highest alongside more-educated, more-experienced, and younger owners, including owners motivated by bringing new ideas to market or helping the community. AI adoption was also more common alongside indicators of high-growth entrepreneurship, including venture capital funding, recent product and process innovation, and growth-oriented business strategies. Early adoption was far from evenly distributed: a handful of “superstar” cities and emerging hubs led startups’ adoption of AI. These patterns of early AI use foreshadow economic and social impacts far beyond this limited initial diffusion, with the possibility of a growing “AI divide” if early patterns persist…(More)”.

How language gaps constrain generative AI development


Article by Regina Ta and Nicol Turner Lee: “Prompt-based generative artificial intelligence (AI) tools are quickly being deployed for a range of use cases, from writing emails and compiling legal cases to personalizing research essays in a wide range of educational, professional, and vocational disciplines. But language is not monolithic, and opportunities may be missed in developing generative AI tools for non-standard languages and dialects. Current applications often are not optimized for certain populations or communities and, in some instances, may exacerbate social and economic divisions. As noted by the Austrian linguist and philosopher Ludwig Wittgenstein, “The limits of my language mean the limits of my world.” This is especially true today, when the language we speak can change how we engage with technology, and the limits of our online vernacular can constrain the full and fair use of existing and emerging technologies.

As it stands now, the majority of the world’s speakers are being left behind if they are not part of one of the world’s dominant languages, such as English, French, German, Spanish, Chinese, or Russian. There are over 7,000 languages spoken worldwide, yet a plurality of content on the internet is written in English, with the largest remaining online shares claimed by Asian and European languages like Mandarin or Spanish. Moreover, in the English language alone, there are over 150 dialects beyond “standard” U.S. English. Consequently, large language models (LLMs) that train AI tools, like generative AI, rely on binary internet data that serve to increase the gap between standard and non-standard speakers, widening the digital language divide.

Among sociologists, anthropologists, and linguists, language is a source of power and one that significantly influences the development and dissemination of new tools that are dependent upon learned, linguistic capabilities. Depending on where one sits within socio-ethnic contexts, native language can internally strengthen communities while also amplifying and replicating inequalities when coopted by incumbent power structures to restrict immigrant and historically marginalized communities. For example, during the transatlantic slave trade, literacy was a weapon used by white supremacists to reinforce the dependence of Blacks on slave masters, which resulted in many anti-literacy laws being passed in the 1800s in most Confederate states…(More)”.