DC launched an AI tool for navigating the city’s open data


Article by Kaela Roeder: “In a move echoing local governments’ increasing attention toward generative artificial intelligence across the country, the nation’s capital now aims to make navigating its open data easier through a new public beta pilot.

DC Compass, launched in March, uses generative AI to answer user questions and create maps from open data sets, ranging from the district’s population to what different trees are planted in the city. The Office of the Chief Technology Officer (OCTO) partnered with the geographic information system (GIS) technology company Esri, which has an office in Vienna, Virginia, to create the new tool.

This debut follows Mayor Muriel Bowser’s signing of DC’s AI Values and Strategic Plan in February. The order requires agencies to assess if using AI is in alignment with the values it sets forth, including that there’s a clear benefit to people; a plan for “meaningful accountability” for the tool; and transparency, sustainability, privacy and equity at the forefront of deployment.

These values are key when launching something like DC Compass, said Michael Rupert, the interim chief technology officer for digital services at the Office of the Chief Technology Officer.

“The way Mayor Bowser rolled out the mayor’s order and this value statement, I think gives residents and businesses a little more comfort that we aren’t just writing a check and seeing what happens,” Rupert said. “That we’re actually methodically going about it in a responsible way, both morally and fiscally.”..(More)”.

Screenshot of AI portal with black text and data tables over white background

DC COMPASS IN ACTION. (SCREENSHOT/COURTESY OCTO)

The Potential of Artificial Intelligence for the SDGs and Official Statistics


Report by Paris21: “Artificial Intelligence (AI) and its impact on people’s lives is growing rapidly. AI is already leading to significant developments from healthcare to education, which can contribute to the efficient monitoring and achievement of the Sustainable Development Goals (SDGs), a call to action to address the world’s greatest challenges. AI is also raising concerns because, if not addressed carefully, its risks may outweigh its benefits. As a result, AI is garnering increasing attention from National Statistical Offices (NSOs) and the official statistics community as they are challenged to produce more, comprehensive, timely, and highquality data for decision-making with limited resources in a rapidly changing world of data and technologies and in light of complex and converging global issues from pandemics to climate change. This paper has been prepared as an input to the “Data and AI for Sustainable Development: Building a Smarter Future” Conference, organized in partnership with The Partnership in Statistics for Development in the 21st Century (PARIS21), the World Bank and the International Monetary Fund (IMF). Building on case studies that examine the use of AI by NSOs, the paper presents the benefits and risks of AI with a focus on NSO operations related to sustainable development. The objective is to spark discussions and to initiate a dialogue around how AI can be leveraged to inform decisions and take action to better monitor and achieve sustainable development, while mitigating its risks…(More)”.

Counting Feminicide: Data Feminism in Action


Book by Catherine D’Ignazio: “What isn’t counted doesn’t count. And mainstream institutions systematically fail to account for feminicide, the gender-related killing of women and girls, including cisgender and transgender women. Against this failure, Counting Feminicide brings to the fore the work of data activists across the Americas who are documenting such murders—and challenging the reigning logic of data science by centering care, memory, and justice in their work. Drawing on Data Against Feminicide, a large-scale collaborative research project, Catherine D’Ignazio describes the creative, intellectual, and emotional labor of feminicide data activists who are at the forefront of a data ethics that rigorously and consistently takes power and people into account.

Individuals, researchers, and journalists—these data activists scour news sources to assemble spreadsheets and databases of women killed by gender-related violence, then circulate those data in a variety of creative and political forms. Their work reveals the potential of restorative/transformative data science—the use of systematic information to, first, heal communities from the violence and trauma produced by structural inequality and, second, envision and work toward the world in which such violence has been eliminated. Specifically, D’Ignazio explores the possibilities and limitations of counting and quantification—reducing complex social phenomena to convenient, sortable, aggregable forms—when the goal is nothing short of the elimination of gender-related violence.

Counting Feminicide showcases the incredible power of data feminism in practice, in which each murdered woman or girl counts, and, in being counted, joins a collective demand for the restoration of rights and a transformation of the gendered order of the world…(More)”.

Generative AI in Journalism


Report by Nicholas Diakopoulos et al: “The introduction of ChatGPT by OpenAI in late 2022 captured the imagination of the public—and the news industry—with the potential of generative AI to upend how people create and consume media. Generative AI is a type of artificial intelligence technology that can create new content, such as text, images, audio, video, or other media, based on the data it has been trained on and according to written prompts provided by users. ChatGPT is the chat-based user interface that made the power and potential of generative AI salient to a wide audience, reaching 100 million users within two months of its launch.

Although similar technology had been around, by late 2022 it was suddenly working, spurring its integration into various products and presenting not only a host of opportunities for productivity and new experiences but also some serious concerns about accuracy, provenance and attribution of source information, and the increased potential for creating misinformation.

This report serves as a snapshot of how the news industry has grappled with the initial promises and challenges of generative AI towards the end of 2023. The sample of participants reflects how some of the more savvy and experienced members of the profession are reacting to the technology.

Based on participants’ responses, they found that generative AI is already changing work structure and organization, even as it triggers ethical concerns around use. Here are some key takeaways:

  • Applications in News Production. The most predominant current use cases for generative AI include various forms of textual content production, information gathering and sensemaking, multimedia content production, and business uses.
  • Changing Work Structure and Organization. There are a host of new roles emerging to grapple with the changes introduced by generative AI including for leadership, editorial, product, legal, and engineering positions.
  • Work Redesign. There is an unmet opportunity to design new interfaces to support journalistic work with generative AI, in particular to enable the human oversight needed for the efficient and confident checking and verification of outputs..(More)”

How Copyright May Destroy Our Access To The World’s Academic Knowledge


Article by Glyn Moody: “The shift from analogue to digital has had a massive impact on most aspects of life. One area where that shift has the potential for huge benefits is in the world of academic publishing. Academic papers are costly to publish and distribute on paper, but in a digital format they can be shared globally for almost no cost. That’s one of the driving forces behind the open access movement. But as Walled Culture has reported, resistance from the traditional publishing world has slowed the shift to open access, and undercut the benefits that could flow from it.

That in itself is bad news, but new research from Martin Paul Eve (available as open access) shows that the way the shift to digital has been managed by publishers brings with it a new problem. For all their flaws, analogue publications have the great virtue that they are durable: once a library has a copy, it is likely to be available for decades, if not centuries. Digital scholarly articles come with no such guarantee. The Internet is constantly in flux, with many publishers and sites closing down each year, often without notice. That’s a problem when sites holding archival copies of scholarly articles vanish, making it harder, perhaps impossible, to access important papers. Eve explored whether publishers were placing copies of the articles they published in key archives. Ideally, digital papers would be available in multiple archives to ensure resilience, but the reality is that very few publishers did this. Ars Technica has a good summary of Eve’s results:

When Eve broke down the results by publisher, less than 1 percent of the 204 publishers had put the majority of their content into multiple archives. (The cutoff was 75 percent of their content in three or more archives.) Fewer than 10 percent had put more than half their content in at least two archives. And a full third seemed to be doing no organized archiving at all.

At the individual publication level, under 60 percent were present in at least one archive, and over a quarter didn’t appear to be in any of the archives at all. (Another 14 percent were published too recently to have been archived or had incomplete records.)..(More)”.

The Unintended Consequences of Data Standardization


Article by Cathleen Clerkin: “The benefits of data standardization within the social sector—and indeed just about any industry—are multiple, important, and undeniable. Access to the same type of data over time lends the ability to track progress and increase accountability. For example, over the last 20 years, my organization, Candid, has tracked grantmaking by the largest foundations to assess changes in giving trends. The data allowed us to demonstrate philanthropy’s disinvestment in historically Black colleges and universities. Data standardization also creates opportunities for benchmarking—allowing individuals and organizations to assess how they stack up to their colleagues and competitors. Moreover, large amounts of standardized data can help predict trends in the sector. Finally—and perhaps most importantly to the social sector—data standardization invariably reduces the significant reporting burdens placed on nonprofits.

Yet, for all of its benefits, data is too often proposed as a universal cure that will allow us to unequivocally determine the success of social change programs and processes. The reality is far more complex and nuanced. Left unchecked, the unintended consequences of data standardization pose significant risks to achieving a more effective, efficient, and equitable social sector…(More)”.

Data Authenticity, Consent, and Provenance for AI Are All Broken: What Will It Take to Fix Them?


Article by Shayne Longpre et al: “New AI capabilities are owed in large part to massive, widely sourced, and underdocumented training data collections. Dubious collection practices have spurred crises in data transparency, authenticity, consent, privacy, representation, bias, copyright infringement, and the overall development of ethical and trustworthy AI systems. In response, AI regulation is emphasizing the need for training data transparency to understand AI model limitations. Based on a large-scale analysis of the AI training data landscape and existing solutions, we identify the missing infrastructure to facilitate responsible AI development practices. We explain why existing tools for data authenticity, consent, and documentation alone are unable to solve the core problems facing the AI community, and outline how policymakers, developers, and data creators can facilitate responsible AI development, through universal data provenance standards…(More)”.

Why data about people are so hard to govern


Paper by Wendy H. Wong, Jamie Duncan, and David A. Lake: “How data on individuals are gathered, analyzed, and stored remains largely ungoverned at both domestic and global levels. We address the unique governance problem posed by digital data to provide a framework for understanding why data governance remains elusive. Data are easily transferable and replicable, making them a useful tool. But this characteristic creates massive governance problems for all of us who want to have some agency and choice over how (or if) our data are collected and used. Moreover, data are co-created: individuals are the object from which data are culled by an interested party. Yet, any data point has a marginal value of close to zero and thus individuals have little bargaining power when it comes to negotiating with data collectors. Relatedly, data follow the rule of winner take all—the parties that have the most can leverage that data for greater accuracy and utility, leading to natural oligopolies. Finally, data’s value lies in combination with proprietary algorithms that analyze and predict the patterns. Given these characteristics, private governance solutions are ineffective. Public solutions will also likely be insufficient. The imbalance in market power between platforms that collect data and individuals will be reproduced in the political sphere. We conclude that some form of collective data governance is required. We examine the challenges to the data governance by looking a public effort, the EU’s General Data Protection Regulation, a private effort, Apple’s “privacy nutrition labels” in their App Store, and a collective effort, the First Nations Information Governance Centre in Canada…(More)”

Creating an Integrated System of Data and Statistics on Household Income, Consumption, and Wealth: Time to Build


Report by the National Academies: “Many federal agencies provide data and statistics on inequality and related aspects of household income, consumption, and wealth (ICW). However, because the information provided by these agencies is often produced using different concepts, underlying data, and methods, the resulting estimates of poverty, inequality, mean and median household income, consumption, and wealth, as well as other statistics, do not always tell a consistent or easily interpretable story. Measures also differ in their accuracy, timeliness, and relevance so that it is difficult to address such questions as the effects of the Great Recession on household finances or of the Covid-19 pandemic and the ensuing relief efforts on household income and consumption. The presence of multiple, sometimes conflicting statistics at best muddies the waters of policy debates and, at worst, enable advocates with different policy perspectives to cherry-pick their preferred set of estimates. Achieving an integrated system of relevant, high-quality, and transparent household ICW data and statistics should go far to reduce disagreement about who has how much, and from what sources. Further, such data are essential to advance research on economic wellbeing and to ensure that policies are well targeted to achieve societal goals…(More)”.

Objectivity vs affect: how competing forms of legitimacy can polarize public debate in data-driven public consultation


Paper by Alison Powell: “How do data and objectivity become politicized? How do processes intended to include citizen voices instead push them into social media that intensify negative expression? This paper examines the possibility and limits of ‘agonistic data practices’ (Crooks & Currie, 2021) examining how data-driven consultation practices create competing forms of legitimacy for quantifiable knowledge and affective lived experience. Drawing on a two-year study of a private Facebook group self-presenting as a supportive space for working-class people critical of the development of ‘low-traffic neighbourhoods’ (LTNs), the paper reveals how the dynamics of ‘affective polarization’ associated the use of data with elite and exclusionary politics. Participants addressed this by framing their online contributions as ‘vernacular data’ and also by associating numerical data with exclusion and inequality. Over time the strong statements of feeling began to support content of a conspiratorial nature, reflected at the social level of discourse in the broader media environment where stories of strong feeling gain legitimacy in right-wing sources. The paper concludes that ideologies of dataism and practices of datafication may create conditions for political extremism to develop when the potential conditions of ‘agonistic data practices’ are not met, and that consultation processes must avoid overly valorizing data and calculable knowledge if they wish to retain democratic accountability…(More)”.