The 2010 Census Confidentiality Protections Failed, Here’s How and Why

Paper by John M. Abowd, et al: “Using only 34 published tables, we reconstruct five variables (census block, sex, age, race, and ethnicity) in the confidential 2010 Census person records. Using the 38-bin age variable tabulated at the census block level, at most 20.1% of reconstructed records can differ from their confidential source on even a single value for these five variables. Using only published data, an attacker can verify that all records in 70% of all census blocks (97 million people) are perfectly reconstructed. The tabular publications in Summary File 1 thus have prohibited disclosure risk similar to the unreleased confidential microdata. Reidentification studies confirm that an attacker can, within blocks with perfect reconstruction accuracy, correctly infer the actual census response on race and ethnicity for 3.4 million vulnerable population uniques (persons with nonmodal characteristics) with 95% accuracy, the same precision as the confidential data achieve and far greater than statistical baselines. The flaw in the 2010 Census framework was the assumption that aggregation prevented accurate microdata reconstruction, justifying weaker disclosure limitation methods than were applied to 2010 Census public microdata. The framework used for 2020 Census publications defends against attacks that are based on reconstruction, as we also demonstrate here. Finally, we show that alternatives to the 2020 Census Disclosure Avoidance System with similar accuracy (enhanced swapping) also fail to protect confidentiality, and those that partially defend against reconstruction attacks (incomplete suppression implementations) destroy the primary statutory use case: data for redistricting all legislatures in the country in compliance with the 1965 Voting Rights Act…(More)”.

Foundational Research Gaps and Future Directions for Digital Twins

Report by the National Academy of Engineering; National Academies of Sciences, Engineering, and Medicine: “Across multiple domains of science, engineering, and medicine, excitement is growing about the potential of digital twins to transform scientific research, industrial practices, and many aspects of daily life. A digital twin couples computational models with a physical counterpart to create a system that is dynamically updated through bidirectional data flows as conditions change. Going beyond traditional simulation and modeling, digital twins could enable improved medical decision-making at the individual patient level, predictions of future weather and climate conditions over longer timescales, and safer, more efficient engineering processes. However, many challenges remain before these applications can be realized.

This report identifies the foundational research and resources needed to support the development of digital twin technologies. The report presents critical future research priorities and an interdisciplinary research agenda for the field, including how federal agencies and researchers across domains can best collaborate…(More)”.

Conversing with Congress: An Experiment in AI-Enabled Communication

Blog by Beth Noveck: “Each Member of the US House Representative speaks for 747,184 people – a staggering increase from 50 years ago. In the Senate, this disproportion is even more pronounced: on average each Senator represents 1.6 million more constituents than her predecessor a generation ago. That’s a lower level of representation than any other industrialized democracy.  

As the population grows (over 60% since 1970), so, too, does constituent communications. 

But that communication is not working well. According to the Congressional Management Foundation, this overwhelming communication volume leads to dissatisfaction among voters who feel their views are not adequately considered by their representatives….A pioneering and important new study published in Government Information Quarterly entitled “Can AI communication tools increase legislative responsiveness and trust in democratic institutions?” (Volume 40, Issue 3, June 2023, 101829) from two Cornell researchers is shedding new light on the practical potential for AI to create more meaningful constituent communication….Depending on the treatment group they either were or were not told when replies were AI-drafted.

Their findings are telling. Standard, generic responses fare poorly in gaining trust. In contrast, all AI-assisted responses, particularly those with human involvement, significantly boost trust. “Legislative correspondence generated by AI with human oversight may be received favorably.” 

Screenshot 2023 12 12 at 4.21.16 Pm

While the study found AI-assisted replies to be more trustworthy, it also explored how the quality of these replies impacts perception. When they conducted this study, ChatGPT was still in its infancy and more prone to linguistic hallucinations so they also tested in a second experiment how people perceived higher, more relevant and responsive replies against lower quality, irrelevant replies drafted with AI…(More)”.

Using Data for Good: Identifying Who Could Benefit from Simplified Tax Filing

Blog by New America: “For years, New America Chicago has been working with state agencies, national and local advocates and thought leaders, as well as community members on getting beneficial tax credits, like the Earned Income Tax Credit (EITC) and Child Tax Credit (CTC), into the hands of those who need them most. Illinois paved the way recently with its innovative simplified filing initiative which helps residents easily claim their state Earned Income Credit (EIC) by confirming their refund with a prepopulated return.

This past year we had discussions with Illinois policymakers and state agencies, like the Illinois Department of Revenue (IDoR) and the Illinois Department of Human Services (IDHS), to envision new ways for expanding the simplified filing initiative. It is currently designed to reach those who have filed a federal tax return and claimed their EITC, leaving out non-filer households who typically do not file taxes because they earn less than the federal income requirement or have other barriers.

In Illinois, over 600,000 households are enrolled in SNAP, and over 1 million households are enrolled in Medicaid. Every year thousands of families spend countless hours applying for these and other social safety net programs using IDHS’ Application for Benefits Eligibility (ABE). Unfortunately, many of these households are most in need of the federal EITC and the recently expanded state EIC but will never receive it. We posed the question, what if Illinois could save families time and money by using that already provided income and household information to streamline access to the state EIC for low-income families that don’t normally file taxes?

Our friends at Inclusive Economy Lab (IEL) conducted analysis using Census microdata to estimate the number of Illinois households who are enrolled in Medicaid and SNAP but do not file their federal or state tax forms…(More)”.

Informing Decisionmakers in Real Time

Article by Robert M. Groves: “In response, the National Science Foundation (NSF) proposed the creation of a complementary group to provide decisionmakers at all levels with the best available evidence from the social sciences to inform pandemic policymaking. In May 2020, with funding from NSF and additional support from the Alfred P. Sloan Foundation and the David and Lucile Packard Foundation, NASEM established the Societal Experts Action Network (SEAN) to connect “decisionmakers grappling with difficult issues to the evidence, trends, and expert guidance that can help them lead their communities and speed their recovery.” We chose to build a network because of the widespread recognition that no one small group of social scientists would have the expertise or the bandwidth to answer all the questions facing decisionmakers. What was needed was a structure that enabled an ongoing feedback loop between researchers and decisionmakers. This structure would foster the integration of evidence, research, and advice in real time, which broke with NASEM’s traditional form of aggregating expert guidance over lengthier periods.

In its first phase, SEAN’s executive committee set about building a network that could both gather and disseminate knowledge. To start, we brought in organizations of decisionmakers—including the National Association of Counties, the National League of Cities, the International City/County Management Association, and the National Conference of State Legislatures—to solicit their questions. Then we added capacity to the network by inviting social and behavioral organizations—like the National Bureau of Economic Research, the National Hazards Center at the University of Colorado Boulder, the Kaiser Family Foundation, the National Opinion Research Center at the University of Chicago, The Policy Lab at Brown University, and Testing for America—to join and respond to questions and disseminate guidance. In this way, SEAN connected teams of experts with evidence and answers to leaders and communities looking for advice…(More)”.

How to make data open? Stop overlooking librarians

Article by Jessica Farrell: “The ‘Year of Open Science’, as declared by the US Office of Science and Technology Policy (OSTP), is now wrapping up. This followed an August 2022 memo from OSTP acting director Alondra Nelson, which mandated that data and peer-reviewed publications from federally funded research should be made freely accessible by the end of 2025. Federal agencies are required to publish full plans for the switch by the end of 2024.

But the specifics of how data will be preserved and made publicly available are far from being nailed down. I worked in archives for ten years and now facilitate two digital-archiving communities, the Software Preservation Network and BitCurator Consortium, at Educopia in Atlanta, Georgia. The expertise of people such as myself is often overlooked. More open-science projects need to integrate digital archivists and librarians, to capitalize on the tools and approaches that we have already created to make knowledge accessible and open to the public.How to make your scientific data accessible, discoverable and useful

Making data open and ‘FAIR’ — findable, accessible, interoperable and reusable — poses technical, legal, organizational and financial questions. How can organizations best coordinate to ensure universal access to disparate data? Who will do that work? How can we ensure that the data remain open long after grant funding runs dry?

Many archivists agree that technical questions are the most solvable, given enough funding to cover the labour involved. But they are nonetheless complex. Ideally, any open research should be testable for reproducibility, but re-running scripts or procedures might not be possible unless all of the required coding libraries and environments used to analyse the data have also been preserved. Besides the contents of spreadsheets and databases, scientific-research data can include 2D or 3D images, audio, video, websites and other digital media, all in a variety of formats. Some of these might be accessible only with proprietary or outdated software…(More)”.

After USTR’s Move, Global Governance of Digital Trade Is Fraught with Unknowns

Article by Patrick Leblond: “On October 25, the United States announced at the World Trade Organization (WTO) that it was dropping its support for provisions meant to promote the free flow of data across borders. Also abandoned were efforts to continue negotiations on international e-commerce, to protect the source code in applications and algorithms (the so-called Joint Statement Initiative process).

According to the Office of the US Trade Representative (USTR): “In order to provide enough policy space for those debates to unfold, the United States has removed its support for proposals that might prejudice or hinder those domestic policy considerations.” In other words, the domestic regulation of data, privacy, artificial intelligence, online content and the like, seems to have taken precedence over unhindered international digital trade, which the United States previously strongly defended in trade agreements such as the Trans-Pacific Partnership (TPP) and the Canada-United States-Mexico Agreement (CUSMA)…

One pathway for the future sees the digital governance noodle bowl getting bigger and messier. In this scenario, international digital trade suffers. Agreements continue proliferating but remain ineffective at fostering cross-border digital trade: either they remain hortatory with attempts at cooperation on non-strategic issues, or no one pays attention to the binding provisions because business can’t keep up and governments want to retain their “policy space.” After all, why has there not yet been any dispute launched based on binding provisions in a digital trade agreement (either on its own or as part of a larger trade deal) when there has been increasing digital fragmentation?

The other pathway leads to the creation of a new international standards-setting and governance body (call it an International Digital Standards Board), like there exists for banking and finance. Countries that are members of such an international organization and effectively apply the commonly agreed standards become part of a single digital area where they can conduct cross-border digital trade without impediments. This is the only way to realize the G7’s “data free flow with trust” vision, originally proposed by Japan…(More)”.

Shaping the Future: Indigenous Voices Reshaping Artificial Intelligence in Latin America

Blog by Enzo Maria Le Fevre Cervini: “In a groundbreaking move toward inclusivity and respect for diversity, a comprehensive report “Inteligencia artificial centrada en los pueblos indígenas: perspectivas desde América Latina y el Caribe” authored by Cristina Martinez and Luz Elena Gonzalez has been released by UNESCO, outlining the pivotal role of Indigenous perspectives in shaping the trajectory of Artificial Intelligence (AI) in Latin America. The report, a collaborative effort involving Indigenous communities, researchers, and various stakeholders, emphasizes the need for a fundamental shift in the development of AI technologies, ensuring they align with the values, needs, and priorities of Indigenous peoples.

The core theme of the report revolves around the idea that for AI to be truly respectful of human rights, it must incorporate the perspectives of Indigenous communities in Latin America, the Caribbean, and beyond. Recognizing the UNESCO Recommendation on the Ethics of Artificial Intelligence, the report highlights the urgency of developing a framework of shared responsibility among different actors, urging them to leverage their influence for the collective public interest.

While acknowledging the immense potential of AI in preserving Indigenous identities, conserving cultural heritage, and revitalizing languages, the report notes a critical gap. Many initiatives are often conceived externally, prompting a call to reevaluate these projects to ensure Indigenous leadership, development, and implementation…(More)”.

New York City Takes Aim at AI

Article by Samuel Greengard: “As concerns over artificial intelligence (AI) grow and angst about its potential impact increase, political leaders and government agencies are taking notice. In November, U.S. president Joe Biden issued an executive order designed to build guardrails around the technology. Meanwhile, the European Union (EU) is currently developing a legal framework around responsible AI.

Yet, what is often overlooked about artificial intelligence is that it’s more likely to impact people on a local level. AI touches housing, transportation, healthcare, policing and numerous other areas relating to business and daily life. It increasingly affects citizens, government employees, and businesses in both obvious and unintended ways.

One city attempting to position itself at the vanguard of AI is New York. In October 2023, New York City announced a blueprint for developing, managing, and using the technology responsibly. The New York City Artificial Intelligence Action Plan—the first of its kind in the U.S.—is designed to help officials and the public navigate the AI space.

“It’s a fairly comprehensive plan that addresses both the use of AI within city government and the responsible use of the technology,” says Clifford S. Stein, Wai T. Chang Professor of Industrial Engineering and Operations Research and Interim Director of the Data Science Institute at Columbia University.

Adds Stefaan Verhulst, co-founder and chief research and development officer at The GovLab and Senior Fellow at the Center for Democracy and Technology (CDT), “AI localism focuses on the idea that cities are where most of the action is in regard to AI.”…(More)”.

Boston experimented with using generative AI for governing. It went surprisingly well

Article by Santiago Garces and Stephen Goldsmith: “…we see the possible advances of generative AI as having the most potential. For example, Boston asked OpenAI to “suggest interesting analyses” after we uploaded 311 data. In response, it suggested two things: time series analysis by case time, and a comparative analysis by neighborhood. This meant that city officials spent less time navigating the mechanics of computing an analysis, and had more time to dive into the patterns of discrepancy in service. The tools make graphs, maps, and other visualizations with a simple prompt. With lower barriers to analyze data, our city officials can formulate more hypotheses and challenge assumptions, resulting in better decisions.

Not all city officials have the engineering and web development experience needed to run these tests and code. But this experiment shows that other city employees, without any STEM background, could, with just a bit of training, utilize these generative AI tools to supplement their work.

To make this possible, more authority would need to be granted to frontline workers who too often have their hands tied with red tape. Therefore, we encourage government leaders to allow workers more discretion to solve problems, identify risks, and check data. This is not inconsistent with accountability; rather, supervisors can utilize these same generative AI tools, to identify patterns or outliers—say, where race is inappropriately playing a part in decision-making, or where program effectiveness drops off (and why). These new tools will more quickly provide an indication as to which interventions are making a difference, or precisely where a historic barrier is continuing to harm an already marginalized community.  

Civic groups will be able to hold government accountable in new ways, too. This is where the linguistic power of large language models really shines: Public employees and community leaders alike can request that tools create visual process maps, build checklists based on a description of a project, or monitor progress compliance. Imagine if people who have a deep understanding of a city—its operations, neighborhoods, history, and hopes for the future—can work toward shared goals, equipped with the most powerful tools of the digital age. Gatekeepers of formerly mysterious processes will lose their stranglehold, and expediters versed in state and local ordinances, codes, and standards, will no longer be necessary to maneuver around things like zoning or permitting processes. 

Numerous challenges would remain. Public workforces would still need better data analysis skills in order to verify whether a tool is following the right steps and producing correct information. City and state officials would need technology partners in the private sector to develop and refine the necessary tools, and these relationships raise challenging questions about privacy, security, and algorithmic bias…(More)”