Vulnerability and Data Protection Law


Book by Gianclaudio Malgieri: “Vulnerability has traditionally been viewed through the lens of specific groups of people, such as ethnic minorities, children, the elderly, or people with disabilities. With the rise of digital media, our perceptions of vulnerable groups and individuals have been reshaped as new vulnerabilities and different vulnerable sub-groups of users, consumers, citizens, and data subjects emerge.

Vulnerability and Data Protection Law not only depicts these problems but offers the reader a detailed investigation of the concept of data subjects and a reconceptualization of the notion of vulnerability within the General Data Protection Regulation. The regulation offers a forward-facing set of tools that-though largely underexplored-are essential in rebalancing power asymmetries and mitigating induced vulnerabilities in the age of artificial intelligence.

Considering the new risks and potentialities of the digital market, the new awareness about cognitive weaknesses, and the new philosophical sensitivity about the condition of human vulnerability, the author looks for a more general and layered definition of the data subject’s vulnerability that goes beyond traditional labels. In doing so, he seeks to promote a ‘vulnerability-aware’ interpretation of the GDPR.

A heuristic analysis that re-interprets the whole GDPR, this work is essential for both scholars of data protection law and for policymakers looking to strengthen regulations and protect the data of vulnerable individuals…(More)”.

Digital Sovereignty and Governance in the Data Economy: Data Trusteeship Instead of Property Rights on Data


Chapter by Ingrid Schneider: “This chapter challenges the current business models of the dominant platforms in the digital economy. In the search for alternatives, and towards the aim of achieving digital sovereignty, it proceeds in four steps: First, it discusses scholarly proposals to constitute a new intellectual property right on data. Second, it examines four models of data governance distilled from the literature that seek to see data administered (1) as a private good regulated by the market, (2) as a public good regulated by the state, (3) as a common good managed by a commons’ community, and (4) as a data trust supervised by means of stewardship by a trustee. Third, the strengths and weaknesses of each of these models, which are ideal types and serve as heuristics, are critically appraised. Fourth, data trusteeship which at present seems to be emerging as a promising implementation model for better data governance, is discussed in more detail, both in an empirical-descriptive way, by referring to initiatives in several countries, and analytically, by highlighting the challenges and pitfalls of data trusteeship…(More)”.

Can AI help governments clean out bureaucratic “Sludge”?


Blog by Abhi Nemani: “Government services often entail a plethora of paperwork and processes that can be exasperating and time-consuming for citizens. Whether it’s applying for a passport, filing taxes, or registering a business, chances are one has encountered some form of sludge.

Sludge is a term coined by Cass Sunstein, in his straightforward book, Sludge, a legal scholar and former administrator of the White House Office of Information and Regulatory Affairs, to describe unnecessarily effortful processes, bureaucratic procedures, and other barriers to desirable outcomes in government services…

So how can sludge be reduced or eliminated in government services? Sunstein suggests that one way to achieve this is to conduct Sludge Audits, which are systematic evaluations of the costs and benefits of existing or proposed sludge. He also recommends that governments adopt ethical principles and guidelines for the design and use of public services. He argues that by reducing sludge, governments can enhance the quality of life and well-being of their citizens.

One example of sludge reduction in government is the simplification and automation of tax filing in some countries. According to a study by the World Bank, countries that have implemented electronic tax filing systems have reduced the time and cost of tax compliance for businesses and individuals. The study also found that electronic tax filing systems have improved tax administration efficiency, transparency, and revenue collection. Some countries, such as Estonia and Chile, have gone further by pre-filling tax returns with information from various sources, such as employers, banks, and other government agencies. This reduces the burden on taxpayers to provide or verify data, and increases the accuracy and completeness of tax returns.

Future Opportunities for AI in Cutting Sludge

AI technology is rapidly evolving, and its potential applications are manifold. Here are a few opportunities for further AI deployment:

  • AI-assisted policy design: AI can analyze vast amounts of data to inform policy design, identifying areas of administrative burden and suggesting improvements.
  • Smart contracts and blockchain: These technologies could automate complex procedures, such as contract execution or asset transfer, reducing the need for paperwork.
  • Enhanced citizen engagement: AI could personalize government services, making them more accessible and less burdensome.

Key Takeaways:

  • AI could play a significant role in policy design, contract execution, and citizen engagement.
  • These technologies hold the potential to significantly reduce sludge…(More)”.

Use of AI in social sciences could mean humans will no longer be needed in data collection


Article by Michael Lee: A team of researchers from four Canadian and American universities say artificial intelligence could replace humans when it comes to collecting data for social science research.

Researchers from the University of Waterloo, University of Toronto, Yale University and the University of Pennsylvania published an article in the journal Science on June 15 about how AI, specifically large language models (LLMs), could affect their work.

“AI models can represent a vast array of human experiences and perspectives, possibly giving them a higher degree of freedom to generate diverse responses than conventional human participant methods, which can help to reduce generalizability concerns in research,” Igor Grossmann, professor of psychology at Waterloo and a co-author of the article, said in a news release.

Philip Tetlock, a psychology professor at UPenn and article co-author, goes so far as to say that LLMs will “revolutionize human-based forecasting” in just three years.

In their article, the authors pose the question: “How can social science research practices be adapted, even reinvented, to harness the power of foundational AI? And how can this be done while ensuring transparent and replicable research?”

The authors say the social sciences have traditionally relied on methods such as questionnaires and observational studies.

But with the ability of LLMs to pore over vast amounts of text data and generate human-like responses, the authors say this presents a “novel” opportunity for researchers to test theories about human behaviour at a faster rate and on a much larger scale.

Scientists could use LLMs to test theories in a simulated environment before applying them in the real world, the article says, or gather differing perspectives on a complex policy issue and generate potential solutions.

“It won’t make sense for humans unassisted by AIs to venture probabilistic judgments in serious policy debates. I put an 90 per cent chance on that,” Tetlock said. “Of course, how humans react to all of that is another matter.”

One issue the authors identified, however, is that LLMs often learn to exclude sociocultural biases, raising the question of whether models are correctly reflecting the populations they study…(More)”

Artificial Intelligence for Emergency Response


Paper by Ayan Mukhopadhyay: “Emergency response management (ERM) is a challenge faced by communities across the globe. First responders must respond to various incidents, such as fires, traffic accidents, and medical emergencies. They must respond quickly to incidents to minimize the risk to human life. Consequently, considerable attention has been devoted to studying emergency incidents and response in the last several decades. In particular, data-driven models help reduce human and financial loss and improve design codes, traffic regulations, and safety measures. This tutorial paper explores four sub-problems within emergency response: incident prediction, incident detection, resource allocation, and resource dispatch. We aim to present mathematical formulations for these problems and broad frameworks for each problem. We also share open-source (synthetic) data from a large metropolitan area in the USA for future work on data-driven emergency response…(More)”.

Fighting poverty with synthetic data


Article by Jack Gisby, Anna Kiknadze, Thomas Mitterling, and Isabell Roitner-Fransecky: “If you have ever used a smartwatch or other wearable tech to track your steps, heart rate, or sleep, you are part of the “quantified self” movement. You are voluntarily submitting millions of intimate data points for collection and analysis. The Economist highlighted the benefits of good quality personal health and wellness data—increased physical activity, more efficient healthcare, and constant monitoring of chronic conditions. However, not everyone is enthusiastic about this trend. Many fear corporations will use the data to discriminate against the poor and vulnerable. For example, insurance firms could exclude patients based on preconditions obtained from personal data sharing.

Can we strike a balance between protecting the privacy of individuals and gathering valuable information? This blog explores applying a synthetic populations approach in New York City,  a city with an established reputation for using big data approaches to support urban management, including for welfare provisions and targeted policy interventions.

To better understand poverty rates at the census tract level, World Data Lab, with the support of the Sloan Foundation, generated a synthetic population based on the borough of Brooklyn. Synthetic populations rely on a combination of microdata and summary statistics:

  • Microdata consists of personal information at the individual level. In the U.S., such data is available at the Public Use Microdata Area (PUMA) level. PUMA are geographic areas partitioning the state, containing no fewer than 100,000 people each. However, due to privacy concerns, microdata is unavailable at the more granular census tract level. Microdata consists of both household and individual-level information, including last year’s household income, the household size, the number of rooms, and the age, sex, and educational attainment of each individual living in the household.
  • Summary statistics are based on populations rather than individuals and are available at the census tract level, given that there are fewer privacy concerns. Census tracts are small statistical subdivisions of a county, averaging about 4,000 inhabitants. In New York City, a census tract roughly equals a building block. Similar to microdata, summary statistics are available for individuals and households. On the census tract level, we know the total population, the corresponding demographic breakdown, the number of households within different income brackets, the number of households by number of rooms, and other similar variables…(More)”.

Open Data on GitHub: Unlocking the Potential of AI


Paper by Anthony Cintron Roman, Kevin Xu, Arfon Smith, Jehu Torres Vega, Caleb Robinson, Juan M Lavista Ferres: “GitHub is the world’s largest platform for collaborative software development, with over 100 million users. GitHub is also used extensively for open data collaboration, hosting more than 800 million open data files, totaling 142 terabytes of data. This study highlights the potential of open data on GitHub and demonstrates how it can accelerate AI research. We analyze the existing landscape of open data on GitHub and the patterns of how users share datasets. Our findings show that GitHub is one of the largest hosts of open data in the world and has experienced an accelerated growth of open data assets over the past four years. By examining the open data landscape on GitHub, we aim to empower users and organizations to leverage existing open datasets and improve their discoverability — ultimately contributing to the ongoing AI revolution to help address complex societal issues. We release the three datasets that we have collected to support this analysis as open datasets at this https URL…(More)”

Ethical Considerations Towards Protestware


Paper by Marc Cheong, Raula Gaikovina Kula, and Christoph Treude: “A key drawback to using a Open Source third-party library is the risk of introducing malicious attacks. In recently times, these threats have taken a new form, when maintainers turn their Open Source libraries into protestware. This is defined as software containing political messages delivered through these libraries, which can either be malicious or benign. Since developers are willing to freely open-up their software to these libraries, much trust and responsibility are placed on the maintainers to ensure that the library does what it promises to do. This paper takes a look into the possible scenarios where developers might consider turning their Open Source Software into protestware, using an ethico-philosophical lens. Using different frameworks commonly used in AI ethics, we explore the different dilemmas that may result in protestware. Additionally, we illustrate how an open-source maintainer’s decision to protest is influenced by different stakeholders (viz., their membership in the OSS community, their personal views, financial motivations, social status, and moral viewpoints), making protestware a multifaceted and intricate matter…(More)”

Rewiring The Web: The future of personal data


Paper by Jon Nash and Charlie Smith: “In this paper, we argue that the widespread use of personal information online represents a fundamental flaw in our digital infrastructure that enables staggeringly high levels of fraud, undermines our right to privacy, and limits competition.Charlie Smith

To realise a web fit for the twenty-first century, we need to fundamentally rethink the ways in which we interact with organisations online.

If we are to preserve the founding values of an open, interoperable web in the face of such profound change, we must update the institutions, regulatory regimes, and technologies that make up this network of networks.

Many of the problems we face stem from the vast amounts of personal information that currently flow through the internet—and fixing this fundamental flaw would have a profound effect on the quality of our lives and the workings of the web…(More)”

A Snapshot of Artificial Intelligence Procurement Challenges


Press Release: “The GovLab has released a new report offering recommendations for government in procuring artificial intelligence (AI) tools. As the largest purchaser of technology, it is critical for the federal government to adapt its procurement practices to ensure that beneficial AI tools can be responsibly and rapidly acquired and that safeguards are in place to ensure that technology improves people’s lives while minimizing risks. 

Based on conversations with over 35 leaders in government technology, the report identifies key challenges impeding successful procurement of AI, and offers five urgent recommendations to ensure that government is leveraging the benefits of AI to serve residents:

  1. Training: Invest in training public sector professionals to understand and differentiate between high- and low-risk AI opportunities. This includes teaching individuals and government entities to define problems accurately and assess algorithm outcomes. Frequent training updates are necessary to adapt to the evolving AI landscape.
  2. Tools: Develop decision frameworks, contract templates, auditing tools, and pricing models that empower procurement officers to confidently acquire AI. Open data and simulated datasets can aid in testing algorithms and identifying discriminatory effects.
  3. Regulation and Guidance: Recognize the varying complexity of AI use cases and develop a system that guides acquisition professionals to allocate time appropriately. This approach ensures more problematic cases receive thorough consideration.
  4. Organizational Change: Foster collaboration, knowledge sharing, and coordination among procurement officials and policymakers. Including mechanisms for public input allows for a multidisciplinary approach to address AI challenges.
  5. Narrow the Expertise Gap: Integrate individuals with expertise in new technologies into various government departments, including procurement, legal, and policy teams. Strengthen connections with academia and expand fellowship programs to facilitate the acquisition of relevant talent capable of auditing AI outcomes. Implement these programs at federal, state, and local government levels…(More)”