The Prediction Society: Algorithms and the Problems of Forecasting the Future


Paper by Hideyuki Matsumi and Daniel J. Solove: “Predictions about the future have been made since the earliest days of humankind, but today, we are living in a brave new world of prediction. Today’s predictions are produced by machine learning algorithms that analyze massive quantities of personal data. Increasingly, important decisions about people are being made based on these predictions.

Algorithmic predictions are a type of inference. Many laws struggle to account for inferences, and even when they do, the laws lump all inferences together. But as we argue in this Article, predictions are different from other inferences. Predictions raise several unique problems that current law is ill-suited to address. First, algorithmic predictions create a fossilization problem because they reinforce patterns in past data and can further solidify bias and inequality from the past. Second, algorithmic predictions often raise an unfalsiability problem. Predictions involve an assertion about future events. Until these events happen, predictions remain unverifiable, resulting in an inability for individuals to challenge them as false. Third, algorithmic predictions can involve a preemptive intervention problem, where decisions or interventions render it impossible to determine whether the predictions would have come true. Fourth, algorithmic predictions can lead to a self-fulfilling prophecy problem where they actively shape the future they aim to forecast.

More broadly, the rise of algorithmic predictions raises an overarching concern: Algorithmic predictions not only forecast the future but also have the power to create and control it. The increasing pervasiveness of decisions based on algorithmic predictions is leading to a prediction society where individuals’ ability to author their own future is diminished while the organizations developing and using predictive systems are gaining greater power to shape the future…(More)”

From LogFrames to Logarithms – A Travel Log


Article by Karl Steinacker and Michael Kubach: “..Today, authorities all over the world are experimenting with predictive algorithms. That sounds technical and innocent but as we dive deeper into the issue, we realise that the real meaning is rather specific: fraud detection systems in social welfare payment systems. In the meantime, the hitherto banned terminology had it’s come back: welfare or social safety nets are, since a couple of years, en vogue again. But in the centuries-old Western tradition, welfare recipients must be monitored and, if necessary, sanctioned, while those who work and contribute must be assured that there is no waste. So it comes at no surprise that even today’s algorithms focus on the prime suspect, the individual fraudster, the undeserving poor.

Fraud detection systems promise that the taxpayer will no longer fall victim to fraud and efficiency gains can be re-directed to serve more people. The true extent of welfare fraud is regularly exaggerated  while the costs of such systems is routinely underestimated. A comparison of the estimated losses and investments doesn’t take place. It is the principle to detect and punish the fraudsters that prevail. Other issues don’t rank high either, for example on how to distinguish between honest mistakes and deliberate fraud. And as case workers spent more time entering and analysing data and in front of a computer screen, the less they have time and inclination to talk to real people and to understand the context of their life at the margins of society.

Thus, it can be said that routinely hundreds of thousands of people are being scored. Example Denmark: Here, a system called Udbetaling Danmark was created in 2012 to streamline the payment of welfare benefits. Its fraud control algorithms can access the personal data of millions of citizens, not all of whom receive welfare payments. In contrast to the hundreds of thousands affected by this data mining, the number of cases referred to the Police for further investigation are minute. 

In the city of Rotterdam in the Netherlands every year, data of 30,000 welfare recipients is investigated in order to flag suspected welfare cheats. However, an analysis of its scoring system based on machine learning and algorithms showed systemic discrimination with regard to ethnicity, age, gender, and parenthood. It revealed evidence of other fundamental flaws making the system both inaccurate and unfair. What might appear to a caseworker as a vulnerability is treated by the machine as grounds for suspicion. Despite the scale of data used to calculate risk scores, the output of the system is not better than random guesses. However, the consequences of being flagged by the “suspicion machine” can be drastic, with fraud controllers empowered to turn the lives of suspects inside out.

As reported by the World Bank, the recent Covid-19 pandemic provided a great push to implement digital social welfare systems in the global South. In fact, for the World Bank the so-called Digital Public Infrastructure (DPI), enabling “Digitizing Government to Person Payments (G2Px)”, are as fundamental for social and economic development today as physical infrastructure was for previous generations. Hence, the World Bank is finances globally systems modelled after the Indian Aadhaar system, where more than a billion persons have been registered biometrically. Aadhaar has become, for all intents and purposes, a pre-condition to receive subsidised food and other assistance for 800 million Indian citizens.

Important international aid organisations are not behaving differently from states. The World Food Programme alone holds data of more than 40 million people on its Scope data base. Unfortunately, WFP like other UN organisations, is not subject to data protection laws and the jurisdiction of courts. This makes the communities they have worked with particularly vulnerable.

In most places, the social will become the metric, where logarithms determine the operational conduit for delivering, controlling and withholding assistance, especially welfare payments. In other places, the power of logarithms may go even further, as part of trust systems, creditworthiness, and social credit. These social credit systems for individuals are highly controversial as they require mass surveillance since they aim to track behaviour beyond financial solvency. The social credit score of a citizen might not only suffer from incomplete, or inaccurate data, but also from assessing political loyalties and conformist social behaviour…(More)”.

How Differential Privacy Will Affect Estimates of Air Pollution Exposure and Disparities in the United States


Article by Madalsa Singh: “Census data is crucial to understand energy and environmental justice outcomes such as poor air quality which disproportionately impact people of color in the U.S. With the advent of sophisticated personal datasets and analysis, Census Bureau is considering adding top-down noise (differential privacy) and post-processing 2020 census data to reduce the risk of identification of individual respondents. Using 2010 demonstration census and pollution data, I find that compared to the original census, differentially private (DP) census significantly changes ambient pollution exposure in areas with sparse populations. White Americans have lowest variability, followed by Latinos, Asian, and Black Americans. DP underestimates pollution disparities for SO2 and PM2.5 while overestimates the pollution disparities for PM10…(More)”.

Yes, No, Maybe? Legal & Ethical Considerations for Informed Consent in Data Sharing and Integration


Report by Deja Kemp, Amy Hawn Nelson, & Della Jenkins: “Data sharing and integration are increasingly commonplace at every level of government, as cross-program and cross-sector data provide valuable insights to inform resource allocation, guide program implementation, and evaluate policies. Data sharing, while routine, is not without risks, and clear legal frameworks for data sharing are essential to mitigate those risks, protect privacy, and guide responsible data use. In some cases, federal privacy laws offer clear consent requirements and outline explicit exceptions where consent is not required to share data. In other cases, the law is unclear or silent regarding whether consent is needed for data sharing. Importantly, consent can present both ethical and logistical challenges, particularly when integrating cross-sector data. This brief will frame out key concepts related to consent; explore major federal laws governing the sharing of administrative data, including individually identifiable information; and examine important ethical implications of consent, particularly in cases when the law is silent or unclear. Finally, this brief will outline the foundational role of strong governance and consent frameworks in ensuring ethical data use and offer technical alternatives to consent that may be appropriate for certain data uses….(More)”.

Generative Artificial Intelligence and Data Privacy: A Primer


Report by Congressional Research Service: “Since the public release of Open AI’s ChatGPT, Google’s Bard, and other similar systems, some Members of Congress have expressed interest in the risks associated with “generative artificial intelligence (AI).” Although exact definitions vary, generative AI is a type of AI that can generate new content—such as text, images, and videos—through learning patterns from pre-existing data.
It is a broad term that may include various technologies and techniques from AI and machine learning (ML). Generative AI models have received significant attention and scrutiny due to their potential harms, such as risks involving privacy, misinformation, copyright, and non-consensual sexual imagery. This report focuses on privacy issues and relevant policy considerations for Congress. Some policymakers and stakeholders have raised privacy concerns about how individual data may be used to develop and deploy generative models. These concerns are not new or unique to generative AI, but the scale, scope, and capacity of such technologies may present new privacy challenges for Congress…(More)”.

The latest in homomorphic encryption: A game-changer shaping up


Article by Katharina Koerner: “Privacy professionals are witnessing a revolution in privacy technology. The emergence and maturing of new privacy-enhancing technologies that allow for data use and collaboration without sharing plain text data or sending data to a central location are part of this revolution.

The United Nations, the Organisation for Economic Co-operation and Development, the U.S. White House, the European Union Agency for Cybersecurity, the UK Royal Society, and Singapore’s media and privacy authorities all released reports, guidelines and regulatory sandboxes around the use of PETs in quick succession. We are in an era where there are high hopes for data insights to be leveraged for the public good while maintaining privacy principles and enhanced security.

A prominent example of a PET is fully homomorphic encryption, often mentioned in the same breath as differential privacy, federated learning, secure multiparty computation, private set intersection, synthetic data, zero knowledge proofs or trusted execution environments.

As FHE advances and becomes standardized, it has the potential to revolutionize the way we handle, protect and utilize personal data. Staying informed about the latest advancements in this field can help privacy pros prepare for the changes ahead in this rapidly evolving digital landscape.

Homomorphic encryption: A game changer?

FHE is a groundbreaking cryptographic technique that enables third parties to process information without revealing the data itself by running computations on encrypted data.

This technology can have far-reaching implications for secure data analytics. Requests to a databank can be answered without accessing its plain text data, as the analysis is conducted on data that remains encrypted. This adds a third layer of security for data when in use, along with protecting data at rest and in transit…(More)”.

Data Privacy and Algorithmic Inequality


Paper by Zhuang Liu, Michael Sockin & Wei Xiong: “This paper develops a foundation for a consumer’s preference for data privacy by linking it to the desire to hide behavioral vulnerabilities. Data sharing with digital platforms enhances the matching efficiency for standard consumption goods, but also exposes individuals with self-control issues to temptation goods. This creates a new form of inequality in the digital era—algorithmic inequality. Although data privacy regulations provide consumers with the option to opt out of data sharing, these regulations cannot fully protect vulnerable consumers because of data-sharing externalities. The coordination problem among consumers may also lead to multiple equilibria with drastically different levels of data sharing by consumers. Our quantitative analysis further illustrates that although data is non-rival and beneficial to social welfare, it can also exacerbate algorithmic inequality…(More)”.

Accept All: Unacceptable? 


Report by Demos and Schillings: “…sought to investigate how our data footprints are being created and exploited online. It involved an exploratory investigation into how data sharing and data regulation practices are impacting citizens: looking into how individuals’ data footprints are created, what people experience when they want to exercise their data rights, and how they feel about how their data is being used. This was a novel approach, using live case studies as they embarked on a data odyssey in order to understand, in real time, the data challenge people face.

We then held a series of stakeholder roundtables with academics, lawyers, technologists, people working in industry and civil society, which focused on diagnosing the problems and what potential solutions already look like, or could look like in the future, across multiple stakeholder groups….(More)” See also: documentary produced by the project partners, law firm Schillings and the independent consumer data action service Rightly, and TVN, alongside this report, here.

The Future of Consent: The Coming Revolution in Privacy and Consumer Trust


Report by Ogilvy: “The future of consent will be determined by how we – as individuals, nations, and a global species – evolve our understanding of what counts as meaningful consent. For consumers and users, the greatest challenge lies in connecting consent to a mechanism of relevant, personal control over their data. For businesses and other organizations, the task will be to recast consent as a driver of positive economic outcomes, rather than an obstacle.

In the coming years of digital privacy innovation, regulation, and increasing market maturity, everyone will need to think more deeply about their relationship with consent. As an initial step, we’ve assembled this snapshot on the current and future state of (meaningful) consent: what it means, what the obstacles are, and which critical changes we need to embrace to evolve…(More)”.

The Surveillance Ad Model Is Toxic — Let’s Not Install Something Worse


Article by Elizabeth M. Renieris: “At this stage, law and policy makerscivil society and academic researchers largely agree that the existing business model of the Web — algorithmically targeted behavioural advertising based on personal data, sometimes also referred to as surveillance advertising — is toxic. They blame it for everything from the erosion of individual privacy to the breakdown of democracy. Efforts to address this toxicity have largely focused on a flurry of new laws (and legislative proposals) requiring enhanced notice to, and consent from, users and limiting the sharing or sale of personal data by third parties and data brokers, as well as the application of existing laws to challenge ad-targeting practices.

In response to the changing regulatory landscape and zeitgeist, industry is also adjusting its practices. For example, Google has introduced its Privacy Sandbox, a project that includes a planned phaseout of third-party cookies from its Chrome browser — a move that, although lagging behind other browsers, is nonetheless significant given Google’s market share. And Apple has arguably dealt one of the biggest blows to the existing paradigm with the introduction of its AppTrackingTransparency (ATT) tool, which requires apps to obtain specific, opt-in consent from iPhone users before collecting and sharing their data for tracking purposes. The ATT effectively prevents apps from collecting a user’s Identifier for Advertisers, or IDFA, which is a unique Apple identifier that allows companies to recognize a user’s device and track its activity across apps and websites.

But the shift away from third-party cookies on the Web and third-party tracking of mobile device identifiers does not equate to the end of tracking or even targeted ads; it just changes who is doing the tracking or targeting and how they go about it. Specifically, it doesn’t provide any privacy protections from first parties, who are more likely to be hegemonic platforms with the most user data. The large walled gardens of Apple, Google and Meta will be less impacted than smaller players with limited first-party data at their disposal…(More)”.