Federated machine learning in data-protection-compliant research


Paper by Alissa Brauneck et al : “In recent years, interest in machine learning (ML) as well as in multi-institutional collaborations has grown, especially in the medical field. However, strict application of data-protection laws reduces the size of training datasets, hurts the performance of ML systems and, in the worst case, can prevent the implementation of research insights in clinical practice. Federated learning can help overcome this bottleneck through decentralised training of ML models within the local data environment, while maintaining the predictive performance of ‘classical’ ML. Thus, federated learning provides immense benefits for cross-institutional collaboration by avoiding the sharing of sensitive personal data(Fig. 1; refs.). Because existing regulations (especially the General Data Protection Regulation 2016/679 of the European Union, or GDPR) set stringent requirements for medical data and rather vague rules for ML systems, researchers are faced with uncertainty. In this comment, we provide recommendations for researchers who intend to use federated learning, a privacy-preserving ML technique, in their research. We also point to areas where regulations are lacking, discussing some fundamental conceptual problems with ML regulation through the GDPR, related especially to notions of transparency, fairness and error-free data. We then provide an outlook on how implications from data-protection laws can be directly incorporated into federated learning tools…(More)”.

Computational Social Science for the Public Good: Towards a Taxonomy of Governance and Policy Challenges


Chapter by Stefaan G. Verhulst: “Computational Social Science (CSS) has grown exponentially as the process of datafication and computation has increased. This expansion, however, is yet to translate into effective actions to strengthen public good in the form of policy insights and interventions. This chapter presents 20 limiting factors in how data is accessed and analysed in the field of CSS. The challenges are grouped into the following six categories based on their area of direct impact: Data Ecosystem, Data Governance, Research Design, Computational Structures and Processes, the Scientific Ecosystem, and Societal Impact. Through this chapter, we seek to construct a taxonomy of CSS governance and policy challenges. By first identifying the problems, we can then move to effectively address them through research, funding, and governance agendas that drive stronger outcomes…(More)”. Full Book: Handbook of Computational Social Science for Policy

Predicting Socio-Economic Well-being Using Mobile Apps Data: A Case Study of France


Paper by Rahul Goel, Angelo Furno, and Rajesh Sharma: “Socio-economic indicators provide context for assessing a country’s overall condition. These indicators contain information about education, gender, poverty, employment, and other factors. Therefore, reliable and accurate information is critical for social research and government policing. Most data sources available today, such as censuses, have sparse population coverage or are updated infrequently. Nonetheless, alternative data sources, such as call data records (CDR) and mobile app usage, can serve as cost-effective and up-to-date sources for identifying socio-economic indicators.
This work investigates mobile app data to predict socio-economic features. We present a large-scale study using data that captures the traffic of thousands of mobile applications by approximately 30 million users distributed over 550,000 km square and served by over 25,000 base stations. The dataset covers the whole France territory and spans more than 2.5 months, starting from 16th March 2019 to 6th June 2019. Using the app usage patterns, our best model can estimate socio-economic indicators (attaining an R-squared score upto 0.66). Furthermore, using models’ explainability, we discover that mobile app usage patterns have the potential to reveal socio-economic disparities in IRIS. Insights of this study provide several avenues for future interventions, including users’ temporal network analysis and exploration of alternative data sources…(More)”.

The Health of Democracies During the Pandemic: Results from a Randomized Survey Experiment


Paper by Marcella Alsan et al: “Concerns have been raised about the “demise of democracy”, possibly accelerated by pandemic-related restrictions. Using a survey experiment involving 8,206 respondents from five Western democracies, we find that subjects randomly exposed to information regarding civil liberties infringements undertaken by China and South Korea to contain COVID-19 became less willing to sacrifice rights and more worried about their long-term-erosion. However, our treatment did not increase support for democratic procedures more generally, despite our prior evidence that pandemic-related health risks diminished such support. These results suggest that the start of the COVID-19 crisis was a particularly vulnerable time for democracies…(More)”.

Data Is What Data Does: Regulating Use, Harm, and Risk Instead of Sensitive Data


Paper by Daniel J. Solove: “Heightened protection for sensitive data is becoming quite trendy in privacy laws around the world. Originating in European Union (EU) data protection law and included in the EU’s General Data Protection Regulation (GDPR), sensitive data singles out certain categories of personal data for extra protection. Commonly recognized special categories of sensitive data include racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, health, sexual orientation and sex life, biometric data, and genetic data.

Although heightened protection for sensitive data appropriately recognizes that not all situations involving personal data should be protected uniformly, the sensitive data approach is a dead end. The sensitive data categories are arbitrary and lack any coherent theory for identifying them. The borderlines of many categories are so blurry that they are useless. Moreover, it is easy to use non-sensitive data as a proxy for certain types of sensitive data.

Personal data is akin to a grand tapestry, with different types of data interwoven to a degree that makes it impossible to separate out the strands. With Big Data and powerful machine learning algorithms, most non-sensitive data can give rise to inferences about sensitive data. In many privacy laws, data that can give rise to inferences about sensitive data is also protected as sensitive data. Arguably, then, nearly all personal data can be sensitive, and the sensitive data categories can swallow up everything. As a result, most organizations are currently processing a vast amount of data in violation of the laws.

This Article argues that the problems with the sensitive data approach make it unworkable and counterproductive — as well as expose a deeper flaw at the root of many privacy laws. These laws make a fundamental conceptual mistake — they embrace the idea that the nature of personal data is a sufficiently useful focal point for the law. But nothing meaningful for regulation can be determined solely by looking at the data itself. Data is what data does. Personal data is harmful when its use causes harm or creates a risk of harm. It is not harmful if it is not used in a way to cause harm or risk of harm.

To be effective, privacy law must focus on use, harm, and risk rather than on the nature of personal data. The implications of this point extend far beyond sensitive data provisions. In many elements of privacy laws, protections should be based on the use of personal data and proportionate to the harm and risk involved with those uses…(More)”.

Why Do Innovations Fail? Lessons Learned from a Digital Democratic Innovation


Paper by Jenny Lindholm and Janne Berg: “Democratic innovations are brought forward by political scientists as a response to worrying democratic deficits. This paper aims to evaluate the design, process, and outcome of digital democratic innovations. We study a mobile application for following local politics. Data is collected using three online surveys with different groups, and a workshop with young citizens. The results show that the app did not fully meet the democratic ideal of inclusiveness at the process stage, especially in reaching young people. However, the user groups that had used the app reported positive democratic effects…(More)”.

Who owns the map? Data sovereignty and government spatial data collection, use, and dissemination


Paper by Peter A. Johnson and Teresa Scassa: “Maps, created through the collection, assembly, and analysis of spatial data are used to support government planning and decision-making. Traditionally, spatial data used to create maps are collected, controlled, and disseminated by government, although over time, this role has shifted. This shift has been driven by the availability of alternate sources of data collected by private sector companies, and data contributed by volunteers to open mapping platforms, such as OpenStreetMap. In theorizing this shift, we provide examples of how governments use data sovereignty as a tool to shape spatial data collection, use, and sharing. We frame four models of how governments may navigate shifting spatial data sovereignty regimes; first, with government retaining complete control over data collection; second, with government contracting a third party to provide specific data collection services, but with data ownership and dissemination responsibilities resting with government; third, with government purchasing data under terms of access set by third party data collectors, who disseminate data to several parties, and finally, with government retreating from or relinquishing data sovereignty altogether. Within this rapidly changing landscape of data providers, we propose that governments must consider how to address data sovereignty concerns to retain their ability to control data use in the public interest…(More)”.

Nudge and Nudging in Public Policy


Paper by Sanchayan Banerjee and damPeter John: “Nudging has been used to make public policies widely, in various fields such as personal finance, health, education, environment/climate, privacy, law, and human well-being. Nonetheless, with an increase in the applications of nudging, the toolkit of nudges also expanded massively, which ultimately led to multiple different conceptualisations and definitions of the nudge. In this entry, we review developments to nudge and nudging in public policy. First, we briefly discuss the political philosophy and psychological paradigm behind the conventional nudge, and examples of economically modelling nudge applications. Then, we highlight the role of nudges in behavioural public policy, an emerging subdiscipline of public policy which uses insights from behavioural sciences to develop new policies. We review the many definitions of nudge and introduce alternative toolkits of behaviours change, such as thinks, boosts, nudge+. We conclude with a discussion on the limitations of nudging in public policy and future research in behavioural public policy….(More)”.

Experiments of Living Constitutionalism


Paper by Cass R. Sunstein: “Experiments of Living Constitutionalism urges that the Constitution should be interpreted so as to allow both individuals and groups to experiment with different ways of living, whether we are speaking of religious practices, family arrangements, political associations, civic associations, child-rearing, schooling, romance, or work. Experiments of Living Constitutionalism prizes diversity and plurality; it gives pride of place to freedom of speech, freedom of association, and free exercise of religion (which it would protect against the imposition of secular values); it cherishes federalism; it opposes authoritarianism in all its forms. While Experiments of Living Constitutionalism has considerable appeal, my purpose in naming it is not to endorse or defend it, but as a thought experiment and to contrast it to Common Good Constitutionalism, with the aim of specifying the criteria on which one might embrace or defend any approach to constitutional law. My central conclusion is that we cannot know whether to accept or reject Experiments of Living Constitutionalism, Common Good Constitutionalism, Common Law Constitutionalism, democracy-reinforcing approaches, moral readings, originalism, or any other proposed approach without a concrete sense of what it entails – of what kind of constitutional order it would likely bring about or produce. No approach to constitutional interpretation can be evaluated without asking how it fits with the evaluator’s “fixed points,” which operate at multiple levels of generality. The search for reflective equilibrium is essential in deciding whether to accept a theory of constitutional interpretation…(More)”.

Studying open government data: Acknowledging practices and politics


Paper by Gijs van Maanen: “Open government and open data are often presented as the Asterix and Obelix of modern government—one cannot discuss one, without involving the other. Modern government, in this narrative, should open itself up, be more transparent, and allow the governed to have a say in their governance. The usage of technologies, and especially the communication of governmental data, is then thought to be one of the crucial instruments helping governments achieving these goals. Much open government data research, hence, focuses on the publication of open government data, their reuse, and re-users. Recent research trends, by contrast, divert from this focus on data and emphasize the importance of studying open government data in practice, in interaction with practitioners, while simultaneously paying attention to their political character. This commentary looks more closely at the implications of emphasizing the practical and political dimensions of open government data. It argues that researchers should explicate how and in what way open government data policies present solutions to what kind of problems. Such explications should be based on a detailed empirical analysis of how different actors do or do not do open data. The key question to be continuously asked and answered when studying and implementing open government data is how the solutions openness present latch onto the problem they aim to solve…(More)”.