Understanding how to build a social licence for using novel linked datasets for planning and research in Kent, Surrey and Sussex: results of deliberative focus groups.


Paper by Elizabeth Ford et al: “Digital programmes in the newly created NHS integrated care boards (ICBs) in the United Kingdom mean that curation and linkage of anonymised patient data is underway in many areas for the first time. In Kent, Surrey and Sussex (KSS), in Southeast England, public health teams want to use these datasets to answer strategic population health questions, but public expectations around use of patient data are unknown….We aimed to engage with citizens of KSS to gather their views and expectations of data linkage and re-use, through deliberative discussions…
We held five 3-hour deliberative focus groups with 79 citizens of KSS, presenting information about potential uses of data, safeguards, and mechanisms for public involvement in governance and decision making about datasets. After each presentation, participants discussed their views in facilitated small groups which were recorded, transcribed and analysed thematically…
The focus groups generated 15 themes representing participants’ views on the benefits, risks and values for safeguarding linked data. Participants largely supported use of patient data to improve health service efficiency and resource management, preventative services and out of hospital care, joined-up services and information flows. Most participants expressed concerns about data accuracy, breaches and hacking, and worried about commercial use of data. They suggested that transparency of data usage through audit trails and clear information about accountability, ensuring data re-use does not perpetuate stigma and discrimination, ongoing, inclusive and valued involvement of the public in dataset decision-making, and a commitment to building trust, would meet their expectations for responsible data use…
Participants were largely favourable about the proposed uses of patient linked datasets but expected a commitment to transparency and public involvement. Findings were mapped to previous tenets of social license and can be used to inform ICB digital programme teams on how to proceed with use of linked datasets in a trustworthy and socially acceptable way…(More)”.

Citizen Participation and Knowledge Support in Urban Public Energy Transition—A Quadruple Helix Perspective


Paper by Peter Nijkamp et al: “Climate change, energy transition needs and the current energy crisis have prompted cities to implement far-reaching changes in public energy supply. The present paper seeks to map out the conditions for sustainable energy provision and use, with a particular view to the role of citizens in a quadruple helix context. Citizen participation is often seen as a sine qua non for a successful local or district energy policy in an urban area but needs due scientific and digital support based on evidence-based knowledge (using proper user-oriented techniques such as Q-analysis). The paper sets out to explore the citizen engagement and knowledge base for drastic energy transitions in the city based on the newly developed “diabolo” model, in which in particular digital tools (e.g., dashboards, digital twins) are proposed as useful tools for the interface between citizens and municipal policy. The approach adopted in this paper is empirically illustrated for local energy policy in the city of Rotterdam…(More)”.

Secondary data for global health digitalisation


Paper by Anatol-Fiete Näher, et al: “Substantial opportunities for global health intelligence and research arise from the combined and optimised use of secondary data within data ecosystems. Secondary data are information being used for purposes other than those intended when they were collected. These data can be gathered from sources on the verge of widespread use such as the internet, wearables, mobile phone apps, electronic health records, or genome sequencing. To utilise their full potential, we offer guidance by outlining available sources and approaches for the processing of secondary data. Furthermore, in addition to indicators for the regulatory and ethical evaluation of strategies for the best use of secondary data, we also propose criteria for assessing reusability. This overview supports more precise and effective policy decision making leading to earlier detection and better prevention of emerging health threats than is currently the case…(More)”.

Measuring Partial Democracies: Rules and their Implementation


Paper by Debarati Basu,  Shabana Mitra &  Archana Purohit: “This paper proposes a new index that focuses on capturing the extent of democracy in a country using not only the existence of rules but also the extent of their implementation. The measure, based on the axiomatically robust framework of (Alkire and Foster, J Public Econ 95:476–487, 2011), is able to moderate the existence of democratic rules by their actual implementation. By doing this we have a meaningful way of capturing the notion of a partial democracy within a continuum between non-democratic and democratic, separating out situations when the rules exist but are not implemented well. We construct our index using V-Dem data from 1900 to 2010 for over 100 countries to measure the process of democratization across the world. Our results show that we can track the progress in democratization, even when the regime remains either a democracy or an autarchy. This is the notion of partial democracy that our implementation-based index measures through a wide-based index that is consistent, replicable, extendable, easy to interpret, and more nuanced in its ability to capture the essence of democracy…(More)”.

Federated machine learning in data-protection-compliant research


Paper by Alissa Brauneck et al : “In recent years, interest in machine learning (ML) as well as in multi-institutional collaborations has grown, especially in the medical field. However, strict application of data-protection laws reduces the size of training datasets, hurts the performance of ML systems and, in the worst case, can prevent the implementation of research insights in clinical practice. Federated learning can help overcome this bottleneck through decentralised training of ML models within the local data environment, while maintaining the predictive performance of ‘classical’ ML. Thus, federated learning provides immense benefits for cross-institutional collaboration by avoiding the sharing of sensitive personal data(Fig. 1; refs.). Because existing regulations (especially the General Data Protection Regulation 2016/679 of the European Union, or GDPR) set stringent requirements for medical data and rather vague rules for ML systems, researchers are faced with uncertainty. In this comment, we provide recommendations for researchers who intend to use federated learning, a privacy-preserving ML technique, in their research. We also point to areas where regulations are lacking, discussing some fundamental conceptual problems with ML regulation through the GDPR, related especially to notions of transparency, fairness and error-free data. We then provide an outlook on how implications from data-protection laws can be directly incorporated into federated learning tools…(More)”.

Computational Social Science for the Public Good: Towards a Taxonomy of Governance and Policy Challenges


Chapter by Stefaan G. Verhulst: “Computational Social Science (CSS) has grown exponentially as the process of datafication and computation has increased. This expansion, however, is yet to translate into effective actions to strengthen public good in the form of policy insights and interventions. This chapter presents 20 limiting factors in how data is accessed and analysed in the field of CSS. The challenges are grouped into the following six categories based on their area of direct impact: Data Ecosystem, Data Governance, Research Design, Computational Structures and Processes, the Scientific Ecosystem, and Societal Impact. Through this chapter, we seek to construct a taxonomy of CSS governance and policy challenges. By first identifying the problems, we can then move to effectively address them through research, funding, and governance agendas that drive stronger outcomes…(More)”. Full Book: Handbook of Computational Social Science for Policy

Predicting Socio-Economic Well-being Using Mobile Apps Data: A Case Study of France


Paper by Rahul Goel, Angelo Furno, and Rajesh Sharma: “Socio-economic indicators provide context for assessing a country’s overall condition. These indicators contain information about education, gender, poverty, employment, and other factors. Therefore, reliable and accurate information is critical for social research and government policing. Most data sources available today, such as censuses, have sparse population coverage or are updated infrequently. Nonetheless, alternative data sources, such as call data records (CDR) and mobile app usage, can serve as cost-effective and up-to-date sources for identifying socio-economic indicators.
This work investigates mobile app data to predict socio-economic features. We present a large-scale study using data that captures the traffic of thousands of mobile applications by approximately 30 million users distributed over 550,000 km square and served by over 25,000 base stations. The dataset covers the whole France territory and spans more than 2.5 months, starting from 16th March 2019 to 6th June 2019. Using the app usage patterns, our best model can estimate socio-economic indicators (attaining an R-squared score upto 0.66). Furthermore, using models’ explainability, we discover that mobile app usage patterns have the potential to reveal socio-economic disparities in IRIS. Insights of this study provide several avenues for future interventions, including users’ temporal network analysis and exploration of alternative data sources…(More)”.

The Health of Democracies During the Pandemic: Results from a Randomized Survey Experiment


Paper by Marcella Alsan et al: “Concerns have been raised about the “demise of democracy”, possibly accelerated by pandemic-related restrictions. Using a survey experiment involving 8,206 respondents from five Western democracies, we find that subjects randomly exposed to information regarding civil liberties infringements undertaken by China and South Korea to contain COVID-19 became less willing to sacrifice rights and more worried about their long-term-erosion. However, our treatment did not increase support for democratic procedures more generally, despite our prior evidence that pandemic-related health risks diminished such support. These results suggest that the start of the COVID-19 crisis was a particularly vulnerable time for democracies…(More)”.

Data Is What Data Does: Regulating Use, Harm, and Risk Instead of Sensitive Data


Paper by Daniel J. Solove: “Heightened protection for sensitive data is becoming quite trendy in privacy laws around the world. Originating in European Union (EU) data protection law and included in the EU’s General Data Protection Regulation (GDPR), sensitive data singles out certain categories of personal data for extra protection. Commonly recognized special categories of sensitive data include racial or ethnic origin, political opinions, religious or philosophical beliefs, trade union membership, health, sexual orientation and sex life, biometric data, and genetic data.

Although heightened protection for sensitive data appropriately recognizes that not all situations involving personal data should be protected uniformly, the sensitive data approach is a dead end. The sensitive data categories are arbitrary and lack any coherent theory for identifying them. The borderlines of many categories are so blurry that they are useless. Moreover, it is easy to use non-sensitive data as a proxy for certain types of sensitive data.

Personal data is akin to a grand tapestry, with different types of data interwoven to a degree that makes it impossible to separate out the strands. With Big Data and powerful machine learning algorithms, most non-sensitive data can give rise to inferences about sensitive data. In many privacy laws, data that can give rise to inferences about sensitive data is also protected as sensitive data. Arguably, then, nearly all personal data can be sensitive, and the sensitive data categories can swallow up everything. As a result, most organizations are currently processing a vast amount of data in violation of the laws.

This Article argues that the problems with the sensitive data approach make it unworkable and counterproductive — as well as expose a deeper flaw at the root of many privacy laws. These laws make a fundamental conceptual mistake — they embrace the idea that the nature of personal data is a sufficiently useful focal point for the law. But nothing meaningful for regulation can be determined solely by looking at the data itself. Data is what data does. Personal data is harmful when its use causes harm or creates a risk of harm. It is not harmful if it is not used in a way to cause harm or risk of harm.

To be effective, privacy law must focus on use, harm, and risk rather than on the nature of personal data. The implications of this point extend far beyond sensitive data provisions. In many elements of privacy laws, protections should be based on the use of personal data and proportionate to the harm and risk involved with those uses…(More)”.

Why Do Innovations Fail? Lessons Learned from a Digital Democratic Innovation


Paper by Jenny Lindholm and Janne Berg: “Democratic innovations are brought forward by political scientists as a response to worrying democratic deficits. This paper aims to evaluate the design, process, and outcome of digital democratic innovations. We study a mobile application for following local politics. Data is collected using three online surveys with different groups, and a workshop with young citizens. The results show that the app did not fully meet the democratic ideal of inclusiveness at the process stage, especially in reaching young people. However, the user groups that had used the app reported positive democratic effects…(More)”.