Leveraging Data for Racial Equity in Workforce Opportunity


Report by CODE: “Across many decades, obstacles to gainful employment have limited the ability of Black Americans and other people of color to obtain well-paying jobs that create wealth and contribute to health and well-being.

A dearth of opportunity in the job market is related to inequalities in education, bias in hiring, and other forms of systemic inequality in the U.S.

Over time, federal efforts have addressed the need to increase diversity, equity, and inclusion in the government workforce, and promoted similar changes in the broader society. While these efforts have brought progress, they have not been entirely effective. At the same time, federal action has made new kinds of data available—data that can shed light on some of the historic drivers of workforce inequity and help inform solutions to their ongoing impact.

This report explores a number of current opportunities to strengthen longstanding data-driven tools to address workforce inequity. The report shows how the effects of workforce discrimination and other historic practices are still being felt today. At the same time, it outlines opportunities to apply data to increase equity in many areas related to the workforce gap, including disparities in health and wellbeing, socioeconomic status, and housing insecurity…(More)”.

Because Data Can’t Speak for Itself


A Practical Guide to Telling Persuasive Policy Stories” by David Chrisinger and Lauren Brodsky: “People with important evidence-based ideas often struggle to translate data into stories their readers can relate to and understand. And if leaders can’t communicate well to their audience, they will not be able to make important changes in the world.

Why do some evidence-based ideas thrive while others die? And how do we improve the chances of worthy ideas? In Because Data Can’t Speak for Itself, accomplished educators and writers David Chrisinger and Lauren Brodsky tackle these questions head-on. They reveal the parts and functions of effective data-driven stories and explain myriad ways to turn your data dump into a narrative that can inform, persuade, and inspire action.

Chrisinger and Brodsky show that convincing data-driven stories draw their power from the same three traits, which they call peoplepurpose, and persistence. Writers need to find the real people behind the numbers and share their stories. At the same time, they need to remember their own purpose and be honest about what data says—and, just as importantly, what it does not.

Compelling and concise, this fast-paced tour of success stories—and several failures—includes examples on topics such as COVID-19, public diplomacy, and criminal justice…(More)”

Big Data and Public Policy


Book by Rebecca Moody and Victor Bekkers: “This book provides a comprehensive overview of how the course, content and outcome of policy making is affected by big data. It scrutinises the notion that big and open data makes policymaking a more rational process, in which policy makers are able to predict, assess and evaluate societal problems. It also examines how policy makers deal with big data, the problems and limitations they face, and how big data shapes policymaking on the ground. The book considers big data from various perspectives, not just the political, but also the technological, legal, institutional and ethical dimensions. The potential of big data use in the public sector is also assessed, as well as the risks and dangers this might pose. Through several extended case studies, it demonstrates the dynamics of big data and public policy. Offering a holistic approach to the study of big data, this book will appeal to students and scholars of public policy, public administration and data science, as well as those interested in governance and politics…(More)”.

8 Strategies for Chief Data Officers to Create — and Demonstrate — Value


Article by Thomas H. Davenport, Richard Y. Wang, and Priyanka Tiwari: “The chief data officer (CDO) role was only established in 2002, but it has grown enormously since then. In one recent survey of large companies, 83% reported having a CDO. This isn’t surprising: Data and approaches to understanding it (analytics and AI) are incredibly important in contemporary organizations. What is eyebrow-raising, however, is that the CDO job is terribly ill-defined. Sixty-two percent of CDOs surveyed in the research we describe below reported that the CDO role is poorly understood, and incumbents of the job have often met with diffuse expectations and short tenures. There is a clear need for CDOs to focus on adding visible value to their organizations.

Part of the problem is that traditional data management approaches are unlikely to provide visible value in themselves. Many nontechnical executives don’t really understand the CDO’s work and struggle to recognize when it’s being done well. CDOs are often asked to focus on preventing data problems (defense-oriented initiatives) and such data management projects as improving data architectures, data governance, and data quality. But data will never be perfect, meaning executives will always be somewhat frustrated with their organization’s data situation. While improvements in data management may be difficult to recognize or measure, major problems such as hacks, breaches, lost or inaccessible data, or poor quality are much easier to recognize than improvements.

So how can CDOs demonstrate that they’re creating value?…(More)”

Data Free Disney


Essay by Janet Vertesy: “…Once upon a time, you could just go to Disneyland. You could get tickets at the gates, stand in line for rides, buy food and tchotchkes, even pick up copies of your favorite Disney movies at a local store. It wasn’t even that long ago. The last time I visited, in 2010, the company didn’t record what I ate for dinner or detect that I went on Pirates of the Caribbean five times. It was none of their business.

But sometime in the last few years, tracking and tracing became their business. Like many corporations out there, Walt Disney Studios spent the last decade transforming into a data company.

The theme parks alone are a data scientist’s dream. Just imagine: 50,000 visitors a day, most equipped with cell phones and a specialized app. Millions of location traces, along with rides statistics, lineup times, and food-order preferences. Thousands and thousands of credit card swipes, each populating a database with names and addresses, each one linking purchases across the park grounds.1 A QR-code scavenger hunt that records the path people took through Star Wars: Galaxy’s Edge. Hotel keycards with entrance times, purchases, snack orders, and more. Millions of photos snapped on rides and security cameras throughout the park, feeding facial-recognition systems. Tickets with names, birthdates, and portraits attached. At Florida’s Disney World, MagicBands—bracelets using RFID (radio-frequency identification) technology—around visitors’ wrists gather all that information plus fingerprints in one place, while sensors ambiently detect their every move. What couldn’t you do with all that data?…(More)”.

Secondary data for global health digitalisation


Paper by Anatol-Fiete Näher, et al: “Substantial opportunities for global health intelligence and research arise from the combined and optimised use of secondary data within data ecosystems. Secondary data are information being used for purposes other than those intended when they were collected. These data can be gathered from sources on the verge of widespread use such as the internet, wearables, mobile phone apps, electronic health records, or genome sequencing. To utilise their full potential, we offer guidance by outlining available sources and approaches for the processing of secondary data. Furthermore, in addition to indicators for the regulatory and ethical evaluation of strategies for the best use of secondary data, we also propose criteria for assessing reusability. This overview supports more precise and effective policy decision making leading to earlier detection and better prevention of emerging health threats than is currently the case…(More)”.

Engineering Personal Data Sharing


Report by ENISA: “This report attempts to look closer at specific use cases relating to personal data sharing, primarily in the health sector, and discusses how specific technologies and considerations of implementation can support the meeting of specific data protection. After discussing some challenges in (personal) data sharing, this report demonstrates how to engineer specific technologies and techniques in order to enable privacy preserving data sharing. More specifically it discusses specific use cases for sharing data in the health sector, with the aim of demonstrating how data protection principles can be met through the proper use of technological solutions relying on advanced cryptographic techniques. Next it discusses data sharing that takes place as part of another process or service, where the data is processed through some secondary channel or entity before reaching its primary recipient. Lastly, it identifies challenges, considerations and possible architectural solutions on intervenability aspects (such as the right to erasure and the right to rectification when sharing data)…(More)”.

Rethinking the impact of open data: A first step towards a European impact assessment for open data


Report for data.europa.eu: “This report is the first in a series of four that aims to establish a standard methodology for open data impact assessments that can be used across Europe. This exercise is key because a consistent definition of the impact of open data does not exist. The lack of a robust, conceptual foundation has made it more difficult for data portals to demonstrate their value through empirical evidence. It also challenges the EU’s ability to understand and compare performance across Member States. Most academic articles that look to explore the impact of data refer to existing open data frameworks, with the open data maturity (ODM) and open data barometer (ODB) ones most frequently represented. These two frameworks distinguish between different kinds of impact, and both mention social, political and economic impacts in particular. The ODM also includes the environmental impact in its framework.

Sometimes, these frameworks diverge from the European Commission’s own recommendations of how best to measure impact, as explained in specific sections of the better regulation guidelines and the better regulation toolbox. They help to answer a critical question for policymakers: do the benefits provided outweigh the costs of assembling and distributing (open) data? Future reports in this series will further explore how to better align existing frameworks, such as the ODM, with these critically important guidelines…(More)”.

Big Data and the Law of War


Essay by Paul Stephan: “Big data looms large in today’s world. Much of the tech sector regards the building up of large sets of searchable data as part (sometimes the greater part) of its business model. Surveillance-oriented states, of which China is the foremost example, use big data to guide and bolster monitoring of their own people as well as potential foreign threats. Many other states are not far behind in the surveillance arms race, notwithstanding the attempts of the European Union to put its metaphorical finger in the dike. Finally, ChatGPT has revived popular interest in artificial intelligence (AI), which uses big data as a means of optimizing the training and algorithm design on which it depends, as a cultural, economic, and social phenomenon. 

If big data is growing in significance, might it join territory, people, and property as objects of international conflict, including armed conflict? So far it has not been front and center in Russia’s invasion of Ukraine, the war that currently consumes much of our attention. But future conflicts could certainly feature attacks on big data. China and Taiwan, for example, both have sophisticated technological infrastructures that encompass big data and AI capabilities. The risk that they might find themselves at war in the near future is larger than anyone would like. What, then, might the law of war have to say about big data? More generally, if existing law does not meet our needs,  how might new international law address the issue?

In a recent essay, part of an edited volume on “The Future Law of Armed Conflict,” I argue that big data is a resource and therefore a potential target in an armed conflict. I address two issues: Under the law governing the legality of war (jus ad bellum), what kinds of attacks on big data might justify an armed response, touching off a bilateral (or multilateral) armed conflict (a war)? And within an existing armed conflict, what are the rules (jus in bello, also known as international humanitarian law, or IHL) governing such attacks?

The distinction is meaningful. If cyber operations rise to the level of an armed attack, then the targeted state has, according to Article 51 of the U.N. Charter, an “inherent right” to respond with armed force. Moreover, the target need not confine its response to a symmetrical cyber operation. Once attacked, a state may use all forms of armed force in response, albeit subject to the restrictions imposed by IHL. If the state regards, say, a takedown of its financial system as an armed attack, it may respond with missiles…(More)”.

Federated machine learning in data-protection-compliant research


Paper by Alissa Brauneck et al : “In recent years, interest in machine learning (ML) as well as in multi-institutional collaborations has grown, especially in the medical field. However, strict application of data-protection laws reduces the size of training datasets, hurts the performance of ML systems and, in the worst case, can prevent the implementation of research insights in clinical practice. Federated learning can help overcome this bottleneck through decentralised training of ML models within the local data environment, while maintaining the predictive performance of ‘classical’ ML. Thus, federated learning provides immense benefits for cross-institutional collaboration by avoiding the sharing of sensitive personal data(Fig. 1; refs.). Because existing regulations (especially the General Data Protection Regulation 2016/679 of the European Union, or GDPR) set stringent requirements for medical data and rather vague rules for ML systems, researchers are faced with uncertainty. In this comment, we provide recommendations for researchers who intend to use federated learning, a privacy-preserving ML technique, in their research. We also point to areas where regulations are lacking, discussing some fundamental conceptual problems with ML regulation through the GDPR, related especially to notions of transparency, fairness and error-free data. We then provide an outlook on how implications from data-protection laws can be directly incorporated into federated learning tools…(More)”.