European Health Data Space


European Commission Press Release: “The set-up of the European Health Data Space will be an integral part of building a European Health Union, a process launched by the Commission today with a first set of proposals to reinforce preparedness and response during health crisis. This  is also a direct follow up of the Data strategy adopted by the Commission in February this year, where the Commission had already stressed the importance of creating European data spaces, including on health….

In this perspective, as part of the implementation of the Data strategy, a data governance act is set to be presented still this year, which will support the reuse of public sensitive data such as health data. A dedicated legislative proposal on a European health data space is planned for next year, as set out in the 2021 Commission work programme.

As first steps, the following activities starting in 2021 will pave the way for better data-driven health care in Europe:

  • The Commission proposes a European Health Data Space in 2021;
  • A Joint Action with 22 Member States to propose options on governance, infrastructure, data quality and data solidarity and empowering citizens with regards to secondary health data use in the EU;
  • Investments to support the European Health Data Space under the EU4Health programme, as well as common data spaces and digital health related innovation under Horizon Europe and the Digital Europe programmes;
  • Engagement with relevant actors to develop targeted Codes of Conduct for secondary health data use;
  • A pilot project, to demonstrate the feasibility of cross border analysis for healthcare improvement, regulation and innovation;
  • Other EU funding opportunities for digital transformation of health and care will be available for Member States as of 2021 under Recovery and Resilience Facility, European Regional Development Fund, European Social Fund+, InvestEU.

The set of proposals adopted by the Commission today to strengthen the EU’s crisis preparedness and response, taking the first steps towards a European Health Union, also pave the way for the participation of the European Medicines Agency (EMA) and the European Centre for Disease Prevention and Control (ECDC) in the future European Health Data Space infrastructure, along with research institutes, public health bodies, and data permit authorities in the Member States….(More)”.

The responsible use of data for and about children: treading carefully and ethically


Q&A with Stefaan G. Verhulst and Andrew Young …” working in collaboration with UNICEF on an initiative called Responsible Data for Children initiative (RD4C) . Its focus is on data – the risks it poses to children, as well as the opportunities it offers.

You have been working with UNICEF on the Responsible Data for Children initiative (RD4C). What is this and why do we need to be talking more about ‘responsible data’?

To date, the relationship between the datafication of everyday life and child welfare has been under-explored, both by researchers in data ethics and those who work to advance the rights of children. This neglect is a lost opportunity, and also poses a risk to children.

Today’s children are the first generation to grow up amid the rapid datafication of virtually every aspect of social, cultural, political and economic life. This alone calls for greater scrutiny of the role played by data. An entire generation is being datafied, often starting before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. Ironically, humanitarian and development organizations working with children are themselves among the key actors contributing to the increased collection of data. These organizations rely on a wide range of technologies, including biometrics, digital identity systems, remote-sensing technologies, mobile and social media messaging apps, and administrative data systems. The data generated by these tools and platforms inevitably includes potentially sensitive PII data (personally identifiable information) and DII data (demographically identifiable information). All of this begs much closer scrutiny, and a more systematic framework to guide how child-related data is collected, stored, and used.

Towards this aim, we have also been working with the Data for Children Collaborative, based in Edinburgh in establishing innovative and ethical practices around the use of data to improve the lives of children worldwide….(More)”.

Federated Learning for Privacy-Preserving Data Access


Paper by Małgorzata Śmietanka, Hirsh Pithadia and Philip Treleaven: “Federated learning is a pioneering privacy-preserving data technology and also a new machine learning model trained on distributed data sets.

Companies collect huge amounts of historic and real-time data to drive their business and collaborate with other organisations. However, data privacy is becoming increasingly important because of regulations (e.g. EU GDPR) and the need to protect their sensitive and personal data. Companies need to manage data access: firstly within their organizations (so they can control staff access), and secondly protecting raw data when collaborating with third parties. What is more, companies are increasingly looking to ‘monetize’ the data they’ve collected. However, under new legislations, utilising data by different organization is becoming increasingly difficult (Yu, 2016).

Federated learning pioneered by Google is the emerging privacy- preserving data technology and also a new class of distributed machine learning models. This paper discusses federated learning as a solution for privacy-preserving data access and distributed machine learning applied to distributed data sets. It also presents a privacy-preserving federated learning infrastructure….(More)”.

Four Principles to Make Data Tools Work Better for Kids and Families


Blog by the Annie E. Casey Foundation: “Advanced data analytics are deeply embedded in the operations of public and private institutions and shape the opportunities available to youth and families. Whether these tools benefit or harm communities depends on their design, use and oversight, according to a report from the Annie E. Casey Foundation.

Four Principles to Make Advanced Data Analytics Work for Children and Families examines the growing field of advanced data analytics and offers guidance to steer the use of big data in social programs and policy….

The Foundation report identifies four principles — complete with examples and recommendations — to help steer the growing field of data science in the right direction.

Four Principles for Data Tools

  1. Expand opportunity for children and families. Most established uses of advanced analytics in education, social services and criminal justice focus on problems facing youth and families. Promising uses of advanced analytics go beyond mitigating harm and help to identify so-called odds beaters and new opportunities for youth.
    • Example: The Children’s Data Network at the University of Southern California is helping the state’s departments of education and social services explore why some students succeed despite negative experiences and what protective factors merit more investment.
    • Recommendation: Government and its philanthropic partners need to test if novel data science applications can create new insights and when it’s best to apply them.
       
  2. Provide transparency and evidence. Advanced analytical tools must earn and maintain a social license to operate. The public has a right to know what decisions these tools are informing or automating, how they have been independently validated, and who is accountable for answering and addressing concerns about how they work.
    • Recommendations: Local and state task forces can be excellent laboratories for testing how to engage youth and communities in discussions about advanced analytics applications and the policy frameworks needed to regulate their use. In addition, public and private funders should avoid supporting private algorithms whose design and performance are shielded by trade secrecy claims. Instead, they should fund and promote efforts to develop, evaluate and adapt transparent and effective models.
       
  3. Empower communities. The field of advanced data analytics often treats children and families as clients, patients and consumers. Put to better use, these same tools can help elucidate and reform the systems acting upon children and families. For this shift to occur, institutions must focus analyses and risk assessments on structural barriers to opportunity rather than individual profiles.
    • Recommendation: In debates about the use of data science, greater investment is needed to amplify the voices of youth and their communities.
       
  4. Promote equitable outcomes. Useful advanced analytics tools should promote more equitable outcomes for historically disadvantaged groups. New investments in advanced analytics are only worthwhile if they aim to correct the well-documented bias embedded in existing models.
    • Recommendations: Advanced analytical tools should only be introduced when they reduce the opportunity deficit for disadvantaged groups — a move that will take organizing and advocacy to establish and new policy development to institutionalize. Philanthropy and government also have roles to play in helping communities test and improve tools and examples that already exist….(More)”.

A Legal Framework for Access to Data – A Competition Policy Perspective


Paper by Heike Schweitzer and Robert Welker: “The paper strives to systematise the debate on access to data from a competition policy angle. At the outset, two general policy approaches to access to data are distinguished: a “private control of data” approach versus an “open access” approach. We argue that, when it comes to private sector data, the “private control of data” approach is preferable. According to this approach, the “whether” and “how” of data access should generally be left to the market. However, public intervention can be justified by significant market failures. We discuss the presence of such market failures and the policy responses, including, in particular, competition policy responses, with a view to three different data access scenarios: access to data by co-generators of usage data (Scenario 1); requests for access to bundled or aggregated usage data by third parties vis-à-vis a service or product provider who controls such datasets, with the goal to enter complementary markets (Scenario 2); requests by firms to access the large usage data troves of the Big Tech online platforms for innovative purposes (Scenario 3). On this basis we develop recommendations for data access policies….(More)”.

Not fit for Purpose: A critical analysis of the ‘Five Safes’


Paper by Chris Culnane, Benjamin I. P. Rubinstein, and David Watts: “Adopted by government agencies in Australia, New Zealand, and the UK as policy instrument or as embodied into legislation, the ‘Five Safes’ framework aims to manage risks of releasing data derived from personal information. Despite its popularity, the Five Safes has undergone little legal or technical critical analysis. We argue that the Fives Safes is fundamentally flawed: from being disconnected from existing legal protections and appropriation of notions of safety without providing any means to prefer strong technical measures, to viewing disclosure risk as static through time and not requiring repeat assessment. The Five Safes provides little confidence that resulting data sharing is performed using ‘safety’ best practice or for purposes in service of public interest….(More)”.

COVID-19 Data and Data Sharing Agreements: The Potential of Sunset Clauses and Sunset Provisions


A report by SDSN TReNDS and DataReady Limited on behalf of Contracts4DataCollaboration: “Building upon issues discussed in the C4DC report, “Laying the Foundation for Effective Partnerships: An Examination of Data Sharing Agreements,” this brief examines the potential of sunset clauses or sunset provisions to be a legally binding, enforceable, and accountable way of ensuring COVID-19 related data sharing agreements are wound down responsibly at the end of the pandemic. The brief is divided into four substantive parts: Part I introduces sunset clauses as legislative tools, highlighting a number of examples of how they have been used in both COVID-19 related and other contexts; Part II discusses sunset provisions in the context of data sharing agreements and attempts to explain the complex interrelationship between data ownership, intellectual property, and sunset provisions; Part III identifies some key issues policymakers should consider when assessing the utility and viability of sunset provisions within their data sharing agreements and arrangements; and Part IV highlights the value of a memorandum of understanding (MoU) as a viable vehicle for sunset provisions in contexts where data sharing agreements are either non-existent or not regularly used….(More)“.(Contracts 4 Data Collaboration Framework).

NIH Releases New Policy for Data Management and Sharing


NIH Blogpost by Carrie Wolinetz: “Today, nearly twenty years after the publication of the Final NIH Statement on Sharing Research Data in 2003, we have released a Final NIH Policy for Data Management and Sharing. This represents the agency’s continued commitment to share and make broadly available the results of publicly funded biomedical research. We hope it will be a critical step in moving towards a culture change, in which data management and sharing is seen as integral to the conduct of research. Responsible data management and sharing is good for science; it maximizes availability of data to the best and brightest minds, underlies reproducibility, honors the participation of human participants by ensuring their data is both protected and fully utilized, and provides an element of transparency to ensure public trust and accountability.

This policy has been years in the making and has benefited enormously from feedback and input from stakeholders throughout the process. We are grateful to all those who took the time to comment on Request for Information, the Draft policy, or to participate in workshops or Tribal consultations. That thoughtful feedback has helped shape the Final policy, which we believe strikes a balance between reasonable expectations for data sharing and flexibility to allow for a diversity of data types and circumstances. How we incorporated public comments and decision points that led to the Final policy are detailed in the Preamble to the DMS policy.

The Final policy applies to all research funded or conducted by NIH that results in the generation of scientific data. The Final Policy has two main requirements (1) the submission of a Data Management and Sharing Plan (Plan); and (2) compliance with the approved Plan. We are asking for Plans at the time of submission of the application, because we believe planning and budgeting for data management and sharing needs to occur hand in hand with planning the research itself. NIH recognizes that science evolves throughout the research process, which is why we have built in the ability to update DMS Plans, but at the end of the day, we are expecting investigators and institutions to be accountable to the Plans they have laid out for themselves….

Anticipating that variation in readiness, and in recognition of the cultural change we are trying to seed, there is a two-year implementation period. This time will be spent developing the information, support, and tools that the biomedical enterprise will need to comply with this new policy. NIH has already provided additional supplementary information – on (1) elements of a data management and sharing plan; (2) allowable costs; and (3) selecting a data repository – in concert with the policy release….(More)”

Your phone already tracks your location. Now that data could fight voter suppression


Article by Seth Rosenblatt: “Smartphone location data is a dream for marketers who want to know where you go and how long you spend there—and a privacy nightmare. But this kind of geolocation data could also be used to protect people’s voting rights on Election Day.

The newly founded nonprofit Center for New Data is now tracking voters at the polls using smartphone location data to help researchers understand how easy—or difficult—it is for people to vote in different places. Called the Observing Democracy project, the nonpartisan effort is making data on how far people have to travel to vote and how long they have to wait in line available in a privacy-friendly way so it can be used to craft election policies that ensure voting is accessible for everyone.

Election data has already fueled changes in various municipalities and states. A 66-page lawsuit filed by Fair Fight Action against the state of Georgia in the wake of Stacey Abrams’s narrow loss to Brian Kemp in the 2018 gubernatorial race relies heavily on data to back its assertions of unconstitutionally delayed and deferred voter registration, unfair challenges to absentee and provisional ballots, and unjustified purges of voter rolls—all hallmarks of voter suppression.

The promise of Observing Democracy is to make this type of impactful data available much more rapidly than ever before. Barely a month old, Observing Democracy isn’t wasting any time: Its all-volunteer staffers will be receiving data potentially as soon as Nov. 4 on voter wait times at polling locations, travel times to polling stations, and how frequently ballot drop-off boxes are visited, courtesy of location-data mining companies X-Mode Social and Veraset, which was spun off from SafeGraph….(More)”.

To mitigate the costs of future pandemics, establish a common data space


Article by Stephanie Chin and Caitlin Chin: “To improve data sharing during global public health crises, it is time to explore the establishment of a common data space for highly infectious diseases. Common data spaces integrate multiple data sources, enabling a more comprehensive analysis of data based on greater volume, range, and access. At its essence, a common data space is like a public library system, which has collections of different types of resources from books to video games; processes to integrate new resources and to borrow resources from other libraries; a catalog system to organize, sort, and search through resources; a library card system to manage users and authorization; and even curated collections or displays that highlight themes among resources.

Even before the COVID-19 pandemic, there was significant momentum to make critical data more widely accessible. In the United States, Title II of the Foundations for Evidence-Based Policymaking Act of 2018, or the OPEN Government Data Act, requires federal agencies to publish their information online as open data, using standardized, machine-readable data formats. This information is now available on the federal data.gov catalog and includes 50 state- or regional-level data hubs and 47 city- or county-level data hubs. In Europe, the European Commission released a data strategy in February 2020 that calls for common data spaces in nine sectors, including healthcare, shared by EU businesses and governments.

Going further, a common data space could help identify outbreaks and accelerate the development of new treatments by compiling line list incidence data, epidemiological information and models, genome and protein sequencing, testing protocols, results of clinical trials, passive environmental monitoring data, and more.

Moreover, it could foster a common understanding and consensus around the facts—a prerequisite to reach international buy-in on policies to address situations unique to COVID-19 or future pandemics, such as the distribution of medical equipment and PPE, disruption to the tourism industry and global supply chains, social distancing or quarantine, and mass closures of businesses….(More). See also Call for Action for a Data Infrastructure to tackle Pandemics and other Dynamic Threats.