Commission proposes measures to boost data sharing and support European data spaces


Press Release: “To better exploit the potential of ever-growing data in a trustworthy European framework, the Commission today proposes new rules on data governance. The Regulation will facilitate data sharing across the EU and between sectors to create wealth for society, increase control and trust of both citizens and companies regarding their data, and offer an alternative European model to data handling practice of major tech platforms.

The amount of data generated by public bodies, businesses and citizens is constantly growing. It is expected to multiply by five between 2018 and 2025. These new rules will allow this data to be harnessed and will pave the way for sectoral European data spaces to benefit society, citizens and companies. In the Commission’s data strategy of February this year, nine such data spaces have been proposed, ranging from industry to energy, and from health to the European Green Deal. They will, for example, contribute to the green transition by improving the management of energy consumption, make delivery of personalised medicine a reality, and facilitate access to public services.

The Regulation includes:

  • A number of measures to increase trust in data sharing, as the lack of trust is currently a major obstacle and results in high costs.
  • Create new EU rules on neutrality to allow novel data intermediaries to function as trustworthy organisers of data sharing.
  • Measures to facilitate the reuse of certain data held by the public sector. For example, the reuse of health data could advance research to find cures for rare or chronic diseases.
  • Means to give Europeans control on the use of the data they generate, by making it easier and safer for companies and individuals to voluntarily make their data available for the wider common good under clear conditions….(More)”.

Responsible Data Re-Use for COVID19


” The Governance Lab (The GovLab) at the NYU Tandon School of Engineering, with support from the Henry Luce Foundation, today released guidance to inform decision-making in the responsible re-use of data — re-purposing data for a use other than that for which it was originally intended — to address COVID-19. The findings, recommendations, and a new Responsible Data Re-Use framework stem from The Data Assembly initiative in New York City. An effort to solicit diverse, actionable public input on data re-use for crisis response in the United States, the Data Assembly brought together New York City-based stakeholders from government, the private sector, civic rights and advocacy organizations, and the general public to deliberate on innovative, though potentially risky, uses of data to inform crisis response in New York City. The findings and guidance from the initiative will inform policymaking and practice regarding data re-use in New York City, as well as free data literacy training offerings.

The Data Assembly’s Responsible Data Re-Use Framework provides clarity on a major element of the ongoing crisis. Though leaders throughout the world have relied on data to reduce uncertainty and make better decisions, expectations around the use and sharing of siloed data assets has remained unclear. This summer, along with the New York Public Library and Brooklyn Public Library, The GovLab co-hosted four months of remote deliberations with New York-based civil rights organizations, key data holders, and policymakers. Today’s release is a product of these discussions, to show how New Yorkers and their leaders think about the opportunities and risks involved in the data-driven response to COVID-19….(More)”

See: The Data Assembly Synthesis Report by y Andrew Young, Stefaan G. Verhulst, Nadiya Safonova, and Andrew J. Zahuranec

Data could hold the key to stopping Alzheimer’s


Blog post by Bill Gates: “My family loves to do jigsaw puzzles. It’s one of our favorite activities to do together, especially when we’re on vacation. There is something so satisfying about everyone working as a team to put down piece after piece until finally the whole thing is done.

In a lot of ways, the fight against Alzheimer’s disease reminds me of doing a puzzle. Your goal is to see the whole picture, so that you can understand the disease well enough to better diagnose and treat it. But in order to see the complete picture, you need to figure out how all of the pieces fit together.

Right now, all over the world, researchers are collecting data about Alzheimer’s disease. Some of these scientists are working on drug trials aimed at finding a way to stop the disease’s progression. Others are studying how our brain works, or how it changes as we age. In each case, they’re learning new things about the disease.

But until recently, Alzheimer’s researchers often had to jump through a lot of hoops to share their data—to see if and how the puzzle pieces fit together. There are a few reasons for this. For one thing, there is a lot of confusion about what information you can and can’t share because of patient privacy. Often there weren’t easily available tools and technologies to facilitate broad data-sharing and access. In addition, pharmaceutical companies invest a lot of money into clinical trials, and often they aren’t eager for their competitors to benefit from that investment, especially when the programs are still ongoing.

Unfortunately, this siloed approach to research data hasn’t yielded great results. We have only made incremental progress in therapeutics since the late 1990s. There’s a lot that we still don’t know about Alzheimer’s, including what part of the brain breaks down first and how or when you should intervene. But I’m hopeful that will change soon thanks in part to the Alzheimer’s Disease Data Initiative, or ADDI….(More)“.

Building Trust for Inter-Organizational Data Sharing: The Case of the MLDE


Paper by Heather McKay, Sara Haviland, and Suzanne Michael: “There is increasing interest in sharing data across agencies and even between states that was once siloed in separate agencies. Driving this is a need to better understand how people experience education and work, and their pathways through each. A data-sharing approach offers many possible advantages, allowing states to leverage pre-existing data systems to conduct increasingly sophisticated and complete analyses. However, information sharing across state organizations presents a series of complex challenges, one of which is the central role trust plays in building successful data-sharing systems. Trust building between organizations is therefore crucial to ensuring project success.

This brief examines the process of building trust within the context of the development and implementation of the Multistate Longitudinal Data Exchange (MLDE). The brief is based on research and evaluation activities conducted by Rutgers’ Education & Employment Research Center (EERC) over the past five years, which included 40 interviews with state leaders and the Western Interstate Commission for Higher Education (WICHE) staff, observations of user group meetings, surveys, and MLDE document analysis. It is one in a series of MLDE briefs developed by EERC….(More)”.

European Health Data Space


European Commission Press Release: “The set-up of the European Health Data Space will be an integral part of building a European Health Union, a process launched by the Commission today with a first set of proposals to reinforce preparedness and response during health crisis. This  is also a direct follow up of the Data strategy adopted by the Commission in February this year, where the Commission had already stressed the importance of creating European data spaces, including on health….

In this perspective, as part of the implementation of the Data strategy, a data governance act is set to be presented still this year, which will support the reuse of public sensitive data such as health data. A dedicated legislative proposal on a European health data space is planned for next year, as set out in the 2021 Commission work programme.

As first steps, the following activities starting in 2021 will pave the way for better data-driven health care in Europe:

  • The Commission proposes a European Health Data Space in 2021;
  • A Joint Action with 22 Member States to propose options on governance, infrastructure, data quality and data solidarity and empowering citizens with regards to secondary health data use in the EU;
  • Investments to support the European Health Data Space under the EU4Health programme, as well as common data spaces and digital health related innovation under Horizon Europe and the Digital Europe programmes;
  • Engagement with relevant actors to develop targeted Codes of Conduct for secondary health data use;
  • A pilot project, to demonstrate the feasibility of cross border analysis for healthcare improvement, regulation and innovation;
  • Other EU funding opportunities for digital transformation of health and care will be available for Member States as of 2021 under Recovery and Resilience Facility, European Regional Development Fund, European Social Fund+, InvestEU.

The set of proposals adopted by the Commission today to strengthen the EU’s crisis preparedness and response, taking the first steps towards a European Health Union, also pave the way for the participation of the European Medicines Agency (EMA) and the European Centre for Disease Prevention and Control (ECDC) in the future European Health Data Space infrastructure, along with research institutes, public health bodies, and data permit authorities in the Member States….(More)”.

The responsible use of data for and about children: treading carefully and ethically


Q&A with Stefaan G. Verhulst and Andrew Young …” working in collaboration with UNICEF on an initiative called Responsible Data for Children initiative (RD4C) . Its focus is on data – the risks it poses to children, as well as the opportunities it offers.

You have been working with UNICEF on the Responsible Data for Children initiative (RD4C). What is this and why do we need to be talking more about ‘responsible data’?

To date, the relationship between the datafication of everyday life and child welfare has been under-explored, both by researchers in data ethics and those who work to advance the rights of children. This neglect is a lost opportunity, and also poses a risk to children.

Today’s children are the first generation to grow up amid the rapid datafication of virtually every aspect of social, cultural, political and economic life. This alone calls for greater scrutiny of the role played by data. An entire generation is being datafied, often starting before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. Ironically, humanitarian and development organizations working with children are themselves among the key actors contributing to the increased collection of data. These organizations rely on a wide range of technologies, including biometrics, digital identity systems, remote-sensing technologies, mobile and social media messaging apps, and administrative data systems. The data generated by these tools and platforms inevitably includes potentially sensitive PII data (personally identifiable information) and DII data (demographically identifiable information). All of this begs much closer scrutiny, and a more systematic framework to guide how child-related data is collected, stored, and used.

Towards this aim, we have also been working with the Data for Children Collaborative, based in Edinburgh in establishing innovative and ethical practices around the use of data to improve the lives of children worldwide….(More)”.

Federated Learning for Privacy-Preserving Data Access


Paper by Małgorzata Śmietanka, Hirsh Pithadia and Philip Treleaven: “Federated learning is a pioneering privacy-preserving data technology and also a new machine learning model trained on distributed data sets.

Companies collect huge amounts of historic and real-time data to drive their business and collaborate with other organisations. However, data privacy is becoming increasingly important because of regulations (e.g. EU GDPR) and the need to protect their sensitive and personal data. Companies need to manage data access: firstly within their organizations (so they can control staff access), and secondly protecting raw data when collaborating with third parties. What is more, companies are increasingly looking to ‘monetize’ the data they’ve collected. However, under new legislations, utilising data by different organization is becoming increasingly difficult (Yu, 2016).

Federated learning pioneered by Google is the emerging privacy- preserving data technology and also a new class of distributed machine learning models. This paper discusses federated learning as a solution for privacy-preserving data access and distributed machine learning applied to distributed data sets. It also presents a privacy-preserving federated learning infrastructure….(More)”.

Four Principles to Make Data Tools Work Better for Kids and Families


Blog by the Annie E. Casey Foundation: “Advanced data analytics are deeply embedded in the operations of public and private institutions and shape the opportunities available to youth and families. Whether these tools benefit or harm communities depends on their design, use and oversight, according to a report from the Annie E. Casey Foundation.

Four Principles to Make Advanced Data Analytics Work for Children and Families examines the growing field of advanced data analytics and offers guidance to steer the use of big data in social programs and policy….

The Foundation report identifies four principles — complete with examples and recommendations — to help steer the growing field of data science in the right direction.

Four Principles for Data Tools

  1. Expand opportunity for children and families. Most established uses of advanced analytics in education, social services and criminal justice focus on problems facing youth and families. Promising uses of advanced analytics go beyond mitigating harm and help to identify so-called odds beaters and new opportunities for youth.
    • Example: The Children’s Data Network at the University of Southern California is helping the state’s departments of education and social services explore why some students succeed despite negative experiences and what protective factors merit more investment.
    • Recommendation: Government and its philanthropic partners need to test if novel data science applications can create new insights and when it’s best to apply them.
       
  2. Provide transparency and evidence. Advanced analytical tools must earn and maintain a social license to operate. The public has a right to know what decisions these tools are informing or automating, how they have been independently validated, and who is accountable for answering and addressing concerns about how they work.
    • Recommendations: Local and state task forces can be excellent laboratories for testing how to engage youth and communities in discussions about advanced analytics applications and the policy frameworks needed to regulate their use. In addition, public and private funders should avoid supporting private algorithms whose design and performance are shielded by trade secrecy claims. Instead, they should fund and promote efforts to develop, evaluate and adapt transparent and effective models.
       
  3. Empower communities. The field of advanced data analytics often treats children and families as clients, patients and consumers. Put to better use, these same tools can help elucidate and reform the systems acting upon children and families. For this shift to occur, institutions must focus analyses and risk assessments on structural barriers to opportunity rather than individual profiles.
    • Recommendation: In debates about the use of data science, greater investment is needed to amplify the voices of youth and their communities.
       
  4. Promote equitable outcomes. Useful advanced analytics tools should promote more equitable outcomes for historically disadvantaged groups. New investments in advanced analytics are only worthwhile if they aim to correct the well-documented bias embedded in existing models.
    • Recommendations: Advanced analytical tools should only be introduced when they reduce the opportunity deficit for disadvantaged groups — a move that will take organizing and advocacy to establish and new policy development to institutionalize. Philanthropy and government also have roles to play in helping communities test and improve tools and examples that already exist….(More)”.

A Legal Framework for Access to Data – A Competition Policy Perspective


Paper by Heike Schweitzer and Robert Welker: “The paper strives to systematise the debate on access to data from a competition policy angle. At the outset, two general policy approaches to access to data are distinguished: a “private control of data” approach versus an “open access” approach. We argue that, when it comes to private sector data, the “private control of data” approach is preferable. According to this approach, the “whether” and “how” of data access should generally be left to the market. However, public intervention can be justified by significant market failures. We discuss the presence of such market failures and the policy responses, including, in particular, competition policy responses, with a view to three different data access scenarios: access to data by co-generators of usage data (Scenario 1); requests for access to bundled or aggregated usage data by third parties vis-à-vis a service or product provider who controls such datasets, with the goal to enter complementary markets (Scenario 2); requests by firms to access the large usage data troves of the Big Tech online platforms for innovative purposes (Scenario 3). On this basis we develop recommendations for data access policies….(More)”.

Not fit for Purpose: A critical analysis of the ‘Five Safes’


Paper by Chris Culnane, Benjamin I. P. Rubinstein, and David Watts: “Adopted by government agencies in Australia, New Zealand, and the UK as policy instrument or as embodied into legislation, the ‘Five Safes’ framework aims to manage risks of releasing data derived from personal information. Despite its popularity, the Five Safes has undergone little legal or technical critical analysis. We argue that the Fives Safes is fundamentally flawed: from being disconnected from existing legal protections and appropriation of notions of safety without providing any means to prefer strong technical measures, to viewing disclosure risk as static through time and not requiring repeat assessment. The Five Safes provides little confidence that resulting data sharing is performed using ‘safety’ best practice or for purposes in service of public interest….(More)”.