Data Privacy Increasingly a Focus of National Security Reviews


Paper by Tamara Ehs, and Monika Mokre: “The yellow vest movement started in November 2018 and has formed the longest protest movement in France since 1945. The movement provoked different reactions of the French government—on the one hand, violence and repression; on the other hand, concessions. One of them was to provide a possibility for citizens’ participation by organizing the so-called “Grand Débat.” It was clear to all observers that this was less an attempt to further democracy in France than to calm down the protests of the yellow vests. Thus, it seemed doubtful from the beginning whether this form of participatory democracy could be understood as a real form of citizens’ deliberation, and in fact, several shortcomings with regard to procedure and participation were pointed out by theorists of deliberative democracy. The aim of this article is to analyze the Grand Débat with regard to its deliberative qualities and shortcomings….(More)”.

Data to the rescue: how humanitarian aid NGOs should collect information based on the GDPR


Paper by Theodora Gazi: “Data collection is valuable before, during and after interventions in order to increase the effectiveness of humanitarian projects. Although the General Data Protection Regulation (GDPR) sets forth rules for the processing of personal data, its implementation by humanitarian aid actors is crucial and presents challenges. Failure to comply triggers severe risks for both data subjects and the reputation of the actor. This article provides insights into the implementation of the guiding principles of the GDPR, the legal bases for data processing, data subjects’ rights and data sharing during the provision of humanitarian assistance…(More)”

The economics of Business to Government data sharing


Paper by Bertin Martens and Nestor Duch Brown: “Data and information are fundamental pieces for effective evidence-based policy making and provision of public services. In recent years, some private firms have been collecting large amounts of data, which, were they available to governments, could greatly improve their capacity to take better policy decisions and to increase social welfare. Business-to-Government (B2G) data sharing can result in substantial benefits for society. It can save costs to governments by allowing them to benefit from the use of data collected by businesses without having to collect the same data again. Moreover, it can support the production of new and innovative outputs based on the shared data by different users. Finally, the data available to government may give only an incomplete or even biased picture, while aggregating complementary datasets shared by different parties (including businesses) may result in improved policies with strong social welfare benefits.


The examples assembled by the High Level Expert Group on B2G data sharing show that most of the current B2G data transactions remain one-off experimental pilot projects that do not seem to be sustainable over time. Overall, the volume of B2G operations still seems to be relatively small and clearly sub-optimal from a social welfare perspective. The market does not seem to scale compared to the economic potential for welfare gains in society. There are likely to be significant potential economic benefits from additional B2G data sharing operations. These could be enabled by measures that would seek to improve their governance conditions to contribute to increase the overall number of transactions. To design such measures, it is important to understand the nature of the current barriers for B2G data sharing operations. In this paper, we focus on the more important barriers from an economic perspective: (a) monopolistic data markets, (b) high transaction costs and perceived risks in data sharing and (c) a lack of incentives for private firms to contribute to the production of public benefits. The following reflections are mainly conceptual, since there is currently little quantitative empirical evidence on the different aspects of B2G transactions.

  • Monopolistic data markets. Some firms -like big tech companies for instance- may be in a privileged position as the exclusive providers of the type of data that a public body seeks to access. This position enables the firms to charge a high price for the data beyond a reasonable rate of return on costs. While a monopolistic market is still a functioning market, the resulting price may lead to some governments not being able or willing to purchase the data and therefore may cause social welfare losses. Nonetheless, monopolistic pricing may still be justified from an innovation perspective: it strengthens incentives to invest in more and better data collection systems and thereby increases the supply of data in the long run. In some cases, the data seller may be in a position to price-discriminate between commercial buyers and a public body, charging a lower price to the latter since the data would not be used for commercial purposes.
  • High transaction costs and perceived risks. An important barrier for data sharing comes from the ex-ante costs related to finding a suitable data sharing partner, negotiating a contractual arrangement, re-formatting and cleaning the data, among others. Potentially interested public bodies may not be aware of available datasets or may not be in a position to handle them or understand their advantages and disadvantages. There may also be ex-post risks related to uncertainties in the quality and/or usefulness of the data, the technical implementation of the data sharing deal, ensuring compliance with the agreed conditions, the risk of data leaks to unauthorized third-parties and exposure of personal and confidential data.
  • Lack of incentives. Firms may be reluctant to share data with governments because it might have a negative impact on them. This could be due to suspicions that the data delivered might be used to implement market regulations and to enforce competition rules that could negatively affect firms’ profits. Moreover, if firms share data with government under preferential conditions, they may have difficulties justifying the foregone profit to shareholders, since the benefits generated by better policies or public services fuelled by the private data will occur to society as a whole and are often difficult to express in monetary terms. Finally, firms might be afraid of entering into a competitive disadvantage if they provide data to public bodies – perhaps under preferential conditions – and their competitors do not.

Several mechanisms could be designed to solve the barriers that may be holding back B2G data sharing initiatives. One would be to provide stronger incentives for the data supplier firm to engage in this type of transactions. These incentives can be direct, i.e., monetary, or indirect, i.e., reputational (e.g. as part of corporate social responsibility programmes). Another way would be to ascertain the data transfer by making the transaction mandatory, with a fair cost compensation. An intermediate way would be based on solutions that seek to facilitate voluntary B2G operations without mandating them, for example by reducing the transaction costs and perceived risks for the provider data supplier, e.g. by setting up trusted data intermediary platforms, or appropriate contractual provisions. A possible EU governance framework for B2G data sharing operations could cover these options….(More)”.

UK’s National Data Strategy


DCMS (UK): “…With the increasing ascendance of data, it has become ever-more important that the government removes the unnecessary barriers that prevent businesses and organisations from accessing such information.

The importance of data sharing was demonstrated during the first few months of the coronavirus pandemic, when government departments, local authorities, charities and the private sector came together to provide essential services. One notable example is the Vulnerable Person Service, which in a very short space of time enabled secure data-sharing across the public and private sectors to provide millions of food deliveries and access to priority supermarket delivery slots for clinically extremely vulnerable people.

Aggregation of data from different sources can also lead to new insights that otherwise would not have been possible. For example, the Connected Health Cities project anonymises and links data from different health and social care services, providing new insights into the way services are used.

Vitally, data sharing can also fuel growth and innovation.20 For new and innovating organisations, increasing data availability will mean that they, too, will be able to gain better insights from their work and access new markets – from charities able to pool beneficiary data to better evaluate the effectiveness of interventions, to new entrants able to access new markets. Often this happens as part of commercial arrangements; in other instances government has sought to intervene where there are clear consumer benefits, such as in relation to Open Banking and Smart Data. Government has also invested in the research and development of new mechanisms for better data sharing, such as the Office for AI and Innovate UK’s partnership with the Open Data Institute to explore data trusts.21

However, our call for evidence, along with engagement with stakeholders, has identified a range of barriers to data availability, including:

  • a culture of risk aversion
  • issues with current licensing regulations
  • market barriers to greater re-use, including data hoarding and differential market power
  • inconsistent formatting of public sector data
  • issues pertaining to the discoverability of data
  • privacy and security concerns
  • the benefits relating to increased data sharing not always being felt by the organisation incurring the cost of collection and maintenance

This is a complex environment, and heavy-handed intervention may have the unwanted effect of reducing incentives to collect, maintain and share data for the benefit of the UK. It is clear that any way forward must be carefully considered to avoid unintended negative consequences. There is a balance to be struck between maintaining appropriate commercial incentives to collect data, while ensuring that data can be used widely for the benefit of the UK. For personal data, we must also take account of the balance between individual rights and public benefit.

This is a new issue for all digital economies that has come to the fore as data has become a significant modern, economic asset. Our approach will take account of those incentives, and consider how innovation can overcome perceived barriers to availability. For example, it can be limited to users with specific characteristics, by licence or regulator accreditation; it can be shared within a collaborating group of organisations; there may also be value in creating and sharing synthetic data to support research and innovation, as well as other privacy-enhancing technologies and techniques….(More)”.

Synthetic data: Unlocking the power of data and skills for machine learning


Karen Walker at Gov.UK: “Defence generates and holds a lot of data. We want to be able to get the best out of it, unlocking new insights that aren’t currently visible, through the use of innovative data science and analytics techniques tailored to defence’s specific needs. But this can be difficult because our data is often sensitive for a variety of reasons. For example, this might include information about the performance of particular vehicles, or personnel’s operational deployment details.

It is therefore often challenging to share data with experts who sit outside the Ministry of Defence, particularly amongst the wider data science community in government, small companies and academia. The use of synthetic data gives us a way to address this challenge and to benefit from the expertise of a wider range of people by creating datasets which aren’t sensitive. We have recently published a report from this work….(More)”.

Double image of original data and synthetic data in a 2D chart. The two images look almost identical

Business-to-Business Data Sharing: An Economic and Legal Analysis


Paper by Bertin Martens et al: “The European Commission announced in its Data Strategy (2020) its intentions to propose an enabling legislative framework for the governance of common European data spaces, to review and operationalize data portability, to prioritize standardization activities and foster data interoperability and to clarify usage rights for co-generated IoT data. This Strategy starts from the premise that there is not enough data sharing and that much data remain locked up and are not available for innovative re-use. The Commission will also consider the adoption of a New Competition Tool as well as the adoption of ex ante regulation for large online gate-keeping platforms as part of the announced Digital Services Act Package . In this context, the goal of this report is to examine the obstacles to Business-to-Business (B2B) data sharing: what keeps businesses from sharing or trading more of their data with other businesses and what can be done about it? For this purpose, this report uses the well-known tools of legal and economic thinking about market failures. It starts from the economic characteristics of data and explores to what extent private B2B data markets result in a socially optimal degree of data sharing, or whether there are market failures in data markets that might justify public policy intervention.

It examines the conditions under which monopolistic data market failures may occur. It contrasts these welfare losses with the welfare gains from economies of scope in data aggregation in large pools. It also discusses other potential sources of B2B data market failures due to negative externalities, risks and transaction costs and asymmetric information situations. In a next step, the paper explores solutions to overcome these market failures. Private third-party data intermediaries may be in a position to overcome market failures due to high transactions costs and risks. They can aggregate data in large pools to harvest the benefits of economies of scale and scope in data. Where third-party intervention fails, regulators can step in, with ex-post competition instruments and with ex-ante regulation. The latter includes data portability rights for personal data and mandatory data access rights….(More)”.

Privacy-Preserving Record Linkage in the context of a National Statistics Institute


Guidance by Rainer Schnell: “Linking existing administrative data sets on the same units is used increasingly as a research strategy in many different fields. Depending on the academic field, this kind of operation has been given different names, but in application areas, this approach is mostly denoted as record linkage. Although linking data on organisations or economic entities is common, the most interesting applications of record linkage concern data on persons. Starting in medicine, this approach is now also being used in the social sciences and official statistics. Furthermore, the joint use of survey data with administrative data is now standard practice. For example, victimisation surveys are linked to police records, labour force surveys are linked to social security databases, and censuses are linked to surveys.

Merging different databases containing information on the same unit is technically trivial if all involved databases have a common identification number, such as a social security number or, as in the Scandinavian countries, a permanent personal identification number. Most of the modern identification numbers contain checksum mechanisms so that errors in these identifiers can be easily detected and corrected. Due to the many advantages of permanent personal identification numbers, similar systems have been introduced or discussed in some European countries outside Scandinavia.

In many jurisdictions, no permanent personal identification number is available for linkage. Examples are New Zealand, Australia, the UK, and Germany. Here, the linkage is most often based on alphanumeric identifiers such as surname, first name, address, and place of birth. In the literature, such identifiers are most often denoted as indirect or quasi-identifiers. Such identifiers are prone to error, for example, due to typographical errors, memory faults (previous addresses), different recordings of the same identifier (for example, swapping of substrings: reversal of first name and last name), deliberately false information (for example, year of birth) or changes of values over time (for example name changes due to marriages). Linking on exact matching information, therefore, yields only a non-randomly selected subset of records.

Furthermore, the quality of identifying information in databases containing only indirect identifiers is much lower than usually expected. Error rates in excess of 20% and more records containing incomplete or erroneous identifiers are encountered in practice….(More)”.

Applying new models of data stewardship to health and care data


Report by the Open Data Institute: “The outbreak of the coronavirus (Covid-19) has amplified and accelerated the need for an effective technology ecosystem that benefits everyone’s health. This report explores models of ‘data stewardship’ (the collection, maintenance and sharing of data) required to enable better evaluation

The pandemic has been accompanied by a marked increase in the use of digital technology, including introduction of remote consultation in general practice, new data flows to support the distribution of food and other essentials, and applications to support digital contact tracing.

This report explores models of ‘data stewardship’ (the collection, maintenance and sharing of data) required to enable better evaluation. It argues everybody involved in technology has a shared responsibility to enable evaluation, whether that means innovators sharing data for evaluation purposes, or healthcare providers being clearer, from the outset, about what data is needed to support effective evaluation.

This report re-envisages the role of evaluators as data stewards, who could use their positions as intermediaries to encourage stakeholders to share data, and help increase access to data for public benefit…(More)”.

EU risks being dethroned as world’s lead digital regulator


Marietje Schaake at the Financial Times: “With a series of executive orders, US president Donald Trump has quickly changed the digital regulatory game. His administration has adopted unprecedented sanctions against the Chinese technology group Huawei; next on the list of likely targets is the Chinese ecommerce group Alibaba.

The TikTok takeover saga continues, since the president this month ordered the sale of its US operations within 90 days. The administration’s Clean Network programme also claims to protect privacy by keeping “unsafe” companies out of US cable, cloud and app infrastructure. Engaging with a shared privacy agenda, which the EU has enshrined in law, would be a constructive step.

Instead, US secretary of state Mike Pompeo has prioritised warnings about the dangers posed by Huawei to individual EU member states during a recent visit. Yet these unilateral American actions also highlight weaknesses in Europe’s own preparedness and unity on issues of national security in the digital world. Beyond emphasising fundamental rights and economic rules, Europe must move fast if it does not want to see other global actors draw the road maps of regulation.

Recent years have seen the acceleration of national security arguments to restrict market access for global technology companies. Decisions on bans and sanctions tend to rely on the type of executive power that the EU lacks, especially in the national security domain. The bloc has never fully developed a common security policy — and deliberately so. In its white paper on artificial intelligence, the European Commission explicitly omits AI in the military context, and European geopolitical clout remains underused by politicians keen to advance their national postures.

Tensions between the promise of a digital single market and the absence of a common approach to security were revealed in fragmented responses to 5G concerns, as well as foreign acquisitions of strategic tech companies. This ad hoc policy toolbox may well prove inadequate to build the co-ordination needed for a forceful European strategy. The US tussle with TikTok and Huawei should be a lesson to European politicians on their approach to regulating tech.

A confident Europe might argue that concerns about terabytes of the most intimate information being shared with foreign companies were promptly met with the EU’s general data protection regulations. A more critical voice would counter that Europe does not appreciate the risks of integrating Chinese tech into 5G networks, and that its narrow focus on fundamental rights and market regulations in the digital world was always naive.

Either way, now that geopolitics is integrating with tech policy, the EU risks being dethroned as the lead regulator of the digital world. In many ways it is remarkable that a reckoning took this long. For decades, online products and services have evaded restrictions on their reach into global communities. But the long-anticipated collision of geopolitics and technological disruption is finally here. It will do significant collateral damage to the open internet.

The challenge for democracies is to preserve their own core values and interests, along with the benefits of an open, global internet. A series of nationalistic bans and restrictions will not achieve these goals. Instead it will unleash a digital trade war at the expense of internet users worldwide..(More)”.

Personal data, public data, privacy & power: GDPR & company data


Open Corporates: “…there are three other aspects which are relevant when talking about access to EU company data.

Cargo-culting GDPR

The first, is a tendency to take this complex and subtle legislation that is GDPR and use a poorly understood version in other legislation and regulation, even if that regulation is already covered by GDPR. This actually undermines the GDPR regime, and prevents it from working effectively, and should strongly be resisted. In the tech world, such approaches are called ‘cargo-culting’.

Similarly GDPR is often used as an excuse for not releasing company information as open data, even when the same data is being sold to third parties apparently without concerns — if one is covered by GDPR, the other certainly should be.

Widened power asymmetries

The second issue is the unintended consequences of GDPR, specifically the way it increases asymmetries of power and agency. For example, something like the so-called Right To Be Forgotten takes very significant resources to implement, and so actually strengthens the position of the giant tech companies — for such companies, investing millions in large teams to decide who should and should not be given the Right To Be Forgotten is just a relatively small cost of doing business.

Another issue is the growth of a whole new industry dedicated to removing traces of people’s past from the internet (2), which is also increasing the asymmetries of power. The vast majority of people are not directors of companies, or beneficial owners, and it is only the relatively rich and powerful (including politicians and criminals) who can afford lawyers to stifle free speech, or remove parts of their past they would rather not be there, from business failures to associations with criminals.

OpenCorporates, for example, was threatened with a lawsuit from a member of one of the wealthiest families in Europe for reproducing a gazette notice from the Luxembourg official gazette (a publication that contains public notices). We refused to back down, believing we had a good case in law and in the public interest, and the other side gave up. But such so-called SLAPP suits are becoming increasingly common, although unlike many US states there are currently no defences in place to resist these in the EU, despite pressure from civil society to address this….

At the same time, the automatic assumption that all Personally Identifiable Information (PII), someone’s name for example, is private is highly problematic, confusing both citizens and policy makers, and further undermining democracies and fair societies. As an obvious case, it’s critical that we know the names of our elected representatives, and those in positions of power, otherwise we would have an opaque society where decisions are made by nameless individuals with opaque agendas and personal interests — such as a leader awarding a contract to their brother’s company, for example.

As the diagram below illustrates, there is some personally identifiable information that it’s strongly in the public interest to know. Take the director or beneficial owner of a company, for example, of course their details are PII — clearly you need to know their name (and other information too), otherwise what actually do you know about them, or the company (only that some unnamed individual has been given special protection under law to be shielded from the company’s debts and actions, and yet can benefit from its profits)?

On the other hand, much of the data which is truly about our privacy — the profiles, inferences and scores that companies store on us — is explicitly outside GDPR, if it doesn’t contain PII.

Image for post

Hopefully, as awareness of the issues increases, we will develop a more nuanced, deeper, understanding of privacy, such that case law around GDPR, and successors to this legislation begin to rebalance and case law starts to bring clarity to the ambiguities of the GDPR….(More)”.