DATA – Page 313 – The Living Library

The Data Gaze: Capitalism, Power and Perception

Curated on April 3, 2019April 3, 2019 by Stefaan Verhulst

Book by David Beer: “A significant new way of understanding contemporary capitalism is to understand the intensification and spread of data analytics. This text is about the powerful promises and visions that have led to the expansion of data analytics and data-led forms of social ordering.

It is centrally concerned with examining the types of knowledge associated with data analytics and shows that how these analytics are envisioned is central to the emergence and prominence of data at various scales of social life. This text aims to understand the powerful role of the data analytics industry and how this industry facilitates the spread and intensification of data-led processes. As such, The Data Gaze is concerned with understanding how data-led, data-driven and data-reliant forms of capitalism pervade organisational and everyday life.

Using a clear theoretical approach derived from Foucault and critical data studies the text develops the concept of the data gaze and shows how powerful and persuasive it is. It’s an essential and subversive guide to data analytics and data capitalism. …(More)”.

The Economics of Social Data

Curated on April 3, 2019April 3, 2019 by Stefaan Verhulst

Paper by Dirk Bergemann and Alessandro Bonatti: “Large internet platforms collect data from individual users in almost every interaction on the internet. Whenever an individual browses a news website, searches for a medical term or for a travel recommendation, or simply checks the weather forecast on an app, that individual generates data. A central feature of the datacollected from the individuals is its social aspect. Namely, the data captured from an individual user is not only informative about this speciﬁc individual, but also about users in some metric similar to the individual. Thus, the individual data is really social data. The social nature of the data generates an informational externality that we investigate in this note….(More)”.

Trustworthy Privacy Indicators: Grades, Labels, Certifications and Dashboards

Curated on April 3, 2019April 3, 2019 by Stefaan Verhulst

Paper by Joel R. Reidenberg et al: “Despite numerous groups’ efforts to score, grade, label, and rate the privacy of websites, apps, and network-connected devices, these attempts at privacy indicators have, thus far, not been widely adopted. Privacy policies, however, remain long, complex, and impractical for consumers. Communicating in some short-hand form, synthesized privacy content is now crucial to empower internet users and provide them more meaningful notice, as well as nudge consumers and data processors toward more meaningful privacy. Indeed, on the basis of these needs, the National Institute of Standards and Technology and the Federal Trade Commission in the United States, as well as lawmakers and policymakers in the European Union, have advocated for the development of privacy indicator systems.

Efforts to develop privacy grades, scores, labels, icons, certifications, seals, and dashboards have wrestled with various deficiencies and obstacles for the wide-scale deployment as meaningful and trustworthy privacy indicators. This paper seeks to identify and explain these deficiencies and obstacles that have hampered past and current attempts. With these lessons, the article then offers criteria that will need to be established in law and policy for trustworthy indicators to be successfully deployed and adopted through technological tools. The lack of standardization prevents user-recognizability and dependability in the online marketplace, diminishes the ability to create automated tools for privacy, and reduces incentives for consumers and industry to invest in a privacy indicators. Flawed methods in selection and weighting of privacy evaluation criteria and issues interpreting language that is often ambiguous and vague jeopardize success and reliability when baked into an indicator of privacy protectiveness or invasiveness. Likewise, indicators fall short when those organizations rating or certifying the privacy practices are not objective, trustworthy, and sustainable.

Nonetheless, trustworthy privacy rating systems that are meaningful, accurate, and adoptable can be developed to assure effective and enduring empowerment of consumers. This paper proposes a framework using examples from prior and current attempts to create privacy indicator systems in order to provide a valuable resource for present-day, real world policymaking….(More)”.

Understanding algorithmic decision-making: Opportunities and challenges

Curated on April 1, 2019April 1, 2019 by Stefaan Verhulst

Study by Claude Castelluccia and Daniel Le Métayer for the European Parliament: “While algorithms are hardly a recent invention, they are nevertheless increasingly involved in systems used to support decision-making. These systems, known as ‘ADS’ (algorithmic decision systems), often rely on the analysis of large amounts of personal data to infer correlations or, more generally, to derive information deemed useful to make decisions. Human intervention in the decision-making may vary, and may even be completely out of the loop in entirely automated systems. In many situations, the impact of the decision on people can be significant, such as access to credit, employment, medical treatment, or judicial sentences, among other things.

Entrusting ADS to make or to influence such decisions raises a variety of ethical, political, legal, or technical issues, where great care must be taken to analyse and address them correctly. If they are neglected, the expected benefits of these systems may be negated by a variety of different risks for individuals (discrimination, unfair practices, loss of autonomy, etc.), the economy (unfair practices, limited access to markets, etc.), and society as a whole (manipulation, threat to democracy, etc.).

This study reviews the opportunities and risks related to the use of ADS. It presents policy options to reduce the risks and explain their limitations. We sketch some options to overcome these limitations to be able to benefit from the tremendous possibilities of ADS while limiting the risks related to their use. Beyond providing an up-to date and systematic review of the situation, the study gives a precise definition of a number of key terms and an analysis of their differences to help clarify the debate. The main focus of the study is the technical aspects of ADS. However, to broaden the discussion, other legal, ethical and social dimensions are considered….(More)”.

Privacy-Preserved Data Sharing for Evidence-Based Policy Decisions: A Demonstration Project Using Human Services Administrative Records for Evidence-Building Activities

Curated on April 1, 2019April 4, 2019 by Stefaan Verhulst

Paper by the Bipartisan Policy Center: “Emerging privacy-preserving technologies and approaches hold considerable promise for improving data privacy and confidentiality in the 21st century. At the same time, more information is becoming accessible to support evidence-based policymaking.

In 2017, the U.S. Commission on Evidence-Based Policymaking unanimously recommended that further attention be given to the deployment of privacy-preserving data-sharing applications. If these types of applications can be tested and scaled in the near-term, they could vastly improve insights about important policy problems by using disparate datasets. At the same time, the approaches could promote substantial gains in privacy for the American public.

There are numerous ways to engage in privacy-preserving data sharing. This paper primarily focuses on secure computation, which allows information to be accessed securely, guarantees privacy, and permits analysis without making private information available. Three key issues motivated the launch of a domestic secure computation demonstration project using real government-collected data:

Using new privacy-preserving approaches addresses pressing needs in society. Current widely accepted approaches to managing privacy risks—like preventing the identification of individuals or organizations in public datasets—will become less effective over time. While there are many practices currently in use to keep government-collected data confidential, they do not often incorporate modern developments in computer science, mathematics, and statistics in a timely way. New approaches can enable researchers to combine datasets to improve the capability for insights, without being impeded by traditional concerns about bringing large, identifiable datasets together. In fact, if successful, traditional approaches to combining data for analysis may not be as necessary.
There are emerging technical applications to deploy certain privacy-preserving approaches in targeted settings. These emerging procedures are increasingly enabling larger-scale testing of privacy-preserving approaches across a variety of policy domains, governmental jurisdictions, and agency settings to demonstrate the privacy guarantees that accompany data access and use.
Widespread adoption and use by public administrators will only follow meaningful and successful demonstration projects. For example, secure computation approaches are complex and can be difficult to understand for those unfamiliar with their potential. Implementing new privacy-preserving approaches will require thoughtful attention to public policy implications, public opinions, legal restrictions, and other administrative limitations that vary by agency and governmental entity.

This project used real-world government data to illustrate the applicability of secure computation compared to the classic data infrastructure available to some local governments. The project took place in a domestic, non-intelligence setting to increase the salience of potential lessons for public agencies….(More)”.

Know-how: Big Data, AI and the peculiar dignity of tacit knowledge

Curated on March 31, 2019July 18, 2019 by Stefaan Verhulst

Essay by Tim Rogan: “Machine learning – a kind of sub-field of artificial intelligence (AI) – is a means of training algorithms to discern empirical relationships within immense reams of data. Run a purpose-built algorithm by a pile of images of moles that might or might not be cancerous. Then show it images of diagnosed melanoma. Using analytical protocols modelled on the neurons of the human brain, in an iterative process of trial and error, the algorithm figures out how to discriminate between cancers and freckles. It can approximate its answers with a specified and steadily increasing degree of certainty, reaching levels of accuracy that surpass human specialists. Similar processes that refine algorithms to recognise or discover patterns in reams of data are now running right across the global economy: medicine, law, tax collection, marketing and research science are among the domains affected. Welcome to the future, say the economist Erik Brynjolfsson and the computer scientist Tom Mitchell: machine learning is about to transform our lives in something like the way that steam engines and then electricity did in the 19th and 20th centuries.

Signs of this impending change can still be hard to see. Productivity statistics, for instance, remain worryingly unaffected. This lag is consistent with earlier episodes of the advent of new ‘general purpose technologies’. In past cases, technological innovation took decades to prove transformative. But ideas often move ahead of social and political change. Some of the ways in which machine learning might upend the status quo are already becoming apparent in political economy debates.

The discipline of political economy was created to make sense of a world set spinning by steam-powered and then electric industrialisation. Its central question became how best to regulate economic activity. Centralised control by government or industry, or market freedoms – which optimised outcomes? By the end of the 20th century, the answer seemed, emphatically, to be market-based order. But the advent of machine learning is reopening the state vs market debate. Which between state, firm or market is the better means of coordinating supply and demand? Old answers to that question are coming under new scrutiny. In an eye-catching paper in 2017, the economists Binbin Wang and Xiaoyan Li at Sichuan University in China argued that big data and machine learning give centralised planning a new lease of life. The notion that market coordination of supply and demand encompassed more information than any single intelligence could handle would soon be proved false by 21st-century AI.

How seriously should we take such speculations? Might machine learning bring us full-circle in the history of economic thought, to where measures of economic centralisation and control – condemned long ago as dangerous utopian schemes – return, boasting new levels of efficiency, to constitute a new orthodoxy?

A great deal turns on the status of tacit knowledge….(More)”.

Data: The Lever to Promote Innovation in the EU

Curated on March 28, 2019March 28, 2019 by Stefaan Verhulst

Blog Post by Juan Murillo Arias: “…But in order for data to truly become a lever that foments innovation in benefit of society as a whole, we must understand and address the following factors:

1. Disconnected, disperse sources. As users of digital services (transportation, finance, telecommunications, news or entertainment) we leave a different digital footprint for each service that we use. These footprints, which are different facets of the same polyhedron, can even be contradictory on occasion. For this reason, they must be seen as complementary. Analysts should be aware that they must cross data sources from different origins in order to create a reliable picture of our preferences, otherwise we will be basing decisions on partial or biased information. How many times do we receive advertising for items we have already purchased, or tourist destinations where we have already been? And this is just one example of digital marketing. When scoring financial solvency, or monitoring health, the more complete the digital picture is of the person, the more accurate the diagnosis will be.

Furthermore, from the user’s standpoint, proper management of their entire, disperse digital footprint is a challenge. Perhaps centralized consent would be very beneficial. In the financial world, the PSD2 regulations have already forced banks to open this information to other banks if customers so desire. Fostering competition and facilitating portability is the purpose, but this opening up has also enabled the development of new services of information aggregation that are very useful to financial services users. It would be ideal if this step of breaking down barriers and moving toward a more transparent market took place simultaneously in all sectors in order to avoid possible distortions to competition and by extension, consumer harm. Therefore, customer consent would open the door to building a more accurate picture of our preferences.

2. The public and private sectors’ asymmetric capacity to gather data.This is related to citizens using public services less frequently than private services in the new digital channels. However, governments could benefit from the information possessed by private companies. These anonymous, aggregated data can help to ensure a more dynamic public management. Even personal data could open the door to customized education or healthcare on an individual level. In order to analyze all of this, the European Commissionhas created a working group including 23 experts. The purpose is to come up with a series of recommendations regarding the best legal, technical and economic framework to encourage this information transfer across sectors.

3. The lack of incentives for companies and citizens to encourage the reuse of their data.The reality today is that most companies solely use the sources internally. Only a few have decided to explore data sharing through different models (for academic research or for the development of commercial services). As a result of this and other factors, the public sector largely continues using the survey method to gather information instead of reading the digital footprint citizens produce. Multiple studies have demonstrated that this digital footprint would be useful to describe socioeconomic dynamics and monitor the evolution of official statistical indicators. However, these studies have rarely gone on to become pilot projects due to the lack of incentives for a private company to open up to the public sector, or to society in general, making this new activity sustainable.

4. Limited commitment to the diversification of services.Another barrier is the fact that information based product development is somewhat removed from the type of services that the main data generators (telecommunications, banks, commerce, electricity, transportation, etc.) traditionally provide. Therefore, these data based initiatives are not part of their main business and are more closely tied to companies’ innovation areas where exploratory proofs of concept are often not consolidated as a new line of business.

5. Bidirectionality. Data should also flow from the public sector to the rest of society. The first regulatory framework was created for this purpose. Although it is still very recent (the PSI Directive on the re-use of public sector data was passed in 2013), it is currently being revised, in attempt to foster the consolidation of an open data ecosystem that emanates from the public sector as well. On the one hand it would enable greater transparency, and on the other, the development of solutions to improve multiple fields in which public actors are key, such as the environment, transportation and mobility, health, education, justice and the planning and execution of public works. Special emphasis will be placed on high value data sets, such as statistical or geospatial data — data with tremendous potential to accelerate the emergence of a wide variety of information based data products and services that add value.The Commission will begin working with the Member States to identify these data sets.

In its report, Creating Data through Open Data, the European open data portal estimates that government agencies making their data accessible will inject an extra €65 billion in the EU economy this year.

6. The commitment to analytical training and financial incentives for innovation.They are the key factors that have given rise to the digital unicorns that have emerged, more so in the U.S. and China than in Europe….(More)”

Protection of health-related data: new guidelines

Curated on March 28, 2019March 28, 2019 by Stefaan Verhulst

Press Release: “The Council of Europe has issued a set of guidelines to its 47 member states urging them to ensure, in law and practice, that the processing of health-related data is done in full respect of human rights, notably the right to privacy and data protection.

With the development of new technological tools in the health sector the volume of health-related data processed has grown exponentially showing the need for guidance for health administrations and professionals.

In a Recommendation, applicable to both the public and private sectors, the Council of Europe´s Committee of Ministers, calls on governments to transmit these guidelines to health-care systems and to actors processing health-related data, in particular health-care professionals and data protection officers.

The recommendation contains a set of principles to protect health-related data incorporating the novelties introduced in the updated Council of Europe data protection convention, known as “Convention 108+”, opened for signature in October 2018.

The Committee of Ministers underlines that health-related data should be protected by appropriate security measures taking into account the latest technological developments, their sensitive nature and the assessment of potential risks. Protection measures should be incorporated by design to any information system which processes health-related data.

The recommendation contains guidance with regard to various issues including the legitimate basis for the data processing of health-care data – notably consent by the data subject -, data concerning unborn children, health-related genetic data, the sharing of health-related data by professionals and the storage of data.

The guidelines list a number of rights of data subjects, crucially the transparency of data processing. They also contain a number of principles that should be respected when data are processed for scientific research, when they are collected by mobile devices or when they are transferred across borders….(More)”.

New York City ‘Open Data’ Paves Way for Innovative Technology

Curated on March 28, 2019May 29, 2019 by Stefaan Verhulst

Leo Gringut at the International Policy Digest: “The philosophy behind “Open Data for All” turns on the idea that easy access to government data offers everyday New Yorkers the chance to grow and innovate: “Data is more than just numbers – it’s information that can create new opportunities and level the playing field for New Yorkers. It’s the illumination that changes frameworks, the insight that turns impenetrable issues into solvable problems.” Fundamentally, the newfound accessibility of City data is revolutionizing NYC business. According to Albert Webber, Program Manager for Open Data, City of New York, a key part of his job is “to engage the civic technology community that we have, which is very strong, very powerful in New York City.”

Fundamentally, Open Data is a game-changer for hundreds of New York companies, from startups to corporate giants, all of whom rely on data for their operations. The effect is set to be particularly profound in New York City’s most important economic sector: real estate. Seeking to transform the real estate and construction market in the City, valued at a record-setting $1 trillion in 2016, companies have been racing to develop tools that will harness the power of Open Data to streamline bureaucracy and management processes.

One such technology is the Citiscape app. Developed by a passionate team of real estate experts with more than 15 years of experience in the field, the app assembles data from the Department of Building and the Environmental Control Board into one easy-to-navigate interface. According to Citiscape Chief Operational Officer Olga Khaykina, the secret is in the app’s simplicity, which puts every aspect of project management at the user’s fingertips. “We made DOB and ECB just one tap away,” said Khaykina. “You’re one tap away from instant and accurate updates and alerts from the DOB that will keep you informed about any changes to ongoing project. One tap away from organized and cloud-saved projects, including accessible and coordinated interaction with all team members through our in-app messenger. And one tap away from uncovering technical information about any building in NYC, just by entering its address.” Gone are the days of continuously refreshing the DOB website in hopes of an update on a minor complaint or a status change regarding your project; Citiscape does the busywork so you can focus on your project.

The Citiscape team emphasized that, without access to Open Data, this project would have been impossible….(More)”.

Fearful of fake news blitz, U.S. Census enlists help of tech giants

Curated on March 28, 2019March 28, 2019 by Stefaan Verhulst

Nick Brown at Reuters: “The U.S. Census Bureau has asked tech giants Google, Facebook and Twitter to help it fend off “fake news” campaigns it fears could disrupt the upcoming 2020 count, according to Census officials and multiple sources briefed on the matter.

The push, the details of which have not been previously reported, follows warnings from data and cybersecurity experts dating back to 2016 that right-wing groups and foreign actors may borrow the “fake news” playbook from the last presidential election to dissuade immigrants from participating in the decennial count, the officials and sources told Reuters.

The sources, who asked not to be named, said evidence included increasing chatter on platforms like “4chan” by domestic and foreign networks keen to undermine the survey. The census, they said, is a powerful target because it shapes U.S. election districts and the allocation of more than $800 billion a year in federal spending.

Ron Jarmin, the Deputy Director of the Census Bureau, confirmed the bureau was anticipating disinformation campaigns, and was enlisting the help of big tech companies to fend off the threat.

“We expect that (the census) will be a target for those sorts of efforts in 2020,” he said.

Census Bureau officials have held multiple meetings with tech companies since 2017 to discuss ways they could help, including as recently as last week, Jarmin said.

So far, the bureau has gotten initial commitments from Alphabet Inc’s Google, Twitter Inc and Facebook Inc to help quash disinformation campaigns online, according to documents summarizing some of those meetings reviewed by Reuters.

But neither Census nor the companies have said how advanced any of the efforts are….(More)”.