DATA – Page 330 – The Living Library

Making NHS data work for everyone

Curated on December 12, 2018December 12, 2018 by Stefaan Verhulst

Reform: This report looks at the access and use of NHS data by private sector companies for research or product and service development purposes….

The private sector is an important partner to the NHS and plays a crucial role in the development of healthcare technologies that use data collected by hospitals or GP practices. It provides the skills and know-how to develop data-driven tools which can be used to improve patient care. However, this is not a one-sided exchange as the NHS makes the data available to build these tools and offers medical expertise to make sense of the data. This is known as the “value exchange”. Our research uncovered that there is a lack of clarity over what a fair value exchange looks like. This lack of clarity in conjunction with the lack of national guidance on the types of partnerships that could be developed has led to a patchwork on the ground….

Knowing what the “value exchange” is between patients, the NHS and industry allows for a more informed conversation about what constitutes a fair partnership when there is access to data to create a product or service

WHAT NEEDS TO CHANGE?

Engage with the public
A national strategy
Access to good quality data
Commercial and legal skills…(More)

We Need an FDA For Algorithms

Curated on December 12, 2018 by Stefaan Verhulst

Interview with Hannah Fry on the promise and danger of an AI world by Michael Segal:”…Why do we need an FDA for algorithms?

It used to be the case that you could just put any old colored liquid in a glass bottle and sell it as medicine and make an absolute fortune. And then not worry about whether or not it’s poisonous. We stopped that from happening because, well, for starters it’s kind of morally repugnant. But also, it harms people. We’re in that position right now with data and algorithms. You can harvest any data that you want, on anybody. You can infer any data that you like, and you can use it to manipulate them in any way that you choose. And you can roll out an algorithm that genuinely makes massive differences to people’s lives, both good and bad, without any checks and balances. To me that seems completely bonkers. So I think we need something like the FDA for algorithms. A regulatory body that can protect the intellectual property of algorithms, but at the same time ensure that the benefits to society outweigh the harms.

Why is the regulation of medicine an appropriate comparison?

If you swallow a bottle of colored liquid and then you keel over the next day, then you know for sure it was poisonous. But there are much more subtle things in pharmaceuticals that require expert analysis to be able to weigh up the benefits and the harms. To study the chemical profile of these drugs that are being sold and make sure that they actually are doing what they say they’re doing. With algorithms it’s the same thing. You can’t expect the average person in the street to study Bayesian inference or be totally well read in random forests, and have the kind of computing prowess to look up a code and analyze whether it’s doing something fairly. That’s not realistic. Simultaneously, you can’t have some code of conduct that every data science person signs up to, and agrees that they won’t tread over some lines. It has to be a government, really, that does this. It has to be government that analyzes this stuff on our behalf and makes sure that it is doing what it says it does, and in a way that doesn’t end up harming people.

How did you come to write a book about algorithms?

Back in 2011 in London, we had these really bad riots in London. I’d been working on a project with the Metropolitan Police, trying mathematically to look at how these riots had spread and to use algorithms to ask how could the police have done better. I went to go and give a talk in Berlin about this paper we’d published about our work, and they completely tore me apart. They were asking questions like, “Hang on a second, you’re creating this algorithm that has the potential to be used to suppress peaceful demonstrations in the future. How can you morally justify the work that you’re doing?” I’m kind of ashamed to say that it just hadn’t occurred to me at that point in time. Ever since, I have really thought a lot about the point that they made. And started to notice around me that other researchers in the area weren’t necessarily treating the data that they were working with, and the algorithms that they were creating, with the ethical concern they really warranted. We have this imbalance where the people who are making algorithms aren’t talking to the people who are using them. And the people who are using them aren’t talking to the people who are having decisions made about their lives by them. I wanted to write something that united those three groups….(More)”.

G20/OECD Compendium of good practices on the use of open data for Anti-corruption

Curated on December 10, 2018December 10, 2018 by Stefaan Verhulst

OECD: “This compendium of good practices was prepared by the OECD at the request of the G20 Anti-corruption Working Group (ACWG), to raise awareness of the benefits of open data policies and initiatives in:

fighting corruption,

increasing public sector transparency and integrity,

fostering economic development and social innovation.

This compendium provides an overview of initiatives for the publication and re-use of open data to fight corruption across OECD and G20 countries and underscores the impact that a digital transformation of the public sector can deliver in terms of better governance across policy areas. The practices illustrate the use of open data as a way of fighting corruption and show how open data principles can be translated into concrete initiatives.

The publication is divided into three sections:

Section 1 discusses the benefits of open data for greater public sector transparency and performance, national competitiveness and social engagement, and how these initiatives contribute to greater public trust in government.

Section 2 highlights the preconditions necessary across different policy areas related to anti-corruption (e.g. open government, public procurement) to sustain the implementation of an “Open by default” approach that could help government move from a perspective that focuses on increasing access to public sector information to one that enhances the publication of open government data for re-use and value co-creation.

Section 3 presents the results of the OECD survey administered across OECD and G20 countries, good practices on the publishing and reusing of open data for anti-corruption in G20 countries, and lessons learned from the definition and implementation of these initiatives. This chapter also discusses the implications for broader national matters such as freedom of press, and the involvement of key actors of the open data ecosystem (e.g. journalists and civil society organisations) as key partners in open data re-use for anti-corruption…(More)”.

Data Flow in the Smart City: Open Data Versus the Commons

Curated on December 10, 2018December 10, 2018 by Stefaan Verhulst

Chapter by Richard Beckwith, John Sherry and David Prendergast in The Hackable City: “Much of the recent excitement around data, especially ‘Big Data,’ focuses on the potential commercial or economic value of data. How that data will affect people isn’t much discussed. People know that smart cities will deploy Internet-based monitoring and that flows of the collected data promise to produce new values. Less considered is that smart cities will be sites of new forms of citizen action—enabled by an ‘economy’ of data that will lead to new methods of collectivization, accountability, and control which, themselves, can provide both positive and negative values to the citizenry. Therefore, smart city design needs to consider not just measurement and publication of data but also the implications of city-wide deployment, data openness, and the possibility of unintended consequences if data leave the city….(More)”.

The Seductive Diversion of ‘Solving’ Bias in Artificial Intelligence

Curated on December 10, 2018May 29, 2019 by Stefaan Verhulst

Blog by Julia Powles and Helen Nissenbaum: “Serious thinkers in academia and business have swarmed to the A.I. bias problem, eager to tweak and improve the data and algorithms that drive artificial intelligence. They’ve latched onto fairness as the objective, obsessing over competing constructs of the term that can be rendered in measurable, mathematical form. If the hunt for a science of computational fairness was restricted to engineers, it would be one thing. But given our contemporary exaltation and deference to technologists, it has limited the entire imagination of ethics, law, and the media as well.

There are three problems with this focus on A.I. bias. The first is that addressing bias as a computational problem obscures its root causes. Bias is a social problem, and seeking to solve it within the logic of automation is always going to be inadequate.

Second, even apparent success in tackling bias can have perverse consequences. Take the example of a facial recognition system that works poorly on women of color because of the group’s underrepresentation both in the training data and among system designers. Alleviating this problem by seeking to “equalize” representation merely co-opts designers in perfecting vast instruments of surveillance and classification.

When underlying systemic issues remain fundamentally untouched, the bias fighters simply render humans more machine readable, exposing minorities in particular to additional harms.

Third — and most dangerous and urgent of all — is the way in which the seductive controversy of A.I. bias, and the false allure of “solving” it, detracts from bigger, more pressing questions. Bias is real, but it’s also a captivating diversion.

What has been remarkably underappreciated is the key interdependence of the twin stories of A.I. inevitability and A.I. bias. Against the corporate projection of an otherwise sunny horizon of unstoppable A.I. integration, recognizing and acknowledging bias can be seen as a strategic concession — one that subdues the scale of the challenge. Bias, like job losses and safety hazards, becomes part of the grand bargain of innovation.

The reality that bias is primarily a social problem and cannot be fully solved technically becomes a strength, rather than a weakness, for the inevitability narrative. It flips the script. It absorbs and regularizes the classification practices and underlying systems of inequality perpetuated by automation, allowing relative increases in “fairness” to be claimed as victories — even if all that is being done is to slice, dice, and redistribute the makeup of those negatively affected by actuarial decision-making.

In short, the preoccupation with narrow computational puzzles distracts us from the far more important issue of the colossal asymmetry between societal cost and private gain in the rollout of automated systems. It also denies us the possibility of asking: Should we be building these systems at all?…(More)”.

To Reduce Privacy Risks, the Census Plans to Report Less Accurate Data

Curated on December 6, 2018December 6, 2018 by Stefaan Verhulst

Mark Hansen at the New York Times: “When the Census Bureau gathered data in 2010, it made two promises. The form would be “quick and easy,” it said. And “your answers are protected by law.”

But mathematical breakthroughs, easy access to more powerful computing, and widespread availability of large and varied public data sets have made the bureau reconsider whether the protection it offers Americans is strong enough. To preserve confidentiality, the bureau’s directors have determined they need to adopt a “formal privacy” approach, one that adds uncertainty to census data before it is published and achieves privacy assurances that are provable mathematically.

The census has always added some uncertainty to its data, but a key innovation of this new framework, known as “differential privacy,” is a numerical value describing how much privacy loss a person will experience. It determines the amount of randomness — “noise” — that needs to be added to a data set before it is released, and sets up a balancing act between accuracy and privacy. Too much noise would mean the data would not be accurate enough to be useful — in redistricting, in enforcing the Voting Rights Act or in conducting academic research. But too little, and someone’s personal data could be revealed.

On Thursday, the bureau will announce the trade-off it has chosen for data publications from the 2018 End-to-End Census Test it conducted in Rhode Island, the only dress rehearsal before the actual census in 2020. The bureau has decided to enforce stronger privacy protections than companies like Apple or Google had when they each first took up differential privacy….

In presentation materials for Thursday’s announcement, special attention is paid to lessening any problems with redistricting: the potential complications of using noisy counts of voting-age people to draw district lines. (By contrast, in 2000 and 2010 the swapping mechanism produced exact counts of potential voters down to the block level.)

The Census Bureau has been an early adopter of differential privacy. Still, instituting the framework on such a large scale is not an easy task, and even some of the big technology firms have had difficulties. For example, shortly after Apple’s announcement in 2016 that it would use differential privacy for data collected from its macOS and iOS operating systems, it was revealed that the actual privacy loss of their systems was much higher than advertised.

Some scholars question the bureau’s abandonment of techniques like swapping in favor of differential privacy. Steven Ruggles, Regents Professor of history and population studies at the University of Minnesota, has relied on census data for decades. Through the Integrated Public Use Microdata Series, he and his team have regularized census data dating to 1850, providing consistency between questionnaires as the forms have changed, and enabling researchers to analyze data across years.

“All of the sudden, Title 13 gets equated with differential privacy — it’s not,” he said, adding that if you make a guess about someone’s identity from looking at census data, you are probably wrong. “That has been regarded in the past as protection of privacy. They want to make it so that you can’t even guess.”

“There is a trade-off between usability and risk,” he added. “I am concerned they may go far too far on privileging an absolutist standard of risk.”

In a working paper published Friday, he said that with the number of private services offering personal data, a prospective hacker would have little incentive to turn to public data such as the census “in an attempt to uncover uncertain, imprecise and outdated information about a particular individual.”…(More)”.

New methods help identify what drives sensitive or socially unacceptable behaviors

Curated on December 5, 2018December 5, 2018 by Stefaan Verhulst

Mary Guiden at Physorg: “Conservation scientists and statisticians at Colorado State University have teamed up to solve a key problem for the study of sensitive behaviors like poaching, harassment, bribery, and drug use.

Sensitive behaviors—defined as socially unacceptable or not compliant with rules and regulations—are notoriously hard to study, researchers say, because people often do not want to answer direct questions about them.

To overcome this challenge, scientists have developed indirect questioning approaches that protect responders’ identities. However, these methods also make it difficult to predict which sectors of a population are more likely to participate in sensitive behaviors, and which factors, such as knowledge of laws, education, or income, influence the probability that an individual will engage in a sensitive behavior.

Assistant Professor Jennifer Solomon and Associate Professor Michael Gavin of the Department of Human Dimensions of Natural Resources at CSU, and Abu Conteh from MacEwan University in Alberta, Canada, have teamed up with Professor Jay Breidt and doctoral student Meng Cao in the CSU Department of Statistics to develop a new method to solve the problem.

The study, “Understanding the drivers of sensitive behavior using Poisson regression from quantitative randomized response technique data,” was published recently in PLOS One.

Conteh, who, as a doctoral student, worked with Gavin in New Zealand, used a specific technique, known as quantitative randomized response, to elicit confidential answers to questions on behaviors related to non-compliance with natural resource regulations from a protected area in Sierra Leone.

In this technique, the researcher conducting interviews has a large container containing pingpong balls, some with numbers and some without numbers. The interviewer asks the respondent to pick a ball at random, without revealing it to the interviewer. If the ball has a number, the respondent tells the interviewer the number. If the ball does not have a number, the respondent reveals how many times he illegaly hunted animals in a given time period….

Armed with the new computer program, the scientists found that people from rural communities with less access to jobs in urban centers were more likely to hunt in the reserve. People in communities with a greater proportion people displaced by Sierra Leone’s 10-year civil war were also more likely to hunt illegally….(More)”

The researchers said that collaborating across disciplines was and is key to addressing complex problems like this one. It is commonplace for people to be noncompliant with rules and regulations and equally important for social scientists to analyze these behaviors….(More)”

Beyond GDP: Measuring What Counts for Economic and Social Performance

Curated on December 5, 2018December 5, 2018 by Stefaan Verhulst

OECD Book: “Metrics matter for policy and policy matters for well-being. In this report, the co-chairs of the OECD-hosted High Level Expert Group on the Measurement of Economic Performance and Social Progress, Joseph E. Stiglitz, Jean-Paul Fitoussi and Martine Durand, show how over-reliance on GDP as the yardstick of economic performance misled policy makers who did not see the 2008 crisis coming. When the crisis did hit, concentrating on the wrong indicators meant that governments made inadequate policy choices, with severe and long-lasting consequences for many people.

While GDP is the most well-known, and most powerful economic indicator, it can’t tell us everything we need to know about the health of countries and societies. In fact, it can’t even tell us everything we need to know about economic performance. We need to develop dashboards of indicators that reveal who is benefitting from growth, whether that growth is environmentally sustainable, how people feel about their lives, what factors contribute to an individual’s or a country’s success. This book looks at progress made over the past 10 years in collecting well-being data, and in using them to inform policies. An accompanying volume, For Good Measure: Advancing Research on Well-being Metrics Beyond GDP, presents the latest findings from leading economists and statisticians on selected issues within the broader agenda on defining and measuring well-being….(More)”

Open Government Data for Inclusive Development

Curated on December 4, 2018December 4, 2018 by Stefaan Verhulst

Chapter by F. van Schalkwyk and M, Cañares in “Making Open Development Inclusive”, MIT Press by Matthew L. Smith and Ruhiya Kris Seward (Eds): “This chapter examines the relationship between open government data and social inclusion. Twenty-eight open data initiatives from the Global South are analyzed to find out how and in what contexts the publication of open government data tend to result in the inclusion of habitually marginalized communities in governance processes such that they may lead better lives.

The relationship between open government data and social inclusion is examined by presenting an analysis of the outcomes of open data projects. This analysis is based on a constellation of factors that were identified as having a bearing on open data initiatives with respect to inclusion. The findings indicate that open data can contribute to an increase in access and participation— both components of inclusion. In these cases, this particular finding indicates that a more open, participatory approach to governance practice is taking root. However, the findings also show that access and participation approaches to open government data have, in the cases studied here, not successfully disrupted the concentration of power in political and other networks, and this has placed limits on open data’s contribution to a more inclusive society.

The chapter starts by presenting a theoretical framework for the analysis of the relationship between open data and inclusion. The framework sets out the complex relationship between social actors, information and power in the network society. This is critical, we suggest, in developing a realistic analysis of the contexts in which open data activates its potential for
transformation. The chapter then articulates the research question and presents the methodology used to operationalize those questions. The findings and discussion section that follows examines the factors affecting the relationship between open data and inclusion, and how these factors
are observed to play out across several open data initiatives in different contexts. The chapter ends with concluding remarks and an attempt to synthesize the insights that emerged in the preceding sections….(More)”.

Better Data for Doing Good: Responsible Use of Big Data and Artificial Intelligence

Curated on December 4, 2018December 4, 2018 by Stefaan Verhulst

Report by the World Bank: “Describes opportunities for harnessing the value of big data and artificial intelligence (AI) for social good and how new families of AI algorithms now make it possible to obtain actionable insights automatically and at scale. Beyond internet business or commercial applications, multiple examples already exist of how big data and AI can help achieve shared development objectives, such as the 2030 Agenda for Sustainable Development and the Sustainable Development Goals (SDGs). But ethical frameworks in line with increased uptake of these new technologies remain necessary—not only concerning data privacy but also relating to the impact and consequences of using data and algorithms. Public recognition has grown concerning AI’s potential to create both opportunities for societal benefit and risks to human rights. Development calls for seizing the opportunity to shape future use as a force for good, while at the same time ensuring the technologies address inequalities and avoid widening the digital divide….(More)”.