Data-Driven Development


Report by the World Bank: “…Decisions based on data can greatly improve people’s lives. Data can uncover patterns, unexpected relationships and market trends, making it possible to address previously intractable problems and leverage hidden opportunities. For example, tracking genes associated with certain types of cancer to improve treatment, or using commuter travel patterns to devise public transportation that is affordable and accessible for users, as well as profitable for operators.

Data is clearly a precious commodity, and the report points out that people should have greater control over the use of their personal data. Broadly speaking, there are three possible answers to the question “Who controls our data?”: firms, governments, or users. No global consensus yet exists on the extent to which private firms that mine data about individuals should be free to use the data for profit and to improve services.

User’s willingness to share data in return for benefits and free services – such as virtually unrestricted use of social media platforms – varies widely by country. In addition to that, early internet adopters, who grew up with the internet and are now age 30–40, are the most willing to share (GfK 2017).

Are you willing to share your data? (source: GfK 2017)

Image

On the other hand, data can worsen the digital divide – the data poor, who leave no digital trail because they have limited access, are most at risk from exclusion from services, opportunities and rights, as are those who lack a digital ID, for instance.

Firms and Data

For private sector firms, particularly those in developing countries, the report suggests how they might expand their markets and improve their competitive edge. Companies are already developing new markets and making profits by analyzing data to better understand their customers. This is transforming conventional business models. For years, telecommunications has been funded by users paying for phone calls. Today, advertisers pay for users’ data and attention are funding the internet, social media, and other platforms, such as apps, reversing the value flow.

Governments and Data

For governments and development professionals, the report provides guidance on how they might use data more creatively to help tackle key global challenges, such as eliminating extreme poverty, promoting shared prosperity, or mitigating the effects of climate change. The first step is developing appropriate guidelines for data sharing and use, and for anonymizing personal data. Governments are already beginning to use the huge quantities of data they hold to enhance service delivery, though they still have far to go to catch up with the commercial giants, the report finds.

Data for Development

The Information and Communications for Development report analyses how the data revolution is changing the behavior of governments, individuals, and firms and how these changes affect economic, social, and cultural development. This is a topic of growing importance that cannot be ignored, and the report aims to stimulate wider debate on the unique challenges and opportunities of data for development. It will be useful for policy makers, but also for anyone concerned about how their personal data is used and how the data revolution might affect their future job prospects….(More)”.

NHS Pulls Out Of Data-Sharing Deal With Home Office Immigration Enforcers


Jasmin Gray at Huffington Post: “The NHS has pulled out of a controversial data-sharing arrangement with the Home Office which saw confidential patients’ details passed on to immigration enforcers.

In May, the government suspended the ‘memorandum of understanding’ agreement between the health service and the Home Office after MPs, doctors and health charities warned it was leaving seriously ill migrants too afraid to seek medical treatment. 

But on Tuesday, NHS Digital announced that it was cutting itself out of the agreement altogether. 

“NHS Digital has received a revised narrowed request from the Home Office and is discussing this request with them,” a spokesperson for the data-branch of the health service said, adding that they have “formally closed-out our participation” in the previous memorandum of understanding. 

The anxieties of “multiple stakeholder communities” to ensure the agreement made by the government was respected was taken into account in the decision, they added. 

Meanwhile, the Home Office confirmed it was working to agree a new deal with NHS Digital which would only allow it to make requests for data about migrants “facing deportation action because they have committed serious crimes, or where information necessary to protect someone’s welfare”. 

The move has been welcomed by campaigners, with Migrants’ Rights Network director Rita Chadra saying that many migrants had missed out on “the right to privacy and access to healthcare” because of the data-sharing mechanism….(More)”.

Beyond Open vs. Closed: Balancing Individual Privacy and Public Accountability in Data Sharing


Paper by Bill Howe et al: “Data too sensitive to be “open” for analysis and re-purposing typically remains “closed” as proprietary information. This dichotomy undermines efforts to make algorithmic systems more fair, transparent, and accountable. Access to proprietary data in particular is needed by government agencies to enforce policy, researchers to evaluate methods, and the public to hold agencies accountable; all of these needs must be met while preserving individual privacy and firm competitiveness. In this paper, we describe an integrated legaltechnical approach provided by a third-party public-private data trust designed to balance these competing interests.

Basic membership allows firms and agencies to enable low-risk access to data for compliance reporting and core methods research, while modular data sharing agreements support a wide array of projects and use cases. Unless specifically stated otherwise in an agreement, all data access is initially provided to end users through customized synthetic datasets that offer a) strong privacy guarantees, b) removal of signals that could expose competitive advantage for the data providers, and c) removal of biases that could reinforce discriminatory policies, all while maintaining empirically good fidelity to the original data. We find that the liberal use of synthetic data, in conjunction with strong legal protections over raw data, strikes a tunable balance between transparency, proprietorship, privacy, and research objectives; and that the legal-technical framework we describe can form the basis for organizational data trusts in a variety of contexts….(More)”.

Beyond Open vs. Closed: Balancing Individual Privacy and Public Accountability in Data Sharing


Paper by Bill Howe et al: “Data too sensitive to be “open” for analysis and re-purposing typically remains “closed” as proprietary information. This dichotomy undermines efforts to make algorithmic systems more fair, transparent, and accountable. Access to proprietary data in particular is needed by government agencies to enforce policy, researchers to evaluate methods, and the public to hold agencies accountable; all of these needs must be met while preserving individual privacy and firm competitiveness. In this paper, we describe an integrated legaltechnical approach provided by a third-party public-private data trust designed to balance these competing interests.

Basic membership allows firms and agencies to enable low-risk access to data for compliance reporting and core methods research, while modular data sharing agreements support a wide array of projects and use cases. Unless specifically stated otherwise in an agreement, all data access is initially provided to end users through customized synthetic datasets that offer a) strong privacy guarantees, b) removal of signals that could expose competitive advantage for the data providers, and c) removal of biases that could reinforce discriminatory policies, all while maintaining empirically good fidelity to the original data. We find that the liberal use of synthetic data, in conjunction with strong legal protections over raw data, strikes a tunable balance between transparency, proprietorship, privacy, and research objectives; and that the legal-technical framework we describe can form the basis for organizational data trusts in a variety of contexts….(More)”.

To turn the open data revolution from idea to reality, we need more evidence


Stefaan Verhulst at apolitical: “The idea that we are living in a data age — one characterised by unprecedented amounts of information with unprecedented potential — has  become mainstream. We regularly read “data is the new oil,” or “data is the most valuable commodity in the global economy.”

Doubtlessly, there is truth in these statements. But a major, often unacknowledged problem is how much data remains inaccessible, hidden in siloes and behind walls.

For close to a decade, the technology and public interest community has pushed the idea of open data. At its core, open data represents a new paradigm of information and information access.

Rooted in notions of an information commons — developed by scholars like Nobel Prize winner Elinor Ostrom — and borrowing from the language of open source, open data begins from the premise that data collected from the public, often using public funds or publicly funded infrastructure, should also belong to the public — or at least, be made broadly accessible to those pursuing public-interest goals.

The open data movement has reached significant milestones in its short history. An ever-increasing number of governments across both developed and developing economies have released large datasets for the public’s benefit….

Similarly, a growing number of private companies have “Data Collaboratives” leveraging their data — with various degrees of limitations — to serve the public interest.

Despite such initiatives, many open data projects (and data collaboratives) remain fledgling. The field has trouble scaling projects beyond initial pilots. In addition, many potential stakeholders — private sector and government “owners” of data, as well as public beneficiaries — remain sceptical of open data’s value. Such limitations need to be overcome if open data and its benefits are to spread. We need hard evidence of its impact.

Ironically, the field is held back by an absence of good data on open data — that is, a lack of reliable empirical evidence that could guide new initiatives.

At the GovLab, a do-tank at New York University, we study the impact of open data. One of our overarching conclusions is that we need a far more solid evidence base to move open data from being a good idea to reality.

What do we know? Several initiatives undertaken at the GovLab offer insight. Our ODImpactwebsite now includes more than 35 detailed case studies of open government data projects. These examples provide powerful evidence not only that open data can work but also about howit works….

We have also launched an Open Data Periodic Table to better understand what conditions predispose an open data project toward success or failure. For example, having a clear problem definition, as well as the capacity and culture to carry out open data projects, are vital. Successful projects also build cross-sector partnerships around open data and its potential uses and establish practices to assess and mitigate risks, and have transparent and responsive governance structures….(More)”.

The Three Goals and Five Functions of Data Stewards


Medium Article by Stefaan G. Verhulst: “…Yet even as we see more data steward-type roles defined within companies, there exists considerable confusion about just what they should be doing. In particular, we have noticed a tendency to conflate the roles of data stewards with those of individuals or groups who might be better described as chief privacy, chief data or security officers. This slippage is perhaps understandable, but our notion of the role is somewhat broader. While privacy and security are of course key components of trusted and effective data collaboratives, the real goal is to leverage private data for broader social goals — while preventing harm.

So what are the necessary attributes of data stewards? What are their roles, responsibilities, and goals of data stewards? And how can they be most effective, both as champions of sharing within organizations and as facilitators for leveraging data with external entities? These are some of the questions we seek to address in our current research, and below we outline some key preliminary findings.

The following “Three Goals” and “Five Functions” can help define the aspirations of data stewards, and what is needed to achieve the goals. While clearly only a start, these attributes can help guide companies currently considering setting up sharing initiatives or establishing data steward-like roles.

The Three Goals of Data Stewards

  • Collaborate: Data stewards are committed to working and collaborating with others, with the goal of unlocking the inherent value of data when a clear case exists that it serves the public good and that it can be used in a responsible manner.
  • Protect: Data stewards are committed to managing private data ethically, which means sharing information responsibly, and preventing harm to potential customers, users, corporate interests, the wider public and of course those individuals whose data may be shared.
  • Act: Data stewards are committed to pro-actively acting in order to identify partners who may be in a better position to unlock value and insights contained within privately held data.

…(More)”.

Google, T-Mobile Tackle 911 Call Problem


Sarah Krouse at the Wall Street Journal: “Emergency call operators will soon have an easier time pinpointing the whereabouts of Android phone users.

Google has struck a deal with T-Mobile US to pipe location data from cellphones with Android operating systems in the U.S. to emergency call centers, said Fiona Lee, who works on global partnerships for Android emergency location services.

The move is a sign that smartphone operating system providers and carriers are taking steps to improve the quality of location data they send when customers call 911. Locating callers has become a growing problem for 911 operators as cellphone usage has proliferated. Wireless devices now make 80% or more of the 911 calls placed in some parts of the U.S., according to the trade group National Emergency Number Association. There are roughly 240 million calls made to 911 annually.

While landlines deliver an exact address, cellphones typically register only an estimated location provided by wireless carriers that can be as wide as a few hundred yards and imprecise indoors.

That has meant that while many popular applications like Uber can pinpoint users, 911 call takers can’t always do so. Technology giants such as Google and Apple Inc. that run phone operating systems need a direct link to the technology used within emergency call centers to transmit precise location data….

Google currently offers emergency location services in 14 countries around the world by partnering with carriers and companies that are part of local emergency communications infrastructure. Its location data is based on a combination of inputs from Wi-Fi to sensors, GPS and a mobile network information.

Jim Lake, director at the Charleston County Consolidated 9-1-1 Center, participated in a pilot of Google’s emergency location services and said it made it easier to find people who didn’t know their location, particularly because the area draws tourists.

“On a day-to-day basis, most people know where they are, but when they don’t, usually those are the most horrifying calls and we need to know right away,” Mr. Lake said.

In June, Apple said it had partnered with RapidSOS to send iPhone users’ location information to 911 call centers….(More)”

How Charities Are Using Artificial Intelligence to Boost Impact


Nicole Wallace at the Chronicle of Philanthropy: “The chaos and confusion of conflict often separate family members fleeing for safety. The nonprofit Refunite uses advanced technology to help loved ones reconnect, sometimes across continents and after years of separation.

Refugees register with the service by providing basic information — their name, age, birthplace, clan and subclan, and so forth — along with similar facts about the people they’re trying to find. Powerful algorithms search for possible matches among the more than 1.1 million individuals in the Refunite system. The analytics are further refined using the more than 2,000 searches that the refugees themselves do daily.

The goal: find loved ones or those connected to them who might help in the hunt. Since Refunite introduced the first version of the system in 2010, it has helped more than 40,000 people reconnect.

One factor complicating the work: Cultures define family lineage differently. Refunite co-founder Christopher Mikkelsen confronted this problem when he asked a boy in a refugee camp if he knew where his mother was. “He asked me, ‘Well, what mother do you mean?’ ” Mikkelsen remembers. “And I went, ‘Uh-huh, this is going to be challenging.’ ”

Fortunately, artificial intelligence is well suited to learn and recognize different family patterns. But the technology struggles with some simple things like distinguishing the image of a chicken from that of a car. Mikkelsen believes refugees in camps could offset this weakness by tagging photographs — “car” or “not car” — to help train algorithms. Such work could earn them badly needed cash: The group hopes to set up a system that pays refugees for doing such work.

“To an American, earning $4 a day just isn’t viable as a living,” Mikkelsen says. “But to the global poor, getting an access point to earning this is revolutionizing.”

Another group, Wild Me, a nonprofit created by scientists and technologists, has created an open-source software platform that combines artificial intelligence and image recognition, to identify and track individual animals. Using the system, scientists can better estimate the number of endangered animals and follow them over large expanses without using invasive techniques….

To fight sex trafficking, police officers often go undercover and interact with people trying to buy sex online. Sadly, demand is high, and there are never enough officers.

Enter Seattle Against Slavery. The nonprofit’s tech-savvy volunteers created chatbots designed to disrupt sex trafficking significantly. Using input from trafficking survivors and law-enforcement agencies, the bots can conduct simultaneous conversations with hundreds of people, engaging them in multiple, drawn-out conversations, and arranging rendezvous that don’t materialize. The group hopes to frustrate buyers so much that they give up their hunt for sex online….

A Philadelphia charity is using machine learning to adapt its services to clients’ needs.

Benefits Data Trust helps people enroll for government-assistance programs like food stamps and Medicaid. Since 2005, the group has helped more than 650,000 people access $7 billion in aid.

The nonprofit has data-sharing agreements with jurisdictions to access more than 40 lists of people who likely qualify for government benefits but do not receive them. The charity contacts those who might be eligible and encourages them to call the Benefits Data Trust for help applying….(More)”.

What is a data trust?


Essay by Jack Hardinges at ODI: “There are different interpretations of what a data trust is, or should be…

There’s not a well-used definition of ‘a data trust’, or even consensus on what one is. Much of the recent interest in data trusts in the UK has been fuelled by them being recommended as a way to ‘share data in a fair, safe and equitable way’ by a UK government-commissioned independent review into Artificial Intelligence (AI) in 2017. However, there has been wider international interest in the concept for some time.

At a very high level, the aim of data trusts appears to be to give people and organisations confidence when enabling access to data in ways that provide them with some value (either directly or indirectly) in return. Beyond that high level goal, there are a variety of thoughts about what form they should take. In our work so far, we’ve found different interpretations of the term ‘data trust’:

  • A data trust as a repeatable framework of terms and mechanisms.
  • A data trust as a mutual organisation.
  • A data trust as a legal structure.
  • A data trust as a store of data.
  • A data trust as public oversight of data access….(More)”

Data Stewards: Data Leadership to Address 21st Century Challenges


Post by Stefaan Verhulst: “…Over the last two years, we have focused on the opportunities (and challenges) surrounding what we call “data collaboratives.” Data collaboratives are an emerging form of public-private partnership, in which information held by companies (or other entities) is shared with the public sector, civil society groups, research institutes and international organizations. …

For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations that has a mandate to explore ways to harness the potential of their data towards positive public ends.

Today, each attempt to establish a cross-sector partnership built on the analysis of private-sector data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.

As a consequence, the process of establishing data collaboratives and leveraging privately held data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.

By establishing data stewardship as a corporate function, recognized and trusted within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked….

To take stock of current practice and scope needs and opportunities we held a small yet in-depth kick-off event at the offices of the Cloudera Foundation in San Francisco on May 8th 2018 that was attended by representatives from Linkedin, Facebook, Uber, Mastercard, DigitalGlobe, Cognizant, Streetlight Data, the World Economic Forum, and Nethope — among others.

Four Key Take Aways

The discussions were varied and wide-ranging.

Several reflected on the risks involved — including the risks of NOT sharing or collaborating on privately held data that could improve people’s lives (and in some occasions save lives).

Others warned that the window of opportunity to increase the practice of data collaboratives may be closing — given new regulatory requirements and other barriers that may disincentivize corporations from engaging with third parties around their data.

Ultimately four key take aways emerged. These areas — at the nexus of opportunities and challenges — are worth considering further, because they help us better understand both the potential and limitations of data collaboratives….(More)”