Researchers wrestle with a privacy problem


Erika Check Hayden at Nature: “The data contained in tax returns, health and welfare records could be a gold mine for scientists — but only if they can protect people’s identities….In 2011, six US economists tackled a question at the heart of education policy: how much does great teaching help children in the long run?

They started with the records of more than 11,500 Tennessee schoolchildren who, as part of an experiment in the 1980s, had been randomly assigned to high- and average-quality teachers between the ages of five and eight. Then they gauged the children’s earnings as adults from federal tax returns filed in the 2000s. The analysis showed that the benefits of a good early education last for decades: each year of better teaching in childhood boosted an individual’s annual earnings by some 3.5% on average. Other data showed the same individuals besting their peers on measures such as university attendance, retirement savings, marriage rates and home ownership.

The economists’ work was widely hailed in education-policy circles, and US President Barack Obama cited it in his 2012 State of the Union address when he called for more investment in teacher training.

But for many social scientists, the most impressive thing was that the authors had been able to examine US federal tax returns: a closely guarded data set that was then available to researchers only with tight restrictions. This has made the study an emblem for both the challenges and the enormous potential power of ‘administrative data’ — information collected during routine provision of services, including tax returns, records of welfare benefits, data on visits to doctors and hospitals, and criminal records. Unlike Internet searches, social-media posts and the rest of the digital trails that people establish in their daily lives, administrative data cover entire populations with minimal self-selection effects: in the US census, for example, everyone sampled is required by law to respond and tell the truth.

This puts administrative data sets at the frontier of social science, says John Friedman, an economist at Brown University in Providence, Rhode Island, and one of the lead authors of the education study “They allow researchers to not just get at old questions in a new way,” he says, “but to come at problems that were completely impossible before.”….

But there is also concern that the rush to use these data could pose new threats to citizens’ privacy. “The types of protections that we’re used to thinking about have been based on the twin pillars of anonymity and informed consent, and neither of those hold in this new world,” says Julia Lane, an economist at New York University. In 2013, for instance, researchers showed that they could uncover the identities of supposedly anonymous participants in a genetic study simply by cross-referencing their data with publicly available genealogical information.

Many people are looking for ways to address these concerns without inhibiting research. Suggested solutions include policy measures, such as an international code of conduct for data privacy, and technical methods that allow the use of the data while protecting privacy. Crucially, notes Lane, although preserving privacy sometimes complicates researchers’ lives, it is necessary to uphold the public trust that makes the work possible.

“Difficulty in access is a feature, not a bug,” she says. “It should be hard to get access to data, but it’s very important that such access be made possible.” Many nations collect administrative data on a massive scale, but only a few, notably in northern Europe, have so far made it easy for researchers to use those data.

In Denmark, for instance, every newborn child is assigned a unique identification number that tracks his or her lifelong interactions with the country’s free health-care system and almost every other government service. In 2002, researchers used data gathered through this identification system to retrospectively analyse the vaccination and health status of almost every child born in the country from 1991 to 1998 — 537,000 in all. At the time, it was the largest study ever to disprove the now-debunked link between measles vaccination and autism.

Other countries have begun to catch up. In 2012, for instance, Britain launched the unified UK Data Service to facilitate research access to data from the country’s census and other surveys. A year later, the service added a new Administrative Data Research Network, which has centres in England, Scotland, Northern Ireland and Wales to provide secure environments for researchers to access anonymized administrative data.

In the United States, the Census Bureau has been expanding its network of Research Data Centers, which currently includes 19 sites around the country at which researchers with the appropriate permissions can access confidential data from the bureau itself, as well as from other agencies. “We’re trying to explore all the available ways that we can expand access to these rich data sets,” says Ron Jarmin, the bureau’s assistant director for research and methodology.

In January, a group of federal agencies, foundations and universities created the Institute for Research on Innovation and Science at the University of Michigan in Ann Arbor to combine university and government data and measure the impact of research spending on economic outcomes. And in July, the US House of Representatives passed a bipartisan bill to study whether the federal government should provide a central clearing house of statistical administrative data.

Yet vast swathes of administrative data are still inaccessible, says George Alter, director of the Inter-university Consortium for Political and Social Research based at the University of Michigan, which serves as a data repository for approximately 760 institutions. “Health systems, social-welfare systems, financial transactions, business records — those things are just not available in most cases because of privacy concerns,” says Alter. “This is a big drag on research.”…

Many researchers argue, however, that there are legitimate scientific uses for such data. Jarmin says that the Census Bureau is exploring the use of data from credit-card companies to monitor economic activity. And researchers funded by the US National Science Foundation are studying how to use public Twitter posts to keep track of trends in phenomena such as unemployment.

 

….Computer scientists and cryptographers are experimenting with technological solutions. One, called differential privacy, adds a small amount of distortion to a data set, so that querying the data gives a roughly accurate result without revealing the identity of the individuals involved. The US Census Bureau uses this approach for its OnTheMap project, which tracks workers’ daily commutes. ….In any case, although synthetic data potentially solve the privacy problem, there are some research applications that cannot tolerate any noise in the data. A good example is the work showing the effect of neighbourhood on earning potential3, which was carried out by Raj Chetty, an economist at Harvard University in Cambridge, Massachusetts. Chetty needed to track specific individuals to show that the areas in which children live their early lives correlate with their ability to earn more or less than their parents. In subsequent studies5, Chetty and his colleagues showed that moving children from resource-poor to resource-rich neighbourhoods can boost their earnings in adulthood, proving a causal link.

Secure multiparty computation is a technique that attempts to address this issue by allowing multiple data holders to analyse parts of the total data set, without revealing the underlying data to each other. Only the results of the analyses are shared….(More)”

Routledge International Handbook of Ignorance Studies


Book edited by Matthias Gross and Linsey McGoey: “Once treated as the absence of knowledge, ignorance today has become a highly influential topic in its own right, commanding growing attention across the natural and social sciences where a wide range of scholars have begun to explore the social life and political issues involved in the distribution and strategic use of not knowing. The field is growing fast and this handbook reflects this interdisciplinary field of study by drawing contributions from economics, sociology, history, philosophy, cultural studies, anthropology, feminist studies, and related fields in order to serve as a seminal guide to the political, legal and social uses of ignorance in social and political life….(More)”

The Data Revolution for Sustainable Development


Jeffrey D. Sachs at Project Syndicate: “There is growing recognition that the success of the Sustainable Development Goals (SDGs), which will be adopted on September 25 at a special United Nations summit, will depend on the ability of governments, businesses, and civil society to harness data for decision-making…

One way to improve data collection and use for sustainable development is to create an active link between the provision of services and the collection and processing of data for decision-making. Take health-care services. Every day, in remote villages of developing countries, community health workers help patients fight diseases (such as malaria), get to clinics for checkups, receive vital immunizations, obtain diagnoses (through telemedicine), and access emergency aid for their infants and young children (such as for chronic under-nutrition). But the information from such visits is usually not collected, and even if it is put on paper, it is never used again.
We now have a much smarter way to proceed. Community health workers are increasingly supported by smart-phone applications, which they can use to log patient information at each visit. That information can go directly onto public-health dashboards, which health managers can use to spot disease outbreaks, failures in supply chains, or the need to bolster technical staff. Such systems can provide a real-time log of vital events, including births and deaths, and even use so-called verbal autopsies to help identify causes of death. And, as part of electronic medical records, the information can be used at future visits to the doctor or to remind patients of the need for follow-up visits or medical interventions….
Fortunately, the information and communications technology revolution and the spread of broadband coverage nearly everywhere can quickly make such time lags a thing of the past. As indicated in the report A World that Counts: Mobilizing the Data Revolution for Sustainable Development, we must modernize the practices used by statistical offices and other public agencies, while tapping into new sources of data in a thoughtful and creative way that complements traditional approaches.
Through more effective use of smart data – collected during service delivery, economic transactions, and remote sensing – the fight against extreme poverty will be bolstered; the global energy system will be made much more efficient and less polluting; and vital services such as health and education will be made far more effective and accessible.
With this breakthrough in sight, several governments, including that of the United States, as well as businesses and other partners, have announced plans to launch a new “Global Partnership for Sustainable Development Data” at the UN this month. The new partnership aims to strengthen data collection and monitoring efforts by raising more funds, encouraging knowledge-sharing, addressing key barriers to access and use of data, and identifying new big-data strategies to upgrade the world’s statistical systems.
The UN Sustainable Development Solutions Network will support the new Global Partnership by creating a new Thematic Network on Data for Sustainable Development, which will bring together leading data scientists, thinkers, and academics from across multiple sectors and disciplines to form a center of data excellence….(More)”

Using Big Data to Understand the Human Condition: The Kavli HUMAN Project


Azmak Okan et al in the Journal “Big Data”: “Until now, most large-scale studies of humans have either focused on very specific domains of inquiry or have relied on between-subjects approaches. While these previous studies have been invaluable for revealing important biological factors in cardiac health or social factors in retirement choices, no single repository contains anything like a complete record of the health, education, genetics, environmental, and lifestyle profiles of a large group of individuals at the within-subject level. This seems critical today because emerging evidence about the dynamic interplay between biology, behavior, and the environment point to a pressing need for just the kind of large-scale, long-term synoptic dataset that does not yet exist at the within-subject level. At the same time that the need for such a dataset is becoming clear, there is also growing evidence that just such a synoptic dataset may now be obtainable—at least at moderate scale—using contemporary big data approaches. To this end, we introduce the Kavli HUMAN Project (KHP), an effort to aggregate data from 2,500 New York City households in all five boroughs (roughly 10,000 individuals) whose biology and behavior will be measured using an unprecedented array of modalities over 20 years. It will also richly measure environmental conditions and events that KHP members experience using a geographic information system database of unparalleled scale, currently under construction in New York. In this manner, KHP will offer both synoptic and granular views of how human health and behavior coevolve over the life cycle and why they evolve differently for different people. In turn, we argue that this will allow for new discovery-based scientific approaches, rooted in big data analytics, to improving the health and quality of human life, particularly in urban contexts….(More)”

Syria refugees tap in to legal advice by text


Hannah Kuchler in the Financial Times: “Syrian refugees can now access free legal advice by text message after a Palestinian start-up launched a service in Turkey, which it hopes to expand to reach refugees across Europe.

Refugees fleeing the conflict in Syria can receive legal guidance via their mobile phones on everything from whether they have the right to work to education services available for their children, after Souktel, a small start-up partnered with the American Bar Association.

The 30-person start-up employs both former humanitarian workers from Oxfam and USAID, who understand the problems faced by refugees, and software engineers who tackle the challenge of sorting, tagging and translating enquiries which are then sent to a team of Turkish lawyers.

Jacob Korenblum, president and chief executive of Souktel, said more than 10,000 individuals have used the service since it launched less than three weeks ago, with lawyers busy answering a steady stream of questions.

“Given the strength and rapid interest in this service and the uptake since its launch, we want to scale into Greece and other European countries to meet the same need,” he said. “This is very much becoming a pan-European problem at the very least.”…

The American Bar Association approached Souktel and asked them to build a service that could offer remote legal support and uses funds from international donors to pay the company….

Smartphones — or even basic mobile phones — have fast become one of the easiest ways of communicating for the poor or dispossessed. Even when basic infrastructure has failed, people are able to access information and connect with relatives abroad via their devices.

Mr Korenblum, a Canadian former aid worker, helped found Souktel after he saw young people in Palestine relying on their mobile devices when working there 10 years ago. The company has built similar services on behalf of humanitarian organisations working in other areas — including the UK’s department for international development in Gaza, Iraq and Somalia, among other places…(More)”

 

Civic engagement platform brings the town meeting online


Springwise: “Citizens may have the ability to express enthusiasm or disgust for government policies online, but these opinions are only as valuable as the ears they reach. We recently saw Balancing Act offer citizens the ability to view and play around with their city’s budget, providing governments with a better understanding of the wants and needs of their constituents. Now, CitizenLab is another civic engagement platform, which is bringing the town meeting into the digital age — providing a space for citizens to communicate with their government, and for governments to ‘citizensource’ opinions on their policies.

citizenlab

To begin, participants visit the platform and enter their city. This will take them to a collection of ‘labs’ — categories such as education, health and public spaces. They can then post new ideas, join existing conversations and upvote interesting topics. Local governments can then use the platform as a resource to discover the priorities of its citizens. They can respond directly to discussions and consult the public opinion on important issues. Governments can also acknowledge the most vital issues raised by taking them to city council for discussion. The platform is designed to host positive ideas, rather than raise issues.

Website: www.citizenlab.co

Our World in Data


“Life around the world is changing rapidly – here you find the data visualizations that show you how. Poverty, violence, health, education, the environment and much more. Our World In Data covers a wide range of topics and visualizes the empirical evidence of how living standards changed over the last decades, centuries, and millennia. A web publication authored by Max Roser. (work in progress)”

How Startups Are Transforming the Smart City Movement


Jason Shueh at GovTech: “Remember the 1990s visions of the future? Those first incantations of the sweeping “smart city,” so technologically utopian and Tomorrowland-ish in design? The concept and solutions were pitched by tech titans like IBM and Cisco, cost obscene amounts of money, and promised equally outlandish levels of innovation.

It was a drive — as idealistic as it was expedient — to spark a new industry that infused cities with data, analytics, sensors and clean energy. Two-and-a-half decades later, the smart city market has evolved. Its solutions are more pragmatic and its benefits more potent. Evidence brims inSingapore, where officials boast that they can predict traffic congestion an hour in advance with 90 percent accuracy. Similarly, in Chicago, the city has embraced analytics to estimate rodent infestations and prioritizerestaurant inspections. These of course are a few standouts, but as many know, the movement is highly diverse and runs its fingers through cities and across continents.

And yet what’s not as well-known is what’s happened in the last few years. The industry appears to be undergoing another metamorphosis, one that takes the ingenuity inspired by its beginnings and reimagines it with the help of do-it-yourself entrepreneurs….

Asked for a definition, Abrahamson centered his interpretation on tech that enhances quality of life. With the possible exception of health care, finance and education — systems large enough to merit their own categories, Abrahamson explains smart cities by highlighting investment areas at Urban.us. Specific areas are packaged as follows:

Mobility and Logistics: How cities move people and things to, from and within cities.

Built Environment: The public and private spaces in which citizens work and live.

Utilities: Critical resources including water, waste and energy.

Service Delivery: How local governments provide services ranging from public works to law enforcement….

Who’s Investing?

….Here is a sampling of a few types, with examples of their startup investments.

General Venture Capitalists

a16z (Andreessen Horowitz) – Mapillary and Moovit

Specialty Venture Capitalists

Fontinalis – Lyft, ParkMe, LocoMobi

Black Coral Capital – Digital Lumens, Clean Energy Collective, newterra

Govtech Fund – AmigoCloud, Mark43, MindMixer

Corporate Venture Capitalists

Google Ventures – Uber, Skycatch, Nest

Motorola Solutions Venture Capital – CyPhy Works and SceneDoc

BMW i Ventures – Life360 and ChargePoint

Impact/Social Investors

Omidyar Network – SeeClickFix and Nationbuilder

Knight Foundation – Public Stuff, Captricity

Kapor Capital – Uber, Via, Blocpower

1776 – Radiator Labs, Water Lens… (More)

Global platform launched to promote positive plagiarism among foundations


Ellie Ward at PioneersPost: “A group of leading foundations and NGOs, including the Rockefeller Foundation, Oxfam and the Skoll Foundation have launched a peer-to-peer platform to make solving pressing social issues easier.

Sphaera (pronounced s’faira) is a peer-to-peer online platform that will collate the knowledge of funders and practitioners working to solve social and environmental issues around the world.

Organisations will share their evidence-based solutions and research within the portal, which will then repurpose the information into tools, processes and frameworks that can be used by others. In theory a solution that helps fishermen log their catch could be repurposed for healthcare workers to track and improve treatment of contagious disease. …”Sphaera makes it easy to discover, share and remix solutions. We put the collective, practical knowledge of what works – in health, finance, conservation, education, in every sector relevant to wellbeing – at the fingertips of practitioners everywhere. Our hope is that together we are better, faster, and more effective in tackling the urgent problems of our time.”

Arthur Wood, founding partner of Total Impact Capital and a global leader in social finance, said: “With the birth of cloud technology we have seen a plethora of models changing the way we use, share, purchase and allocate resources. From AirBNB to Uber, folks are now asking why this trend has had zero impact in Philanthropy.”

Wood explained that Sphaera is “designed to liberate the silos of individual project knowledge and to leverage that expertise and knowledge to create scale and collaboration across the philanthropic landscape… Or simply stated, how can a great idea in one stovepipe be shared to the benefit of all?” (More)

White House debuts open source tool for visualizing government work across the country


Wired: “Data is immensely powerful. The trick lies in organizing the stuff. The good news is that so many organizations are now offering tools that help with this—and so many of these tools are open source.

The White House is among the many who are tapping into this trend. Today, the administration revealed a new tool meant to help anyone visualize government work across the country. Built in partnership with more than 15 Federal agencies, it’s basically a huge map of the country—with data layers you can select or deselect—that lets you see where certain community-based initiatives are gaining ground.

“This new approach focuses on the direction that cities and small towns want to go rather than the laundry list of programs the government has,” a representative from the White House Office of Management and Budget tells WIRED.

The initiatives include My Brother’s Keeper, a program designed to help residents succeed in education, in their careers, and beyond; Climate Action Champions, a program aimed to help local leaders address climate change issues; and Promise Zones, which hopes to increase economic security and expand educational opportunities within the community. The map also includes demographic information, on things like US Census data on counties of persistent poverty and data from Harvard about upward economic mobility by county.

“From the start, [the map] has been built in the open, and source code is available on GitHub,” the White House says, inviting data enthusiasts to make use of the map—which the administration also promised would get the benefit of regular data updates.”