Researchers wrestle with a privacy problem


Erika Check Hayden at Nature: “The data contained in tax returns, health and welfare records could be a gold mine for scientists — but only if they can protect people’s identities….In 2011, six US economists tackled a question at the heart of education policy: how much does great teaching help children in the long run?

They started with the records of more than 11,500 Tennessee schoolchildren who, as part of an experiment in the 1980s, had been randomly assigned to high- and average-quality teachers between the ages of five and eight. Then they gauged the children’s earnings as adults from federal tax returns filed in the 2000s. The analysis showed that the benefits of a good early education last for decades: each year of better teaching in childhood boosted an individual’s annual earnings by some 3.5% on average. Other data showed the same individuals besting their peers on measures such as university attendance, retirement savings, marriage rates and home ownership.

The economists’ work was widely hailed in education-policy circles, and US President Barack Obama cited it in his 2012 State of the Union address when he called for more investment in teacher training.

But for many social scientists, the most impressive thing was that the authors had been able to examine US federal tax returns: a closely guarded data set that was then available to researchers only with tight restrictions. This has made the study an emblem for both the challenges and the enormous potential power of ‘administrative data’ — information collected during routine provision of services, including tax returns, records of welfare benefits, data on visits to doctors and hospitals, and criminal records. Unlike Internet searches, social-media posts and the rest of the digital trails that people establish in their daily lives, administrative data cover entire populations with minimal self-selection effects: in the US census, for example, everyone sampled is required by law to respond and tell the truth.

This puts administrative data sets at the frontier of social science, says John Friedman, an economist at Brown University in Providence, Rhode Island, and one of the lead authors of the education study “They allow researchers to not just get at old questions in a new way,” he says, “but to come at problems that were completely impossible before.”….

But there is also concern that the rush to use these data could pose new threats to citizens’ privacy. “The types of protections that we’re used to thinking about have been based on the twin pillars of anonymity and informed consent, and neither of those hold in this new world,” says Julia Lane, an economist at New York University. In 2013, for instance, researchers showed that they could uncover the identities of supposedly anonymous participants in a genetic study simply by cross-referencing their data with publicly available genealogical information.

Many people are looking for ways to address these concerns without inhibiting research. Suggested solutions include policy measures, such as an international code of conduct for data privacy, and technical methods that allow the use of the data while protecting privacy. Crucially, notes Lane, although preserving privacy sometimes complicates researchers’ lives, it is necessary to uphold the public trust that makes the work possible.

“Difficulty in access is a feature, not a bug,” she says. “It should be hard to get access to data, but it’s very important that such access be made possible.” Many nations collect administrative data on a massive scale, but only a few, notably in northern Europe, have so far made it easy for researchers to use those data.

In Denmark, for instance, every newborn child is assigned a unique identification number that tracks his or her lifelong interactions with the country’s free health-care system and almost every other government service. In 2002, researchers used data gathered through this identification system to retrospectively analyse the vaccination and health status of almost every child born in the country from 1991 to 1998 — 537,000 in all. At the time, it was the largest study ever to disprove the now-debunked link between measles vaccination and autism.

Other countries have begun to catch up. In 2012, for instance, Britain launched the unified UK Data Service to facilitate research access to data from the country’s census and other surveys. A year later, the service added a new Administrative Data Research Network, which has centres in England, Scotland, Northern Ireland and Wales to provide secure environments for researchers to access anonymized administrative data.

In the United States, the Census Bureau has been expanding its network of Research Data Centers, which currently includes 19 sites around the country at which researchers with the appropriate permissions can access confidential data from the bureau itself, as well as from other agencies. “We’re trying to explore all the available ways that we can expand access to these rich data sets,” says Ron Jarmin, the bureau’s assistant director for research and methodology.

In January, a group of federal agencies, foundations and universities created the Institute for Research on Innovation and Science at the University of Michigan in Ann Arbor to combine university and government data and measure the impact of research spending on economic outcomes. And in July, the US House of Representatives passed a bipartisan bill to study whether the federal government should provide a central clearing house of statistical administrative data.

Yet vast swathes of administrative data are still inaccessible, says George Alter, director of the Inter-university Consortium for Political and Social Research based at the University of Michigan, which serves as a data repository for approximately 760 institutions. “Health systems, social-welfare systems, financial transactions, business records — those things are just not available in most cases because of privacy concerns,” says Alter. “This is a big drag on research.”…

Many researchers argue, however, that there are legitimate scientific uses for such data. Jarmin says that the Census Bureau is exploring the use of data from credit-card companies to monitor economic activity. And researchers funded by the US National Science Foundation are studying how to use public Twitter posts to keep track of trends in phenomena such as unemployment.

 

….Computer scientists and cryptographers are experimenting with technological solutions. One, called differential privacy, adds a small amount of distortion to a data set, so that querying the data gives a roughly accurate result without revealing the identity of the individuals involved. The US Census Bureau uses this approach for its OnTheMap project, which tracks workers’ daily commutes. ….In any case, although synthetic data potentially solve the privacy problem, there are some research applications that cannot tolerate any noise in the data. A good example is the work showing the effect of neighbourhood on earning potential3, which was carried out by Raj Chetty, an economist at Harvard University in Cambridge, Massachusetts. Chetty needed to track specific individuals to show that the areas in which children live their early lives correlate with their ability to earn more or less than their parents. In subsequent studies5, Chetty and his colleagues showed that moving children from resource-poor to resource-rich neighbourhoods can boost their earnings in adulthood, proving a causal link.

Secure multiparty computation is a technique that attempts to address this issue by allowing multiple data holders to analyse parts of the total data set, without revealing the underlying data to each other. Only the results of the analyses are shared….(More)”

Opening City Hall’s Wallets to Innovation


Tina Rosenberg at the New York Times: “Six years ago, the city of San Francisco decided to upgrade its streetlights. This is its story: O.K., stop. This is a parody, right? Government procurement is surely too nerdy even for Fixes. Procurement is a clerical task that cities do on autopilot: Decide what you need. Write a mind-numbing couple of dozen pages of specifications. Collect a few bids from the usual suspects. Yep, that’s procurement.But it doesn’t have to be. Instead of a rote purchasing exercise, what if procurement could be a way for cities to find new approaches to their problems?….

“Instead of saying to the marketplace ‘here’s the solution we want,’ we said ‘here’s the challenge, here’s the problem we’re having’,” said Barbara Hale, assistant general manager of the city’s Public Utilities Commission. “That opened us up to what other people thought the solution to the problem was, rather than us in our own little world deciding we knew the answer.”

The city got 59 different ideas from businesses in numerous countries. A Swiss company called Paradox won an agreement to do a 12-streetlight pilot test.

So — a happy ending for the scrappy and innovative Paradox? No. Paradox’s system worked, but the city could not award a contract for 18,500 streetlights that way. It held another competition for just the control systems, and tried out three of them. Last year the city issued a traditional R.F.P., using what it learned from the pilots. The contract has not yet been awarded.

Dozens of cities around the world are using problem-based procurement.   Barcelona has posed six challenges that it will spend a million euros on, and Moscow announced last year that five percent of city spending would be set aside for innovative procurement. But in the vast majority of cities, as in San Francisco, problem-based procurement is still just for small pilot projects — a novelty.

It will grow, however. This is largely because of the efforts ofCityMart, a company based in New York and Barcelona that has almost single-handedly taken the concept from a neat idea to something cities all over want to figure out how to do.

The concept is new enough that there’s not yet a lot of evidence about its effects. There’s plenty of proof, however, of the deficiencies of business-as-usual.

With the typical R.F.P., a city uses a consultant, working with local officials, to design what to ask for. Then city engineers and lawyers write the specifications, and the R.F.P. goes out for bids.

“If it’s a road safety issue it’s likely it will be the traffic engineers who will be asked to tell you what you can do, what you should invest in,” said Sascha Haselmayer, CityMart’s chief executive. “They tend to come up with things like traffic lights. They do not know there’s a world of entrepreneurs who work on educating drivers better, or that have a different design approach to public space — things that may not fit into the professional profile of the consultant.”

Such a process is guaranteed to be innovation-free. Innovation is far more likely when expertise from one discipline is applied to another. If you want the most creative solution to a traffic problem, ask people who aren’t traffic engineers.

The R.F.P. process itself was designed to give anyone a shot at a contract, but in reality, the winners almost always come from a small group of businesses with the required financial stability, legal know-how to negotiate the bureaucracy, and connections. Put those together, and cities get to consider only a tiny spectrum of the possible solutions to their problems.

Problem-based procurement can provide them with a whole rainbow. But to do that, the process needs clearinghouses — eBays or Craigslists for urban ideas….(More)”

Data Collaboratives: Sharing Public Data in Private Hands for Social Good


Beth Simone Noveck (The GovLab) in Forbes: “Sensor-rich consumer electronics such as mobile phones, wearable devices, commercial cameras and even cars are collecting zettabytes of data about the environment and about us. According to one McKinsey study, the volume of data is growing at fifty percent a year. No one needs convincing that these private storehouses of information represent a goldmine for business, but these data can do double duty as rich social assets—if they are shared wisely.

Think about a couple of recent examples: Sharing data held by businesses and corporations (i.e. public data in private hands) can help to improve policy interventions. California planners make water allocation decisions based upon expertise, data and analytical tools from public and private sources, including Intel, the Earth Research Institute at the University of California at Santa Barbara, and the World Food Center at the University of California at Davis.

In Europe, several phone companies have made anonymized datasets available, making it possible for researchers to track calling and commuting patterns and gain better insight into social problems from unemployment to mental health. In the United States, LinkedIn is providing free data about demand for IT jobs in different markets which, when combined with open data from the Department of Labor, helps communities target efforts around training….

Despite the promise of data sharing, these kind of data collaboratives remain relatively new. There is a need toaccelerate their use by giving companies strong tax incentives for sharing data for public good. There’s a need for more study to identify models for data sharing in ways that respect personal privacy and security and enable companies to do well by doing good. My colleagues at The GovLab together with UN Global Pulse and the University of Leiden, for example, published this initial analysis of terms and conditions used when exchanging data as part of a prize-backed challenge. We also need philanthropy to start putting money into “meta research;” it’s not going to be enough to just open up databases: we need to know if the data is good.

After years of growing disenchantment with closed-door institutions, the push for greater use of data in governing can be seen as both a response and as a mirror to the Big Data revolution in business. Although more than 1,000,000 government datasets about everything from air quality to farmers markets are openly available online in downloadable formats, much of the data about environmental, biometric, epidemiological, and physical conditions rest in private hands. Governing better requires a new empiricism for developing solutions together. That will depend on access to these private, not just public data….(More)”

Can Yelp Help Government Win Back the Public’s Trust?


Tod Newcombe at Governing: “Look out, DMV, IRS and TSA. Yelp, the popular review website that’s best known for its rants or cheers regarding restaurants and retailers, is about to make it easier to review and rank government services.

Last month, Yelp and the General Services Administration (GSA), which manages the basic functions of the federal government, announced that government workers will soon be able to read and respond to their agencies’ Yelp reviews — and, hopefully, incorporate the feedback into service improvements.

At first glance, the news might not seem so special. There already are Yelp pages for government agencies like Departments of Motor Vehicles, which have been particularly popular. San Francisco’s DMV office, for example, has received more than 450 reviews and has a three-star rating. But federal agencies and workers haven’t been allowed to respond to the reviewers nor could they collect data from the pages because Yelp hadn’t been approved by the GSA. The agreement changes that situation, also making it possible for agencies to set up new Yelp pages….

Yelp has been posting online reviews about restaurants, bars, nail salons and other retailers since 2004. Despite its reputation as a place to vent about bad service, more than two-thirds of the 82 million reviews posted since Yelp started have been positive with most rated at either four or five stars, according to the company’s website. And when businesses boost their Yelp rating by one star, revenues have increased by as much as 9 percent, according to a 2011 study by Harvard Business School Professor Michael Luca.

Now the public sector is about to start paying more attention to those rankings. More importantly, they will find out if engaging the public in a timely fashion changes their perception of government.

While all levels of government have become active with social media, direct interaction between an agency and citizens is still the exception rather than the rule. Agencies typically use Facebook and Twitter to inform followers about services or to provide information updates, not as a feedback mechanism. That’s why having a more direct connection between the comments on a Yelp page and a government agency represents a shift in engagement….(More)”

Open data is not just for startups


Mike Altendorf at CIO: “…Surely open data is just for start-ups, market research companies and people that want to save the world? Well there are two reasons why I wanted to dedicate a bit of time to the subject of open data. First, one of the major barriers to internal innovation that I hear about all the time is the inability to use internal data to inform that innovation. This is usually because data is deemed too sensitive, too complex, too siloed or too difficult to make usable. Leaving aside the issues that any of those problems are going to cause for the organisation more generally, it is easy to see how this can create a problem. So why not use someone else’s data?

The point of creating internal labs and innovation centres is to explore the art of the possible. I quite agree that insight from your own data is a good place to start but it isn’t the only place. You could also argue that by using your own data you are restricting your thinking because you are only looking at information that already relates to your business. If the point of a lab is to explore ideas for supporting the business then you may be better off looking outwards at what is happening in the world around you rather than inwards into the constrained world of the industry you already inhabit….

The fact is there is vast amounts of data sets that are freely available that can be made to work for you if you can just apply the creativity and technical smarts to them.

My second point is less about open data than about opening up data. Organisations collect information on their business operations, customers and suppliers all the time. The smart ones know how to use it to build competitive advantage but the really smart ones also know that there is significant extra value to be gained from sharing that data with the customer or supplier that it relates to. The customer or supplier can then use it to make informed decisions themselves. Some organisations have been doing this for a while. Customers of First Direct have been able to analyse their own spending patterns for years (although the data has been somewhat limited). The benefit to the customer is that they can make informed decisions based on actual data about their past behaviours and so adapt their spending habits accordingly (or put their head firmly in the sand and carry on as before in my case!). The benefit to the bank is that they are able to suggest ideas for how to improve a customer’s financial health alongside the data. Others have looked at how they can help customers by sharing (anonymised) information about what people with similar lifestyles/needs are doing/buying so customers can learn from each other. Trials have shown that customers welcomed the insight….(More)”

 

Civic Jazz in the New Maker Cities


 at Techonomy: “Our civic innovation movement is about 6 years old.  It began when cities started opening up data to citizens, journalists, public-sector companies, non-profits, and government agencies.  Open data is an invitation: it’s something to go to work on— both to innovate and to create a more transparent environment about what works and what doesn’t.  I remember when we first opened data in SF and began holding conferences and hackathons. In short order we saw a community emerge with remarkable capacity to contribute to, tinker with, hack, explore and improve the city.

Early on this took the form of visualizing data, like crime patterns in Oakland. This was followed by engagement: “Look, the police are skating by and not enforcing prostitution laws. Lets call them on it!”   Civic hackathons brought together journalists, software developers, hardware people, and urbanists. I recall when artists teamed with the Arup engineering firm to build noise sensors and deployed them in the Tenderloin neighborhood (with absolutely no permission from anybody). Noise was an issue. How could you understand the problem unless you measured it?

Something as wonky as an API invited people in, at which point a sense of civic possibility and wonder set in. Suddenly whole swaths of the city were working on the city.  During the SF elections four years ago Gray Area Foundation for the Arts (which I chair) led a project with candidates, bureaucrats, and hundreds of volunteers for a summer-long set of hackathons and projects. We were stunned so many people would come together and collaborate so broadly. It was a movement, fueled by a sense of agency and informed by social media. Today cities are competing on innovation. It has become a movement.

All this has been accelerated by startups, incubators, and the economy’s whole open innovation conversation.  Remarkably, we now see capital from flowing in to support urban and social ventures where we saw none just a few years ago. The accelerator Tumml in SF is a premier example, but there are similar efforts in many cities.

This initial civic innovation movement was focused on apps and data, a relatively easy place to start. With such an approach you’re not contending for real estate or creating something that might gentrify neighborhoods. Today this movement is at work on how we design the city itself.  As millennials pour in and cities are where most of us live, enormous experimentation is at play. Ours is a highly interdisciplinary age, mixing new forms of software code and various physical materials, using all sorts of new manufacturing techniques.

Brooklyn is a great example.  A few weeks ago I met with Bob Bland, CEO of Manufacture New York. This ambitious 160,000 square foot public/private partnership is reimagining the New York fashion business. In one place it co-locates contract manufacturers, emerging fashion brands and advanced fashion research. Think wearables, sensors, smart fabrics, and the application of advanced manufacturing to fashion. By bringing all these elements under one roof, the supply chain can be compressed, sped-up, and products made more innovative.

New York City’s Economic Development office envisions a local urban supply chain that can offer a scalable alternative to the giant extended global one. In fashion it makes more and more sense for brands to be located near their suppliers. Social media speeds up fashion cycles, so we’re moving beyond predictable seasons and looks specified ahead of time. Manufacturers want to place smaller orders more frequently, so they can take less inventory risk and keep current with trends.

When you put so much talent in one space, creativity flourishes. In fashion, unlike tech, there isn’t a lot of IP protection. So designers can riff off each other’s idea and incorporate influences as artists do. What might be called stealing ideas in the software business is seen in fashion as jazz and a way to create a more interesting work environment.

A few blocks away is the Brooklyn Navy Yard, a mammoth facility at the center of New York’s emerging maker economy. …In San Francisco this urban innovation movement is working on the form of the city itself. Our main boulevard, Market Street, is to be reimagined, repaved, and made greener with far fewer private vehicles over the next two years. Our planning department, in concert with art organizations here, has made citizen-led urban prototyping the centerpiece of the planning process….(More)”

Public service coding: the BBC as an open software developer


Juan Mateos-Garcia at NESTA: “On Monday, the BBC published British, Bold, Creative, a paper where it put forward a vision for its future based on openness and collaboration with its audiences and the UK’s wider creative industries.

In this blog post, we focus on an area where the BBC is already using an open and collaborative model for innovation: software development.

The value of software

Although less visible to the public than its TV, radio and online content programming, the BBC’s software development activities may create value and drive innovation beyond the BBC, providing an example of how the corporation can put its “technology and digital capabilities at the service of the wider industry.

Software is an important form of innovation investment that helps the BBC deliver new products and services, and become more efficient. One might expect that much of the software developed by the BBC would also be of value to other media and digital organisations. Such beneficial “spillovers” are encouraged by the BBC’s use of open source licensing, which enables other organisations to download its software for free, change it as they see fit, and share the results.

Current debates about the future of the BBC – including the questions about its role in influencing the future technology landscape in the Government’s Charter Review Consultation – need to be informed by robust evidence about how it develops software, and the impact that this has.

In this blog post, we use data from the world’s biggest collaborative software development platform, GitHub, to study the BBC as an open software developer.

GitHub gives organisations and individuals hosting space to store their projects (referred to as “repos”), and tools to coordinate development. This includes the option to “fork” (copy) other users’ software, change it and redistribute the improvements. Our key questions are:

  • How active is the BBC on GitHub?
  • How has its presence on GitHub changed over time?
  • What is the level of adoption (forking) of BBC projects on GitHub?
  • What types of open source projects is the BBC developing?
  • Where in the UK and in the rest of the world are the people interested in BBC projects based?

But before tackling these questions, it is important to address a question often raised in relation to open source software:

Why might an organisation like the BBC want to share its valuable code on a platform like GitHub?

There are several possible reasons:

  • Quality: Opening up a software project attracts help from other developers, making it better
  • Adoption: Releasing software openly can help turn it into a widely adopted standard
  • Signalling: It signals the organisation as an interesting place to work and partner with
  • Public value: Some organisations release their code openly with the explicit goal of creating public value

The webpage introducing TAL (Television Application Layer), a BBC project on GitHub, is a case in point: “Sharing TAL should make building applications on TV easier for others, helping to drive the uptake of this nascent technology. The BBC has a history of doing this and we are always looking at new ways to reach our audience.”…(More)

Census Business Builder


Census: “Are you looking for data to help you start or grow a business?…The U.S. Census Bureau today released Census Business Builder: Small Business Edition, a new Web tool that allows business owners and entrepreneurs to easily navigate and use key demographic and economic data to help guide their research into opening a new business or adding to an existing one. The Census Business Builder was developed with user-centered design at its core and incorporated feedback from customers and stakeholders, including small business owners, trade associations and other government agencies. The tool combines data from the American Community Survey, the economic census, County Business Patterns and other economic surveys to provide a complete business profile of an area. Business statistics include the number of establishments, employment, payroll and sales. American Community Survey statistics include population characteristics, economic characteristics and housing characteristics. The new tool also combines third-party consumer spending data with the Census Bureau economic and demographic data.

Governance Networks in the Public Sector


New book by E.H. Klijn and J. Koppenjan: “Governance Networks in the Public Sector presents a comprehensive study of governance networks and the management of complexities in network settings. Public, private and non-profit organizations are increasingly faced with complex, wicked problems when making decisions, developing policies or delivering services in the public sector. These activities take place in networks of interdependent actors guided by diverging and sometimes conflicting perceptions and strategies. As a result these networks are dominated by cognitive, strategic and institutional complexities. Dealing with these complexities requires sophisticated forms of coordination: network governance.

This book presents the most recent theoretical and empirical insights into governance networks. It provides a conceptual framework and analytical tools to study the complexities involved in handling wicked problems in governance networks in the public sector. The book also discusses strategies and management recommendations for governments, business and third sector organisations operating in and governing networks….(More)”

Design Thinking Comes of Age


Jon Kolko at HBR: “There’s a shift under way in large organizations, one that puts design much closer to the center of the enterprise. But the shift isn’t about aesthetics. It’s about applying the principles of design to the way people work.

This new approach is in large part a response to the increasing complexity of modern technology and modern business. That complexity takes many forms. Sometimes software is at the center of a product and needs to be integrated with hardware (itself a complex task) and made intuitive and simple from the user’s point of view (another difficult challenge). Sometimes the problem being tackled is itself multi-faceted: Think about how much tougher it is to reinvent a health care delivery system than to design a shoe. And sometimes the business environment is so volatile that a company must experiment with multiple paths in order to survive.

I could list a dozen other types of complexity that businesses grapple with every day. But here’s what they all have in common: People need help making sense of them. Specifically, people need their interactions with technologies and other complex systems to be simple, intuitive, and pleasurable.

A set of principles collectively known as design thinking—empathy with users, a discipline of prototyping, and tolerance for failure chief among them—is the best tool we have for creating those kinds of interactions and developing a responsive, flexible organizational culture….

Design thinking, first used to make physical objects, is increasingly being applied to complex, intangible issues, such as how a customer experiences a service. Regardless of the context, design thinkers tend to use physical models, also known as design artifacts, to explore, define, and communicate. Those models—primarily diagrams and sketches—supplement and in some cases replace the spreadsheets, specifications, and other documents that have come to define the traditional organizational environment. They add a fluid dimension to the exploration of complexity, allowing for nonlinear thought when tackling nonlinear problems.

For example, the U.S. Department of Veterans Affairs’ Center for Innovation has used a design artifact called a customer journey map to understand veterans’ emotional highs and lows in their interactions with the VA….

In design-centric organizations, you’ll typically see prototypes of new ideas, new products, and new services scattered throughout offices and meeting rooms. Whereas diagrams such as customer journey maps explore the problem space, prototypes explore the solution space. They may be digital, physical, or diagrammatic, but in all cases they are a way to communicate ideas. The habit of publicly displaying rough prototypes hints at an open-minded culture, one that values exploration and experimentation over rule following….(More)”