Why Google’s Waze Is Trading User Data With Local Governments


Parmy Olson at Forbes: “In Rio de Janeiro most eyes are on the final, nail-biting matches of the World Cup. Over in the command center of the city’s department of transport though, they’re on a different set of screens altogether.

Planners there are watching the aggregated data feeds of thousands of smartphones being walked or driven around a city, thanks to two popular travel apps, Waze and Moovit.

The goal is traffic management, and it involves swapping data for data. More cities are lining up to get access, and while the data the apps are sharing is all anonymous for now, identifying details could get more specific if cities like what they see, and people become more comfortable with being monitored through their smartphones in return for incentives.

Rio is the first city in the world to collect real-time data both from drivers who use the Waze navigation app and pedestrians who use the public-transportation app Moovit, giving it an unprecedented view on thousands of moving points across the sprawling city. Rio is also talking to the popular cycling app Strava to start monitoring how cyclists are moving around the city too.

All three apps are popular, consumer services which, in the last few months, have found a new way to make their crowdsourced data useful to someone other than advertisers. While consumers use Waze and Moovit to get around, both companies are flipping the use case and turning those millions of users into a network of sensors that municipalities can tap into for a better view on traffic and hazards. Local governments can also use these apps as a channel to send alerts.

On an average day in June, Rio’s transport planners could get an aggregated view of 110,000 drivers (half a million over the course of the month), and see nearly 60,000 incidents being reported each day – everything from built-up traffic, to hazards on the road, Waze says. Till now they’ve been relying on road cameras and other basic transport-department information.

What may be especially tantalizing for planners is the super-accurate read Waze gets on exactly where drivers are going, by pinging their phones’ GPS once every second. The app can tell how fast a driver is moving and even get a complete record of their driving history, according to Waze spokesperson Julie Mossler. (UPDATE: Since this story was first published Waze has asked to clarify that it separates users’ names and their 30-day driving info. The driving history is categorized under an alias.)

This passively-tracked GPS data “is not something we share,” she adds. Waze, which Google bought last year for $1.3 billion, can turn the data spigots on and off through its application programing interface (API).

Waze has been sharing user data with Rio since summer 2013 and it just signed up the State of Florida. It says more departments of transport are in the pipeline.

But none of these partnerships are making Waze any money. The app’s currency of choice is data. “It’s a two-way street,” says Mossler. “Literally.”

In return for its user updates, Waze gets real-time information from Rio on highways, from road sensors and even from cameras, while Florida will give the app data on construction projects or city events.

Florida’s department of transport could not be reached for comment, but one of its spokesmen recently told a local news station: “We’re going to share our information, our camera images, all of our information that comes from the sensors on the roadway, and Waze is going to share its data with us.”…

To get Moovit’s data, municipalities download a web interface that gives them an aggregated view of where pedestrians using Moovit are going. In return, the city feeds Moovit’s database with a stream of real-time GPS data for buses and trains, and can issue transport alerts to Moovit’s users. Erez notes the cities aren’t allowed to make “any sort of commercial approach to the users.”

Erez may be saving that for advertisers, an avenue he says he’s still exploring. For now getting data from cities is the bigger priority. It gives Moovit “a competitive advantage,” he says.

Cycling app Strava also recently started sharing its real-time user data as part of a paid-for service called Strava Metro.

Municipalities pay 80 cents a year for every Strava member being tracked. Metro only launched in May, but it already counts the state of Oregon; London, UK; Glasgow, Scotland; Queensland, Austalia and Evanston, Illinois as customers.
….
Privacy advocates will naturally want to keep a wary eye on what data is being fed to cities, and that it doesn’t leak or get somehow misused by City Hall. The data-sharing might not be ubiquitous enough for that to be a problem yet, and it should be noted that any kind of deal making with the public sector can get wrapped up in bureaucracy and take years to get off the ground.

For now Waze says it’s acting for the public good….(More)

Methods to Protect and Secure “Big Data” May Be Unknowingly Corrupting Research


New paper by John M. Abowd and Ian M. Schmutte: “…As the government and private companies increase the amount of data made available for public use (e.g. Census data, employment surveys, medical data), efforts to protect privacy and confidentiality (through statistical disclosure limitation or SDL) can often cause misleading and compromising effects on economic research and analysis, particularly in cases where data properties are unclear for the end-user.

Data swapping is a particularly insidious method of SDL and is frequently used by important data aggregators like the Census Bureau, the National Center for Health Statistics and others, which interferes with the results of empirical analysis in ways that few economists and other social scientists are aware of.

To encourage more transparency, the authors call for both government statistical agencies as well as the private sector (Amazon, Google, Microsoft, Netfix, Yahoo!, etc.) to release more information about parameters used in SDL methods, and insist that journals and editors publishing such research require documentation of the author’s entire methodological process….(More)

VIDEO:

Why governments need guinea pigs for policies


Jonathan Breckon in the Guardian:”People are unlikely to react positively to the idea of using citizens as guinea pigs; many will be downright disgusted. But there are times when government must experiment on us in the search for knowledge and better policy….

Though history calls into question the ethics of experimentation, unless we try things out, we will never learn. The National Audit Office says that £66bn worth of government projects have no plans to evaluate their impact. It is unethical to roll out policies in this arbitrary way. We have to experiment on a small scale to have a better understanding of how things work before rolling out policies across the UK. This is just as relevant to social policy, as it is to science and medicine, as set out in a new report by the Alliance for Useful Evidence.

Whether it’s the best ways to teach our kids to read, designing programmes to get unemployed people back to work, or encouraging organ donation – if the old ways don’t work, we have to test new ones. And that testing can’t always be done by a committee in Whitehall or in a university lab.

Experimentation can’t happen in isolation. What works in Lewisham or Londonnery, might not work in Lincoln – or indeed across the UK. For instance, there is a huge amount debate around the current practice of teaching children to read and spell using phonics, which was based on a small-scale study in Clackmannanshire, as well as evidence from the US. A government-commissioned review on the evidence for phonics led professor Carole Torgerson, then at York University, to warn against making national policy off the back of just one small Scottish trial.

One way round this problem is to do larger experiments. The increasing use of the internet in public services allows for more and faster experimentation, on a larger scale for lower cost – the randomised controlled trial on voter mobilisation that went to 61 million users in the 2010 US midterm elections, for example. However, the use of the internet doesn’t get us off the ethical hook. Facebook had to apologise after a global backlash to secret psychological tests on their 689,000 users.

Contentious experiments should be approved by ethics committees – normal practice for trials in hospitals and universities.

We are also not interested in freewheeling trial-and-error; robust and appropriate research techniques to learn from experiments are vital. It’s best to see experimentation as a continuum, ranging from the messiness of attempts to try something new to experiments using the best available social science, such as randomised controlled trials.

Experimental government means avoiding an approach where everything is fixed from the outset. What we need is “a spirit of experimentation, unburdened by promises of success”, as recommended by the late professor Roger Jowell, author of the 2003 Cabinet Office report, Trying it out [pdf]….(More)”

Big Data for Social Good


Introduction to a Special Issue of the Journal “Big Data” by Catlett Charlie and Ghani Rayid: “…organizations focused on social good are realizing the potential as well but face several challenges as they seek to become more data-driven. The biggest challenge they face is a paucity of examples and case studies on how data can be used for social good. This special issue of Big Data is targeted at tackling that challenge and focuses on highlighting some exciting and impactful examples of work that uses data for social good. The special issue is just one example of the recent surge in such efforts by the data science community. …

This special issue solicited case studies and problem statements that would either highlight (1) the use of data to solve a social problem or (2) social challenges that need data-driven solutions. From roughly 20 submissions, we selected 5 articles that exemplify this type of work. These cover five broad application areas: international development, healthcare, democracy and government, human rights, and crime prevention.

“Understanding Democracy and Development Traps Using a Data-Driven Approach” (Ranganathan et al.) details a data-driven model between democracy, cultural values, and socioeconomic indicators to identify a model of two types of “traps” that hinder the development of democracy. They use historical data to detect causal factors and make predictions about the time expected for a given country to overcome these traps.

“Targeting Villages for Rural Development Using Satellite Image Analysis” (Varshney et al.) discusses two case studies that use data and machine learning techniques for international economic development—solar-powered microgrids in rural India and targeting financial aid to villages in sub-Saharan Africa. In the process, the authors stress the importance of understanding the characteristics and provenance of the data and the criticality of incorporating local “on the ground” expertise.

In “Human Rights Event Detection from Heterogeneous Social Media Graphs,” Chen and Neil describe efficient and scalable techniques to use social media in order to detect emerging patterns in human rights events. They test their approach on recent events in Mexico and show that they can accurately detect relevant human rights–related tweets prior to international news sources, and in some cases, prior to local news reports, which could potentially lead to more timely, targeted, and effective advocacy by relevant human rights groups.

“Finding Patterns with a Rotten Core: Data Mining for Crime Series with Core Sets” (Wang et al.) describes a case study with the Cambridge Police Department, using a subspace clustering method to analyze the department’s full housebreak database, which contains detailed information from thousands of crimes from over a decade. They find that the method allows human crime analysts to handle vast amounts of data and provides new insights into true patterns of crime committed in Cambridge…..(More)

An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service


Paper by Martin Lnenicka: “Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals….(More)”

Turning Government Data into Better Public Service


OMB Blog: “Every day, millions of people use their laptops, phones, and tablets to check the status of their tax refund, get the latest forecast from the National Weather Service, book a campsite at one of our national parks, and much more. There were more than 1.3 billion visits to websites across the Federal Government in just the past 90 days.

Today, during Sunshine Week when we celebrate openness and transparency in government, we are pleased to release the Digital Analytics Dashboard, a new window into the way people access the government online. For the first time, you can see how many people are using a Federal Government website, which pages are most popular, and which devices, browsers, and operating systems people are using. We’ll use the data from the Digital Analytics Program to focus our digital service teams on the services that matter most to the American people, and analyze how much progress we are making. The Dashboard will help government agencies understand how people find, access, and use government services online to better serve the public – all while protecting privacy.  The program does not track individuals. It anonymizes the IP addresses of all visitors and then uses the resulting information in the aggregate….(More)

 

What Your Tweets Say About You


at the New Yorker: “How much can your tweets reveal about you? Judging by the last nine hundred and seventy-two words that I used on Twitter, I’m about average when it comes to feeling upbeat and being personable, and I’m less likely than most people to be depressed or angry. That, at least, is the snapshot provided by AnalyzeWords, one of the latest creations from James Pennebaker, a psychologist at the University of Texas who studies how language relates to well-being and personality. One of Pennebaker’s most famous projects is a computer program called Linguistic Inquiry and Word Count (L.I.W.C.), which looks at the words we use, and in what frequency and context, and uses this information to gauge our psychological states and various aspects of our personality….

Take a study, out last month, from a group of researchers based at the University of Pennsylvania. The psychologist Johannes Eichstaedt and his colleagues analyzed eight hundred and twenty-six million tweets across fourteen hundred American counties. (The counties contained close to ninety per cent of the U.S. population.) Then, using lists of words—some developed by Pennebaker, others by Eichstaedt’s team—that can be reliably associated with anger, anxiety, social engagement, and positive and negative emotions, they gave each county an emotional profile. Finally, they asked a simple question: Could those profiles help determine which counties were likely to have more deaths from heart disease?

The answer, it turned out, was yes….

The researchers have a theory: they suggest that “the language of Twitter may be a window into the aggregated and powerful effects of the community context.” They point to other epidemiological studies which have shown that general facts about a community, such as its “social cohesion and social capital,” have consequences for the health of individuals. Broadly speaking, people who live in poorer, more fragmented communities are less healthy than people living in richer, integrated ones.“When we do a sub-analysis, we find that the power that Twitter has is in large part accounted for by community and socioeconomic variables,” Eichstaedt told me when we spoke over Skype. In short, a young person’s negative, angry, and stressed-out tweets might reflect his or her stress-inducing environment—and that same environment may have negative health repercussions for other, older members of the same community….(More)”

Using open legislative data to map bill co-sponsorship networks in 15 countries


François Briatte at OpeningParliament.org: “A few years back, Kamil Gregor published a post under the title “Visualizing politics: Network analysis of bill sponsors”. His post, which focused on the lower chamber of the Czech Parliament, showed how basic social network analysis can support the exploration of parliamentary work, by revealing the ties that members of parliament create between each other through the co-sponsorship of private bills….In what follows, I would like to quickly report on a small research project that I have developed over the years, under the name “parlnet”.

Legislative data on bill co-sponsorship

This project looks at bill co-sponsorship networks in European countries. Many parliaments allow their members to co-sponsor each other’s private bills, which makes it possible to represent these parliaments as collaborative networks, where a tie exists between two MPs if they have co-sponsored legislation together.

This idea is not new: it was pioneered by James Fowler in the United States, and has been the subject of extensive research in American politics, both on the U.S. Congress and on state legislatures. Similar research also exists on the bill co-sponsorship networks of parliaments in Argentina, Chile andRomania.

Inspired by this research and by Baptiste Coulmont’s visualisation of the French lower chamber, I surveyed the parliamentary websites of the following countries:

  • all 28 current members of the European Union ;
  • 4 members of the EFTA: Iceland, Liechtenstein, Norway, and Switzerland

This search returned 19 parliamentary chambers from 15 countries for which it was (relatively) easy to extract legislative data, either through open data portals like data.riksdagen.se in Sweden ordata.stortinget.no in Norway, or from official parliamentary websites directly….After splitting the data into legislative periods separated by nationwide elections, I was able to draw a large collection of networks showing bill co-sponsorship in these 19 chambers….In this graph, each point (or node) is a Belgian MP, and each tie between two MPs indicates that they have co-sponsored at least one bill together. The colors and abbreviations used in the graph are party-related codes, which combine information on the parliamentary group and linguistic community of each MP.Because this kind of graph can be interesting to explore in more detail, I have also built interactive visualizations out of them, in order to show more detailed information on the MPs who participate in bill cosposorship…

The parlnet project was coded in R, and its code is public so that it might benefit from external contributions. The list of countries and chambers that it covers is not exhaustive: in some cases like Portugal, I simply failed to retrieve the data. More talented coders might therefore be able to add to the current database.

Bill cosponsorship networks illustrate how open legislative data provided by parliaments can be turned into interactive tools that easily convey some information about parliamentary work, including, but not limited to:

  • the role of parliamentary party leaders in managing the legislation produced by their groups
  • the impact of partisan discipline and ideology on legislative collaboration between MPs
  • the extent of cross-party cooperation in various parliamentary environments and chambers… (More)

UNESCO demonstrates global impact through new transparency portal


“Opendata.UNESCO.org  is intended to present comprehensive, quality and timely information about UNESCO’s projects, enabling users to find information by country/region, funding source, and sector and providing comprehensive project data, including budget, expenditure, completion status, implementing organization, project documents, and more. It publishes program and financial information that are in line with UN system-experience of the IATI (International Aid Transparency Initiative) standards and other relevant transparency initiatives. UNESCO is now part of more than 230 organizations that have published to the IATI Registry, which brings together donor and developing countries, civil society organizations and other experts in aid information who are committed to working together to increase the transparency of aid.

Since its creation 70 years ago, UNESCO has tirelessly championed the causes of education, culture, natural sciences, social and human sciences, communication and information, globally. For instance – started in March 2010, the program for the Enhancement of Literacy in Afghanistan (ELA) benefited from a $19.5 million contribution by Japan. It aimed to improve the level of literacy, numeracy and vocational skills of the adult population in 70 districts of 15 provinces of Afghanistan. Over the next three years, until April 2013, the ELA programme helped some 360,000 adult learners in General Literacy compotency. An interactive map allows for an easy identification of UNESCO’s high-impact programs, and up-to-date information of current and future aid allocations within and across countries.

Public participation and interactivity are key to the success of any open data project. http://Opendata.UNESCO.org will evolve as Member States and partners will get involved, by displaying data on their own websites and sharing data among different networks, building and sharing applications, providing feedback, comments, and recommendations. …(More)”

Data for policy: when the haystack is made of needles. A call for contributions


Diana Vlad-Câlcic at the European Commission: “If policy-making is ‘whatever government chooses to do or not to do’ (Th. Dye), then how do governments actually decide? Evidence-based policy-making is not a new answer to this question, but it is constantly challenging both policy-makers and scientists to sharpen their thinking, their tools and their responsiveness.  The European Commission has recognised this and has embedded in its processes, namely through Impact Assessment, policy monitoring and evaluation, an evidence-informed decision-making approach.

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk. (John von Neumann)

New data technologies raise the bar high for advanced modelling, dynamic visualisation, real-time data flows and a variety of data sources, from sensors, to cell phones or the Internet as such. An abundance of (big) data, a haystack made of needles, but do public administrations have the right tools and skills to exploit it? How much of it adds real value to established statistics and to scientific evidence? Are the high hopes and the high expectations partly just hype? And what lessons can we learn from experience?

To explore these questions, the European Commission is launching a study with the Oxford Internet Institute, Technopolis and CEPS  on ‘Data for policy: big data and other innovative data-driven approaches for evidence-informed policymaking’. As a first step, the study will collect examples of initiatives in public institutions at national and international level, where innovative data technologies contribute to the policy process. It will eventually develop case-studies for EU policies.

Contribute to the collective reflection by sharing with us good practices and examples you have from other public administrations. Follow the developments of the study also on Twitter @data4policyEU