Potholes and Big Data: Crowdsourcing Our Way to Better Government


Phil Simon in Wired: “Big Data is transforming many industries and functions within organizations with relatively limited budgets.
Consider Thomas M. Menino, up until recently Boston’s longest-serving mayor. At some point in the past few years, Menino realized that it was no longer 1950. Perhaps he was hobnobbing with some techies from MIT at dinner one night. Whatever his motivation, he decided that there just had to be a better, more cost-effective way to maintain and fix the city’s roads. Maybe smartphones could help the city take a more proactive approach to road maintenance.
To that end, in July 2012, the Mayor’s Office of New Urban Mechanics launched a new project called Street Bump, an app that allows drivers to automatically report the road hazards to the city as soon as they hear that unfortunate “thud,” with their smartphones doing all the work.
The app’s developers say their work has already sparked interest from other cities in the U.S., Europe, Africa and elsewhere that are imagining other ways to harness the technology.
Before they even start their trip, drivers using Street Bump fire up the app, then set their smartphones either on the dashboard or in a cup holder. The app takes care of the rest, using the phone’s accelerometer — a motion detector — to sense when a bump is hit. GPS records the location, and the phone transmits it to an AWS remote server.
But that’s not the end of the story. It turned out that the first version of the app reported far too many false positives (i.e., phantom potholes). This finding no doubt gave ammunition to the many naysayers who believe that technology will never be able to do what people can and that things are just fine as they are, thank you. Street Bump 1.0 “collected lots of data but couldn’t differentiate between potholes and other bumps.” After all, your smartphone or cell phone isn’t inert; it moves in the car naturally because the car is moving. And what about the scores of people whose phones “move” because they check their messages at a stoplight?
To their credit, Menino and his motley crew weren’t entirely discouraged by this initial setback. In their gut, they knew that they were on to something. The idea and potential of the Street Bump app were worth pursuing and refining, even if the first version was a bit lacking. Plus, they have plenty of examples from which to learn. It’s not like the iPad, iPod, and iPhone haven’t evolved considerably over time.
Enter InnoCentive, a Massachusetts-based firm specializing in open innovation and crowdsourcing. The City of Boston contracted InnoCentive to improve Street Bump and reduce the amount of tail chasing. The company accepted the challenge and essentially turned it into a contest, a process sometimes called gamification. InnoCentive offered a network of 400,000 experts a share of $25,000 in prize money donated by Liberty Mutual.
Almost immediately, the ideas to improve Street Bump poured in from unexpected places. This crowd had wisdom. Ultimately, the best suggestions came from:

  • A group of hackers in Somerville, Massachusetts, that promotes community education and research
  • The head of the mathematics department at Grand Valley State University in Allendale, MI.
  • An anonymous software engineer

…Crowdsourcing roadside maintenance isn’t just cool. Increasingly, projects like Street Bump are resulting in substantial savings — and better government.”

Can NewsGenius make annotated government documents more understandable?


at E Pluribus Unum: “Last year, Rap Genius launched News Genius to help decode current events. Today, the General Service Administration (GSA) announced that digital annotation service News Genius is now available to help decode federal government Web projects:

“The federal government can now unlock the collaborative “genius” of citizens and communities to make public services easier to access and understand with a new free social media platform launched by GSA today at the Federal #SocialGov Summit on Entrepreneurship and Small Business,” writes Justin Herman, federal social media manager.

“News Genius, an annotation wiki based on Rap Genius now featuring federal-friendly Terms of Service, allows users to enhance policies, regulations and other documents with in-depth explanations, background information and paths to more resources. In the hands of government managers it will improve public services through citizen feedback and plain language, and will reduce costs by delivering these benefits on a free platform that doesn’t require a contract.”

This could be a significant improvement in making complicated policy documents and regulations understandable to the governed. While plain writing is indispensable for open government and mandated by law and regulation, the practice isn’t exactly uniformly practiced in Washington.

If people can understand more about what a given policy, proposed rule or regulation actually says, they may well be more likely to participate in the process of revising it. We’ll see if people adopt the tool, but on balance, that sounds like a step ahead.”

Artists Show How Anyone Can Fight the Man with Open Data


MotherBoard: “The UK’s Open Data Institute usually looks, as you’d probably expect, like an office full of people staring at screens. But visit at the moment and you might see a potato gun among the desks or a bunch of drone photos on the wall—all in the name of encouraging public discussion around and engagement with open data.
The ODI was set up by World Wide Web inventor Tim Berners-Lee and interdisciplinary researcher Nigel Shadbolt in London to push for an open data culture, and from Monday it will be hosting the second Data as Culture exhibition, which presents a more artistic take on questions surrounding the practicalities of open data. In doing so, it shows quite how the general public can (and probably really should) use data to inform their own lives and to engage with political issues.
All of the exhibits are based on freely available data, which is made lot more animated and accessible than numbers in a spreadsheet. “I made the decision straight away to move away from anything screen-based,” curator Shiri Shalmy told me as she gave me a tour, winding through office workers tapping away on keyboards. “Everything had to be physical.”…
James Bridle’s work on drone warfare touches a similar theme, though in this case the data are not hidden: his images of military UAVs come from Google Maps. “They’re there for anybody to look at, they’re kind of secret but available,” said Shalmy, who added that with the data out there, we can’t pretend we don’t know what’s going on. “They can do things in secret as long as we pretend it’s a secret.”
We’ve looked at Bridle’s work before, from his Dronestagram photos to his chalk outlines of drones, and he’s been commissioned to do something new for the Data as Culture show: Shalmy has asked him to compare the open data on military drones against that of London’s financial centre. He’ll present what he digs up in summer.

From the series ‘Watching the Watchers.’ Image: James Bridle/ODI

Using this kind of government data—from local council expenses to military movements—shows quite how much information is available and how it can be used to hold politicians to account. In essence, anyone can do surveillance to some level. While activists including Berners-Lee push for more data to be made accessible, it’s only useful if we actually bother to engage with it, and work like Bridle’s pose the uneasy suggestion that sometimes it’s more comfortable to remain ignorant.
And in addition to reading data, we can collect it. Rather than delving into government files, a knitted banner by artist Sam Meech uses publicly generated data to make a political point. The banner bears the phrase “8 hour labour,” a reference to the eight-hour workday movement that sprang up in Britain’s Industrial Revolution. The idea was that people would have eight hours work, eight hours rest, and eight hours recreation.

A detail from Sam Meechan’s Punchcard Economy. Image: Sam Meechan/ODI

But the black-and-white pattern in the banner is made up of much less regular working hours: those logged by self-employed creatives, who can take part by entering their own timesheet data via virtual punchcards. Shalmy pointed out her own schedule in a week when she was setting up the exhibition: a 70-hour block woven into the knit. It’s an example of how individuals can use data to make a political point—the work is reminiscent of trade union banners and seems particularly relevant at a time when controversial zero hours contracts are on the rise.
Also garnering data from the public, artist collective Thickear are asking people to fill in data forms on their arrival, which they’ll file on an old-fashioned spike. I took one of the forms, only to be confronted with nonsensical bureaucratic-type boxes. “The data itself is not informative in any way,” said Shalmy. It’s more about the idea of who we trust to give our data to. How often do we accept privacy policies without even giving ourselves the chance to even blink at the small print?…”

The six types of Twitter conversations


Lee Rainie: “Have you ever wondered what a Twitter conversation looks like from 10,000 feet? A new report from the Pew Research Center, in association with the Social Media Research Foundation, provides an aerial view of the social media network. By analyzing many thousands of Twitter conversations, we identified six different conversational archetypes. Our infographic describes each type of conversation network and an explanation of how it is shaped by the topic being discussed and the people driving the conversation.
FT_14.02.20_TwitterPoster (1)
Read the full report: Mapping the Twitter Conversation”

Crowdsourced transit app shows what time the bus will really come


Springwise: “The problem with most transport apps is that they rely on fixed data from transport company schedules and don’t truly reflect exactly what’s going on with the city’s trains and buses at any given moment. Operating like a Waze for public transport, Israel’s Ototo app crowdsources real-time information from passengers to give users the best suggestions for their commute.
The app relies on a community of ‘Riders’, who allow anonymous location data to be sent from their smartphone whenever they’re using public transport. By collating this data together, Ototo offers more realistic information about bus and train routes. While a bus may be due in five minutes, a Rider currently on that bus might be located more than five minutes away, indicating that the bus isn’t on time. Ototo can then suggest a quicker route for users. According to Fast Company, the service currently has a 12,000-strong global Riders community that powers its travel recommendations. On top of this, the app is designed in an easy-to-use infographic format that quickly and efficiently tells users where they need to be going and how long it will take. The app is free to download from the App Store, and the video below offers a demonstration:


Ototo faces competition from similar services such as New York City’s Moovit, which also details how crowded buses are.”

Exploration, Extraction and ‘Rawification’. The Shaping of Transparency in the Back Rooms of Open Data


Paper by Denis, Jerome and Goëta, Samuel: “With the advent of open data initiatives, raw data has been staged as a crucial element of government transparency. If the consequences of such data-driven transparency have already been discussed, we still don’t know much about its back rooms. What does it mean for an administration to open its data? Following information infrastructure studies, this communication aims to question the modes of existence of raw data in administrations. Drawing on an ethnography of open government data projects in several French administrations, it shows that data are not ready-at-hand resources. Indeed, three kinds of operations are conducted that progressively instantiate open data. The first one is exploration. Where are, and what are, the data within the institution are tough questions, the response to which entails organizational and technical inquiries. The second one is extraction. Data are encapsulated in databases and its release implies a sometimes complex disarticulation process. The third kind of operations is ‘rawification’. It consists in a series of tasks that transforms what used to be indexical professional data into raw data. To become opened, data are (re)formatted, cleaned, ungrounded. Though largely invisible, these operations foreground specific ‘frictions’ that emerge during the sociotechnical shaping of transparency, even before data publication and reuses.”

Government Surveillance and Internet Search Behavior


New paper by Marthews, Alex and Tucker, Catherine: “This paper uses data from Google Trends on search terms from before and after the surveillance revelations of June 2013 to analyze whether Google users’ search behavior shifted as a result of an exogenous shock in information about how closely their internet searches were being monitored by the U. S. government. We use data from Google Trends on search volume for 282 search terms across eleven different countries. These search terms were independently rated for their degree of privacy-sensitivity along multiple dimensions. Using panel data, our result suggest that cross-nationally, users were less likely to search using search terms that they believed might get them in trouble with the U. S. government. In the U. S., this was the main subset of search terms that were affected. However, internationally there was also a drop in traffic for search terms that were rated as personally sensitive. These results have implications for policy makers in terms of understanding the actual effects on search behavior of disclosures relating to the scale of government surveillance on the Internet and their potential effects on international competitiveness.

Index: Privacy and Security


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on privacy and security and was originally published in 2014.

Globally

  • Percentage of people who feel the Internet is eroding their personal privacy: 56%
  • Internet users who feel comfortable sharing personal data with an app: 37%
  • Number of users who consider it important to know when an app is gathering information about them: 70%
  • How many people in the online world use privacy tools to disguise their identity or location: 28%, or 415 million people
  • Country with the highest penetration of general anonymity tools among Internet users: Indonesia, where 42% of users surveyed use proxy servers
  • Percentage of China’s online population that disguises their online location to bypass governmental filters: 34%

In the United States

Over the Years

  • In 1996, percentage of the American public who were categorized as having “high privacy concerns”: 25%
    • Those with “Medium privacy concerns”: 59%
    • Those who were unconcerned with privacy: 16%
  • In 1998, number of computer users concerned about threats to personal privacy: 87%
  • In 2001, those who reported “medium to high” privacy concerns: 88%
  • Individuals who are unconcerned about privacy: 18% in 1990, down to 10% in 2004
  • How many online American adults are more concerned about their privacy in 2014 than they were a year ago, indicating rising privacy concerns: 64%
  • Number of respondents in 2012 who believe they have control over their personal information: 35%, downward trend for 7 years
  • How many respondents in 2012 continue to perceive privacy and the protection of their personal information as very important or important to the overall trust equation: 78%, upward trend for seven years
  • How many consumers in 2013 trust that their bank is committed to ensuring the privacy of their personal information is protected: 35%, down from 48% in 2004

Privacy Concerns and Beliefs

  • How many Internet users worry about their privacy online: 92%
    • Those who report that their level of concern has increased from 2013 to 2014: 7 in 10
    • How many are at least sometimes worried when shopping online: 93%, up from 89% in 2012
    • Those who have some concerns when banking online: 90%, up from 86% in 2012
  • Number of Internet users who are worried about the amount of personal information about them online: 50%, up from 33% in 2009
    • Those who report that their photograph is available online: 66%
      • Their birthdate: 50%
      • Home address: 30%
      • Cell number: 24%
      • A video: 21%
      • Political affiliation: 20%
  • Consumers who are concerned about companies tracking their activities: 58%
    • Those who are concerned about the government tracking their activities: 38%
  • How many users surveyed felt that the National Security Association (NSA) overstepped its bounds in light of recent NSA revelations: 44%
  • Respondents who are comfortable with advertisers using their web browsing history to tailor advertisements as long as it is not tied to any other personally identifiable information: 36%, up from 29% in 2012
  • Percentage of voters who do not want political campaigns to tailor their advertisements based on their interests: 86%
  • Percentage of respondents who do not want news tailored to their interests: 56%
  • Percentage of users who are worried about their information will be stolen by hackers: 75%
    • Those who are worried about companies tracking their browsing history for targeted advertising: 54%
  • How many consumers say they do not trust businesses with their personal information online: 54%
  • Top 3 most trusted companies for privacy identified by consumers from across 25 different industries in 2012: American Express, Hewlett Packard and Amazon
    • Most trusted industries for privacy: Healthcare, Consumer Products and Banking
    • Least trusted industries for privacy: Internet and Social Media, Non-Profits and Toys
  • Respondents who admit to sharing their personal information with companies they did not trust in 2012 for reasons such as convenience when making a purchase: 63%
  • Percentage of users who say they prefer free online services supported by targeted ads: 61%
    • Those who prefer paid online services without targeted ads: 33%
  • How many Internet users believe that it is not possible to be completely anonymous online: 59%
    • Those who believe complete online anonymity is still possible: 37%
    • Those who say people should have the ability to use the Internet anonymously: 59%
  • Percentage of Internet users who believe that current laws are not good enough in protecting people’s privacy online: 68%
    • Those who believe current laws provide reasonable protection: 24%

Security Related Issues

  • How many have had an email or social networking account compromised or taken over without permission: 21%
  • Those who have been stalked or harassed online: 12%
  • Those who think the federal government should do more to act against identity theft: 74%
  • Consumers who agree that they will avoid doing business with companies who they do not believe protect their privacy online: 89%
    • Among 65+ year old consumers: 96%

Privacy-Related Behavior

  • How many mobile phone users have decided not to install an app after discovering the amount of information it collects: 54%
  • Number of Internet users who have taken steps to remove or mask their digital footprint (including clearing cookies, encrypting emails, and using virtual networks to mask their IP addresses): 86%
  • Those who have set their browser to disable cookies: 65%
  • Number of users who have not allowed a service to remember their credit card information: 73%
  • Those who have chosen to block an app from accessing their location information: 53%
  • How many have signed up for a two-step sign-in process: 57%
  • Percentage of Gen-X (33-48 year olds) and Millennials (18-32 year olds) who say they never change their passwords or only change them when forced to: 41%
    • How many report using a unique password for each site and service: 4 in 10
    • Those who use the same password everywhere: 7%

Sources

Statistics and Open Data: Harvesting unused knowledge, empowering citizens and improving public services


House of Commons Public Administration Committee (Tenth Report):
“1. Open data is playing an increasingly important role in Government and society. It is data that is accessible to all, free of restrictions on use or redistribution and also digital and machine-readable so that it can be combined with other data, and thereby made more useful. This report looks at how the vast amounts of data generated by central and local Government can be used in open ways to improve accountability, make Government work better and strengthen the economy.

2. In this inquiry, we examined progress against a series of major government policy announcements on open data in recent years, and considered the prospects for further development. We heard of government open data initiatives going back some years, including the decision in 2009 to release some Ordnance Survey (OS) data as open data, and the Public Sector Mapping Agreement (PSMA) which makes OS data available for free to the public sector.  The 2012 Open Data White Paper ‘Unleashing the Potential’ says that transparency through open data is “at the heart” of the Government’s agenda and that opening up would “foster innovation and reform public services”. In 2013 the report of the independently-chaired review by Stephan Shakespeare, Chief Executive of the market research and polling company YouGov, of the use, re-use, funding and regulation of Public Sector Information urged Government to move fast to make use of data. He criticised traditional public service attitudes to data before setting out his vision:

    • To paraphrase the great retailer Sir Terry Leahy, to run an enterprise without data is like driving by night with no headlights. And yet that is what Government often does. It has a strong institutional tendency to proceed by hunch, or prejudice, or by the easy option. So the new world of data is good for government, good for business, and above all good for citizens. Imagine if we could combine all the data we produce on education and health, tax and spending, work and productivity, and use that to enhance the myriad decisions which define our future; well, we can, right now. And Britain can be first to make it happen for real.

3. This was followed by publication in October 2013 of a National Action Plan which sets out the Government’s view of the economic potential of open data as well as its aspirations for greater transparency.

4. This inquiry is part of our wider programme of work on statistics and their use in Government. A full description of the studies is set out under the heading “Statistics” in the inquiries section of our website, which can be found at www.parliament.uk/pasc. For this inquiry we received 30 pieces of written evidence and took oral evidence from 12 witnesses. We are grateful to all those who have provided evidence and to our Specialist Adviser on statistics, Simon Briscoe, for his assistance with this inquiry.”

Table of Contents:

Summary
1 Introduction
2 Improving accountability through open data
3 Open Data and Economic Growth
4 Improving Government through open data
5 Moving faster to make a reality of open data
6 A strategic approach to open data?
Conclusion
Conclusions and recommendations

How Twitter Could Help Police Departments Predict Crime


Eric Jaffe in Atlantic Cities: “Initially, Matthew Gerber didn’t believe Twitter could help predict where crimes might occur. For one thing, Twitter’s 140-character limit leads to slang and abbreviations and neologisms that are hard to analyze from a linguistic perspective. Beyond that, while criminals occasionally taunt law enforcement via Twitter, few are dumb or bold enough to tweet their plans ahead of time. “My hypothesis was there was nothing there,” says Gerber.
But then, that’s why you run the data. Gerber, a systems engineer at the University of Virginia’s Predictive Technology Lab, did indeed find something there. He reports in a new research paper that public Twitter data improved the predictions for 19 of 25 crimes that occurred early last year in metropolitan Chicago, compared with predictions based on historical crime patterns alone. Predictions for stalking, criminal damage, and gambling saw the biggest bump…..
Of course, the method says nothing about why Twitter data improved the predictions. Gerber speculates that people are tweeting about plans that correlate highly with illegal activity, as opposed to tweeting about crimes themselves.
Let’s use criminal damage as an example. The algorithm identified 700 Twitter topics related to criminal damage; of these, one topic involved the words “united center blackhawks bulls” and so on. Gather enough sports fans with similar tweets and some are bound to get drunk enough to damage public property after the game. Again this scenario extrapolates far more than the data tells, but it offers a possible window into the algorithm’s predictive power.

The map on the left shows predicted crime threat based on historical patterns; the one on the right includes Twitter data. (Via Decision Support Systems)
From a logistical standpoint, it wouldn’t be too difficult for police departments to use this method in their own predictions; both the Twitter data and modeling software Gerber used are freely available. The big question, he says, is whether a department used the same historical crime “hot spot” data as a baseline for comparison. If not, a new round of tests would have to be done to show that the addition of Twitter data still offered a predictive upgrade.
There’s also the matter of public acceptance. Data-driven crime prediction tends to raise any number of civil rights concerns. In 2012, privacy advocates criticized the FBI for a similar plan to use Twitter for crime predictions. In recent months the Chicago Police Department’s own methods have been knocked as a high-tech means of racial profiling. Gerber says his algorithms don’t target any individuals and only cull data posted voluntarily to a public account.”