Can NewsGenius make annotated government documents more understandable?


at E Pluribus Unum: “Last year, Rap Genius launched News Genius to help decode current events. Today, the General Service Administration (GSA) announced that digital annotation service News Genius is now available to help decode federal government Web projects:

“The federal government can now unlock the collaborative “genius” of citizens and communities to make public services easier to access and understand with a new free social media platform launched by GSA today at the Federal #SocialGov Summit on Entrepreneurship and Small Business,” writes Justin Herman, federal social media manager.

“News Genius, an annotation wiki based on Rap Genius now featuring federal-friendly Terms of Service, allows users to enhance policies, regulations and other documents with in-depth explanations, background information and paths to more resources. In the hands of government managers it will improve public services through citizen feedback and plain language, and will reduce costs by delivering these benefits on a free platform that doesn’t require a contract.”

This could be a significant improvement in making complicated policy documents and regulations understandable to the governed. While plain writing is indispensable for open government and mandated by law and regulation, the practice isn’t exactly uniformly practiced in Washington.

If people can understand more about what a given policy, proposed rule or regulation actually says, they may well be more likely to participate in the process of revising it. We’ll see if people adopt the tool, but on balance, that sounds like a step ahead.”

Artists Show How Anyone Can Fight the Man with Open Data


MotherBoard: “The UK’s Open Data Institute usually looks, as you’d probably expect, like an office full of people staring at screens. But visit at the moment and you might see a potato gun among the desks or a bunch of drone photos on the wall—all in the name of encouraging public discussion around and engagement with open data.
The ODI was set up by World Wide Web inventor Tim Berners-Lee and interdisciplinary researcher Nigel Shadbolt in London to push for an open data culture, and from Monday it will be hosting the second Data as Culture exhibition, which presents a more artistic take on questions surrounding the practicalities of open data. In doing so, it shows quite how the general public can (and probably really should) use data to inform their own lives and to engage with political issues.
All of the exhibits are based on freely available data, which is made lot more animated and accessible than numbers in a spreadsheet. “I made the decision straight away to move away from anything screen-based,” curator Shiri Shalmy told me as she gave me a tour, winding through office workers tapping away on keyboards. “Everything had to be physical.”…
James Bridle’s work on drone warfare touches a similar theme, though in this case the data are not hidden: his images of military UAVs come from Google Maps. “They’re there for anybody to look at, they’re kind of secret but available,” said Shalmy, who added that with the data out there, we can’t pretend we don’t know what’s going on. “They can do things in secret as long as we pretend it’s a secret.”
We’ve looked at Bridle’s work before, from his Dronestagram photos to his chalk outlines of drones, and he’s been commissioned to do something new for the Data as Culture show: Shalmy has asked him to compare the open data on military drones against that of London’s financial centre. He’ll present what he digs up in summer.

From the series ‘Watching the Watchers.’ Image: James Bridle/ODI

Using this kind of government data—from local council expenses to military movements—shows quite how much information is available and how it can be used to hold politicians to account. In essence, anyone can do surveillance to some level. While activists including Berners-Lee push for more data to be made accessible, it’s only useful if we actually bother to engage with it, and work like Bridle’s pose the uneasy suggestion that sometimes it’s more comfortable to remain ignorant.
And in addition to reading data, we can collect it. Rather than delving into government files, a knitted banner by artist Sam Meech uses publicly generated data to make a political point. The banner bears the phrase “8 hour labour,” a reference to the eight-hour workday movement that sprang up in Britain’s Industrial Revolution. The idea was that people would have eight hours work, eight hours rest, and eight hours recreation.

A detail from Sam Meechan’s Punchcard Economy. Image: Sam Meechan/ODI

But the black-and-white pattern in the banner is made up of much less regular working hours: those logged by self-employed creatives, who can take part by entering their own timesheet data via virtual punchcards. Shalmy pointed out her own schedule in a week when she was setting up the exhibition: a 70-hour block woven into the knit. It’s an example of how individuals can use data to make a political point—the work is reminiscent of trade union banners and seems particularly relevant at a time when controversial zero hours contracts are on the rise.
Also garnering data from the public, artist collective Thickear are asking people to fill in data forms on their arrival, which they’ll file on an old-fashioned spike. I took one of the forms, only to be confronted with nonsensical bureaucratic-type boxes. “The data itself is not informative in any way,” said Shalmy. It’s more about the idea of who we trust to give our data to. How often do we accept privacy policies without even giving ourselves the chance to even blink at the small print?…”

The six types of Twitter conversations


Lee Rainie: “Have you ever wondered what a Twitter conversation looks like from 10,000 feet? A new report from the Pew Research Center, in association with the Social Media Research Foundation, provides an aerial view of the social media network. By analyzing many thousands of Twitter conversations, we identified six different conversational archetypes. Our infographic describes each type of conversation network and an explanation of how it is shaped by the topic being discussed and the people driving the conversation.
FT_14.02.20_TwitterPoster (1)
Read the full report: Mapping the Twitter Conversation”

Crowdsourced transit app shows what time the bus will really come


Springwise: “The problem with most transport apps is that they rely on fixed data from transport company schedules and don’t truly reflect exactly what’s going on with the city’s trains and buses at any given moment. Operating like a Waze for public transport, Israel’s Ototo app crowdsources real-time information from passengers to give users the best suggestions for their commute.
The app relies on a community of ‘Riders’, who allow anonymous location data to be sent from their smartphone whenever they’re using public transport. By collating this data together, Ototo offers more realistic information about bus and train routes. While a bus may be due in five minutes, a Rider currently on that bus might be located more than five minutes away, indicating that the bus isn’t on time. Ototo can then suggest a quicker route for users. According to Fast Company, the service currently has a 12,000-strong global Riders community that powers its travel recommendations. On top of this, the app is designed in an easy-to-use infographic format that quickly and efficiently tells users where they need to be going and how long it will take. The app is free to download from the App Store, and the video below offers a demonstration:


Ototo faces competition from similar services such as New York City’s Moovit, which also details how crowded buses are.”

Exploration, Extraction and ‘Rawification’. The Shaping of Transparency in the Back Rooms of Open Data


Paper by Denis, Jerome and Goëta, Samuel: “With the advent of open data initiatives, raw data has been staged as a crucial element of government transparency. If the consequences of such data-driven transparency have already been discussed, we still don’t know much about its back rooms. What does it mean for an administration to open its data? Following information infrastructure studies, this communication aims to question the modes of existence of raw data in administrations. Drawing on an ethnography of open government data projects in several French administrations, it shows that data are not ready-at-hand resources. Indeed, three kinds of operations are conducted that progressively instantiate open data. The first one is exploration. Where are, and what are, the data within the institution are tough questions, the response to which entails organizational and technical inquiries. The second one is extraction. Data are encapsulated in databases and its release implies a sometimes complex disarticulation process. The third kind of operations is ‘rawification’. It consists in a series of tasks that transforms what used to be indexical professional data into raw data. To become opened, data are (re)formatted, cleaned, ungrounded. Though largely invisible, these operations foreground specific ‘frictions’ that emerge during the sociotechnical shaping of transparency, even before data publication and reuses.”

Government Surveillance and Internet Search Behavior


New paper by Marthews, Alex and Tucker, Catherine: “This paper uses data from Google Trends on search terms from before and after the surveillance revelations of June 2013 to analyze whether Google users’ search behavior shifted as a result of an exogenous shock in information about how closely their internet searches were being monitored by the U. S. government. We use data from Google Trends on search volume for 282 search terms across eleven different countries. These search terms were independently rated for their degree of privacy-sensitivity along multiple dimensions. Using panel data, our result suggest that cross-nationally, users were less likely to search using search terms that they believed might get them in trouble with the U. S. government. In the U. S., this was the main subset of search terms that were affected. However, internationally there was also a drop in traffic for search terms that were rated as personally sensitive. These results have implications for policy makers in terms of understanding the actual effects on search behavior of disclosures relating to the scale of government surveillance on the Internet and their potential effects on international competitiveness.

Index: Privacy and Security


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on privacy and security and was originally published in 2014.

Globally

  • Percentage of people who feel the Internet is eroding their personal privacy: 56%
  • Internet users who feel comfortable sharing personal data with an app: 37%
  • Number of users who consider it important to know when an app is gathering information about them: 70%
  • How many people in the online world use privacy tools to disguise their identity or location: 28%, or 415 million people
  • Country with the highest penetration of general anonymity tools among Internet users: Indonesia, where 42% of users surveyed use proxy servers
  • Percentage of China’s online population that disguises their online location to bypass governmental filters: 34%

In the United States

Over the Years

  • In 1996, percentage of the American public who were categorized as having “high privacy concerns”: 25%
    • Those with “Medium privacy concerns”: 59%
    • Those who were unconcerned with privacy: 16%
  • In 1998, number of computer users concerned about threats to personal privacy: 87%
  • In 2001, those who reported “medium to high” privacy concerns: 88%
  • Individuals who are unconcerned about privacy: 18% in 1990, down to 10% in 2004
  • How many online American adults are more concerned about their privacy in 2014 than they were a year ago, indicating rising privacy concerns: 64%
  • Number of respondents in 2012 who believe they have control over their personal information: 35%, downward trend for 7 years
  • How many respondents in 2012 continue to perceive privacy and the protection of their personal information as very important or important to the overall trust equation: 78%, upward trend for seven years
  • How many consumers in 2013 trust that their bank is committed to ensuring the privacy of their personal information is protected: 35%, down from 48% in 2004

Privacy Concerns and Beliefs

  • How many Internet users worry about their privacy online: 92%
    • Those who report that their level of concern has increased from 2013 to 2014: 7 in 10
    • How many are at least sometimes worried when shopping online: 93%, up from 89% in 2012
    • Those who have some concerns when banking online: 90%, up from 86% in 2012
  • Number of Internet users who are worried about the amount of personal information about them online: 50%, up from 33% in 2009
    • Those who report that their photograph is available online: 66%
      • Their birthdate: 50%
      • Home address: 30%
      • Cell number: 24%
      • A video: 21%
      • Political affiliation: 20%
  • Consumers who are concerned about companies tracking their activities: 58%
    • Those who are concerned about the government tracking their activities: 38%
  • How many users surveyed felt that the National Security Association (NSA) overstepped its bounds in light of recent NSA revelations: 44%
  • Respondents who are comfortable with advertisers using their web browsing history to tailor advertisements as long as it is not tied to any other personally identifiable information: 36%, up from 29% in 2012
  • Percentage of voters who do not want political campaigns to tailor their advertisements based on their interests: 86%
  • Percentage of respondents who do not want news tailored to their interests: 56%
  • Percentage of users who are worried about their information will be stolen by hackers: 75%
    • Those who are worried about companies tracking their browsing history for targeted advertising: 54%
  • How many consumers say they do not trust businesses with their personal information online: 54%
  • Top 3 most trusted companies for privacy identified by consumers from across 25 different industries in 2012: American Express, Hewlett Packard and Amazon
    • Most trusted industries for privacy: Healthcare, Consumer Products and Banking
    • Least trusted industries for privacy: Internet and Social Media, Non-Profits and Toys
  • Respondents who admit to sharing their personal information with companies they did not trust in 2012 for reasons such as convenience when making a purchase: 63%
  • Percentage of users who say they prefer free online services supported by targeted ads: 61%
    • Those who prefer paid online services without targeted ads: 33%
  • How many Internet users believe that it is not possible to be completely anonymous online: 59%
    • Those who believe complete online anonymity is still possible: 37%
    • Those who say people should have the ability to use the Internet anonymously: 59%
  • Percentage of Internet users who believe that current laws are not good enough in protecting people’s privacy online: 68%
    • Those who believe current laws provide reasonable protection: 24%

Security Related Issues

  • How many have had an email or social networking account compromised or taken over without permission: 21%
  • Those who have been stalked or harassed online: 12%
  • Those who think the federal government should do more to act against identity theft: 74%
  • Consumers who agree that they will avoid doing business with companies who they do not believe protect their privacy online: 89%
    • Among 65+ year old consumers: 96%

Privacy-Related Behavior

  • How many mobile phone users have decided not to install an app after discovering the amount of information it collects: 54%
  • Number of Internet users who have taken steps to remove or mask their digital footprint (including clearing cookies, encrypting emails, and using virtual networks to mask their IP addresses): 86%
  • Those who have set their browser to disable cookies: 65%
  • Number of users who have not allowed a service to remember their credit card information: 73%
  • Those who have chosen to block an app from accessing their location information: 53%
  • How many have signed up for a two-step sign-in process: 57%
  • Percentage of Gen-X (33-48 year olds) and Millennials (18-32 year olds) who say they never change their passwords or only change them when forced to: 41%
    • How many report using a unique password for each site and service: 4 in 10
    • Those who use the same password everywhere: 7%

Sources

Statistics and Open Data: Harvesting unused knowledge, empowering citizens and improving public services


House of Commons Public Administration Committee (Tenth Report):
“1. Open data is playing an increasingly important role in Government and society. It is data that is accessible to all, free of restrictions on use or redistribution and also digital and machine-readable so that it can be combined with other data, and thereby made more useful. This report looks at how the vast amounts of data generated by central and local Government can be used in open ways to improve accountability, make Government work better and strengthen the economy.

2. In this inquiry, we examined progress against a series of major government policy announcements on open data in recent years, and considered the prospects for further development. We heard of government open data initiatives going back some years, including the decision in 2009 to release some Ordnance Survey (OS) data as open data, and the Public Sector Mapping Agreement (PSMA) which makes OS data available for free to the public sector.  The 2012 Open Data White Paper ‘Unleashing the Potential’ says that transparency through open data is “at the heart” of the Government’s agenda and that opening up would “foster innovation and reform public services”. In 2013 the report of the independently-chaired review by Stephan Shakespeare, Chief Executive of the market research and polling company YouGov, of the use, re-use, funding and regulation of Public Sector Information urged Government to move fast to make use of data. He criticised traditional public service attitudes to data before setting out his vision:

    • To paraphrase the great retailer Sir Terry Leahy, to run an enterprise without data is like driving by night with no headlights. And yet that is what Government often does. It has a strong institutional tendency to proceed by hunch, or prejudice, or by the easy option. So the new world of data is good for government, good for business, and above all good for citizens. Imagine if we could combine all the data we produce on education and health, tax and spending, work and productivity, and use that to enhance the myriad decisions which define our future; well, we can, right now. And Britain can be first to make it happen for real.

3. This was followed by publication in October 2013 of a National Action Plan which sets out the Government’s view of the economic potential of open data as well as its aspirations for greater transparency.

4. This inquiry is part of our wider programme of work on statistics and their use in Government. A full description of the studies is set out under the heading “Statistics” in the inquiries section of our website, which can be found at www.parliament.uk/pasc. For this inquiry we received 30 pieces of written evidence and took oral evidence from 12 witnesses. We are grateful to all those who have provided evidence and to our Specialist Adviser on statistics, Simon Briscoe, for his assistance with this inquiry.”

Table of Contents:

Summary
1 Introduction
2 Improving accountability through open data
3 Open Data and Economic Growth
4 Improving Government through open data
5 Moving faster to make a reality of open data
6 A strategic approach to open data?
Conclusion
Conclusions and recommendations

How Twitter Could Help Police Departments Predict Crime


Eric Jaffe in Atlantic Cities: “Initially, Matthew Gerber didn’t believe Twitter could help predict where crimes might occur. For one thing, Twitter’s 140-character limit leads to slang and abbreviations and neologisms that are hard to analyze from a linguistic perspective. Beyond that, while criminals occasionally taunt law enforcement via Twitter, few are dumb or bold enough to tweet their plans ahead of time. “My hypothesis was there was nothing there,” says Gerber.
But then, that’s why you run the data. Gerber, a systems engineer at the University of Virginia’s Predictive Technology Lab, did indeed find something there. He reports in a new research paper that public Twitter data improved the predictions for 19 of 25 crimes that occurred early last year in metropolitan Chicago, compared with predictions based on historical crime patterns alone. Predictions for stalking, criminal damage, and gambling saw the biggest bump…..
Of course, the method says nothing about why Twitter data improved the predictions. Gerber speculates that people are tweeting about plans that correlate highly with illegal activity, as opposed to tweeting about crimes themselves.
Let’s use criminal damage as an example. The algorithm identified 700 Twitter topics related to criminal damage; of these, one topic involved the words “united center blackhawks bulls” and so on. Gather enough sports fans with similar tweets and some are bound to get drunk enough to damage public property after the game. Again this scenario extrapolates far more than the data tells, but it offers a possible window into the algorithm’s predictive power.

The map on the left shows predicted crime threat based on historical patterns; the one on the right includes Twitter data. (Via Decision Support Systems)
From a logistical standpoint, it wouldn’t be too difficult for police departments to use this method in their own predictions; both the Twitter data and modeling software Gerber used are freely available. The big question, he says, is whether a department used the same historical crime “hot spot” data as a baseline for comparison. If not, a new round of tests would have to be done to show that the addition of Twitter data still offered a predictive upgrade.
There’s also the matter of public acceptance. Data-driven crime prediction tends to raise any number of civil rights concerns. In 2012, privacy advocates criticized the FBI for a similar plan to use Twitter for crime predictions. In recent months the Chicago Police Department’s own methods have been knocked as a high-tech means of racial profiling. Gerber says his algorithms don’t target any individuals and only cull data posted voluntarily to a public account.”

New Field Guide Explores Open Data Innovations in Disaster Risk and Resilience


Worldbank: “From Indonesia to Bangladesh to Nepal, community members armed with smartphones and GPS systems are contributing to some of the most extensive and versatile maps ever created, helping inform policy and better prepare their communities for disaster risk.
In Jakarta, more than 500 community members have been trained to collect data on thousands of hospitals, schools, private buildings, and critical infrastructure. In Sri Lanka, government and academic volunteers mapped over 30,000 buildings and 450 km of roadways using a collaborative online resource called OpenStreetMaps.
These are just a few of the projects that have been catalyzed by the Open Data for Resilience Initiative (OpenDRI), developed by the World Bank’s Global Facility for Disaster Reduction and Recovery (GFDRR). Launched in 2011, OpenDRI is active in more than 20 countries today, mapping tens of thousands of buildings and urban infrastructure, providing more than 1,000 geospatial datasets to the public, and developing innovative application tools.
To expand this work, the World Bank Group has launched the OpenDRI Field Guide as a showcase of successful projects and a practical guide for governments and other organizations to shape their own open data programs….
The field guide walks readers through the steps to build open data programs based on the OpenDRI methodology. One of the first steps is data collation. Relevant datasets are often locked because of proprietary arrangements or fragmented in government bureaucracies. The field guide explores tools and methods to enable the participatory mapping projects that can fill in gaps and keep existing data relevant as cities rapidly expand.

GeoNode: Mapping Disaster Damage for Faster Recovery
One example is GeoNode, a locally controlled and open source cataloguing tool that helps manage and visualize geospatial data. The tool, already in use in two dozen countries, can be modified and easily be integrated into existing platforms, giving communities greater control over mapping information.
GeoNode was used extensively after Typhoon Yolanda (Haiyan) swept the Philippines with 300 km/hour winds and a storm surge of over six meters last fall. The storm displaced nearly 11 million people and killed more than 6,000.
An event-specific GeoNode project was created immediately and ultimately collected more than 72 layers of geospatial data, from damage assessments to situation reports. The data and quick analysis capability contributed to recovery efforts and is still operating in response mode at Yolandadata.org.
InaSAFE: Targeting Risk Reduction
A sister project, InaSAFE, is an open, easy-to-use tool for creating impact assessments for targeted risk reduction. The assessments are based on how an impact layer – such as a tsunami, flood, or earthquake – affects exposure data, such as population or buildings.
With InaSAFE, users can generate maps and statistical information that can be easily disseminated and even fed back into projects like GeoNode for simple, open source sharing.
The initiative, developed in collaboration with AusAID and the Government of Indonesia, was put to the test in the 2012 flood season in Jakarta, and its successes provoked a rapid national rollout and widespread interest from the international community.
Open Cities: Improving Urban Planning & Resilience
The Open Cities project, another program operating under the OpenDRI platform, aims to catalyze the creation, management and use of open data to produce innovative solutions for urban planning and resilience challenges across South Asia.
In 2013, Kathmandu was chosen as a pilot city, in part because the population faces the highest mortality threat from earthquakes in the world. Under the project, teams from the World Bank assembled partners and community mobilizers to help execute the largest regional community mapping project to date. The project surveyed more than 2,200 schools and 350 health facilities, along with road networks, points of interest, and digitized building footprints – representing nearly 340,000 individual data nodes.”