New paper by Steven Aftergood in the Harvard Civil Rights-Civil Liberties Law Review (Vol. 48, No. 2, Summer 2013): “This Article reviews selected aspects of secrecy policy in the Obama Administration to better comprehend the dynamics of official secrecy, particularly in the national security realm. An understanding emerges: secrecy policy is founded on a set of principles so broadly conceived that they do not provide unequivocal guidance to government officials who are responsible for deciding whether or not to classify particular topics. In the absence of such guidance, individual classification decisions are apt to be shaped by extraneous factors, including bureaucratic self-interest and public controversy. The lack of clear guidance has unwholesome implications for the scope and operation of the classification system, leading it to stray from its legitimate national security foundations. But an insight into the various drivers of classification policy also suggests new remedial approaches to curtail inappropriate secrecy.”
The Shame Game: U.S. Department of Labor Smartphone App Will Allow Public to Effortlessly Scrutinize Business Employment Practices
Charles B. Palmer in National Law Review: “The United States Department of Labor (DOL) recently launched a contest to find a new smartphone app that will allow the general public to effortlessly search for and scrutinize businesses and employers that have faced DOL citations. Dubbed the DOL Fair Labor Data Challenge, the contest seeks app entries that integrate information from consumer ratings websites, location tracking services, DOL Wage & Hour Division (WHD) citation data, and Occupational Safety & Health Administration (OSHA) citation data, into one software platform. In addition, the contest also encourages app developers to include other features in their respective app entries, such as information from state health boards and various licensing agencies.
The DOL Fair Labor Data Challenge is part of the DOL’s plan to amplify its enforcement efforts through increased public awareness and ease of access to citation data. Consumers and job applicants will soon be able to search for and publicly shame employers that hold one or more citations in the DOL database, all by just using their smartphones.”
Big data + politics = open data: The case of health care data in England
New Paper in Policy & Internet: “There is a great deal of enthusiasm about the prospects for Big Data held in health care systems around the world. Health care appears to offer the ideal combination of circumstances for its exploitation, with a need to improve productivity on the one hand and the availability of data that can be used to identify opportunities for improvement on the other. The enthusiasm rests on two assumptions. First, that the data sets held by hospitals and other organizations, and the technological infrastructure needed for their acquisition, storage, and manipulation, are up to the task. Second, that organizations outside health care systems will be able to access detailed datasets. We argue that both assumptions can be challenged. The article uses the example of the National Health Service in England to identify data, technology, and information governance challenges. The public acceptability of third party access to detailed health care datasets is, at best, unclear.”
Sitegeist
“Sitegeist is a mobile application that helps you to learn more about your surroundings in seconds. Drawing on publicly available information, the app presents solid data in a simple at-a-glance format to help you tap into the pulse of your location. From demographics about people and housing to the latest popular spots or weather, Sitegeist presents localized information visually so you can get back to enjoying the neighborhood. The application draws on free APIs such as the U.S. Census, Yelp! and others to showcase what’s possible with access to data. Sitegeist was created by the Sunlight Foundation in consultation with design firm IDEO and with support from the John S. and James L. Knight Foundation. It is the third in a series of National Data Apps.”
5 Big Data Projects That Could Impact Your Life
Mashable: “We reached out to a few organizations using information, both hand- and algorithm-collected, to create helpful tools for their communities. This is only a small sample of what’s out there — plenty more pop up each day, and as more information becomes public, the trend will only grow….
1. Transit Time NYC
Transit Time NYC, an interactive map developed by WNYC, lets New Yorkers click a spot in any of the city’s five boroughs for an estimate of subway or train travel times. To create it, WNYC lead developer Steve Melendez broke the city into 2,930 hexagons, then pulled data from open source itinerary platform OpenTripPlanner — the Wikipedia of mapping software — and coupled it with the MTA’s publicly downloadable subway schedule….
2. Twitter’s ‘Topography of Tweets
In a blog post, Twitter unveiled a new data visualization map that displays billions of geotagged tweets in a 3D landscape format. The purpose is to display, topographically, which parts of certain cities most people are tweeting from…
3. Homicide Watch D.C.
Homicide Watch D.C. is a community-driven data site that aims to cover every murder in the District of Columbia. It’s sorted by “suspect” and “victim” profiles, where it breaks down each person’s name, age, gender and race, as well as original articles reported by Homicide Watch staff…
4. Falling Fruit
Can you find a hidden apple tree along your daily bike commute? Falling Fruit can.
The website highlights overlooked or hidden edibles in urban areas across the world. By collecting public information from the U.S. Department of Agriculture, municipal tree inventories, foraging maps and street tree databases, the site has created a network of 615 types of edibles in more than 570,000 locations. The purpose is to remind urban dwellers that agriculture does exist within city boundaries — it’s just more difficult to find….
5. AIDSvu
AIDSVu is an interactive map that illustrates the prevalence of HIV in the United States. The data is pulled from the U.S. Center for Disease Control’s national HIV surveillance reports, which are collected at both state and county levels each year…”
9 models to scale open data – past, present and future
Ones that are working now
1) Form a community to enter in new data. Open Street Map and MusicBrainz are two big examples. It works as the community is the originator of the data. That said, neither has dominated its industry as much as I thought they would have by now.
2) Sell tools to an upstream generator of open data. This is what CKAN does for central Governments (and the new ScraperWiki CKAN tool helps with). It’s what mySociety does, when selling FixMyStreet installs to local councils, thereby publishing their potholes as RSS feeds.
3) Use open data (quietly). Every organisation does this and never talks about it. It’s key to quite old data resellers like Bloomberg. It is what most of ScraperWiki’s professional services customers ask us to do. The value to society is enormous and invisible. The big flaw is that it doesn’t help scale supply of open data.
4) Sell tools to downstream users. This isn’t necessarily open data specific – existing software like spreadsheets and Business Intelligence can be used with open or closed data. Lots of open data is on the web, so tools like the new ScraperWiki which work well with web data are particularly suited to it.
Ones that haven’t worked
5) Collaborative curation ScraperWiki started as an audacious attempt to create an open data curation community, based on editing scraping code in a wiki. In its original form (now called ScraperWiki Classic) this didn’t scale. …With a few exceptions, notably OpenCorporates, there aren’t yet open data curation projects.
6) General purpose data marketplaces, particularly ones that are mainly reusing open data, haven’t taken off. They might do one day, however I think they need well-adopted higher level standards for data formatting and syncing first (perhaps something like dat, perhaps something based on CSV files).
Ones I expect more of in the future
These are quite exciting models which I expect to see a lot more of.
7) Give labour/money to upstream to help them create better data. This is quite new. The only, and most excellent, example of it is the UK’s National Archive curating the Statute Law Database. They do the work with the help of staff seconded from commercial legal publishers and other parts of Government.
It’s clever because it generates money for upstream, which people trust the most, and which has the most ability to improve data quality.
8) Viral open data licensing. MySQL made lots of money this way, offering proprietary dual licenses of GPLd software to embedded systems makers. In data this could use OKFN’s Open Database License, and organisations would pay when they wanted to mix the open data with their own closed data. I don’t know anyone actively using it, although Chris Taggart from OpenCorporates mentioned this model to me years ago.
9) Corporations release data for strategic advantage. Companies are starting to release their own data for strategic gain. This is very new. Expect more of it.”
The Little Data Book on Information and Communication Technology 2013
The World Bank: “This new addition to the Little Data Book series presents at-a-glance tables for more than 200 economies showing the most recent national data on key indicators of information and communications technology (ICT), including access, quality, affordability, efficiency, sustainability and applications.”
Power of open data reveals global corporate networks
Open Data Institute: “The ODI today welcomed the move by OpenCorporates to release open data visualisations which show the global corporate networks of millions of businesses and the power of open data.
See the Maps
OpenCorporates, a company based at the ODI, has produced visuals using several sources, which it has published as open data for the first time:
- Filings made by large domestic and foreign companies to the U.S. Securities and Exchange Commission
- Banking data held by the National Information Center of the Federal Reserve System in the U.S.
- Information about individual shareholders published by the official New Zealand corporate registry
Launched today, the visualisations are available through the main OpenCorporates website.”
The 20 Basics of Open Government
Watching what is going on around the world in national, state, and local governments, we think opengov is maturing and that the time has come for a basics resource for newbies. Our goal was to include the full expanse of open government and show how it all ties together so that when you, the astute reader, meet up with one of the various opengov cliques that uses the terminology in a narrowly defined way, you can see how they fit into the bigger picture. You should also be able to determine how opengov can best be applied to benefit whatever you’re up to, while keeping in mind the need to provide both access for citizens to engage with government and access to information.
Have a read through it, and let us know what you think! When you find a typo – or something you disagree with – or something we missed, let us know that as well. The easiest way to do it is right there in the comments (we’re not afraid to be called out in public!), but we’re open to email and twitter as well. We’re looking forward to hearing what you think!.”
The Real-Time City? Big Data and Smart Urbanism
New paper by Rob Kitchin from the National University of Ireland, Maynooth (NUI Maynooth) – NIRSA: “‘Smart cities’ is a term that has gained traction in academia, business and government to describe cities that, on the one hand, are increasingly composed of and monitored by pervasive and ubiquitous computing and, on the other, whose economy and governance is being driven by innovation, creativity and entrepreneurship, enacted by smart people. This paper focuses on the former and how cities are being instrumented with digital devices and infrastructure that produce ‘big data’ which enable real-time analysis of city life, new modes of technocratic urban governance, and a re-imagining of cities. The paper details a number of projects that seek to produce a real-time analysis of the city and provides a critical reflection on the implications of big data and smart urbanism”