9 models to scale open data – past, present and future


Open Knowledge Foundation Blog: “The possibilities of open data have been enthralling us for 10 years…But that excitement isn’t what matters in the end. What matters is scale – which organisational structures will make this movement explode?  This post quickly and provocatively goes through some that haven’t worked (yet!) and some that have.
Ones that are working now
1) Form a community to enter in new data. Open Street Map and MusicBrainz are two big examples. It works as the community is the originator of the data. That said, neither has dominated its industry as much as I thought they would have by now.
2) Sell tools to an upstream generator of open data. This is what CKAN does for central Governments (and the new ScraperWiki CKAN tool helps with). It’s what mySociety does, when selling FixMyStreet installs to local councils, thereby publishing their potholes as RSS feeds.
3) Use open data (quietly). Every organisation does this and never talks about it. It’s key to quite old data resellers like Bloomberg. It is what most of ScraperWiki’s professional services customers ask us to do. The value to society is enormous and invisible. The big flaw is that it doesn’t help scale supply of open data.
4) Sell tools to downstream users. This isn’t necessarily open data specific – existing software like spreadsheets and Business Intelligence can be used with open or closed data. Lots of open data is on the web, so tools like the new ScraperWiki which work well with web data are particularly suited to it.
Ones that haven’t worked
5) Collaborative curation ScraperWiki started as an audacious attempt to create an open data curation community, based on editing scraping code in a wiki. In its original form (now called ScraperWiki Classic) this didn’t scale. …With a few exceptions, notably OpenCorporates, there aren’t yet open data curation projects.
6) General purpose data marketplaces, particularly ones that are mainly reusing open data, haven’t taken off. They might do one day, however I think they need well-adopted higher level standards for data formatting and syncing first (perhaps something like dat, perhaps something based on CSV files).
Ones I expect more of in the future
These are quite exciting models which I expect to see a lot more of.
7) Give labour/money to upstream to help them create better data. This is quite new. The only, and most excellent, example of it is the UK’s National Archive curating the Statute Law Database. They do the work with the help of staff seconded from commercial legal publishers and other parts of Government.
It’s clever because it generates money for upstream, which people trust the most, and which has the most ability to improve data quality.
8) Viral open data licensing. MySQL made lots of money this way, offering proprietary dual licenses of GPLd software to embedded systems makers. In data this could use OKFN’s Open Database License, and organisations would pay when they wanted to mix the open data with their own closed data. I don’t know anyone actively using it, although Chris Taggart from OpenCorporates mentioned this model to me years ago.
9) Corporations release data for strategic advantage. Companies are starting to release their own data for strategic gain. This is very new. Expect more of it.”

Digital Public Spaces


FutureEverything Publications: “This publication gathers a range of short explorations of the idea of the Digital Public Space. The central vision of the Digital Public Space is to give everyone everywhere unrestricted access to an open resource of culture and knowledge. This vision has emerged from ideas around building platforms for engagement around cultural archives to become something wider, which this publication is seeking to hone and explore.
This is the first publication to look at the emergence of the Digital Public Space. Contributors include some of the people who are working to make the Digital Public Space happen.
The Digital Public Spaces publication has been developed by FutureEverything working with Bill Thompson of the BBC and in association with The Creative Exchange.”

Understanding Smart Data Disclosure Policy Success: The Case of Green Button


New Paper by Djoko Sigit Sayogo and Theresa Pardo: “Open data policies are expected to promote innovations that stimulate social, political and economic change. In pursuit of innovation potential, open datahas expanded to wider environment involving government, business and citizens. The US government recently launched such collaboration through a smart data policy supporting energy efficiency called Green Button. This paper explores the implementation of Green Button and identifies motivations and success factors facilitating successful collaboration between public and private organizations to support smart disclosure policy. Analyzing qualitative data from semi-structured interviews with experts involved in Green Button initiation and implementation, this paper presents some key findings. The success of Green Button can be attributed to the interaction between internal and external factors. The external factors consist of both market and non-market drivers: economic factors, technology related factors, regulatory contexts and policy incentives, and some factors that stimulate imitative behavior among the adopters. The external factors create the necessary institutional environment for the Green Button implementation. On the other hand, the acceptance and adoption of Green Button itself is influenced by the fit of Green Button capability to the strategic mission of energy and utility companies in providing energy efficiency programs. We also identify the different roles of government during the different stages of Green Button implementation.”
[Recipient of Best Management/Policy Paper Award, dgo2013]

Next.Data.gov


Nick Sinai at the White House Blog: “Today, we’re excited to share a sneak preview of a new design for Data.gov, called Next.Data.gov. The upgrade builds on the President’s May 2013 Open Data Executive Order that aims to fuse open-data practices into the Federal Government’s DNA. Next.Data.gov is far from complete (think of it as a very early beta), but we couldn’t wait to share our design approach and the technical details behind it – knowing that we need your help to make it even better.  Here are some key features of the new design:
 

OSTP_nextdata_1 

Leading with Data: The Data.gov team at General Services Administration (GSA), a handful of Presidential Innovation Fellows, and OSTP staff designed Next.Data.Gov to put data first. The team studied the usage patterns on Data.gov and found that visitors were hungry for examples of how data are used. The team also noticed many sources, such as tweets and articles outside of Data.gov featuring Federal datasets in action. So Next.Data.gov includes a rich stream that enables each data community to communicate how its datasets are impacting companies and the public.

OSTP_nextdata_2 

In this dynamic stream, you’ll find blog posts, tweets, quotes, and other features that more fully showcase the wide range of information assets that exist within the vaults of government.
Powerful Search: The backend of Next.Data.gov is CKAN and is powered by Solr—a powerful search engine that will make it even easier to find relevant datasets online. Suggested search terms have been added to help users find (and type) things faster. Next.Data.gov will start to index datasets from agencies that publish their catalogs publicly, in line with the President’s Open Data Executive Order. The early preview launching today features datasets from the Department of Health and Human Services—one of the first Federal agencies to publish a machine-readable version of its data catalog.
Rotating Data Visualizations: Building on the theme of leading with data, even the  masthead-design for Next.Data.gov is an open-data-powered visualization—for now, it’s a cool U.S. Geological Survey earthquake plot showing the magnitude of earthquake measurements collected over the past week, around the globe.

OSTP_nextdata_3 

This particular visualization was built using D3.js. The visualization will be updated periodically to spotlight different ways open data is used and illustrated….
We encourage you to collaborate in the design process by creating pull requests or providing feedback via Quora or Twitter.”

Metrics for Government Reform


Geoff Mulgan: “How do you measure a programme of government reform? What counts as evidence that it’s working or not? I’ve been asked this question many times, so this very brief note suggests some simple answers – mainly prompted by seeing a few writings on this question which I thought confused some basic points.”
Any type of reform programme will combine elements at very different levels. These may include:

  • A new device – for example, adjusting the wording in an official letter or a call centre script to see what impact this has on such things as tax compliance.
  • A new kind of action – for example a new way of teaching maths in schools, treating patients with diabetes, handling prison leavers.
  • A new kind of policy – for example opening up planning processes to more local input; making welfare payments more conditional.
  • A new strategy – for example a scheme to cut carbon in cities, combining retrofitting of housing with promoting bicycle use; or a strategy for public health.
  • A new approach to strategy – for example making more use of foresight, scenarios or big data.
  • A new approach to governance – for example bringing hitherto excluded groups into political debate and decision-making.

This rough list hopefully shows just how different these levels are in their nature. Generally as we go down the list the following things rise:

  • The number of variables and the complexity of the processes involved
  • The timescales over which any judgements can be made
  • The difficultness involved in making judgements about causation
  • The importance of qualitative relative to quantitative assessment”

Internet Association's New Website Lets Users Comment on Bills


Mashable: “The Internet Association, the lobbying conglomerate of big tech companies like Google, Amazon and Facebook, has launched a new website that allows users to comment on proposed bills.
The association unveiled its redesigned website on Monday, and it hopes its new, interactive features will give citizens a way to speak up…
In the “Take Action” section of the website, under “Leave Your Mark,” the association plans to upload bills, declarations and other context documents for netizens to peruse and, most importantly, interact with. After logging in, a user can comment on the bill in general, and even make line edits.”

The Durkheim Project


Co.Labs: “A new project, newly launched by DARPA and Dartmouth University, is trying something new: Data-mining social networks to spot patterns indicating suicidal behavior.
Called The Durkheim Project, named for the Victorian-era psychologist, it is asking veterans to offer their Twitter and Facebook authorization keys for an ambitious effort to match social media behavior with indications of suicidal thought. Veterans’ online behavior is then fed into a real-time analytics dashboard which predicts suicide risks and psychological episodes… The Durkheim Project is led by New Hampshire-based Patterns and Predictions, a Dartmouth University spin-off with close ties to academics there…
The Durkheim Project is part of DARPA’s Detection and Computational Analysis of Psychological Signals (DCAPS) project. DCAPS is a larger effort designed to harness predictive analytics for veteran mental health–and not just from social media. According to DARPA’s Russell Shilling’s program introduction, DCAPS is also developing algorithms that can data mine voice communications, daily eating and sleeping patterns, in-person social interactions, facial expressions, and emotional states for signs of suicidal thought. While participants in Durkheim won’t receive mental health assistance directly from the project, their contributions will go a long way toward treating suicidal veterans in the future….
The project launched on July 1; the number of veterans participating is not currently known but the finished number is expected to hover around 100,000.”

Why the Share Economy is Important for Disaster Response and Resilience


Patrick Meier at iRevolution: “A unique and detailed survey funded by the Rockefeller Foundation confirms the important role that social and community bonds play vis-à-vis disaster resilience. The new study, which focuses on resilience and social capital in the wake of Hurricane Sandy, reveals how disaster-affected communities self-organized, “with reports of many people sharing access to power, food and water, and providing shelter.” This mutual aid was primarily coordinated face-to-face. This may not always be possible, however. So the “Share Economy” can also play an important role in coordinating self-help during disasters….
In a share economy, “asset owners use digital clearinghouses to capitalize the unused capacity of things they already have, and consumers rent from their peers rather than rent or buy from a company”. During disasters, these asset owners can use the same digital clearinghouses to offer what they have at no cost. For example, over 1,400 kindhearted New Yorkers offered free housing to people heavily affected by the hurricane. They did this using AirBnB, as shown in the short video above. Meanwhile, on the West Coast, the City of San Francisco has just lunched a partnership with BayShare, a sharing economy advocacy group in the Bay Area. The partnership’s goal is to “harness the power of sharing to ensure the best response to future disasters in San Francisco”

https://web.archive.org/web/2000/https://www.youtube.com/watch?v=vIWxAWRq4t0

Open Data Tools: Turning Data into ‘Actionable Intelligence’


Shannon Bohle in SciLogs: “My previous two articles were on open access and open data. They conveyed major changes that are underway around the globe in the methods by which scientific and medical research findings and data sets are circulated among researchers and disseminated to the public. I showed how E-science and ‘big data’ fit into the philosophy of science though a paradigm shift as a trilogy of approaches: deductive, empirical, and computational, which was pointed out, provides a logical extenuation of Robert Boyle’s tradition of scientific inquiry involving “skepticism, transparency, and reproducibility for independent verification” to the computational age…
This third article on open access and open data evaluates new and suggested tools when it comes to making the most of the open access and open data OSTP mandates. According to an article published in The Harvard Business Review’s “HBR Blog Network,” this is because, as its title suggests, “open data has  little value if people can’t use it.” Indeed, “the goal is for this data to become actionable intelligence: a launchpad for investigation, analysis, triangulation, and improved decision making at all levels.” Librarians and archivists have key roles to play in not only storing data, but packaging it for proper accessibility and use, including adding descriptive metadata and linking to existing tools or designing new ones for their users. Later, in a comment following the article, the author, Craig Hammer, remarks on the importance of archivists and international standards, “Certified archivists have always been important, but their skillset is crucially in demand now, as more and more data are becoming available. Accessibility—in the knowledge management sense—must be on par with digestibility / ‘data literacy’ as priorities for continuing open data ecosystem development. The good news is that several governments and multilaterals (in consultation with data scientists and – yep! – certified archivists) are having continuing ‘shared metadata’ conversations, toward the possible development of harmonized data standards…If these folks get this right, there’s a real shot of (eventual proliferation of) interoperability (i.e. a data platform from Country A can ‘talk to’ a data platform from Country B), which is the only way any of this will make sense at the macro level.”