Open Data and Beyond


Paper by Frederika Welle Donker, Bastiaan van Loenen and Arnold K. Bregt: “In recent years, there has been an increasing trend of releasing public sector information as open data. Governments worldwide see the potential benefits of opening up their data. The potential benefits are more transparency, increased governmental efficiency and effectiveness, and external benefits, including societal and economic benefits. The private sector also recognizes potential benefits of making their datasets available as open data. One such company is Liander, an energy network administrator in the Netherlands. Liander views open data as a contributing factor to energy conservation. However, to date there has been little research done into the actual effects of open data. This research has developed a monitoring framework to assess the effects of open data, and has applied the framework to Liander’s small-scale energy consumption dataset….(More)

OpenTrials: towards a collaborative open database of all available information on all clinical trials


Paper Ben Goldacre and Jonathan Gray at BioMed Central: “OpenTrials is a collaborative and open database for all available structured data and documents on all clinical trials, threaded together by individual trial. With a versatile and expandable data schema, it is initially designed to host and match the following documents and data for each trial: registry entries; links, abstracts, or texts of academic journal papers; portions of regulatory documents describing individual trials; structured data on methods and results extracted by systematic reviewers or other researchers; clinical study reports; and additional documents such as blank consent forms, blank case report forms, and protocols. The intention is to create an open, freely re-usable index of all such information and to increase discoverability, facilitate research, identify inconsistent data, enable audits on the availability and completeness of this information, support advocacy for better data and drive up standards around open data in evidence-based medicine. The project has phase I funding. This will allow us to create a practical data schema and populate the database initially through web-scraping, basic record linkage techniques, crowd-sourced curation around selected drug areas, and import of existing sources of structured and documents. It will also allow us to create user-friendly web interfaces onto the data and conduct user engagement workshops to optimise the database and interface designs. Where other projects have set out to manually and perfectly curate a narrow range of information on a smaller number of trials, we aim to use a broader range of techniques and attempt to match a very large quantity of information on all trials. We are currently seeking feedback and additional sources of structured data….(More)”

New Orleans Gamifies the City Budget


Kelsey E. Thomas at Next City: “New Orleanians can try their hand at being “mayor for a day” with a new interactive website released by the Committee for a Better New Orleans Wednesday.

The Big Easy Budget Game uses open data from the city to allow players to create their own version of an operating budget. Players are given a digital $602 million, and have to balance the budget — keeping in mind the government’s responsibilities, previous year’s spending and their personal priorities.

Each department in the game has a minimum funding level (players can’t just quit funding public schools if they feel like it), and restricted funding, such as state or federal dollars, is off limits.

CBNO hopes to attract 600 players this year, and plans to compile the data from each player into a crowdsourced meta-budget called “The People’s Budget.” Next fall, the People’s Budget will be released along with the city’s proposed 2017 budget.

Along with the budgeting game, CBNO released a more detailed website, also using the city’s open data, that breaks down the city’s budgeted versus actual spending from 2007 to now and is filterable. The goal is to allow users without big data experience to easily research funding relevant to their neighborhoods.

Many cities have been releasing interactive websites to make their data more accessible to residents. Checkbook NYC updates more than $70 billion in city expenses daily and breaks them down by transaction. Fiscal Focus Pittsburgh is an online visualization tool that outlines revenues and expenses in the city’s budget….(More)”

Open data and the API economy: when it makes sense to give away data


 at ZDNet: “Open data is one of those refreshing trends that flows in the opposite direction of the culture of fear that has developed around data security. Instead of putting data under lock and key, surrounded by firewalls and sandboxes, some organizations see value in making data available to all comers — especially developers.

The GovLab.org, a nonprofit advocacy group, published an overview of the benefits governments and organizations are realizing from open data, as well as some of the challenges. The group defines open data as “publicly available data that can be universally and readily accessed, used and redistributed free of charge. It is structured for usability and computability.”…

For enterprises, an open-data stance may be the fuel to build a vibrant ecosystem of developers and business partners. Scott Feinberg, API architect for The New York Times, is one of the people helping to lead the charge to open-data ecosystems. In a recent CXOTalk interview with ZDNet colleague Michael Krigsman, he explains how through the NYT APIs program, developers can sign up for access to 165 years worth of content.

But it requires a lot more than simply throwing some APIs out into the market. Establishing such a comprehensive effort across APIs requires a change in mindset that many organizations may not be ready for, Feinberg cautions. “You can’t be stingy,” he says. “You have to just give it out. When we launched our developer portal there’s a lot of questions like, are people going to be stealing our data, questions like that. Just give it away. You don’t have to give it all but don’t be stingy, and you will find that first off not that many people are going to use it at first. you’re going to find that out, but the people who do, you’re going to find those passionate people who are really interested in using your data in new ways.”

Feinberg clarifies that the NYT’s APIs are not giving out articles for free. Rather, he explains, “we give is everything but article content. You can search for articles. You can find out what’s trending. You can almost do anything you want with our data through our APIs with the exception of actually reading all of the content. It’s really about giving people the opportunity to really interact with your content in ways that you’ve never thought of, and empowering your community to figure out what they want. You know while we don’t give our actual article text away, we give pretty much everything else and people build a lot of really cool stuff on top of that.”

Open data sets, of course, have to worthy of the APIs that offer them. In his post, Borne outlines the seven qualities open data needs to have to be of value to developers and consumers. (Yes, they’re also “Vs” like big data.)

  1. Validity: It’s “critical to pay attention to these data validity concerns when your organization’s data are exposed to scrutiny and inspection by others,” Borne states.
  2. Value: The data needs to be the font of new ideas, new businesses, and innovations.
  3. Variety: Exposing the wide variety of data available can be “a scary proposition for any data scientist,” Borne observes, but nonetheless is essential.
  4. Voice: Remember that “your open data becomes the voice of your organization to your stakeholders.”
  5. Vocabulary: “The semantics and schema (data models) that describe your data are more critical than ever when you provide the data for others to use,” says Borne. “Search, discovery, and proper reuse of data all require good metadata, descriptions, and data modeling.”
  6. Vulnerability: Accept that open data, because it is so open, will be subjected to “misuse, abuse, manipulation, or alteration.”
  7. proVenance: This is the governance requirement behind open data offerings. “Provenance includes ownership, origin, chain of custody, transformations that been made to it, processing that has been applied to it (including which versions of processing software were used), the data’s uses and their context, and more,” says Borne….(More)”

When open data is a Trojan Horse: The weaponization of transparency in science and governance


Karen E.C. Levy and David Merritt Johns in Big Data and Society: “Openness and transparency are becoming hallmarks of responsible data practice in science and governance. Concerns about data falsification, erroneous analysis, and misleading presentation of research results have recently strengthened the call for new procedures that ensure public accountability for data-driven decisions. Though we generally count ourselves in favor of increased transparency in data practice, this Commentary highlights a caveat. We suggest that legislative efforts that invoke the language of data transparency can sometimes function as “Trojan Horses” through which other political goals are pursued. Framing these maneuvers in the language of transparency can be strategic, because approaches that emphasize open access to data carry tremendous appeal, particularly in current political and technological contexts. We illustrate our argument through two examples of pro-transparency policy efforts, one historical and one current: industry-backed “sound science” initiatives in the 1990s, and contemporary legislative efforts to open environmental data to public inspection. Rules that exist mainly to impede science-based policy processes weaponize the concept of data transparency. The discussion illustrates that, much as Big Data itself requires critical assessment, the processes and principles that attend it—like transparency—also carry political valence, and, as such, warrant careful analysis….(More)”

Liberating data for public value: The case of Data.gov


Paper by Rashmi Krishnamurthy and  Yukika Awazu in the International Journal of Information Management: “Public agencies around the globe are liberating their data. Drawing on a case of Data.gov, we outline the challenges and opportunities that lie ahead for the liberation of public data. Data.gov is an online portal that provides open access to datasets generated by US public agencies and countries around the world in a machine-readable format. By discussing the challenges and opportunities faced by Data.gov, we provide several lessons that can inform research and practice. We suggest that providing access to open data in itself does not spur innovation. Specifically, we claim that public agencies need to spend resources to improve the capacities of their organizations to move toward ‘open data by default’; develop capacities of community to use data to solve problems; and think critically about the unintended consequences of providing access to public data. We also suggest that public agencies need better metrics to evaluate the success of open-data efforts in achieving its goals….(More)”

Open Data Impact: When Demand and Supply Meet


Stefaan Verhulst and Andrew Young at the GovLab: “Today, in “Open Data Impact: When Demand and Supply Meet,” the GovLab and Omidyar Network release key findings about the social, economic, cultural and political impact of open data. The findings are based on 19 detailed case studies of open data projects from around the world. These case studies were prepared in order to address an important shortcoming in our understanding of when, and how, open data works. While there is no shortage of enthusiasm for open data’s potential, nor of conjectural estimates of its hypothetical impact, few rigorous, systematic analyses exist of its concrete, real-world impact…. The 19 case studies that inform this report, all of which can be found at Open Data’s Impact (odimpact.org), a website specially set up for this project, were chosen for their geographic and sectoral representativeness. They seek to go beyond the descriptive (what happened) to the explanatory (why it happened, and what is the wider relevance or impact)….

In order to achieve the potential of open data and scale the impact of the individual projects discussed in our report, we need a better – and more granular – understanding of the enabling conditions that lead to success. We found 4 central conditions (“4Ps”) that play an important role in ensuring success:

Conditions

  • Partnerships: Intermediaries and data collaboratives play an important role in ensuring success, allowing for enhanced matching of supply and demand of data.
  • Public infrastructure: Developing open data as a public infrastructure, open to all, enables wider participation, and a broader impact across issues and sectors.
  • Policies: Clear policies regarding open data, including those promoting regular assessments of open data projects, are also critical for success.
  • Problem definition: Open data initiatives that have a clear target or problem definition have more impact and are more likely to succeed than those with vaguely worded statements of intent or unclear reasons for existence. 

Core Challenges

Finally, the success of a project is also determined by the obstacles and challenges it confronts. Our research uncovered 4 major challenges (“4Rs”) confronting open data initiatives across the globe:

Challenges

  • Readiness: A lack of readiness or capacity (evident, for example, in low Internet penetration or technical literacy rates) can severely limit the impact of open data.
  • Responsiveness: Open data projects are significantly more likely to be successful when they remain agile and responsive—adapting, for instance, to user feedback or early indications of success and failure.
  • Risks: For all its potential, open data does pose certain risks, notably to privacy and security; a greater, more nuanced understanding of these risks will be necessary to address and mitigate them.
  • Resource Allocation: While open data projects can often be launched cheaply, those projects that receive generous, sustained and committed funding have a better chance of success over the medium and long term.

Toward a Next Generation Open Data Roadmap

The report we release today concludes with ten recommendations for policymakers, advocates, users, funders and other stakeholders in the open data community. For each step, we include a few concrete methods of implementation – ways to translate the broader recommendation into meaningful impact.

Together, these 10 recommendations and their means of implementation amount to what we call a “Next Generation Open Data Roadmap.” This roadmap is just a start, and we plan to continue fleshing it out in the near future. For now, it offers a way forward. It is our hope that this roadmap will help guide future research and experimentation so that we can continue to better understand how the potential of open data can be fulfilled across geographies, sectors and demographics.

Additional Resources

In conjunction with the release of our key findings paper, we also launch today an “Additional Resources” section on the Open Data’s Impact website. The goal of that section is to provide context on our case studies, and to point in the direction of other, complementary research. It includes the following elements:

  • A “repository of repositories,” including other compendiums of open data case studies and sources;
  • A compilation of some popular open data glossaries;
  • A number of open data research publications and reports, with a particular focus on impact;
  • A collection of open data definitions and a matrix of analysis to help assess those definitions….(More)

Crowdlaw and open data policy: A perfect match?


 at Sunlight: “The open government community has long envisioned a future where all public policy is collaboratively drafted online and in the open — a future in which we (the people) don’t just have a say in who writes and votes on the rules that govern our society, but are empowered in a substantive way to participate, annotating or even crafting those rules ourselves. If that future seems far away, it’s because we’ve seen few successful instances of this approach in the United States. But an increasing amount of open and collaborative online approaches to drafting legislation — a set of practices the NYU GovLab and others have called “crowdlaw” — seem to have found their niche in open data policy.

This trend has taken hold at the local level, where multiple cities have employed crowdlaw techniques to draft or revise the regulations which establish and govern open data initiatives. But what explains this trend and the apparent connection between crowdlaw and the proactive release of government information online? Is it simply that both are “open government” practices? Or is there something more fundamental at play here?…

Since 2012, several high-profile U.S. cities have utilized collaborative tools such as Google Docs,GitHub, and Madison to open up the process of open data policymaking. The below chronology of notable instances of open data policy drafted using crowdlaw techniques gives the distinct impression of a good idea spreading in American cities:….

While many cities may not be ready to take their hands off of the wheel and trust the public to help engage in meaningful decisions about public policy, it’s encouraging to see some giving it a try when it comes to open data policy. Even for cities still feeling skeptical, this approach can be applied internally; it allows other departments impacted by changes that come about through an open data policy to weigh in, too. Cities can open up varying degrees of the process, retaining as much autonomy as they feel comfortable with. In the end, utilizing the crowdlaw process with open data legislation can increase its effectiveness and accountability by engaging the public directly — a win-win for governments and their citizens alike….(More)”

Cities, Data, and Digital Innovation


Paper by Mark Kleinman: “Developments in digital innovation and the availability of large-scale data sets create opportunities for new economic activities and new ways of delivering city services while raising concerns about privacy. This paper defines the terms Big Data, Open Data, Open Government, and Smart Cities and uses two case studies – London (U.K.) and Toronto – to examine questions about using data to drive economic growth, improve the accountability of government to citizens, and offer more digitally enabled services. The paper notes that London has been one of a handful of cities at the forefront of the Open Data movement and has been successful in developing its high-tech sector, although it has so far been less innovative in the use of “smart city” technology to improve services and lower costs. Toronto has also made efforts to harness data, although it is behind London in promoting Open Data. Moreover, although Toronto has many assets that could contribute to innovation and economic growth, including a growing high-technology sector, world-class universities and research base, and its role as a leading financial centre, it lacks a clear narrative about how these assets could be used to promote the city. The paper draws some general conclusions about the links between data innovation and economic growth, and between open data and open government, as well as ways to use big data and technological innovation to ensure greater efficiency in the provision of city services…(More)

The Opportunity Project: Utilizing Open Data to Build Stronger Ladders of Opportunity for All


White House Factsheet: “In the lead up to the President’s historic visit to SxSW, today the Administration is announcing the launch of “The Opportunity Project,” a new open data effort to improve economic mobility for all Americans. As the President said in his State of the Union address, we must harness 21st century technology and innovation to expand access to opportunity and tackle our greatest challenges.

The Opportunity Project will put data and tools in the hands of civic leaders, community organizations, and families to help them navigate information about critical resources such as access to jobs, housing, transportation, schools, and other neighborhood amenities. This project is about unleashing the power of data to help our children and our children’s children access the resources they need to thrive. Today, the Administration is releasing a unique package of Federal and local datasets in an easy-to-use format and accelerating a new way for the federal government to collaborate with local leaders, technologists, and community members to use data and technology to tackle inequities and strengthen their communities.

Key components of this announcement include:

·         The launch of “The Opportunity Project” and Opportunity.Census.gov to provide easy access to the new package of Opportunity Project data, a combination of Federal and local data, on key assets that determine access to opportunity at the neighborhood level. This data can now be used by technologists, community groups, and local governments in order to help families find affordable housing, help businesses identify services they need, and help policymakers see inequities in their communities and make investments to expand fair housing and increase economic mobility.

·         The release of a dozen new private sector and non-profit digital tools that were built in collaboration with eight cities and using the Opportunity Project data to help families, local leaders, advocates, and the media navigate information about access to jobs, housing, transportation, schools, neighborhood amenities, and other critical resources. Participating cities include Baltimore, Detroit, Kansas City, MO, New Orleans, New York, Philadelphia, San Francisco, and Washington, D.C., as well as organizations and companies such as Redfin, Zillow, GreatSchools, PolicyLink andStreetwyze.

·         More than thirty additional non-profits, community organizations, coding boot camps, academic institutions, and local governments have already committed to use the Opportunity Project data to build stronger ladders of opportunity in communities across the country.

·         The Administration is issuing a Call to Action to the public to develop new tools, offer additional sources of data, deepen community engagement through the use of the data, and other actions. We want to hear about what new steps you are taking or programs you are implementing to address these topics.

This project represents an important continuation of how the Federal government is working with communities and technologists to enhance the power of open data by making it more accessible to a wide variety of users across the country, and by facilitating collaborations between software developers and community members to build digital tools that make it easier for communities and families to solve their greatest challenges….(More)”