Open Data: What Is It and Why Should You Care?


Jason Shueh at Government Technology: “Though the debate about open data in government is an evolving one, it is indisputably here to stay — it can be heard in both houses of Congress, in state legislatures, and in city halls around the nation.
Already, 39 states and 46 localities provide data sets to data.gov, the federal government’s online open data repository. And 30 jurisdictions, including the federal government, have taken the additional step of institutionalizing their practices in formal open data policies.
Though the term “open data” is spoken of frequently — and has been since President Obama took office in 2009 — what it is and why it’s important isn’t always clear. That’s understandable, perhaps, given that open data lacks a unified definition.
“People tend to conflate it with big data,” said Emily Shaw, the national policy manager at the Sunlight Foundation, “and I think it’s useful to think about how it’s different from big data in the sense that open data is the idea that public information should be accessible to the public online.”
Shaw said the foundation, a Washington, D.C., non-profit advocacy group promoting open and transparent government, believes the term open data can be applied to a variety of information created or collected by public entities. Among the benefits of open data are improved measurement of policies, better government efficiency, deeper analytical insights, greater citizen participation, and a boost to local companies by way of products and services that use government data (think civic apps and software programs).
“The way I personally think of open data,” Shaw said, “is that it is a manifestation of the idea of open government.”

What Makes Data Open

For governments hoping to adopt open data in policy and in practice, simply making data available to the public isn’t enough to make that data useful. Open data, though straightforward in principle, requires a specific approach based on the agency or organization releasing it, the kind of data being released and, perhaps most importantly, its targeted audience.
According to the foundation’s California Open Data Handbook, published in collaboration with Stewards of Change Institute, a national group supporting innovation in human services, data must first be both “technically open” and “legally open.” The guide defines the terms in this way:
Technically open: [data] available in a machine-readable standard format, which means it can be retrieved and meaningfully processed by a computer application
Legally open: [data] explicitly licensed in a way that permits commercial and non-commercial use and re-use without restrictions.
Technically open means that data is easily accessible to its intended audience. If the intended users are developers and programmers, Shaw said, the data should be presented within an application programming interface (API); if it’s intended for researchers in academia, data might be structured in a bulk download; and if it’s aimed at the average citizen, data should be available without requiring software purchases.
….

4 Steps to Open Data

Creating open data isn’t without its complexities. There are many tasks that need to happen before an open data project ever begins. A full endorsement from leadership is paramount. Adding the project into the work flow is another. And allaying fears and misunderstandings is expected with any government project.
After the basic table stakes are placed, the handbook prescribes four steps: choosing a set of data, attaching an open license, making it available through a proper format and ensuring the data is discoverable.
1. Choose a Data Set
Choosing a data set can appear daunting, but it doesn’t have to be. Shaw said ample resources are available from the foundation and others on how to get started with this — see our list of open data resources for more information. In the case of selecting a data set, or sets, she referred to the foundation’s recently updated guidelines that urge identifying data sets based on goals and the demand from citizen feedback.
2. Attach an Open License
Open licenses dispel ambiguity and encourage use. However, they need to be proactive, and this means users should not be forced to request the information in order to use it — a common symptom of data accessed through the Freedom of Information Act. Tips for reference can be found at Opendefinition.org, a site that has a list of examples and links to open licenses that meet the definition of open use.
3. Format the Data to Your Audience
As previously stated, Shaw recommends tailoring the format of data to the audience, with the ideal being that data is packaged in formats that can be digested by all users: developers, civic hackers, department staff, researchers and citizens. This could mean it’s put into APIs, spreadsheet docs, text and zip files, FTP servers and torrent networking systems (a way to download files from different sources). The file type and the system for download all depends on the audience.
“Part of learning about what formats government should offer data in is to engage with the prospective users,” Shaw said.
4. Make it Discoverable
If open data is strewn across multiple download links and wedged into various nooks and crannies of a website, it probably won’t be found. Shaw recommends a centralized hub that acts as a one-stop shop for all open data downloads. In many jurisdictions, these Web pages and websites have been called “portals;” they are the online repositories for a jurisdiction’s open data publishing.
“It is important for thinking about how people can become aware of what their governments hold. If the government doesn’t make it easy for people to know what kinds of data is publicly available on the website, it doesn’t matter what format it’s in,” Shaw said. She pointed to public participation — a recurring theme in open data development — to incorporate into the process to improve accessibility.
 
Examples of portals, can be found in numerous cities across the U.S., such as San Francisco, New York, Los Angeles, Chicago and Sacramento, Calif.
Visit page 2 of our story for open data resources, and page 3 for open data file formats.

Browser extension automates citations of online material


Springwise: “Plagiarism is a major concern for colleges today, meaning when it comes to writing a thesis or essay, college students can often spend an inordinate amount of time ensuring their bibliographies are up to scratch, to the detriment of the quality of the actual writing. In the past, services such as ReadCube have made it easier to annotate and search online articles, and now Citelighter automatically generates a citation for any web resource, along with a number of tools to help students organize their research.

The service is a toolbar that sits at the top of the user’s browser while they search for material for their paper. When they’ve found a fact or quote that’s useful, users simply highlight the text and click the Capture button, which saves the clipping to the project they’re working on. Citelight automatically captures the bibliographic information necessary to create a citation that reaches academic standards, and users can also add their own comments for when they come to use the quote in their essay. Citations can be re-ordered within each project to enable students to plot out a rough version of their paper before sitting down to write…”

“Government Entrepreneur” is Not an Oxymoron


Mitchell Weiss in Harvard Business Review Blog: “Entrepreneurship almost always involves pushing against the status quo to capture opportunities and create value. So it shouldn’t be surprising when a new business model, such as ridesharing, disrupts existing systems and causes friction between entrepreneurs and local government officials, right?
But imagine if the road that led to the Seattle City Council ridesharing hearings this month — with rulings that sharply curtail UberX, Lyft, and Sidecar’s operations there — had been a vastly different one.  Imagine that public leaders had conceived and built a platform to provide this new, shared model of transit.  Or at the very least, that instead of having a revolution of the current transit regime done to Seattle public leaders, it was done with them.  Amidst the acrimony, it seems hard to imagine that public leaders could envision and operate such a platform, or that private innovators could work with them more collaboratively on it — but it’s not impossible. What would it take? Answer: more public entrepreneurs.
The idea of ”public entrepreneurship” may sound to you like it belongs on a list of oxymorons right alongside “government intelligence.” But it doesn’t.  Public entrepreneurs around the world are improving our lives, inventing entirely new ways to serve the public.   They are using sensors to detect potholes; word pedometers to help students learn; harnessing behavioral economics to encourage organ donation; crowdsourcing patent review; and transforming Medellin, Colombia with cable cars. They are coding in civic hackathons and competing in the Bloomberg challenge.  They are partnering with an Office of New Urban Mechanics in Boston or in Philadelphia, co-developing products in San Francisco’s Entrepreneurship-in-Residence program, or deploying some of the more than $430 million invested into civic-tech in the last two years.
There is, however, a big problem with public entrepreneurs: there just aren’t enough of them.  Without more public entrepreneurship, it’s hard to imagine meeting our public challenges or making the most of private innovation. One might argue that bungled healthcare website roll-outs or internet spying are evidence of too much activity on the part of public leaders, but I would argue that what they really show is too little entrepreneurial skill and judgment.
The solution to creating more public entrepreneurs is straightforward: train them. But, by and large, we don’t.  Consider Howard Stevenson’s definition of entrepreneurship: “the pursuit of opportunity without regard to resources currently controlled.” We could teach that approach to people heading towards the public sector. But now consider the following list of terms: “acknowledgement of multiple constituencies,” “risk reduction,” “formal planning,” “coordination,” “efficiency measures,” “clearly defined responsibility,” and “organizational culture.” It reads like a list of the kinds of concepts we would want a new public official to know; like it might be drawn from an interview evaluation form or graduate school syllabus.  In fact, it’s from Stevenson’s list of pressures that pull managers away from entrepreneurship and towards administration.  Of course, that’s not all bad. We must have more great public administrators.  But with all our challenges and amidst all the dynamism, we are going to need more than analysts and strategists in the public sector, we need inventors and builders, too.
Public entrepreneurship is not simply innovation in the public sector (though it makes use of innovation), and it’s not just policy reform (though it can help drive reform).  Public entrepreneurs build something from nothing with resources — be they financial capital or human talent or new rules — they didn’t command. In Boston, I worked with many amazing public managers and a handful of outstanding public entrepreneurs.  Chris Osgood and Nigel Jacob brought the country’s first major-city mobile 311 app to life, and they are public entrepreneurs.   They created Citizens Connect in 2009 by bringing together iPhones on loan together with a local coder and the most under-tapped resource in the public sector: the public.  They transformed the way basic neighborhood issues are reported and responded to (20% of all constituent cases in Boston are reported over smartphones now), and their model is now accessible to 40 towns in Massachusetts and cities across the country.  The Mayor’s team in Boston that started-up the One Fund in the days after the Marathon bombings were public entrepreneurs.  We built the organization from PayPal and a Post Office Box, and it went on to channel $61 million from donors to victims and survivors in just 75 days. It still operates today….
It’s worth noting that public entrepreneurship, perhaps newly buzzworthy, is not actually new. Elinor Ostrom (44 years before her Nobel Prize) observed public entrepreneurs inventing new models in the 1960s. Back when Ronald Reagan was president, Peter Drucker wrote that it was entrepreneurship that would keep public service “flexible and self-renewing.” And almost two decades have passed since David Osborne and Ted Gaebler’s “Reinventing Government” (the then handbook for public officials) carried the promising subtitle: “How the Entrepreneurial Spirit is Transforming the Public Sector”.  Public entrepreneurship, though not nearly as widespread as its private complement, or perhaps as fashionable as its “social” counterpart (focussed on non-profits and their ecosystem), has been around for a while and so have those who practiced it.
But still today, we mostly train future public leaders to be public administrators. We school them in performance management and leave them too inclined to run from risk instead of managing it. And we communicate often, explicitly or not, to private entrepreneurs that government officials are failures and dinosaurs.  It’s easy to see how that road led to Seattle this month, but hard see how it empowers public officials to take on the enormous challenges that still lie ahead of us, or how it enables the public to help them.”

HarassMap: Using Crowdsourced Data to Map Sexual Harassment in Egypt


Chelsea Young in Technology Innovation Management Review: “Through a case study of HarassMap, an advocacy, prevention, and response tool that uses crowdsourced data to map incidents of sexual harassment in Egypt, this article examines the application of crowdsourcing technology to drive innovation in the field of social policy. This article applies a framework that explores the potential, limitations, and future applications of crowdsourcing technology in this sector to reveal how crowdsourcing technology can be applied to overcome cultural and environmental constraints that have traditionally impeded the collection of data. Many of the lessons emerging from this case study hold relevance beyond the field of social policy. Applied to specific problems, this technology can be used to improve the efficiency and effectiveness of mitigation strategies, while facilitating rapid and informed decision making based on “good enough” data. However, this case also illustrates a number of challenges arising from the integrity of crowdsourced data and the potential for ethical conflict when using this data to inform policy formulation.”

The Potential of Crowdsourcing to Improve Patient-Centered Care


Michael Weiner in the Journal The Patient – Patient-Centered Outcomes Research: “Crowdsourcing (CS) is the outsourcing of a problem or task to a crowd. Although patient-centered care (PCC) may aim to be tailored to an individual’s needs, the uses of CS for generating ideas, identifying values, solving problems, facilitating research, and educating an audience represent powerful roles that can shape both allocation of shared resources and delivery of personalized care and treatment. CS can often be conducted quickly and at relatively low cost. Pitfalls include bias, risks of research ethics, inadequate quality of data, inadequate metrics, and observer-expectancy effect. Health professionals and consumers in the US should increase their attention to CS for the benefit of PCC. Patients’ participation in CS to shape health policy and decisions is one way to pursue PCC itself and may help to improve clinical outcomes through a better understanding of patients’ perspectives. CS should especially be used to traverse the quality-cost curve, or decrease costs while preserving or improving quality of care.”

Infomediary Business Models for Connecting Open Data Providers and Users


Paper by Marijn Janssen and Anneke Zuiderwijk in Social Science Computer Review: “Many public organizations are opening their data to the general public and embracing social media in order to stimulate innovation. These developments have resulted in the rise of new, infomediary business models, positioned between open data providers and users. Yet the variation among types of infomediary business models is little understood. The aim of this article is to contribute to the understanding of the diversity of existing infomediary business models that are driven by open data and social media. Cases presenting different modes of open data utilization in the Netherlands are investigated and compared. Six types of business models are identified: single-purpose apps, interactive apps, information aggregators, comparison models, open data repositories, and service platforms. The investigated cases differ in their levels of access to raw data and in how much they stimulate dialogue between different stakeholders involved in open data publication and use. Apps often are easy to use and provide predefined views on data, whereas service platforms provide comprehensive functionality but are more difficult to use. In the various business models, social media is sometimes used for rating and discussion purposes, but it is rarely used for stimulating dialogue or as input to policy making. Hybrid business models were identified in which both public and private organizations contribute to value creation. Distinguishing between different types of open data users was found to be critical in explaining different business models.”

Rethinking Institutions and Organizations


Essay by Royston Greenwood, C.R. Hiningsand Dave Whetten in the Journal of Management Studies: “In this essay we argue that institutional scholarship has become overly concerned with explaining institutions and institutional processes, notably at the level of the organization field, rather than with using them to explain and understand organizations. Especially missing is an attempt to gain a coherent, holistic account of how organizations are structured and managed. We also argue that when institutional theory does give attention to organizations it inappropriately treats them as though they are the same, or at least as though any differences are irrelevant for purposes of theory. We propose a return to the study of organizations with an emphasis upon comparative analysis, and suggest the institutional logics perspective as an appropriate means for doing so.”

Ten Innovations to Compete for Global Innovation Award


Making All Voices Count: “The Global Innovation Competition was launched at the Open Government Partnership Summit in November, 2013 and set out to scout the globe for fresh ideas to enhance government accountability and boost citizen engagement. The call was worldwide and in response, nearly 200 innovative ideas were submitted. After a process of public voting and peer review, these have been reduced to ten.
Below, we highlight the innovations that will now compete for a prize of £65,000 plus six months mentorship at the Global Innovation Week March 31 – April 4, 2014 in Kenya.
The first seven emerged from a process of peer review and the following three were selected by the Global Innovation Jury.

An SMS gateway, connected to local hospitals and the web, to channel citizens’ requests for pregnancy services. At risk women, in need of information such as hospital locations and general advice, will receive relevant and targeted updates utilising both an SMS and a GIS-based system.  The aim is to reduce maternal mortality by targeting at risk women in poorer communities in Indonesia.

“One of the causes of high maternal mortality rate in Indonesia is late response in childbirth treatment and lack of pregnancy care information.”

This project, led by a civil servant, aims to engage citizens in Pakistan in service delivery governance. The project aims to enable and motivate citizens to collect, analyze and disseminate service delivery performance data in order to drive performance and help effective decision making.

“BSDU will serve as a model of better management aided by the citizens, for the citizens.”

A Geographic Information System that gives Indonesian citizens access to information regarding government funded projects. The idea is to enable and motivate citizens to compare a project’s information with its real-world implementation and to provide feedback on this. The ultimate aim is to fight corruption in the public sector by making it easier for citizens to monitor, and provide feedback on, government-funded projects.

“On-the-map information about government-funded projects, where citizens are able to submit their opinions, should became a global standard in budget transparency!”

A digital payment system in South Africa that rewards citizens who participate in activities such as waste separation and community gardening. Citizens are able to ‘spend’ rewards on airtime, pre-paid electricity and groceries. By rewarding social volunteers this project aims to boost citizen engagement, build trust and establish the link between government and citizen actors.

“GEM offers a direct channel for communication and rewards between governments and citizens.”

An app created by a team of software developers to provide Ghanaian citizens with information about the oil and gas industry, with the aim of raising awareness of the revenue generated and to spark debate about how this could be used to improve national development.

“The idea is to bring citizens, the oil and gas companies and the government all onto one platform.”

Ghana Petrol Watch seeks to deliver basic facts and figures associated with oil and gas exploration to the average Ghanaian. The solution employs mobile technology to deliver this information. The audience can voice their concerns as comments on the issue via replies to the SMS. These would then be published on the web portal for further exposure and publicity.

“The information on the petroleum industry is publicly available, but not readily accessible and often does not reach the grassroots community in an easily comprehensible manner.”

A common platform to be implemented in Khulna City, Bangladesh, where citizens and elected officials will interact on budget, expenditure and information.

“The concept of citizen engagement for the fulfillment of pre-election commitment is an innovation in establishing governance.”

The aim of this project is an increase in child engagement in governmental budgeting and policy formulation in Mwanza City, Tanzania. This project was selected as a wildcard by the Global Innovation Jury.

“In many projects I have seen, children are always the perceived beneficiaries, rarely do you see innovations where children are active participants in achieving a goal in their society. It was great to see children as active contributors to their own discourse.” – Jury Member, Shikoh Gitau.

A ‘watchdog’ newsletter in Kenya focusing on monitoring the actions of officials with the aim of educating, empowering and motivating citizens to hold their leaders to account. This project was selected as a wildcard by the Global Innovation Jury.

“We endeavor to bridge the information gap in northern Kenya by giving voice to the voiceless and also highlighting their challenges. The aim is an increase in the educational level of the people through information.”

Citizen Desk is an open-source tool that combines the ability of citizens to share eyewitness reports with the public need for verified information in real time. Citizen Desk lets citizen journalists file reports via SMS or social media, with no need for technical training. This project was selected as a wildcard by the Global Innovation Jury.

“It has become evident for some time now that good technical innovation must rest on a strong bedrock of social and political activity, on the ground, deeply in touch with local conditions, and sometimes in the face of power and privilege.” – Jury Member Bright Simons.”

Big data: are we making a big mistake?


Tim Harford in the Financial Times: “Cheerleaders for big data have made four exciting claims, each one reflected in the success of Google Flu Trends: that data analysis produces uncannily accurate results; that every single data point can be captured, making old statistical sampling techniques obsolete; that it is passé to fret about what causes what, because statistical correlation tells us what we need to know; and that scientific or statistical models aren’t needed because, to quote “The End of Theory”, a provocative essay published in Wired in 2008, “with enough data, the numbers speak for themselves”. Unfortunately, these four articles of faith are at best optimistic oversimplifications. At worst, according to David Spiegelhalter, Winton Professor of the Public Understanding of Risk at Cambridge university, they can be “complete bollocks. Absolute nonsense.”…
But big data do not solve the problem that has obsessed statisticians and scientists for centuries: the problem of insight, of inferring what is going on, and figuring out how we might intervene to change a system for the better.
“We have a new resource here,” says Professor David Hand of Imperial College London. “But nobody wants ‘data’. What they want are the answers.”
To use big data to produce such answers will require large strides in statistical methods.
“It’s the wild west right now,” says Patrick Wolfe of UCL. “People who are clever and driven will twist and turn and use every tool to get sense out of these data sets, and that’s cool. But we’re flying a little bit blind at the moment.”
Statisticians are scrambling to develop new methods to seize the opportunity of big data. Such new methods are essential but they will work by building on the old statistical lessons, not by ignoring them.
Recall big data’s four articles of faith. Uncanny accuracy is easy to overrate if we simply ignore false positives, as with Target’s pregnancy predictor. The claim that causation has been “knocked off its pedestal” is fine if we are making predictions in a stable environment but not if the world is changing (as with Flu Trends) or if we ourselves hope to change it. The promise that “N = All”, and therefore that sampling bias does not matter, is simply not true in most cases that count. As for the idea that “with enough data, the numbers speak for themselves” – that seems hopelessly naive in data sets where spurious patterns vastly outnumber genuine discoveries.
“Big data” has arrived, but big insights have not. The challenge now is to solve new problems and gain new answers – without making the same old statistical mistakes on a grander scale than ever.”

The GovLab Index: Privacy and Security


Please find below the latest installment in The GovLab Index series, inspired by the Harper’s Index. “The GovLab Index: Privacy and Security examines the attitudes and concerns of American citizens regarding online privacy. Previous installments include Designing for Behavior ChangeThe Networked Public, Measuring Impact with Evidence, Open Data, The Data Universe, Participation and Civic Engagement and Trust in Institutions.
Globally

  • Percentage of people who feel the Internet is eroding their personal privacy: 56%
  • Internet users who feel comfortable sharing personal data with an app: 37%
  • Number of users who consider it important to know when an app is gathering information about them: 70%
  • How many people in the online world use privacy tools to disguise their identity or location: 28%, or 415 million people
  • Country with the highest penetration of general anonymity tools among Internet users: Indonesia, where 42% of users surveyed use proxy servers
  • Percentage of China’s online population that disguises their online location to bypass governmental filters: 34%

In the United States
Over the Years

  • In 1996, percentage of the American public who were categorized as having “high privacy concerns”: 25%
    • Those with “Medium privacy concerns”: 59%
    • Those who were unconcerned with privacy: 16%
  • In 1998, number of computer users concerned about threats to personal privacy: 87%
  • In 2001, those who reported “medium to high” privacy concerns: 88%
  • Individuals who are unconcerned about privacy: 18% in 1990, down to 10% in 2004
  • How many online American adults are more concerned about their privacy in 2014 than they were a year ago, indicating rising privacy concerns: 64%
  • Number of respondents in 2012 who believe they have control over their personal information: 35%, downward trend for 7 years
  • How many respondents in 2012 continue to perceive privacy and the protection of their personal information as very important or important to the overall trust equation: 78%, upward trend for seven years
  • How many consumers in 2013 trust that their bank is committed to ensuring the privacy of their personal information is protected: 35%, down from 48% in 2004

Privacy Concerns and Beliefs

  • How many Internet users worry about their privacy online: 92%
    • Those who report that their level of concern has increased from 2013 to 2014: 7 in 10
    • How many are at least sometimes worried when shopping online: 93%, up from 89% in 2012
    • Those who have some concerns when banking online: 90%, up from 86% in 2012
  • Number of Internet users who are worried about the amount of personal information about them online: 50%, up from 33% in 2009
    • Those who report that their photograph is available online: 66%
      • Their birthdate: 50%
      • Home address: 30%
      • Cell number: 24%
      • A video: 21%
      • Political affiliation: 20%
  • Consumers who are concerned about companies tracking their activities: 58%
    • Those who are concerned about the government tracking their activities: 38%
  • How many users surveyed felt that the National Security Association (NSA) overstepped its bounds in light of recent NSA revelations: 44%
  • Respondents who are comfortable with advertisers using their web browsing history to tailor advertisements as long as it is not tied to any other personally identifiable information: 36%, up from 29% in 2012
  • Percentage of voters who do not want political campaigns to tailor their advertisements based on their interests: 86%
  • Percentage of respondents who do not want news tailored to their interests: 56%
  • Percentage of users who are worried about their information will be stolen by hackers: 75%
    • Those who are worried about companies tracking their browsing history for targeted advertising: 54%
  • How many consumers say they do not trust businesses with their personal information online: 54%
  • Top 3 most trusted companies for privacy identified by consumers from across 25 different industries in 2012: American Express, Hewlett Packard and Amazon
    • Most trusted industries for privacy: Healthcare, Consumer Products and Banking
    • Least trusted industries for privacy: Internet and Social Media, Non-Profits and Toys
  • Respondents who admit to sharing their personal information with companies they did not trust in 2012 for reasons such as convenience when making a purchase: 63%
  • Percentage of users who say they prefer free online services supported by targeted ads: 61%
    • Those who prefer paid online services without targeted ads: 33%
  • How many Internet users believe that it is not possible to be completely anonymous online: 59%
    • Those who believe complete online anonymity is still possible: 37%
    • Those who say people should have the ability to use the Internet anonymously: 59%
  • Percentage of Internet users who believe that current laws are not good enough in protecting people’s privacy online: 68%
    • Those who believe current laws provide reasonable protection: 24%

FULL LIST at http://thegovlab.org/the-govlab-index-privacy-and-trust/