How Open Data Policies Unlock Innovation


Tim Cashman at Socrata: “Several trends made the Web 2.0 world we now live in possible. Arguably, the most important of these has been the evolution of online services as extensible technology platforms that enable users, application developers, and other collaborators to create value that extends far beyond the original offering itself.

The Era of ‘Government-as-a-Platform’

The same principles that have shaped the consumer web are now permeating government. Forward-thinking public sector organizations are catching on to the idea that, to stay relevant and vital, governments must go beyond offering a few basic services online. Some have even come to the realization that they are custodians of an enormously valuable resource: the data they collect through their day-to-day operations.  By opening up this data for public consumption online, innovative governments are facilitating the same kind of digital networks that consumer web services have fostered for years.  The era of government as a platform is here, and open data is the catalyst.

The Role of Open Data Policy in Unlocking Innovation in Government

The open data movement continues to transition from an emphasis on transparency to measuring the civic and economic impact of open data programs. As part of this transition, governments are realizing the importance of creating a formal policy to define strategic goals, describe the desired benefits, and provide the scope for data publishing efforts over time.  When well executed, open data policies yield a clear set of benefits. These range from spurring slow-moving bureaucracies into action to procuring the necessary funding to sustain open data initiatives beyond a single elected official’s term.

Four Types of Open Data Policies

There are four main types of policy levers currently in use regarding open data: executive orders, non-binding resolutions, new laws, new regulations, and codified laws. Each of these tools has specific advantages and potential limitations.

Executive Orders

The prime example of an open data executive order in action is President Barack Obama’s Open Data Initiative. While this executive order was short – only four paragraphs on two pages – the real policy magic was a mandate-by-reference that required all U.S. federal agencies to comply with a detailed set of time-bound actions. All of these requirements are publicly viewable on a GitHub repository – a free hosting service for open source software development projects – which is revolutionary in and of itself. Detailed discussions on government transparency took place not in closed-door boardrooms, but online for everyone to see, edit, and improve.

Non-Binding Resolutions

A classic example of a non-binding resolution can be found by doing an online search for the resolution of Palo Alto, California. Short and sweet, this town squire-like exercise delivers additional attention to the movement inside and outside of government. The lightweight policy tool also has the benefit of lasting a bit longer than any particular government official. Although, in recognition of the numerous resolutions that have ever come out of any small town, resolutions are only as timeless as people’s memory.

Internal Regulations

The New York State Handbook on Open Data is a great example of internal regulations put to good use. Originating from the Office of Information Technology Resources, the handbook is a comprehensive, clear, and authoritative guide on how open data is actually supposed to work. Also available on GitHub, the handbook resembles the federal open data project in many ways.

Codified Laws

The archetypal example of open data law comes from San Francisco.
Interestingly, what started as an “Executive Directive” from Mayor Gavin Newsom later turned into legislation and brought with it the power of stronger department mandates and a significant budget. Once enacted, laws are generally hard to revise. However, in the case of San Francisco, the city council has already revised the law two times in four years.
At the federal government level, the Digital Accountability and Transparency Act, or DATA Act, was introduced in both the U.S. House of Representatives (H.R. 2061) and the U.S. Senate (S. 994) in 2013. The act mandates the standardization and publication of a wide of variety of the federal government’s financial reports as open data. Although the Housed voted to pass the Data Act, it still awaits a vote in the Senate.

The Path to Government-as-a-Platform

Open data policies are an effective way to motivate action and provide clear guidance for open data programs. But they are not a precondition for public-sector organizations to embrace the government-as-a-platform model. In fact, the first step does not involve technology at all. Instead, it involves government leaders realizing that public data belongs to the people. And, it requires the vision to appreciate this data as a shared resource that only increases in value the more widely it is distributed and re-used for analytics, web and mobile apps, and more.
The consumer web has shown the value of open data networks in spades (think Facebook). Now, it’s government’s turn to create the next web.”

The myth of the keyboard warrior: public participation and 38 Degrees


James Dennis in Open Democracy: “A cursory glance at the comment section of the UK’s leading newspapers suggests that democratic engagement is at an all time low; we are generation apathetic. In their annual health check, the Audit of Political Engagement, the Hansard Society paint a bleak picture of participation trends in Britain. Only 41% of those surveyed are committed to voting in the next General Election. Moreover, less than 1% of the population is a member of a political party. However, 38 Degrees, the political activist movement, bucks these downward trends. In the four years since their foundation in 2009, 38 Degrees have amassed a membership of 1.8 million individuals—more than three times the entire combined memberships of all of Britain’s political parties.

The organisation is not without its critics, however. Earlier this week, during a debate in House of Commons on the Care Bill, David T. C. Davies MP cast doubt on the authenticity of the organisation’s ethos, “People. Power. Change”, claiming that:

These people purport to be happy-go-lucky students. They are always on first name terms; Ben and Fred and Rebecca and Sarah and the rest of it. The reality is that it is a hard-nosed left-wing Labour-supporting organisation with links to some very wealthy upper middle-class socialists, despite the pretence that it likes to give out.

Likewise, in a comment piece for The Guardian, Oscar Rickett argued that the form of participation cultivated by 38 Degrees is not beneficial to our civic culture as it encourages fragmented, issue-driven collective action in which “small urges are satisfied with the implication that they are bringing about large change”.
However, given the lack of empirical research undertaken on 38 Degrees, such criticisms are often anecdotal or campaign-specific. So here are just a couple of the significant findings emerging from my ongoing research.

New organisations

38 Degrees bears little resemblance to the organisational models that we’ve become accustomed to. Unlike political parties or traditional pressure groups, 38 Degrees operates on a more level playing field. Members are central to the key decisions that are made before and during a campaign and the staff facilitate these choices. Essentially, the organisation acts as a conduit for its membership, removing the layers of elite-level decision-making that characterised political groups in the twentieth century.
38 Degrees seeks to structure grassroots engagement in two ways. Firstly, the group fuses a vast range of qualitative and quantitative data sources from its membership to guide their campaign decisions and strategy. By using digital media, members are able to express their opinion very quickly on an unprecedented scale. One way in which they do this is through ad-hoc surveys of their members to decide on key strategic decisions, such as their survey regarding the decision to campaign against plans by the NHS to compile a database of medical records for potential use by private firms. In just 24 hours the group had a response from 137,000 of it’s members, with 93 per cent backing their plans to organise a mass opt out.
Secondly, the group offers the platform Campaigns By You, which provides members with the technological opportunities to structure and undertake their own campaigns, retaining complete autonomy over the decision-making process. In both cases, albeit to a differing degree, it is the mass of individual participants that direct the group strategy, with 38 Degrees offering the technological capacity to structure this. 38 Degrees assimilates the fragmented, competing individual voices of its membership, and offers cohesive, collective action.
David Karpf proposes that we consider this phenomenon as characteristic of new type of organisation. These new organisations challenge our traditional understanding of collective action as they are structurally fluid. 38 Degrees relies on central staff to structure the wants and needs of their membership. However, this doesn’t necessarily lead to a regimented hierarchy. Pablo Gerbaudo describes this as ‘soft leadership’ where the central staff act as choreographers, organising and structuring collective action whilst minimising their encroachment on the will of individual members. …
In conclusion, the successes of 38 Degrees, in terms of mobilising public participation, come down to how the organisation maximises the membership’s sense of efficacy, the feeling that each individual member has, or can have, an impact.
By providing influence over the decision-making process, either explicitly or implicitly, members become more than just cheerleaders observing elites from the sidelines; they are active and involved in the planning and execution of public participation.”

Personal Data for the Public Good


Final report on “New Opportunities to Enrich Understanding of Individual and Population Health” of the health data exploration project: “Individuals are tracking a variety of health-related data via a growing number of wearable devices and smartphone apps. More and more data relevant to health are also being captured passively as people communicate with one another on social networks, shop, work, or do any number of activities that leave “digital footprints.”
Almost all of these forms of “personal health data” (PHD) are outside of the mainstream of traditional health care, public health or health research. Medical, behavioral, social and public health research still largely rely on traditional sources of health data such as those collected in clinical trials, sifting through electronic medical records, or conducting periodic surveys.
Self-tracking data can provide better measures of everyday behavior and lifestyle and can fill in gaps in more traditional clinical data collection, giving us a more complete picture of health. With support from the Robert Wood Johnson Foundation, the Health Data Exploration (HDE) project conducted a study to better understand the barriers to using personal health data in research from the individuals who track the data about their own personal health, the companies that market self-track- ing devices, apps or services and aggregate and manage that data, and the researchers who might use the data as part of their research.
Perspectives
Through a series of interviews and surveys, we discovered strong interest in contributing and using PHD for research. It should be noted that, because our goal was to access individuals and researchers who are already generating or using digital self-tracking data, there was some bias in our survey findings—participants tended to have more educa- tion and higher household incomes than the general population. Our survey also drew slightly more white and Asian participants and more female participants than in the general population.
Individuals were very willing to share their self-tracking data for research, in particular if they knew the data would advance knowledge in the fields related to PHD such as public health, health care, computer science and social and behavioral science. Most expressed an explicit desire to have their information shared anonymously and we discovered a wide range of thoughts and concerns regarding thoughts over privacy.
Equally, researchers were generally enthusiastic about the potential for using self-tracking data in their research. Researchers see value in these kinds of data and think these data can answer important research questions. Many consider it to be of equal quality and importance to data from existing high quality clinical or public health data sources.
Companies operating in this space noted that advancing research was a worthy goal but not their primary business concern. Many companies expressed interest in research conducted outside of their company that would validate the utility of their device or application but noted the critical importance of maintaining their customer relationships. A number were open to data sharing with academics but noted the slow pace and administrative burden of working with universities as a challenge.
In addition to this considerable enthusiasm, it seems a new PHD research ecosystem may well be emerging. Forty-six percent of the researchers who participated in the study have already used self-tracking data in their research, and 23 percent of the researchers have already collaborated with application, device, or social media companies.
The Personal Health Data Research Ecosystem
A great deal of experimentation with PHD is taking place. Some individuals are experimenting with personal data stores or sharing their data directly with researchers in a small set of clinical experiments. Some researchers have secured one-off access to unique data sets for analysis. A small number of companies, primarily those with more of a health research focus, are working with others to develop data commons to regularize data sharing with the public and researchers.
SmallStepsLab serves as an intermediary between Fitbit, a data rich company, and academic research- ers via a “preferred status” API held by the company. Researchers pay SmallStepsLab for this access as well as other enhancements that they might want.
These promising early examples foreshadow a much larger set of activities with the potential to transform how research is conducted in medicine, public health and the social and behavioral sciences.
Opportunities and Obstacles
There is still work to be done to enhance the potential to generate knowledge out of personal health data:

  • Privacy and Data Ownership: Among individuals surveyed, the dominant condition (57%) for making their PHD available for research was an assurance of privacy for their data, and over 90% of respondents said that it was important that the data be anonymous. Further, while some didn’t care who owned the data they generate, a clear majority wanted to own or at least share owner- ship of the data with the company that collected it.
  • InformedConsent:Researchersareconcerned about the privacy of PHD as well as respecting the rights of those who provide it. For most of our researchers, this came down to a straightforward question of whether there is informed consent. Our research found that current methods of informed consent are challenged by the ways PHD are being used and reused in research. A variety of new approaches to informed consent are being evaluated and this area is ripe for guidance to assure optimal outcomes for all stakeholders.
  • Data Sharing and Access: Among individuals, there is growing interest in, as well as willingness and opportunity to, share personal health data with others. People now share these data with others with similar medical conditions in online groups like PatientsLikeMe or Crohnology, with the intention to learn as much as possible about mutual health concerns. Looking across our data, we find that individuals’ willingness to share is dependent on what data is shared, how the data will be used, who will have access to the data and when, what regulations and legal protections are in place, and the level of compensation or benefit (both personal and public).
  • Data Quality: Researchers highlighted concerns about the validity of PHD and lack of standard- ization of devices. While some of this may be addressed as the consumer health device, apps and services market matures, reaching the optimal outcome for researchers might benefit from strategic engagement of important stakeholder groups.

We are reaching a tipping point. More and more people are tracking their health, and there is a growing number of tracking apps and devices on the market with many more in development. There is overwhelming enthusiasm from individuals and researchers to use this data to better understand health. To maximize personal data for the public good, we must develop creative solutions that allow individual rights to be respected while providing access to high-quality and relevant PHD for research, that balance open science with intellectual property, and that enable productive and mutually beneficial collaborations between the private sector and the academic research community.”

Why the wealthiest countries are also the most open with their data


Emily Badger in the Washington Post: “The Oxford Internet Institute this week posted a nice visualization of the state of open data in 70 countries around the world, reflecting the willingness of national governments to release everything from transportation timetables to election results to machine-readable national maps. The tool is based on the Open Knowledge Foundation’s Open Data Index, an admittedly incomplete but telling assessment of who willingly publishes updated, accurate national information on, say, pollutants (Sweden) and who does not (ahem, South Africa).

Oxford Internet Institute
Tally up the open data scores for these 70 countries, and the picture looks like this, per the Oxford Internet Institute (click on the picture to link through to the larger interactive version):
Oxford Internet Institute
…With apologies for the tiny, tiny type (and the fact that many countries aren’t listed here at all), a couple of broad trends are apparent. For one, there’s a prominent global “openness divide,” in the words of the Oxford Internet Institute. The high scores mostly come from Europe and North America, the low scores from Asia, Africa and Latin America. Wealth is strongly correlated with “openness” by this measure, whether we look at World Bank income groups or Gross National Income per capita. By the OII’s calculation, wealth accounts for about a third of the variation in these Open Data Index scores.
Perhaps this is an obvious correlation, but the reasons why open data looks like the luxury of rich economies are many, and they point to the reality that poor countries face a lot more obstacles to openness than do places like the United States. For one thing, openness is also closely correlated with Internet penetration. Why open your census results if people don’t have ways to access it (or means to demand it)? It’s no easy task to do this, either.”

The Next Frontier in Crowdsourcing: Your Smartphone


Rachel Metz in MIT TechnologyReview: “Rather than swiping the screen or entering a passcode to unlock the smartphone in my hand, I have to tell it how energetic the people around me are feeling by tapping one of four icons. I’m the only one here, and the one that best fits my actual energy level, to be honest, is a figure lying down and emitting a trail of z’s.
I’m trying out an Android app called Twitch. Created by Stanford researchers, it asks you to complete a few simple tasks—contributing information, as with the reported energy levels, or performing simple tasks like ranking images or structuring data extracted from Wikipedia pages—each time you unlock your phone. The information collected by apps like Twitch could be useful to academics, market researchers, or local businesses. Such software could also provide a low-cost way to perform useful work that can easily be broken up into pieces and fed to millions of devices.

Twitch is one of several projects exploring crowdsourcing via the lock screen. Plenty of people already contribute freely to crowdsourcing websites like Wikipedia and Quora or paid services like Amazon’s Mechanical Turk, and the sustained popularity of traffic app Waze shows that people are willing to contribute to a common cause from their handsets if it provides a timely, helpful result.
There are certainly enough smartphones with lock screens ready to be harnessed. According to data from market researcher comScore, 160 million people in the U.S.—or 67 percent of cell phone users—have smartphones, and nearly 52 percent of these run Google’s Android OS, which allows apps like Twitch to replace the standard lock screen….”

Computational Social Science: Exciting Progress and Future Directions


Duncan Watts in The Bridge: “The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers. Over the same period, and driven by the same explosion in data, the study of social phenomena has increasingly become the province of computer scientists, physicists, and other “hard” scientists. Papers on social networks and related topics appear routinely in top science journals and computer science conferences; network science research centers and institutes are sprouting up at top universities; and funding agencies from DARPA to NSF have moved quickly to embrace what is being called computational social science.
Against these exciting developments stands a stubborn fact: in spite of many thousands of published papers, there’s been surprisingly little progress on the “big” questions that motivated the field of computational social science—questions concerning systemic risk in financial systems, problem solving in complex organizations, and the dynamics of epidemics or social movements, among others.
Of the many reasons for this state of affairs, I concentrate here on three. First, social science problems are almost always more difficult than they seem. Second, the data required to address many problems of interest to social scientists remain difficult to assemble. And third, thorough exploration of complex social problems often requires the complementary application of multiple research traditions—statistical modeling and simulation, social and economic theory, lab experiments, surveys, ethnographic fieldwork, historical or archival research, and practical experience—many of which will be unfamiliar to any one researcher. In addition to explaining the particulars of these challenges, I sketch out some ideas for addressing them….”

The Parable of Google Flu: Traps in Big Data Analysis


David Lazer: “…big data last winter had its “Dewey beats Truman” moment, when the poster child of big data (at least for behavioral data), Google Flu Trends (GFT), went way off the rails in “nowcasting” the flu–overshooting the peak last winter by 130% (and indeed, it has been systematically overshooting by wide margins for 3 years). Tomorrow we (Ryan Kennedy, Alessandro Vespignani, and Gary King) have a paper out in Science dissecting why GFT went off the rails, how that could have been prevented, and the broader lessons to be learned regarding big data.
[We are The Parable of Google Flu (WP-Final).pdf we submitted before acceptance. We have also posted an SSRN paper evaluating GFT for 2013-14, since it was reworked in the Fall.]Key lessons that I’d highlight:
1) Big data are typically not scientifically calibrated. This goes back to my post last month regarding measurement. This does not make them useless from a scientific point of view, but you do need to build into the analysis that the “measures” of behavior are being affected by unseen things. In this case, the likely culprit was the Google search algorithm, which was modified in various ways that we believe likely to have increased flu related searches.
2) Big data + analytic code used in scientific venues with scientific claims need to be more transparent. This is a tricky issue, because there are both legitimate proprietary interests involved and privacy concerns, but much more can be done in this regard than has been done in the 3 GFT papers. [One of my aspirations over the next year is to work together with big data companies, researchers, and privacy advocates to figure out how this can be done.]
3) It’s about the questions, not the size of the data. In this particular case, one could have done a better job stating the likely flu prevalence today by ignoring GFT altogether and just project 3 week old CDC data to today (better still would have been to combine the two). That is, a synthesis would have been more effective than a pure “big data” approach. I think this is likely the general pattern.
4) More generally, I’d note that there is much more that the academy needs to do. First, the academy needs to build the foundation for collaborations around big data (e.g., secure infrastructures, legal understandings around data sharing, etc). Second, there needs to be MUCH more work done to build bridges between the computer scientists who work on big data and social scientists who think about deriving insights about human behavior from data more generally. We have moved perhaps 5% of the way that we need to in this regard.”

How Maps Drive Decisions at EPA


Joseph Marks at NextGov: “The Environmental Protection Agency office charged with taking civil and criminal actions against water and air polluters used to organize its enforcement targeting meetings and conference calls around spreadsheets and graphs.

The USA National Wetlands Inventory is one of the interactive maps produced by the Geoplatform.gov tool.

Those spreadsheets detailed places with large oil and gas production and other possible pollutants where EPA might want to focus its own inspection efforts or reach out to state-level enforcement agencies.
During the past two years, the agency has largely replaced those spreadsheets and tables with digital maps, which make it easier for participants to visualize precisely where the top polluting areas are and how those areas correspond to population centers, said Harvey Simon, EPA’s geospatial information officer, making it easier for the agency to focus inspections and enforcement efforts where they will do the most good.
“Rather than verbally going through tables and spreadsheets you have a lot of people who are not [geographic information systems] practitioners who are able to share map information,” Simon said. “That’s allowed them to take a more targeted and data-driven approach to deciding what to do where.”
The change is a result of the EPA Geoplatform, a tool built off Esri’s ArcGIS Online product, which allows companies and government agencies to build custom Web maps using base maps provided by Esri mashed up with their own data.
When the EPA Geoplatform launched in May 2012 there were about 250 people registered to create and share mapping data within the agency. That number has grown to more than 1,000 during the past 20 months, Simon said.
“The whole idea of the platform effort is to democratize the use of geospatial information within the agency,” he said. “It’s relatively simple now to make a Web map and mash up data that’s useful for your work, so many users are creating Web maps themselves without any support from a consultant or from a GIS expert in their office.”
A governmentwide Geoplatform launched in 2012, spurred largely by agencies’ frustrations with the difficulty of sharing mapping data after the 2010 explosion of the Deepwater Horizon oil rig in the Gulf of Mexico. The platform’s goal was twofold. First officials wanted to share mapping data more widely between agencies so they could avoid duplicating each other’s work and to share data more easily during an emergency.
Second, the government wanted to simplify the process for viewing and creating Web maps so they could be used more easily by nonspecialists.
EPA’s geoplatform has essentially the same goals. The majority of the maps the agency builds using the platform aren’t  publicly accessible so the EPA doesn’t have to worry about scrubbing maps of data that could reveal personal information about citizens or proprietary data about companies. It publishes some maps that don’t pose any privacy concerns on EPA websites as well as on the national geoplatform and to Data.gov, the government data repository.
Once ArcGIS Online is judged compliant with the Federal Information Security Management Act, or FISMA, which is expected this month, EPA will be able to share significantly more nonpublic maps through the national geoplatform and rely on more maps produced by other agencies, Simon said.
EPA’s geoplatform has also made it easier for the agency’s environmental justice office to share common data….”

Participatory Budgeting Platform


Hollie Gilman:  “Stanford’s Social Algorithm’s Lab SOAL has built an interactive Participatory Budgeting Platform that allows users to simulate budgetary decision making on $1 million dollars of public monies.  The center brings together economics, computer science, and networking to work on problems and understand the impact of social networking.   This project is part of Stanford’s Widescope Project to enable people to make political decisions on the budgets through data driven social networks.
The Participatory Budgeting simulation highlights the fourth annual Participatory Budgeting in Chicago’s 49th ward — the first place to implement PB in the U.S.  This year $1 million, out of $1.3 million in Alderman capital funds, will be allocated through participatory budgeting.
One goal of the platform is to build consensus. The interactive geo-spatial mapping software enables citizens to more intuitively identify projects in a given area.  Importantly, the platform forces users to make tough choices and balance competing priorities in real time.
The platform is an interesting example of a collaborative governance prototype that could be transformative in its ability to engage citizens with easily accessible mapping software.”

Open Data is a Civil Right


Yo Yoshida, Founder & CEO, Appallicious in GovTech: “As Americans, we expect a certain standardization of basic services, infrastructure and laws — no matter where we call home. When you live in Seattle and take a business trip to New York, the electric outlet in the hotel you’re staying in is always compatible with your computer charger. When you drive from San Francisco to Los Angeles, I-5 doesn’t all-of-a-sudden turn into a dirt country road because some cities won’t cover maintenance costs. If you take a 10-minute bus ride from Boston to the city of Cambridge, you know the money in your wallet is still considered legal tender.

But what if these expectations of consistency were not always a given? What if cities, counties and states had absolutely zero coordination when it came to basic services? This is what it is like for us in the open data movement. There are so many important applications and products that have been built by civic startups and concerned citizens. However, all too often these efforts are confided to city limits, and unavailable to anyone outside of them. It’s time to start reimagining the way cities function and how local governments operate. There is a wealth of information housed in local governments that should be public by default to help fuel a new wave of civic participation.
Appallicious’ Neighborhood Score provides an overall health and sustainability score, block-by-block for every neighborhood in the city of San Francisco. The first time metrics have been applied to neighborhoods so we can judge how government allocates our resources, so we can better plan how to move forward. But, if you’re thinking about moving to Oakland, just a subway stop away from San Francisco and want to see the score for a neighborhood, our app can’t help you, because that city has yet to release the data sets we need.
In Contra Costa County, there is the lifesaving PulsePoint app, which notifies smartphone users who are trained in CPR when someone nearby may be in need of help. This is an amazing app—for residents of Contra Costa County. But if someone in neighboring Alameda County needs CPR, the app, unfortunately, is completely useless.
Buildingeye visualizes planning and building permit data to allow users to see what projects are being proposed in their area or city. However, buildingeye is only available in a handful of places, simply because most cities have yet to make permits publicly available. Think about what this could do for the construction sector — an industry that has millions of jobs for Americans. Buildingeye also gives concerned citizens access to public documents like never before, so they can see what might be built in their cities or on their streets.
Along with other open data advocates, I have been going from city-to-city, county-to-county and state-to-state, trying to get governments and departments to open up their massive amounts of valuable data. Each time one city, or one county, agrees to make their data publicly accessible, I can’t help but think it’s only a drop in the bucket. We need to think bigger.
Every government, every agency and every department in the country that has already released this information to the public is a case study that points to the success of open data — and why every public entity should follow their lead. There needs to be a national referendum that instructs that all government data should be open and accessible to the public.
Last May, President Obama issued an executive order requiring that going forward, any data generated by the federal government must be made available to the public in open, machine-readable formats. In the executive order, Obama stated that, “openness in government strengthens our democracy, promotes the delivery of efficient and effective services to the public, and contributes to economic growth.”
If this is truly the case, Washington has an obligation to compel local and state governments to release their data as well. Many have tried to spur this effort. California Lt. Gov. Gavin Newsom created the Citizenville Challenge to speed up adoption on the local level. The U.S. Conference of Mayors has also been vocal in promoting open data efforts. But none of these initiatives could have the same effect of a federal mandate.
What I am proposing is no small feat, and it won’t happen overnight. But there should be a concerted effort by those in the technology industry, specifically civic startups, to call on Congress to draft legislation that would require every city in the country to make their data open, free and machine readable. Passing federal legislation will not be an easy task — but creating a “universal open data” law is possible. It would require little to no funding, and it is completely nonpartisan. It’s actually not a political issue at all; it is, for lack of a better word, and administrative issue.
Often good legislation is blocked because lawmakers and citizens are concerned about project funding. While there should be support to help cities and towns achieve the capability of opening their data, a lot of the time, they don’t need it. In 2009, the city and county of San Francisco opened up its data with zero dollars. Many other cities have done the same. There will be cities and municipalities that will need financial assistance to accomplish this. But it is worth it, and it will not require a significant investment for a substantial return. There are free online open data portals, like ckan, dkan and a new effort from Accela, CivicData.com, to centralize open data efforts.
When the UK Government recently announced a £1.5 million investment to support open data initiatives, its Cabinet Office Minister said, “We know that it creates a more accountable, efficient and effective government. Open Data is a raw material for economic growth, supporting the creation of new markets, business and jobs and helping us compete in the global race.”
We should not fall behind these efforts. There is too much at stake for our citizens, not to mention our economy. A recent McKinsey report found that making open data has the potential to create $3 trillion in value worldwide.
Former Speaker Tip O’Neil famously said, “all politics are local.” But we in the civic startup space believe all data is local. Data is reporting potholes in your neighborhood and identifying high crime areas in your communities. It’s seeing how many farmers’ markets there are in your town compared to liquor stores. Data helps predict which areas of a city are most at risk during a heat wave and other natural disasters. A federal open data law would give the raw material needed to create tools to improve the lives of all Americans, not just those who are lucky enough to live in a city that has released this information on its own.
It’s a different way of thinking about how a government operates and the relationship it has with its citizens. Open data gives the public an amazing opportunity to be more involved with governmental decisions. We can increase accountability and transparency, but most importantly we can revolutionize the way local residents communicate and work with their government.
Access to this data is a civil right. If this is truly a government by, of and for the people, then its data needs to be available to all of us. By opening up this wealth of information, we will design a better government that takes advantage of the technology and skills of civic startups and innovative citizens….”