Why the wealthiest countries are also the most open with their data


Emily Badger in the Washington Post: “The Oxford Internet Institute this week posted a nice visualization of the state of open data in 70 countries around the world, reflecting the willingness of national governments to release everything from transportation timetables to election results to machine-readable national maps. The tool is based on the Open Knowledge Foundation’s Open Data Index, an admittedly incomplete but telling assessment of who willingly publishes updated, accurate national information on, say, pollutants (Sweden) and who does not (ahem, South Africa).

Oxford Internet Institute
Tally up the open data scores for these 70 countries, and the picture looks like this, per the Oxford Internet Institute (click on the picture to link through to the larger interactive version):
Oxford Internet Institute
…With apologies for the tiny, tiny type (and the fact that many countries aren’t listed here at all), a couple of broad trends are apparent. For one, there’s a prominent global “openness divide,” in the words of the Oxford Internet Institute. The high scores mostly come from Europe and North America, the low scores from Asia, Africa and Latin America. Wealth is strongly correlated with “openness” by this measure, whether we look at World Bank income groups or Gross National Income per capita. By the OII’s calculation, wealth accounts for about a third of the variation in these Open Data Index scores.
Perhaps this is an obvious correlation, but the reasons why open data looks like the luxury of rich economies are many, and they point to the reality that poor countries face a lot more obstacles to openness than do places like the United States. For one thing, openness is also closely correlated with Internet penetration. Why open your census results if people don’t have ways to access it (or means to demand it)? It’s no easy task to do this, either.”

Open Data is a Civil Right


Yo Yoshida, Founder & CEO, Appallicious in GovTech: “As Americans, we expect a certain standardization of basic services, infrastructure and laws — no matter where we call home. When you live in Seattle and take a business trip to New York, the electric outlet in the hotel you’re staying in is always compatible with your computer charger. When you drive from San Francisco to Los Angeles, I-5 doesn’t all-of-a-sudden turn into a dirt country road because some cities won’t cover maintenance costs. If you take a 10-minute bus ride from Boston to the city of Cambridge, you know the money in your wallet is still considered legal tender.

But what if these expectations of consistency were not always a given? What if cities, counties and states had absolutely zero coordination when it came to basic services? This is what it is like for us in the open data movement. There are so many important applications and products that have been built by civic startups and concerned citizens. However, all too often these efforts are confided to city limits, and unavailable to anyone outside of them. It’s time to start reimagining the way cities function and how local governments operate. There is a wealth of information housed in local governments that should be public by default to help fuel a new wave of civic participation.
Appallicious’ Neighborhood Score provides an overall health and sustainability score, block-by-block for every neighborhood in the city of San Francisco. The first time metrics have been applied to neighborhoods so we can judge how government allocates our resources, so we can better plan how to move forward. But, if you’re thinking about moving to Oakland, just a subway stop away from San Francisco and want to see the score for a neighborhood, our app can’t help you, because that city has yet to release the data sets we need.
In Contra Costa County, there is the lifesaving PulsePoint app, which notifies smartphone users who are trained in CPR when someone nearby may be in need of help. This is an amazing app—for residents of Contra Costa County. But if someone in neighboring Alameda County needs CPR, the app, unfortunately, is completely useless.
Buildingeye visualizes planning and building permit data to allow users to see what projects are being proposed in their area or city. However, buildingeye is only available in a handful of places, simply because most cities have yet to make permits publicly available. Think about what this could do for the construction sector — an industry that has millions of jobs for Americans. Buildingeye also gives concerned citizens access to public documents like never before, so they can see what might be built in their cities or on their streets.
Along with other open data advocates, I have been going from city-to-city, county-to-county and state-to-state, trying to get governments and departments to open up their massive amounts of valuable data. Each time one city, or one county, agrees to make their data publicly accessible, I can’t help but think it’s only a drop in the bucket. We need to think bigger.
Every government, every agency and every department in the country that has already released this information to the public is a case study that points to the success of open data — and why every public entity should follow their lead. There needs to be a national referendum that instructs that all government data should be open and accessible to the public.
Last May, President Obama issued an executive order requiring that going forward, any data generated by the federal government must be made available to the public in open, machine-readable formats. In the executive order, Obama stated that, “openness in government strengthens our democracy, promotes the delivery of efficient and effective services to the public, and contributes to economic growth.”
If this is truly the case, Washington has an obligation to compel local and state governments to release their data as well. Many have tried to spur this effort. California Lt. Gov. Gavin Newsom created the Citizenville Challenge to speed up adoption on the local level. The U.S. Conference of Mayors has also been vocal in promoting open data efforts. But none of these initiatives could have the same effect of a federal mandate.
What I am proposing is no small feat, and it won’t happen overnight. But there should be a concerted effort by those in the technology industry, specifically civic startups, to call on Congress to draft legislation that would require every city in the country to make their data open, free and machine readable. Passing federal legislation will not be an easy task — but creating a “universal open data” law is possible. It would require little to no funding, and it is completely nonpartisan. It’s actually not a political issue at all; it is, for lack of a better word, and administrative issue.
Often good legislation is blocked because lawmakers and citizens are concerned about project funding. While there should be support to help cities and towns achieve the capability of opening their data, a lot of the time, they don’t need it. In 2009, the city and county of San Francisco opened up its data with zero dollars. Many other cities have done the same. There will be cities and municipalities that will need financial assistance to accomplish this. But it is worth it, and it will not require a significant investment for a substantial return. There are free online open data portals, like ckan, dkan and a new effort from Accela, CivicData.com, to centralize open data efforts.
When the UK Government recently announced a £1.5 million investment to support open data initiatives, its Cabinet Office Minister said, “We know that it creates a more accountable, efficient and effective government. Open Data is a raw material for economic growth, supporting the creation of new markets, business and jobs and helping us compete in the global race.”
We should not fall behind these efforts. There is too much at stake for our citizens, not to mention our economy. A recent McKinsey report found that making open data has the potential to create $3 trillion in value worldwide.
Former Speaker Tip O’Neil famously said, “all politics are local.” But we in the civic startup space believe all data is local. Data is reporting potholes in your neighborhood and identifying high crime areas in your communities. It’s seeing how many farmers’ markets there are in your town compared to liquor stores. Data helps predict which areas of a city are most at risk during a heat wave and other natural disasters. A federal open data law would give the raw material needed to create tools to improve the lives of all Americans, not just those who are lucky enough to live in a city that has released this information on its own.
It’s a different way of thinking about how a government operates and the relationship it has with its citizens. Open data gives the public an amazing opportunity to be more involved with governmental decisions. We can increase accountability and transparency, but most importantly we can revolutionize the way local residents communicate and work with their government.
Access to this data is a civil right. If this is truly a government by, of and for the people, then its data needs to be available to all of us. By opening up this wealth of information, we will design a better government that takes advantage of the technology and skills of civic startups and innovative citizens….”

Procurement and Civic Innovation


Derek Eder: “Have you ever used a government website and had a not-so-awesome experience? In our slick 2014 world of Google, Twitter and Facebook, why does government tech feel like it’s stuck in the 1990s?
The culprit: bad technology procurement.
Procurement is the procedure a government follows to buy something–letting suppliers know what they want, asking for proposals, restricting what kinds of proposal they will consider, limiting what kinds of firms they will do business with, and deciding if what they got what they paid for.
The City of Chicago buys technology about the same way that they buy health insurance, a bridge, or anything else in between. And that’s the problem.
Chicago’s government has a long history of corruption, nepotism and patronage. After each outrage, new rules are piled upon existing rules to prevent that crisis from happening again. Unfortunately, this accumulation of rules does not just protect against the bad guys, it also forms a huge barrier to entry for technology innovators.
So, the firms that end up building our city’s digital public services tend to be good at picking their way through the barriers of the procurement process, not at building good technology. Instead of making government tech contracting fair and competitive, procurement has unfortunately had the opposite effect.
So where does this leave us? Despite Chicago’s flourishing startup scene, and despite having one of the country’s largest community of civic technologists, the Windy City’s digital public services are still terribly designed and far too expensive to the taxpayer.

The Technology Gap

The best way to see the gap between Chicago’s volunteer civic tech community and the technology that the City pays is to look at an entire class of civic apps that are essentially facelifts on existing government websites….
You may have noticed an increase in quality and usability between these three civic apps and their official government counterparts.
Now consider this: all of the government sites took months to build and cost hundreds of thousands of dollars. Was My Car Towed, 2nd City Zoning and CrimeAround.us were all built by one to two people in a matter of days, for no money.
Think about that for a second. Consider how much the City is overpaying for websites its citizens can barely use. And imagine how much better our digital city services would be if the City worked with the very same tech startups they’re trying to nurture.
Why do these civic apps exist? Well, with the City of Chicago releasing hundreds of high quality datasets on their data portal over the past three years (for which they should be commended), a group of highly passionate and skilled technologists have started using their skills to develop these apps and many others.
It’s mostly for fun, learning, and a sense of civic duty, but it demonstrates there’s no shortage of highly skilled developers who are interested in using technology to make their city a better place to live in…
Two years ago, in the Fall of 2011, I learned about procurement in Chicago for the first time. An awesome group of developers, designers and I had just built ChicagoLobbyists.org – our very first civic app – for the City of Chicago’s first open data hackathon….
Since then, the City has often cited ChicagoLobbyists.org as evidence of the innovation-sparking potential of open data.
Shortly after our site launched, a Request For Proposals, or RFP, was issued by the City for an ‘Online Lobbyist Disclosure System.’
Hey! We just built one of those! Sure, we would need to make some updates to it—adding a way for lobbyists to log in and submit their info—but we had a solid start. So, our scrappy group of tech volunteers decided to respond to the RFP.
After reading all 152 pages of the document, we realized we had no chance of getting the bid. It was impossible for the ChicagoLobbyists.org group to meet the legal requirements (as it would have been for any small software shop):

  • audited financial statements for the past 3 years
  • an economic disclosure statement (EDS) and affidavit
  • proof of $500k workers compensation and employers liability
  • proof of $2 million in professional liability insurance”

How government can engage with citizens online – expert views


The Guardian: In our livechat on 28 February the experts discussed how to connect up government and citizens online. Digital public services are not just for ‘techno wizzy people’, so government should make them easier for everyone… Read the livechat in full
Michael Sanders, head of research for the behavioural insights team@mike_t_sanders
It’s important that government is a part of people’s lives: when people interact with government it shouldn’t be a weird and alienating experience, but one that feels part of their everyday lives.
Online services are still too often difficult to use: most people who use the HMRC website will do so infrequently, and will forget its many nuances between visits. This is getting better but there’s a long way to go.
Digital by default keeps things simple: one of our main findings from our research on improving public services is that we should do all we can to “make it easy”.
There is always a risk of exclusion: we should avoid “digital by default” becoming “digital only”.
Ben Matthews, head of communications at Futuregov@benrmatthews
We prefer digital by design to digital by default: sometimes people can use technology badly, under the guise of ‘digital by default’. We should take a more thoughtful approach to technology, using it as a means to an end – to help us be open, accountable and human.
Leadership is important: you can get enthusiasm from the frontline or younger workers who are comfortable with digital tools, but until they’re empowered by the top of the organisation to use them actively and effectively, we’ll see little progress.
Jargon scares people off: ‘big data’ or ‘open data’, for example….”

Open Government -Opportunities and Challenges for Public Governance


New volume of Public Administration and Information Technology series: “Given this global context, and taking into account both the need of academicians and practitioners, it is the intention of this book to shed light on the open government concept and, in particular:
• To provide comprehensive knowledge of recent major developments of open government around the world.
• To analyze the importance of open government efforts for public governance.
• To provide insightful analysis about those factors that are critical when designing, implementing and evaluating open government initiatives.
• To discuss how contextual factors affect open government initiatives’success or failure.
• To explore the existence of theoretical models of open government.
• To propose strategies to move forward and to address future challenges in an international context.”

New study proves economic benefits of open data for Berlin


ePSI Platform: “The study “Digitales Gold: Nutzen und Wertschöpfung durch Open Data für Berlin” – or “Digital Gold: the open data benefits and its added value for Berlin” in english – released by TSB Technologiestiftung Berlin estimates that Open Data will bring around 32 million euros per year of economic benefit to the city of Berlin for the next few years. …

The estimations made for Berlin are inspired by previous reasoning included in two other studies: Pollock R. (2011), Welfare Gains from opening up public sector information in the UK; and Fuchs, S. et al. (2013), Open Government Data – Offene Daten für Österreich. Mit  Community-Strategien von heute zum Potential von morgen.
Upon presenting the study  data journalist Michael Hörz shows various examples of how to develop interesting new information and services with publicly available information. You can read more about it (in German) here.”

Big Data, Big New Businesses


Nigel Shaboldt and Michael Chui: “Many people have long believed that if government and the private sector agreed to share their data more freely, and allow it to be processed using the right analytics, previously unimaginable solutions to countless social, economic, and commercial problems would emerge. They may have no idea how right they are.

Even the most vocal proponents of open data appear to have underestimated how many profitable ideas and businesses stand to be created. More than 40 governments worldwide have committed to opening up their electronic data – including weather records, crime statistics, transport information, and much more – to businesses, consumers, and the general public. The McKinsey Global Institute estimates that the annual value of open data in education, transportation, consumer products, electricity, oil and gas, health care, and consumer finance could reach $3 trillion.

These benefits come in the form of new and better goods and services, as well as efficiency savings for businesses, consumers, and citizens. The range is vast. For example, drawing on data from various government agencies, the Climate Corporation (recently bought for $1 billion) has taken 30 years of weather data, 60 years of data on crop yields, and 14 terabytes of information on soil types to create customized insurance products.

Similarly, real-time traffic and transit information can be accessed on smartphone apps to inform users when the next bus is coming or how to avoid traffic congestion. And, by analyzing online comments about their products, manufacturers can identify which features consumers are most willing to pay for, and develop their business and investment strategies accordingly.

Opportunities are everywhere. A raft of open-data start-ups are now being incubated at the London-based Open Data Institute (ODI), which focuses on improving our understanding of corporate ownership, health-care delivery, energy, finance, transport, and many other areas of public interest.

Consumers are the main beneficiaries, especially in the household-goods market. It is estimated that consumers making better-informed buying decisions across sectors could capture an estimated $1.1 trillion in value annually. Third-party data aggregators are already allowing customers to compare prices across online and brick-and-mortar shops. Many also permit customers to compare quality ratings, safety data (drawn, for example, from official injury reports), information about the provenance of food, and producers’ environmental and labor practices.

Consider the book industry. Bookstores once regarded their inventory as a trade secret. Customers, competitors, and even suppliers seldom knew what stock bookstores held. Nowadays, by contrast, bookstores not only report what stock they carry but also when customers’ orders will arrive. If they did not, they would be excluded from the product-aggregation sites that have come to determine so many buying decisions.

The health-care sector is a prime target for achieving new efficiencies. By sharing the treatment data of a large patient population, for example, care providers can better identify practices that could save $180 billion annually.

The Open Data Institute-backed start-up Mastodon C uses open data on doctors’ prescriptions to differentiate among expensive patent medicines and cheaper “off-patent” varieties; when applied to just one class of drug, that could save around $400 million in one year for the British National Health Service. Meanwhile, open data on acquired infections in British hospitals has led to the publication of hospital-performance tables, a major factor in the 85% drop in reported infections.

There are also opportunities to prevent lifestyle-related diseases and improve treatment by enabling patients to compare their own data with aggregated data on similar patients. This has been shown to motivate patients to improve their diet, exercise more often, and take their medicines regularly. Similarly, letting people compare their energy use with that of their peers could prompt them to save hundreds of billions of dollars in electricity costs each year, to say nothing of reducing carbon emissions.

Such benchmarking is even more valuable for businesses seeking to improve their operational efficiency. The oil and gas industry, for example, could save $450 billion annually by sharing anonymized and aggregated data on the management of upstream and downstream facilities.

Finally, the move toward open data serves a variety of socially desirable ends, ranging from the reuse of publicly funded research to support work on poverty, inclusion, or discrimination, to the disclosure by corporations such as Nike of their supply-chain data and environmental impact.

There are, of course, challenges arising from the proliferation and systematic use of open data. Companies fear for their intellectual property; ordinary citizens worry about how their private information might be used and abused. Last year, Telefónica, the world’s fifth-largest mobile-network provider, tried to allay such fears by launching a digital confidence program to reassure customers that innovations in transparency would be implemented responsibly and without compromising users’ personal information.

The sensitive handling of these issues will be essential if we are to reap the potential $3 trillion in value that usage of open data could deliver each year. Consumers, policymakers, and companies must work together, not just to agree on common standards of analysis, but also to set the ground rules for the protection of privacy and property.”

Crowdsourcing voices to study Parkinson’s disease


TedMed: “Mathematician Max Little is launching a project that aims to literally give Parkinson’s disease (PD) patients a voice in their own diagnosis and help them monitor their disease progression.
Patients Voice Analysis (PVA) is an open science project that uses phone-based voice recordings and self-reported symptoms, along with software Little designed, to track disease progression. Little, a TEDMED 2013 speaker and TED Fellow, is partnering with the online community PatientsLikeMe, co-founded by TEDMED 2009 speaker James Heywood, and Sage Bionetworks, a non-profit research organization, to conduct the research.
The new project is an extension of Little’s Parkinson’s Voice Initiative, which used speech analysis algorithms to diagnose Parkinson’s from voice records with the help of 17,000 volunteers. This time, he seeks to not only detect markers of PD, but also to add information reported by patients using PatientsLikeMe’s Parkinson’s Disease Rating Scale (PDRS), a tool that documents patients’ answers to questions that measure treatment effectiveness and disease progression….
As openly shared information, the collected data has potential to help vast numbers of individuals by tapping into collective ingenuity. Little has long argued that for science to progress, researchers need to democratize research and move past jostling for credit. Sage Bionetworks has designed a platform called Synapse to allow data sharing with collaborative version control, an effort led by open data advocate John Wilbanks.
“If you can’t share your data, how can you reproduce your science? One of the big problems we’re facing with this kind of medical research is the data is not open and getting access to it is a nightmare,” Little says.
With the PVA project, “Basically anyone can log on download the anonymized data and play around with data mining techniques. We don’t really care what people are able to come up with. We just want the most accurate prediction we can get.
“In research, you’re almost always constrained by what you think is the best way to do things. Unless you open it to the community at large, you’ll never know,” he says.”

Are bots taking over Wikipedia?


Kurzweil News: “As crowdsourced Wikipedia has grown too large — with more than 30 million articles in 287 languages — to be entirely edited and managed by volunteers, 12 Wikipedia bots have emerged to pick up the slack.

The bots use Wikidata — a free knowledge base that can be read and edited by both humans and bots — to exchange information between entries and between the 287 languages.

Which raises an interesting question: what portion of Wikipedia edits are generated by humans versus bots?

To find out (and keep track of other bot activity), Thomas Steiner of Google Germany has created an open-source application (and API): Wikipedia and Wikidata Realtime Edit Stats, described in an arXiv paper.
The percentages of bot vs. human edits as shown in the application is constantly changing.  A KurzweilAI snapshot on Feb. 20 at 5:19 AM EST showed an astonishing 42% of Wikipedia being edited by bots. (The application lists the 12 bots.)


Anonymous vs. logged-In humans (credit: Thomas Steiner)
The percentages also vary by language. Only 5% of English edits were by bots; but for Serbian pages, in which few Wikipedians apparently participate, 96% of edits were by bots.

The application also tracks what percentage of edits are by anonymous users. Globally, it was 25 percent in our snapshot and a surprising 34 percent for English — raising interesting questions about corporate and other interests covertly manipulating Wikipedia information.

LocalWiki turns open local data into open local knowledge


Marina Kukso at OpenGovVoices:” LocalWiki is an open knowledge project focusing on giving everyone the opportunity to collaborate to create and share all kinds of information about the place where they live.

The project started in 2004 in Davis, Calif. as the Davis Wiki, now the primary local information resource for Davis residents. One-in-seven residents have contributed to the project and, in a given month, almost every resident uses it.

In 2010, we received funding from the Knight Foundation to bring LocalWiki to many more communities. We created a wiki software specifically designed for local collaboration and have seen adoption in more than 70 communities worldwide. People now use LocalWiki for everything from mapping out nature trails to planning a grassroots mayoral election candidate debate….

There’s a great deal of expertise within our communities, and at LocalWiki we see part of the mission of our work as providing a platform for people to contextualize and make meaning out of the information made available through open data and open gov efforts at the local level.

There are obviously limitations to the ability of programming laypeople to make use of open data to create new knowledge to drive action, most notably many people’s lack of expertise in data analysis, but with LocalWiki we hope to at least address some of those limitations by making it significantly easier for people to collaborate to create meaning out of open data and to share it with others. This is why LocalWiki has a wysiwyg editor, which includes mapping as a core feature and prioritizes usability in design.

Finally, adding information about a community on LocalWiki is a way to create new open data. It’s incredibly important to make things like internal city crime statistics public, but residents’ perspectives on the relative safety of their neighborhoods is a different kind of data that provides additional insights into public safety challenges and adds complexity to the picture created by statistics.”