Defining Open Data


Open Knowledge Foundation Blog: “Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose. This is the summary of the full Open Definition which the Open Knowledge Foundation created in 2005 to provide both a succinct explanation and a detailed definition of open data.
As the open data movement grows, and even more governments and organisations sign up to open data, it becomes ever more important that there is a clear and agreed definition for what “open data” means if we are to realise the full benefits of openness, and avoid the risks of creating incompatibility between projects and splintering the community.

Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals).

Read more about different kinds of data in our one page introduction to open data
There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance …. So the explanation of what open means applies to all of these information sources and types. Open may also apply both to data – big data and small data – or to content, like images, text and music!
So here we set out clearly what open means, and why this agreed definition is vital for us to collaborate, share and scale as open data and open content grow and reach new communities.

What is Open?

The full Open Definition provides a precise definition of what open data is. There are 2 important elements to openness:

  • Legal openness: you must be allowed to get the data legally, to build on it, and to share it. Legal openness is usually provided by applying an appropriate (open) license which allows for free access to and reuse of the data, or by placing data into the public domain.
  • Technical openness: there should be no technical barriers to using that data. For example, providing data as printouts on paper (or as tables in PDF documents) makes the information extremely difficult to work with. So the Open Definition has various requirements for “technical openness,” such as requiring that data be machine readable and available in bulk.”…

The transition towards transparency


Roland Harwood at the Open Data Institute Blog: “It’s a very exciting time for the field of open data, especially in the UK public sector which is arguably leading the world in this emerging discipline right now, in no small part thanks to the efforts to the Open Data Institute. There is a strong push to release public data and to explore new innovations that can be created as a result.
For instance, the Ordnance Survey have been leading the way with opening up half of their data for others to use, complemented by their GeoVation programme which provides support and incentive for external innovators to develop new products and services.
More recently the Technology Strategy Board have been working with the likes of NERC, Met Office, Environment Agency and other public agencies to help solve business problems using environmental data.
It goes without saying that data won’t leap up and create any value by itself any more than a pile of discarded parts outside a factory will assemble themselves into a car.   We’ve found that the secret of successful open data innovation is to be with people working to solve some specific problem.  Simply releasing the data is not enough. See below a summary of our Do’s and Don’ts of opening up data
Do…

  • Make sure data quality is high (ODI Certificates can help!)
  • Promote innovation using data sets. Transparency is only a means to an end
  • Enhance communication with external innovators
  • Make sure your co-creators are incentivised
  • Get organised, create a community around an issue
  • Pass on learnings to other similar organisations
  • Experiement – open data requires new mindsets and business models
  • Create safe spaces – Innovation Airlocks – to share and prototype with trusted partners
  • Be brave – people may do things with the data that you don’t like
  • Set out to create commercial or social value with data

Dont…

  • Just release data and expect people to understand or create with it. Publication is not the same as communication
  • Wait for data requests, put the data out first informally
  • Avoid challenges to current income streams
  • Go straight for the finished article, use rapid prototyping
  • Be put off by the tensions between confidentiality, data protection and publishing
  • Wait for the big budget or formal process but start big things with small amounts now
  • Be technology led, be business led instead
  • Expect the community to entirely self-manage
  • Restrict open data to the IT literate – create interdisciplinary partnerships
  • Get caught in the false dichotomy that is commercial vs. social

In summary we believe we need to assume openness as the default (for organisations that is, not individuals) and secrecy as the exception – the exact opposite to how most commercial organisations currently operate. …”

5 Ways Cities Are Using Big Data


Eric Larson in Mashable: “New York City released more than 200 high-value data sets to the public on Monday — a way, in part, to provide more content for open-sourced mapping projects like OpenStreetMap.
It’s one of the many releases since the Local Law 11 of 2012 passed in February, which calls for more transparency of the city government’s collected data.
But it’s not just New York: Cities across the world, large and small, are utilizing big data sets — like traffic statistics, energy consumption rates and GPS mapping — to launch projects to help their respective communities.
We rounded up a few of our favorites below….

1. Seattle’s Power Consumption

The city of Seattle recently partnered with Microsoft and Accenture on a pilot project to reduce the area’s energy usage. Using Microsoft’s Azure cloud, the project will collect and analyze hundreds of data sets collected from four downtown buildings’ management systems.
With predictive analytics, then, the system will work to find out what’s working and what’s not — i.e. where energy can be used less, or not at all. The goal is to reduce power usage by 25%.

2. SpotHero

Finding parking spots — especially in big cities — is undoubtably a headache.

SpotHero is an app, for both iOS and Android devices, that tracks down parking spots in a select number of cities. How it works: Users type in an address or neighborhood (say, Adams Morgan in Washington, D.C.) and are taken to a listing of available garages and lots nearby — complete with prices and time durations.
The app tracks availability in real-time, too, so a spot is updated in the system as soon as it’s snagged.
Seven cities are currently synced with the app: Washington, D.C., New York, Chicago, Baltimore, Boston, Milwaukee and Newark, N.J.

3. Adopt-a-Hydrant

Anyone who’s spent a winter in Boston will agree: it snows.

In January, the city’s Office of New Urban Mechanics released an app called Adopt-a-Hydrant. The program is mapped with every fire hydrant in the city proper — more than 13,000, according to a Harvard blog post — and lets residents pledge to shovel out one, or as many as they choose, in the almost inevitable event of a blizzard.
Once a pledge is made, volunteers receive a notification if their hydrant — or hydrants — become buried in snow.

4. Adopt-a-Sidewalk

Similar to Adopt-a-Hydrant, Chicago’s Adopt-a-Sidewalk app lets residents of the Windy City pledge to shovel sidewalks after snowfall. In a city just as notorious for snowstorms as Boston, it’s an effective way to ensure public spaces remain free of snow and ice — especially spaces belonging to the elderly or disabled.

If you’re unsure which part of town you’d like to “adopt,” just register on the website and browse the map — you’ll receive a pop-up notification for each street you swipe that’s still available.

5. Less Congestion for Lyon

Last year, researchers at IBM teamed up with the city of Lyon, France (about four hours south of Paris), to build a system that helps traffic operators reduce congestion on the road.

The system, called the “Decision Support System Optimizer (DSSO),” uses real-time traffic reports to detect and predict congestions. If an operator sees that a traffic jam is likely to occur, then, she/he can adjust traffic signals accordingly to keep the flow of cars moving smoothly.
It’s an especially helpful tool for emergencies — say, when an ambulance is en route to the hospital. Over time, the algorithms in the system will “learn” from its most successful recommendations, then apply that knowledge when making future predictions.”

Explore the world’s constitutions with a new online tool


Official Google Blog: “Constitutions are as unique as the people they govern, and have been around in one form or another for millennia. But did you know that every year approximately five new constitutions are written, and 20-30 are amended or revised? Or that Africa has the youngest set of constitutions, with 19 out of the 39 constitutions written globally since 2000 from the region?
The process of redesigning and drafting a new constitution can play a critical role in uniting a country, especially following periods of conflict and instability. In the past, it’s been difficult to access and compare existing constitutional documents and language—which is critical to drafters—because the texts are locked up in libraries or on the hard drives of constitutional experts. Although the process of drafting constitutions has evolved from chisels and stone tablets to pens and modern computers, there has been little innovation in how their content is sourced and referenced.
With this in mind, Google Ideas supported the Comparative Constitutions Project to build Constitute, a new site that digitizes and makes searchable the world’s constitutions. Constitute enables people to browse and search constitutions via curated and tagged topics, as well as by country and year. The Comparative Constitutions Project cataloged and tagged nearly 350 themes, so people can easily find and compare specific constitutional material. This ranges from the fairly general, such as “Citizenship” and “Foreign Policy,” to the very specific, such as “Suffrage and turnouts” and “Judicial Autonomy and Power.”
Our aim is to arm drafters with a better tool for constitution design and writing. We also hope citizens will use Constitute to learn more about their own constitutions, and those of countries around the world.”

Participatory Budgeting Around the World


Jay Colburn, from the International Budget Partnership:  “Public participation in budget decision making can occur in many different forms. Participatory budgeting (PB) is an increasingly popular process in which the public is involved directly in making budgetary decisions, most often at the local level. The involvement of community members usually includes identifying and prioritizing the community’s needs and then voting on spending for specific projects.
PB was first developed in Porto Alegre, Brazil, in 1989 as an innovative reform to address the city’s severe inequality. Since then it has spread around the world. Though the specifics of how the PB process works varies depending on the context in which it is implemented, most PB processes have four basic similarities: 1) community members identify spending ideas; 2) delegates are selected to develop spending proposals based on those ideas; 3) residents vote on which proposals to fund; and 4) the government implements the chosen proposals.
During the 1990s PB spread throughout Brazil and across Latin America. Examples of participatory budgeting can now be found in every region of the world, including Central Asia, Europe, and the Middle East. As the use of PB has expanded, it has been adapted in many ways. One example is to incorporate new information and communication technologies as a way to broaden opportunities for participation (see Using Technology to Improve Transparency and Citizen Engagement in this newsletter for more on this topic.)…
There are also a number of different models of PB that have been developed, each with slightly different rules and processes. Using the different models and methods has expanded our knowledge on the potential impacts of PB. In addition to having demonstrable and measurable results on mobilizing public funds for services for the poor, participatory budgeting has also been linked to greater tax compliance, increased demands for transparency, and greater access to budget information and oversight.
However, not all instances of PB are equally successful; there are many variables to consider when weighing the impact of different cases. These can include the level and mechanisms of participation, information accessibility, knowledge of opportunities to participate, political context, and prevailing socioeconomic factors. There is a large and growing literature on the benefits and challenges of PB. The IBP Open Budgets Blog recently featured posts on participatory budgeting initiatives in Peru, Kyrgyzstan, and Kenya. While there are still many lessons to be learned about how PB can be used in different contexts, it is certainly a positive step toward increased citizen engagement in the budget process and influence over how public funds are spent.
For more information and resources on PB, visit the participatory budgeting Facebook group”

Three ways to think of the future…


Geoff Mulgan’s blog: “Here I suggest three complementary ways of thinking about the future which provide partial protection against the pitfalls.
The shape of the future
First, create your own composite future by engaging with the trends. There are many methods available for mapping the future – from Foresight to scenarios to the Delphi method.
Behind all are implicit views about the shapes of change. Indeed any quantitative exploration of the future uses a common language of patterns (shown in this table above) which summarises the fact that some things will go up, some go down, some change suddenly and some not at all.
All of us have implicit or explicit assumptions about these. But it’s rare to interrogate them systematically and test whether our assumptions about what fits in which category are right.
Let’s start with the J shaped curves. Many of the long-term trends around physical phenomena look J-curved: rising carbon emissions, water useage and energy consumption have been exponential in shape over the centuries. As we know, physical constraints mean that these simply can’t go on – the J curves have to become S shaped sooner or later, or else crash. That is the ecological challenge of the 21st century.
New revolutions
But there are other J curves, particularly the ones associated with digital technology.  Moore’s Law and Metcalfe’s Law describe the dramatically expanding processing power of chips, and the growing connectedness of the world.  Some hope that the sheer pace of technological progress will somehow solve the ecological challenges. That hope has more to do with culture than evidence. But these J curves are much faster than the physical ones – any factor that doubles every 18 months achieves stupendous rates of change over decades.
That’s why we can be pretty confident that digital technologies will continue to throw up new revolutions – whether around the Internet of Things, the quantified self, machine learning, robots, mass surveillance or new kinds of social movement. But what form these will take is much harder to predict, and most digital prediction has been unreliable – we have Youtube but not the Interactive TV many predicted (when did you last vote on how a drama should end?); relatively simple SMS and twitter spread much more than ISDN or fibre to the home.  And plausible ideas like the long tail theory turned out to be largely wrong.
If the J curves are dramatic but unusual, much more of the world is shaped by straight line trends – like ageing or the rising price of disease that some predict will take costs of healthcare up towards 40 or 50% of GDP by late in the century, or incremental advances in fuel efficiency, or the likely relative growth of the Chinese economy.
Also important are the flat straight lines – the things that probably won’t change in the next decade or two:  the continued existence of nation states not unlike those of the 19th century? Air travel making use of fifty year old technologies?
Great imponderables
If the Js are the most challenging trends, the most interesting ones are the ‘U’s’- the examples of trends bending:  like crime which went up for a century and then started going down, or world population that has been going up but could start going down in the later part of this century, or divorce rates which seem to have plateaued, or Chinese labour supply which is forecast to turn down in the 2020s.
No one knows if the apparently remorseless upward trends of obesity and depression will turn downwards. No one knows if the next generation in the West will be poorer than their parents. And no one knows if democratic politics will reinvent itself and restore trust. In every case, much depends on what we do. None of these trends is a fact of nature or an act of God.
That’s one reason why it’s good to immerse yourself in these trends and interrogate what shape they really are. Out of that interrogation we can build a rough mental model and generate our own hypotheses – ones not based on the latest fashion or bestseller but hopefully on a sense of what the data shows and in particular what’s happening to the deltas – the current rates of change of different phenomena.”

Open data for accountable governance: Is data literacy the key to citizen engagement?


at UNDP’s Voices of Eurasia blog: “How can technology connect citizens with governments, and how can we foster, harness, and sustain the citizen engagement that is so essential to anti-corruption efforts?
UNDP has worked on a number of projects that use technology to make it easier for citizens to report corruption to authorities:

These projects are showing some promising results, and provide insights into how a more participatory, interactive government could develop.
At the heart of the projects is the ability to use citizen generated data to identify and report problems for governments to address….

Wanted: Citizen experts

As Kenneth Cukier, The Economist’s Data Editor, has discussed, data literacy will become the new computer literacy. Big data is still nascent and it is impossible to predict exactly how it will affect society as a whole. What we do know is that it is here to stay and data literacy will be integral to our lives.
It is essential that we understand how to interact with big data and the possibilities it holds.
Data literacy needs to be integrated into the education system. Educating non-experts to analyze data is critical to enabling broad participation in this new data age.
As technology advances, key government functions become automated, and government data sharing increases, newer ways for citizens to engage will multiply.
Technology changes rapidly, but the human mind and societal habits cannot. After years of closed government and bureaucratic inefficiency, adaptation of a new approach to governance will take time and education.
We need to bring up a generation that sees being involved in government decisions as normal, and that views participatory government as a right, not an ‘innovative’ service extended by governments.

What now?

In the meantime, while data literacy lies in the hands of a few, we must continue to connect those who have the technological skills with citizen experts seeking to change their communities for the better – as has been done in many a Social Innovation Camps recently (in Montenegro, Ukraine and Armenia at Mardamej and Mardamej Relaoded and across the region at Hurilab).
The social innovation camp and hackathon models are an increasingly debated topic (covered by Susannah Vila, David Eaves, Alex Howard and Clay Johnson).
On the whole, evaluations are leading to newer models that focus on greater integration of mentorship to increase sustainability – which I readily support. However, I do have one comment:
Social innovation camps are often criticized for a lack of sustainability – a claim based on the limited number of apps that go beyond the prototype phase. I find a certain sense of irony in this, for isn’t this what innovation is about: Opening oneself up to the risk of failure in the hope of striking something great?
In the words of Vinod Khosla:

“No failure means no risk, which means nothing new.”

As more data is released, the opportunity for new apps and new ways for citizen interaction will multiply and, who knows, someone might come along and transform government just as TripAdvisor transformed the travel industry.”

Innovating to Improve Disaster Response and Recovery


Todd Park at OSTP blog: “Last week, the White House Office of Science and Technology Policy (OSTP) and the Federal Emergency Management Agency (FEMA) jointly challenged a group of over 80 top innovators from around the country to come up with ways to improve disaster response and recovery efforts.  This diverse group of stakeholders, consisting of representatives from Zappos, Airbnb, Marriott International, the Parsons School of Design, AOL/Huffington Post’s Social Impact, The Weather Channel, Twitter, Topix.com, Twilio, New York City, Google and the Red Cross, to name a few, spent an entire day at the White House collaborating on ideas for tools, products, services, programs, and apps that can assist disaster survivors and communities…
During the “Data Jam/Think Tank,” we discussed response and recovery challenges…Below are some of the ideas that were developed throughout the day. In the case of the first two ideas, participants wrote code and created actual working prototypes.

  • A real-time communications platform that allows survivors dependent on electricity-powered medical devices to text or call in their needs—such as batteries, medication, or a power generator—and connect those needs with a collaborative transportation network to make real-time deliveries.
  • A technical schema that tags all disaster-related information from social media and news sites – enabling municipalities and first responders to better understand all of the invaluable information generated during a disaster and help identify where they can help.
  • A Disaster Relief Innovation Vendor Engine (DRIVE) which aggregates pre-approved vendors for disaster-related needs, including transportation, power, housing, and medical supplies, to make it as easy as possible to find scarce local resources.
  • A crowdfunding platform for small businesses and others to receive access to capital to help rebuild after a disaster, including a rating system that encourages rebuilding efforts that improve the community.
  • Promoting preparedness through talk shows, working closely with celebrities, musicians, and children to raise awareness.
  • A “community power-go-round” that, like a merry-go-round, can be pushed to generate electricity and additional power for battery-charged devices including cell phones or a Wi-Fi network to provide community internet access.
  • Aggregating crowdsourced imagery taken and shared through social media sites to help identify where trees have fallen, electrical lines have been toppled, and streets have been obstructed.
  • A kid-run local radio station used to educate youth about preparedness for a disaster and activated to support relief efforts during a disaster that allows youth to share their experiences.”

White House: "We Want Your Input on Building a More Open Government"


Nick Sinai at the White House Blog:”…We are proud of this progress, but recognize that there is always more we can do to build a more efficient, effective, and accountable government.  In that spirit, the Obama Administration has committed to develop a second National Action Plan on Open Government: “NAP 2.0.”
In order to develop a Plan with the most creative and ambitious solutions, we need all-hands-on-deck. That’s why we are asking for your input on what should be in the NAP 2.0:

  1. How can we better encourage and enable the public to participate in government and increase public integrity? For example, in the first National Action Plan, we required Federal enforcement agencies to make publicly available compliance information easily accessible, downloadable and searchable online – helping the public to hold the government and regulated entities accountable.
  • What other kinds of government information should be made more available to help inform decisions in your communities or in your lives?
  • How would you like to be able to interact with Federal agencies making decisions which impact where you live?
  • How can the Federal government better ensure broad feedback and public participation when considering a new policy?
  1. The American people must be able to trust that their Government is doing everything in its power to stop wasteful practices and earn a high return on every tax dollar that is spent.  How can the government better manage public resources? 
  • What suggestions do you have to help the government achieve savings while also improving the way that government operates?
  • What suggestions do you have to improve transparency in government spending?
  1. The American people deserve a Government that is responsive to their needs, makes information readily accessible, and leverages Federal resources to help foster innovation both in the public and private sector.   How can the government more effectively work in collaboration with the public to improve services?
  • What are your suggestions for ways the government can better serve you when you are seeking information or help in trying to receive benefits?
  • In the past few years, the government has promoted the use of “grand challenges,” ambitious yet achievable goals to solve problems of national priority, and incentive prizes, where the government identifies challenging problems and provides prizes and awards to the best solutions submitted by the public.  Are there areas of public services that you think could be especially benefited by a grand challenge or incentive prize?
  • What information or data could the government make more accessible to help you start or improve your business?

Please think about these questions and send your thoughts to opengov@ostp.gov by September 23. We will post a summary of your submissions online in the future.”

How Mechanical Turkers Crowdsourced a Huge Lexicon of Links Between Words and Emotion


The Physics arXiv Blog: Sentiment analysis on the social web depends on how a person’s state of mind is expressed in words. Now a new database of the links between words and emotions could provide a better foundation for this kind of analysis


One of the buzzphrases associated with the social web is sentiment analysis. This is the ability to determine a person’s opinion or state of mind by analysing the words they post on Twitter, Facebook or some other medium.
Much has been promised with this method—the ability to measure satisfaction with politicians, movies and products; the ability to better manage customer relations; the ability to create dialogue for emotion-aware games; the ability to measure the flow of emotion in novels; and so on.
The idea is to entirely automate this process—to analyse the firehose of words produced by social websites using advanced data mining techniques to gauge sentiment on a vast scale.
But all this depends on how well we understand the emotion and polarity (whether negative or positive) that people associate with each word or combinations of words.
Today, Saif Mohammad and Peter Turney at the National Research Council Canada in Ottawa unveil a huge database of words and their associated emotions and polarity, which they have assembled quickly and inexpensively using Amazon’s crowdsourcing Mechanical Turk website. They say this crowdsourcing mechanism makes it possible to increase the size and quality of the database quickly and easily….The result is a comprehensive word-emotion lexicon for over 10,000 words or two-word phrases which they call EmoLex….
The bottom line is that sentiment analysis can only ever be as good as the database on which it relies. With EmoLex, analysts have a new tool for their box of tricks.”
Ref: arxiv.org/abs/1308.6297: Crowdsourcing a Word-Emotion Association Lexicon