Nick Grossman in Idea Lab: “The big idea in all of this is that through open data and standards and API-based interoperability, it’s possible not just to build more “civic apps,” but to make all apps more civic:
So in a perfect world, I’d not only be able to get my transit information from anywhere (say, Citymapper), I’d be able to read restaurant inspection data from anywhere (say, Foursquare), be able to submit a 311 request from anywhere (say, Twitter), etc.
These examples only scratch the surface of how apps can “become more civic” (i.e., integrate with government/civic information and services). And that’s only really describing one direction: apps tapping into government information and services.
Another, even more powerful direction is the reverse: helping governments tap into the people-power in web networks. In fact, I heard an amazing stat earlier this year:
It’s incredible to think about how web-enabled networks can extend the reach and increase the leverage of public-interest programs and government services, even when (perhaps especially when) that is not their primary function — i.e., Waze is a traffic avoidance app, not a “civic” app. Other examples include the Airbnb community coming together to provide emergency housing after Sandy, and the Etsy community helping to “craft a comeback” in Rockford, Ill.
In other words, helping all apps “be more civic,” rather than just building more civic apps. I think there is a ton of leverage there, and it’s a direction that has just barely begun to be explored.”
Defining Open Data
As the open data movement grows, and even more governments and organisations sign up to open data, it becomes ever more important that there is a clear and agreed definition for what “open data” means if we are to realise the full benefits of openness, and avoid the risks of creating incompatibility between projects and splintering the community.

Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals).

Read more about different kinds of data in our one page introduction to open data
There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance …. So the explanation of what open means applies to all of these information sources and types. Open may also apply both to data – big data and small data – or to content, like images, text and music!
So here we set out clearly what open means, and why this agreed definition is vital for us to collaborate, share and scale as open data and open content grow and reach new communities.
What is Open?
The full Open Definition provides a precise definition of what open data is. There are 2 important elements to openness:
- Legal openness: you must be allowed to get the data legally, to build on it, and to share it. Legal openness is usually provided by applying an appropriate (open) license which allows for free access to and reuse of the data, or by placing data into the public domain.
- Technical openness: there should be no technical barriers to using that data. For example, providing data as printouts on paper (or as tables in PDF documents) makes the information extremely difficult to work with. So the Open Definition has various requirements for “technical openness,” such as requiring that data be machine readable and available in bulk.”…
Imagining Data Without Division
Thomas Lin in Quanta Magazine: “As science dives into an ocean of data, the demands of large-scale interdisciplinary collaborations are growing increasingly acute…Seven years ago, when David Schimel was asked to design an ambitious data project called the National Ecological Observatory Network, it was little more than a National Science Foundation grant. There was no formal organization, no employees, no detailed science plan. Emboldened by advances in remote sensing, data storage and computing power, NEON sought answers to the biggest question in ecology: How do global climate change, land use and biodiversity influence natural and managed ecosystems and the biosphere as a whole?…
For projects like NEON, interpreting the data is a complicated business. Early on, the team realized that its data, while mid-size compared with the largest physics and biology projects, would be big in complexity. “NEON’s contribution to big data is not in its volume,” said Steve Berukoff, the project’s assistant director for data products. “It’s in the heterogeneity and spatial and temporal distribution of data.”
Unlike the roughly 20 critical measurements in climate science or the vast but relatively structured data in particle physics, NEON will have more than 500 quantities to keep track of, from temperature, soil and water measurements to insect, bird, mammal and microbial samples to remote sensing and aerial imaging. Much of the data is highly unstructured and difficult to parse — for example, taxonomic names and behavioral observations, which are sometimes subject to debate and revision.
And, as daunting as the looming data crush appears from a technical perspective, some of the greatest challenges are wholly nontechnical. Many researchers say the big science projects and analytical tools of the future can succeed only with the right mix of science, statistics, computer science, pure mathematics and deft leadership. In the big data age of distributed computing — in which enormously complex tasks are divided across a network of computers — the question remains: How should distributed science be conducted across a network of researchers?
Part of the adjustment involves embracing “open science” practices, including open-source platforms and data analysis tools, data sharing and open access to scientific publications, said Chris Mattmann, 32, who helped develop a precursor to Hadoop, a popular open-source data analysis framework that is used by tech giants like Yahoo, Amazon and Apple and that NEON is exploring. Without developing shared tools to analyze big, messy data sets, Mattmann said, each new project or lab will squander precious time and resources reinventing the same tools. Likewise, sharing data and published results will obviate redundant research.
To this end, international representatives from the newly formed Research Data Alliance met this month in Washington to map out their plans for a global open data infrastructure.”
Social media analytics for future oriented policy making
New paper by Verena Grubmüller, Katharina Götsch, and Bernhard Krieger: “Research indicates that evidence-based policy making is most successful when public administrators refer to diversified information portfolios. With the rising prominence of social media in the last decade, this paper argues that governments can benefit from integrating this publically available, user-generated data through the technique of social media analytics (SMA). There are already several initiatives set up to predict future policy issues, e.g. for the policy fields of crisis mitigation or migrant integration insights. The authors analyse these endeavours and their potential for providing more efficient and effective public policies. Furthermore, they scrutinise the challenges to governmental SMA usage in particular with regards to legal and ethical aspects. Reflecting the latter, this paper provides forward-looking recommendations on how these technologies can best be used for future policy making in a legally and ethically sound manner.”
Online public services and Design Thinking for governments
Ela Alptekin: “The digital era has changed the expectations citizens have regarding the communication of public services and their engagement with government agencies. ‘Digital Citizenship’ is common place and this is a great opportunity for institutions to explore the benefits this online presence offers.
Most government agencies have moved their public services to digital platforms by applying technology to the exact same workflow they had earlier. They’ve replaced hard copies with emails and signatures with digital prints. However, Information Technologies don’t just improve the efficiency of governments, they also have the power to transform how governments work by redefining their engagement with citizens. With this outlook they can expand the array of services that could be provided and implemented.
When it comes to online public services there are two different paths to building-up a strategy: Governments can either: Use stats, trends and quantitative surveys to measure and produce “reliable results”; or they can develop a deeper understanding of the basic needs of their consumers for a specific problem. With that focus, they may propose a solid solution that would satisfy those needs.
Two of the primary criteria of evaluation in any measurement or observation are:
-
Does the same measurement process yields the same results?
-
Are we measuring what we intend to measure?
These two concepts are reliability and validity.
According to Roger Martin, author of “The Design of Business”, truly innovative organisations are those that have managed to balance the “reliability” of analytical thinking with the “validity” of abductive thinking. Many organisations often don’t find this balance between reliability and validity and choose only the reliable data to move on with their future implementations.
So what is the relationship between reliability and validity? The two do not necessarily go hand-in-hand.
“At best, we have a measure that has both high validity and high reliability. It yields consistent results in repeated application and it accurately reflects what we hope to represent.
It is possible to have a measure that has high reliability but low validity – one that is consistent in getting bad information or consistent in missing the mark. *It is also possible to have one that has low reliability and low validity – inconsistent and not on target.
Finally, it is not possible to have a measure that has low reliability and high validity – you can’t really get at what you want or what you’re interested in if your measure fluctuates wildly.” – click here for further reading.
Many online, government, public services are based on reliable data and pay no attention to the validity of the results ( 1st figure “reliable but not valid” ).
What can government agencies use to balance the reliability and validity when it comes to public services? The answer is waiting in Design Thinking and abductive reasoning.
….Design thinking helps agencies to go back to the basics of what citizens need from their governments. It can be used to develop both reliable and valid online public services that are able to satisfy their needs….
As Government accelerates towards a world of public services that are digital by default, is this going to deliver the kind of digital services that move the public with them?
To find out, thinkpublic partnered with Consumer Focus (UK) to undertake detailed research into some of the fundamental questions and issues that users of digital public services are interested in. The findings have been published today in the Manifesto for Online Public Services, which sets out simple guiding principles to be placed at the heart of online service design.”
The transition towards transparency
More recently the Technology Strategy Board have been working with the likes of NERC, Met Office, Environment Agency and other public agencies to help solve business problems using environmental data.…
It goes without saying that data won’t leap up and create any value by itself any more than a pile of discarded parts outside a factory will assemble themselves into a car. We’ve found that the secret of successful open data innovation is to be with people working to solve some specific problem. Simply releasing the data is not enough. See below a summary of our Do’s and Don’ts of opening up data
Do…
- Make sure data quality is high (ODI Certificates can help!)
- Promote innovation using data sets. Transparency is only a means to an end
- Enhance communication with external innovators
- Make sure your co-creators are incentivised
- Get organised, create a community around an issue
- Pass on learnings to other similar organisations
- Experiement – open data requires new mindsets and business models
- Create safe spaces – Innovation Airlocks – to share and prototype with trusted partners
- Be brave – people may do things with the data that you don’t like
- Set out to create commercial or social value with data
Dont…
- Just release data and expect people to understand or create with it. Publication is not the same as communication
- Wait for data requests, put the data out first informally
- Avoid challenges to current income streams
- Go straight for the finished article, use rapid prototyping
- Be put off by the tensions between confidentiality, data protection and publishing
- Wait for the big budget or formal process but start big things with small amounts now
- Be technology led, be business led instead
- Expect the community to entirely self-manage
- Restrict open data to the IT literate – create interdisciplinary partnerships
- Get caught in the false dichotomy that is commercial vs. social
In summary we believe we need to assume openness as the default (for organisations that is, not individuals) and secrecy as the exception – the exact opposite to how most commercial organisations currently operate. …”
Undefined By Data: A Survey of Big Data Definitions
Using Participatory Crowdsourcing in South Africa to Create a Safer Living Environment
The study illustrates how participatory crowdsourcing (specifically humans as sensors) can be used as a Smart City initiative focusing on public safety by illustrating what is required to contribute to the Smart City, and developing a roadmap in the form of a model to assist decision making when selecting an optimal crowdsourcing initiative. Public safety data quality criteria were developed to assess and identify the problems affecting data quality.
This study is guided by design science methodology and applies three driving theories: the Data Information Knowledge Action Result (DIKAR) model, the characteristics of a Smart City, and a credible Data Quality Framework. Four critical success factors were developed to ensure high quality public safety data is collected through participatory crowdsourcing utilising voice technologies.”
Mobile phone data are a treasure-trove for development
Paul van der Boor and Amy Wesolowski in SciDevNet: “Each of us generates streams of digital information — a digital ‘exhaust trail’ that provides real-time information to guide decisions that affect our lives. For example, Google informs us about traffic by using both its ‘My Location’ feature on mobile phones and third-party databases to aggregate location data. BBVA, one of Spain’s largest banks, analyses transactions such as credit card payments as well as ATM withdrawals to find out when and where peak spending occurs.This type of data harvest is of great value. But, often, there is so much data that its owners lack the know-how to process it and fail to realise its potential value to policymakers.
Meanwhile, many countries, particularly in the developing world, have a dearth of information. In resource-poor nations, the public sector often lives in an analogue world where piles of paper impede operations and policymakers are hindered by uncertainty about their own strengths and capabilities.Nonetheless, mobile phones have quickly pervaded the lives of even the poorest: 75 per cent of the world’s 5.5 billion mobile subscriptions are in emerging markets. These people are also generating digital trails of anything from their movements to mobile phone top-up patterns. It may seem that putting this information to use would take vast analytical capacity. But using relatively simple methods, researchers can analyse existing mobile phone data, especially in poor countries, to improve decision-making.
Think of existing, available data as low-hanging fruit that we — two graduate students — could analyse in less than a month. This is not a test of data-scientist prowess, but more a way of saying that anyone could do it.
There are three areas that should be ‘low-hanging fruit’ in terms of their potential to dramatically improve decision-making in information-poor countries: coupling healthcare data with mobile phone data to predict disease outbreaks; using mobile phone money transactions and top-up data to assess economic growth; and predicting travel patterns after a natural disaster using historical movement patterns from mobile phone data to design robust response programmes.
Another possibility is using call-data records to analyse urban movement to identify traffic congestion points. Nationally, this can be used to prioritise infrastructure projects such as road expansion and bridge building.
The information that these analyses could provide would be lifesaving — not just informative or revenue-increasing, like much of this work currently performed in developed countries.
But some work of high social value is being done. For example, different teams of European and US researchers are trying to estimate the links between mobile phone use and regional economic development. They are using various techniques, such as merging night-time satellite imagery from NASA with mobile phone data to create behavioural fingerprints. They have found that this may be a cost-effective way to understand a country’s economic activity and, potentially, guide government spending.
Another example is given by researchers (including one of this article’s authors) who have analysed call-data records from subscribers in Kenya to understand malaria transmission within the country and design better strategies for its elimination. [1]
In this study, published in Science, the location data of the mobile phones of more than 14 million Kenyan subscribers was combined with national malaria prevalence data. After identifying the sources and sinks of malaria parasites and overlaying these with phone movements, analysis was used to identify likely transmission corridors. UK scientists later used similar methods to create different epidemic scenarios for the Côte d’Ivoire.”
Making All Voices Count
Launch of Making All Voices Count: “Making All Voices Count is a global initiative that supports innovation, scaling-up, and research to deepen existing innovations and help harness new technologies to enable citizen engagement and government responsiveness….Solvable problems need not remain unsolved. Democratic systems in the 21st century continue to be inhibited by 19th century timescales, with only occasional opportunities for citizens to express their views formally, such as during elections. In this century, many citizens have access to numerous tools that enable them to express their views – and measure government performance – in real time.
For example, online reporting platforms enable citizens to monitor the election process by reporting intimidation, vote buying, bias and misinformation; access to mobile technology allows citizens to update water suppliers on gaps in service delivery; crisis information can be crowdsourced via eyewitness reports of violence, as reported by email and sms.
The rise of mobile communication, the installation of broadband and the fast-growing availability of open data, offer tremendous opportunities for data journalism and new media channels. They can inspire governments to develop new ways to fight corruption and respond to citizens efficiently, effectively and fairly. In short, developments in technology and innovation mean that government and citizens can interact like never before.
Making All Voices Count is about seizing this moment to strengthen our commitments to promote transparency, fight corruption, empower citizens, and harness the power of new technologies to make government more effective and accountable.
The programme specifically aims to address the following barriers that weaken the link between governments and citizens:
- Citizens lack incentives: Citizens may not have the necessary incentives to express their feedback on government performance – due to a sense of powerlessness, distrust in the government, fear of retribution, or lack of reliable information
- Governments lack incentives: At the same time, governments need incentives to respond to citizen input whenever possible and to leverage citizen participation. The government’s response to citizens should be reinforced by proactive, public communication. This initiative will help create incentives for government to respond. Where government responds effectively, citizens’ confidence in government performance and approval ratings are likely to increase
- Governments lack the ability to translate citizen feedback into action: This could be due to anything from political constraints to a lack of skills and systems. Governments need better tools to effectively analyze and translate citizen input into information that will lead to solutions and shape resource allocation. Once captured, citizens’ feedback (on their experiences with government performance) must be communicated so as to engage both the government and the broader public in finding a solution.
- Citizens lack meaningful opportunities: Citizens need greater access to better tools and know-how to easily engage with government in a way that results in government action and citizen empowerment”