Toward an Open Data Demand Assessment and Segmentation Methodology


Stefaan Verhulst and Andrew Young at IADB: “Across the world, significant time and resources are being invested in making government data accessible to all with the broad goal of improving people’s lives. Evidence of open data’s impact – on improving governance, empowering citizens, creating economic opportunity, and solving public problems – is emerging and is largely encouraging. Yet much of the potential value of open data remains untapped, in part because we often do not understand who is using open data or, more importantly, who is not using open data but could benefit from the insights it may generate. By identifying, prioritizing, segmenting, and engaging with the actual and future demand for open data in a systemic and systematic way, practitioners can ensure that open data is more targeted. Understanding and meeting the demand for open data can increase overall impact and return on investment of public funds.

The GovLab, in partnership with the Inter-American Development Bank, and with the support of the French Development Agency developed the Open Data Demand and Assessment Methodology to provide open data policymakers and practitioners with an approach for identifying, segmenting, and engaging with demand. This process specifically seeks to empower data champions within public agencies who want to improve their data’s ability to improve people’s lives….(More)”.

New Urban Centres Database sets new standards for information on cities at global scale


EU Science Hub: “Data analysis highlights very diverse development patterns and inequalities across cities and world regions.

Building on the Global Human Settlement Layer (GHSL), the new database provides more detailed information on the cities’ location and size as well as characteristics such as greenness, night time light emission, population size, the built-up areas exposed to natural hazards, and travel time to the capital city.

For several of these attributes, the database contains information recorded over time, dating as far back as 1975. 

Responding to a lack of consistent data, or data only limited to large cities, the Urban Centre Database now makes it possible to map, classify and count all human settlements in the world in a standardised way.

An analysis of the data reveals very different development patterns in the different parts of the world.

“The data shows that in the low-income countries, high population growth has resulted only into moderate increases in the built-up areas, while in the high-income countries, moderate population growth has resulted into very big increases in the built-up areas. In practice, cities have grown more in size in richer countries, with respect to poorer countries where the populations are growing faster”, said JRC researcher Thomas Kemper.

According to JRC scientists, around 75% of the global population now live in cities, towns or suburbs….

The City Centres Database provides new open data supporting the monitoring of UN Sustainable Development Goals, the UN’s New Urban Agenda and the Sendai Framework for Disaster Risk Reduction.

The main findings based on the Urban Centre Database are summarised in a new edition of the Atlas of the Human Planet, published together with the database….(More)”.

“Giving something back”: A systematic review and ethical enquiry into public views on the use of patient data for research in the United Kingdom and the Republic of Ireland


Paper by Jessica Stockdale, Jackie Cassell and Elizabeth Ford: “The use of patients’ medical data for secondary purposes such as health research, audit, and service planning is well established in the UK, and technological innovation in analytical methods for new discoveries using these data resources is developing quickly. Data scientists have developed, and are improving, many ways to extract and process information in medical records. This continues to lead to an exciting range of health related discoveries, improving population health and saving lives. Nevertheless, as the development of analytic technologies accelerates, the decision-making and governance environment as well as public views and understanding about this work, has been lagging behind1.

Public opinion and data use

A range of small studies canvassing patient views, mainly in the USA, have found an overall positive orientation to the use of patient data for societal benefit27. However, recent case studies, like NHS England’s ill-fated Care.data scheme, indicate that certain schemes for secondary data use can prove unpopular in the UK. Launched in 2013, Care.data aimed to extract and upload the whole population’s general practice patient records to a central database for prevalence studies and service planning8. Despite the stated intention of Care.data to “make major advances in quality and patient safety”8, this programme was met with a widely reported public outcry leading to its suspension and eventual closure in 2016. Several factors may have been involved in this failure, from the poor public communication about the project, lack of social licence9, or as pressure group MedConfidential suggests, dislike of selling data to profit-making companies10. However, beyond these specific explanations for the project’s failure, what ignited public controversy was a concern with the impact that its aim to collect and share data on a large scale might have on patient privacy. The case of Care.data indicates a reluctance on behalf of the public to share their patient data, and it is still not wholly clear whether the public are willing to accept future attempts at extracting and linking large datasets of medical information. The picture of mixed opinion makes taking an evidence-based position, drawing on social consensus, difficult for legislators, regulators, and data custodians who may respond to personal or media generated perceptions of public views. However, despite differing results of studies canvassing public views, we hypothesise that there may be underlying ethical principles that could be extracted from the literature on public views, which may provide guidance to policy-makers for future data-sharing….(More)”.

EU negotiators agree on new rules for sharing of public sector data


European Commission Press Release: “Negotiators from the European Parliament, the Council of the EU and the Commission have reached an agreement on a revised directive that will facilitate the availability and re-use of public sector data.

Data is the fuel that drives the growth of many digital products and services. Making sure that high-quality, high-value data from publicly funded services is widely and freely available is a key factor in accelerating European innovation in highly competitive fields such as artificial intelligence requiring access to vast amounts of high-quality data.

In full compliance with the EU General Data Protection Regulation, the new Directive on Open Data and Public Sector Information (PSI) – which can be for example anything from anonymised personal data on household energy use to general information about national education or literacy levels – updates the framework setting out the conditions under which public sector data should be made available for re-use, with a particular focus on the increasing amounts of high-value data that is now available.

Vice-President for the Digital Single Market Andrus Ansip said: “Data is increasingly the lifeblood of today’s economy and unlocking the potential of public open data can bring significant economic benefits. The total direct economic value of public sector information and data from public undertakings is expected to increase from €52 billion in 2018 to €194 billion by 2030. With these new rules in place, we will ensure that we can make the most of this growth” 

Commissioner for Digital Economy and Society Mariya Gabriel said: “Public sector information has already been paid for by the taxpayer. Making it more open for re-use benefits the European data economy by enabling new innovative products and services, for example based on artificial intelligence technologies. But beyond the economy, open data from the public sector is also important for our democracy and society because it increases transparency and supports a facts-based public debate.”

As part of the EU Open Data policy, rules are in place to encourage Member States to facilitate the re-use of data from the public sector with minimal or no legal, technical and financial constraints. But the digital world has changed dramatically since they were first introduced in 2003.

What do the new rules cover?

  • All public sector content that can be accessed under national access to documents rules is in principle freely available for re-use. Public sector bodies will not be able to charge more than the marginal cost for the re-use of their data, except in very limited cases. This will allow more SMEs and start-ups to enter new markets in providing data-based products and services.
  • A particular focus will be placed on high-value datasets such as statistics or geospatial data. These datasets have a high commercial potential, and can speed up the emergence of a wide variety of value-added information products and services.
  • Public service companies in the transport and utilities sector generate valuable data. The decision on whether or not their data has to be made available is covered by different national or European rules, but when their data is available for re-use, they will now be covered by the Open Data and Public Sector Information Directive. This means they will have to comply with the principles of the Directive and ensure the use of appropriate data formats and dissemination methods, while still being able to set reasonable charges to recover related costs.
  • Some public bodies strike complex data deals with private companies, which can potentially lead to public sector information being ‘locked in’. Safeguards will therefore be put in place to reinforce transparency and to limit the conclusion of agreements which could lead to exclusive re-use of public sector data by private partners.
  • More real-time data, available via Application Programming Interfaces (APIs), will allow companies, especially start-ups, to develop innovative products and services, e.g. mobility apps. Publicly-funded research data is also being brought into the scope of the directive: Member States will be required to develop policies for open access to publicly funded research data while harmonised rules on re-use will be applied to all publicly-funded research data which is made accessible via repositories….(More)”.

Info We Trust: How to Inspire the World with Data


Book by R.J. Andrews: “How do we create new ways of looking at the world? Join award-winning data storyteller RJ Andrews as he pushes beyond the usual how-to, and takes you on an adventure into the rich art of informing.

Creating Info We Trust is a craft that puts the world into forms that are strong and true.  It begins with maps, diagrams, and charts — but must push further than dry defaults to be truly effective. How do we attract attention? How can we offer audiences valuable experiences worth their time? How can we help people access complexity?

Dark and mysterious, but full of potential, data is the raw material from which new understanding can emerge. Become a hero of the information age as you learn how to dip into the chaos of data and emerge with new understanding that can entertain, improve, and inspire. Whether you call the craft data storytelling, data visualization, data journalism, dashboard design, or infographic creation — what matters is that you are courageously confronting the chaos of it all in order to improve how people see the world. Info We Trust is written for everyone who straddles the domains of data and people: data visualization professionals, analysts, and all who are enthusiastic for seeing the world in new ways.

This book draws from the entirety of human experience, quantitative and poetic. It teaches advanced techniques, such as visual metaphor and data transformations, in order to create more human presentations of data.  It also shows how we can learn from print advertising, engineering, museum curation, and mythology archetypes. This human-centered approach works with machines to design information for people. Advance your understanding beyond by learning from a broad tradition of putting things “in formation” to create new and wonderful ways of opening our eyes to the world….(More)”.

Societal costs and benefits of high-value open government data: a case study in the Netherlands


Paper by F.M. Welle Donker and B. van Loenen: “Much research has emphasised the benefits of open government data, and especially high-value data. The G8 Open Data Charter defines high-value data as data that improve democracy and encourage the innovative reuse of the particular data. Thus, governments worldwide invest resources to identify potential high-value datasets and to publish these data as open data. However, while the benefits of open data are well researched, the costs of publishing data as open data are less researched. This research examines the relationship between the costs of making data suitable for publication as (linked) open data and the societal benefits thereof. A case study of five high-value datasets was carried out in the Netherlands to provide a societal cost-benefit analysis of open high-value data. Different options were investigated, ranging from not publishing the dataset at all to publishing the dataset as linked open data.

In general, it can be concluded that the societal benefits of (linked) open data are higher than the costs. The case studies show that there are differences between the datasets. In many cases, costs for open data are an integral part of general data management costs and hardly lead to additional costs. In certain cases, however, the costs to anonymize /aggregate the data are high compared to the potential value of an open data version of the dataset. Although, for these datasets, this leads to a less favourable relationship between costs and benefits, the societal benefits would still be higher than without an open data version….(More)”.

Index: Open Data


By Alexandra Shaw, Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on open data and was originally published in 2018.

Value and Impact

  • The projected year at which all 28+ EU member countries will have a fully operating open data portal: 2020

  • Between 2016 and 2020, the market size of open data in Europe is expected to increase by 36.9%, and reach this value by 2020: EUR 75.7 billion

Public Views on and Use of Open Government Data

  • Number of Americans who do not trust the federal government or social media sites to protect their data: Approximately 50%

  • Key findings from The Economist Intelligence Unit report on Open Government Data Demand:

    • Percentage of respondents who say the key reason why governments open up their data is to create greater trust between the government and citizens: 70%

    • Percentage of respondents who say OGD plays an important role in improving lives of citizens: 78%

    • Percentage of respondents who say OGD helps with daily decision making especially for transportation, education, environment: 53%

    • Percentage of respondents who cite lack of awareness about OGD and its potential use and benefits as the greatest barrier to usage: 50%

    • Percentage of respondents who say they lack access to usable and relevant data: 31%

    • Percentage of respondents who think they don’t have sufficient technical skills to use open government data: 25%

    • Percentage of respondents who feel the number of OGD apps available is insufficient, indicating an opportunity for app developers: 20%

    • Percentage of respondents who say OGD has the potential to generate economic value and new business opportunity: 61%

    • Percentage of respondents who say they don’t trust governments to keep data safe, protected, and anonymized: 19%

Efforts and Involvement

  • Time that’s passed since open government advocates convened to create a set of principles for open government data – the instance that started the open data government movement: 10 years

  • Countries participating in the Open Government Partnership today: 79 OGP participating countries and 20 subnational governments

  • Percentage of “open data readiness” in Europe according to European Data Portal: 72%

    • Open data readiness consists of four indicators which are presence of policy, national coordination, licensing norms, and use of data.

  • Number of U.S. cities with Open Data portals: 27

  • Number of governments who have adopted the International Open Data Charter: 62

  • Number of non-state organizations endorsing the International Open Data Charter: 57

  • Number of countries analyzed by the Open Data Index: 94

  • Number of Latin American countries that do not have open data portals as of 2017: 4 total – Belize, Guatemala, Honduras and Nicaragua

  • Number of cities participating in the Open Data Census: 39

Demand for Open Data

  • Open data demand measured by frequency of open government data use according to The Economist Intelligence Unit report:

    • Australia

      • Monthly: 15% of respondents

      • Quarterly: 22% of respondents

      • Annually: 10% of respondents

    • Finland

      • Monthly: 28% of respondents

      • Quarterly: 18% of respondents

      • Annually: 20% of respondents

    •  France

      • Monthly: 27% of respondents

      • Quarterly: 17% of respondents

      • Annually: 19% of respondents

        •  
    • India

      • Monthly: 29% of respondents

      • Quarterly: 20% of respondents

      • Annually: 10% of respondents

    • Singapore

      • Monthly: 28% of respondents

      • Quarterly: 15% of respondents

      • Annually: 17% of respondents 

    • UK

      • Monthly: 23% of respondents

      • Quarterly: 21% of respondents

      • Annually: 15% of respondents

    • US

      • Monthly: 16% of respondents

      • Quarterly: 15% of respondents

      • Annually: 20% of respondents

  • Number of FOIA requests received in the US for fiscal year 2017: 818,271

  • Number of FOIA request processed in the US for fiscal year 2017: 823,222

  • Distribution of FOIA requests in 2017 among top 5 agencies with highest number of request:

    • DHS: 45%

    • DOJ: 10%

    • NARA: 7%

    • DOD: 7%

    • HHS: 4%

Examining Datasets

  • Country with highest index score according to ODB Leaders Edition: Canada (76 out of 100)

  • Country with lowest index score according to ODB Leaders Edition: Sierra Leone (22 out of 100)

  • Number of datasets open in the top 30 governments according to ODB Leaders Edition: Fewer than 1 in 5

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition: 19%

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition by sector/subject:

    • Budget: 30%

    • Companies: 13%

    • Contracts: 27%

    • Crime: 17%

    • Education: 13%

    • Elections: 17%

    • Environment: 20%

    • Health: 17%

    • Land: 7%

    • Legislation: 13%

    • Maps: 20%

    • Spending: 13%

    • Statistics: 27%

    • Trade: 23%

    • Transport: 30%

  • Percentage of countries that release data on government spending according to ODB Leaders Edition: 13%

  • Percentage of government data that is updated at regular intervals according to ODB Leaders Edition: 74%

  • Number of datasets available through:

  • Number of datasets classed as “open” in 94 places worldwide analyzed by the Open Data Index: 11%

  • Percentage of open datasets in the Caribbean, according to Open Data Census: 7%

  • Number of companies whose data is available through OpenCorporates: 158,589,950

City Open Data

  • New York City

  • Singapore

    • Number of datasets published in Singapore: 1,480

    • Percentage of datasets with standardized format: 35%

    • Percentage of datasets made as raw as possible: 25%

  • Barcelona

    • Number of datasets published in Barcelona: 443

    • Open data demand in Barcelona measured by:

      • Number of unique sessions in the month of September 2018: 5,401

    • Quality of datasets published in Barcelona according to Tim Berners Lee 5-star Open Data: 3 stars

  • London

    • Number of datasets published in London: 762

    • Number of data requests since October 2014: 325

  • Bandung

    • Number of datasets published in Bandung: 1,417

  • Buenos Aires

    • Number of datasets published in Buenos Aires: 216

  • Dubai

    • Number of datasets published in Dubai: 267

  • Melbourne

    • Number of datasets published in Melbourne: 199

Sources

  • About OGP, Open Government Partnership. 2018.  

The Paradox of Police Data


Stacy Wood in KULA: knowledge creation, dissemination, and preservation studies: “This paper considers the history and politics of ‘police data.’ Police data, I contend, is a category of endangered data reliant on voluntary and inconsistent reporting by law enforcement agencies; it is also inconsistently described and routinely housed in systems that were not designed with long-term strategies for data preservation, curation or management in mind. Moreover, whereas US law enforcement agencies have, for over a century, produced and published a great deal of data about crime, data about the ways in which police officers spend their time and make decisions about resources—as well as information about patterns of individual officer behavior, use of force, and in-custody deaths—is difficult to find. This presents a paradoxical situation wherein vast stores of extant data are completely inaccessible to the public. This paradoxical state is not new, but the continuation of a long history co-constituted by technologies, epistemologies and context….(More)”.

Congress passes ‘Open Government Data Act’ to make open data part of the US Code


Melisha Dsouza at Packt>: “22nd December marked a win for U.S. government in terms of efficiency, accountability, and transparency of open data. Following the Senate vote held on 19th December, Congress passed the Foundations for Evidence-Based Policymaking (FEBP) Act (H.R. 4174, S. 2046). Title II of this package is the Open, Public, Electronic and Necessary (OPEN) Government Data Act, which requires all non-sensitive government data to be made available in open and machine-readable formats by default.

The federal government possesses a huge amount of public data which should ideally be used to improve government services and promote private sector innovation. The open data proposal will mandate that federal agencies publish their information online, using machine-readable data formats.

Here are some of the key points that the Open Government Data Act seeks to do:

  • Define open data without locking in yesterday’s technology.
  • Create minimal standards for making federal government data available to the public.
  • Require the federal government to use open data for better decision making.
  • Ensure accountability by requiring regular oversight.
  • Establish and formalize Chief Data Officers (CDO) at federal agencies with data governance and implementation responsibilities.
  • Agencies need to maintain and publish a comprehensive data inventory of all data assets to help open data advocates identify key government information resources and transform them from documents and siloed databases into open data….(More)”.

For a more extensive discussion see: Congress votes to make open government data the default in the United States by Alex Howard.

It’s time for a Bill of Data Rights


Article by Martin Tisne: “…The proliferation of data in recent decades has led some reformers to a rallying cry: “You own your data!” Eric Posner of the University of Chicago, Eric Weyl of Microsoft Research, and virtual-reality guru Jaron Lanier, among others, argue that data should be treated as a possession. Mark Zuckerberg, the founder and head of Facebook, says so as well. Facebook now says that you “own all of the contact and information you post on Facebook” and “can control how it is shared.” The Financial Times argues that “a key part of the answer lies in giving consumers ownership of their own personal data.” In a recent speech, Tim Cook, Apple’s CEO, agreed, saying, “Companies should recognize that data belongs to users.”

This essay argues that “data ownership” is a flawed, counterproductive way of thinking about data. It not only does not fix existing problems; it creates new ones. Instead, we need a framework that gives people rights to stipulate how their data is used without requiring them to take ownership of it themselves….

The notion of “ownership” is appealing because it suggests giving you power and control over your data. But owning and “renting” out data is a bad analogy. Control over how particular bits of data are used is only one problem among many. The real questions are questions about how data shapes society and individuals. Rachel’s story will show us why data rights are important and how they might work to protect not just Rachel as an individual, but society as a whole.

Tomorrow never knows

To see why data ownership is a flawed concept, first think about this article you’re reading. The very act of opening it on an electronic device created data—an entry in your browser’s history, cookies the website sent to your browser, an entry in the website’s server log to record a visit from your IP address. It’s virtually impossible to do anything online—reading, shopping, or even just going somewhere with an internet-connected phone in your pocket—without leaving a “digital shadow” behind. These shadows cannot be owned—the way you own, say, a bicycle—any more than can the ephemeral patches of shade that follow you around on sunny days.

Your data on its own is not very useful to a marketer or an insurer. Analyzed in conjunction with similar data from thousands of other people, however, it feeds algorithms and bucketizes you (e.g., “heavy smoker with a drink habit” or “healthy runner, always on time”). If an algorithm is unfair—if, for example, it wrongly classifies you as a health risk because it was trained on a skewed data set or simply because you’re an outlier—then letting you “own” your data won’t make it fair. The only way to avoid being affected by the algorithm would be to never, ever give anyone access to your data. But even if you tried to hoard data that pertains to you, corporations and governments with access to large amounts of data about other people could use that data to make inferences about you. Data is not a neutral impression of reality. The creation and consumption of data reflects how power is distributed in society. …(More)”.