Index: Open Data


By Alexandra Shaw, Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on open data and was originally published in 2018.

Value and Impact

  • The projected year at which all 28+ EU member countries will have a fully operating open data portal: 2020

  • Between 2016 and 2020, the market size of open data in Europe is expected to increase by 36.9%, and reach this value by 2020: EUR 75.7 billion

Public Views on and Use of Open Government Data

  • Number of Americans who do not trust the federal government or social media sites to protect their data: Approximately 50%

  • Key findings from The Economist Intelligence Unit report on Open Government Data Demand:

    • Percentage of respondents who say the key reason why governments open up their data is to create greater trust between the government and citizens: 70%

    • Percentage of respondents who say OGD plays an important role in improving lives of citizens: 78%

    • Percentage of respondents who say OGD helps with daily decision making especially for transportation, education, environment: 53%

    • Percentage of respondents who cite lack of awareness about OGD and its potential use and benefits as the greatest barrier to usage: 50%

    • Percentage of respondents who say they lack access to usable and relevant data: 31%

    • Percentage of respondents who think they don’t have sufficient technical skills to use open government data: 25%

    • Percentage of respondents who feel the number of OGD apps available is insufficient, indicating an opportunity for app developers: 20%

    • Percentage of respondents who say OGD has the potential to generate economic value and new business opportunity: 61%

    • Percentage of respondents who say they don’t trust governments to keep data safe, protected, and anonymized: 19%

Efforts and Involvement

  • Time that’s passed since open government advocates convened to create a set of principles for open government data – the instance that started the open data government movement: 10 years

  • Countries participating in the Open Government Partnership today: 79 OGP participating countries and 20 subnational governments

  • Percentage of “open data readiness” in Europe according to European Data Portal: 72%

    • Open data readiness consists of four indicators which are presence of policy, national coordination, licensing norms, and use of data.

  • Number of U.S. cities with Open Data portals: 27

  • Number of governments who have adopted the International Open Data Charter: 62

  • Number of non-state organizations endorsing the International Open Data Charter: 57

  • Number of countries analyzed by the Open Data Index: 94

  • Number of Latin American countries that do not have open data portals as of 2017: 4 total – Belize, Guatemala, Honduras and Nicaragua

  • Number of cities participating in the Open Data Census: 39

Demand for Open Data

  • Open data demand measured by frequency of open government data use according to The Economist Intelligence Unit report:

    • Australia

      • Monthly: 15% of respondents

      • Quarterly: 22% of respondents

      • Annually: 10% of respondents

    • Finland

      • Monthly: 28% of respondents

      • Quarterly: 18% of respondents

      • Annually: 20% of respondents

    •  France

      • Monthly: 27% of respondents

      • Quarterly: 17% of respondents

      • Annually: 19% of respondents

        •  
    • India

      • Monthly: 29% of respondents

      • Quarterly: 20% of respondents

      • Annually: 10% of respondents

    • Singapore

      • Monthly: 28% of respondents

      • Quarterly: 15% of respondents

      • Annually: 17% of respondents 

    • UK

      • Monthly: 23% of respondents

      • Quarterly: 21% of respondents

      • Annually: 15% of respondents

    • US

      • Monthly: 16% of respondents

      • Quarterly: 15% of respondents

      • Annually: 20% of respondents

  • Number of FOIA requests received in the US for fiscal year 2017: 818,271

  • Number of FOIA request processed in the US for fiscal year 2017: 823,222

  • Distribution of FOIA requests in 2017 among top 5 agencies with highest number of request:

    • DHS: 45%

    • DOJ: 10%

    • NARA: 7%

    • DOD: 7%

    • HHS: 4%

Examining Datasets

  • Country with highest index score according to ODB Leaders Edition: Canada (76 out of 100)

  • Country with lowest index score according to ODB Leaders Edition: Sierra Leone (22 out of 100)

  • Number of datasets open in the top 30 governments according to ODB Leaders Edition: Fewer than 1 in 5

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition: 19%

  • Average percentage of datasets that are open in the top 30 open data governments according to ODB Leaders Edition by sector/subject:

    • Budget: 30%

    • Companies: 13%

    • Contracts: 27%

    • Crime: 17%

    • Education: 13%

    • Elections: 17%

    • Environment: 20%

    • Health: 17%

    • Land: 7%

    • Legislation: 13%

    • Maps: 20%

    • Spending: 13%

    • Statistics: 27%

    • Trade: 23%

    • Transport: 30%

  • Percentage of countries that release data on government spending according to ODB Leaders Edition: 13%

  • Percentage of government data that is updated at regular intervals according to ODB Leaders Edition: 74%

  • Number of datasets available through:

  • Number of datasets classed as “open” in 94 places worldwide analyzed by the Open Data Index: 11%

  • Percentage of open datasets in the Caribbean, according to Open Data Census: 7%

  • Number of companies whose data is available through OpenCorporates: 158,589,950

City Open Data

  • New York City

  • Singapore

    • Number of datasets published in Singapore: 1,480

    • Percentage of datasets with standardized format: 35%

    • Percentage of datasets made as raw as possible: 25%

  • Barcelona

    • Number of datasets published in Barcelona: 443

    • Open data demand in Barcelona measured by:

      • Number of unique sessions in the month of September 2018: 5,401

    • Quality of datasets published in Barcelona according to Tim Berners Lee 5-star Open Data: 3 stars

  • London

    • Number of datasets published in London: 762

    • Number of data requests since October 2014: 325

  • Bandung

    • Number of datasets published in Bandung: 1,417

  • Buenos Aires

    • Number of datasets published in Buenos Aires: 216

  • Dubai

    • Number of datasets published in Dubai: 267

  • Melbourne

    • Number of datasets published in Melbourne: 199

Sources

  • About OGP, Open Government Partnership. 2018.  

Digital Investigative Journalism


Book edited by Oliver Hahn and Florian Stalph: “In the post-digital era, investigative journalism around the world faces a revolutionary shift in the way information is gathered and interpreted. Reporters in the field are confronted with data sources, new logics of information dissemination, and a flood of disinformation. Investigative journalists are working with programmers, designers and scientists to develop innovative tools and hands-on approaches that assist them in disclosing the misuse of power and uncovering injustice.

This volume provides an overview of the most sophisticated techniques of digital investigative journalism: data and computational journalism, which investigates stories hidden in numbers; immersive journalism, which digs into virtual reality; drone journalism, which conquers hitherto inaccessible territories; visual and interactive journalism, which reforms storytelling with images and audience perspectives; and digital forensics and visual analytics, which help to authenticate digital content and identify sources in order to detect manipulation. All these techniques are discussed against the backdrop of international political scenarios and globally networked societies….(More)”.

The Social Fact: News and Knowledge in a Networked World


Book by John Wihbey: “While the public believes that journalism remains crucial for democracy, there is a general sense that the news media are performing this role poorly. In The Social Fact, John Wihbey makes the case that journalism can better serve democracy by focusing on ways of fostering social connection. Wihbey explores how the structure of news, information, and knowledge and their flow through society are changing, and he considers ways in which news media can demonstrate the highest possible societal value in the context of these changes.

Wihbey examines network science as well as the interplay between information and communications technologies (ICTs) and the structure of knowledge in society. He discusses the underlying patterns that characterize our increasingly networked world of information—with its viral phenomena and whiplash-inducing trends, its extremes and surprises. How can the traditional media world be reconciled with the world of social, peer-to-peer platforms, crowdsourcing, and user-generated content? Wihbey outlines a synthesis for news producers and advocates innovation in approach, form, and purpose. The Social Fact provides a valuable framework for doing audience-engaged media work of many kinds in our networked, hybrid media environment. It will be of interest to all those concerned about the future of news and public affairs….(More)”.

Time to step away from the ‘bright, shiny things’? Towards a sustainable model of journalism innovation in an era of perpetual change


Paper by Julie Posetti: “The news industry has a focus problem. ‘Shiny Things Syndrome’ –obsessive pursuit of technology in the absence of clear and research-informed strategies – is the diagnosis offered by participants in this research. The cure suggested involves a conscious shift by news publishers from being technology-led, to audience-focused and technology-empowered.

This report presents the first research from the Journalism Innovation Project anchored within the Reuters Institute for the Study of Journalism at the University of Oxford. It is based on analysis of discussions with 39 leading journalism innovators from around the world, representing 27 different news publishers. The main finding of this research is that relentless, high-speed pursuit of technology-driven innovation could be almost as dangerous as stagnation. While ‘random acts of innovation’, organic experimentation, and willingness to embrace new technology remain valuable features of an innovation culture, there is evidence of an increasingly urgent requirement for the cultivation of sustainable innovation frameworks and clear, longer-term strategies within news organisations.

Such a ‘pivot’ could also address the growing problem of burnout associated with ‘innovation fatigue’. To be effective, such strategies need to be focused on engaging audiences – the ‘end users’ – and they would benefit from research-informed innovation ‘indicators’.

The key themes identified in this report are:
a. The risks of ‘Shiny Things Syndrome’ and the impacts of ‘innovation fatigue’ in an era of perpetual change
b. Audiences: starting (again) with the end user
c. The need for a ‘user-led’ approach to researching journalism innovation and developing foundational frameworks to support it

Additionally, new journalism innovation considerations are noted, such as the implications of digital technologies’ ‘unintended consequences’, and the need to respond innovatively to media freedom threats – such as gendered online harassment, privacy breaches, and orchestrated disinformation campaigns….(More)”.

Lost and Saved . . . Again: The Moral Panic about the Loss of Community Takes Hold of Social Media


Keith N. Hampton and Barry Wellman in Contemporary Sociology:”Why does every generation believe that relationships were stronger and community better in the recent past? Lamenting about the loss of community, based on a selective perception of the present and an idealization of ‘‘traditional community,’’ dims awareness of powerful inequalities and cleavages that have always pervaded human society and favors deterministic models over a nuanced understanding of how network affordances contribute to different outcomes. The beˆtes noirs have varied according to the moral panic of the times: industrialization, bureaucratization, urbanization, capitalism, socialism, and technological developments have all been tabbed by such diverse commentators as Thomas Jefferson (1784), Karl Marx (1852), Louis Wirth (1938), Maurice Stein (1960), Robert Bellah et al. (1996), and Tom Brokaw (1998). Each time, observers look back nostalgically to what they supposed were the supportive, solidary communities of the previous generation. Since the advent of the internet, the moral panicers have seized on this technology as the latest cause of lost community, pointing with alarm to what digital technologies are doing to relationships. As the focus shifts to social media and mobile devices, the panic seems particularly acute….

Taylor Dotson’s (2017) recent book Technically Together has a broader timeline for the demise of community. He sees it as happen- ing around the time the internet was popularized, with community even worse off as a result of Facebook and mobile devices. Dotson not only blames new technologies for the decline of community, but social theory, specifically the theory and the practice of ‘‘networked individualism’’: the relational turn from bounded, densely knit local groups to multiple, partial, often far-flung social networks (Rainie and Wellman 2012). Dotson takes the admirable position that social science should do more to imagine different outcomes, new technological possibilities that can be created by tossing aside the trends of today and engineering social change through design….

Some alarm in the recognition that the nature of community is changing as technologies change is sensible, and we have no quarrel with the collective desire to have better, more supportive friends, families, and communities. As Dotson implies, the maneuverability in having one’s own individually networked community can come at the cost of local group solidarity. Indeed, we have also taken action that does more than pontificate to promote local community, building community on and offline (Hampton 2011).

Yet part of contemporary unease comes from a selective perception of the present and an idealization of other forms of community. There is nostalgia for a perfect form of community that never was. Longing for a time when the grass was ever greener dims an awareness of the powerful stresses and cleavages that have always pervaded human society. And advocates, such as Dotson (2017), who suggest the need to save a particular type of community at the expense of another, often do so blind of the potential tradeoffs….(More)”

The Nail Finds a Hammer: Self-Sovereign Identity, Design Principles, and Property Rights in the Developing World


Report by Michael Graglia, Christopher Mellon and Tim Robustelli: “Our interest in identity systems was an inevitable outgrowth of our earlier work on blockchain-based1 land registries.2 Property registries, which at the simplest level are ledgers of who has which rights to which asset, require a very secure and reliable means of identifying both people and properties. In the course of investigating solutions to that problem, we began to appreciate the broader challenges of digital identity and its role in international development. And the more we learned about digital identity, the more convinced we became of the need for self-sovereign identity, or SSI. This model, and the underlying principles of identity which it incorporates, will be described in detail in this paper.

We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority. These credentials are stored and controlled by the identity holder—typically in a wallet—and presented to different people for different reasons at the identity holder’s discretion. Crucially, the identity holder controls what information to present based on the environment, trust level, and type of interaction. Moreover, their fundamental identity persists even though the credentials by which it is represented may change over time.

The digital incarnation of this model has many benefits, including both greatly improved privacy and security, and the ability to create more trustworthy online spaces. Social media and news sites, for example, might limit participation to users with verified identities, excluding bots and impersonators.

The need for identification in the physical world varies based on location and social context. We expect to walk in relative anonymity down a busy city street, but will show a driver’s license to enter a bar, and both a driver’s license and a birth certificate to apply for a passport. There are different levels of ID and supporting documents required for each activity. But in each case, access to personal information is controlled by the user who may choose whether or not to share it.

Self-sovereign identity gives users complete control of their own identities and related personal data, which sits encrypted in distributed storage instead of being stored by a third party in a central database. In older, “federated identity” models, a single account—a Google account, for example—might be used to log in to a number of third-party sites, like news sites or social media platforms. But in this model a third party brokers all of these ID transactions, meaning that in exchange for the convenience of having to remember fewer passwords, the user must sacrifice a degree of privacy.

A real world equivalent would be having to ask the state to share a copy of your driver’s license with the bar every time you wanted to prove that you were over the age of 21. SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction….(More)”.

A Behavioral Economics Approach to Digitalisation


Paper by Dirk Beerbaum and Julia M. Puaschunder: “A growing body of academic research in the field of behavioural economics, political science and psychology demonstrate how an invisible hand can nudge people’s decisions towards a preferred option. Contrary to the assumptions of the neoclassical economics, supporters of nudging argue that people have problems coping with a complex world, because of their limited knowledge and their restricted rationality. Technological improvement in the age of information has increased the possibilities to control the innocent social media users or penalise private investors and reap the benefits of their existence in hidden persuasion and discrimination. Nudging enables nudgers to plunder the simple uneducated and uninformed citizen and investor, who is neither aware of the nudging strategies nor able to oversee the tactics used by the nudgers (Puaschunder 2017a, b; 2018a, b).

The nudgers are thereby legally protected by democratically assigned positions they hold. The law of motion of the nudging societies holds an unequal concentration of power of those who have access to compiled data and coding rules, relevant for political power and influencing the investor’s decision usefulness (Puaschunder 2017a, b; 2018a, b). This paper takes as a case the “transparency technology XBRL (eXtensible Business Reporting Language)” (Sunstein 2013, 20), which should make data more accessible as well as usable for private investors. It is part of the choice architecture on regulation by governments (Sunstein 2013). However, XBRL is bounded to a taxonomy (Piechocki and Felden 2007).

Considering theoretical literature and field research, a representation issue (Beerbaum, Piechocki and Weber 2017) for principles-based accounting taxonomies exists, which intelligent machines applying Artificial Intelligence (AI) (Mwilu, Prat and Comyn-Wattiau 2015) nudge to facilitate decision usefulness. This paper conceptualizes ethical questions arising from the taxonomy engineering based on machine learning systems: Should the objective of the coding rule be to support or to influence human decision making or rational artificiality? This paper therefore advocates for a democratisation of information, education and transparency about nudges and coding rules (Puaschunder 2017a, b; 2018a, b)…(More)”.

The Nail Finds a Hammer: Self-Sovereign Identity, Design Principles, and Property Rights in the Developing World


Report by Michael Graglia, Christopher Mellon and Tim Robustelli: “Our interest in identity systems was an inevitable outgrowth of our earlier work on blockchain-based1 land registries.2 Property registries, which at the simplest level are ledgers of who has which rights to which asset, require a very secure and reliable means of identifying both people and properties. In the course of investigating solutions to that problem, we began to appreciate the broader challenges of digital identity and its role in international development. And the more we learned about digital identity, the more convinced we became of the need for self-sovereign identity, or SSI. This model, and the underlying principles of identity which it incorporates, will be described in detail in this paper.

We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority. These credentials are stored and controlled by the identity holder—typically in a wallet—and presented to different people for different reasons at the identity holder’s discretion. Crucially, the identity holder controls what information to present based on the environment, trust level, and type of interaction. Moreover, their fundamental identity persists even though the credentials by which it is represented may change over time.

The digital incarnation of this model has many benefits, including both greatly improved privacy and security, and the ability to create more trustworthy online spaces. Social media and news sites, for example, might limit participation to users with verified identities, excluding bots and impersonators.

The need for identification in the physical world varies based on location and social context. We expect to walk in relative anonymity down a busy city street, but will show a driver’s license to enter a bar, and both a driver’s license and a birth certificate to apply for a passport. There are different levels of ID and supporting documents required for each activity. But in each case, access to personal information is controlled by the user who may choose whether or not to share it.

Self-sovereign identity gives users complete control of their own identities and related personal data, which sits encrypted in distributed storage instead of being stored by a third party in a central database. In older, “federated identity” models, a single account—a Google account, for example—might be used to log in to a number of third-party sites, like news sites or social media platforms. But in this model a third party brokers all of these ID transactions, meaning that in exchange for the convenience of having to remember fewer passwords, the user must sacrifice a degree of privacy.

A real world equivalent would be having to ask the state to share a copy of your driver’s license with the bar every time you wanted to prove that you were over the age of 21. SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction….(More)”.

Mapping humanitarian action on Instagram


Report by Anthony McCosker, Jane Farmer, Tracy De Cotta, Peter Kamstra, Natalie Jovanovski, Arezou Soltani Panah, Zoe Teh, and Sam Wilson: “Every day, people undertake many different kinds of voluntary service and humanitarian action. This might involve fundraising and charity work, giving time, helping or inspiring others, or promoting causes. However, because so much of the research on volunteering and humanitarian action focuses on formal activities along with large-scale campaigns and global crisis events, we know very little about what people are doing informally and in their local community.

Humanitarianism is changing with the digital age and with new modes of networked communication and interaction. The research presented in this report offers new insights into the way people engage with humanitarian activities in their local contexts and everyday lives. We turned to Instagram as a novel data source that can offer insights into everyday humanitarian action. As a popular visual social media platform, Instagram provides a certain kind of intimate access to the humanitarian acts and the social good values that people want to capture, share and promote to others.

We sought to develop a typology of everyday humanitarian actions, the targets of those actions and situations and contexts they happen in through an analysis of Instagram data. Our research methodology and findings unlock a new approach to understanding humanitarian action in situ, and opens opportunities for organisation-led campaigns to improve and support self-mobilisation.

By using geographical information provided by Instagram users when they post, we demonstrate the relationships between humanitarian activities and locations across Victoria, Australia, illustrating the heavy concentration of activity within Melbourne’s CBD and inner suburbs. The data shows patterns in the kinds of actions, the situations in which they occur, and the humanitarian targets and values shared. On the basis of the findings, the report points to next steps in how humanitarian and charity organisations can innovate using social data to build a digitally active humanitarian movement by mapping and amplifying and better understanding humanitarian deeds where and when they happen. While the analysis offers many nuanced insights into everyday humanitarian activity, we highlight three key findings.

  • When people post to Instagram about humanitarian action they are most often promoting causes and activities, fundraising and giving time
  • Groups give time (volunteering, giving), individuals give or raise money (charity, fundraising)
  • Humanitarian action posted to Instagram is heavily concentrated around Melbourne CBD and inner suburbs, with a focus on public spaces, restaurant and entertainment precincts along the Yarra River and Swanston Street…(More)”.

Folksonomies: how to do things with words on social media


Oxford Dictionaries: “Folksonomy, a portmanteau word for ‘folk taxonomy’, is a term for collaborative tagging: the production of user-created ‘tags’ on social media that help readers to find and sort content. In other words, hashtags: #ThrowbackThursday, #DogLife, #MeToo. Because ordinary people create folksonomy tags, folksonomies include categories devised by small communities, subcultures, or even individuals, not merely those by accepted taxonomic systems like the Dewey Decimal System.

The term first arose in the wake of Web 2.0 – the Web’s transition, in the early 2000s, from a read-only platform to a read-write platform that allows users to comment on and collaboratively tag what they read. Rather unusually, we know the exact date it was coined: 24 July, 2004. The information architect Thomas Vander Wal came up with it in response to a query over what to call this kind of informal social classification.

Perhaps the most visible folksonomies are those on social-media platforms like Facebook, Twitter, Tumblr, Flickr, and Instagram. Often, people create tags on these platforms in order to gather under a single tag content that many different users have created, making it easier to find posts related to that tag. (If I’m interested in dogs, I might look at content gathered under the tag #DogLife.) Because tags reflect the interests of people who create them, researchers have pursued ways to use tags to build more comprehensive profiles of users, with an eye to surveillance or to selling them relevant ads.

But people may also use tags as prompts for the creation of new content, not merely the curation of content they would have posted anyway. As I write this post, a trending tag on Twitter, #MakeAHorrorMovieMoreHorrific, is prompting thousands of people to write satirical takes on how classic horror movies might be made more ‘horrifying’ by adding unhappy features of our ordinary lives. (‘I Know What You Did Last Summer, and I Put It on Facebook’; ‘Rosemary’s Baby Is Teething’; ‘The Exercise’)

From a certain perspective, this is not so different from a library’s acknowledgment of a new category of text: if a new academic field, like ‘the history of the book’, catches on, then libraries rearrange their shelves and catalogues to accommodate the history of the book as a category; the new shelf space and catalogue space creates a demand for new books in that category, which encourages authors and publishers to produce new books to meet the demand.

But new folksonomy tags (with important exceptions, as in the realm of activism) are often short-lived and meant to be short-lived, obscure and meant to be obscure. What library cataloguer would think to accommodate the category #glitterhorse, which has a surprising number of posts on Twitter and Instagram? How can Vander Wal’s original definition of folksonomy as a tool for information retrieval accommodate tags that function, not as search terms, but as theatrical asides, like #sorrynotsorry? What about tags that are so narrowly specific that no search could ever turn up more than one usage?

Perhaps the best way to understand the weird things that people do with folksonomy tags is to appeal, not to information science, but to narratology, the study of narrative structures. …(More)”.