Towards “Government as a Platform”? Preliminary Lessons from Australia, the United Kingdom and the United States


Paper by J. Ramon Gil‐Garcia, Paul Henman, and Martha Alicia Avila‐Maravilla: “In the last two decades, Internet portals have been used by governments around the world as part of very diverse strategies from service provision to citizen engagement. Several authors propose that there is an evolution of digital government reflected in the functionality and sophistication of these portals and other technologies. More recently, scholars and practitioners are proposing different conceptualizations of “government as a platform” and, for some, this could be the next stage of digital government. However, it is not clear what are the main differences between a sophisticated Internet portal and a platform. Therefore, based on an analysis of three of the most advanced national portals, this ongoing research paper explores to what extent these digital efforts clearly represent the basic characteristics of platforms. So, this paper explores questions such as: (1) to what extent current national portals reflect the characteristics of what has been called “government as a platform?; and (2) Are current national portals evolving towards “government as a platform”?…(More)”.

Index: The Data Universe 2019


By Michelle Winowatan, Andrew J. Zahuranec, Andrew Young, Stefaan Verhulst, Max Jun Kim

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe.

Please share any additional, illustrative statistics on data, or other issues at the nexus of technology and governance, with us at info@thelivinglib.org

Internet Traffic:

  • Percentage of the world’s population that uses the internet: 51.2% (3.9 billion people) – 2018
  • Number of search processed worldwide by Google every year: at least 2 trillion – 2016
  • Website traffic worldwide generated through mobile phones: 52.2% – 2018
  • The total number of mobile subscriptions in the first quarter of 2019: 7.9 billion (addition of 44 million in quarter) – 2019
  • Amount of mobile data traffic worldwide: nearly 30 billion GB – 2018
  • Data category with highest traffic worldwide: video (60%) – 2018
  • Global average of data traffic per smartphone per month: 5.6 GB – 2018
    • North America: 7 GB – 2018
    • Latin America: 3.1 GB – 2018
    • Western Europe: 6.7 GB – 2018
    • Central and Eastern Europe: 4.5 GB – 2018
    • North East Asia: 7.1 GB – 2018
    • Southeast Asia and Oceania: 3.6 GB – 2018
    • India, Nepal, and Bhutan: 9.8 GB – 2018
    • Middle East and Africa: 3.0 GB – 2018
  • Time between the creation of each new bitcoin block: 9.27 minutes – 2019

Streaming Services:

  • Total hours of video streamed by Netflix users every minute: 97,222 – 2017
  • Hours of YouTube watched per day: over 1 billion – 2018
  • Number of tracks uploaded to Spotify every day: Over 20,000 – 2019
  • Number of Spotify’s monthly active users: 232 million – 2019
  • Spotify’s total subscribers: 108 million – 2019
  • Spotify’s hours of content listened: 17 billion – 2019
  • Total number of songs on Spotify’s catalog: over 30 million – 2019
  • Apple Music’s total subscribers: 60 million – 2019
  • Total number of songs on Apple Music’s catalog: 45 million – 2019

Social Media:

Calls and Messaging:

Retail/Financial Transaction:

  • Number of packages shipped by Amazon in a year: 5 billion – 2017
  • Total value of payments processed by Venmo in a year: USD 62 billion – 2019
  • Based on an independent analysis of public transactions on Venmo in 2017:
  • Based on a non-representative survey of 2,436 US consumers between the ages of 21 and 72 on P2P platforms:
    • The average volume of transactions handled by Venmo: USD 64.2 billion – 2019
    • The average volume of transactions handled by Zelle: USD 122.0 billion – 2019
    • The average volume of transactions handled by PayPal: USD 141.8 billion – 2019 
    • Platform with the highest percent adoption among all consumers: PayPal (48%) – 2019 

Internet of Things:

Sources:

Companies Collect a Lot of Data, But How Much Do They Actually Use?


Article by Priceonomics Data Studio: “For all the talk of how data is the new oil and the most valuable resource of any enterprise, there is a deep dark secret companies are reluctant to share — most of the data collected by businesses simply goes unused.

This unknown and unused data, known as dark data comprises more than half the data collected by companies. Given that some estimates indicate that 7.5 septillion (7,700,000,000,000,000,000,000) gigabytes of data are generated every single day, not using  most of it is a considerable issue.

In this article, we’ll look at this dark data. Just how much of it is created by companies, what are the reasons this data isn’t being analyzed, and what are the costs and implications of companies not using the majority of the data they collect.  

Before diving into the analysis, it’s worth spending a moment clarifying what we mean by the term “dark data.” Gartner defines dark data as:

“The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). 

To learn more about this phenomenon, Splunk commissioned a global survey of 1,300+ business leaders to better understand how much data they collect, and how much is dark. Respondents were from IT and business roles, and were located in Australia, China, France, Germany, Japan, the United States, and the United Kingdom. across various industries. For the report, Splunk defines dark data as: “all the unknown and untapped data across an organization, generated by systems, devices and interactions.”

While the costs of storing data has decreased overtime, the cost of saving septillions of gigabytes of wasted data is still significant. What’s more, during this time the strategic importance of data has increased as companies have found more and more uses for it. Given the cost of storage and the value of data, why does so much of it go unused?

The following chart shows the reasons why dark data isn’t currently being harnessed:

By a large margin, the number one reason given for not using dark data is that companies lack a tool to capture or analyze the data. Companies accumulate data from server logs, GPS networks, security tools, call records, web traffic and more. Companies track everything from digital transactions to the temperature of their server rooms to the contents of retail shelves. Most of this data lies in separate systems, is unstructured, and cannot be connected or analyzed.

Second, the data captured just isn’t good enough. You might have important customer information about a transaction, but it’s missing location or other important metadata because that information sits somewhere else or was never captured in useable format.

Additionally, dark data exists because there is simply too much data out there and a lot of is unstructured. The larger the dataset (or the less structured it is), the more sophisticated the tool required for analysis. Additionally, these kinds of datasets often time require analysis by individuals with significant data science expertise who are often is short supply

The implications of the prevalence are vast. As a result of the data deluge, companies often don’t know where all the sensitive data is stored and can’t be confident they are complying with consumer data protection measures like GDPR. …(More)”.

How does Finland use health and social data for the public benefit?


Karolina Mackiewicz at ICT & Health: “…Better innovation opportunities, quicker access to comprehensive ready-combined data, smoother permit procedures needed for research – those are some of the benefits for society, academia or business announced by the Ministry of Social Affairs and Health of Finland when the Act on the Secondary Use of Health and Social Data was introduced.

It came into force on 1st of May 2019. According to the Finnish Innovation Fund SITRA, which was involved in the development of the legislation and carried out the pilot projects, it’s a ‘groundbreaking’ piece of legislation. It’ not only effectively introduces a one-stop-shop for data but it’s also one of the first, if not the first, implementations of the GDPR (the EU’s General Data Protection Regulation) for the secondary use of data in Europe. 

The aim of the Act is “to facilitate the effective and safe processing and access to the personal social and health data for steering, supervision, research, statistics and development in the health and social sector”. A second objective is to guarantee an individual’s legitimate expectations as well as their rights and freedoms when processing personal data. In other words, the Ministry of Health promises that the Act will help eliminate the administrative burden in access to the data by the researchers and innovative businesses while respecting the privacy of individuals and providing conditions for the ethically sustainable way of using data….(More)”.

Introduction to Decision Intelligence


Blog post by Cassie Kozyrkov: “…Decision intelligence is a new academic discipline concerned with all aspects of selecting between options. It brings together the best of applied data science, social science, and managerial science into a unified field that helps people use data to improve their lives, their businesses, and the world around them. It’s a vital science for the AI era, covering the skills needed to lead AI projects responsibly and design objectives, metrics, and safety-nets for automation at scale.

Let’s take a tour of its basic terminology and concepts. The sections are designed to be friendly to skim-reading (and skip-reading too, that’s where you skip the boring bits… and sometimes skip the act of reading entirely).

What’s a decision?

Data are beautiful, but it’s decisions that are important. It’s through our decisions — our actions — that we affect the world around us.

We define the word “decision” to mean any selection between options by any entity, so the conversation is broader than MBA-style dilemmas (like whether to open a branch of your business in London).

In this terminology, labeling a photo as cat versus not-cat is a decision executed by a computer system, while figuring out whether to launch that system is a decision taken thoughtfully by the human leader (I hope!) in charge of the project.

What’s a decision-maker?

In our parlance, a “decision-maker” is not that stakeholder or investor who swoops in to veto the machinations of the project team, but rather the person who is responsible for decision architecture and context framing. In other words, a creator of meticulously-phrased objectives as opposed to their destroyer.

What’s decision-making?

Decision-making is a word that is used differently by different disciplines, so it can refer to:

  • taking an action when there were alternative options (in this sense it’s possible to talk about decision-making by a computer or a lizard).
  • performing the function of a (human) decision-maker, part of which is taking responsibility for decisions. Even though a computer system can execute a decision, it will not be called a decision-maker because it does not bear responsibility for its outputs — that responsibility rests squarely on the shoulders of the humans who created it.

Decision intelligence taxonomy

One way to approach learning about decision intelligence is to break it along traditional lines into its quantitative aspects (largely overlapping with applied data science) and qualitative aspects (developed primarily by researchers in the social and managerial sciences)….(More)”.


“Anonymous” Data Won’t Protect Your Identity


Sophie Bushwick at Scientific American: “The world produces roughly 2.5 quintillion bytes of digital data per day, adding to a sea of information that includes intimate details about many individuals’ health and habits. To protect privacy, data brokers must anonymize such records before sharing them with researchers and marketers. But a new study finds it is relatively easy to reidentify a person from a supposedly anonymized data set—even when that set is incomplete.

Massive data repositories can reveal trends that teach medical researchers about disease, demonstrate issues such as the effects of income inequality, coach artificial intelligence into humanlike behavior and, of course, aim advertising more efficiently. To shield people who—wittingly or not—contribute personal information to these digital storehouses, most brokers send their data through a process of deidentification. This procedure involves removing obvious markers, including names and social security numbers, and sometimes taking other precautions, such as introducing random “noise” data to the collection or replacing specific details with general ones (for example, swapping a birth date of “March 7, 1990” for “January–April 1990”). The brokers then release or sell a portion of this information.

“Data anonymization is basically how, for the past 25 years, we’ve been using data for statistical purposes and research while preserving people’s privacy,” says Yves-Alexandre de Montjoye, an assistant professor of computational privacy at Imperial College London and co-author of the new study, published this week in Nature Communications.  Many commonly used anonymization techniques, however, originated in the 1990s, before the Internet’s rapid development made it possible to collect such an enormous amount of detail about things such as an individual’s health, finances, and shopping and browsing habits. This discrepancy has made it relatively easy to connect an anonymous line of data to a specific person: if a private detective is searching for someone in New York City and knows the subject is male, is 30 to 35 years old and has diabetes, the sleuth would not be able to deduce the man’s name—but could likely do so quite easily if he or she also knows the target’s birthday, number of children, zip code, employer and car model….(More)”

The plan to mine the world’s research papers


Priyanka Pulla in Nature: “Carl Malamud is on a crusade to liberate information locked up behind paywalls — and his campaigns have scored many victories. He has spent decades publishing copyrighted legal documents, from building codes to court records, and then arguing that such texts represent public-domain law that ought to be available to any citizen online. Sometimes, he has won those arguments in court. Now, the 60-year-old American technologist is turning his sights on a new objective: freeing paywalled scientific literature. And he thinks he has a legal way to do it.

Over the past year, Malamud has — without asking publishers — teamed up with Indian researchers to build a gigantic store of text and images extracted from 73 million journal articles dating from 1847 up to the present day. The cache, which is still being created, will be kept on a 576-terabyte storage facility at Jawaharlal Nehru University (JNU) in New Delhi. “This is not every journal article ever written, but it’s a lot,” Malamud says. It’s comparable to the size of the core collection in the Web of Science database, for instance. Malamud and his JNU collaborator, bioinformatician Andrew Lynn, call their facility the JNU data depot.

No one will be allowed to read or download work from the repository, because that would breach publishers’ copyright. Instead, Malamud envisages, researchers could crawl over its text and data with computer software, scanning through the world’s scientific literature to pull out insights without actually reading the text.

The unprecedented project is generating much excitement because it could, for the first time, open up vast swathes of the paywalled literature for easy computerized analysis. Dozens of research groups already mine papers to build databases of genes and chemicals, map associations between proteins and diseases, and generate useful scientific hypotheses. But publishers control — and often limit — the speed and scope of such projects, which typically confine themselves to abstracts, not full text. Researchers in India, the United States and the United Kingdom are already making plans to use the JNU store instead. Malamud and Lynn have held workshops at Indian government laboratories and universities to explain the idea. “We bring in professors and explain what we are doing. They get all excited and they say, ‘Oh gosh, this is wonderful’,” says Malamud.

But the depot’s legal status isn’t yet clear. Malamud, who contacted several intellectual-property (IP) lawyers before starting work on the depot, hopes to avoid a lawsuit. “Our position is that what we are doing is perfectly legal,” he says. For the moment, he is proceeding with caution: the JNU data depot is air-gapped, meaning that no one can access it from the Internet. Users have to physically visit the facility, and only researchers who want to mine for non-commercial purposes are currently allowed in. Malamud says his team does plan to allow remote access in the future. “The hope is to do this slowly and deliberately. We are not throwing this open right away,” he says….(More)”.

Improving access to information and restoring the public’s faith in democracy through deliberative institutions


Katherine R. Knobloch at Democratic Audit: “Both scholars and citizens have begun to believe that democracy is in decline. Authoritarian power grabs, polarising rhetoric, and increasing inequality can all claim responsibility for democratic systems that feel broken. Democracy depends on a polity who believe that their engagement matters, but evidence suggests democratic institutions have become unresponsive to the will of the public. How can we restore faith in self-government when both research and personal experience tell us that the public is losing power, not gaining it?

Deliberative public engagement

Deliberative democracy offers one solution, and it’s slowly shifting how the public engages in political decision-making. In Oregon, the Citizens’ Initiative Review(CIR) asks a group of randomly selected voters to carefully study public issues and then make policy recommendations based on their collective experience and insight. In Ireland, Citizens’ Assemblies are being used to amend the country’s constitution to better reflect changing cultural norms. In communities across the world, Participatory Budgeting is giving the public control over local government spending. Far from squashing democratic power, these deliberative institutions bolster it. They exemplify a new wave in democratic government, one that aims to bring community members together across political and cultural divides to make decisions about how to govern themselves.

Though the contours of deliberative events vary, most share key characteristics. A diverse body of community members gather together to learn from experts and one another, think through the short- and long-term implications of different policy positions, and discuss how issues affect not only themselves but their wider communities. At the end of those conversations, they make decisions that are representative of the diversity of participants and their ideas and which have been tested through collective reasoning….(More)”.

Google and the University of Chicago Are Sued Over Data Sharing


Daisuke Wakabayashi in The New York Times: “When the University of Chicago Medical Center announced a partnership to share patient data with Google in 2017, the alliance was promoted as a way to unlock information trapped in electronic health records and improve predictive analysis in medicine.

On Wednesday, the University of Chicago, the medical center and Google were sued in a potential class-action lawsuit accusing the hospital of sharing hundreds of thousands of patients’ records with the technology giant without stripping identifiable date stamps or doctor’s notes.

The suit, filed in United States District Court for the Northern District of Illinois, demonstrates the difficulties technology companies face in handling health data as they forge ahead into one of the most promising — and potentially lucrative — areas of artificial intelligence: diagnosing medical problems.

Google is at the forefront of an effort to build technology that can read electronic health records and help physicians identify medical conditions. But the effort requires machines to learn this skill by analyzing a vast array of old health records collected by hospitals and other medical institutions.

That raises privacy concerns, especially when is used by a company like Google, which already knows what you search for, where you are and what interests you hold.

In 2016, DeepMind, a London-based A.I. lab owned by Google’s parent company, Alphabet, was accused of violating patient privacy after it struck a deal with Britain’s National Health Service to process medical data for research….(More)”.

Self-Sovereign Identity


/sɛlf-ˈsɑvrən aɪˈdɛntəti/

A decentralized identification mechanism that gives individuals control over what, when, and to whom their personal information is shared.

An identification document (ID) is a crucial part of every individual’s life, in that it is often a prerequisite for accessing a variety of services—ranging from creating a bank account to enrolling children in school to buying alcoholic beverages to signing up for an email account to voting in an election—and also a proof of simply being. This system poses fundamental problems, which a field report by The GovLab on Blockchain and Identity frames as follows:

“One of the central challenges of modern identity is its fragmentation and variation across platform and individuals. There are also issues related to interoperability between different forms of identity, and the fact that different identities confer very different privileges, rights, services or forms of access. The universe of identities is vast and manifold. Every identity in effect poses its own set of challenges and difficulties—and, of course, opportunities.”

A report published in New America echoed this point, by arguing that:

“Societally, we lack a coherent approach to regulating the handling of personal data. Users share and generate far too much data—both personally identifiable information (PII) and metadata, or “data exhaust”—without a way to manage it. Private companies, by storing an increasing amount of PII, are taking on an increasing level of risk. Solution architects are recreating the wheel, instead of flying over the treacherous terrain we have just described.”

SSI is dubbed as the solution for those identity problems mentioned above. Identity Woman, a researcher and advocate for SSI, goes even further by arguing that generating “a digital identity that is not under the control of a corporation, an organization or a government” is essential “in pursuit of social justice, deep democracy, and the development of new economies that share wealth and protect the environment.”

To inform the analysis of blockchain-based Self-Sovereign Identity (SSI), The GovLab report argues that identity is “a process, not a thing” and breaks it into a 5-stage lifecycle, which are provisioning, administration, authentication, authorization, and auditing/monitoring. At each stage, identification serves a unique function and poses different challenges.

With SSI, individuals have full control over how their personal information is shared, who gets access to it, and when. The New America report summarizes the potential of SSI in the following paragraphs:

“We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority.”

[…]

“SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction.”

Some case studies on the application of SSI in the real world presented on The GovLab Blockchange website include a government-issued self-sovereign ID using blockchain technology in the city of Zug in Switzerland; a mobile election voting platform, secured via smart biometrics, real-time ID verification and the blockchain for irrefutability piloted in West Virginia; and a blockchain-based land and property transaction/registration in Sweden.

Nevertheless, on the hype of this new and emerging technology, the authors write:

“At their core, blockchain technologies offer new capacity for increasing the immutability, integrity, and resilience of information capture and disclosure mechanisms, fostering the potential to address some of the information asymmetries described above. By leveraging a shared and verified database of ledgers stored in a distributed manner, blockchain seeks to redesign information ecosystems in a more transparent, immutable, and trusted manner. Solving information asymmetries may turn out to be the real contribution of blockchain, and this—much more than the current enthusiasm over virtual currencies—is the real reason to assess its potential.

“It is important to emphasize, of course, that blockchain’s potential remains just that for the moment—only potential. Considerable hype surrounds the emerging technology, and much remains to be done and many obstacles to overcome if blockchain is to achieve the enthusiasts’ vision of “radical transparency.”

Further readings: