Data Activism


ˈdeɪtə ˈæktɪˌvɪzəm

New social practices enabled by data and technology which aim to create political change (Milan and Gutiérrez).

The large-scale generation of data that has occurred over the past decade has given rise to data activism, defined by Stefania Milan and Miren Gutiérrez, scholars in technology and society at the University of Amsterdam and University of Deusto, as “new social practices rooted in technology and data.” These authors further discuss this term, arguing:

“Data activism indicates social practices that take a critical approach to big data. Examples include the collective mapping and geo-referencing of the messages of victims of natural disasters in order to facilitate disaster relief operations, or the elaboration of open government data for advocacy and campaigning. But data activism also embraces tactics of resistance to massive data collection by private companies and governments, such as the encryption of private communication, or obfuscation tactics that put sand into the data collection machine.

Milan and Gutiérrez further elaborate on these two forms of data activism in their paper “Technopolitics in the Age of Big Data.” Here, they argue all data activism is either proactive and reactive. They state:

“We identify two forms of data activism: proactive data activism, whereby citizens take advantage of the possibilities offered by big data infrastructure for advocacy and social change, and reactive data activism, namely grassroots efforts aimed at resisting massive data collection and protecting users from malicious snooping.”

An example of reactive data activism comes from Media Action Grassroots Network, a network of social justice organizations based in the United States. This network provides digital security training to grassroot activists working on racial justice issues.

An example of proactive data activism is discussed in “Data witnessing: attending to injustice with data in Amnesty International’s Decoders project.” There, author Jonathan Gray, a critical data scholar, examines “what digital data practices at Amnesty International’s Decoders initiative can add to the understanding of witnessing.” According to Gray, witnessing is a concept that has been used in law, religion, and media, among others, to explore the construction of evidence and experience. In this paper, Gray references four data witnessing projects, which are:

“(i) witnessing historical abuses with structured data from digitised documents; (ii) witnessing the destruction of villages with satellite imagery and machine learning; (iii) witnessing environmental injustice with company reports and photographs; and (iv) witnessing online abuse through the classification of Twitter data. These projects illustrate the configuration of experimental apparatuses for witnessing injustices with data.”

Within the more recent context, proactive data activism has several notable examples. Civil rights activists in Zanesville, Ohio used data to demonstrate the inequitable access to clean water between predominantly white communities and black communities. A collection of activists, organizers, and mathematicians formed Data 4 Black Lives to promote justice for Black communities through data and data science. Finally, in an effort to monitor government accountability in providing COVID-19 case data, Indonesian activists created a platform where citizens can independently report COVID-19 cases.

Multisolving


ˌmʌltiˈsɑlvɪŋ

pooling expertise, funding, and political will to solve multiple problems with a single investment of time and money (Sawin, 2018).

Co-Director of Climate Interactive, a not-for-profit energy and environment think tank, Elizabeth Sawin wrote an article in Stanford Social Innovation Review (SSIR) on multisolving after a year-long study of the implementation of such approach for climate and health. Defined as a way of solving multiple problems with a single investment of time and money, the multisolving approach brings together stakeholders from different sectors and disciplines to tackle public issues in a cost-efficient manner.

In the article, Sawin provides examples of multisolving that have been implemented in countries across the globe:

In Japan, manufacturing facilities use “green curtains”—living panels of climbing plants—to clean the air, provide vegetables for company cafeterias, and reduce energy use for cooling. A walk-to-school program in the United Kingdom fights a decline in childhood physical activity while reducing traffic congestion and greenhouse gas emissions from transportation. A food-gleaning program staffed by young volunteers and families facing food insecurity in Spain addresses food waste, hunger, and a desire for sustainability.

A Climate Interactive report provides three principles and three practices that can help stakeholders develop multisolving strategy. In the SSIR article, Sawin summarizes those principles into three points. First, she argues that a solution must serve everyone in a system without an exception. Second, she suggests that multisolvers must recognize that problems are multifaceted and that multisolving provides solution to multiple facets of a big issue. Third, Sawin posits that experimentation and learning are key to measuring the success of multisolving.

Further, in the article Sawin also outlined three good multisolving practices. First, she identifies openness to collaboration with actors from different sectors or groups in a society as a critical ingredient in developing a multisolving strategy. Second, Sawin stresses the importance of learning, documenting, and improving to ensure optimal benefits of multisolving for the public. Finally, she argues that communicating the benefits of multisolving to various stakeholders can help generate buy-in for a multisolving project.

In concluding the article, Sawin wrote “[n]one of these multisolving principles or tools, on their own, are revolutionary. They need no new apps or state-of-the-art techniques to work. What makes multisolving unique is that it weaves together these principles and practices in a way that builds over time to create big results.”

Informational Autocrats


ˌɪnfərˈmeɪʃənəl ˈɔtəˌkræts

Rulers who control and manipulate information in order to maintain power. (Guriev and Treisman, 2019)

Sergei Guriev (Professor of Economics, Sciences Po, Paris) and Daniel Treisman (Professor of Political Science, University of California, Los Angeles) detail in their paper, Informational Autocrats, a term for new, more surreptitious type of authoritarian leaders. The authors write:

“In this article, we document the changing characteristics of authoritarian states worldwide. Using newly collected data, we show that recent autocrats employ violent repression and impose official ideologies far less often than their predecessors. They also appear more prone to conceal rather than to publicize cases of state brutality. Analyzing texts of leaders’ speeches, we show that “informational autocrats” favor a rhetoric of economic performance and provision of public services that resembles that of democratic leaders far more than it does the discourse of threats and fear embraced by old-style dictators. Authoritarian leaders are increasingly mimicking democracy by holding elections and, where necessary, falsifying the results.

Today, informational autocrats often employ “cyber troops” to spread disinformation. They specifically target and take advantage of the “uninformed masses”  in order to advance their interests. Guriev and Treisman further argue:

“A key element in our theory of informational autocracy is the gap in political knowledge between the “informed elite” and the general public. While the elite accurately observes the limitations of an incompetent incumbent, the public is susceptible to the ruler’s propaganda. Using individual-level data from the Gallup World Poll, we show that such a gap does indeed exist in many authoritarian states today. Unlike in democracies, where the highly educated are more likely than others to approve of their government, in authoritarian states the highly educated tend to be more critical. The highly educated are also more aware of media censorship than their less-schooled compatriots.”

Separately, Andrea Kendall-Taylor, Erica Frantz, and Joseph Wright, in Foreign Affairs, echo the above suggestion, in that: 

“Dictatorships can also use new technologies to shape public perception of the regime and its legitimacy. Automated accounts (or “bots”) on social media can amplify influence campaigns and produce a flurry of distracting or misleading posts that crowd out opponents’ messaging.”

Additionally:

“Digital tools might even help regimes make themselves appear less repressive and more responsive to their citizens. In some cases, authoritarian regimes have deployed new technologies to mimic components of democracy, such as participation and deliberation.”

Globalization of ideas and technological advances have contributed to creating a hostile environment for traditional and overt dictatorship. At the same time, this combination has also been misused by informational autocrats to advance their own interests. Promoting accountability across all sectors through open government data and algorithmic transparency, for example, can prevent such effort to control and manipulate information.

Kludge


ˈklʌdʒ

A clumsy but temporarily effective solution to a particular problem (Oxford English Dictionary).

The term kludge is often used in the world of computer programming to refer to an inelegant temporary patch intended to solve a problem.

In an article for the Washington Post, Mike Konczal—a fellow at the Roosevelt Institute—discusses how kludges are also found in policymaking. Konczal argues that in a well-intentioned effort to make governing simpler, policymakers tend to adopt simple fixes, instead of policies that would make decision-making process actually simple.

Policies that make decision-making process simple can involve “nudges”—a behavioral economics concept proposed by Richard Thaler and Cass Sunstein. In the article, Konczal writes:

“A simple policy is one that simply “nudges” people into one choice or another using a variety of default rules, disclosure requirements, and other market structures. Think, for instance, of rules that require fast-food restaurants to post calories on their menus, or a mortgage that has certain terms clearly marked in disclosures.

“These sorts of regulations are deemed “choice preserving.” Consumers are still allowed to buy unhealthy fast-food meals or sign up for mortgages they can’t reasonably afford. The regulations are just there to inform people about their choices. These rules are designed to keep the market “free,” where all possibilities are ultimately possible, although there are rules to encourage certain outcomes.”

On the other hand, there are policy “kludges”, which according to Steve Teles—professor of political science at Johns Hopkins University—illustrate the current public policy situation in the United States, reflected in the complexity of the healthcare, education, and environmental protection system, to which Teles further arguesAmerica has chosen to govern itself through more indirect and incoherent policy mechanisms than can be found in any comparable country.” 

According to Teles, these kludges can accumulate to be costly and complex with no clear principles. Continued iteration of policy kludges has increased the transaction costs for individuals to access services, the compliance costs for government and business, and created unequal opportunity for individuals and institutions to benefit from democracy. In Teles’ words, the costs of kludges are outlined as follows:

“The most insidious feature of kludgeocracy is the hidden, indirect, and frequently corrupt distribution of its costs. Those costs can be put into three categories — costs borne by individual citizens, costs borne by the government that must implement the complex policies, and costs to the character of our democracy.”

Technochauvinism


ˈtɛknoʊˈʃoʊvəˌnɪzəm

The belief that technology is always the solution (Broussard, 2018).

Since the beginning of its rise in the late 20th century, digital and computer technology promised to improve many ways the society operates. Personal computers, mobile phones, and the internet are some of the most ubiquitous examples of technology that have demonstrable capabilities to make lives easier to a certain extent.

However, recent years have shown increasing techlash—defined by The Oxford English Dictionary as “a strong and widespread negative reaction to the growing power and influence of large technology companies, particularly those based in Silicon Valley”—as a response to the harm that technology has helped create. Misinformation, privacy violation, and algorithmic bias are phrases that can often be found in the same sentence as one or more tech companies.

Computer scientist and data journalist Meredith Boussard, who is a professor at New York University, argues that these problems stem from technochauvinism—the belief that technology is always the solution. The summary of her book, Artificial Unintelligence, writes:

“… it’s just not true that social problems would inevitably retreat before a digitally enabled Utopia. To prove her point, she undertakes a series of adventures in computer programming. She goes for an alarming ride in a driverless car, concluding “the cyborg future is not coming any time soon”; uses artificial intelligence to investigate why students can’t pass standardized tests; deploys machine learning to predict which passengers survived the Titanic disaster; and attempts to repair the U.S. campaign finance system by building AI software. If we understand the limits of what we can do with technology, Broussard tells us, we can make better choices about what we should do with it to make the world better for everyone.”

The term technochauvinism is similar to technosolutionism. In that, they both describe the belief that most, if not all, complex issues can be solved with the right computation and engineering. However, the use of “chauvinism” is intentional because part of the criticism is about the rampant gender inequality in the tech industry, which manifest in many ways including algorithmic sexism.

“In Artificial Unintelligence, Meredith Broussard argues that our collective enthusiasm for applying computer technology to every aspect of life has resulted in a tremendous amount of poorly designed systems. We are so eager to do everything digitally—hiring, driving, paying bills, even choosing romantic partners—that we have stopped demanding that our technology actually work. Broussard, a software developer and journalist, reminds us that there are fundamental limits to what we can (and should) do with technology. With this book, she offers a guide to understanding the inner workings and outer limits of technology—and issues a warning that we should never assume that computers always get things right.”

Nowcasting


naʊˈkæstɪŋ

A method of describing the present or the near future by analyzing datasets that are not traditionally included in the analysis (e.g. web searches, reviews, social media data, etc.)

Nowcasting is a term that originates in meteorology, which refers to “the detailed description of the current weather along with forecasts obtained by extrapolation for a period of 0 to 6 hours ahead.” Today, nowcasting is also used in other fields, such as macroeconomics and health, to provide more up-to-date statistics.

Traditionally, macroeconomic statistics are collected on a quarterly basis and released with a substantial lag. For example, GDP data for euro area “is only available at quarterly frequency and is released six weeks after the close of the quarter.” Further, economic datasets from government agencies such as the US Census Bureau “typically appear only after multi-year lags, and the public-facing versions are aggregated to the county or ZIP code level.

The arrival of big data era has shown some promise to improve nowcasting. A paper by Edward L. Glaeser, Hyunjin Kim, and Michael Luca presents “evidence that Yelp data can complement government surveys by measuring economic activity in close to real time, at a granular level, and at almost any geographic scale.” In the paper, the authors concluded:

“Our analyses of one possible data source, Yelp, suggests that these new data sources can be a useful complement to official government data. Yelp can help predict contemporaneous changes in the local economy. It can also provide a snapshot of economic change at the local level. It is a useful addition to the data tools that local policy-makers can access.

“Yet our analysis also highlights the challenges with the idea of replacing the Census altogether at any point in the near future. Government statistical agencies invest heavily in developing relatively complete coverage, for a wide set of metrics. The variation in coverage inherent in data from online platforms make it difficult to replace the role of providing official statistics that government data sources play.

“Ultimately, data from platforms like Yelp –combined with official government statistics – can provide valuable complementary datasets that will ultimately allow for more timely and granular forecasts and policy analyses, with a wider set of variables and more complete view of the local economy.”

Another example comes from the United States Federal Reserve (The Fed), which used data from payroll-processing company ADP to payroll employment. This data is traditionally provided by Current Employment Statistics (CES) survey. Despite being “one of the most carefully conducted measures of labor market activity and uses an extremely large sample, it is still subject to significant sampling error and nonsampling errors.” The Fed sought to improve the reliability of this survey by including data provided by ADP. The study found that combining CES and ADP data “reduces the error inherent in both data sources.”

However, nowcasting using big data comes with some limitations. Several researchers evaluated the accuracy of Google Flu Trends (GFT) in the 2012-2013 and 2013-2014 seasons. GFT uses flu-related google searches to make its prediction. The study found that GFT data showed significant overestimation compared to Centers for Disease Control and Prevention (CDC) flu trends prediction.

Jesse Dunietz wrote in Nautilus describing how to address the limitations of big data and make nowcasting efforts more accurate: 

“But when big data isn’t seen as a panacea, it can be transformative. Several groups, like Columbia University researcher Jeffrey Shaman’s, for example, have outperformed the flu predictions of both the CDC and GFT by using the former to compensate for the skew of the latter. “Shaman’s team tested their model against actual flu activity that had already occurred during the season,” according to the CDC. By taking the immediate past into consideration, Shaman and his team fine-tuned their mathematical model to better predict the future. All it takes is for teams to critically assess their assumptions about their data.”

Index: Secondary Uses of Personal Data


By Alexandra Shaw, Andrew Zahuranec, Andrew Young, Stefaan Verhulst

The Living Library Index–inspired by the Harper’s Index–provides important statistics and highlights global trends in governance innovation. This installment focuses on public perceptions regarding secondary uses of personal data (or the re-use of data initially collected for a different purpose). It provides a summary of societal perspectives toward personal data usage, sharing, and control. It is not meant to be comprehensive–rather, it intends to illustrate conflicting, and often confusing, attitudes toward the re-use of personal data. 

Please share any additional, illustrative statistics on data, or other issues at the nexus of technology and governance, with us at info@thelivinglib.org

Data ownership and control 

  • Percentage of Americans who say it is “very important” they control information collected about them: 74% – 2016
  • Americans who think that today’s privacy laws are not good enough at protecting people’s privacy online: 68% – 2016
  • Americans who say they have “a lot” of control over how companies collect and use their information: 9% – 2015
  • In a survey of 507 online shoppers, the number of respondents who indicated they don’t want brands tracking their location: 62% – 2015
  • In a survey of 507 online shoppers, the amount who “prefer offers that are targeted to where they are and what they are doing:” 60% – 2015 
  • Number of surveyed American consumers willing to provide data to corporations under the following conditions: 
    • “Data about my social concerns to better connect me with non-profit organizations that advance those causes:” 19% – 2018
    • “Data about my DNA to help me uncover any hereditary illnesses:” 21% – 2018
    • “Data about my interests and hobbies to receive relevant information and offers from online sellers:” 32% – 2018
    • “Data about my location to help me find the fastest route to my destination:” 40% – 2018
    • “My email address to receive exclusive offers from my favorite brands:”  56% – 2018  

Consumer Attitudes 

  • Academic study participants willing to donate personal data to research if it could lead to public good: 60% – 2014
  • Academic study participants willing to share personal data for research purposes in the interest of public good: 25% – 2014
  • Percentage who expect companies to “treat [them] like an individual, not as a member of some segment like ‘millennials’ or ‘suburban mothers:’” 74% – 2018 
    • Percentage who believe that brands should understand a “consumer’s individual situation (e.g. marital status, age, location, etc.)” when they’re being marketed to: 70% – 2018 Number who are “more annoyed” by companies now compared to 5 years ago: 40% – 2018Percentage worried their data is shared across companies without their permission: 88% – 2018Amount worried about a brand’s ability to track their behavior while on the brand’s website, app, or neither: 75% – 2018 
  • Consumers globally who expect brands to anticipate needs before they arise: 33%  – 2018 
  • Surveyed residents of the United Kingdom who identify as:
    • “Data pragmatists” willing to share personal data “under the right circumstances:” 58% – 2017
    • “Fundamentalists,” who would not share personal data for better services: 24% – 2017
    • Respondents who think data sharing is part of participating in the modern economy: 62% – 2018
    • Respondents who believe that data sharing benefits enterprises more than consumers: 75% – 2018
    • People who want more control over their data that enterprises collect: 84% – 2018
    • Percentage “unconcerned” about personal data protection: 18% – 2018
  • Percentage of Americans who think that government should do more to regulate large technology companies: 55% – 2018
  • Registered American voters who trust broadband companies with personal data “a great deal” or “a fair amount”: 43% – 2017
  • Americans who report experiencing a major data breach: 64% – 2017
  • Number of Americans who believe that their personal data is less secure than it was 5 years ago: 49% – 2019
  • Amount of surveyed American citizens who consider trust in a company an important factor for sharing data: 54% – 2018

Convenience

Microsoft’s 2015 Consumer Data Value Exchange Report attempts to understand consumer attitudes on the exchange of personal data across the global markets of Australia, Brazil, Canada, Colombia, Egypt, Germany, Kenya, Mexico, Nigeria, Spain, South Africa, United Kingdom and the United States. From their survey of 16,500 users, they find:

  • The most popular incentives for sharing data are: 
    • Cash rewards: 64% – 2015
    • Significant discounts: 49% – 2015
    • Streamlined processes: 29% – 2015
    • New ideas: 28% – 2015
  • Respondents who would prefer to see more ads to get new services: 34% – 2015
  • Respondents willing to share search terms for a service that enabled fewer steps to get things done: 70% – 2015 
  • Respondents willing to share activity data for such an improvement: 82% – 2015
  • Respondents willing to share their gender for “a service that inspires something new based on others like them:” 79% – 2015

A 2015 Pew Research Center survey presented Americans with several data-sharing scenarios related to convenience. Participants could respond: “acceptable,” “it depends,” or “not acceptable” to the following scenarios: 

  • Share health information to get access to personal health records and arrange appointments more easily:
    • Acceptable: 52% – 2015
    • It depends: 20% – 2015
    • Not acceptable: 26% – 2015
  • Share data for discounted auto insurance rates: 
    • Acceptable: 37% – 2015
    • It depends: 16% – 2015
    • Not acceptable: 45% – 2015
  • Share data for free social media services: 
    • Acceptable: 33% – 2015
    • It depends: 15% – 2015
    • Not acceptable: 51% – 2015
  • Share data on smart thermostats for cheaper energy bills: 
    • Acceptable: 33% – 2015
    • It depends: 15% – 2015
    • Not acceptable: 51% – 2015

Other Studies

  • Surveyed banking and insurance customers who would exchange personal data for:
    • Targeted auto insurance premiums: 64% – 2019
    • Better life insurance premiums for healthy lifestyle choices: 52% – 2019 
  • Surveyed banking and insurance customers willing to share data specifically related to income, location and lifestyle habits to: 
    • Secure faster loan approvals: 81.3% – 2019
    • Lower the chances of injury or loss: 79.7% – 2019 
    • Receive discounts on non-insurance products or services: 74.6% – 2019
    • Receive text alerts related to banking account activity: 59.8% – 2019 
    • Get saving advice based on spending patterns: 56.6% – 2019
  • In a survey of over 7,000 members of the public around the globe, respondents indicated:
    • They thought “smartphone and tablet apps used for navigation, chat, and news that can access your contacts, photos, and browsing history” is “creepy;” 16% – 2016
    • Emailing a friend about a trip to Paris and receiving advertisements for hotels, restaurants and excursions in Paris is “creepy:” 32% – 2016
    • A free fitness-tracking device that monitors your well-being and sends a monthly report to you and your employer is “creepy:” 45% – 2016
    • A telematics device that allows emergency services to track your vehicle is “creepy:” 78% – 2016
  • The number of British residents who do not want to work with virtual agents of any kind: 48% – 2017
  • Americans who disagree that “if companies give me a discount, it is a fair exchange for them to collect information about me without my knowing”: 91% – 2015

Data Brokers, Intermediaries, and Third Parties 

  • Americans who consider it acceptable for a grocery store to offer a free loyalty card in exchange for selling their shopping data to third parties: 47% – 2016
  • Number of people who know that “searches, site visits and purchases” are reviewed without consent:  55% – 2015
  • The number of people in 1991 who wanted companies to ask them for permission first before collecting their personal information and selling that data to intermediaries: 93% – 1991
    • Number of Americans who “would be very concerned if the company at which their data were stored sold it to another party:” 90% – 2008
    • Percentage of Americans who think it’s unacceptable for their grocery store to share their shopping data with third parties in exchange for a free loyalty card: 32% – 2016
  • Percentage of Americans who think that government needs to do more to regulate advertisers: 64% – 2016
    • Number of Americans who “want to have control over what marketers can learn about” them online: 84% – 2015
    • Percentage of Americans who think they have no power over marketers to figure out what they’re learning about them: 58% – 2015
  • Registered American voters who are “somewhat uncomfortable” or “very uncomfortable” with companies like Internet service providers or websites using personal data to recommend stories, articles, or videos:  56% – 2017
  • Registered American voters who are “somewhat uncomfortable” or “very uncomfortable” with companies like Internet service providers or websites selling their personal information to third parties for advertising purposes: 64% – 2017

Personal Health Data

The Robert Wood Johnson Foundation’s 2014 Health Data Exploration Project Report analyzes attitudes about personal health data (PHD). PHD is self-tracking data related to health that is traceable through wearable devices and sensors. The three major stakeholder groups involved in using PHD for public good are users, companies that track the users’ data, and researchers. 

  • Overall Respondents:
    • Percentage who believe anonymity is “very” or “extremely” important: 67% – 2014
    • Percentage who “probably would” or “definitely would” share their personal data with researchers: 78% – 2014
    • Percentage who believe that they own—or should own—all the data about them, even when it is indirectly collected: 54% – 2014
    • Percentage who think they share or ought to share ownership with the company: 30% – 2014
    • Percentage who think companies alone own or should own all the data about them: 4% – 2014
    • Percentage for whom data ownership “is not something I care about”: 13% – 2014
    • Percentage who indicated they wanted to own their data: 75% – 2014 
    • Percentage who would share data only if “privacy were assured:” 68% – 2014
    • People who would supply data regardless of privacy or compensation: 27% – 2014
      • Percentage of participants who mentioned privacy, anonymity, or confidentiality when asked under what conditions they would share their data:  63% – 2014
      • Percentage who would be “more” or “much more” likely to share data for compensation: 56% – 2014
      • Percentage who indicated compensation would make no difference: 38% – 2014
      • Amount opposed to commercial  or profit-making use of their data: 13% – 2014
    • Percentage of people who would only share personal health data with a guarantee of:
      • Privacy: 57% – 2014
      • Anonymization: 90% – 2014
  • Surveyed Researchers: 
    • Percentage who agree or strongly agree that self-tracking data would help provide more insights in their research: 89% – 2014
    • Percentage who say PHD could answer questions that other data sources could not: 95% – 2014
    • Percentage who have used public datasets: 57% – 2014
    • Percentage who have paid for data for research: 19% – 2014
    • Percentage who have used self-tracking data before for research purposes: 46% – 2014
    • Percentage who have worked with application, device, or social media companies: 23% – 2014
    • Percentage who “somewhat disagree” or “strongly disagree” there are barriers that cannot be overcome to using self-tracking data in their research: 82% – 2014 

SOURCES: 

“2019 Accenture Global Financial Services Consumer Study: Discover the Patterns in Personality”, Accenture, 2019. 

“Americans’ Views About Data Collection and Security”, Pew Research Center, 2015. 

“Data Donation: Sharing Personal Data for Public Good?”, ResearchGate, 2014.

Data privacy: What the consumer really thinks,” Acxiom, 2018.

“Exclusive: Public wants Big Tech regulated”, Axios, 2018.

Consumer data value exchange,” Microsoft, 2015.

Crossing the Line: Staying on the right side of consumer privacy,” KPMG International Cooperative, 2016.

“How do you feel about the government sharing our personal data? – livechat”, The Guardian, 2017. 

“Personal data for public good: using health information in medical research”, The Academy of Medical Sciences, 2006. 

“Personal Data for the Public Good: New Opportunities to Enrich Understanding of Individual and Population Health”, Robert Wood Johnson Foundation, Health Data Exploration Project, Calit2, UC Irvine and UC San Diego, 2014. 

“Pew Internet and American Life Project: Cloud Computing Raises Privacy Concerns”, Pew Research Center, 2008. 

“Poll: Little Trust That Tech Giants Will Keep Personal Data Private”, Morning Consult & Politico, 2017. 

“Privacy and Information Sharing”, Pew Research Center, 2016. 

“Privacy, Data and the Consumer: What US Thinks About Sharing Data”, MarTech Advisor, 2018. 

“Public Opinion on Privacy”, Electronic Privacy Information Center, 2019. 

“Selligent Marketing Cloud Study Finds Consumer Expectations and Marketer Challenges are Rising in Tandem”, Selligent Marketing Cloud, 2018. 

The Data-Sharing Disconnect: The Impact of Context, Consumer Trust, and Relevance in Retail Marketing,” Boxever, 2015. 

Microsoft Research reveals understanding gap in the brand-consumer data exchange,” Microsoft Research, 2015.

“Survey: 58% will share personal data under the right circumstances”, Marketing Land: Third Door Media, 2019. 

“The state of privacy in post-Snowden America”, Pew Research Center, 2016. 

The Tradeoff Fallacy: How Marketers Are Misrepresenting American Consumers And Opening Them Up to Exploitation”, University of Pennsylvania, 2015.

Index: The Data Universe 2019


By Michelle Winowatan, Andrew J. Zahuranec, Andrew Young, Stefaan Verhulst, Max Jun Kim

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe.

Please share any additional, illustrative statistics on data, or other issues at the nexus of technology and governance, with us at info@thelivinglib.org

Internet Traffic:

  • Percentage of the world’s population that uses the internet: 51.2% (3.9 billion people) – 2018
  • Number of search processed worldwide by Google every year: at least 2 trillion – 2016
  • Website traffic worldwide generated through mobile phones: 52.2% – 2018
  • The total number of mobile subscriptions in the first quarter of 2019: 7.9 billion (addition of 44 million in quarter) – 2019
  • Amount of mobile data traffic worldwide: nearly 30 billion GB – 2018
  • Data category with highest traffic worldwide: video (60%) – 2018
  • Global average of data traffic per smartphone per month: 5.6 GB – 2018
    • North America: 7 GB – 2018
    • Latin America: 3.1 GB – 2018
    • Western Europe: 6.7 GB – 2018
    • Central and Eastern Europe: 4.5 GB – 2018
    • North East Asia: 7.1 GB – 2018
    • Southeast Asia and Oceania: 3.6 GB – 2018
    • India, Nepal, and Bhutan: 9.8 GB – 2018
    • Middle East and Africa: 3.0 GB – 2018
  • Time between the creation of each new bitcoin block: 9.27 minutes – 2019

Streaming Services:

  • Total hours of video streamed by Netflix users every minute: 97,222 – 2017
  • Hours of YouTube watched per day: over 1 billion – 2018
  • Number of tracks uploaded to Spotify every day: Over 20,000 – 2019
  • Number of Spotify’s monthly active users: 232 million – 2019
  • Spotify’s total subscribers: 108 million – 2019
  • Spotify’s hours of content listened: 17 billion – 2019
  • Total number of songs on Spotify’s catalog: over 30 million – 2019
  • Apple Music’s total subscribers: 60 million – 2019
  • Total number of songs on Apple Music’s catalog: 45 million – 2019

Social Media:

Calls and Messaging:

Retail/Financial Transaction:

  • Number of packages shipped by Amazon in a year: 5 billion – 2017
  • Total value of payments processed by Venmo in a year: USD 62 billion – 2019
  • Based on an independent analysis of public transactions on Venmo in 2017:
  • Based on a non-representative survey of 2,436 US consumers between the ages of 21 and 72 on P2P platforms:
    • The average volume of transactions handled by Venmo: USD 64.2 billion – 2019
    • The average volume of transactions handled by Zelle: USD 122.0 billion – 2019
    • The average volume of transactions handled by PayPal: USD 141.8 billion – 2019 
    • Platform with the highest percent adoption among all consumers: PayPal (48%) – 2019 

Internet of Things:

Sources: