Index: The Data Universe


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe and was originally published in 2013.

  • How much data exists in the digital universe as of 2012: 2.7 zetabytes*
  • Increase in the quantity of Internet data from 2005 to 2012: +1,696%
  • Percent of the world’s data created in the last two years: 90
  • Number of exabytes (=1 billion gigabytes) created every day in 2012: 2.5; that number doubles every month
  • Percent of the digital universe in 2005 created by the U.S. and western Europe vs. emerging markets: 48 vs. 20
  • Percent of the digital universe in 2012 created by emerging markets: 36
  • Percent of the digital universe in 2020 predicted to be created by China alone: 21
  • How much information in the digital universe is created and consumed by consumers (video, social media, photos, etc.) in 2012: 68%
  • Percent of which enterprises have liability or responsibility for (copyright, privacy, compliance with regulations, etc.): 80
  • Amount included in the Obama Administration’s 2-12 Big Data initiative: over $200 million
  • Amount the Department of Defense is investing annually on Big Data projects as of 2012: over $250 million
  • Data created per day in 2012: 2.5 quintillion bytes
  • How many terabytes* of data collected by the U.S. Library of Congress as of April 2011: 235
  • How many terabytes of data collected by Walmart per hour as of 2012: 2,560, or 2.5 petabytes*
  • Projected growth in global data generated per year, as of 2011: 40%
  • Number of IT jobs created globally by 2015 to support big data: 4.4 million (1.9 million in the U.S.)
  • Potential shortage of data scientists in the U.S. alone predicted for 2018: 140,000-190,000, in addition to 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions
  • Time needed to sequence the complete human genome (analyzing 3 billion base pairs) in 2003: ten years
  • Time needed in 2013: one week
  • The world’s annual effective capacity to exchange information through telecommunication networks in 1986, 2007, and (predicted) 2013: 281 petabytes, 65 exabytes, 667 exabytes
  • Projected amount of digital information created annually that will either live in or pass through the cloud: 1/3
  • Increase in data collection volume year-over-year in 2012: 400%
  • Increase in number of individual data collectors from 2011 to 2012: nearly double (over 300 data collection parties in 2012)

*1 zetabyte = 1 billion terabytes | 1 petabyte = 1,000 terabytes | 1 terabyte = 1,000 gigabytes | 1 gigabyte = 1 billion bytes

Sources

The Logic of Connective Action- Digital Media and the Personalization of Contentious Politics


New book by W. Lance Bennett and Alexandra Segerberg: “The Logic of Connective Action explains the rise of a personalized digitally networked politics in which diverse individuals address the common problems of our times such as economic fairness and climate change. Rich case studies from the United States, United Kingdom, and Germany illustrate a theoretical framework for understanding how large-scale connective action is coordinated using inclusive discourses such as “We Are the 99%” that travel easily through social media. In many of these mobilizations, communication operates as an organizational process that may replace or supplement familiar forms of collective action based on organizational resource mobilization, leadership, and collective action framing. In some cases, connective action emerges from crowds that shun leaders, as when Occupy protesters created media networks to channel resources and create loose ties among dispersed physical groups. In other cases, conventional political organizations deploy personalized communication logics to enable large-scale engagement with a variety of political causes. The Logic of Connective Action shows how power is organized in communication-based networks, and what political outcomes may result.”

Is Online Transparency Just a Feel-Good Sham?


Billy House in the National Journal: “It drew more than a few laughs in Washington. Not long after the White House launched its We the People website in 2011, where citizens could write online petitions and get a response if they garnered enough signatures, someone called for construction of a Star Wars-style Death Star.
With laudable humor, the White House dispatched Paul Shawcross, chief of the Science and Space Branch of the Office of Management and Budget, to explain that the administration “does not support blowing up planets.”
The incident caused a few chuckles, but it also made a more serious point: Years after politicians and government officials began using Internet surveys and online outreach as tools to engage people, the results overall have been questionable….
But skepticism over the value of these programs—and their genuineness—remains strong. Peter Levine, a professor at Tufts University’s Jonathan M. Tisch College of Citizenship and Public Service, said programs like online petitioning and citizen cosponsoring do not necessarily produce a real, representative voice for the people.
It can be “pretty easy to overwhelm these efforts with deliberate strategic action,” he said, noting that similar petitioning efforts in the European Union often find marijuana legalization as the most popular measure.”

Civic Innovation Fellowships Go Global


Some thoughts from Panthea Lee from Reboot: “In recent years, civic innovation fellowships have shown great promise to improve the relationships between citizens and government. In the United States, Code for America and the Presidential Innovation Fellows have demonstrated the positive impact a small group of technologists can have working hand-in-hand with government. With the launch of Code for All, Code for Europe, Code4Kenya, and Code4Africa, among others, the model is going global.
But despite the increasing popularity of civic innovation fellowships, there are few templates for how a “Code for” program can be adapted to a different context. In the US, the success of Code for America has drawn from a wealth of tech talent eager to volunteer skills, public and private support, and the active participation of municipal governments. Elsewhere, new “Code for” programs are surely going to have to operate within a different set of capacities and constraints.”

Smartphones As Weather Surveillance Systems


Tom Simonite in MIT Technology Review: “You probably never think about the temperature of your smartphone’s battery, but it turns out to provide an interesting method for tracking outdoor air temperature. It’s a discovery that adds to other evidence that mobile apps could provide a new way to measure what’s happening in the atmosphere and improve weather forecasting.
Startup OpenSignal, whose app crowdsources data on cellphone reception, first noticed in 2012 that changes in battery temperature correlated with those outdoors. On Tuesday, they published a scientific paper on that technique in a geophysics journal and announced that the technique will be used to interpret data from a weather crowdsourcing app. OpenSignal originally started collecting data on battery temperatures to try and understand the connections between signal strength and how quickly a device chews through its battery.
OpenSignal’s crowdsourced weather-tracking effort joins another accidentally enabled by smartphones. A project called PressureNET that collects air pressure data by taking advantage of the fact many Android phones have a barometer inside to aid their GPS function (see “App Feeds Scientists Atmospheric Data From Thousands of Smartphones”). Cliff Mass, an atmospheric scientist at the University of Washington, is working to incorporate PressureNET data into weather models that usually rely on data from weather stations. He believes that smartphones could provide valuable data from places where there are no weather stations, if enough people start sharing data using apps like PressureNET.
Other research suggests that logging changes in cell network signal strength perceived by smartphones could provide yet more weather data. In February researchers in the Netherlands produced detailed maps of rainfall compiled by monitoring fluctuations in the signal strength measured by cellular network masts, caused by water droplets in the atmosphere.”

Internet Governance is Our Shared Responsibility


New paper by Vint Cerf, Patrick Ryan and Max Senges Senges: “This essay looks at the the different roles that multistakeholder institutions play in the Internet governance ecosystem. We propose a model for thinking of Internet governance within the context of the Internet’s layered model. We use the example of the negotiations in Dubai in 2102 at the World Conference on International Telecommunications as an illustration for why it is important for different institutions within the governance system to focus on their respective areas of expertise (e.g., the ITU, ICANN, and IGF). Several areas of conflict (a “tussle”) are reviewed, such as the desire to promote more broadband infrastructure, a topic that is in the remit of the International Telecommunications Union, but also the recurring desire of countries like Russia and China to use the ITU to regulate content and restrict free expression on the Internet through onerous cybersecurity and spam provisions. We conclude that it is folly to try and regulate all these areas through an international treaty, and encourage further development of mechanisms for global debate like the Internet Governance Forum (IGF).”

Why the world’s governments are interested in creating hubs for open data


in Gigaom: “Amid the tech giants and eager startups that have camped out in East London’s trendy Shoreditch neighborhood, the Open Data Institute is the rare nonprofit on the block that talks about feel-good sorts of things like “triple-bottom line” and “social and environmental value.” …Governments everywhere are embracing the idea that open data is the right way to manage services for citizens. The U.K. has been a leader on this — just check out the simplicity of gov.uk — which is one of the reasons why ODI is U.K. born….“Open data” is open access to the data that has exploded on the scene in recent years, some of it due to the rise of our connected, digital lifestyles from the internet, sensors, GPS, and cell phones, just to name a few resources. But ODI is particularly interested in working with data sets that can have big global and societal impacts, like health, financial, environmental and government data. For example, in conjunction with startup OpenCorporates, ODI recently helped launch a data visualization about Goldman Sachs’s insanely complex corporate structure.”

How to do scientific research without even trying (much)


Ars Technica: “To some extent, scientific research requires expensive or specialized equipment—some work just requires a particle accelerator or a virus containment facility. But plenty of other research has very simple requirements: a decent camera, a bit of patience, or being in the right place at the right time. Since that sort of work is open to anyone, getting the public involved can be a huge win for scientists, who can then obtain much more information than they could have gathered on their own.
A group of Spanish researchers has now written an article that is a mixture of praise for this sort of citizen science, a resource list for people hoping to get involved, and a how-to guide for anyone inspired to join in. The researchers focus on their own area of interest—insects, specifically the hemiptera or “true bugs”—but a lot of what they say applies to other areas of research.

The paper also lists a variety of regional-specific sites that focus on insect identification and tracking, such as ones for the UK, Belgium, and Slovenia. But a dedicated system isn’t required for this sort of resource. In the researchers’ home base on the Iberian Peninsula, insects are tracked via a Flickr group. (If you’re interested in insect research and based in the US, you can also find dozens of projects at the SciStarter site.) We’ve uploaded some of the most amazing images into a gallery that accompanies this article.
ZooKeys, 2013. DOI: 10.3897/zookeys.319.4342

The Recent Rise of Government Open Data APIs


Janet Wagner in ProgrammableWeb: “In recent months, the number of government open data APIs has been increasing rapidly due to a variety of factors including the development of open data technology platforms, the launch of Project Open Data and a recent White House executive order regarding government data.
ProgrammableWeb writer Mark Boyd has recently written three articles related to open data APIs; an article about the latest release of the CKAN API, an article about the UK Open Data Institute and an article about the CivOmega Open Data Search Engine. This post is a brief overview of several recent factors that have led to the rise of government open data APIs.”

 

Copyright Done Right? Finland To Vote On Crowdsourced Regulations


Fast-Feed: “Talk about crowdsourcing: Finland is set to vote on a set of copyright laws that weren’t proposed by government or content-making agencies: They were drafted by citizens.
Finns are able to propose laws that the government must consider if 50,000 supporters sign a petition calling for the law within six months. A set of copyright regulations that are fairer to everyone just passed that threshold, and TorrentFreak.com reports that a government vote is likely in early 2014. The new laws were created with the help of the Finnish Electronic Frontier Foundation, and the body has promised that it will maintain pressure on the political system so that the law will actually be changed.
The proposed new laws would decriminalize file sharing and prevent house searches and surveillance of pirates. TorrentFreak reminds us of the international media outcry that happened last year when during a police raid a 9-year-old girl’s laptop was confiscated on the grounds that she stole copyrighted content. Finland’s existing copyright laws, under what’s called the Lex Karpela amendment, are very strict and criminalize the breaking of DRM for copying purposes as well as preventing discussion of the technology for doing so. The laws have been criticized by activists and observers for their strictness and infringement upon freedom of speech.”