The Dictatorship of Data


Kenneth Cukier and Viktor Mayer-Schönberger in MIT Technology Review: “Big data is poised to transform society, from how we diagnose illness to how we educate children, even making it possible for a car to drive itself. Information is emerging as a new economic input, a vital resource. Companies, governments, and even individuals will be measuring and optimizing everything possible.
But there is a dark side. Big data erodes privacy. And when it is used to make predictions about what we are likely to do but haven’t yet done, it threatens freedom as well. Yet big data also exacerbates a very old problem: relying on the numbers when they are far more fallible than we think. Nothing underscores the consequences of data analysis gone awry more than the story of Robert McNamara.”

"A bite of me"


Federico Zannier @ Kickstarter: “I’ve data mined myself. I’ve violated my own privacy. Now I am selling it all. But how much am I worth?

I spend hours every day surfing the internet. Meanwhile, companies like Facebook and Google have been using my online information (the websites I visit, the friends I have, the videos I watch) for their own benefit.
In 2012, advertising revenue in the United States was around $30 billion. That same year, I made exactly $0 from my own data. But what if I tracked everything myself? Could I at least make a couple bucks back?
I started looking at the terms of service for the websites I often use. In their privacy policies, I have found sentences like this: “You grant a worldwide, non-exclusive, royalty-free license to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such content in any and all media or distribution methods (now known or later developed).” I’ve basically agreed to give away a lifelong, international, sub-licensable right to use my personal data….
Check out myprivacy.info to see some of the visualizations I’ve made.
http://myprivacy.info”

Life and Death of Tweets Not so Random After All


MIT Technology Review: “MIT assistant professor Tauhid Zaman and two other researchers (Emily Fox at the University of Washington and Eric Bradlow at the University of Pennsylvania’s Wharton School) have come up with a model that can predict how many times a tweet will ultimately be retweeted, minutes after it is posted. The model was created by collecting retweets on a slew of topics and looking at the time when the original tweet was posted and how fast it spread. That provided knowledge used to predict how popular a new tweet will be by looking at how many times it was retweeted shortly after it was first posted.
The researchers’ findings were explained in a paper submitted to the Annals of Applied Statistics. In the paper, the authors note that “understanding retweet behavior could lead to a better understanding of how broader ideas spread in Twitter and in other social networks,” and such data may be helpful in a number of areas, like marketing and political campaigning.
You can check out the model here.”

Mary Meeker’s Internet Trends Report


AllThingsD: For the second year in a row, Mary Meeker is unveiling her now famed Internet Trends report at the D11 Conference.

Meeker, the Kleiner Perkins Caufield & Byers partner, highlights growth of Internet usage and other activities on mobile devices and updates that now infamous gap between mobile internet usage and mobile monetization.
But there are many new additions. Among them are the rise of wearable tech as perhaps the next big tech cycle of the coming decade and a look at how Americans’ online sharing habits compare to the rest of the world.
Here’s Meeker’s full presentation:

KPCB Internet Trends 2013 from Kleiner Perkins Caufield & Byers

The Internet as Politicizing Instrument


New Issue of Transformations (Editorial): “This issue of Transformations presents essays responding to Marcus Breen’s recent book Uprising: The Internet’s Unintended Consequences. Breen asks whether the Internet can become a politicising instrument for the new online proletariat – the individualised users isolated by the monitor screen. He asks “if the proletariat can use the Internet, is it freed from the moral and social constraints of the past that were imposed by conventional media and its regulation of the public space?” (32) This question raises further issues. Does this freedom translate into an emancipatory politics where the proletariat is able to pursue its own ends, or does it simply reproduce the power relation between the user-subject and the Internet and those who control and manage it. The articles in this issue respond in various ways to these questions.
Marcus Breen’s own article “The Internet and Privatism: Reconstructing the Monitor Space” makes a case for privatism – the restriction of subjective life to isolated or privatised experience, especially in relation to the computer monitor – as the new modality of meaning making in the Internet era. Using approaches associated with cultural and media studies, the paper traces the way the Internet has influenced the shift in the culture towards values associated with the confluence of ideas around the private, best described by privatism.
Fidele Vlavo’s article investigates the central discourses that have constructed the internet as a democratic and public environment removed from state and corporate control. The aim is to call attention to the issues that have limited the development of the internet as a tool for socio-political empowerment. The paper first retraces the early discursive constructions that insist on representing the internet as a decentralised and open structure. It also questions the role played by the digerati (or cyber elite) in the formulation of contradictory demands for public interests, self-governance, and entrepreneurial rights. Finally, it examines the emergence of two early virtual communities and their attempts to facilitate free speech and self-regulation. In the context of activists advocating freedom of expression and government institutions re-organizing legislation to control the Internet, the examination of these discourses provides a useful starting point for the (re)assessment of the potential of direct online mobilization.
Emit Snake-Being’s article examines the limits of the Internet as a politicising instrument by showing how Internet users are subject to the controls of the search engine algorithm, managed by elite groups whose purpose is to reproduce themselves in terms of neo-liberal capitalism. Invoking recent political events in the Middle East and in London in which a wired proletariat sought to resist and overturn political authorities through Internet communication, Snake-Beings argues that such events are compromised by the fact that they owe their possibility to Internet providers and their commercial imperatives. Snake-Being’s article, as well as most of the other articles in this issue, offers a timely reminder not only of the possibilities, but of the limits of the Internet as a politicising instrument for progressive, emancipatory politics.
Frances Shaw’s paper concerns the way in which the logic of surveillance operates in contested sites in cities where live coverage of demonstrations against capitalism leads to confrontation between demonstrators and police. Through a detailed account of the “Occupy Sydney” demonstration in 2011, Shaw shows how both demonstrators and police engaged in tactics of surveillance and resistance to counter each other’s power and authority. In an age of instant communication and global surveillance, freedom of movement and freedom from surveillance in public spaces is drawn into the logics of power mediated by mobile ‘phones and computer based communication technology.
Karyl Ketchum’s paper offers detailed analysis of two Internet sites to show how the proletarianisation of the Internet is gendered in terms of male interests. Picking up on Breen’s argument that Internet proletarianisation leads to an open system that “supports both anything and anyone,” she argues that, in the domain of online pornography, this new-found freedom turns out to be “the power of computer analytics to harness and hone the shifting meanings of white Western Enlightenment masculinities in new globalising postcolonial contexts, economies and geopolitical struggles.” Furthermore, Ketchum shows how this default to male interests was also at work in American reporting of the Arab Spring revolutions in Egypt and other Middle Eastern countries. The YouTube video posted by a young Egyptian woman, Asmaa Mahfouz, which sparked the revolution in Egypt that eventually overthrew the Mubarak government, was not given due coverage by the Western media, so that “women like Mahfouz all but disappear from Western accounts of the Arab Spring.”
Liden and Giritli Nygren’s paper addresses the challenges to the theories of the political sphere posed by a digital society. It is suggested that this is most evident at the intersection between understandings of technology, performativities, and politics that combines empirical closeness with abstract understandings of socio-political and cultural contexts. The paper exemplifies this by reporting on a study of online citizen dialogue in the making, in this case concerning school planning in a Swedish municipality. Applying these theoretical perspectives to this case provides some key findings. The technological design is regarded as restricting the potential dialogue, as is outlined in different themes where the participants enact varying positions—taxpayers, citizen consumers, or local residents. The political analysis stresses a dialogue that lacks both polemic and public perspectives, and rather is characterized by the expression of different special interests. Together, these perspectives can provide the foundation for the development of applying theories in a digital society.
The Internet and Privatism: Reconstructing the Monitor Space (Marcus Breen)
The Digital Hysterias of Decentralisation, Entrepreneurship and Open Community (Fidele Vlavo)
From Ideology to Algorithm: the Opaque Politics of the Internet (Emit Snake-Beings)
“Walls of Seeing”: Protest Surveillance, Embodied Boundaries, and Counter-Surveillance at Occupy Sydney (Frances Shaw)
Gendered Uprisings: Desire, Revolution, and the Internet’s “Unintended Consequences”(Karyl E. Ketchum)
Analysing the Intersections between Technology, Performativity, and Politics: the Case of Local Citizen Dialogue (Gustav Lidén and Katarina Giritli Nygren)”

If My Data Is an Open Book, Why Can’t I Read It?


Natasha Singer in the New York Times: “Never mind all the hoopla about the presumed benefits of an “open data” society. In our day-to-day lives, many of us are being kept in the data dark.

“The fact that I am producing data and companies are collecting it to monetize it, if I can’t get a copy myself, I do consider it unfair,” says Latanya Sweeney, the director of the Data Privacy Lab at Harvard, where she is a professor of government and technology….

In fact, a few companies are challenging the norm of corporate data hoarding by actually sharing some information with the customers who generate it — and offering tools to put it to use. It’s a small but provocative trend in the United States, where only a handful of industries, like health care and credit, are required by federal law to provide people with access to their records.

Last year, San Diego Gas and Electric, a utility, introduced an online energy management program in which customers can view their electricity use in monthly, daily or hourly increments. There is even a practical benefit: customers can earn credits by reducing energy consumption during peak hours….

The Declassification Engine


Wired: “The CIA offers an electronic search engine that lets you mine about 11 million agency documents that have been declassified over the years. It’s called CREST, short for CIA Records Search Tool. But this represents only a portion the CIA’s declassified materials, and if you want unfettered access to the search engine, you’ll have to physically visit the National Archives at College Park, Maryland….
a new project launched by a team of historians, mathematicians, and computer scientists at Columbia University in New York City. Led by Matthew Connelly — a Columbia professor trained in diplomatic history — the project is known as The Declassification Engine, and it seeks to provide a single online database for declassified documents from across the federal government, including the CIA, the State Department, and potentially any other agency.
The project is still in the early stages, but the team has already assembled a database of documents that stretches back to the 1940s, and it has begun building new tools for analyzing these materials. In aggregating all documents into a single database, the researchers hope to not only provide quicker access to declassified materials, but to glean far more information from these documents than we otherwise could.
In the parlance of the day, the project is tackling these documents with the help of Big Data. If you put enough of this declassified information in a single place, Connelly believes, you can begin to predict what government information is still being withheld”

Digital Strategy: Delivering Better Results for the Public


The White House Blog: “Today marks one year since we released the Digital Government Strategy (PDF/ HTML5), as part of the President’s directive to build a 21st Century Government that delivers better services to the American people.
The Strategy is built on the proposition that all Americans should be able to access information from their Government anywhere, anytime, and on any device; that open government data – data that are publicly accessible in easy-to-use formats – can fuel innovation and economic growth; and that technology can make government more transparent, more efficient, and more effective.
A year later, there’s a lot to be proud of:
Information Centric
In twelve months, the Federal Government has significantly shifted how it thinks about digital information – treating data as a valuable national asset that should be open and available to the public, to entrepreneurs, and others, instead of keeping it trapped in government systems. …
Shared Platform
The Federal Government and the American people cannot afford to have each agency build isolated and duplicative technology solutions. Instead, we must use modern platforms for digital services that can be shared across agencies….
Customer-Centric
Citizens shouldn’t have to struggle to access the information they need. To ensure that the American people can easily find government services, we implemented a government-wide Digital Analytics Program across all Federal websites….
Security and Privacy
Throughout all of these efforts, maintaining cyber security and protecting privacy have been paramount….
In the end, the digital strategy is all about connecting people to government resources in useful ways. And by “connecting” we mean a two-way street….
Learn more at: http://www.whitehouse.gov/digitalgov/strategy-milestones and http://www.whitehouse.gov/digitalgov/deliverables.”

Deepbills project


Cato Institute: “The Deepbills project takes the raw XML of Congressional bills (available at FDsys and Thomas) and adds additional semantic information to them in inside the text.

You can download the continuously-updated data at http://deepbills.cato.org/download

Congress already produces machine-readable XML of almost every bill it proposes, but that XML is designed primarily for formatting a paper copy, not for extracting information. For example, it’s not currently possible to find every mention of an Agency, every legal reference, or even every spending authorization in a bill without having a human being read it….
Currently the following information is tagged:

  • Legal citations…
  • Budget Authorities (both Authorizations of Appropriations and Appropriations)…
  • Agencies, bureaus, and subunits of the federal government.
  • Congressional committees
  • Federal elective officeholders (Congressmen)”

Introducing: Project Open Data


White House Blog: “Technology evolves rapidly, and it can be challenging for policy and its implementation to evolve at the same pace.  Last week, President Obama launched the Administration’s new Open Data Policy and Executive Order aimed at ensuring that data released by the government will be as accessible and useful as possible.  To make sure this tech-focused policy can keep up with the speed of innovation, we created Project Open Data.
Project Open Data is an online, public repository intended to foster collaboration and promote the continual improvement of the Open Data Policy. We wanted to foster a culture change in government where we embrace collaboration and where anyone can help us make open data work better. The project is published on GitHub, an open source platform that allows communities of developers to collaboratively share and enhance code.  The resources and plug-and-play tools in Project Open Data can help accelerate the adoption of open data practices.  For example, one tool instantly converts spreadsheets and databases into APIs for easier consumption by developers.  The idea is that anyone, from Federal agencies to state and local governments to private citizens, can freely use and adapt these open source tools—and that’s exactly what’s happening.
Within the first 24 hours after Project Open Data was published, more than two dozen contributions (or “pull requests” in GitHub speak) were submitted by the public. The submissions included everything from fixing broken links, to providing policy suggestions, to contributing new code and tools. One pull request even included new code that translates geographic data from locked formats into open data that is freely available for use by anyone…”