Undefined By Data: A Survey of Big Data Definitions


Paper by Jonathan Stuart Ward and Adam Barker: “The term big data has become ubiquitous. Owing to shared origin between academia, industry and the media there is no single unified definition, and various stakeholders provide diverse and often contradictory definitions. The lack of a consistent definition introduces ambiguity and hampers discourse relating to big data. This short paper attempts to collate the various definitions which have gained some degree of traction and to furnish a clear and concise definition of an otherwise ambiguous term…
Despite the range and differences existing within each of the aforementioned definitions there are some points of similarity. Notably all definitions make at least one of the following assertions:
Size: the volume of the datasets is a critical factor.
Complexity: the structure, behaviour and permutations of the datasets is a critical factor.
Technologies: the tools and techniques which are used to process a sizable or complex dataset is a critical factor.
The definitions surveyed here all encompass at least one of these factors, most encompass two. An extrapolation of these factors would therefore postulate the following: Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning.”

Making All Voices Count


Launch of Making All Voices Count: “Making All Voices Count is a global initiative that supports innovation, scaling-up, and research to deepen existing innovations and help harness new technologies to enable citizen engagement and government responsiveness….Solvable problems need not remain unsolved. Democratic systems in the 21st century continue to be inhibited by 19th century timescales, with only occasional opportunities for citizens to express their views formally, such as during elections. In this century, many citizens have access to numerous tools that enable them to express their views – and measure government performance – in real time.
For example, online reporting platforms enable citizens to monitor the election process by reporting intimidation, vote buying, bias and misinformation; access to mobile technology allows citizens to update water suppliers on gaps in service delivery; crisis information can be crowdsourced via eyewitness reports of violence, as reported by email and sms.
The rise of mobile communication, the installation of broadband and the fast-growing availability of open data, offer tremendous opportunities for data journalism and new media channels. They can inspire governments to develop new ways to fight corruption and respond to citizens efficiently, effectively and fairly. In short, developments in technology and innovation mean that government and citizens can interact like never before.
Making All Voices Count is about seizing this moment to strengthen our commitments to promote transparency, fight corruption, empower citizens, and harness the power of new technologies to make government more effective and accountable.
The programme specifically aims to address the following barriers that weaken the link between governments and citizens:

  • Citizens lack incentives: Citizens may not have the necessary incentives to express their feedback on government performance – due to a sense of powerlessness, distrust in the government, fear of retribution, or lack of reliable information
  • Governments lack incentives: At the same time, governments need incentives to respond to citizen input whenever possible and to leverage citizen participation. The government’s response to citizens should be reinforced by proactive, public communication.  This initiative will help create incentives for government to respond.  Where government responds effectively, citizens’ confidence in government performance and approval ratings are likely to increase
  • Governments lack the ability to translate citizen feedback into action: This could be due to anything from political constraints to a lack of skills and systems. Governments need better tools to effectively analyze and translate citizen input into information that will lead to solutions and shape resource allocation. Once captured, citizens’ feedback (on their experiences with government performance) must be communicated so as to engage both the government and the broader public in finding a solution.
  • Citizens lack meaningful opportunities: Citizens need greater access to better tools and know-how to easily engage with government in a way that results in government action and citizen empowerment”

MicroMappers: Microtasking for Disaster Response


Patrick Meier: “My team and I at QCRI are about to launch MicroMappers: the first ever set of microtasking apps specifically customized for digital humanitarian response. If you’re new to microtasking in the context of disaster response, then I recommend reading this, this and this. The purpose of our web-based microtasking apps (we call them Clickers) is to quickly make sense of all the user-generated, multi-media content posted on social media during disasters. How? By using microtasking and making it as easy as a single click of the mouse to become a digital humanitarian volunteer. This is how volunteers with Zooniverse were able to click-and-thus-tag well over 2,000,000 images in under 48-hours.
We have already developed and customized four Clickers using the free and open source microtasking platform CrowdCrafting: TweetClicker, TweetGeoClicker, ImageClicker and ImageGeoClicker. Each Clicker includes a mini-tutorial to guide volunteers.”

Social media: its emerging importance and impact on citizen engagement


New article by Victoria Burton in International Affairs Forum that “examines the impact of social media which not only provides citizens alternative avenues to express themselves about government policies but presents new challenges and means for government to provide services to the public. An example is the CovJam online venture presented by Coventry City and IBM that used social media as part of a three-day brainstorming event about the city. Social media have facilitated government programs to carry out surveys and fine-tune services but perhaps the greatest aspect is that of greater public participation. Moving forward, it will be important to address social media across public sectors and establish strategies to leverage its advantages and benefits.”

From Crowd-Sourcing Potholes to Community Policing


New paper by Manik Suri (GovLab): “The tragic Boston Marathon bombing and hair-raising manhunt that ensued was a sobering event. It also served as a reminder that emerging “civic technologies” – platforms and applications that enable citizens to connect and collaborate with each other and with government – are more important today than ever before. As commentators have noted, local police and federal agents utilized a range of technological platforms to tap the “wisdom of the crowd,” relying on thousands of private citizens to develop a “hive mind” that identified two suspects within a record period of time.
In the immediate wake of the devastating attack on April 15th, investigators had few leads. But within twenty-four hours, senior FBI officials, determined to seek “assistance from the public,” called on everyone with information to submit all media, tips, and leads related to the Boston Marathon attack. This unusual request for help yielded thousands of images and videos from local Bostonians, tourists, and private companies through technological channels ranging from telephone calls and emails to Flickr posts and Twitter messages. In mere hours, investigators were able to “crowd-source” a tremendous amount of data – including thousands of images from personal cameras, amateur videos from smart phones, and cell-tower information from private carriers. Combing through data from this massive network of “eyes and ears,” law enforcement officials were quickly able to generate images of two lead suspects – enabling a “modern manhunt” to commence immediately.
Technological innovations have transformed our commercial, political, and social realities. These advances include new approaches to how we generate knowledge, access information, and interact with one another, as well as new pathways for building social movements and catalyzing political change. While a significant body of academic research has focused on the role of technology in transforming electoral politics and social movements, less attention has been paid to how technological innovation can improve the process of governance itself.
A growing number of platforms and applications lie at this intersection of technology and governance, in what might be termed the “civic technology” sector. Broadly speaking, this sector involves the application of new information and communication technologies – ranging from robust social media platforms to state-of-the-art big data analysis systems – to address public policy problems. Civic technologies encompass enterprises that “bring web technologies directly to government, build services on top of government data for citizens, and change the way citizens ask, get, or need services from government.” These technologies have the potential to transform governance by promoting greater transparency in policy-making, increasing government efficiency, and enhancing citizens’ participation in public sector decision-making.

GovLab Seeks Open Data Success Stories


Wyatt Kash in InformationWeek: “A team of open government advocates, led by former White House aide Beth Novek, has launched a campaign to identify 500 examples of how freely available government data is being put to profitable use in the private sector.Open Data 500 is part of a broader effort by New York University’s Governance Lab (GovLab) to conduct the “first real, comprehensive study of the use of open government data in the private sector,” said Joel Gurin, founder of OpenDataNow.com and senior adviser at GovLab.
Novek, who served in the White House as the first U.S. deputy CTO and led the White House Open Government Initiative from 2009-2011, founded GovLab while also teaching at the MIT Media Lab and NYU’s Robert F. Wagner Graduate School of Public Service.
In an interview with InformationWeek Government, Gurin explained that the goal of GovLab, and the Open Data 500 project, is to show how technology and new uses of data can make government more effective, and create more of a partnership between government and the public. “We’re also trying to draw on more public expertise to solve government problems,” he said….
Gurin said Open Data 500 will primarily look at U.S.-based, revenue-producing companies or organizations where government data is a key resource for their business. While the GovLab will focus initially on the use of federal data, it will also look at cases where entrepreneurs are making use of state or local data, but in scalable fashion.
“This goes one step further than the datapaloozas” championed by U.S. CTO Todd Park to showcase tools developed by the private sector using government data. “We’re trying to show how we can make data sets even more impactful and useful.”
Gurin said the GovLab team hopes to complete the study by the end of this year. The team has already identified 150 companies as candidates. To submit your company for consideration, visit thegovlab.org/submit-your-company; to submit another company, visit thegovlab.org/open500

Smarter Than You Think: How Technology is Changing Our Minds for the Better


New book by Clive Thompson: “It’s undeniable—technology is changing the way we think. But is it for the better? Amid a chorus of doomsayers, Clive Thompson delivers a resounding “yes.” The Internet age has produced a radical new style of human intelligence, worthy of both celebration and analysis. We learn more and retain it longer, write and think with global audiences, and even gain an ESP-like awareness of the world around us. Modern technology is making us smarter, better connected, and often deeper—both as individuals and as a society.
In Smarter Than You Think Thompson shows that every technological innovation—from the written word to the printing press to the telegraph—has provoked the very same anxieties that plague us today. We panic that life will never be the same, that our attentions are eroding, that culture is being trivialized. But as in the past, we adapt—learning to use the new and retaining what’s good of the old.”

Cyberpsychology and New Media


A thematic reader, edited by Andrew Power, Grainne Kirwan:Cyberpsychology is the study of human interactions with the internet, mobile computing and telephony, games consoles, virtual reality, artificial intelligence, and other contemporary electronic technologies. The field has grown substantially over the past few years and this book surveys how researchers are tackling the impact of new technology on human behaviour and how people interact with this technology.

Examining topics as diverse as online dating, social networking, online communications, artificial intelligence, health-information seeking behaviour, education online, online therapies and cybercrime, Cyberpsychology and New Media book provides an in-depth overview of this burgeoning field, and allows those with little previous knowledge to gain an appreciation of the diversity of the research being undertaken in the area.”

(Appropriate) Big Data for Climate Resilience?


Amy Luers at the Stanford Social Innovation Review: “The answer to whether big data can help communities build resilience to climate change is yes—there are huge opportunities, but there are also risks.

Opportunities

  • Feedback: Strong negative feedback is core to resilience. A simple example is our body’s response to heat stress—sweating, which is a natural feedback to cool down our body. In social systems, feedbacks are also critical for maintaining functions under stress. For example, communication by affected communities after a hurricane provides feedback for how and where organizations and individuals can provide help. While this kind of feedback used to rely completely on traditional communication channels, now crowdsourcing and data mining projects, such as Ushahidi and Twitter Earthquake detector, enable faster and more-targeted relief.
  • Diversity: Big data is enhancing diversity in a number of ways. Consider public health systems. Health officials are increasingly relying on digital detection methods, such as Google Flu Trends or Flu Near You, to augment and diversify traditional disease surveillance.
  • Self-Organization: A central characteristic of resilient communities is the ability to self-organize. This characteristic must exist within a community (see the National Research Council Resilience Report), not something you can impose on it. However, social media and related data-mining tools (InfoAmazonia, Healthmap) can enhance situational awareness and facilitate collective action by helping people identify others with common interests, communicate with them, and coordinate efforts.

Risks

  • Eroding trust: Trust is well established as a core feature of community resilience. Yet the NSA PRISM escapade made it clear that big data projects are raising privacy concerns and possibly eroding trust. And it is not just an issue in government. For example, Target analyzes shopping patterns and can fairly accurately guess if someone in your family is pregnant (which is awkward if they know your daughter is pregnant before you do). When our trust in government, business, and communities weakens, it can decrease a society’s resilience to climate stress.
  • Mistaking correlation for causation: Data mining seeks meaning in patterns that are completely independent of theory (suggesting to some that theory is dead). This approach can lead to erroneous conclusions when correlation is mistakenly taken for causation. For example, one study demonstrated that data mining techniques could show a strong (however spurious) correlation between the changes in the S&P 500 stock index and butter production in Bangladesh. While interesting, a decision support system based on this correlation would likely prove misleading.
  • Failing to see the big picture: One of the biggest challenges with big data mining for building climate resilience is its overemphasis on the hyper-local and hyper-now. While this hyper-local, hyper-now information may be critical for business decisions, without a broader understanding of the longer-term and more-systemic dynamism of social and biophysical systems, big data provides no ability to understand future trends or anticipate vulnerabilities. We must not let our obsession with the here and now divert us from slower-changing variables such as declining groundwater, loss of biodiversity, and melting ice caps—all of which may silently define our future. A related challenge is the fact that big data mining tends to overlook the most vulnerable populations. We must not let the lure of the big data microscope on the “well-to-do” populations of the world make us blind to the less well of populations within cities and communities that have more limited access to smart phones and the Internet.”

The Tech Intellectuals


New Essay by Henry Farrell in Democracy: “A quarter of a century ago, Russell Jacoby lamented the demise of the public intellectual. The cause of death was an improvement in material conditions. Public intellectuals—Dwight Macdonald, I.F. Stone, and their like—once had little choice but to be independent. They had difficulty getting permanent well-paying jobs. However, as universities began to expand, they offered new opportunities to erstwhile unemployables. The academy demanded a high price. Intellectuals had to turn away from the public and toward the practiced obscurities of academic research and prose. In Jacoby’s description, these intellectuals “no longer need[ed] or want[ed] a larger public…. Campuses [were] their homes; colleagues their audience; monographs and specialized journals their media.”
Over the last decade, conditions have changed again. New possibilities are opening up for public intellectuals. Internet-fueled media such as blogs have made it much easier for aspiring intellectuals to publish their opinions. They have fostered the creation of new intellectual outlets (Jacobin, The New Inquiry, The Los Angeles Review of Books), and helped revitalize some old ones too (The Baffler, Dissent). Finally, and not least, they have provided the meat for a new set of arguments about how communications technology is reshaping society.
These debates have created opportunities for an emergent breed of professional argument-crafters: technology intellectuals. Like their predecessors of the 1950s and ’60s, they often make a living without having to work for a university. Indeed, the professoriate is being left behind. Traditional academic disciplines (except for law, which has a magpie-like fascination with new and shiny things) have had a hard time keeping up. New technologies, to traditionalists, are suspect: They are difficult to pin down within traditional academic boundaries, and they look a little too fashionable to senior academics, who are often nervous that their fields might somehow become publicly relevant.
Many of these new public intellectuals are more or less self-made. Others are scholars (often with uncomfortable relationships with the academy, such as Clay Shirky, an unorthodox professor who is skeptical that the traditional university model can survive). Others still are entrepreneurs, like technology and media writer and podcaster Jeff Jarvis, working the angles between public argument and emerging business models….
Different incentives would lead to different debates. In a better world, technology intellectuals might think more seriously about the relationship between technological change and economic inequality. Many technology intellectuals think of the culture of Silicon Valley as inherently egalitarian, yet economist James Galbraith argues that income inequality in the United States “has been driven by capital gains and stock options, mostly in the tech sector.”
They might think more seriously about how technology is changing politics. Current debates are still dominated by pointless arguments between enthusiasts who believe the Internet is a model for a radically better democracy, and skeptics who claim it is the dictator’s best friend.
Finally, they might pay more attention to the burgeoning relationship between technology companies and the U.S. government. Technology intellectuals like to think that a powerful technology sector can enhance personal freedom and constrain the excesses of government. Instead, we are now seeing how a powerful technology sector may enable government excesses. Without big semi-monopolies like Facebook, Google, and Microsoft to hoover up personal information, surveillance would be far more difficult for the U.S. government.
Debating these issues would require a more diverse group of technology intellectuals. The current crop are not diverse in some immediately obvious ways—there are few women, few nonwhites, and few non-English speakers who have ascended to the peak of attention. Yet there is also far less intellectual diversity than there ought to be. The core assumptions of public debates over technology get less attention than they need and deserve.”