Big Data Before the Web


Evan Hepler-Smith in the Wall Street Journal: “Sometime in the early 1950s, on a reservation in Wisconsin, a Menominee Indian man looked at an ink blot. An anthropologist recorded the man’s reaction according to a standard Rorschach-test protocol. The researcher submitted a copy of these notes to an enormous cache of records collected over the course of decades by American social scientists working among various “societies ‘other than our own.’ ” This entire collection of social-scientific data was photographed and printed in arrays of microscopic images on 3-by-5-inch cards. Sets of these cards were shipped to research libraries around the world. They gathered dust.

In the results of this Rorschach test, the anthropologist saw evidence of a culture eroded by modernity. Sixty years later, these documents also testify to the aspirations and fate of the social-scientific project for which they were generated. Deep within this forgotten Ozymandian card file sits the Menominee man’s reaction to Rorschach card VI: “It is like a dead planet. It seems to tell the story of a people once great who have lost . . . like something happened. All that’s left is the symbol.”

In “Database of Dreams: The Lost Quest to Catalog Humanity,” Rebecca Lemov delves into the ambitious efforts of mid-20th-century social scientists to build a “capacious and reliable science of the varieties of the human being” by generating an archive of human experience through interviews and tests and by storing the information on the high-tech media of the day.

 For these psychologists and anthropologists, the key to a universal human science lay in studying members of cultures in transition between traditional and modern ways of life and in rendering their individuality as data. Interweaving stories of social scientists, Native American research subjects and information technologies, Ms. Lemov presents a compelling account of “what ‘humanness’ came to mean in an age of rapid change in technological and social conditions.” Ms. Lemov, an associate professor of the history of science at Harvard University, follows two contrasting threads through a story that she calls “a parable for our time.” She shows, first, how collecting data about human experience shapes human experience and, second, how a high-tech data repository of the 1950s became, as she puts it, a “data ruin.”…(More) – See also: Database of Dreams: The Lost Quest to Catalog Humanity

OpenFDA: an innovative platform providing access to a wealth of FDA’s publicly available data


Paper by Taha A Kass-Hout et al in JAMIA: “The objective of openFDA is to facilitate access and use of big important Food and Drug Administration public datasets by developers, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs).

Materials and Methods: Using cutting-edge technologies deployed on FDA’s new public cloud computing infrastructure, openFDA provides open data for easier, faster (over 300 requests per second per process), and better access to FDA datasets; open source code and documentation shared on GitHub for open community contributions of examples, apps and ideas; and infrastructure that can be adopted for other public health big data challenges.

Results:Since its launch on June 2, 2014, openFDA has developed four APIs for drug and device adverse events, recall information for all FDA-regulated products, and drug labeling. There have been more than 20 million API calls (more than half from outside the United States), 6000 registered users, 20,000 connected Internet Protocol addresses, and dozens of new software (mobile or web) apps developed. A case study demonstrates a use of openFDA data to understand an apparent association of a drug with an adverse event. Conclusion With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products

Conclusion: With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products…(More)”

Big Data in the Policy Cycle: Policy Decision Making in the Digital Era


Paper by Johann Höchtl et al in the Journal of Organizational Computing and Electronic Commerce: “Although of high relevance to political science, the interaction between technological change and political change in the era of Big Data remains somewhat of a neglected topic. Most studies focus on the concept of e-government and e-governance, and on how already existing government activities performed through the bureaucratic body of public administration could be improved by technology. This paper attempts to build a bridge between the field of e-governance and theories of public administration that goes beyond the service delivery approach that dominates a large part of e-government research. Using the policy cycle as a generic model for policy processes and policy development, a new look on how policy decision making could be conducted on the basis of ICT and Big Data is presented in this paper….(More)”

Citizenship, Social Media, and Big Data: Current and Future Research in the Social Sciences


Homero Gil de Zúñiga at Social Science Computer Review: “This special issue of the Social Science Computer Review provides a sample of the latest strategies employing large data sets in social media and political communication research. The proliferation of information communication technologies, social media, and the Internet, alongside the ubiquity of high-performance computing and storage technologies, has ushered in the era of computational social science. However, in no way does the use of “big data” represent a standardized area of inquiry in any field. This article briefly summarizes pressing issues when employing big data for political communication research. Major challenges remain to ensure the validity and generalizability of findings. Strong theoretical arguments are still a central part of conducting meaningful research. In addition, ethical practices concerning how data are collected remain an area of open discussion. The article surveys studies that offer unique and creative ways to combine methods and introduce new tools while at the same time address some solutions to ethical questions….(More)”

What Privacy Papers Should Policymakers be Reading in 2016?


Stacy Gray at the Future of Privacy Forum: “Each year, FPF invites privacy scholars and authors to submit articles and papers to be considered by members of our Advisory Board, with an aim toward showcasing those articles that should inform any conversation about privacy among policymakers in Congress, as well as at the Federal Trade Commission and in other government agencies. For our sixth annual Privacy Papers for Policymakers, we received submissions on topics ranging from mobile app privacy, to location tracking, to drone policy.

Our Advisory Board selected papers that describe the challenges and best practices of designing privacy notices, ways to minimize the risks of re-identification of data by focusing on process-based data release policy and taking a precautionary approach to data release, the relationship between privacy and markets, and bringing the concept of trust more strongly into privacy principles.

Our top privacy papers for 2015 are, in alphabetical order:
Florian Schaub, Rebecca Balebako, Adam L. Durity, and Lorrie Faith Cranor
Ira S. Rubinstein and Woodrow Hartzog
Arvind Narayanan, Joanna Huey, and Edward W. Felten
Ryan Calo
Neil Richards and Woodrow Hartzog
Our two papers selected for Notable Mention are:
Peter Swire (Testimony, Senate Judiciary Committee Hearing, July 8, 2015)
Joel R. Reidenberg
….(More)”

Big Data as Governmentality – Digital Traces, Algorithms, and the Reconfiguration of Data in International Development


Paper by Flyverbom, Mikkel and Madsen, Anders Klinkby and Rasche, Andreas: “This paper conceptualizes how large-scale data and algorithms condition and reshape knowledge production when addressing international development challenges. The concept of governmentality and four dimensions of an analytics of government are proposed as a theoretical framework to examine how big data is constituted as an aspiration to improve the data and knowledge underpinning development efforts. Based on this framework, we argue that big data’s impact on how relevant problems are governed is enabled by (1) new techniques of visualizing development issues, (2) linking aspects of international development agendas to algorithms that synthesize large-scale data, (3) novel ways of rationalizing knowledge claims that underlie development efforts, and (4) shifts in professional and organizational identities of those concerned with producing and processing data for development. Our discussion shows that big data problematizes selected aspects of traditional ways to collect and analyze data for development (e.g. via household surveys). We also demonstrate that using big data analyses to address development challenges raises a number of questions that can deteriorate its impact….(More)

Tech and Innovation to Re-engage Civic Life


Hollie Russon Gilman at the Stanford Social Innovation Review: “Sometimes even the best-intentioned policymakers overlook the power of people. And even the best-intentioned discussions on social impact and leveraging big data for the social sector can obscure the power of every-day people in their communities.

But time and time again, I’ve seen the transformative power of civic engagement when initiatives are structured well. For example, the other year I witnessed a high school student walk into a school auditorium one evening during Boston’s first-ever youth-driven participatory budgeting project. Participatory budgeting gives residents a structured opportunity to work together to identify neighborhood priorities, work in tandem with government officials to draft viable projects, and prioritize projects to fund. Elected officials in turn pledge to implement these projects and are held accountable to their constituents. Initially intrigued by an experiment in democracy (and maybe the free pizza), this student remained engaged over several months, because she met new members of her community; got to interact with elected officials; and felt like she was working on a concrete objective that could have a tangible, positive impact on her neighborhood.

For many of the young participants, ages 12-25, being part of a participatory budgeting initiative is the first time they are involved in civic life. Many were excited that the City of Boston, in collaboration with the nonprofit Participatory Budgeting Project, empowered young people with the opportunity to allocate $1 million in public funds. Through participating, young people gain invaluable civic skills, and sometimes even a passion that can fuel other engagements in civic and communal life.

This is just one example of a broader civic and social innovation trend. Across the globe, people are working together with their communities to solve seemingly intractable problems, but as diverse as those efforts are, there are also commonalities. Well-structured civic engagement creates the space and provides the tools for people to exert agency over policies. When citizens have concrete objectives, access to necessary technology (whether it’s postcards, trucks, or open data portals), and an eye toward outcomes, social change happens.

Using Technology to Distribute Expertise

Technology is allowing citizens around the world to participate in solving local, national, and global problems. When it comes to large, public bureaucracies, expertise is largely top-down and concentrated. Leveraging technology creates opportunities for people to work together in new ways to solve public problems. One way is through civic crowdfunding platforms like Citizinvestor.com, which cities can use to develop public sector projects for citizen support; several cities in Rhode Island, Oregon, and Philadelphia have successfully pooled citizen resources to fund new public works. Another way is through citizen science. Old Weather, a crowdsourcing project from the National Archives and Zooniverse, enrolls people to transcribe old British ship logs to identify climate change patterns. Platforms like these allow anyone to devote a small amount of time or resources toward a broader public good. And because they have a degree of transparency, people can see the progress and impact of their efforts. ….(More)”

Big Data and Big Cities: The Promises and Limitations of Improved Measures of Urban Life


Paper by Edward L. Glaeser et al: “New, “big” data sources allow measurement of city characteristics and outcome variables higher frequencies and finer geographic scales than ever before. However, big data will not solve large urban social science questions on its own. Big data has the most value for the study of cities when it allows measurement of the previously opaque, or when it can be coupled with exogenous shocks to people or place. We describe a number of new urban data sources and illustrate how they can be used to improve the study and function of cities. We first show how Google Street View images can be used to predict income in New York City, suggesting that similar image data can be used to map wealth and poverty in previously unmeasured areas of the developing world. We then discuss how survey techniques can be improved to better measure willingness to pay for urban amenities. Finally, we explain how Internet data is being used to improve the quality of city services….(More)”

Meeting the Challenges of Big Data


Opinion by the European Data Protection Supervisor: “Big data, if done responsibly, can deliver significant benefits and efficiencies for society and individuals not only in health, scientific research, the environment and other specific areas. But there are serious concerns with the actual and potential impact of processing of huge amounts of data on the rights and freedoms of individuals, including their right to privacy. The challenges and risks of big data therefore call for more effective data protection.

Technology should not dictate our values and rights, but neither should promoting innovation and preserving fundamental rights be perceived as incompatible. New business models exploiting new capabilities for the massive collection, instantaneous transmission, combination and reuse of personal information for unforeseen purposes have placed the principles of data protection under new strains, which calls for thorough consideration on how they are applied.

European data protection law has been developed to protect our fundamental rights and values, including our right to privacy. The question is not whether to apply data protection law to big data, but rather how to apply it innovatively in new environments. Our current data protection principles, including transparency, proportionality and purpose limitation, provide the base line we will need to protect more dynamically our fundamental rights in the world of big data. They must, however, be complemented by ‘new’ principles which have developed over the years such as accountability and privacy by design and by default. The EU data protection reform package is expected to strengthen and modernise the regulatory framework .

The EU intends to maximise growth and competitiveness by exploiting big data. But the Digital Single Market cannot uncritically import the data-driven technologies and business models which have become economic mainstream in other areas of the world. Instead it needs to show leadership in developing accountable personal data processing. The internet has evolved in a way that surveillance – tracking people’s behaviour – is considered as the indispensable revenue model for some of the most successful companies. This development calls for critical assessment and search for other options.

In any event, and irrespective of the business models chosen, organisations that process large volumes of personal information must comply with applicable data protection law. The European Data Protection Supervisor (EDPS) believes that responsible and sustainable development of big data must rely on four essential elements:

  • organisations must be much more transparent about how they process personal data;
  • afford users a higher degree of control over how their data is used;
  • design user friendly data protection into their products and services; and;
  • become more accountable for what they do….(More)

Open Government: Missing Questions


Vadym Pyrozhenko at Administration & Society: “This article places the Obama administration’s open government initiative within the context of evolution of the U.S. information society. It examines the concept of openness along the three dimensions of Daniel Bell’s social analysis of the postindustrial society: structure, polity, and culture. Four “missing questions” raise the challenge of the compatibility of public service values with the culture of openness, address the right balance between postindustrial information management practices and the capacity of public organizations to accomplish their missions, and ask to reconsider the idea that greater structural openness of public organizations will necessarily increase their democratic legitimacy….(More)”