big data

Data and Humanitarian Response

Curated on April 7, 2016August 3, 2018 by Stefaan Verhulst

The GovLab: “As part of an ongoing effort to build a knowledge base for the field of opening governance by organizing and disseminating its learnings, the GovLab Selected Readings series provides an annotated and curated collection of recommended works on key opening governance topics. In this edition, we explore the literature on Data and Humanitarian Response. To suggest additional readings on this or any other topic, please email [email protected]. All our Selected Readings can be found here.

Context

Data, when used well in a trusted manner , allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:

Selected Reading List (summaries and hyperlinks)
Annotated Selected Reading List
Additional Readings….(More)”

What Should We Do About Big Data Leaks?

Curated on April 6, 2016October 10, 2018 by Stefaan Verhulst

Paul Ford at the New Republic: “I have a great fondness for government data, and the government has a great fondness for making more of it. Federal elections financial data, for example, with every contribution identified, connected to a name and address. Or the results of the census. I don’t know if you’ve ever had the experience of downloading census data but it’s pretty exciting. You can hold America on your hard drive! Meditate on the miracles of zip codes, the way the country is held together and addressable by arbitrary sets of digits.

You can download whole books, in PDF format, about the foreign policy of the Reagan Administration as it related to Russia. Negotiations over which door the Soviet ambassador would use to enter a building. Gigabytes and gigabytes of pure joy for the ephemeralist. The government is the greatest creator of ephemera ever.

Consider the Financial Crisis Inquiry Commission, or FCIC, created in 2009 to figure out exactly how the global economic pooch was screwed. The FCIC has made so much data, and has done an admirable job (caveats noted below) of arranging it. So much stuff. There are reams of treasure on a single FCIC web site, hosted at Stanford Law School: Hundreds of MP3 files, for example, with interviews with Jamie Dimonof JPMorgan Chase and Lloyd Blankfein of Goldman Sachs. I am desperate to find time to write some code that automatically extracts random audio snippets from each and puts them on top of a slow ambient drone with plenty of reverb, so that I can relax to the dulcet tones of the financial industry explaining away its failings. (There’s a Paul Krugman interview that I assume is more critical.)

The recordings are just the beginning. They’ve released so many documents, and with the documents, a finding aid that you can download in handy PDF format, which will tell you where to, well, find things, pointing to thousands of documents. That aid alone is 1,439 pages.

Look, it is excellent that this exists, in public, on the web. But it also presents a very contemporary problem: What is transparency in the age of massive database drops? The data is available, but locked in MP3s and PDFs and other documents; it’s not searchable in the way a web page is searchable, not easy to comment on or share.

Consider the WikiLeaks release of State Department cables. They were exhausting, there were so many of them, they were in all caps. Or the trove of data Edward Snowden gathered on aUSB drive, or Chelsea Manning on CD. And the Ashley Madison leak, spread across database files and logs of credit card receipts. The massive and sprawling Sony leak, complete with whole email inboxes. And with the just-released Panama Papers, we see two exciting new developments: First, the consortium of media organizations that managed the leak actually came together and collectively, well, branded the papers, down to a hashtag (#panamapapers), informational website, etc. Second, the size of the leak itself—2.5 terabytes!—become a talking point, even though that exact description of what was contained within those terabytes was harder to understand. This, said the consortia of journalists that notably did not include The New York Times, The Washington Post, etc., is the big one. Stay tuned. And we are. But the fact remains: These artifacts are not accessible to any but the most assiduous amateur conspiracist; they’re the domain of professionals with the time and money to deal with them. Who else could be bothered?

If you watched the movie Spotlight, you saw journalists at work, pawing through reams of documents, going through, essentially, phone books. I am an inveterate downloader of such things. I love what they represent. And I’m also comfortable with many-gigabyte corpora spread across web sites. I know how to fetch data, how to consolidate it, and how to search it. I share this skill set with many data journalists, and these capacities have, in some ways, become the sole province of the media. Organs of journalism are among the only remaining cultural institutions that can fund investigations of this size and tease the data apart, identifying linkages and thus constructing informational webs that can, with great effort, be turned into narratives, yielding something like what we call “a story” or “the truth.”

Spotlight was set around 2001, and it features a lot of people looking at things on paper. The problem has changed greatly since then: The data is everywhere. The media has been forced into a new cultural role, that of the arbiter of the giant and semi-legal database. ProPublica, a nonprofit that does a great deal of data gathering and data journalism and then shares its findings with other media outlets, is one example; it funded a project called DocumentCloud with other media organizations that simplifies the process of searching through giant piles of PDFs (e.g., court records, or the results of Freedom of Information Act requests).

At some level the sheer boredom and drudgery of managing these large data leaks make them immune to casual interest; even the Ashley Madison leak, which I downloaded, was basically an opaque pile of data and really quite boring unless you had some motive to poke around.

If this is the age of the citizen journalist, or at least the citizen opinion columnist, it’s also the age of the data journalist, with the news media acting as product managers of data leaks, making the information usable, browsable, attractive. There is an uneasy partnership between leakers and the media, just as there is an uneasy partnership between the press and the government, which would like some credit for its efforts, thank you very much, and wouldn’t mind if you gave it some points for transparency while you’re at it.

Pause for a second. There’s a glut of data, but most of it comes to us in ugly formats. What would happen if the things released in the interest of transparency were released in actual transparent formats?…(More)”

New Orleans Gamifies the City Budget

Curated on April 5, 2016August 3, 2018 by Stefaan Verhulst

Kelsey E. Thomas at Next City: “New Orleanians can try their hand at being “mayor for a day” with a new interactive website released by the Committee for a Better New Orleans Wednesday.

The Big Easy Budget Game uses open data from the city to allow players to create their own version of an operating budget. Players are given a digital $602 million, and have to balance the budget — keeping in mind the government’s responsibilities, previous year’s spending and their personal priorities.

Each department in the game has a minimum funding level (players can’t just quit funding public schools if they feel like it), and restricted funding, such as state or federal dollars, is off limits.

CBNO hopes to attract 600 players this year, and plans to compile the data from each player into a crowdsourced meta-budget called “The People’s Budget.” Next fall, the People’s Budget will be released along with the city’s proposed 2017 budget.

Along with the budgeting game, CBNO released a more detailed website, also using the city’s open data, that breaks down the city’s budgeted versus actual spending from 2007 to now and is filterable. The goal is to allow users without big data experience to easily research funding relevant to their neighborhoods.

Many cities have been releasing interactive websites to make their data more accessible to residents. Checkbook NYC updates more than $70 billion in city expenses daily and breaks them down by transaction. Fiscal Focus Pittsburgh is an online visualization tool that outlines revenues and expenses in the city’s budget….(More)”

Open data and the API economy: when it makes sense to give away data

Curated on April 5, 2016May 29, 2019 by Stefaan Verhulst

Joe McKendrick at ZDNet: “Open data is one of those refreshing trends that flows in the opposite direction of the culture of fear that has developed around data security. Instead of putting data under lock and key, surrounded by firewalls and sandboxes, some organizations see value in making data available to all comers — especially developers.

The GovLab.org, a nonprofit advocacy group, published an overview of the benefits governments and organizations are realizing from open data, as well as some of the challenges. The group defines open data as “publicly available data that can be universally and readily accessed, used and redistributed free of charge. It is structured for usability and computability.”…

For enterprises, an open-data stance may be the fuel to build a vibrant ecosystem of developers and business partners. Scott Feinberg, API architect for The New York Times, is one of the people helping to lead the charge to open-data ecosystems. In a recent CXOTalk interview with ZDNet colleague Michael Krigsman, he explains how through the NYT APIs program, developers can sign up for access to 165 years worth of content.

But it requires a lot more than simply throwing some APIs out into the market. Establishing such a comprehensive effort across APIs requires a change in mindset that many organizations may not be ready for, Feinberg cautions. “You can’t be stingy,” he says. “You have to just give it out. When we launched our developer portal there’s a lot of questions like, are people going to be stealing our data, questions like that. Just give it away. You don’t have to give it all but don’t be stingy, and you will find that first off not that many people are going to use it at first. you’re going to find that out, but the people who do, you’re going to find those passionate people who are really interested in using your data in new ways.”

Feinberg clarifies that the NYT’s APIs are not giving out articles for free. Rather, he explains, “we give is everything but article content. You can search for articles. You can find out what’s trending. You can almost do anything you want with our data through our APIs with the exception of actually reading all of the content. It’s really about giving people the opportunity to really interact with your content in ways that you’ve never thought of, and empowering your community to figure out what they want. You know while we don’t give our actual article text away, we give pretty much everything else and people build a lot of really cool stuff on top of that.”

Open data sets, of course, have to worthy of the APIs that offer them. In his post, Borne outlines the seven qualities open data needs to have to be of value to developers and consumers. (Yes, they’re also “Vs” like big data.)

Validity: It’s “critical to pay attention to these data validity concerns when your organization’s data are exposed to scrutiny and inspection by others,” Borne states.
Value: The data needs to be the font of new ideas, new businesses, and innovations.
Variety: Exposing the wide variety of data available can be “a scary proposition for any data scientist,” Borne observes, but nonetheless is essential.
Voice: Remember that “your open data becomes the voice of your organization to your stakeholders.”
Vocabulary: “The semantics and schema (data models) that describe your data are more critical than ever when you provide the data for others to use,” says Borne. “Search, discovery, and proper reuse of data all require good metadata, descriptions, and data modeling.”
Vulnerability: Accept that open data, because it is so open, will be subjected to “misuse, abuse, manipulation, or alteration.”
proVenance: This is the governance requirement behind open data offerings. “Provenance includes ownership, origin, chain of custody, transformations that been made to it, processing that has been applied to it (including which versions of processing software were used), the data’s uses and their context, and more,” says Borne….(More)”

Selected Readings on Data and Humanitarian Response

Curated on April 3, 2016May 29, 2019 by Stefaan Verhulst

By Prianka Srinivasan and Stefaan G. Verhulst *

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data and humanitarian response was originally published in 2016.

Data, when used well in a trusted manner, allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:

Selected Reading List (summaries and hyperlinks)
Annotated Selected Reading List
Additional Readings

Selected Reading List (summaries in alphabetical order)

Data and Humanitarian Response

John Karlsrud – Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies – Recommends that UN peacekeeping initiatives should better integrate big data and new technologies into their operations, adopting a “Peacekeeping 4.0” for the modern world.
Fancesco Mancini, International Peace Institute – New Technology and the prevention of Violence and Conflict – Explores the ways in which new tools available in communications technology can assist humanitarian workers in preventing violence and conflict.
Patrick Meier – Digital Humanitarians- How Big Data is changing the face of humanitarian response – Profiles the emergence of ‘Digital Humanitarians’—humanitarian workers who are using big data, crowdsourcing and new technologies to transform the way societies respond to humanitarian disasters.
Andrew Robertson and Steve Olson (USIP) – Using Data Sharing to Improve Coordination in Peacebuilding – Summarises the findings of a United States Institute of Peace workshop which investigated the use of data-sharing systems between government and non-government actors in conflict zones. It identifies some of the challenges and benefits of data-sharing in peacebuilding efforts.
United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development – A World That Counts, Mobilizing the Data Revolution – Compiled by a group of 20 international experts, this report proposes ways to improve data management and monitoring, whilst mitigating some of the risks data poses.
Katie Whipkey and Andrej Verity – Guidance for Incorporating Big Data into Humanitarian Operations – Created as part of the Digital Humanitarian Network with the support of UN-OCHA, this is a manual for humanitarian organizations looking to strategically incorporate Big Data into their work.

Risks of Using Big Data in Humanitarian Context

Kate Crawford and Megan Finn – The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters – Analyzes the use of big data techniques following a crisis event, arguing that a reliance of social and mobile data can lead to significant oversights and ethical concerns in the wake of humanitarian disasters.
Katja Lindskov Jacobsen – Making design safe for citizens: A hidden history of humanitarian experimentation – Argues that the UNHCR’s use of iris recognition technology in 2002 and 2007 during the repatriation of Afghan refugees from Pakistan constitutes a case of “humanitarian experimentation.” It questions this sort of experimentation which compromises the security of refugees in the pursuit of safer technologies for the rest of the world.
Responsible Data Forum – Responsible Data Reflection Stories: an Overview – compiles various stories sourced by the Responsible Data Forum blog relating to data challenges faced by advocacy organizations, and draws recommendations based on these cases.
Kristin Bergtora Sandvik – The humanitarian cyberspace: shrinking space or an expanding frontier? – Provides a detailed account of the development of a “humanitarian cyberspace” and how information and communication technologies have been further integrated into humanitarian operations since the mid-1990s.

Annotated Selected Reading List (in alphabetical order)

Karlsrud, John. “Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies.” Cyberspace and International Relations, 2013. http://bit.ly/235Qb3e

This chapter from the book “Cyberspace and International Relations” suggests that advances in big data give humanitarian organizations unprecedented opportunities to prevent and mitigate natural disasters and humanitarian crises. However, the sheer amount of unstructured data necessitates effective “data mining” strategies for multinational organizations to make the most use of this data.
By profiling some civil-society organizations who use big data in their peacekeeping efforts, Karlsrud suggests that these community-focused initiatives are leading the movement toward analyzing and using big data in countries vulnerable to crisis.
The chapter concludes by offering ten recommendations to UN peacekeeping forces to best realize the potential of big data and new technology in supporting their operations.

Mancini, Fancesco. “New Technology and the prevention of Violence and Conflict.” International Peace Institute, 2013. http://bit.ly/1ltLfNV

This report from the International Peace Institute looks at five case studies to assess how information and communications technologies (ICTs) can help prevent humanitarian conflicts and violence. Their findings suggest that context has a significant impact on the ability for these ICTs for conflict prevention, and any strategies must take into account the specific contingencies of the region to be successful.
The report suggests seven lessons gleaned from the five case studies:
- New technologies are just one in a variety of tools to combat violence. Consequently, organizations must investigate a variety of complementary strategies to prevent conflicts, and not simply rely on ICTs.
- Not every community or social group will have the same relationship to technology, and their ability to adopt new technologies are similarly influenced by their context. Therefore, a detailed needs assessment must take place before any new technologies are implemented.
- New technologies may be co-opted by violent groups seeking to maintain conflict in the region. Consequently, humanitarian groups must be sensitive to existing political actors and be aware of possible negative consequences these new technologies may spark.
- Local input is integral to support conflict prevention measures, and there exists need for collaboration and awareness-raising with communities to ensure new technologies are sustainable and effective.
- Information shared between civil-society has more potential to develop early-warning systems. This horizontal distribution of information can also allow communities to maintain the accountability of local leaders.

Meier, Patrick. “Digital humanitarians: how big data is changing the face of humanitarian response.” Crc Press, 2015. http://amzn.to/1RQ4ozc

This book traces the emergence of “Digital Humanitarians”—people who harness new digital tools and technologies to support humanitarian action. Meier suggests that this has created a “nervous system” to connect people from disparate parts of the world, revolutionizing the way we respond to humanitarian crises.
Meier argues that such technology is reconfiguring the structure of the humanitarian space, where victims are not simply passive recipients of aid but can contribute with other global citizens. This in turn makes us more humane and engaged people.

Robertson, Andrew and Olson, Steve. “Using Data Sharing to Improve Coordination in Peacebuilding.” United States Institute for Peace, 2012. http://bit.ly/235QuLm

This report functions as an overview of a roundtable workshop on Technology, Science and Peace Building held at the United States Institute of Peace. The workshop aimed to investigate how data-sharing techniques can be developed for use in peace building or conflict management.
Four main themes emerged from discussions during the workshop:
- “Data sharing requires working across a technology-culture divide”—Data sharing needs the foundation of a strong relationship, which can depend on sociocultural, rather than technological, factors.
- “Information sharing requires building and maintaining trust”—These relationships are often built on trust, which can include both technological and social perspectives.
- “Information sharing requires linking civilian-military policy discussions to technology”—Even when sophisticated data-sharing technologies exist, continuous engagement between different stakeholders is necessary. Therefore, procedures used to maintain civil-military engagement should be broadened to include technology.
- “Collaboration software needs to be aligned with user needs”—technology providers need to keep in mind the needs of its users, in this case peacebuilders, in order to ensure sustainability.

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution.” 2014. https://bit.ly/2Cb3lXq

This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.

Katie Whipkey and Andrej Verity. “Guidance for Incorporating Big Data into Humanitarian Operations.” Digital Humanitarian Network, 2015. http://bit.ly/1Y2BMkQ

This report produced by the Digital Humanitarian Network provides an overview of big data, and how humanitarian organizations can integrate this technology into their humanitarian response. It primarily functions as a guide for organizations, and provides concise, brief outlines of what big data is, and how it can benefit humanitarian groups.
The report puts forward four main benefits acquired through the use of big data by humanitarian organizations: 1) the ability to leverage real-time information; 2) the ability to make more informed decisions; 3) the ability to learn new insights; 4) the ability for organizations to be more prepared.
It goes on to assess seven challenges big data poses for humanitarian organizations: 1) geography, and the unequal access to technology across regions; 2) the potential for user error when processing data; 3) limited technology; 4) questionable validity of data; 5) underdeveloped policies and ethics relating to data management; 6) limitations relating to staff knowledge.

Risks of Using Big Data in Humanitarian Context
Crawford, Kate, and Megan Finn. “The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters.” GeoJournal 80.4, 2015. http://bit.ly/1X0F7AI

Crawford & Finn present a critical analysis of the use of big data in disaster management, taking a more skeptical tone to the data revolution facing humanitarian response.
They argue that though social and mobile data analysis can yield important insights and tools in crisis events, it also presents a number of limitations which can lead to oversights being made by researchers or humanitarian response teams.
Crawford & Finn explore the ethical concerns the use of big data in disaster events introduces, including issues of power, privacy, and consent.
The paper concludes by recommending that critical data studies, such as those presented in the paper, be integrated into crisis event research in order to analyze some of the assumptions which underlie mobile and social data.

Jacobsen, Katja Lindskov (2010) Making design safe for citizens: A hidden history of humanitarian experimentation. Citizenship Studies 14.1: 89-103. http://bit.ly/1YaRTwG

This paper explores the phenomenon of “humanitarian experimentation,” where victims of disaster or conflict are the subjects of experiments to test the application of technologies before they are administered in greater civilian populations.
By analyzing the particular use of iris recognition technology during the repatriation of Afghan refugees to Pakistan in 2002 to 2007, Jacobsen suggests that this “humanitarian experimentation” compromises the security of already vulnerable refugees in order to better deliver biometric product to the rest of the world.

Responsible Data Forum. “Responsible Data Reflection Stories: An Overview.” http://bit.ly/1Rszrz1

This piece from the Responsible Data forum is primarily a compilation of “war stories” which follow some of the challenges in using big data for social good. By drawing on these crowdsourced cases, the Forum also presents an overview which makes key recommendations to overcome some of the challenges associated with big data in humanitarian organizations.
It finds that most of these challenges occur when organizations are ill-equipped to manage data and new technologies, or are unaware about how different groups interact in digital spaces in different ways.

Sandvik, Kristin Bergtora. “The humanitarian cyberspace: shrinking space or an expanding frontier?” Third World Quarterly 37:1, 17-32, 2016. http://bit.ly/1PIiACK

This paper analyzes the shift toward more technology-driven humanitarian work, where humanitarian work increasingly takes place online in cyberspace, reshaping the definition and application of aid. This has occurred along with what many suggest is a shrinking of the humanitarian space.
Sandvik provides three interpretations of this phenomena:
- First, traditional threats remain in the humanitarian space, which are both modified and reinforced by technology.
- Second, new threats are introduced by the increasing use of technology in humanitarianism, and consequently the humanitarian space may be broadening, not shrinking.
- Finally, if the shrinking humanitarian space theory holds, cyberspace offers one example of this, where the increasing use of digital technology to manage disasters leads to a contraction of space through the proliferation of remote services.

Additional Readings on Data and Humanitarian Response

Kristin Bergtora Sandvik, et al. – Humanitarian technology: a critical research agenda. – Takes a critical look at the field of humanitarian technology, analyzing what challenges this poses to post-disaster and conflict environment.
Kristin Bergtora Sandvik – “The Risks of Technological Innovation.” – Suggests that despite the evident benefits such technology presents, it can also undermine humanitarian action and lead to “catastrophic events” themselves needing a new type of humanitarian response.
Ryan Burns – Rethinking big data in digital humanitarianism: practices, epistemologies, and social relations – Takes a critical look at the use of big data in humanitarian spaces, arguing that the advent of digital humanitarianism has profound political and social implications, and can in fact limit information available following a humanitarian crisis.
Kate Crawford – Is Data a Danger to the Developing World? – Argues that it is not simply risks to privacy that data poses to developing countries, but suggests that “data discrimination” can affect even the basic human rights of individuals, and introduce problematic power hierarchies between those who can access data and those who cannot.
Paul Currion – Eyes Wide Shut: The challenge of humanitarian biometrics – Examines the use of biometrics by humanitarian organizations and national governments, and suggests stronger accountability is needed to ensure data from marginalized groups remain protected.
Yves-Alexandre de Montjoye, Jake Kendall and Cameron F. Kerry – Enabling Humanitarian Use of Mobile Phone Data – Analyzes how data from mobile communication can provide insights into the spread of infectious disease, and how such data can also compromise individual privacy.
Michael F. Goodchild and Alan Glennon – Crowdsourcing geographic information for disaster response: a research frontier – Explores how though volunteered geographic data may be messy and unreliable, it can provide many benefits in emergency situations.
Raphael Horler – Crowdsourcing in the Humanitarian Network – An Analysis of the Literature – A Bachelor thesis which explores the increasing use of crowdsourced data by organizations involved in disaster response, investigating some of the challenges such use of crowdsourcing poses.
Gus Hosein and Carly Nyst – Aiding Surveillance – Suggests that the unregulated use of technologies and surveillance systems by humanitarian organizations create systems which pose serious threats to individuals’ rights, particularly their right to privacy.
L. Jacobsen – The Politics of Humanitarian Technology: Good Intentions, Unintended Consequences and Insecurity – Raises concerns about the rise of data collection and digital technology in humanitarian aid organizations, arguing that its unquestioned prominence creates new structures of power and control, which remain hidden under the rubric of liberal humanitarianism.
Mirca Madianou – Digital Inequality and Second-Order Disasters: Social Media in the Typhoon Haiyan Recovery – Taking the effects of Typhoon Haiyan as a key case study, this paper investigates how digital inequalities and an unequal access to data can exacerbate existing social inequalities in a post-disaster environment.
Sean Martin McDonald – Ebola: A Big Data Disaster. Privacy, Property, and the Law of Disaster Experimentation – Analyzes the challenges and privacy risks of using unregulated data in public health coordination by taking the use of Call Detail Record (CDR) data during the Ebola crisis as a key case study.
National Academy of Engineering – Sensing and Shaping Emerging Conflicts: Report of a Joint Workshop of the National Academy of Engineering and the United States Institute of Peace: Roundtable on Technology, Science, and Peacebuilding – Building on the overview report of the United States Institute of Peace workshop examines what opportunities new technologies and data sharing provides for humanitarian groups.
Mary K.Pratt – Big Data’s role in humanitarian aid – A Computer World article which provides an overview of Big Data, and how it is improving the efficiency and efficacy of humanitarian response, especially in conflict zones.
Bertrand Taithe Róisínand and Roger Mac Ginty – Data hubris? Humanitarian information systems and the mirage of technology – Specifically looks at visual technology and crisis mapping, and big data, and suggests that there exists an over-enthusiasm in these claims made on behalf of technologically advanced humanitarian information systems.
Linnet Taylor – No place to hide? The ethics and analytics of tracking mobility using mobile phone data – Examines the ethical problems associated with the tracking of mobile phone data, especially in low or middle-income countries.
UN Office for the Coordination of Humanitarian Affairs (UN-OCHA) – Big data and humanitarianism: 5 things you need to know – Briefly outlines five issues that face humanitarian organizations as they integrate big data into their operations.
United Nations Global Pulse – Mapping the Risk-Utility Landscape of Mobile Data for Sustainable Development and Humanitarian Action – Reports on a Global Pulse project (done in partnership with Massachusetts Institute of Technology) which aimed to find how aggregated mobile data can be maximized to protect privacy and provide effective support to crisis response.
The Wilson Center – Connecting Grassroots to Government for Disaster Management: Workshop Summary – Summarizes the key points drawn from a two day Wilson Center workshop, which investigated how new technologies could engage whole communities in disaster management.

* Thanks to: Kristen B. Sandvik; Zara Rahman; Jennifer Schulte; Sean McDonald; Paul Currion; Dinorah Cantú-Pedraza and the Responsible Data Listserve for valuable input.

Elements of a New Ethical Framework for Big Data Research

Curated on April 2, 2016August 3, 2018 by Stefaan Verhulst

“The Berkman Center is pleased to announce the publication of a new paper from the Privacy Tools for Sharing Research Data project team. In this paper, Effy Vayena, Urs Gasser, Alexandra Wood, and David O’Brien from the Berkman Center, with Micah Altman from MIT Libraries, outline elements of a new ethical framework for big data research.

Emerging large-scale data sources hold tremendous potential for new scientific research into human biology, behaviors, and relationships. At the same time, big data research presents privacy and ethical challenges that the current regulatory framework is ill-suited to address. In light of the immense value of large-scale research data, the central question moving forward is not whether such data should be made available for research, but rather how the benefits can be captured in a way that respects fundamental principles of ethics and privacy.

The authors argue that a framework with the following elements would support big data utilization and help harness the value of big data in a sustainable and trust-building manner:

Oversight should aim to provide universal coverage of human subjects research, regardless of funding source, across all stages of the information lifecycle.
New definitions and standards should be developed based on a modern understanding of privacy science and the expectations of research subjects.
Researchers and review boards should be encouraged to incorporate systematic risk-benefit assessments and new procedural and technological solutions from the wide range of interventions that are available.
Oversight mechanisms and the safeguards implemented should be tailored to the intended uses, benefits, threats, harms, and vulnerabilities associated with a specific research activity.

Development of a new ethical framework with these elements should be the product of a dynamic multistakeholder process that is designed to capture the latest scientific understanding of privacy, analytical methods, available safeguards, community and social norms, and best practices for research ethics as they evolve over time.

The full paper is available for download through the Washington and Lee Law Review Online as part of a collection of papers featured at the Future of Privacy Forum workshop Beyond IRBs: Designing Ethical Review Processes for Big Data Research held on December 10, 2015, in Washington, DC….(More)”

The Bottom of the Data Pyramid: Big Data and the Global South

Curated on April 1, 2016August 3, 2018 by Stefaan Verhulst

Payal Arora at the International Journal of Communication: “To date, little attention has been given to the impact of big data in the Global South, about 60% of whose residents are below the poverty line. Big data manifests in novel and unprecedented ways in these neglected contexts. For instance, India has created biometric national identities for her 1.2 billion people, linking them to welfare schemes, and social entrepreneurial initiatives like the Ushahidi project that leveraged crowdsourcing to provide real-time crisis maps for humanitarian relief.

While these projects are indeed inspirational, this article argues that in the context of the Global South there is a bias in the framing of big data as an instrument of empowerment. Here, the poor, or the “bottom of the pyramid” populace are the new consumer base, agents of social change instead of passive beneficiaries. This neoliberal outlook of big data facilitating inclusive capitalism for the common good sidelines critical perspectives urgently needed if we are to channel big data as a positive social force in emerging economies. This article proposes to assess these new technological developments through the lens of databased democracies, databased identities, and databased geographies to make evident normative assumptions and perspectives in this under-examined context….(More)”.

When open data is a Trojan Horse: The weaponization of transparency in science and governance

Curated on April 1, 2016August 3, 2018 by Stefaan Verhulst

Karen E.C. Levy and David Merritt Johns in Big Data and Society: “Openness and transparency are becoming hallmarks of responsible data practice in science and governance. Concerns about data falsification, erroneous analysis, and misleading presentation of research results have recently strengthened the call for new procedures that ensure public accountability for data-driven decisions. Though we generally count ourselves in favor of increased transparency in data practice, this Commentary highlights a caveat. We suggest that legislative efforts that invoke the language of data transparency can sometimes function as “Trojan Horses” through which other political goals are pursued. Framing these maneuvers in the language of transparency can be strategic, because approaches that emphasize open access to data carry tremendous appeal, particularly in current political and technological contexts. We illustrate our argument through two examples of pro-transparency policy efforts, one historical and one current: industry-backed “sound science” initiatives in the 1990s, and contemporary legislative efforts to open environmental data to public inspection. Rules that exist mainly to impede science-based policy processes weaponize the concept of data transparency. The discussion illustrates that, much as Big Data itself requires critical assessment, the processes and principles that attend it—like transparency—also carry political valence, and, as such, warrant careful analysis….(More)”

The Total Archive

Curated on March 30, 2016November 14, 2018 by Stefaan Verhulst

LIMN issue edited by Boris Jardine and Christopher Kelty: “Vast accumulations saturate our world: phone calls and emails stored by security agencies; every preference of every individual collected by advertisers; ID numbers, and maybe an iris scan, for every Indian; hundreds of thousands of whole genome sequences; seed banks of all existing plants, and of course, books… all of them. Just what is the purpose of these optimistically total archives, and how are they changing us?

This issue of Limn asks authors and artists to consider how these accumulations govern us, where this obsession with totality came from and how we might think differently about big data and algorithms, by thinking carefully through the figure of the archive.

Contributors: Miriam Austin, Jenny Bangham, Reuben Binns, Balázs Bodó, Geoffry C. Bowker, Finn Brunton,Lawrence Cohen, Stephen Collier, Vadig De Croehling, Lukas Engelmann, Nicholas HA Evans, Fabienne Hess, Anna Hughes, Boris Jardine, Emily Jones, Judith Kaplan, Whitney Laemmli, Andrew Lakoff, Rebecca Lemov, Branwyn Poleykett, Mary Murrell, Ben Outhwaite, Julien Prévieux, and Jenny Reardon….(More)”

Accountable machines: bureaucratic cybernetics?

Curated on March 28, 2016November 14, 2018 by Stefaan Verhulst

Alison Powell at LSE Media Policy Project Blog: “Algorithms are everywhere, or so we are told, and the black boxes of algorithmic decision-making make oversight of processes that regulators and activists argue ought to be transparent more difficult than in the past. But when, and where, and which machines do we wish to make accountable, and for what purpose? In this post I discuss how algorithms discussed by scholars are most commonly those at work on media platforms whose main products are the social networks and attention of individuals. Algorithms, in this case, construct individual identities through patterns of behaviour, and provide the opportunity for finely targeted products and services. While there are serious concerns about, for instance, price discrimination, algorithmic systems for communicating and consuming are, in my view, less inherently problematic than processes that impact on our collective participation and belonging as citizenship. In this second sphere, algorithmic processes – especially machine learning – combine with processes of governance that focus on individual identity performance to profoundly transform how citizenship is understood and undertaken.

Communicating and consuming

In the communications sphere, algorithms are what makes it possible to make money from the web for example through advertising brokerage platforms that help companies bid for ads on major newspaper websites. IP address monitoring, which tracks clicks and web activity, creates detailed consumer profiles and transform the everyday experience of communication into a constantly-updated production of consumer information. This process of personal profiling is at the heart of many of the concerns about algorithmic accountability. The consequence of perpetual production of data by individuals and the increasing capacity to analyse it even when it doesn’t appear to relate has certainly revolutionalised advertising by allowing more precise targeting, but what has it done for areas of public interest?

John Cheney-Lippold identifies how the categories of identity are now developed algorithmically, since a category like gender is not based on self-discloure, but instead on patterns of behaviour that fit with expectations set by previous alignment to a norm. In assessing ‘algorithmic identities’, he notes that these produce identity profiles which are narrower and more behaviour-based than the identities that we perform. This is a result of the fact that many of the systems that inspired the design of algorithmic systems were based on using behaviour and other markers to optimise consumption. Algorithmic identity construction has spread from the world of marketing to the broader world of citizenship – as evidenced by the Citizen Ex experiment shown at the Web We Want Festival in 2015.

Individual consumer-citizens

What’s really at stake is that the expansion of algorithmic assessment of commercially derived big data has extended the frame of the individual consumer into all kinds of other areas of experience. In a supposed ‘age of austerity’ when governments believe it’s important to cut costs, this connects with the view of citizens as primarily consumers of services, and furthermore, with the idea that a citizen is an individual subject whose relation to a state can be disintermediated given enough technology. So, with sensors on your garbage bins you don’t need to even remember to take them out. With pothole reporting platforms like FixMyStreet, a city government can be responsive to an aggregate of individual reports. But what aspects of our citizenship are collective? When, in the algorithmic state, can we expect to be together?

Put another way, is there any algorithmic process to value the long term education, inclusion, and sustenance of a whole community for example through library services?…

Seeing algorithms – machine learning in particular – as supporting decision-making for broad collective benefit rather than as part of ever more specific individual targeting and segmentation might make them more accountable. But more importantly, this would help algorithms support society – not just individual consumers….(More)”