Jenkins A., Croitoru A., Crooks A.T., Stefanidis A. in PLOS: “Place can be generally defined as a location that has been assigned meaning through human experience, and as such it is of multidisciplinary scientific interest. Up to this point place has been studied primarily within the context of social sciences as a theoretical construct. The availability of large amounts of user-generated content, e.g. in the form of social media feeds or Wikipedia contributions, allows us for the first time to computationally analyze and quantify the shared meaning of place. By aggregating references to human activities within urban spaces we can observe the emergence of unique themes that characterize different locations, thus identifying places through their discernible sociocultural signatures. In this paper we present results from a novel quantitative approach to derive such sociocultural signatures from Twitter contributions and also from corresponding Wikipedia entries. By contrasting the two we show how particular thematic characteristics of places (referred to herein as platial themes) are emerging from such crowd-contributed content, allowing us to observe the meaning that the general public, either individually or collectively, is assigning to specific locations. Our approach leverages probabilistic topic modelling, semantic association, and spatial clustering to find locations are conveying a collective sense of place. Deriving and quantifying such meaning allows us to observe how people transform a location to a place and shape its characteristics….(More)”
Automating power: Social bot interference in global politics
Samuel C. Woolley at First Monday: “Over the last several years political actors worldwide have begun harnessing the digital power of social bots — software programs designed to mimic human social media users on platforms like Facebook, Twitter, and Reddit. Increasingly, politicians, militaries, and government-contracted firms use these automated actors in online attempts to manipulate public opinion and disrupt organizational communication. Politicized social bots — here ‘political bots’ — are used to massively boost politicians’ follower levels on social media sites in attempts to generate false impressions of popularity. They are programmed to actively and automatically flood news streams with spam during political crises, elections, and conflicts in order to interrupt the efforts of activists and political dissidents who publicize and organize online. They are used by regimes to send out sophisticated computational propaganda. This paper conducts a content analysis of available media articles on political bots in order to build an event dataset of global political bot deployment that codes for usage, capability, and history. This information is then analyzed, generating a global outline of this phenomenon. This outline seeks to explain the variety of political bot-oriented strategies and presents details crucial to building understandings of these automated software actors in the humanities, social and computer sciences….(More)”
Website Seeks to Make Government Data Easier to Sift Through
Steve Lohr at the New York Times: “For years, the federal government, states and some cities have enthusiastically made vast troves of data open to the public. Acres of paper records on demographics, public health, traffic patterns, energy consumption, family incomes and many other topics have been digitized and posted on the web.
This abundance of data can be a gold mine for discovery and insights, but finding the nuggets can be arduous, requiring special skills.
A project coming out of the M.I.T. Media Lab on Monday seeks to ease that challenge and to make the value of government data available to a wider audience. The project, called Data USA, bills itself as “the most comprehensive visualization of U.S. public data.” It is free, and its software code is open source, meaning that developers can build custom applications by adding other data.
Cesar A. Hidalgo, an assistant professor of media arts and sciences at the M.I.T. Media Lab who led the development of Data USA, said the website was devised to “transform data into stories.” Those stories are typically presented as graphics, charts and written summaries….Type “New York” into the Data USA search box, and a drop-down menu presents choices — the city, the metropolitan area, the state and other options. Select the city, and the page displays an aerial shot of Manhattan with three basic statistics: population (8.49 million), median household income ($52,996) and median age (35.8).
Lower on the page are six icons for related subject categories, including economy, demographics and education. If you click on demographics, one of the so-called data stories appears, based largely on data from the American Community Survey of the United States Census Bureau.
Using colorful graphics and short sentences, it shows the median age of foreign-born residents of New York (44.7) and of residents born in the United States (28.6); the most common countries of origin for immigrants (the Dominican Republic, China and Mexico); and the percentage of residents who are American citizens (82.8 percent, compared with a national average of 93 percent).
Data USA features a selection of data results on its home page. They include the gender wage gap in Connecticut; the racial breakdown of poverty in Flint, Mich.; the wages of physicians and surgeons across the United States; and the institutions that award the most computer science degrees….(More)
Selected Readings on Data and Humanitarian Response
By Prianka Srinivasan and Stefaan G. Verhulst *
The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data and humanitarian response was originally published in 2016.
Data, when used well in a trusted manner, allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:
- Selected Reading List (summaries and hyperlinks)
- Annotated Selected Reading List
- Additional Readings
Selected Reading List (summaries in alphabetical order)
Data and Humanitarian Response
- John Karlsrud – Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies – Recommends that UN peacekeeping initiatives should better integrate big data and new technologies into their operations, adopting a “Peacekeeping 4.0” for the modern world.
- Fancesco Mancini, International Peace Institute – New Technology and the prevention of Violence and Conflict – Explores the ways in which new tools available in communications technology can assist humanitarian workers in preventing violence and conflict.
- Patrick Meier – Digital Humanitarians- How Big Data is changing the face of humanitarian response – Profiles the emergence of ‘Digital Humanitarians’—humanitarian workers who are using big data, crowdsourcing and new technologies to transform the way societies respond to humanitarian disasters.
- Andrew Robertson and Steve Olson (USIP) – Using Data Sharing to Improve Coordination in Peacebuilding – Summarises the findings of a United States Institute of Peace workshop which investigated the use of data-sharing systems between government and non-government actors in conflict zones. It identifies some of the challenges and benefits of data-sharing in peacebuilding efforts.
- United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development – A World That Counts, Mobilizing the Data Revolution – Compiled by a group of 20 international experts, this report proposes ways to improve data management and monitoring, whilst mitigating some of the risks data poses.
- Katie Whipkey and Andrej Verity – Guidance for Incorporating Big Data into Humanitarian Operations – Created as part of the Digital Humanitarian Network with the support of UN-OCHA, this is a manual for humanitarian organizations looking to strategically incorporate Big Data into their work.
Risks of Using Big Data in Humanitarian Context
- Kate Crawford and Megan Finn – The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters – Analyzes the use of big data techniques following a crisis event, arguing that a reliance of social and mobile data can lead to significant oversights and ethical concerns in the wake of humanitarian disasters.
- Katja Lindskov Jacobsen – Making design safe for citizens: A hidden history of humanitarian experimentation – Argues that the UNHCR’s use of iris recognition technology in 2002 and 2007 during the repatriation of Afghan refugees from Pakistan constitutes a case of “humanitarian experimentation.” It questions this sort of experimentation which compromises the security of refugees in the pursuit of safer technologies for the rest of the world.
- Responsible Data Forum – Responsible Data Reflection Stories: an Overview – compiles various stories sourced by the Responsible Data Forum blog relating to data challenges faced by advocacy organizations, and draws recommendations based on these cases.
- Kristin Bergtora Sandvik – The humanitarian cyberspace: shrinking space or an expanding frontier? – Provides a detailed account of the development of a “humanitarian cyberspace” and how information and communication technologies have been further integrated into humanitarian operations since the mid-1990s.
Annotated Selected Reading List (in alphabetical order)
Karlsrud, John. “Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies.” Cyberspace and International Relations, 2013. http://bit.ly/235Qb3e
- This chapter from the book “Cyberspace and International Relations” suggests that advances in big data give humanitarian organizations unprecedented opportunities to prevent and mitigate natural disasters and humanitarian crises. However, the sheer amount of unstructured data necessitates effective “data mining” strategies for multinational organizations to make the most use of this data.
- By profiling some civil-society organizations who use big data in their peacekeeping efforts, Karlsrud suggests that these community-focused initiatives are leading the movement toward analyzing and using big data in countries vulnerable to crisis.
- The chapter concludes by offering ten recommendations to UN peacekeeping forces to best realize the potential of big data and new technology in supporting their operations.
Mancini, Fancesco. “New Technology and the prevention of Violence and Conflict.” International Peace Institute, 2013. http://bit.ly/1ltLfNV
- This report from the International Peace Institute looks at five case studies to assess how information and communications technologies (ICTs) can help prevent humanitarian conflicts and violence. Their findings suggest that context has a significant impact on the ability for these ICTs for conflict prevention, and any strategies must take into account the specific contingencies of the region to be successful.
- The report suggests seven lessons gleaned from the five case studies:
- New technologies are just one in a variety of tools to combat violence. Consequently, organizations must investigate a variety of complementary strategies to prevent conflicts, and not simply rely on ICTs.
- Not every community or social group will have the same relationship to technology, and their ability to adopt new technologies are similarly influenced by their context. Therefore, a detailed needs assessment must take place before any new technologies are implemented.
- New technologies may be co-opted by violent groups seeking to maintain conflict in the region. Consequently, humanitarian groups must be sensitive to existing political actors and be aware of possible negative consequences these new technologies may spark.
- Local input is integral to support conflict prevention measures, and there exists need for collaboration and awareness-raising with communities to ensure new technologies are sustainable and effective.
- Information shared between civil-society has more potential to develop early-warning systems. This horizontal distribution of information can also allow communities to maintain the accountability of local leaders.
Meier, Patrick. “Digital humanitarians: how big data is changing the face of humanitarian response.” Crc Press, 2015. http://amzn.to/1RQ4ozc
- This book traces the emergence of “Digital Humanitarians”—people who harness new digital tools and technologies to support humanitarian action. Meier suggests that this has created a “nervous system” to connect people from disparate parts of the world, revolutionizing the way we respond to humanitarian crises.
- Meier argues that such technology is reconfiguring the structure of the humanitarian space, where victims are not simply passive recipients of aid but can contribute with other global citizens. This in turn makes us more humane and engaged people.
Robertson, Andrew and Olson, Steve. “Using Data Sharing to Improve Coordination in Peacebuilding.” United States Institute for Peace, 2012. http://bit.ly/235QuLm
- This report functions as an overview of a roundtable workshop on Technology, Science and Peace Building held at the United States Institute of Peace. The workshop aimed to investigate how data-sharing techniques can be developed for use in peace building or conflict management.
- Four main themes emerged from discussions during the workshop:
- “Data sharing requires working across a technology-culture divide”—Data sharing needs the foundation of a strong relationship, which can depend on sociocultural, rather than technological, factors.
- “Information sharing requires building and maintaining trust”—These relationships are often built on trust, which can include both technological and social perspectives.
- “Information sharing requires linking civilian-military policy discussions to technology”—Even when sophisticated data-sharing technologies exist, continuous engagement between different stakeholders is necessary. Therefore, procedures used to maintain civil-military engagement should be broadened to include technology.
- “Collaboration software needs to be aligned with user needs”—technology providers need to keep in mind the needs of its users, in this case peacebuilders, in order to ensure sustainability.
United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution.” 2014. https://bit.ly/2Cb3lXq
- This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
- It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.
Katie Whipkey and Andrej Verity. “Guidance for Incorporating Big Data into Humanitarian Operations.” Digital Humanitarian Network, 2015. http://bit.ly/1Y2BMkQ
- This report produced by the Digital Humanitarian Network provides an overview of big data, and how humanitarian organizations can integrate this technology into their humanitarian response. It primarily functions as a guide for organizations, and provides concise, brief outlines of what big data is, and how it can benefit humanitarian groups.
- The report puts forward four main benefits acquired through the use of big data by humanitarian organizations: 1) the ability to leverage real-time information; 2) the ability to make more informed decisions; 3) the ability to learn new insights; 4) the ability for organizations to be more prepared.
- It goes on to assess seven challenges big data poses for humanitarian organizations: 1) geography, and the unequal access to technology across regions; 2) the potential for user error when processing data; 3) limited technology; 4) questionable validity of data; 5) underdeveloped policies and ethics relating to data management; 6) limitations relating to staff knowledge.
Risks of Using Big Data in Humanitarian Context
Crawford, Kate, and Megan Finn. “The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters.” GeoJournal 80.4, 2015. http://bit.ly/1X0F7AI
- Crawford & Finn present a critical analysis of the use of big data in disaster management, taking a more skeptical tone to the data revolution facing humanitarian response.
- They argue that though social and mobile data analysis can yield important insights and tools in crisis events, it also presents a number of limitations which can lead to oversights being made by researchers or humanitarian response teams.
- Crawford & Finn explore the ethical concerns the use of big data in disaster events introduces, including issues of power, privacy, and consent.
- The paper concludes by recommending that critical data studies, such as those presented in the paper, be integrated into crisis event research in order to analyze some of the assumptions which underlie mobile and social data.
Jacobsen, Katja Lindskov (2010) Making design safe for citizens: A hidden history of humanitarian experimentation. Citizenship Studies 14.1: 89-103. http://bit.ly/1YaRTwG
- This paper explores the phenomenon of “humanitarian experimentation,” where victims of disaster or conflict are the subjects of experiments to test the application of technologies before they are administered in greater civilian populations.
- By analyzing the particular use of iris recognition technology during the repatriation of Afghan refugees to Pakistan in 2002 to 2007, Jacobsen suggests that this “humanitarian experimentation” compromises the security of already vulnerable refugees in order to better deliver biometric product to the rest of the world.
Responsible Data Forum. “Responsible Data Reflection Stories: An Overview.” http://bit.ly/1Rszrz1
- This piece from the Responsible Data forum is primarily a compilation of “war stories” which follow some of the challenges in using big data for social good. By drawing on these crowdsourced cases, the Forum also presents an overview which makes key recommendations to overcome some of the challenges associated with big data in humanitarian organizations.
- It finds that most of these challenges occur when organizations are ill-equipped to manage data and new technologies, or are unaware about how different groups interact in digital spaces in different ways.
Sandvik, Kristin Bergtora. “The humanitarian cyberspace: shrinking space or an expanding frontier?” Third World Quarterly 37:1, 17-32, 2016. http://bit.ly/1PIiACK
- This paper analyzes the shift toward more technology-driven humanitarian work, where humanitarian work increasingly takes place online in cyberspace, reshaping the definition and application of aid. This has occurred along with what many suggest is a shrinking of the humanitarian space.
- Sandvik provides three interpretations of this phenomena:
- First, traditional threats remain in the humanitarian space, which are both modified and reinforced by technology.
- Second, new threats are introduced by the increasing use of technology in humanitarianism, and consequently the humanitarian space may be broadening, not shrinking.
- Finally, if the shrinking humanitarian space theory holds, cyberspace offers one example of this, where the increasing use of digital technology to manage disasters leads to a contraction of space through the proliferation of remote services.
Additional Readings on Data and Humanitarian Response
- Kristin Bergtora Sandvik, et al. – Humanitarian technology: a critical research agenda. – Takes a critical look at the field of humanitarian technology, analyzing what challenges this poses to post-disaster and conflict environment.
- Kristin Bergtora Sandvik – “The Risks of Technological Innovation.” – Suggests that despite the evident benefits such technology presents, it can also undermine humanitarian action and lead to “catastrophic events” themselves needing a new type of humanitarian response.
- Ryan Burns – Rethinking big data in digital humanitarianism: practices, epistemologies, and social relations – Takes a critical look at the use of big data in humanitarian spaces, arguing that the advent of digital humanitarianism has profound political and social implications, and can in fact limit information available following a humanitarian crisis.
- Kate Crawford – Is Data a Danger to the Developing World? – Argues that it is not simply risks to privacy that data poses to developing countries, but suggests that “data discrimination” can affect even the basic human rights of individuals, and introduce problematic power hierarchies between those who can access data and those who cannot.
- Paul Currion – Eyes Wide Shut: The challenge of humanitarian biometrics – Examines the use of biometrics by humanitarian organizations and national governments, and suggests stronger accountability is needed to ensure data from marginalized groups remain protected.
- Yves-Alexandre de Montjoye, Jake Kendall and Cameron F. Kerry – Enabling Humanitarian Use of Mobile Phone Data – Analyzes how data from mobile communication can provide insights into the spread of infectious disease, and how such data can also compromise individual privacy.
- Michael F. Goodchild and Alan Glennon – Crowdsourcing geographic information for disaster response: a research frontier – Explores how though volunteered geographic data may be messy and unreliable, it can provide many benefits in emergency situations.
- Raphael Horler – Crowdsourcing in the Humanitarian Network – An Analysis of the Literature – A Bachelor thesis which explores the increasing use of crowdsourced data by organizations involved in disaster response, investigating some of the challenges such use of crowdsourcing poses.
- Gus Hosein and Carly Nyst – Aiding Surveillance – Suggests that the unregulated use of technologies and surveillance systems by humanitarian organizations create systems which pose serious threats to individuals’ rights, particularly their right to privacy.
- L. Jacobsen – The Politics of Humanitarian Technology: Good Intentions, Unintended Consequences and Insecurity – Raises concerns about the rise of data collection and digital technology in humanitarian aid organizations, arguing that its unquestioned prominence creates new structures of power and control, which remain hidden under the rubric of liberal humanitarianism.
- Mirca Madianou – Digital Inequality and Second-Order Disasters: Social Media in the Typhoon Haiyan Recovery – Taking the effects of Typhoon Haiyan as a key case study, this paper investigates how digital inequalities and an unequal access to data can exacerbate existing social inequalities in a post-disaster environment.
- Sean Martin McDonald – Ebola: A Big Data Disaster. Privacy, Property, and the Law of Disaster Experimentation – Analyzes the challenges and privacy risks of using unregulated data in public health coordination by taking the use of Call Detail Record (CDR) data during the Ebola crisis as a key case study.
- National Academy of Engineering – Sensing and Shaping Emerging Conflicts: Report of a Joint Workshop of the National Academy of Engineering and the United States Institute of Peace: Roundtable on Technology, Science, and Peacebuilding – Building on the overview report of the United States Institute of Peace workshop examines what opportunities new technologies and data sharing provides for humanitarian groups.
- Mary K.Pratt – Big Data’s role in humanitarian aid – A Computer World article which provides an overview of Big Data, and how it is improving the efficiency and efficacy of humanitarian response, especially in conflict zones.
- Bertrand Taithe Róisínand and Roger Mac Ginty – Data hubris? Humanitarian information systems and the mirage of technology – Specifically looks at visual technology and crisis mapping, and big data, and suggests that there exists an over-enthusiasm in these claims made on behalf of technologically advanced humanitarian information systems.
- Linnet Taylor – No place to hide? The ethics and analytics of tracking mobility using mobile phone data – Examines the ethical problems associated with the tracking of mobile phone data, especially in low or middle-income countries.
- UN Office for the Coordination of Humanitarian Affairs (UN-OCHA) – Big data and humanitarianism: 5 things you need to know – Briefly outlines five issues that face humanitarian organizations as they integrate big data into their operations.
- United Nations Global Pulse – Mapping the Risk-Utility Landscape of Mobile Data for Sustainable Development and Humanitarian Action – Reports on a Global Pulse project (done in partnership with Massachusetts Institute of Technology) which aimed to find how aggregated mobile data can be maximized to protect privacy and provide effective support to crisis response.
- The Wilson Center – Connecting Grassroots to Government for Disaster Management: Workshop Summary – Summarizes the key points drawn from a two day Wilson Center workshop, which investigated how new technologies could engage whole communities in disaster management.
* Thanks to: Kristen B. Sandvik; Zara Rahman; Jennifer Schulte; Sean McDonald; Paul Currion; Dinorah Cantú-Pedraza and the Responsible Data Listserve for valuable input.
Elements of a New Ethical Framework for Big Data Research
“The Berkman Center is pleased to announce the publication of a new paper from the Privacy Tools for Sharing Research Data project team. In this paper, Effy Vayena, Urs Gasser, Alexandra Wood, and David O’Brien from the Berkman Center, with Micah Altman from MIT Libraries, outline elements of a new ethical framework for big data research.
Emerging large-scale data sources hold tremendous potential for new scientific research into human biology, behaviors, and relationships. At the same time, big data research presents privacy and ethical challenges that the current regulatory framework is ill-suited to address. In light of the immense value of large-scale research data, the central question moving forward is not whether such data should be made available for research, but rather how the benefits can be captured in a way that respects fundamental principles of ethics and privacy.
The authors argue that a framework with the following elements would support big data utilization and help harness the value of big data in a sustainable and trust-building manner:
-
Oversight should aim to provide universal coverage of human subjects research, regardless of funding source, across all stages of the information lifecycle.
-
New definitions and standards should be developed based on a modern understanding of privacy science and the expectations of research subjects.
-
Researchers and review boards should be encouraged to incorporate systematic risk-benefit assessments and new procedural and technological solutions from the wide range of interventions that are available.
-
Oversight mechanisms and the safeguards implemented should be tailored to the intended uses, benefits, threats, harms, and vulnerabilities associated with a specific research activity.
Development of a new ethical framework with these elements should be the product of a dynamic multistakeholder process that is designed to capture the latest scientific understanding of privacy, analytical methods, available safeguards, community and social norms, and best practices for research ethics as they evolve over time.
The full paper is available for download through the Washington and Lee Law Review Online as part of a collection of papers featured at the Future of Privacy Forum workshop Beyond IRBs: Designing Ethical Review Processes for Big Data Research held on December 10, 2015, in Washington, DC….(More)”
Governance by Algorithms: Reality Construction by Algorithmic Selection on the Internet
Paper by Natascha Just & Michael Latzer in Media, Culture & Society (fortcoming): “This paper explores the governance by algorithms in information societies. Theoretically, it builds on (co-)evolutionary innovation studies in order to adequately grasp the interplay of technological and societal change, and combines these with institutional approaches to incorporate governance by technology or rather software as institutions. Methodologically it draws from an empirical survey of Internet-based services that rely on automated algorithmic selection, a functional typology derived from it, and an analysis of associated potential social risks. It shows how algorithmic selection has become a growing source of social order, of a shared social reality in information societies. It argues that – similar to the construction of realities by traditional mass media – automated algorithmic selection applications shape daily lives and realities, affect the perception of the world, and influence behavior. However, the co-evolutionary perspective on algorithms as institutions, ideologies, intermediaries and actors highlights differences that are to be found first in the growing personalization of constructed realities, and second in the constellation of involved actors. Altogether, compared to reality construction by traditional mass media, algorithmic reality construction tends to increase individualization, commercialization, inequalities and deterritorialization, and to decrease transparency, controllability and predictability…(Full Paper)”
Liberating data for public value: The case of Data.gov
Paper by Rashmi Krishnamurthy and Yukika Awazu in the International Journal of Information Management: “Public agencies around the globe are liberating their data. Drawing on a case of Data.gov, we outline the challenges and opportunities that lie ahead for the liberation of public data. Data.gov is an online portal that provides open access to datasets generated by US public agencies and countries around the world in a machine-readable format. By discussing the challenges and opportunities faced by Data.gov, we provide several lessons that can inform research and practice. We suggest that providing access to open data in itself does not spur innovation. Specifically, we claim that public agencies need to spend resources to improve the capacities of their organizations to move toward ‘open data by default’; develop capacities of community to use data to solve problems; and think critically about the unintended consequences of providing access to public data. We also suggest that public agencies need better metrics to evaluate the success of open-data efforts in achieving its goals….(More)”
Open Data Impact: When Demand and Supply Meet
Stefaan Verhulst and Andrew Young at the GovLab: “Today, in “Open Data Impact: When Demand and Supply Meet,” the GovLab and Omidyar Network release key findings about the social, economic, cultural and political impact of open data. The findings are based on 19 detailed case studies of open data projects from around the world. These case studies were prepared in order to address an important shortcoming in our understanding of when, and how, open data works. While there is no shortage of enthusiasm for open data’s potential, nor of conjectural estimates of its hypothetical impact, few rigorous, systematic analyses exist of its concrete, real-world impact…. The 19 case studies that inform this report, all of which can be found at Open Data’s Impact (odimpact.org), a website specially set up for this project, were chosen for their geographic and sectoral representativeness. They seek to go beyond the descriptive (what happened) to the explanatory (why it happened, and what is the wider relevance or impact)….
In order to achieve the potential of open data and scale the impact of the individual projects discussed in our report, we need a better – and more granular – understanding of the enabling conditions that lead to success. We found 4 central conditions (“4Ps”) that play an important role in ensuring success:
- Partnerships: Intermediaries and data collaboratives play an important role in ensuring success, allowing for enhanced matching of supply and demand of data.
- Public infrastructure: Developing open data as a public infrastructure, open to all, enables wider participation, and a broader impact across issues and sectors.
- Policies: Clear policies regarding open data, including those promoting regular assessments of open data projects, are also critical for success.
- Problem definition: Open data initiatives that have a clear target or problem definition have more impact and are more likely to succeed than those with vaguely worded statements of intent or unclear reasons for existence.
Core Challenges
Finally, the success of a project is also determined by the obstacles and challenges it confronts. Our research uncovered 4 major challenges (“4Rs”) confronting open data initiatives across the globe:
- Readiness: A lack of readiness or capacity (evident, for example, in low Internet penetration or technical literacy rates) can severely limit the impact of open data.
- Responsiveness: Open data projects are significantly more likely to be successful when they remain agile and responsive—adapting, for instance, to user feedback or early indications of success and failure.
- Risks: For all its potential, open data does pose certain risks, notably to privacy and security; a greater, more nuanced understanding of these risks will be necessary to address and mitigate them.
- Resource Allocation: While open data projects can often be launched cheaply, those projects that receive generous, sustained and committed funding have a better chance of success over the medium and long term.
Toward a Next Generation Open Data Roadmap
The report we release today concludes with ten recommendations for policymakers, advocates, users, funders and other stakeholders in the open data community. For each step, we include a few concrete methods of implementation – ways to translate the broader recommendation into meaningful impact.
Together, these 10 recommendations and their means of implementation amount to what we call a “Next Generation Open Data Roadmap.” This roadmap is just a start, and we plan to continue fleshing it out in the near future. For now, it offers a way forward. It is our hope that this roadmap will help guide future research and experimentation so that we can continue to better understand how the potential of open data can be fulfilled across geographies, sectors and demographics.
Additional Resources
In conjunction with the release of our key findings paper, we also launch today an “Additional Resources” section on the Open Data’s Impact website. The goal of that section is to provide context on our case studies, and to point in the direction of other, complementary research. It includes the following elements:
- A “repository of repositories,” including other compendiums of open data case studies and sources;
- A compilation of some popular open data glossaries;
- A number of open data research publications and reports, with a particular focus on impact;
- A collection of open data definitions and a matrix of analysis to help assess those definitions….(More)
Innovation Prizes in Practice and Theory
Paper by Michael J. Burstein and Fiona Murray: “Innovation prizes in reality are significantly different from innovation prizes in theory. The former are familiar from popular accounts of historical prizes like the Longitude Prize: the government offers a set amount for a solution to a known problem, like £20,000 for a method of calculating longitude at sea. The latter are modeled as compensation to inventors in return for donating their inventions to the public domain. Neither the economic literature nor the policy literature that led to the 2010 America COMPETES Reauthorization Act — which made prizes a prominent tool of government innovation policy — provides a satisfying justification for the use of prizes, nor does either literature address their operation. In this article, we address both of these problems. We use a case study of one canonical, high profile innovation prize — the Progressive Insurance Automotive X Prize — to explain how prizes function as institutional means to achieve exogenously defined innovation policy goals in the face of significant uncertainty and information asymmetries. Focusing on the structure and function of actual innovation prizes as an empirical matter enables us to make three theoretical contributions to the current understanding of prizes. First, we offer a stronger normative justification for prizes grounded in their status as a key institutional arrangement for solving a specified innovation problem. Second, we develop a model of innovation prize governance and then situate that model in the administrative state, as a species of “new governance” or “experimental” regulation. Third, we derive from those analyses a novel framework for choosing among prizes, patents, and grants, one in which the ultimate choice depends on a trade off between the efficacy and scalability of the institutional solution….(More)”
“Big data” and “open data”: What kind of access should researchers enjoy?
Paper by Gilles Chatellier, Vincent Varlet, and Corinne Blachier-Poisson in Thérapie: “The healthcare sector is currently facing a new paradigm, the explosion of “big data”. Coupled with advances in computer technology, the field of “big data” appears promising, allowing us to better understand the natural history of diseases, to follow-up new technologies (devices, drugs) implementation and to participate in precision medicine, etc. Data sources are multiple (medical and administrative data, electronic medical records, data from rapidly developing technologies such as DNA sequencing, connected devices, etc.) and heterogeneous while their use requires complex methods for accurate analysis. Moreover, faced with this new paradigm, we must determine who could (or should) have access to which data, how to combine collective interest and protection of personal data and how to finance in the long-term both operating costs and databases interrogation. This article analyses the opportunities and challenges related to the use of open and/or “big data”, … (More)”