Big Data in the Public Sector


Chapter by Ricard Munné in New Horizons for a Data-Driven Economy: “The public sector is becoming increasingly aware of the potential value to be gained from big data, as governments generate and collect vast quantities of data through their everyday activities.

The benefits of big data in the public sector can be grouped into three major areas, based on a classification of the types of benefits: advanced analytics, through automated algorithms; improvements in effectiveness, providing greater internal transparency; improvements in efficiency, where better services can be provided based on the personalization of services; and learning from the performance of such services.

The chapter examined several drivers and constraints that have been identified, which can boost or stop the development of big data in the sector depending on how they are addressed. The findings, after analysing the requirements and the technologies currently available, show that there are open research questions to be addressed in order to develop such technologies so competitive and effective solutions can be built. The main developments are required in the fields of scalability of data analysis, pattern discovery, and real-time applications. Also required are improvements in provenance for the sharing and integration of data from the public sector. It is also extremely important to provide integrated security and privacy mechanisms in big data applications, as public sector collects vast amounts of sensitive data. Finally, respecting the privacy of citizens is a mandatory obligation in the European Union….(More)”

Selected Readings on Data and Humanitarian Response


By Prianka Srinivasan and Stefaan G. Verhulst *

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data and humanitarian response was originally published in 2016.

Data, when used well in a trusted manner, allows humanitarian organizations to innovate how to respond to emergency events, including better coordination of post-disaster relief efforts, the ability to harness local knowledge to create more targeted relief strategies, and tools to predict and monitor disasters in real time. Consequently, in recent years both multinational groups and community-based advocates have begun to integrate data collection and evaluation strategies into their humanitarian operations, to better and more quickly respond to emergencies. However, this movement poses a number of challenges. Compared to the private sector, humanitarian organizations are often less equipped to successfully analyze and manage big data, which pose a number of risks related to the security of victims’ data. Furthermore, complex power dynamics which exist within humanitarian spaces may be further exacerbated through the introduction of new technologies and big data collection mechanisms. In the below we share:

  • Selected Reading List (summaries and hyperlinks)
  • Annotated Selected Reading List
  • Additional Readings

Selected Reading List  (summaries in alphabetical order)

Data and Humanitarian Response

Risks of Using Big Data in Humanitarian Context

Annotated Selected Reading List (in alphabetical order)

Karlsrud, John. “Peacekeeping 4.0: Harnessing the Potential of Big Data, Social Media, and Cyber Technologies.” Cyberspace and International Relations, 2013. http://bit.ly/235Qb3e

  • This chapter from the book “Cyberspace and International Relations” suggests that advances in big data give humanitarian organizations unprecedented opportunities to prevent and mitigate natural disasters and humanitarian crises. However, the sheer amount of unstructured data necessitates effective “data mining” strategies for multinational organizations to make the most use of this data.
  • By profiling some civil-society organizations who use big data in their peacekeeping efforts, Karlsrud suggests that these community-focused initiatives are leading the movement toward analyzing and using big data in countries vulnerable to crisis.
  • The chapter concludes by offering ten recommendations to UN peacekeeping forces to best realize the potential of big data and new technology in supporting their operations.

Mancini, Fancesco. “New Technology and the prevention of Violence and Conflict.” International Peace Institute, 2013. http://bit.ly/1ltLfNV

  • This report from the International Peace Institute looks at five case studies to assess how information and communications technologies (ICTs) can help prevent humanitarian conflicts and violence. Their findings suggest that context has a significant impact on the ability for these ICTs for conflict prevention, and any strategies must take into account the specific contingencies of the region to be successful.
  • The report suggests seven lessons gleaned from the five case studies:
    • New technologies are just one in a variety of tools to combat violence. Consequently, organizations must investigate a variety of complementary strategies to prevent conflicts, and not simply rely on ICTs.
    • Not every community or social group will have the same relationship to technology, and their ability to adopt new technologies are similarly influenced by their context. Therefore, a detailed needs assessment must take place before any new technologies are implemented.
    • New technologies may be co-opted by violent groups seeking to maintain conflict in the region. Consequently, humanitarian groups must be sensitive to existing political actors and be aware of possible negative consequences these new technologies may spark.
    • Local input is integral to support conflict prevention measures, and there exists need for collaboration and awareness-raising with communities to ensure new technologies are sustainable and effective.
    • Information shared between civil-society has more potential to develop early-warning systems. This horizontal distribution of information can also allow communities to maintain the accountability of local leaders.

Meier, Patrick. “Digital humanitarians: how big data is changing the face of humanitarian response.” Crc Press, 2015. http://amzn.to/1RQ4ozc

  • This book traces the emergence of “Digital Humanitarians”—people who harness new digital tools and technologies to support humanitarian action. Meier suggests that this has created a “nervous system” to connect people from disparate parts of the world, revolutionizing the way we respond to humanitarian crises.
  • Meier argues that such technology is reconfiguring the structure of the humanitarian space, where victims are not simply passive recipients of aid but can contribute with other global citizens. This in turn makes us more humane and engaged people.

Robertson, Andrew and Olson, Steve. “Using Data Sharing to Improve Coordination in Peacebuilding.” United States Institute for Peace, 2012. http://bit.ly/235QuLm

  • This report functions as an overview of a roundtable workshop on Technology, Science and Peace Building held at the United States Institute of Peace. The workshop aimed to investigate how data-sharing techniques can be developed for use in peace building or conflict management.
  • Four main themes emerged from discussions during the workshop:
    • “Data sharing requires working across a technology-culture divide”—Data sharing needs the foundation of a strong relationship, which can depend on sociocultural, rather than technological, factors.
    • “Information sharing requires building and maintaining trust”—These relationships are often built on trust, which can include both technological and social perspectives.
    • “Information sharing requires linking civilian-military policy discussions to technology”—Even when sophisticated data-sharing technologies exist, continuous engagement between different stakeholders is necessary. Therefore, procedures used to maintain civil-military engagement should be broadened to include technology.
    • “Collaboration software needs to be aligned with user needs”—technology providers need to keep in mind the needs of its users, in this case peacebuilders, in order to ensure sustainability.

United Nations Independent Expert Advisory Group on a Data Revolution for Sustainable Development. “A World That Counts, Mobilizing the Data Revolution.” 2014. https://bit.ly/2Cb3lXq

  • This report focuses on the potential benefits and risks data holds for sustainable development. Included in this is a strategic framework for using and managing data for humanitarian purposes. It describes a need for a multinational consensus to be developed to ensure data is shared effectively and efficiently.
  • It suggests that “people who are counted”—i.e., those who are included in data collection processes—have better development outcomes and a better chance for humanitarian response in emergency or conflict situations.

Katie Whipkey and Andrej Verity. “Guidance for Incorporating Big Data into Humanitarian Operations.” Digital Humanitarian Network, 2015. http://bit.ly/1Y2BMkQ

  • This report produced by the Digital Humanitarian Network provides an overview of big data, and how humanitarian organizations can integrate this technology into their humanitarian response. It primarily functions as a guide for organizations, and provides concise, brief outlines of what big data is, and how it can benefit humanitarian groups.
  • The report puts forward four main benefits acquired through the use of big data by humanitarian organizations: 1) the ability to leverage real-time information; 2) the ability to make more informed decisions; 3) the ability to learn new insights; 4) the ability for organizations to be more prepared.
  • It goes on to assess seven challenges big data poses for humanitarian organizations: 1) geography, and the unequal access to technology across regions; 2) the potential for user error when processing data; 3) limited technology; 4) questionable validity of data; 5) underdeveloped policies and ethics relating to data management; 6) limitations relating to staff knowledge.

Risks of Using Big Data in Humanitarian Context
Crawford, Kate, and Megan Finn. “The limits of crisis data: analytical and ethical challenges of using social and mobile data to understand disasters.” GeoJournal 80.4, 2015. http://bit.ly/1X0F7AI

  • Crawford & Finn present a critical analysis of the use of big data in disaster management, taking a more skeptical tone to the data revolution facing humanitarian response.
  • They argue that though social and mobile data analysis can yield important insights and tools in crisis events, it also presents a number of limitations which can lead to oversights being made by researchers or humanitarian response teams.
  • Crawford & Finn explore the ethical concerns the use of big data in disaster events introduces, including issues of power, privacy, and consent.
  • The paper concludes by recommending that critical data studies, such as those presented in the paper, be integrated into crisis event research in order to analyze some of the assumptions which underlie mobile and social data.

Jacobsen, Katja Lindskov (2010) Making design safe for citizens: A hidden history of humanitarian experimentation. Citizenship Studies 14.1: 89-103. http://bit.ly/1YaRTwG

  • This paper explores the phenomenon of “humanitarian experimentation,” where victims of disaster or conflict are the subjects of experiments to test the application of technologies before they are administered in greater civilian populations.
  • By analyzing the particular use of iris recognition technology during the repatriation of Afghan refugees to Pakistan in 2002 to 2007, Jacobsen suggests that this “humanitarian experimentation” compromises the security of already vulnerable refugees in order to better deliver biometric product to the rest of the world.

Responsible Data Forum. “Responsible Data Reflection Stories: An Overview.” http://bit.ly/1Rszrz1

  • This piece from the Responsible Data forum is primarily a compilation of “war stories” which follow some of the challenges in using big data for social good. By drawing on these crowdsourced cases, the Forum also presents an overview which makes key recommendations to overcome some of the challenges associated with big data in humanitarian organizations.
  • It finds that most of these challenges occur when organizations are ill-equipped to manage data and new technologies, or are unaware about how different groups interact in digital spaces in different ways.

Sandvik, Kristin Bergtora. “The humanitarian cyberspace: shrinking space or an expanding frontier?” Third World Quarterly 37:1, 17-32, 2016. http://bit.ly/1PIiACK

  • This paper analyzes the shift toward more technology-driven humanitarian work, where humanitarian work increasingly takes place online in cyberspace, reshaping the definition and application of aid. This has occurred along with what many suggest is a shrinking of the humanitarian space.
  • Sandvik provides three interpretations of this phenomena:
    • First, traditional threats remain in the humanitarian space, which are both modified and reinforced by technology.
    • Second, new threats are introduced by the increasing use of technology in humanitarianism, and consequently the humanitarian space may be broadening, not shrinking.
    • Finally, if the shrinking humanitarian space theory holds, cyberspace offers one example of this, where the increasing use of digital technology to manage disasters leads to a contraction of space through the proliferation of remote services.

Additional Readings on Data and Humanitarian Response

* Thanks to: Kristen B. Sandvik; Zara Rahman; Jennifer Schulte; Sean McDonald; Paul Currion; Dinorah Cantú-Pedraza and the Responsible Data Listserve for valuable input.

Elements of a New Ethical Framework for Big Data Research


The Berkman Center is pleased to announce the publication of a new paper from the Privacy Tools for Sharing Research Data project team. In this paper, Effy Vayena, Urs Gasser, Alexandra Wood, and David O’Brien from the Berkman Center, with Micah Altman from MIT Libraries, outline elements of a new ethical framework for big data research.

Emerging large-scale data sources hold tremendous potential for new scientific research into human biology, behaviors, and relationships. At the same time, big data research presents privacy and ethical challenges that the current regulatory framework is ill-suited to address. In light of the immense value of large-scale research data, the central question moving forward is not whether such data should be made available for research, but rather how the benefits can be captured in a way that respects fundamental principles of ethics and privacy.

The authors argue that a framework with the following elements would support big data utilization and help harness the value of big data in a sustainable and trust-building manner:

  • Oversight should aim to provide universal coverage of human subjects research, regardless of funding source, across all stages of the information lifecycle.

  • New definitions and standards should be developed based on a modern understanding of privacy science and the expectations of research subjects.

  • Researchers and review boards should be encouraged to incorporate systematic risk-benefit assessments and new procedural and technological solutions from the wide range of interventions that are available.

  • Oversight mechanisms and the safeguards implemented should be tailored to the intended uses, benefits, threats, harms, and vulnerabilities associated with a specific research activity.

Development of a new ethical framework with these elements should be the product of a dynamic multistakeholder process that is designed to capture the latest scientific understanding of privacy, analytical methods, available safeguards, community and social norms, and best practices for research ethics as they evolve over time.

The full paper is available for download through the Washington and Lee Law Review Online as part of a collection of papers featured at the Future of Privacy Forum workshop Beyond IRBs: Designing Ethical Review Processes for Big Data Research held on December 10, 2015, in Washington, DC….(More)”

Mapping a flood of new data


Rebecca Lipman at Economist Intelligence Unit Perspectives on “One city tweets to stay dry: From drones to old-fashioned phone calls, data come from many unlikely sources. In a disaster, such as a flood or earthquake, responders will take whatever information they can get to visualise the crisis and best direct their resources. Increasingly, cities prone to natural disasters are learning to better aid their citizens by empowering their local agencies and responders with sophisticated tools to cut through the large volume and velocity of disaster-related data and synthesise actionable information.

Consider the plight of the metro area of Jakarta, Indonesia, home to some 28m people, 13 rivers and 1,100 km of canals. With 40% of the city below sea level (and sinking), and regularly subject to extreme weather events including torrential downpours in monsoon season, Jakarta’s residents face far-too-frequent, life-threatening floods. Despite the unpredictability of flooding conditions, citizens have long taken a passive approach that depended on government entities to manage the response. But the information Jakarta’s responders had on the flooding conditions was patchy at best. So in the last few years, the government began to turn to the local population for help. It helped.

Today, Jakarta’s municipal government is relying on the web-based PetaJakarta.org project and a handful of other crowdsourcing mobile apps such as Qlue and CROP to collect data and respond to floods and other disasters. Through these programmes, crowdsourced, time-sensitive data derived from citizens’ social-media inputs have made it possible for city agencies to more precisely map the locations of rising floods and help the residents at risk. In January 2015, for example, the web-based Peta Jakarta received 5,209 reports on floods via tweets with detailed text and photos. Anytime there’s a flood, Peta Jakarta’s data from the tweets are mapped and updated every minute, and often cross-checked by Jakarta Disaster Management Agency (BPBD) officials through calls with community leaders to assess the information and guide responders.

But in any city Twitter is only one piece of a very large puzzle. …

Even with such life-and-death examples, government agencies remain deeply protective of data because of issues of security, data ownership and citizen privacy. They are also concerned about liability issues if incorrect data lead to an activity that has unsuccessful outcomes. These concerns encumber the combination of crowdsourced data with operational systems of record, and impede the fast progress needed in disaster situations….Download the case study .”

The Function of—and Need for—Institutional Review Boards


Review by  of The Censor’s Hand: The Misregulation of Human-Subject Research (Carl E. Schneider, The MIT Press): “Scientific research can be a laborious and frustrating process even before it gets started—especially when it involves living human subjects. Universities and other research institutions maintain Institutional Review Boards that scrutinize research proposals and their methodologies, consent and privacy procedures, and so on. Similarly intensive reviews are required when the intention is to use human tissue—if, say, tissue from diagnostic cancer biopsies could potentially be used to gauge the prevalence of some other illness across the population. These procedures can generate absurdities. A doctor who wanted to know which television characters children recognized, for example, was advised to seek ethics committee approval, and told that he needed to do a pilot study as a precursor.

Today’s IRB system is the response to a historic problem: academic researchers’ tendency to behave abominably when left unmonitored. Nazi medical and pseudomedical experiments provide an obvious and well-known reference, but such horrors are not found only in totalitarian regimes. The Tuskegee syphilis study, for example, deliberately left black men untreated over the course of decades so researchers could study the natural course of the disease. On a much smaller but equally disturbing scale is the case of Dan Markingson, a 26-year-old University of Michigan graduate. Suffering from psychotic illness, Markingson was coercively enrolled in a study of antipsychotics to which he could not consent, and concerns about his deteriorating condition were ignored. In 2004, he was found dead, having almost decapitated himself with a box cutter.

Many thoughtful ethicists are aware of the imperfections of IRBs. They have worried publicly for some time that the IRB system, or parts of it, may claim an authority with which even many bioethicists are uncomfortable, and hinder science for no particularly good reason. Does the system need re-tuning, a total re-build, or something even more drastic?

When it comes to IRBs, Carl E. Schneider, a professor of law and internal medicine at the University of Michigan, belongs to the abolitionist camp. In The Censor’s Hand: The Misregulation of Human-Subject Research, he presents the case against the IRB system plainly. It is a case that rests on seven related charges.

IRBs, Schneider posits, cannot be shown to do good, with regulators able to produce “no direct evidence that IRBs prevented harm”; that an IRB at least went through the motions of reviewing the trial in which Markingson died might be cited as evidence of this. On top of that, he claims, IRBs sometimes cause harm, at least insofar as they slow down medical innovation. They are built to err on the side of caution, since “research on humans” can cover a vast range of activities and disciplines, and they struggle to take this range into proper account. Correspondingly, they “lack a legible and convincing ethics”; the autonomy of IRBs means that they come to different decisions on identical cases. (In one case, an IRB thought that providing supplemental vitamin A in a study was so dangerous that it should not be allowed; another thought that withholding it in the same study was so dangerous that it should not be allowed.) IRBs have unrealistically high expectations of their members, who are often fairly ad hoc groupings with no obvious relevant expertise. They overemphasize informed consent, with the unintended consequence that cramming every possible eventuality into a consent form makes it utterly incomprehensible. Finally, Schneider argues, IRBs corrode free expression by restricting what researchers can do and how they can do it….(More)”

Open Data Impact: When Demand and Supply Meet


Stefaan Verhulst and Andrew Young at the GovLab: “Today, in “Open Data Impact: When Demand and Supply Meet,” the GovLab and Omidyar Network release key findings about the social, economic, cultural and political impact of open data. The findings are based on 19 detailed case studies of open data projects from around the world. These case studies were prepared in order to address an important shortcoming in our understanding of when, and how, open data works. While there is no shortage of enthusiasm for open data’s potential, nor of conjectural estimates of its hypothetical impact, few rigorous, systematic analyses exist of its concrete, real-world impact…. The 19 case studies that inform this report, all of which can be found at Open Data’s Impact (odimpact.org), a website specially set up for this project, were chosen for their geographic and sectoral representativeness. They seek to go beyond the descriptive (what happened) to the explanatory (why it happened, and what is the wider relevance or impact)….

In order to achieve the potential of open data and scale the impact of the individual projects discussed in our report, we need a better – and more granular – understanding of the enabling conditions that lead to success. We found 4 central conditions (“4Ps”) that play an important role in ensuring success:

Conditions

  • Partnerships: Intermediaries and data collaboratives play an important role in ensuring success, allowing for enhanced matching of supply and demand of data.
  • Public infrastructure: Developing open data as a public infrastructure, open to all, enables wider participation, and a broader impact across issues and sectors.
  • Policies: Clear policies regarding open data, including those promoting regular assessments of open data projects, are also critical for success.
  • Problem definition: Open data initiatives that have a clear target or problem definition have more impact and are more likely to succeed than those with vaguely worded statements of intent or unclear reasons for existence. 

Core Challenges

Finally, the success of a project is also determined by the obstacles and challenges it confronts. Our research uncovered 4 major challenges (“4Rs”) confronting open data initiatives across the globe:

Challenges

  • Readiness: A lack of readiness or capacity (evident, for example, in low Internet penetration or technical literacy rates) can severely limit the impact of open data.
  • Responsiveness: Open data projects are significantly more likely to be successful when they remain agile and responsive—adapting, for instance, to user feedback or early indications of success and failure.
  • Risks: For all its potential, open data does pose certain risks, notably to privacy and security; a greater, more nuanced understanding of these risks will be necessary to address and mitigate them.
  • Resource Allocation: While open data projects can often be launched cheaply, those projects that receive generous, sustained and committed funding have a better chance of success over the medium and long term.

Toward a Next Generation Open Data Roadmap

The report we release today concludes with ten recommendations for policymakers, advocates, users, funders and other stakeholders in the open data community. For each step, we include a few concrete methods of implementation – ways to translate the broader recommendation into meaningful impact.

Together, these 10 recommendations and their means of implementation amount to what we call a “Next Generation Open Data Roadmap.” This roadmap is just a start, and we plan to continue fleshing it out in the near future. For now, it offers a way forward. It is our hope that this roadmap will help guide future research and experimentation so that we can continue to better understand how the potential of open data can be fulfilled across geographies, sectors and demographics.

Additional Resources

In conjunction with the release of our key findings paper, we also launch today an “Additional Resources” section on the Open Data’s Impact website. The goal of that section is to provide context on our case studies, and to point in the direction of other, complementary research. It includes the following elements:

  • A “repository of repositories,” including other compendiums of open data case studies and sources;
  • A compilation of some popular open data glossaries;
  • A number of open data research publications and reports, with a particular focus on impact;
  • A collection of open data definitions and a matrix of analysis to help assess those definitions….(More)

Crowdsourced Health


crowdsourcedhealthBook by Elad Yom-Tov: “Most of us have gone online to search for information about health. What are the symptoms of a migraine? How effective is this drug? Where can I find more resources for cancer patients? Could I have an STD? Am I fat? A Pew survey reports more than 80 percent of American Internet users have logged on to ask questions like these. But what if the digital traces left by our searches could show doctors and medical researchers something new and interesting? What if the data generated by our searches could reveal information about health that would be difficult to gather in other ways? In this book, Elad Yom-Tov argues that Internet data could change the way medical research is done, supplementing traditional tools to provide insights not otherwise available. He describes how studies of Internet searches have, among other things, already helped researchers track to side effects of prescription drugs, to understand the information needs of cancer patients and their families, and to recognize some of the causes of anorexia.

Yom-Tov shows that the information collected can benefit humanity without sacrificing individual privacy. He explains why people go to the Internet with health questions; for one thing, it seems to be a safe place to ask anonymously about such matters as obesity, sex, and pregnancy. He describes in detrimental effects of “pro-anorexia” online content; tells how computer scientists can scour search engine data to improve public health by, for example, identifying risk factors for disease and centers of contagion; and tells how analyses of how people deal with upsetting diagnoses help doctors to treat patients and patients to understand their conditions….(More)

Access to Government Information in the United States: A Primer


Wendy Ginsberg and Michael Greene at Congressional Research Service: “No provision in the U.S. Constitution expressly establishes a procedure for public access to executive branch records or meetings. Congress, however, has legislated various public access laws. Among these laws are two records access statutes,

  • the Freedom of Information Act (FOIA; 5 U.S.C. §552), and
  • the Privacy Act (5 U.S.C. §552a),

and two meetings access statutes,

  •  the Federal Advisory Committee Act (FACA; 5 U.S.C. App.), and
  • the Government in the Sunshine Act (5 U.S.C. §552b).

These four laws provide the foundation for access to executive branch information in the American federal government. The records-access statutes provide the public with a variety of methods to examine how executive branch departments and agencies execute their missions. The meeting-access statutes provide the public the opportunity to participate in and inform the policy process. These four laws are also among the most used and most litigated federal access laws.

While the four statutes provide the public with access to executive branch federal records and meetings, they do not apply to the legislative or judicial branches of the U.S. government. The American separation of powers model of government provides a collection of formal and informal methods that the branches can use to provide information to one another. Moreover, the separation of powers anticipates conflicts over the accessibility of information. These conflicts are neither unexpected nor necessarily destructive. Although there is considerable interbranch cooperation in the sharing of information and records, such conflicts over access may continue on occasion.

This report offers an introduction to the four access laws and provides citations to additional resources related to these statutes. This report includes statistics on the use of FOIA and FACA and on litigation related to FOIA. The 114th Congress may have an interest in overseeing the implementation of these laws or may consider amending the laws. In addition, this report provides some examples of the methods Congress, the President, and the courts have employed to provide or require the provision of information to one another. This report is a primer on information access in the U.S. federal government and provides a list of resources related to transparency, secrecy, access, and nondisclosure….(More)”

Social Media for Government: Theory and Practice


Book edited by Staci M. Zavattaro and Thomas A. Bryer: “Social media is playing a growing role within public administration, and with it, there is an increasing need to understand the connection between social media research and what actually takes place in government agencies. Most of the existing books on the topic are scholarly in nature, often leaving out the vital theory-practice connection. This book joins theory with practice within the public sector, and explains how the effectiveness of social media can be maximized. The chapters are written by leading practitioners and span topics like how to manage employee use of social media sites, how emergency managers reach the public during a crisis situation, applying public record management methods to social media efforts, how to create a social media brand, how social media can help meet government objectives such as transparency while juggling privacy laws, and much more. For each topic, a collection of practitioner insights regarding the best practices and tools they have discovered are included. Social Media for Government responds to calls within the overall public administration discipline to enhance the theory-practice connection, giving practitioners space to tell academics what is happening in the field in order to encourage further meaningful research into social media use within government….(More)

Cities, Data, and Digital Innovation


Paper by Mark Kleinman: “Developments in digital innovation and the availability of large-scale data sets create opportunities for new economic activities and new ways of delivering city services while raising concerns about privacy. This paper defines the terms Big Data, Open Data, Open Government, and Smart Cities and uses two case studies – London (U.K.) and Toronto – to examine questions about using data to drive economic growth, improve the accountability of government to citizens, and offer more digitally enabled services. The paper notes that London has been one of a handful of cities at the forefront of the Open Data movement and has been successful in developing its high-tech sector, although it has so far been less innovative in the use of “smart city” technology to improve services and lower costs. Toronto has also made efforts to harness data, although it is behind London in promoting Open Data. Moreover, although Toronto has many assets that could contribute to innovation and economic growth, including a growing high-technology sector, world-class universities and research base, and its role as a leading financial centre, it lacks a clear narrative about how these assets could be used to promote the city. The paper draws some general conclusions about the links between data innovation and economic growth, and between open data and open government, as well as ways to use big data and technological innovation to ensure greater efficiency in the provision of city services…(More)