The State of Open Data Portals in Latin America


Michael Steinberg at Center for Data Innovation: “Many Latin American countries publish open data—government data made freely available online in machine-readable formats and without license restrictions. However, there is a tremendous amount of variation in the quantity and type of datasets governments publish on national open data portals—central online repositories for open data that make it easier for users to find data. Despite the wide variation among the countries, the most popular datasets tend to be those that either provide transparency into government operations or offer information that citizens can use directly. As governments continue to update and improve their open data portals, they should take steps to ensure that they are publishing the datasets most valuable to their citizens.

To better understand this variation, we collected information about open data portals in 20 Latin American countries including Argentina, Bolivia, Brazil, Chile, Colombia, Costa Rica, Ecuador, Mexico, Panama, Paraguay, Peru, and Uruguay. Not all Latin American countries have an open data portal, but even if they do not operate a unified portal, some governments may still have open data. Four Latin American countries—Belize, Guatemala, Honduras, and Nicaragua—do not have open data portals. One country— El Salvador—does not have a government-run open data portal, but does have a national open data portal (datoselsalvador.org) run by volunteers….

There are many steps Latin American governments can take to improve open data in their country. Those nations without open data portals should create them, and those who already have them should continue to update them and publish more datasets to better serve their constituents. One way to do this is to monitor the popular datasets on other countries’ open data portals, and where applicable, ensure the government produces similar datasets. Those running open data portals should also routinely monitor search queries to see what users are looking for, and if they are looking for datasets that have not yet been posted, work with the relevant government agencies to make these datasets available.

In summary, there are stark differences in the amount of data published, the format of the data, and the most popular datasets in open data portals in Latin America. However, in every country there is an appetite for data that either provides public accountability for government functions or supplies helpful information to citizens…(More)”.

Using Collaboration to Harness Big Data for Social Good


Jake Porway at SSIR: “These days, it’s hard to get away from the hype around “big data.” We read articles about how Silicon Valley is using data to drive everything from website traffic to autonomous cars. We hear speakers at social sector conferences talk about how nonprofits can maximize their impact by leveraging new sources of digital information like social media data, open data, and satellite imagery.

Braving this world can be challenging, we know. Creating a data-driven organization can require big changes in culture and process. Some nonprofits, like Crisis Text Line and Watsi, started off boldly by building their own data science teams. But for the many other organizations wondering how to best use data to advance their mission, we’ve found that one ingredient works better than all the software and tech that you can throw at a problem: collaboration.

As a nonprofit dedicated to applying data science for social good, DataKind has run more than 200 projects in collaboration with other nonprofits worldwide by connecting them to teams of volunteer data scientists. What do the most successful ones have in common? Strong collaborations on three levels: with data science experts, within the organization itself, and across the nonprofit sector as a whole.

1. Collaborate with data science experts to define your project. As we often say, finding problems can be harder than finding solutions. ….

2. Collaborate across your organization to “build with, not for.” Our projects follow the principles of human-centered design and the philosophy pioneered in the civic tech world of “design with, not for.” ….

3. Collaborate across your sector to move the needle. Many organizations think about building data science solutions for unique challenges they face, such as predicting the best location for their next field office. However, most of us are fighting common causes shared by many other groups….

By focusing on building strong collaborations on these three levels—with data experts, across your organization, and across your sector—you’ll go from merely talking about big data to making big impact….(More).

Using Open Data to Combat Corruption


Paper by Richard Rose: “Open data makes transparent whether public officials are conducting their activities in conformity with standards that can be bureaucratic, political or moral. Actions that violate these standards are colloquially lumped together under the heterogeneous heading of corruption. However, the payment of a large bribe for a multi-million contract differs in kind from a party saying one thing to win votes and doing another once in office or an individual public figure promoting high standards of personal morality while conducting himself in private very differently. This paper conceptually distinguishes different forms of corruption with concrete examples. It also shows how sanctions for different forms of corruption require different sanctions: punishment by the courts, by political leaders or the electorate, or by public morality and a sense of individual shame. Such sanctions are most effective when there is normative agreement that standards have been violated. There are partisan as well as normative disagreements about whether standards have been violated. The paper concludes by pointing out that differences in violating standards require different policy responses….(More)”

Index: Collective Intelligence


By Hannah Pierce and Audrie Pirkl

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on collective intelligence and was originally published in 2017.

The Collective Intelligence Universe

  • Amount of money that Reykjavik’s Better Neighbourhoods program has provided each year to crowdsourced citizen projects since 2012: € 2 million (Citizens Foundation)
  • Number of U.S. government challenges that people are currently participating in to submit their community solutions: 778 (Challenge.gov).
  • Percent of U.S. arts organizations used social media to crowdsource ideas in 2013, from programming decisions to seminar scheduling details: 52% (Pew Research)
  • Number of Wikipedia members who have contributed to a page in the last 30 days: over 120,000 (Wikipedia Page Statistics)
  • Number of languages that the multinational crowdsourced Letters for Black Lives has been translated into: 23 (Letters for Black Lives)
  • Number of comments in a Reddit thread that established a more comprehensive timeline of the theater shooting in Aurora than the media: 1272 (Reddit)
  • Number of physicians that are members of SERMO, a platform to crowdsource medical research: 800,000 (SERMO)
  • Number of citizen scientist projects registered on SciStarter: over 1,500 (Collective Intelligence 2017 Plenary Talk: Darlene Cavalier)
  • Entrants to NASA’s 2009 TopCoder Challenge: over 1,800 (NASA)

Infrastructure

  • Number of submissions for Block Holm (a digital platform that allows citizens to build “Minecraft” ideas on vacant city lots) within the first six months: over 10,000 (OpenLearn)
  • Number of people engaged to The Participatory Budgeting Project in the U.S.: over 300,000. (Participatory Budgeting Project)
  • Amount of money allocated to community projects through this initiative: $238,000,000

Health

  • Percentage of Internet-using adults with chronic health conditions that have gone online within the US to connect with others suffering from similar conditions: 23% (Pew Research)
  • Number of posts to Patient Opinion, a UK based platform for patients to provide anonymous feedback to healthcare providers: over 120,000 (Nesta)
    • Percent of NHS health trusts utilizing the posts to improve services in 2015: 90%
    • Stories posted per month: nearly 1,000 (The Guardian)
  • Number of tumors reported to the English National Cancer Registration each year: over 300,000 (Gov.UK)
  • Number of users of an open source artificial pancreas system: 310 (Collective Intelligence 2017 Plenary Talk: Dana Lewis)

Government

  • Number of submissions from 40 countries to the 2017 Open (Government) Contracting Innovation Challenge: 88 (The Open Data Institute)
  • Public-service complaints received each day via Indonesian digital platform Lapor!: over 500 (McKinsey & Company)
  • Number of registered users of Unicef Uganda’s weekly, SMS poll U-Report: 356,468 (U-Report)
  • Number of reports regarding government corruption in India submitted to IPaidaBribe since 2011: over 140,000 (IPaidaBribe)

Business

  • Reviews posted since Yelp’s creation in 2009: 121 million reviews (Statista)
  • Percent of Americans in 2016 who trust online customer reviews as much as personal recommendations: 84% (BrightLocal)
  • Number of companies and their subsidiaries mapped through the OpenCorporates platform: 60 million (Omidyar Network)

Crisis Response

Public Safety

  • Number of sexual harassment reports submitted to from 50 cities in India and Nepal to SafeCity, a crowdsourcing site and mobile app: over 4,000 (SafeCity)
  • Number of people that used Facebook’s Safety Check, a feature that is being used in a new disaster mapping project, in the first 24 hours after the terror attacks in Paris: 4.1 million (Facebook)

Rawification and the careful generation of open government data


 and  in Social Studies of Science: “Drawing on a two-year ethnographic study within several French administrations involved in open data programs, this article aims to investigate the conditions of the release of government data – the rawness of which open data policies require. This article describes two sets of phenomena. First, far from being taken for granted, open data emerge in administrations through a progressive process that entails uncertain collective inquiries and extraction work. Second, the opening process draws on a series of transformations, as data are modified to satisfy an important criterion of open data policies: the need for both human and technical intelligibility. There are organizational consequences of these two points, which can notably lead to the visibilization or the invisibilization of data labour. Finally, the article invites us to reconsider the apparent contradiction between the process of data release and the existence of raw data. Echoing the vocabulary of one of the interviewees, the multiple operations can be seen as a ‘rawification’ process by which open government data are carefully generated. Such a notion notably helps to build a relational model of what counts as data and what counts as work….(More)”.

Public Data Is More Important Than Ever–And Now It’s Easier To Find


Meg Miller at Co.Design: “Public data, in theory, is meant to be accessible to everyone. But in practice, even finding it can be near impossible, to say nothing of figuring out what to do with it once you do. Government data websites are often clunky and outdated, and some data is still trapped on physical media–like CDs or individual hard drives.

Tens of thousands of these CDs and hard drives, full of data on topics from Arkansas amusement parks to fire incident reporting, have arrived at the doorstep of the New York-based start-up Enigma over the past four years. The company has obtained thousands upon thousands more datasets by way of Freedom of Information Act (FOIA) requests. Enigma specializes in open data: gathering it, curating it, and analyzing it for insights into a client’s industry, for example, or for public service initiatives.

Enigma also shares its 100,000 datasets with the world through an online platform called Public—the broadest collection of public data that is open and searchable by everyone. Public has been around since Enigma launched in 2013, but today the company is introducing a redesigned version of the site that’s fresher and more user-friendly, with easier navigation and additional features that allow users to drill further down into the data.

But while the first iteration of Public was mostly concerned with making Enigma’s enormous trove of data—which it was already gathering and reformating for client work—accessible to the public, the new site focuses more on linking that data in new ways. For journalists, researchers, and data scientists, the tool will offer more sophisticated ways of making sense of the data that they have access to through Enigma….

…the new homepage also curates featured datasets and collections to enforce a sense of discoverability. For example, an Enigma-curated collection of U.S. sanctions data from the U.S. Treasury Department’s Office of Foreign Assets Control (OFAC) shows data on the restrictions on entities or individuals that American companies can and can’t do business with in an effort to achieve specific national security or foreign policy objectives. A new round of sanctions against Russia have been in the news lately as an effort by President Trump to loosen restrictions on blacklisted businesses and individuals in Russia was overruled by the Senate last week. Enigma’s curated data selection on U.S. sanctions could help journalists contextualize recent events with data that shows changes in sanctions lists over time by presidential administration, for instance–or they could compare the U.S. sanctions list to the European Union’s….(More).

Regulation of Big Data: Perspectives on Strategy, Policy, Law and Privacy


Paper by Pompeu CasanovasLouis de KokerDanuta Mendelson and David Watts: “…presents four complementary perspectives stemming from governance, law, ethics, and computer science. Big, Linked, and Open Data constitute complex phenomena whose economic and political dimensions require a plurality of instruments to enhance and protect citizens’ rights. Some conclusions are offered in the end to foster a more general discussion.

This article contends that the effective regulation of Big Data requires a combination of legal tools and other instruments of a semantic and algorithmic nature. It commences with a brief discussion of the concept of Big Data and views expressed by Australian and UK participants in a study of Big Data use in a law enforcement and national security perspective. The second part of the article highlights the UN’s Special Rapporteur on the Right to Privacy interest in the themes and the focus of their new program on Big Data. UK law reforms regarding authorisation of warrants for the exercise of bulk data powers is discussed in the third part. Reflecting on these developments, the paper closes with an exploration of the complex relationship between law and Big Data and the implications for regulation and governance of Big Data….(More)”.

Open Data’s Effect on Food Security


Jeremy de Beer, Jeremiah Baarbé, and Sarah Thuswaldner at Open AIR: “Agricultural data is a vital resource in the effort to address food insecurity. This data is used across the food-production chain. For example, farmers rely on agricultural data to decide when to plant crops, scientists use data to conduct research on pests and design disease resistant plants, and governments make policy based on land use data. As the value of agricultural data is understood, there is a growing call for governments and firms to open their agricultural data.

Open data is data that anyone can access, use, or share. Open agricultural data has the potential to address food insecurity by making it easier for farmers and other stakeholders to access and use the data they need. Open data also builds trust and fosters collaboration among stakeholders that can lead to new discoveries to address the problems of feeding a growing population.

 

A network of partnerships is growing around agricultural data research. The Open African Innovation Research (Open AIR) network is researching open agricultural data in partnership with the Plant Phenotyping and Imaging Research Centre (P2IRC) and the Global Institute for Food Security (GIFS). This research builds on a partnership with the Global Open Data for Agriculture and Nutrition (GODAN) and they are exploring partnerships with Open Data for Development (OD4D) and other open data organizations.

…published two works on open agricultural data. Published in partnership with GODAN, “Ownership of Open Data” describes how intellectual property law defines ownership rights in data. Firms that collect data own the rights to data, which is a major factor in the power dynamics of open data. In July, Jeremiah Baarbé and Jeremy de Beer will be presenting “A Data Commons for Food Security” …The paper proposes a licensing model that allows farmers to benefit from the datasets to which they contribute. The license supports SME data collectors, who need sophisticated legal tools; contributors, who need engagement, privacy, control, and benefit sharing; and consumers who need open access….(More)“.

The final Global Open Data Index is now live


Open Knowledge International: “The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here.

Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here

We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.”

Inside the Algorithm That Tries to Predict Gun Violence in Chicago


Gun violence in Chicago has surged since late 2015, and much of the news media attention on how the city plans to address this problem has focused on the Strategic Subject List, or S.S.L.

The list is made by an algorithm that tries to predict who is most likely to be involved in a shooting, either as perpetrator or victim. The algorithm is not public, but the city has now placed a version of the list — without names — online through its open data portal, making it possible for the first time to see how Chicago evaluates risk.

We analyzed that information and found that the assigned risk scores — and what characteristics go into them — are sometimes at odds with the Chicago Police Department’s public statements and cut against some common perceptions.

■ Violence in the city is less concentrated at the top — among a group of about 1,400 people with the highest risk scores — than some public comments from the Chicago police have suggested.

■ Gangs are often blamed for the devastating increase in gun violence in Chicago, but gang membership had a small predictive effect and is being dropped from the most recent version of the algorithm.

■ Being a victim of a shooting or an assault is far more predictive of future gun violence than being arrested on charges of domestic violence or weapons possession.

■ The algorithm has been used in Chicago for several years, and its effectiveness is far from clear. Chicago accounted for a large share of the increase in urban murders last year….(More)”.