Regulation of Big Data: Perspectives on Strategy, Policy, Law and Privacy


Paper by Pompeu CasanovasLouis de KokerDanuta Mendelson and David Watts: “…presents four complementary perspectives stemming from governance, law, ethics, and computer science. Big, Linked, and Open Data constitute complex phenomena whose economic and political dimensions require a plurality of instruments to enhance and protect citizens’ rights. Some conclusions are offered in the end to foster a more general discussion.

This article contends that the effective regulation of Big Data requires a combination of legal tools and other instruments of a semantic and algorithmic nature. It commences with a brief discussion of the concept of Big Data and views expressed by Australian and UK participants in a study of Big Data use in a law enforcement and national security perspective. The second part of the article highlights the UN’s Special Rapporteur on the Right to Privacy interest in the themes and the focus of their new program on Big Data. UK law reforms regarding authorisation of warrants for the exercise of bulk data powers is discussed in the third part. Reflecting on these developments, the paper closes with an exploration of the complex relationship between law and Big Data and the implications for regulation and governance of Big Data….(More)”.

Open Data’s Effect on Food Security


Jeremy de Beer, Jeremiah Baarbé, and Sarah Thuswaldner at Open AIR: “Agricultural data is a vital resource in the effort to address food insecurity. This data is used across the food-production chain. For example, farmers rely on agricultural data to decide when to plant crops, scientists use data to conduct research on pests and design disease resistant plants, and governments make policy based on land use data. As the value of agricultural data is understood, there is a growing call for governments and firms to open their agricultural data.

Open data is data that anyone can access, use, or share. Open agricultural data has the potential to address food insecurity by making it easier for farmers and other stakeholders to access and use the data they need. Open data also builds trust and fosters collaboration among stakeholders that can lead to new discoveries to address the problems of feeding a growing population.

 

A network of partnerships is growing around agricultural data research. The Open African Innovation Research (Open AIR) network is researching open agricultural data in partnership with the Plant Phenotyping and Imaging Research Centre (P2IRC) and the Global Institute for Food Security (GIFS). This research builds on a partnership with the Global Open Data for Agriculture and Nutrition (GODAN) and they are exploring partnerships with Open Data for Development (OD4D) and other open data organizations.

…published two works on open agricultural data. Published in partnership with GODAN, “Ownership of Open Data” describes how intellectual property law defines ownership rights in data. Firms that collect data own the rights to data, which is a major factor in the power dynamics of open data. In July, Jeremiah Baarbé and Jeremy de Beer will be presenting “A Data Commons for Food Security” …The paper proposes a licensing model that allows farmers to benefit from the datasets to which they contribute. The license supports SME data collectors, who need sophisticated legal tools; contributors, who need engagement, privacy, control, and benefit sharing; and consumers who need open access….(More)“.

The final Global Open Data Index is now live


Open Knowledge International: “The updated Global Open Data Index has been published today, along with our report on the state of Open Data this year. The report includes a broad overview of the problems we found around data publication and how we can improve government open data. You can download the full report here.

Also, after the Public Dialogue phase, we have updated the Index. You can see the updated edition here

We will also keep our forum open for discussions about open data quality and publication. You can see the conversation here.”

Inside the Algorithm That Tries to Predict Gun Violence in Chicago


Gun violence in Chicago has surged since late 2015, and much of the news media attention on how the city plans to address this problem has focused on the Strategic Subject List, or S.S.L.

The list is made by an algorithm that tries to predict who is most likely to be involved in a shooting, either as perpetrator or victim. The algorithm is not public, but the city has now placed a version of the list — without names — online through its open data portal, making it possible for the first time to see how Chicago evaluates risk.

We analyzed that information and found that the assigned risk scores — and what characteristics go into them — are sometimes at odds with the Chicago Police Department’s public statements and cut against some common perceptions.

■ Violence in the city is less concentrated at the top — among a group of about 1,400 people with the highest risk scores — than some public comments from the Chicago police have suggested.

■ Gangs are often blamed for the devastating increase in gun violence in Chicago, but gang membership had a small predictive effect and is being dropped from the most recent version of the algorithm.

■ Being a victim of a shooting or an assault is far more predictive of future gun violence than being arrested on charges of domestic violence or weapons possession.

■ The algorithm has been used in Chicago for several years, and its effectiveness is far from clear. Chicago accounted for a large share of the increase in urban murders last year….(More)”.

Our path to better science in less time using open data science tools


Julia S. Stewart Lowndes et al in Nature: “Reproducibility has long been a tenet of science but has been challenging to achieve—we learned this the hard way when our old approaches proved inadequate to efficiently reproduce our own work. Here we describe how several free software tools have fundamentally upgraded our approach to collaborative research, making our entire workflow more transparent and streamlined. By describing specific tools and how we incrementally began using them for the Ocean Health Index project, we hope to encourage others in the scientific community to do the same—so we can all produce better science in less time.

Figure 1: Better science in less time, illustrated by the Ocean Health Index project.
Figure 1

Every year since 2012 we have repeated Ocean Health Index (OHI) methods to track change in global ocean health36,37. Increased reproducibility and collaboration has reduced the amount of time required to repeat methods (size of bubbles) with updated data annually, allowing us to focus on improving methods each year (text labels show the biggest innovations). The original assessment in 2012 focused solely on scientific methods (for example, obtaining and analysing data, developing models, calculating, and presenting results; dark shading). In 2013, by necessity we gave more focus to data science (for example, data organization and wrangling, coding, versioning, and documentation; light shading), using open data science tools. We established R as the main language for all data preparation and modelling (using RStudio), which drastically decreased the time involved to complete the assessment. In 2014, we adopted Git and GitHub for version control, project management, and collaboration. This further decreased the time required to repeat the assessment. We also created the OHI Toolbox, which includes our R package ohicore for core analytical operations used in all OHI assessments. In subsequent years we have continued (and plan to continue) this trajectory towards better science in less time by improving code with principles of tidy data33; standardizing file and data structure; and focusing more on communication, in part by creating websites with the same open data science tools and workflow. See text and Table 1 for more details….(More)”

Open Data Barometer 2016


Open Data Barometer: “Produced by the World Wide Web Foundation as a collaborative work of the Open Data for Development (OD4D) network and with the support of the Omidyar Network, the Open Data Barometer (ODB) aims to uncover the true prevalence and impact of open data initiatives around the world. It analyses global trends, and provides comparative data on countries and regions using an in-depth methodology that combines contextual data, technical assessments and secondary indicators.

Covering 115 jurisdictions in the fourth edition, the Barometer ranks governments on:

  • Readiness for open data initiatives.
  • Implementation of open data programmes.
  • Impact that open data is having on business, politics and civil society.

After three successful editions, the fourth marks another step towards becoming a global policymaking tool with a participatory and inclusive process and a strong regional focus. This year’s Barometer includes an assessment of government performance in fulfilling the Open Data Charter principles.

The Barometer is a truly global and collaborative effort, with input from more than 100 researchers and government representatives. It takes over six months and more than 10,000 hours of research work to compile. During this process, we address more than 20,000 questions and respond to more than 5,000 comments and suggestions.

The ODB global report is a summary of some of the most striking findings. The full data and methodology is available, and is intended to support secondary research and inform better decisions for the progression of open data policies and practices across the world…(More)”.

Using Open Data to Combat Corruption


Robert Palmer at Open Data Charter: “…today we’re launching the Open Up Guide: Using Open Data to Combat Corruption. We think that with the right conditions in place, greater transparency can lead to more accountability, less corruption and better outcomes for citizens. This guide builds on the work in this area already done by the G20’s anti-corruption working group, Transparency International and the Web Foundation.

Inside the guide you’ll find a number of tools including:

  • A short overview on how open data can be used to combat corruption.
  • Use cases and methodologies. A series of case studies highlighting existing and future approaches to the use of open data in the anti-corruption field.
  • 30 priority datasets and the key attributes needed so that they can talk to each other. To address corruption networks it is particularly important that connections can be established and followed across data sets, national borders and different sectors.
  • Data standards. Standards describe what should be published, and the technical details of how it should be made available. The report includes some of the relevant standards for anti-corruption work, and highlights the areas where there are currently no standards.

The guide has been developed by Transparency International-Mexico, Open Contracting Partnership and the Open Data Charter, building on input from government officials, open data experts, civil society and journalists. It’s been designed as a practical tool for governments who want to use open data to fight corruption. However, it’s still a work in progress and we want feedback on how to make it more useful. Please either comment directly on the Google Doc version of the guide, or email us at [email protected]….View the full guide.”

CityDash: Visualising a Changing City Using Open Data


Chapter by Christopher Pettit, Scott N. Lieske and Murad Jamal in Planning Support Science for Smarter Urban Futures: “In an increasingly urbanised world, there are pressures being placed on our cities, which planners, decision-makers, and communities need to be able to respond to. Data driven responses and tools that can support the communication of information, and indicators on a city’s performance are becoming increasingly available and have the potential to play a critical role in understanding and managing complex urban systems . In this research, we will review international efforts in the creation of city dashboards and introduce the City of Sydney Dashboard, known as CityDash. This chapter culminates in a number of recommendations for city dashboards’ implementation. The recommendations for city dashboards include: consolidated information on a single web page, live data feeds relevant to planners and decision-makers as well as citizens’ daily lives, and site analytics as a way of evaluating user interactions and preferences….(More)”.

Dubai Data Releases Findings of ‘The Dubai Data Economic Impact Report’


Press Release: “the ‘Dubai Data Economic Impact Report’…provides the Dubai Government with insights into the potential economic impacts of opening and sharing data and includes a methodology for more rigorous measurement of the economic impacts of open and shared data, to allow regular assessment of the actual impacts in the future.

The study estimates that the opening and sharing of government and private sector data will potentially add a total of 10.4 billion AED Gross Value Added (GVA) impact to Dubai’s economy annually by 2021. Opening government data alone will result in a GVA impact of 6.6 billion AED annually as of 2021. This is equivalent to approximately 0.8% to 1.2% of Dubai’s forecasted GDP for 2021. Transport, storage, and communications are set to be the highest contributor to this potential GVA of opening government data, accounting for (27.8% or AED1.85 bn) of the total amount, followed by public administration (23.6% or AED 1.57 bn); wholesale, retail, restaurants, and hotels (13.7% or AED 908 million); real estate (9.6% or AED 639 million); and professional services (8.9% or AED 588 million). Finance and insurance, meanwhile, is calculated to make up 6.5% (AED 433 million) of the GVA, while mining, manufacturing, and utilities (6% or AED 395 million); construction (3.5% or AED 230 million); and entertainment and arts (0.4% or AED27 million) account for the remaining proportion.

This economic impact will be realized through the publication, exchange, use and reuse of Dubai data. The Dubai Data Law of 2015 mandates that data providers publish open data and exchange shared data. It defines open data as any Dubai data which is published and can be downloaded, used and re-used without restrictions by all types of users, while shared data is the data that has been classified as either confidential, sensitive, or secret, and can only be accessed by other government entities or by other authorised persons. The law pertains to local government entities, federal government entities which have any data relating to the emirate, individuals and companies who produce, own, disseminate, or exchange any data relating to the emirate. It aims to realise Dubai’s vision of transforming itself into a smart city, manage Dubai Data in accordance with a clear and specific methodology that is consistent with international best practices, integrate the services provided by federal and local government entities, and optimise the use of the data available to data providers, among other objectives….

The study identifies several stakeholders  involved in the use and reuse of open and shared data. These stakeholders – some of whom are qualified as “data creators” – play an important role in the process of generating the economic impacts. They include: data enrichers, who combine open data with their own sources and/or knowledge; data enablers, who do not profit directly from the data, but do so via the platforms and technologies they are provided on; data developers, who design and build Application Programming Interfaces (APIs); and data aggregators, who collect and pool data, providing it to other stakeholders….(More)”

Updated N.Y.P.D. Anti-Crime System to Ask: ‘How We Doing?’


It was a policing invention with a futuristic sounding name — CompStat — when the New York Police Department introduced it as a management system for fighting crime in an era of much higher violence in the 1990s. Police departments around the country, and the world, adapted its system of mapping muggings, robberies and other crimes; measuring police activity; and holding local commanders accountable.

Now, a quarter-century later, it is getting a broad reimagining and being brought into the mobile age. Moving away from simple stats and figures, CompStat is getting touchy-feely. It’s going to ask New Yorkers — via thousands of questions on their phones — “How are you feeling?” and “How are we, the police, doing?”

Whether this new approach will be mimicked elsewhere is still unknown, but as is the case with almost all new tactics in the N.Y.P.D. — the largest municipal police force in the United States by far — it will be closely watched. Nor is it clear if New Yorkers will embrace this approach, reject it as intrusive or simply be annoyed by it.

The system, using location technology, sends out short sets of questions to smartphones along three themes: Do you feel safe in your neighborhood? Do you trust the police? Are you confident in the New York Police Department?

The questions stream out every day, around the clock, on 50,000 different smartphone applications and present themselves on screens as eight-second surveys.

The department believes it will get a more diverse measure of community satisfaction, and allow it to further drive down crime. For now, Police Commissioner James P. O’Neill is calling the tool a “sentiment meter,” though he is open to suggestions for a better name….(More)”.