Our path to better science in less time using open data science tools


Julia S. Stewart Lowndes et al in Nature: “Reproducibility has long been a tenet of science but has been challenging to achieve—we learned this the hard way when our old approaches proved inadequate to efficiently reproduce our own work. Here we describe how several free software tools have fundamentally upgraded our approach to collaborative research, making our entire workflow more transparent and streamlined. By describing specific tools and how we incrementally began using them for the Ocean Health Index project, we hope to encourage others in the scientific community to do the same—so we can all produce better science in less time.

Figure 1: Better science in less time, illustrated by the Ocean Health Index project.
Figure 1

Every year since 2012 we have repeated Ocean Health Index (OHI) methods to track change in global ocean health36,37. Increased reproducibility and collaboration has reduced the amount of time required to repeat methods (size of bubbles) with updated data annually, allowing us to focus on improving methods each year (text labels show the biggest innovations). The original assessment in 2012 focused solely on scientific methods (for example, obtaining and analysing data, developing models, calculating, and presenting results; dark shading). In 2013, by necessity we gave more focus to data science (for example, data organization and wrangling, coding, versioning, and documentation; light shading), using open data science tools. We established R as the main language for all data preparation and modelling (using RStudio), which drastically decreased the time involved to complete the assessment. In 2014, we adopted Git and GitHub for version control, project management, and collaboration. This further decreased the time required to repeat the assessment. We also created the OHI Toolbox, which includes our R package ohicore for core analytical operations used in all OHI assessments. In subsequent years we have continued (and plan to continue) this trajectory towards better science in less time by improving code with principles of tidy data33; standardizing file and data structure; and focusing more on communication, in part by creating websites with the same open data science tools and workflow. See text and Table 1 for more details….(More)”

Open Data Barometer 2016


Open Data Barometer: “Produced by the World Wide Web Foundation as a collaborative work of the Open Data for Development (OD4D) network and with the support of the Omidyar Network, the Open Data Barometer (ODB) aims to uncover the true prevalence and impact of open data initiatives around the world. It analyses global trends, and provides comparative data on countries and regions using an in-depth methodology that combines contextual data, technical assessments and secondary indicators.

Covering 115 jurisdictions in the fourth edition, the Barometer ranks governments on:

  • Readiness for open data initiatives.
  • Implementation of open data programmes.
  • Impact that open data is having on business, politics and civil society.

After three successful editions, the fourth marks another step towards becoming a global policymaking tool with a participatory and inclusive process and a strong regional focus. This year’s Barometer includes an assessment of government performance in fulfilling the Open Data Charter principles.

The Barometer is a truly global and collaborative effort, with input from more than 100 researchers and government representatives. It takes over six months and more than 10,000 hours of research work to compile. During this process, we address more than 20,000 questions and respond to more than 5,000 comments and suggestions.

The ODB global report is a summary of some of the most striking findings. The full data and methodology is available, and is intended to support secondary research and inform better decisions for the progression of open data policies and practices across the world…(More)”.

Using Open Data to Combat Corruption


Robert Palmer at Open Data Charter: “…today we’re launching the Open Up Guide: Using Open Data to Combat Corruption. We think that with the right conditions in place, greater transparency can lead to more accountability, less corruption and better outcomes for citizens. This guide builds on the work in this area already done by the G20’s anti-corruption working group, Transparency International and the Web Foundation.

Inside the guide you’ll find a number of tools including:

  • A short overview on how open data can be used to combat corruption.
  • Use cases and methodologies. A series of case studies highlighting existing and future approaches to the use of open data in the anti-corruption field.
  • 30 priority datasets and the key attributes needed so that they can talk to each other. To address corruption networks it is particularly important that connections can be established and followed across data sets, national borders and different sectors.
  • Data standards. Standards describe what should be published, and the technical details of how it should be made available. The report includes some of the relevant standards for anti-corruption work, and highlights the areas where there are currently no standards.

The guide has been developed by Transparency International-Mexico, Open Contracting Partnership and the Open Data Charter, building on input from government officials, open data experts, civil society and journalists. It’s been designed as a practical tool for governments who want to use open data to fight corruption. However, it’s still a work in progress and we want feedback on how to make it more useful. Please either comment directly on the Google Doc version of the guide, or email us at info@opendatacharter.net….View the full guide.”

CityDash: Visualising a Changing City Using Open Data


Chapter by Christopher Pettit, Scott N. Lieske and Murad Jamal in Planning Support Science for Smarter Urban Futures: “In an increasingly urbanised world, there are pressures being placed on our cities, which planners, decision-makers, and communities need to be able to respond to. Data driven responses and tools that can support the communication of information, and indicators on a city’s performance are becoming increasingly available and have the potential to play a critical role in understanding and managing complex urban systems . In this research, we will review international efforts in the creation of city dashboards and introduce the City of Sydney Dashboard, known as CityDash. This chapter culminates in a number of recommendations for city dashboards’ implementation. The recommendations for city dashboards include: consolidated information on a single web page, live data feeds relevant to planners and decision-makers as well as citizens’ daily lives, and site analytics as a way of evaluating user interactions and preferences….(More)”.

Dubai Data Releases Findings of ‘The Dubai Data Economic Impact Report’


Press Release: “the ‘Dubai Data Economic Impact Report’…provides the Dubai Government with insights into the potential economic impacts of opening and sharing data and includes a methodology for more rigorous measurement of the economic impacts of open and shared data, to allow regular assessment of the actual impacts in the future.

The study estimates that the opening and sharing of government and private sector data will potentially add a total of 10.4 billion AED Gross Value Added (GVA) impact to Dubai’s economy annually by 2021. Opening government data alone will result in a GVA impact of 6.6 billion AED annually as of 2021. This is equivalent to approximately 0.8% to 1.2% of Dubai’s forecasted GDP for 2021. Transport, storage, and communications are set to be the highest contributor to this potential GVA of opening government data, accounting for (27.8% or AED1.85 bn) of the total amount, followed by public administration (23.6% or AED 1.57 bn); wholesale, retail, restaurants, and hotels (13.7% or AED 908 million); real estate (9.6% or AED 639 million); and professional services (8.9% or AED 588 million). Finance and insurance, meanwhile, is calculated to make up 6.5% (AED 433 million) of the GVA, while mining, manufacturing, and utilities (6% or AED 395 million); construction (3.5% or AED 230 million); and entertainment and arts (0.4% or AED27 million) account for the remaining proportion.

This economic impact will be realized through the publication, exchange, use and reuse of Dubai data. The Dubai Data Law of 2015 mandates that data providers publish open data and exchange shared data. It defines open data as any Dubai data which is published and can be downloaded, used and re-used without restrictions by all types of users, while shared data is the data that has been classified as either confidential, sensitive, or secret, and can only be accessed by other government entities or by other authorised persons. The law pertains to local government entities, federal government entities which have any data relating to the emirate, individuals and companies who produce, own, disseminate, or exchange any data relating to the emirate. It aims to realise Dubai’s vision of transforming itself into a smart city, manage Dubai Data in accordance with a clear and specific methodology that is consistent with international best practices, integrate the services provided by federal and local government entities, and optimise the use of the data available to data providers, among other objectives….

The study identifies several stakeholders  involved in the use and reuse of open and shared data. These stakeholders – some of whom are qualified as “data creators” – play an important role in the process of generating the economic impacts. They include: data enrichers, who combine open data with their own sources and/or knowledge; data enablers, who do not profit directly from the data, but do so via the platforms and technologies they are provided on; data developers, who design and build Application Programming Interfaces (APIs); and data aggregators, who collect and pool data, providing it to other stakeholders….(More)”

Updated N.Y.P.D. Anti-Crime System to Ask: ‘How We Doing?’


It was a policing invention with a futuristic sounding name — CompStat — when the New York Police Department introduced it as a management system for fighting crime in an era of much higher violence in the 1990s. Police departments around the country, and the world, adapted its system of mapping muggings, robberies and other crimes; measuring police activity; and holding local commanders accountable.

Now, a quarter-century later, it is getting a broad reimagining and being brought into the mobile age. Moving away from simple stats and figures, CompStat is getting touchy-feely. It’s going to ask New Yorkers — via thousands of questions on their phones — “How are you feeling?” and “How are we, the police, doing?”

Whether this new approach will be mimicked elsewhere is still unknown, but as is the case with almost all new tactics in the N.Y.P.D. — the largest municipal police force in the United States by far — it will be closely watched. Nor is it clear if New Yorkers will embrace this approach, reject it as intrusive or simply be annoyed by it.

The system, using location technology, sends out short sets of questions to smartphones along three themes: Do you feel safe in your neighborhood? Do you trust the police? Are you confident in the New York Police Department?

The questions stream out every day, around the clock, on 50,000 different smartphone applications and present themselves on screens as eight-second surveys.

The department believes it will get a more diverse measure of community satisfaction, and allow it to further drive down crime. For now, Police Commissioner James P. O’Neill is calling the tool a “sentiment meter,” though he is open to suggestions for a better name….(More)”.

What data do we want? Understanding demands for open data among civil society organisations in South Africa


Report by Kaliati, Andrew; Kachieng’a, Paskaliah and de Lanerolle, Indra: “Many governments, international agencies and civil society organisations (CSOs) support and promote open data. Most open government data initiatives have focused on supply – creating portals and publishing information. But much less attention has been given to demand – understanding data needs and nurturing engagement. This research examines the demand for open data in South Africa, and asks under what conditions meeting this demand might influence accountability. Recognising that not all open data projects are developed for accountability reasons, it also examines barriers to using government data for accountability processes. The research team identified and tested ‘use stories’ and ‘use cases’. How did a range of civil society groups with an established interest in holding local government accountable use – or feel that they could use – data in their work? The report identifies and highlights ten broad types of open data use, which they divided into two streams: ‘strategy and planning’ – in which CSOs used government data internally to guide their own actions; and ‘monitoring, mobilising and advocacy’ – in which CSOs undertake outward-facing activities….(More)”

Open data and the war on hunger – a challenge to be met


Diginomica: “Although the private sector is seen as the villain of the piece in some quarters, it actually has a substantial role to play in helping solve the problem of world hunger.

This is the view of Andre Laperriere, executive director of the Global Open Data for Agriculture and Nutrition (Godan) initiative, …

Laperriere himself heads up Godan’s small secretariat of five full-time equivalent employees who are based in Oxfordshire in the UK. The goal of the organisation, which currently has 511 members, is to encourage governmental, non-governmental (NGO) and private sector organisations to share open data about agriculture and nutrition. The idea is to make such information more available, accessible and usable in order to help tackle world food security in the face of mounting threats such as climate change.

But to do so, it is necessary to bring the three key actors originally identified by James Wolfensohn, former president of the World Bank, into play, believes Laperriere. He explains:

You have states, which generate and possess much of the data. There are citizens with lots of specific needs for which the data can be used, and there’s the private sector in between. It’s in the best position to exploit the data and use it to develop products that help meet the needs of the population. So the private sector is the motor of development and has a big role to play.

This is not least because NGOs, cooperatives and civil societies of all kinds often simply do not have the resources or technical knowledge to either find or deal with the massive quantities of open data that is released. Laperriere explains:

It’s a moral dilemma for a lot of research organisations. If, for example, they release 8,000 data sets about every kind of cattle disease, they’re doing so for the benefit of small farmers. But the only ones that can often do anything with it are the big companies as they have the appropriate skills. So the goal is the little guy rather than the big companies, but the alternative is not to release anything at all.

But for private sector businesses to truly get the most out of this open data as it is made available, Laperriere advocates getting together to create so-called pre-competition spaces. These spaces involve competitors collaborating in the early stages of commercial product development to solve common problems. To illustrate how such activity works, Laperriere cites his own past experience when working for a lighting company:

We were pushing fluorescent rather than incandescent lighting, but it contains mercury which pollutes, although it has a lower carbon footprint. It was also a lot more expensive. But we sat down together with the other manufacturers and shared our data to fix the problem together, which meant that everyone benefited by reducing the cost, the mercury pollution and the amount of energy consumed.

Next revolution

While Laperriere understands the fear of many organisations in potentially making themselves vulnerable to competition by disclosing their data, in reality, he attests, “it not the case”. Instead he points out:

If you release data in the right way to stimulate collaboration, it is positive economically and benefits both consumers and companies too as it helps reduce their costs and minimise other problems.

Due to growing amounts of government legislation and policies that require processed food manufacturers around the world to disclose product ingredients, he is, in fact, seeing rising interest in the approach not only among the manufacturers themselves but also among packaging and food preservation companies. The fact that agriculture and nutrition is a vast, complex area does mean there is still a long way to go, however….(More)”

Community-based app gets Londoners walking


Springwise: “Apps that measure a user’s exercise have been 10-a-penny for some years, but Go Jauntly is set to offer something brand new and leans much more into crowdsourcing than its rivals. Launched by a new start-up of nature-loving digital experts, and co-developed with Transport for London, Go Jauntly is a community-based initiative that’s as much about exploration and sharing with fellow jaunt-lovers. It also had £10,000 backing from the Ordnance Survey’s Geovation fund that helps start ups using geo-based technology. Big players are involved.

It’s directly tapped into TFL’s dynamic open data, and keeps users informed of everything from congestion to pollution. According to statistics, some 3.6 million journeys a day are made in London using cars and public transport, all of which could have been walked.

“We’re hoping that with Go Jauntly we’re creating technology for good that has a positive impact on society from a health, wellness and environmental perspective,” explains Hana Sutch, CEO and co-founder. “We wanted to start something that would get people out of the house and more active. Our team at Go Jauntly are all nature-loving city dwellers who spend too much of our time deskbound and wanted to be a bit more active.”

Go Jauntly is available now on the App Store with a variety of walks including Richmond and Regent’s Parks, plus a selection of South East London’s cemeteries. This isn’t just a London-centric innovation, anyone in the UK can download it, walk-the-walk, and share their jaunt. The company is hoping to get an Android version out by the end of the year.

Other apps that encourage walking include Norway’s Traffic Agent, and the UK’s Walkability was also designed to get users on the hoof….(More)”

Citizen-generated data in the information ecosystem


Ssanyu Rebecca at Making All Voices Count: “The call for a data revolution by the UN Secretary General’s High Level Panel in the run up to Agenda 2030 stimulated debate and action in terms of innovative ways of generating and sharing data.

Since then, technological advances have supported increased access to data and information through initiatives such as open data platforms and SMS-based citizen reporting systems. The main driving force for these advances is for data to be timely and usable in decision-making. Among the several actors in the data field are the proponents of citizen-generated data (CGD) who assert its potential in the context of the sustainable development agenda.

Nevertheless, there is need for more evidence on the potential of CGD in influencing policy and service delivery, and contributing to the achievement of the sustainable development goals. Our study on Citizen-generated data in the information ecosystem: exploring links for sustainable development sought to obtain answers. Using case studies on the use of CGD in two different scenarios in Uganda and Kenya, Development Research and Training (DRT) and Development Initiatives (DI) collaborated to carry out this one-year study.

In Uganda, we focused on a process of providing unsolicited citizen feedback to duty- bearers and service providers in communities. This was based on the work of Community Resource Trackers, a group of volunteers supported by DRT in five post-conflict districts (Gulu, Kitgum, Pader, Katakwi and Kotido) to identify and track community resources and provide feedback on their use. These included financial and in-kind resources, allocated through central and local government, NGOs and donors.

In Kenya, we focused on a formalised process of CGD involving the Ministry of Education and National Taxpayers Association. The School Report Card (SRC) is an effort to increase parental participation in schooling. SRC is a scorecard for parents to assess the performance of their school each year in ten areas relatingto the quality of education.

What were the findings?

The two processes provided insights into the changes CGD influences in the areas of  accountability, resource allocation, service delivery and government response.

Both cases demonstrated the relevance of CGD in improving service delivery. They showed that the uptake of CGD and response by government depends significantly on the quality of relationships that CGD producers create with government, and whether the initiatives relate to existing policy priorities and interests of government.

The study revealed important effects on improving citizen behaviours. Community members who participated in CGD processes, understood their role as citizens and participated fully in development processes, with strong skills, knowledge and confidence.

The Kenya case study revealed that CGD can influence policy change if it is generated and used at large scale, and in direct linkage with a specific sector; but it also revealed that this is difficult to measure.

In Uganda we observed distinct improvements in service delivery and accessibility at the local level – which was the motivation for engaging in CGD in the first instance….(More) (Full Report)”.