The hidden costs of open data


Sara Friedman at GCN: “As more local governments open their data for public use, the emphasis is often on “free” — using open source tools to freely share already-created government datasets, often with pro bono help from outside groups. But according to a new report, there are unforeseen costs when it comes pushing government datasets out of public-facing platforms — especially when geospatial data is involved.

The research, led by University of Waterloo professor Peter A. Johnson and McGill University professor Renee Sieber, was based on work as part of Geothink.ca partnership research grant and exploration of the direct and indirect costs of open data.

Costs related to data collection, publishing, data sharing, maintenance and updates are increasingly driving governments to third-party providers to help with hosting, standardization and analytical tools for data inspection, the researchers found. GIS implementation also has associated costs to train staff, develop standards, create valuations for geospatial data, connect data to various user communities and get feedback on challenges.

Due to these direct costs, some governments are more likely to avoid opening datasets that need complex assessment or anonymization techniques for GIS concerns. Johnson and Sieber identified four areas where the benefits of open geospatial data can generate unexpected costs.

First, open data can create “smoke and mirrors” situation where insufficient resources are put toward deploying open data for government use. Users then experience “transaction costs” when it comes to working in specialist data formats that need additional skills, training and software to use.

Second, the level of investment and quality of open data can lead to “material benefits and social privilege” for communities that devote resources to providing more comprehensive platforms.

While there are some open source data platforms, the majority of solutions are proprietary and charged on a pro-rata basis, which can present a challenge for cities with larger, poor populations compared to smaller, wealthier cities. Issues also arise when governments try to combine their data sets, leading to increased costs to reconcile problems.

The third problem revolves around the private sector pushing for the release of data sets that can benefit their business objectives. Companies could push for the release high-value sets, such as a real-time transit data, to help with their product development goals. This can divert attention from low-value sets, such as those detailing municipal services or installations, that could have a bigger impact on residents “from a civil society perspective.”

If communities decide to release the low-value sets first, Johnson and Sieber think the focus can then be shifted to high-value sets that can help recoup the costs of developing the platforms.

Lastly, the report finds inadvertent consequences could result from tying open data resources to private-sector companies. Public-private open data partnerships could lead to infrastructure problems that prevent data from being widely shared, and help private companies in developing their bids for public services….

Johnson and Sieber encourage communities to ask the following questions before investing in open data:

  1. Who are the intended constituents for this open data?
  2. What is the purpose behind the structure for providing this data set?
  3. Does this data enable the intended users to meet their goals?
  4. How are privacy concerns addressed?
  5. Who sets the priorities for release and updates?…(More)”

Read the full report here.