The social value of data

Working paper by Diane Coyle and Annabel Manley: “Data sets, and the inferences made from them, are generating an increasing amount of value in modern economies. However, this value is typically not well captured in GDP, and in general, the absence of markets for data assets means there is no easy approach to measuring the value of data. Yet given the potential value that can be created from investing in data and making it available, this oversight could lead to underinvestment or too little access to data.

Data has certain economic characteristics that make market-based methods of determining value insufficient to understanding its true potential value to society.

First is its non-rival nature, in that one person or company’s use of a dataset does not affect whether another person or company can also use it.

Second is that datasets often involve externalities. For example, information externalities mean that the presence of one data point will increase the value of all other data points in the dataset. Conversely, loss of privacy would be a negative externality. Therefore, the potential to link two datasets creates complications for valuations as the combined dataset will have a value possibly greater than the sum of its parts. These characteristics mean that private markets will not deliver economically efficient social availability of data, and that market prices will not reflect social value.

The experiment

In our new working paper we test one potential method of determining the social value of a dataset: discrete choice analysis.

Discrete choice analysis is a type of ‘contingent valuation’ method used to elicit individuals’ willingness to pay, a measure of consumer surplus. The method we tested is frequently used in marketing research for pricing strategies, and so there are a number of software tools that will automate the survey design and analysis (we used More recently, contingent methods have also been used to value  ‘free’ digital goods, and for a pilot study by the ONS for valuing their own datasets….(More)”.