Reuse of open data in Quebec: from economic development to government transparency
Paper by Christian Boudreau: “Based on the history of open data in Quebec, this article discusses the reuse of these data by various actors within society, with the aim of securing desired economic, administrative and democratic benefits. Drawing on an analysis of government measures and community practices in the field of data reuse, the study shows that the benefits of open data appear to be inconclusive in terms of economic growth. On the other hand, their benefits seem promising from the point of view of government transparency in that it allows various civil society actors to monitor the integrity and performance of government activities. In the age of digital data and networks, the state must be seen not only as a platform conducive to innovation, but also as a rich field of study that is closely monitored by various actors driven by political and social goals….
Although the economic benefits of open data have been inconclusive so far, governments, at least in Quebec, must not stop investing in opening up their data. In terms of transparency, the results of the study suggest that the benefits of open data are sufficiently promising to continue releasing government data, if only to support the evaluation and planning activities of public programmes and services….(More)”.
Paper by Charlotte Ducuing: “The article discusses the concept of infrastructure in the digital environment, through a study of three data sharing legal regimes: the Public Sector Information Directive (PSI Directive), the discussions on in-vehicle data governance and the freshly adopted data sharing legal regime in the Electricity Directive.
While aiming to contribute to the scholarship on data governance, the article deliberately focuses on network industries. Characterised by the existence of physical infrastructure, they have a special relationship to digitisation and ‘platformisation’ and are exposed to specific risks. Adopting an explanatory methodology, the article exposes that these regimes are based on two close but different sources of inspiration, yet intertwined and left unclear. By targeting entities deemed ‘monopolist’ with regard to the data they create and hold, data sharing obligations are inspired from competition law and especially the essential facility doctrine. On the other hand, beneficiaries appear to include both operators in related markets needing data to conduct their business (except for the PSI Directive), and third parties at large to foster innovation. The latter rationale illustrates what is called here a purposive view of data as infrastructure. The underlying understanding of ‘raw’ data (management) as infrastructure for all to use may run counter the ability for the regulated entities to get a fair remuneration for ‘their’ data.
Finally, the article pleads for more granularity when mandating data sharing obligations depending upon the purpose. Shifting away from a ‘one-size-fits-all’ solution, the regulation of data could also extend to the ensuing context-specific data governance regime, subject to further research…(More)”.
Ruoxi Jia at Berkeley artificial intelligence research: “People give massive amounts of their personal data to companies every day and these data are used to generate tremendous business values. Some economists and politicians argue that people should be paid for their contributions—but the million-dollar question is: by how much?
This article discusses methods proposed in our recent AISTATS and VLDB papers that attempt to answer this question in the machine learning context. This is joint work with David Dao, Boxin Wang, Frances Ann Hubis, Nezihe Merve Gurel, Nick Hynes, Bo Li, Ce Zhang, Costas J. Spanos, and Dawn Song, as well as a collaborative effort between UC Berkeley, ETH Zurich, and UIUC. More information about the work in our group can be found here.
What are the existing approaches to data valuation?
Various ad-hoc data valuation schemes have been studied in the literature and some of them have been deployed in the existing data marketplaces. From a practitioner’s point of view, they can be grouped into three categories:
- Query-based pricing attaches values to user-initiated queries. One simple example is to set the price based on the number of queries allowed during a time window. Other more sophisticated examples attempt to adjust the price to some specific criteria, such as arbitrage avoidance.
- Data attribute-based pricing constructs a price model that takes into account various parameters, such as data age, credibility, potential benefits, etc. The model is trained to match market prices released in public registries.
- Auction-based pricing designs auctions that dynamically set the price based on bids offered by buyers and sellers.
However, existing data valuation schemes do not take into account the following important desiderata:
- Task-specificness: The value of data depends on the task it helps to fulfill. For instance, if Alice’s medical record indicates that she has disease A, then her data will be more useful to predict disease A as opposed to other diseases.
- Fairness: The quality of data from different sources varies dramatically. In the worst-case scenario, adversarial data sources may even degrade model performance via data poisoning attacks. Hence, the data value should reflect the efficacy of data by assigning high values to data which can notably improve the model’s performance.
- Efficiency: Practical machine learning tasks may involve thousands or billions of data contributors; thus, data valuation techniques should be capable of scaling up.
With the desiderata above, we now discuss a principled notion of data value and computationally efficient algorithms for data valuation….(More)”.
Article by Joseph E. Stiglitz, Todd N. Tucker, and Gabriel Zucman at Foreign Affairs: “For millennia, markets have not flourished without the help of the state. Without regulations and government support, the nineteenth-century English cloth-makers and Portuguese winemakers whom the economist David Ricardo made famous in his theory of comparative advantage would have never attained the scale necessary to drive international trade. Most economists rightly emphasize the role of the state in providing public goods and correcting market failures, but they often neglect the history of how markets came into being in the first place. The invisible hand of the market depended on the heavier hand of the state.
The state requires something simple to perform its multiple roles: revenue. It takes money to build roads and ports, to provide education for the young and health care for the sick, to finance the basic research that is the wellspring of all progress, and to staff the bureaucracies that keep societies and economies in motion. No successful market can survive without the underpinnings of a strong, functioning state.
That simple truth is being forgotten today. In the United States, total tax revenues paid to all levels of government shrank by close to four percent of national income over the last two decades, from about 32 percent in 1999 to approximately 28 percent today, a decline unique in modern history among wealthy nations. The direct consequences of this shift are clear: crumbling infrastructure, a slowing pace of innovation, a diminishing rate of growth, booming inequality, shorter life expectancy, and a sense of despair among large parts of the population. These consequences add up to something much larger: a threat to the sustainability of democracy and the global market economy….(More)”.
Paper by Rainer Kattel, Wolfgang Drechsler and Erkki Karo: “In this paper, we offer to redefine what entrepreneurial states are: these are states that are capable of unleashing innovations, and wealth resulting from those innovations, and of maintaining socio-political stability at the same time. Innovation bureaucracies are constellations of public organisations that deliver such agile stability. Such balancing acts make public bureaucracies unique in how they work, succeed and fail. The paper looks at the historical evolution of innovation bureaucracy by focusing on public organisations dealing with knowledge and technology, economic development and growth. We briefly show how agility and stability are delivered through starkly different bureaucratic organisations; hence, what matters for capacity and capabilities are not individual organisations, but organisational configurations and how they evolve….(More)”.
The Economist: “Faster, cheaper, better—technology is one field many people rely upon to offer a vision of a brighter future. But as the 2020s dawn, optimism is in short supply. The new technologies that dominated the past decade seem to be making things worse. Social media were supposed to bring people together. In the Arab spring of 2011 they were hailed as a liberating force. Today they are better known for invading privacy, spreading propaganda and undermining democracy. E-commerce, ride-hailing and the gig economy may be convenient, but they are charged with underpaying workers, exacerbating inequality and clogging the streets with vehicles. Parents worry that smartphones have turned their children into screen-addicted zombies.
The technologies expected to dominate the new decade also seem to cast a dark shadow. Artificial intelligence (ai) may well entrench bias and prejudice, threaten your job and shore up authoritarian rulers (see article). 5g is at the heart of the Sino-American trade war. Autonomous cars still do not work, but manage to kill people all the same. Polls show that internet firms are now less trusted than the banking industry. At the very moment banks are striving to rebrand themselves as tech firms, internet giants have become the new banks, morphing from talent magnets to pariahs. Even their employees are in revolt.
The New York Times sums up the encroaching gloom. “A mood of pessimism”, it writes, has displaced “the idea of inevitable progress born in the scientific and industrial revolutions.” Except those words are from an article published in 1979. Back then the paper fretted that the anxiety was “fed by growing doubts about society’s ability to rein in the seemingly runaway forces of technology”.
Today’s gloomy mood is centred on smartphones and social media, which took off a decade ago. Yet concerns that humanity has taken a technological wrong turn, or that particular technologies might be doing more harm than good, have arisen before. In the 1970s the despondency was prompted by concerns about overpopulation, environmental damage and the prospect of nuclear immolation. The 1920s witnessed a backlash against cars, which had earlier been seen as a miraculous answer to the affliction of horse-drawn vehicles—which filled the streets with noise and dung, and caused congestion and accidents. And the blight of industrialisation was decried in the 19th century by Luddites, Romantics and socialists, who worried (with good reason) about the displacement of skilled artisans, the despoiling of the countryside and the suffering of factory hands toiling in smoke-belching mills….(More)”.
Compendium developed by Andrew Reamer: “The E.M. Kauffman Foundation has asked the George Washington Institute of Public Policy (GWIPP) to prepare a compendium of federal sources of data on self-employment, entrepreneurship, and small business development. The Foundation believes that the availability of useful, reliable federal data on these topics would enable robust descriptions and explanations of entrepreneurship trends in the United States and so help guide the development of effective entrepreneurship policies.
Achieving these ends first requires the identification and detailed description of available federal datasets, as provided in this compendium. Its contents include:
- An overview and discussion of 18 datasets from four federal agencies, organized by two categories and five subcategories.
- Tables providing information on each dataset, including:
- scope of coverage of self-employed, entrepreneurs, and businesses;
- data collection methods (nature of data source, periodicity, sampling frame, sample size);
- dataset variables (owner characteristics, business characteristics and operations, geographic areas);
- Data release schedule; and
- Data access by format (including fixed tables, interactive tools, API, FTP download, public use microdata samples [PUMS], and confidential microdata).
For each dataset, examples of studies, if any, that use the data source to describe and explain trends in entrepreneurship.
The author’s aim is for the compendium to facilitate an assessment of the strengths and weaknesses of currently available federal datasets, discussion about how data availability and value can be improved, and implementation of desired improvements…(More)”
Ulises Ali Mejias at AlJazeera: “The recent coup in Bolivia reminds us that poor countries rich in resources continue to be plagued by the legacy of colonialism. Anything that stands in the way of a foreign corporation’s ability to extract cheap resources must be removed.
Today, apart from minerals and fossil fuels, corporations are after another precious resource: Personal data. As with natural resources, data too has become the target of extractive corporate practices.
As sociologist Nick Couldry and I argue in our book, The Costs of Connection: How Data is Colonizing Human Life and Appropriating It for Capitalism, there is a new form of colonialism emerging in the world: data colonialism. By this, we mean a new resource-grab whereby human life itself has become a direct input into economic production in the form of extracted data.
We acknowledge that this term is controversial, given the extreme physical violence and structures of racism that historical colonialism employed. However, our point is not to say that data colonialism is the same as historical colonialism, but rather to suggest that it shares the same core function: extraction, exploitation, and dispossession.
Like classical colonialism, data colonialism violently reconfigures human relations to economic production. Things like land, water, and other natural resources were valued by native people in the precolonial era, but not in the same way that colonisers (and later, capitalists) came to value them: as private property. Likewise, we are experiencing a situation in which things that were once primarily outside the economic realm – things like our most intimate social interactions with friends and family, or our medical records – have now been commodified and made part of an economic cycle of data extraction that benefits a few corporations.
So what could countries in the Global South do to avoid the dangers of data colonialism?…(More)”.
Paper by Bart Cammaerts and Robin Mansell: “This article considers challenges to policy and regulation presented by the dominant digital platforms. A radical democratic framing of the deliberative process is developed to acknowledge the full complexity of power relations that are in play in policy and regulatory debates and this view is contrasted with a liberal democratic perspective.
We show how these different framings have informed historical and contemporary approaches to the challenges presented by conflicting interests in economic value and a range of public values in the context of media content, communication infrastructure and digital platform policy and regulation. We argue for an agonistic approach to digital platform policy and regulatory debate so as to encourage a denaturalization of the prevailing logics of commercial datafication. We offer some suggestions about how such a generative discourse might be encouraged in such a way that it starts to yield a new common sense about the further development of digital platforms; one that might favor a digital ecology better attuned to consumer and citizen interests in democratic societies….(More)”.
Jan Michael Nolin at the Journal of Information, Communication and Ethics in Society: “Principled discussions on the economic value of data are frequently pursued through metaphors. This study aims to explore three influential metaphors for talking about the economic value of data: data are the new oil, data as infrastructure and data as an asset.
With the help of conceptual metaphor theory, various meanings surrounding the three metaphors are explored. Meanings clarified or hidden through various metaphors are identified. Specific emphasis is placed on the economic value of ownership of data.
In discussions on data as economic resource, the three different metaphors are used for separate purposes. The most used metaphor, data are the new oil, communicates that ownership of data could lead to great wealth. However, with data as infrastructure data have no intrinsic value. Therefore, profits generated from data resources belong to those processing the data, not those owning it. The data as an asset metaphor can be used to convince organizational leadership that they own data of great value….(More)”.