Becoming a data steward


Shalini Kurapati at the LSE Impact Blog: “In the context of higher education, data stewards are the first point of reference for all data related questions. In my role as a data steward at TU Delft, I was able to advise, support and train researchers on various aspects of data management throughout the life cycle of a research project, from initial planning to post-publication. This included storing, managing and sharing research outputs such as data, images, models and code.

Data stewards also advise researchers on the ethical, policy and legal considerations during data collection, processing and dissemination. In a way, they are general practitioners for research data management and can usually solve most problems faced by academics. In cases that require specialist intervention, they also serve as a key point for referral (eg: IT, patent, legal experts).

Data stewardship is often organised centrally through the university library. (Subject) Data librarians, research data consultants and research data officers, usually perform similar roles to data stewards. However, TU Delft operates a decentralised model, where data stewards are placed within faculties as disciplinary experts with research experience. This allows data stewards to provide discipline specific support to researchers, which is particularly beneficial, as the concept of what data is itself varies across disciplines….(More)”.

New Zealand launches draft algorithm charter for government agencies


Mia Hunt at Global Government Forum: “The New Zealand government has launched a draft ‘algorithm charter’ that sets out how agencies should analyse data in a way that is fair, ethical and transparent.

The charter, which is open for public consultation, sets out 10 points that agencies would have to adhere to. These include pledging to explain how significant decisions are informed by algorithms or, where it cannot – for national security reasons, for example – explain the reason; taking into account the perspectives of communities, such as LGBTQI+, Pacific islanders and people with disabilities; and identifying and consulting with groups or stakeholders with an interest in algorithm development.

Agencies would also have to publish information about how data is collected and stored; use tools and processes to ensure that privacy, ethics, and human rights considerations are integrated as part of algorithm development and procurement; and periodically assess decisions made by algorithms for unintended bias.

They would commit to implementing a “robust” peer-review process, and have to explain clearly who is responsible for automated decisions and what methods exist for challenge or appeal “via a human”….

The charter – which fits on a single page, and is designed to be simple and easily understood – explains that algorithms are a “fundamental element” of data analytics, which supports public services and delivers “new, innovative and well-targeted” policies aims.

The charter begins: “In a world where technology is moving rapidly, and artificial intelligence is on the rise, it’s essential that government has the right safeguards in place when it uses public data for decision-making. The government must ensure that data ethics are embedded in its work, and always keep in mind the people and communities being served by these tools.”

It says Stats NZ, the country’s official data agency, is “committed to transparent and accountable use of operational algorithms and other advanced data analytics techniques that inform decisions significantly impacting on individuals or groups”….(More)”.

Breaking Down Information Silos with Big Data: A Legal Analysis of Data Sharing


Chapter by Giovanni De Gregorio and Sofia Ranchordas in J. Cannataci, V. Falce & O. Pollicino (Eds), New Legal Challenges of Big Data (Edward Elgar, 2020, Forthcoming): “In the digital society, individuals play different roles depending on the situation they are placed in: they are consumers when they purchase a good, citizens when they vote for elections, content providers when they post information on a platform, and data subjects when their data is collected. Public authorities have thus far regulated citizens and the data collected on their different roles in silos (e.g., bankruptcy registrations, social welfare databases), resulting in inconsistent decisions, redundant paperwork, and delays in processing citizen requests. Data silos are considered to be inefficient both for companies and governments. Big data and data analytics are disrupting these silos allowing the different roles of individuals and the respective data to converge. In practice, this happens in several countries with data sharing arrangements or ad hoc data requests. However, breaking down the existing structure of information silos in the public sector remains problematic. While big data disrupts artificial silos that may not make sense in the digital society and promotes a truly efficient digitalization of data, removing information out of its original context may alter its meaning and violate the privacy of citizens. In addition, silos ensure that citizens are not assessed in one field by information generated in a totally different context. This chapter discusses how big data and data analytics are changing information silos and how digital technology is challenging citizens’ autonomy and right to privacy and data protection. This chapter also explores the need for a more integrated approach to the study of information, particularly in the public sector.

Should Consumers Be Able to Sell Their Own Personal Data?


The Wall Street Journal: “People around the world are confused and concerned about what companies do with the data they collect from their interactions with consumers.

A global survey conducted last fall by the research firm Ipsos gives a sense of the scale of people’s worries and uncertainty. Roughly two-thirds of those surveyed said they knew little or nothing about how much data companies held about them or what companies did with that data. And only about a third of respondents on average said they had at least a fair amount of trust that a variety of corporate and government organizations would use the information they had about them in the right way….

Christopher Tonetti, an associate professor of economics at Stanford Graduate School of Business, says consumers should own and be able to sell their personal data. Cameron F. Kerry, a visiting fellow at the Brookings Institution and former general counsel and acting secretary of the U.S. Department of Commerce, opposes the idea….

YES: It Would Encourage Sharing of Data—a Plus for Consumers and Society…Data isn’t like other commodities in one fundamental way—it doesn’t diminish with use. And that difference is the key to why consumers should own the data that’s created when they interact with companies, and have the right to sell it.YES: It Would Encourage Sharing of Data—a Plus for Consumers and Society…

NO: It Would Do Little to Help Consumers, and Could Leave Them Worse Off Than Now…

But owning data will do little to help consumers’ privacy—and may well leave them worse off. Meanwhile, consumer property rights would create enormous friction for valid business uses of personal information and for the free flow of information we value as a society.

In our current system, consumers reflexively click away rights to data in exchange for convenience, free services, connection, endorphins or other motivations. In a market where consumers could sell or license personal information they generate from web browsing, ride-sharing apps and other digital activities, is there any reason to expect that they would be less motivated to share their information? …(More)”.

Linked Democracy: Foundations, Tools, and Applications


Book edited by Marta Poblet, Pompeu Casanovas and Víctor Rodríguez-Doncel: “This open access book shows the factors linking information flow, social intelligence, rights management and modelling with epistemic democracy, offering licensed linked data along with information about the rights involved. This model of democracy for the web of data brings new challenges for the social organisation of knowledge, collective innovation, and the coordination of actions. Licensed linked data, licensed linguistic linked data, right expression languages, semantic web regulatory models, electronic institutions, artificial socio-cognitive systems are examples of regulatory and institutional design (regulations by design). The web has been massively populated with both data and services, and semantically structured data, the linked data cloud, facilitates and fosters human-machine interaction. Linked data aims to create ecosystems to make it possible to browse, discover, exploit and reuse data sets for applications. Rights Expression Languages semi-automatically regulate the use and reuse of content…(More)”.

Official Statistics 4.0: Verified Facts for People in the 21st Century


Book by Walter J. Radermacher: “This book explores official statistics and their social function in modern societies. Digitisation and globalisation are creating completely new opportunities and risks, a context in which facts (can) play an enormously important part if they are produced with a quality that makes them credible and purpose-specific. In order for this to actually happen, official statistics must continue to actively pursue the modernisation of their working methods.This book is not about the technical and methodological challenges associated with digitisation and globalisation; rather, it focuses on statistical sociology, which scientifically deals with the peculiarities and pitfalls of governing-by-numbers, and assigns statistics a suitable position in the future informational ecosystem. Further, the book provides a comprehensive overview of modern issues in official statistics, embodied in a historical and conceptual framework that endows it with different and innovative perspectives. Central to this work is the quality of statistical information provided by official statistics. The implementation of the UN Sustainable Development Goals in the form of indicators is another driving force in the search for answers, and is addressed here….(More)”.

The Economics of Artificial Intelligence


Book edited by Ajay Agrawal, Joshua Gans and Avi Goldfarb: “Advances in artificial intelligence (AI) highlight the potential of this technology to affect productivity, growth, inequality, market power, innovation, and employment. This volume seeks to set the agenda for economic research on the impact of AI.

It covers four broad themes: AI as a general purpose technology; the relationships between AI, growth, jobs, and inequality; regulatory responses to changes brought on by AI; and the effects of AI on the way economic research is conducted. It explores the economic influence of machine learning, the branch of computational statistics that has driven much of the recent excitement around AI, as well as the economic impact of robotics and automation and the potential economic consequences of a still-hypothetical artificial general intelligence. The volume provides frameworks for understanding the economic impact of AI and identifies a number of open research questions…. (More)”

Data gaps threaten achievement of development goals in Africa


Sara Jerving at Devex: “Data gaps across the African continent threaten to hinder the achievement of the Sustainable Development Goals and the African Union’s Agenda 2063, according to the Mo Ibrahim Foundation’s first governance report released on Tuesday.

The report, “Agendas 2063 & 2030: Is Africa On Track?“ based on an analysis of the foundation’s Ibrahim index of African governance, found that since the adoption of both of these agendas, the availability of public data in Africa has declined. With data focused on social outcomes, there has been a notable decline in education, population and vital statistics, such as birth and death records, which allow citizens to access public services.

The index, on which the report is based, is the most comprehensive dataset on African governance, drawing on ten years of data of all 54 African nations. An updated index is released every two years….

The main challenge in the production of quality, timely data, according to the report, is a lack of funding and lack of independence of the national statistical offices.

Only one country, Mauritius, had a perfect score in terms of independence of its national statistics office – meaning that its office can collect the data it chooses, publish without approval from other arms of the government, and is sufficiently funded. Fifteen African nations scored zero in terms of the independence of their offices….(More)”.

Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques


Paper by Gabriela V. Angeles et al: “Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current.

However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze.

The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS….(More)”.

Data Power: tactics, access and shaping


Introduction to the Data Power Special Issue of Online Information Review by Ysabel Gerrard and Jo Bates : “…The Data Power Conference 2017, and by extension the seven papers in this Special Issue, addressed three questions:

  1. How can we reclaim some form of data-based power and autonomy, and advance data-based technological citizenship, while living in regimes of data power?
  2. Is it possible to regain agency and mobilise data for the common good? To do so, which theories help to interrogate and make sense of the operations of data power?
  3. What kind of design frameworks are needed to build and deploy data-based technologies with values and ethics that are equitable and fair? How can big data be mobilised to improve how we live, beyond notions of efficiency and innovation?

These questions broadly emphasise the reclamation of power, retention of agency and ethics of data-based technologies, and they reflect a broader moment in recent data studies scholarship. While early critical research on “big data” – a term that captures the technologies, analytics and mythologies of increasingly large data sets (Boyd and Crawford, 2012) – could only hypothesise the inequalities and deepened forms discrimination that might emerge as data sets grew in volume, many of those predictions have now become real. The articles in this Special Issue ask pressing questions about data power at a time when we have learned that data are too frequently handled in a way that deepens social inequalities and injustices (amongst others, Eubanks, 2018Noble, 2018).

The papers in this Special Issue approach discussions of inequality and injustice through three broad lenses: the tactics people use to confront unequal distributions of (data) power; the access to data that are most relevant and essential for particular social groups, coupled with the changing and uncertain legalities of data access; and the shaping of social relations by and through data, whether through the demands placed on app users to disclose more personal information, the use of data to construct cultures of compliance or through the very methodologies commonly used to organise and label information. While these three themes do not exhaustively capture the range of topics addressed in this Special Issue, at the Data Power Conferences, or within the field at large, they represent an emphasis within data studies scholarship on shedding light on the most pressing issues confronting our increasingly datafied world…(More)”.