‘In Situ’ Data Rights


Essay by Marshall W Van Alstyne, Georgios Petropoulos, Geoffrey Parker, and Bertin Martens: “…Data portability sounds good in theory—number portability improved telephony—but this theory has its flaws.

  • Context: The value of data depends on context. Removing data from that context removes value. A portability exercise by experts at the ProgrammableWeb succeeded in downloading basic Facebook data but failed on a re-upload.1 Individual posts shed the prompts that preceded them and the replies that followed them. After all, that data concerns others.
  • Stagnation: Without a flow of updates, a captured stock depreciates. Data must be refreshed to stay current, and potential users must see those data updates to stay informed.
  • Impotence: Facts removed from their place of residence become less actionable. We cannot use them to make a purchase when removed from their markets or reach a friend when they are removed from their social networks. Data must be reconnected to be reanimated.
  • Market Failure. Innovation is slowed. Consider how markets for business analytics and B2B services develop. Lacking complete context, third parties can only offer incomplete benchmarking and analysis. Platforms that do offer market overview services can charge monopoly prices because they have context that partners and competitors do not.
  • Moral Hazard: Proposed laws seek to give merchants data portability rights but these entail a problem that competition authorities have not anticipated. Regulators seek to help merchants “multihome,” to affiliate with more than one platform. Merchants can take their earned ratings from one platform to another and foster competition. But, when a merchant gains control over its ratings data, magically, low reviews can disappear! Consumers fraudulently edited their personal records under early U.K. open banking rules. With data editing capability, either side can increase fraud, surely not the goal of data portability.

Evidence suggests that following GDPR, E.U. ad effectiveness fell, E.U. Web revenues fell, investment in E.U. startups fell, the stock and flow of apps available in the E.U. fell, while Google and Facebook, who already had user data, gained rather than lost market share as small firms faced new hurdles the incumbents managed to avoid. To date, the results are far from regulators’ intentions.

We propose a new in situ data right for individuals and firms, and a new theory of benefits. Rather than take data from the platform, or ex situ as portability implies, let us grant users the right to use their data in the location where it resides. Bring the algorithms to the data instead of bringing the data to the algorithms. Users determine when and under what conditions third parties access their in situ data in exchange for new kinds of benefits. Users can revoke access at any time and third parties must respect that. This patches and repairs the portability problems…(More).”

Quarantined Data? The impact, scope & challenges of open data during COVID


Chapter by Álvaro V. Ramírez-Alujas: “How do rates of COVID19 infection increase? How do populations respond to lockdown measures? How is the pandemic affecting the economic and social activity of communities beyond health? What can we do to mitigate risks and support families in this context? The answer to these and other key questions is part of the intense global public debate on the management of the health crisis and how appropriate public policy measures have been taken in order to combat the impact and effects of COVID19 around the world. The common ground to all of them? The availability and use of public data and information. This chapter reflects on the relevance of public information and the availability, processing and use of open data as the primary hub and key ingredient in the responsiveness of governments and public institutions to the COVID19 pandemic and its multiple impacts on society. Discussions are underway concerning the scope, paradoxes, lessons learned, and visible challenges with respect to the available evidence and comparative analysis of government strategies in the region, incorporating the urgent need to shift towards a more robust, sustainable data infrastructure anchored in a logic of strengthening the ecosystem of actors (public and private sectors, civil society and the scientific community) to shape a framework of governance, and a strong, emerging institutional architecture based on data management for sustainable development on a human scale…(More)”.

The Quiet Before


Book by Gal Beckerman: “We tend to think of revolutions as loud: frustrations and demands shouted in the streets. But the ideas fueling them have traditionally been conceived in much quieter spaces, in the small, secluded corners where a vanguard can whisper among themselves, imagine alternate realities, and deliberate about how to achieve their goals. This extraordinary book is a search for those spaces, over centuries and across continents, and a warning that—in a world dominated by social media—they might soon go extinct.

Gal Beckerman, an editor at The New York Times Book Review, takes us back to the seventeenth century, to the correspondence that jump-started the scientific revolution, and then forward through time to examine engines of social change: the petitions that secured the right to vote in 1830s Britain, the zines that gave voice to women’s rage in the early 1990s, and even the messaging apps used by epidemiologists fighting the pandemic in the shadow of an inept administration. In each case, Beckerman shows that our most defining social movements—from decolonization to feminism—were formed in quiet, closed networks that allowed a small group to incubate their ideas before broadcasting them widely.

But Facebook and Twitter are replacing these productive, private spaces, to the detriment of activists around the world. Why did the Arab Spring fall apart? Why did Occupy Wall Street never gain traction? Has Black Lives Matter lived up to its full potential? Beckerman reveals what this new social media ecosystem lacks—everything from patience to focus—and offers a recipe for growing radical ideas again…(More)”.

Incentivising research data sharing: a scoping review


Paper by Helen Buckley Woods and Stephen Pinfield: “Numerous mechanisms exist to incentivise researchers to share their data. This scoping review aims to identify and summarise evidence of the efficacy of different interventions to promote open data practices and provide an overview of current research….Seven major themes in the literature were identified: publisher/journal data sharing policies, metrics, software solutions,research data sharing agreements in general, open science ‘badges’, funder mandates, and initiatives….

A number of key messages for data sharing include: the need to build on existing cultures and practices, meeting people where they are and tailoring interventions to support them; the importance of publicising and explaining the policy/service widely; the need to have disciplinary data champions to model good practice and drive cultural change; the requirement to resource interventions properly; and the imperative to provide robust technical infrastructure and protocols, such as labelling of data sets, use of DOIs, data standards and use of data repositories….(More)”.

If AI Is Predicting Your Future, Are You Still Free?


Essay by Carissa Veliz” “…Today, prediction is mostly done through machine learning algorithms that use statistics to fill in the blanks of the unknown. Text algorithms use enormous language databases to predict the most plausible ending to a string of words. Game algorithms use data from past games to predict the best possible next move. And algorithms that are applied to human behavior use historical data to infer our future: what we are going to buy, whether we are planning to change jobs, whether we are going to get sick, whether we are going to commit a crime or crash our car. Under such a model, insurance is no longer about pooling risk from large sets of people. Rather, predictions have become individualized, and you are increasingly paying your own way, according to your personal risk scores—which raises a new set of ethical concerns.

An important characteristic of predictions is that they do not describe reality. Forecasting is about the future, not the present, and the future is something that has yet to become real. A prediction is a guess, and all sorts of subjective assessments and biases regarding risk and values are built into it. There can be forecasts that are more or less accurate, to be sure, but the relationship between probability and actuality is much more tenuous and ethically problematic than some assume.

Institutions today, however, often try to pass off predictions as if they were a model of objective reality. And even when AI’s forecasts are merely probabilistic, they are often interpreted as deterministic in practice—partly because human beings are bad at understanding probability and partly because the incentives around avoiding risk end up reinforcing the prediction. (For example, if someone is predicted to be 75 percent likely to be a bad employee, companies will not want to take the risk of hiring them when they have candidates with a lower risk score)…(More)”.

Mapping data portability initiatives, opportunities and challenges


OECD Report: “Data portability has become an essential tool for enhancing access to and sharing of data across digital services and platforms. This report explores to what extent data portability can empower users (natural and legal persons) to play a more active role in the re-use of their data across digital services and platforms. It also examines how data portability can help increase interoperability and data flows and thus enhance competition and innovation by reducing switching costs and lock-in effects….(More)”.

The 2021 Good Tech Awards


Kevin Roose at the New York Times: “…Especially at a time when many of tech’s leaders seem more interested in building new, virtual worlds than improving the world we live in, it’s worth praising the technologists who are stepping up to solve some of our biggest problems.

So here, without further ado, are this year’s Good Tech Awards…

One of the year’s most exciting A.I. breakthroughs came in July when DeepMind — a Google-owned artificial intelligence company — published data and open-source code from its groundbreaking AlphaFold project.

The project, which used A.I. to predict the structures of proteins, solved a problem that had vexed scientists for decades, and was hailed by experts as one of the greatest scientific discoveries of all time. And by publishing its data freely, AlphaFold set off a frenzy among researchers, some of whom are already using it to develop new drugs and better understand the proteins involved in viruses like SARS-CoV-2.

Google’s overall A.I. efforts have been fraught with controversy and missteps, but AlphaFold seems like an unequivocally good use of the company’s vast expertise and resources…

Prisons aren’t known as hotbeds of innovation. But two tech projects this year tried to make our criminal justice system more humane.

Recidiviz is a nonprofit tech start-up that builds open-source data tools for criminal justice reform. It was started by Clementine Jacoby, a former Google employee who saw an opportunity to corral data about the prison system and make it available to prison officials, lawmakers, activists and researchers to inform their decisions. Its tools are in use in seven states, including North Dakota, where the data tools helped prison officials assess the risk of Covid-19 outbreaks and identify incarcerated people who were eligible for early release….(More)”.

Biases in human mobility data impact epidemic modeling


Paper by Frank Schlosser, Vedran Sekara, Dirk Brockmann, and Manuel Garcia-Herranz: “Large-scale human mobility data is a key resource in data-driven policy making and across many scientific fields. Most recently, mobility data was extensively used during the COVID-19 pandemic to study the effects of governmental policies and to inform epidemic models. Large-scale mobility is often measured using digital tools such as mobile phones. However, it remains an open question how truthfully these digital proxies represent the actual travel behavior of the general population. Here, we examine mobility datasets from multiple countries and identify two fundamentally different types of bias caused by unequal access to, and unequal usage of mobile phones. We introduce the concept of data generation bias, a previously overlooked type of bias, which is present when the amount of data that an individual produces influences their representation in the dataset. We find evidence for data generation bias in all examined datasets in that high-wealth individuals are overrepresented, with the richest 20% contributing over 50% of all recorded trips, substantially skewing the datasets. This inequality is consequential, as we find mobility patterns of different wealth groups to be structurally different, where the mobility networks of high-wealth users are denser and contain more long-range connections. To mitigate the skew, we present a framework to debias data and show how simple techniques can be used to increase representativeness. Using our approach we show how biases can severely impact outcomes of dynamic processes such as epidemic simulations, where biased data incorrectly estimates the severity and speed of disease transmission. Overall, we show that a failure to account for biases can have detrimental effects on the results of studies and urge researchers and practitioners to account for data-fairness in all future studies of human mobility…(More)”.

Expecting the Unexpected: Effects of Data Collection Design Choices on the Quality of Crowdsourced User-Generated Content


Paper by Roman Lukyanenko: “As crowdsourced user-generated content becomes an important source of data for organizations, a pressing question is how to ensure that data contributed by ordinary people outside of traditional organizational boundaries is of suitable quality to be useful for both known and unanticipated purposes. This research examines the impact of different information quality management strategies, and corresponding data collection design choices, on key dimensions of information quality in crowdsourced user-generated content. We conceptualize a contributor-centric information quality management approach focusing on instance-based data collection. We contrast it with the traditional consumer-centric fitness-for-use conceptualization of information quality that emphasizes class-based data collection. We present laboratory and field experiments conducted in a citizen science domain that demonstrate trade-offs between the quality dimensions of accuracy, completeness (including discoveries), and precision between the two information management approaches and their corresponding data collection designs. Specifically, we show that instance-based data collection results in higher accuracy, dataset completeness and number of discoveries, but this comes at the expense of lower precision. We further validate the practical value of the instance-based approach by conducting an applicability check with potential data consumers (scientists, in our context of citizen science). In a follow-up study, we show, using human experts and supervised machine learning techniques, that substantial precision gains on instance-based data can be achieved with post-processing. We conclude by discussing the benefits and limitations of different information quality and data collection design choice for information quality in crowdsourced user-generated content…(More)”.

Research Anthology on Citizen Engagement and Activism for Social Change


Book by the Information Resources Management Association (IRMA): “Activism and the role everyday people play in making a change in society are increasingly popular topics in the world right now, especially as younger generations begin to speak out. From traditional protests to activities on college campuses, to the use of social media, more individuals are finding accessible platforms with which to share their views and become more actively involved in politics and social welfare. With the emergence of new technologies and a spotlight on important social issues, people are able to become more involved in society than ever before as they fight for what they believe. It is essential to consider the recent trends, technologies, and movements in order to understand where society is headed in the future.

The Research Anthology on Citizen Engagement and Activism for Social Change examines a plethora of innovative research surrounding social change and the various ways citizens are involved in shaping society. Covering topics such as accountability, social media, voter turnout, and leadership, it is an ideal work for activists, sociologists, social workers, politicians, public administrators, sociologists, journalists, policymakers, social media analysts, government administrators, academicians, researchers, practitioners, and students….(More)”.