The Global Commons of Data


Paper by Jennifer Shkabatur: “Data platform companies (such as Facebook, Google, or Twitter) amass and process immense amounts of data that is generated by their users. These companies primarily use the data to advance their commercial interests, but there is a growing public dismay regarding the adverse and discriminatory impacts of their algorithms on society at large. The regulation of data platform companies and their algorithms has been hotly debated in the literature, but current approaches often neglect the value of data collection, defy the logic of algorithmic decision-making, and exceed the platform companies’ operational capacities.

This Article suggests a different approach — an open, collaborative, and incentives-based stance toward data platforms that takes full advantage of the tremendous societal value of user-generated data. It contends that this data shall be recognized as a “global commons,” and access to it shall be made available to a wide range of independent stakeholders — research institutions, journalists, public authorities, and international organizations. These external actors would be able to utilize the data to address a variety of public challenges, as well as observe from within the operation and impacts of the platforms’ algorithms.

After making the theoretical case for the “global commons of data,” the Article explores the practical implementation of this model. First, it argues that a data commons regime should operate through a spectrum of data sharing and usage modalities that would protect the commercial interests of data platforms and the privacy of data users. Second, it discusses regulatory measures and incentives that can solicit the collaboration of platform companies with the commons model. Lastly, it explores the challenges embedded in this approach….(More)”.

The Nail Finds a Hammer: Self-Sovereign Identity, Design Principles, and Property Rights in the Developing World


Report by Michael Graglia, Christopher Mellon and Tim Robustelli: “Our interest in identity systems was an inevitable outgrowth of our earlier work on blockchain-based1 land registries.2 Property registries, which at the simplest level are ledgers of who has which rights to which asset, require a very secure and reliable means of identifying both people and properties. In the course of investigating solutions to that problem, we began to appreciate the broader challenges of digital identity and its role in international development. And the more we learned about digital identity, the more convinced we became of the need for self-sovereign identity, or SSI. This model, and the underlying principles of identity which it incorporates, will be described in detail in this paper.

We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority. These credentials are stored and controlled by the identity holder—typically in a wallet—and presented to different people for different reasons at the identity holder’s discretion. Crucially, the identity holder controls what information to present based on the environment, trust level, and type of interaction. Moreover, their fundamental identity persists even though the credentials by which it is represented may change over time.

The digital incarnation of this model has many benefits, including both greatly improved privacy and security, and the ability to create more trustworthy online spaces. Social media and news sites, for example, might limit participation to users with verified identities, excluding bots and impersonators.

The need for identification in the physical world varies based on location and social context. We expect to walk in relative anonymity down a busy city street, but will show a driver’s license to enter a bar, and both a driver’s license and a birth certificate to apply for a passport. There are different levels of ID and supporting documents required for each activity. But in each case, access to personal information is controlled by the user who may choose whether or not to share it.

Self-sovereign identity gives users complete control of their own identities and related personal data, which sits encrypted in distributed storage instead of being stored by a third party in a central database. In older, “federated identity” models, a single account—a Google account, for example—might be used to log in to a number of third-party sites, like news sites or social media platforms. But in this model a third party brokers all of these ID transactions, meaning that in exchange for the convenience of having to remember fewer passwords, the user must sacrifice a degree of privacy.

A real world equivalent would be having to ask the state to share a copy of your driver’s license with the bar every time you wanted to prove that you were over the age of 21. SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction….(More)”.

A Behavioral Economics Approach to Digitalisation


Paper by Dirk Beerbaum and Julia M. Puaschunder: “A growing body of academic research in the field of behavioural economics, political science and psychology demonstrate how an invisible hand can nudge people’s decisions towards a preferred option. Contrary to the assumptions of the neoclassical economics, supporters of nudging argue that people have problems coping with a complex world, because of their limited knowledge and their restricted rationality. Technological improvement in the age of information has increased the possibilities to control the innocent social media users or penalise private investors and reap the benefits of their existence in hidden persuasion and discrimination. Nudging enables nudgers to plunder the simple uneducated and uninformed citizen and investor, who is neither aware of the nudging strategies nor able to oversee the tactics used by the nudgers (Puaschunder 2017a, b; 2018a, b).

The nudgers are thereby legally protected by democratically assigned positions they hold. The law of motion of the nudging societies holds an unequal concentration of power of those who have access to compiled data and coding rules, relevant for political power and influencing the investor’s decision usefulness (Puaschunder 2017a, b; 2018a, b). This paper takes as a case the “transparency technology XBRL (eXtensible Business Reporting Language)” (Sunstein 2013, 20), which should make data more accessible as well as usable for private investors. It is part of the choice architecture on regulation by governments (Sunstein 2013). However, XBRL is bounded to a taxonomy (Piechocki and Felden 2007).

Considering theoretical literature and field research, a representation issue (Beerbaum, Piechocki and Weber 2017) for principles-based accounting taxonomies exists, which intelligent machines applying Artificial Intelligence (AI) (Mwilu, Prat and Comyn-Wattiau 2015) nudge to facilitate decision usefulness. This paper conceptualizes ethical questions arising from the taxonomy engineering based on machine learning systems: Should the objective of the coding rule be to support or to influence human decision making or rational artificiality? This paper therefore advocates for a democratisation of information, education and transparency about nudges and coding rules (Puaschunder 2017a, b; 2018a, b)…(More)”.

The Nail Finds a Hammer: Self-Sovereign Identity, Design Principles, and Property Rights in the Developing World


Report by Michael Graglia, Christopher Mellon and Tim Robustelli: “Our interest in identity systems was an inevitable outgrowth of our earlier work on blockchain-based1 land registries.2 Property registries, which at the simplest level are ledgers of who has which rights to which asset, require a very secure and reliable means of identifying both people and properties. In the course of investigating solutions to that problem, we began to appreciate the broader challenges of digital identity and its role in international development. And the more we learned about digital identity, the more convinced we became of the need for self-sovereign identity, or SSI. This model, and the underlying principles of identity which it incorporates, will be described in detail in this paper.

We believe that the great potential of SSI is that it can make identity in the digital world function more like identity in the physical world, in which every person has a unique and persistent identity which is represented to others by means of both their physical attributes and a collection of credentials attested to by various external sources of authority. These credentials are stored and controlled by the identity holder—typically in a wallet—and presented to different people for different reasons at the identity holder’s discretion. Crucially, the identity holder controls what information to present based on the environment, trust level, and type of interaction. Moreover, their fundamental identity persists even though the credentials by which it is represented may change over time.

The digital incarnation of this model has many benefits, including both greatly improved privacy and security, and the ability to create more trustworthy online spaces. Social media and news sites, for example, might limit participation to users with verified identities, excluding bots and impersonators.

The need for identification in the physical world varies based on location and social context. We expect to walk in relative anonymity down a busy city street, but will show a driver’s license to enter a bar, and both a driver’s license and a birth certificate to apply for a passport. There are different levels of ID and supporting documents required for each activity. But in each case, access to personal information is controlled by the user who may choose whether or not to share it.

Self-sovereign identity gives users complete control of their own identities and related personal data, which sits encrypted in distributed storage instead of being stored by a third party in a central database. In older, “federated identity” models, a single account—a Google account, for example—might be used to log in to a number of third-party sites, like news sites or social media platforms. But in this model a third party brokers all of these ID transactions, meaning that in exchange for the convenience of having to remember fewer passwords, the user must sacrifice a degree of privacy.

A real world equivalent would be having to ask the state to share a copy of your driver’s license with the bar every time you wanted to prove that you were over the age of 21. SSI, in contrast, gives the user a portable, digital credential (like a driver’s license or some other document that proves your age), the authenticity of which can be securely validated via cryptography without the recipient having to check with the authority that issued it. This means that while the credential can be used to access many different sites and services, there is no third-party broker to track the services to which the user is authenticating. Furthermore, cryptographic techniques called “zero-knowledge proofs” (ZKPs) can be used to prove possession of a credential without revealing the credential itself. This makes it possible, for example, for users to prove that they are over the age of 21 without having to share their actual birth dates, which are both sensitive information and irrelevant to a binary, yes-or-no ID transaction….(More)”.

Mapping humanitarian action on Instagram


Report by Anthony McCosker, Jane Farmer, Tracy De Cotta, Peter Kamstra, Natalie Jovanovski, Arezou Soltani Panah, Zoe Teh, and Sam Wilson: “Every day, people undertake many different kinds of voluntary service and humanitarian action. This might involve fundraising and charity work, giving time, helping or inspiring others, or promoting causes. However, because so much of the research on volunteering and humanitarian action focuses on formal activities along with large-scale campaigns and global crisis events, we know very little about what people are doing informally and in their local community.

Humanitarianism is changing with the digital age and with new modes of networked communication and interaction. The research presented in this report offers new insights into the way people engage with humanitarian activities in their local contexts and everyday lives. We turned to Instagram as a novel data source that can offer insights into everyday humanitarian action. As a popular visual social media platform, Instagram provides a certain kind of intimate access to the humanitarian acts and the social good values that people want to capture, share and promote to others.

We sought to develop a typology of everyday humanitarian actions, the targets of those actions and situations and contexts they happen in through an analysis of Instagram data. Our research methodology and findings unlock a new approach to understanding humanitarian action in situ, and opens opportunities for organisation-led campaigns to improve and support self-mobilisation.

By using geographical information provided by Instagram users when they post, we demonstrate the relationships between humanitarian activities and locations across Victoria, Australia, illustrating the heavy concentration of activity within Melbourne’s CBD and inner suburbs. The data shows patterns in the kinds of actions, the situations in which they occur, and the humanitarian targets and values shared. On the basis of the findings, the report points to next steps in how humanitarian and charity organisations can innovate using social data to build a digitally active humanitarian movement by mapping and amplifying and better understanding humanitarian deeds where and when they happen. While the analysis offers many nuanced insights into everyday humanitarian activity, we highlight three key findings.

  • When people post to Instagram about humanitarian action they are most often promoting causes and activities, fundraising and giving time
  • Groups give time (volunteering, giving), individuals give or raise money (charity, fundraising)
  • Humanitarian action posted to Instagram is heavily concentrated around Melbourne CBD and inner suburbs, with a focus on public spaces, restaurant and entertainment precincts along the Yarra River and Swanston Street…(More)”.

Folksonomies: how to do things with words on social media


Oxford Dictionaries: “Folksonomy, a portmanteau word for ‘folk taxonomy’, is a term for collaborative tagging: the production of user-created ‘tags’ on social media that help readers to find and sort content. In other words, hashtags: #ThrowbackThursday, #DogLife, #MeToo. Because ordinary people create folksonomy tags, folksonomies include categories devised by small communities, subcultures, or even individuals, not merely those by accepted taxonomic systems like the Dewey Decimal System.

The term first arose in the wake of Web 2.0 – the Web’s transition, in the early 2000s, from a read-only platform to a read-write platform that allows users to comment on and collaboratively tag what they read. Rather unusually, we know the exact date it was coined: 24 July, 2004. The information architect Thomas Vander Wal came up with it in response to a query over what to call this kind of informal social classification.

Perhaps the most visible folksonomies are those on social-media platforms like Facebook, Twitter, Tumblr, Flickr, and Instagram. Often, people create tags on these platforms in order to gather under a single tag content that many different users have created, making it easier to find posts related to that tag. (If I’m interested in dogs, I might look at content gathered under the tag #DogLife.) Because tags reflect the interests of people who create them, researchers have pursued ways to use tags to build more comprehensive profiles of users, with an eye to surveillance or to selling them relevant ads.

But people may also use tags as prompts for the creation of new content, not merely the curation of content they would have posted anyway. As I write this post, a trending tag on Twitter, #MakeAHorrorMovieMoreHorrific, is prompting thousands of people to write satirical takes on how classic horror movies might be made more ‘horrifying’ by adding unhappy features of our ordinary lives. (‘I Know What You Did Last Summer, and I Put It on Facebook’; ‘Rosemary’s Baby Is Teething’; ‘The Exercise’)

From a certain perspective, this is not so different from a library’s acknowledgment of a new category of text: if a new academic field, like ‘the history of the book’, catches on, then libraries rearrange their shelves and catalogues to accommodate the history of the book as a category; the new shelf space and catalogue space creates a demand for new books in that category, which encourages authors and publishers to produce new books to meet the demand.

But new folksonomy tags (with important exceptions, as in the realm of activism) are often short-lived and meant to be short-lived, obscure and meant to be obscure. What library cataloguer would think to accommodate the category #glitterhorse, which has a surprising number of posts on Twitter and Instagram? How can Vander Wal’s original definition of folksonomy as a tool for information retrieval accommodate tags that function, not as search terms, but as theatrical asides, like #sorrynotsorry? What about tags that are so narrowly specific that no search could ever turn up more than one usage?

Perhaps the best way to understand the weird things that people do with folksonomy tags is to appeal, not to information science, but to narratology, the study of narrative structures. …(More)”.

Open Data Exposed


Book by Bastiaan van Loenen, Glenn Vancauwenberghe, Joep Crompvoets and Lorenzo Dalla Corte: “This book is about open data, i.e. data that does not have any barriers in the (re)use. Open data aims to optimize access, sharing and using data from a technical, legal, financial, and intellectual perspective.

Data increasingly determines the way people live their lives today. Nowadays, we cannot imagine a life without real-time traffic information about our route to work, information of the daily news or information about the local weather. At the same time, citizens themselves now are constantly generating and sharing data and information via many different devices and social media systems. Especially for governments, collection, management, exchange, and use of data and information have always been key tasks, since data is both the primary input to and output of government activities. Also for businesses, non-profit organizations, researchers and various other actors, data and information are essential….(More)”.

Positive deviance, big data, and development: A systematic literature review


Paper by Basma Albanna and Richard Heeks: “Positive deviance is a growing approach in international development that identifies those within a population who are outperforming their peers in some way, eg, children in low‐income families who are well nourished when those around them are not. Analysing and then disseminating the behaviours and other factors underpinning positive deviance are demonstrably effective in delivering development results.

However, positive deviance faces a number of challenges that are restricting its diffusion. In this paper, using a systematic literature review, we analyse the current state of positive deviance and the potential for big data to address the challenges facing positive deviance. From this, we evaluate the promise of “big data‐based positive deviance”: This would analyse typical sources of big data in developing countries—mobile phone records, social media, remote sensing data, etc—to identify both positive deviants and the factors underpinning their superior performance.

While big data cannot solve all the challenges facing positive deviance as a development tool, they could reduce time, cost, and effort; identify positive deviants in new or better ways; and enable positive deviance to break out of its current preoccupation with public health into domains such as agriculture, education, and urban planning. In turn, positive deviance could provide a new and systematic basis for extracting real‐world development impacts from big data…(More)”.

Surveillance Studies: A Reader


Book edited by Torin Monahan and David Murakami Wood: “Surveillance is everywhere: in workplaces monitoring the performance of employees, social media sites tracking clicks and uploads, financial institutions logging transactions, advertisers amassing fine-grained data on customers, and security agencies siphoning up everyone’s telecommunications activities. Surveillance practices-although often hidden-have come to define the way modern institutions operate. Because of the growing awareness of the central role of surveillance in shaping power relations and knowledge across social and cultural contexts, scholars from many different academic disciplines have been drawn to “surveillance studies,” which in recent years has solidified as a major field of study.

Torin Monahan and David Murakami Wood’s Surveillance Studies is a broad-ranging reader that provides a comprehensive overview of the dynamic field. In fifteen sections, the book features selections from key historical and theoretical texts, samples of the best empirical research done on surveillance, introductions to debates about privacy and power, and cutting-edge treatments of art, film, and literature. While the disciplinary perspectives and foci of scholars in surveillance studies may be diverse, there is coherence and agreement about core concepts, ideas, and texts. This reader outlines these core dimensions and highlights various differences and tensions. In addition to a thorough introduction that maps the development of the field, the volume offers helpful editorial remarks for each section and brief prologues that frame the included excerpts. …(More)”.

When AI Misjudgment Is Not an Accident


Douglas Yeung at Scientific American: “The conversation about unconscious bias in artificial intelligence often focuses on algorithms that unintentionally cause disproportionate harm to entire swaths of society—those that wrongly predict black defendants will commit future crimes, for example, or facial-recognition technologies developed mainly by using photos of white men that do a poor job of identifying women and people with darker skin.

But the problem could run much deeper than that. Society should be on guard for another twist: the possibility that nefarious actors could seek to attack artificial intelligence systems by deliberately introducing bias into them, smuggled inside the data that helps those systems learn. This could introduce a worrisome new dimension to cyberattacks, disinformation campaigns or the proliferation of fake news.

According to a U.S. government study on big data and privacy, biased algorithms could make it easier to mask discriminatory lending, hiring or other unsavory business practices. Algorithms could be designed to take advantage of seemingly innocuous factors that can be discriminatory. Employing existing techniques, but with biased data or algorithms, could make it easier to hide nefarious intent. Commercial data brokers collect and hold onto all kinds of information, such as online browsing or shopping habits, that could be used in this way.

Biased data could also serve as bait. Corporations could release biased data with the hope competitors would use it to train artificial intelligence algorithms, causing competitors to diminish the quality of their own products and consumer confidence in them.

Algorithmic bias attacks could also be used to more easily advance ideological agendas. If hate groups or political advocacy organizations want to target or exclude people on the basis of race, gender, religion or other characteristics, biased algorithms could give them either the justification or more advanced means to directly do so. Biased data also could come into play in redistricting efforts that entrench racial segregation (“redlining”) or restrict voting rights.

Finally, national security threats from foreign actors could use deliberate bias attacks to destabilize societies by undermining government legitimacy or sharpening public polarization. This would fit naturally with tactics that reportedly seek to exploit ideological divides by creating social media posts and buying online ads designed to inflame racial tensions….(More)”.