Online Political Microtargeting: Promises and Threats for Democracy


Frederik Zuiderveen Borgesius et al in Utrecht Law Review: “Online political microtargeting involves monitoring people’s online behaviour, and using the collected data, sometimes enriched with other data, to show people-targeted political advertisements. Online political microtargeting is widely used in the US; Europe may not be far behind.

This paper maps microtargeting’s promises and threats to democracy. For example, microtargeting promises to optimise the match between the electorate’s concerns and political campaigns, and to boost campaign engagement and political participation. But online microtargeting could also threaten democracy. For instance, a political party could, misleadingly, present itself as a different one-issue party to different individuals. And data collection for microtargeting raises privacy concerns. We sketch possibilities for policymakers if they seek to regulate online political microtargeting. We discuss which measures would be possible, while complying with the right to freedom of expression under the European Convention on Human Rights….(More)”.

The Rise of Virtual Citizenship


James Bridle in The Atlantic: “In Cyprus, Estonia, the United Arab Emirates, and elsewhere, passports can now be bought and sold….“If you believe you are a citizen of the world, you are a citizen of nowhere. You don’t understand what citizenship means,” the British prime minister, Theresa May, declared in October 2016. Not long after, at his first postelection rally, Donald Trump asserted, “There is no global anthem. No global currency. No certificate of global citizenship. We pledge allegiance to one flag and that flag is the American flag.” And in Hungary, Prime Minister Viktor Orbán has increased his national-conservative party’s popularity with statements like “all the terrorists are basically migrants” and “the best migrant is the migrant who does not come.”

Citizenship and its varying legal definition has become one of the key battlegrounds of the 21st century, as nations attempt to stake out their power in a G-Zero, globalized world, one increasingly defined by transnational, borderless trade and liquid, virtual finance. In a climate of pervasive nationalism, jingoism, xenophobia, and ever-building resentment toward those who move, it’s tempting to think that doing so would become more difficult. But alongside the rise of populist, identitarian movements across the globe, identity itself is being virtualized, too. It no longer needs to be tied to place or nation to function in the global marketplace.

Hannah Arendt called citizenship “the right to have rights.” Like any other right, it can be bestowed and withheld by those in power, but in its newer forms it can also be bought, traded, and rewritten. Virtual citizenship is a commodity that can be acquired through the purchase of real estate or financial investments, subscribed to via an online service, or assembled by peer-to-peer digital networks. And as these options become available, they’re also used, like so many technologies, to exclude those who don’t fit in.

In a world that increasingly operates online, geography and physical infrastructure still remain crucial to control and management. Undersea fiber-optic cables trace the legacy of imperial trading routes. Google and Facebook erect data centers in Scandinavia and the Pacific Northwest, close to cheap hydroelectric power and natural cooling. The trade in citizenship itself often manifests locally as architecture. From luxury apartments in the Caribbean and the Mediterranean to data centers in Europe and refugee settlements in the Middle East, a scattered geography of buildings brings a different reality into focus: one in which political decisions and national laws transform physical space into virtual territory…(More)”.

How Blockchain can benefit migration programmes and migrants


Solon Ardittis at the Migration Data Portal: “According to a recent report published by CB Insights, there are today at least 36 major industries that are likely to benefit from the use of Blockchain technology, ranging from voting procedures, critical infrastructure security, education and healthcare, to car leasing, forecasting, real estate, energy management, government and public records, wills and inheritance, corporate governance and crowdfunding.

In the international aid sector, a number of experiments are currently being conducted to distribute aid funding through the use of Blockchain and thus to improve the tracing of the ways in which aid is disbursed. Among several other examples, the Start Network, which consists of 42 aid agencies across five continents, ranging from large international organizations to national NGOs, has launched a Blockchain-based project that enables the organization both to speed up the distribution of aid funding and to facilitate the tracing of every single payment, from the original donor to each individual assisted.

As Katherine Purvis of The Guardian noted, “Blockchain enthusiasts are hopeful it could be the next big development disruptor. In providing a transparent, instantaneous and indisputable record of transactions, its potential to remove corruption and provide transparency and accountability is one area of intrigue.”

In the field of international migration and refugee affairs, however, Blockchain technology is still in its infancy.

One of the few notable examples is the launch by the United Nations (UN) World Food Programme (WFP) in May 2017 of a project in the Azraq Refugee Camp in Jordan which, through the use of Blockchain technology, enables the creation of virtual accounts for refugees and the uploading of monthly entitlements that can be spent in the camp’s supermarket through the use of an authorization code. Reportedly, the programme has contributed to a reduction by 98% of the bank costs entailed by the use of a financial service provider.

This is a noteworthy achievement considering that organizations working in international relief can lose up to 3.5% of each aid transaction to various fees and costs and that an estimated 30% of all development funds do not reach their intended recipients because of third-party theft or mismanagement.

At least six other UN agencies including the UN Office for Project Services (UNOPS), the UN Development Programme (UNDP), the UN Children’s Fund (UNICEF), UN Women, the UN High Commissioner for Refugees (UNHCR) and the UN Development Group (UNDG), are now considering Blockchain applications that could help support international assistance, particularly supply chain management tools, self-auditing of payments, identity management and data storage.

The potential of Blockchain technology in the field of migration and asylum affairs should therefore be fully explored.

At the European Union (EU) level, while a Blockchain task force has been established by the European Parliament to assess the ways in which the technology could be used to provide digital identities to refugees, and while the European Commission has recently launched a call for project proposals to examine the potential of Blockchain in a range of sectors, little focus has been placed so far on EU assistance in the field of migration and asylum, both within the EU and in third countries with which the EU has negotiated migration partnership agreements.

This is despite the fact that the use of Blockchain in a number of major programme interventions in the field of migration and asylum could help improve not only their cost-efficiency but also, at least as importantly, their degree of transparency and accountability. This at a time when media and civil society organizations exercise increased scrutiny over the quality and ethical standards of such interventions.

In Europe, for example, Blockchain could help administer the EU Asylum, Migration and Integration Fund (AMIF), both in terms of transferring funds from the European Commission to the eligible NGOs in the Member States and in terms of project managers then reporting on spending. This would help alleviate many of the recurrent challenges faced by NGOs in managing funds in line with stringent EU regulations.

As crucially, Blockchain would have the potential to increase transparency and accountability in the channeling and spending of EU funds in third countries, particularly under the Partnership Framework and other recent schemes to prevent irregular migration to Europe.

A case in point is the administration of EU aid in response to the refugee emergency in Greece where, reportedly, there continues to be insufficient oversight of the full range of commitments and outcomes of large EU-funded investments, particularly in the housing sector. Another example is the set of recent programme interventions in Libya, where a growing number of incidents of human rights abuses and financial mismanagement are being brought to light….(More)”.

Data Collaboratives can transform the way civil society organisations find solutions


Stefaan G. Verhulst at Disrupt & Innovate: “The need for innovation is clear: The twenty-first century is shaping up to be one of the most challenging in recent history. From climate change to income inequality to geopolitical upheaval and terrorism: the difficulties confronting International Civil Society Organisations (ICSOs) are unprecedented not only in their variety but also in their complexity. At the same time, today’s practices and tools used by ICSOs seem stale and outdated. Increasingly, it is clear, we need not only new solutions but new methods for arriving at solutions.

Data will likely become more central to meeting these challenges. We live in a quantified era. It is estimated that 90% of the world’s data was generated in just the last two years. We know that this data can help us understand the world in new ways and help us meet the challenges mentioned above. However, we need new data collaboration methods to help us extract the insights from that data.

UNTAPPED DATA POTENTIAL

For all of data’s potential to address public challenges, the truth remains that most data generated today is in fact collected by the private sector – including ICSOs who are often collecting a vast amount of data – such as, for instance, the International Committee of the Red Cross, which generates various (often sensitive) data related to humanitarian activities. This data, typically ensconced in tightly held databases toward maintaining competitive advantage or protecting from harmful intrusion, contains tremendous possible insights and avenues for innovation in how we solve public problems. But because of access restrictions and often limited data science capacity, its vast potential often goes untapped.

DATA COLLABORATIVES AS A SOLUTION

Data Collaboratives offer a way around this limitation. They represent an emerging public-private partnership model, in which participants from different areas — including the private sector, government, and civil society — come together to exchange data and pool analytical expertise.

While still an emerging practice, examples of such partnerships now exist around the world, across sectors and public policy domains. Importantly several ICSOs have started to collaborate with others around their own data and that of the private and public sector. For example:

  • Several civil society organisations, academics, and donor agencies are partnering in the Health Data Collaborative to improve the global data infrastructure necessary to make smarter global and local health decisions and to track progress against the Sustainable Development Goals (SDGs).
  • Additionally, the UN Office for the Coordination of Humanitarian Affairs (UNOCHA) built Humanitarian Data Exchange (HDX), a platform for sharing humanitarian from and for ICSOs – including Caritas, InterAction and others – donor agencies, national and international bodies, and other humanitarian organisations.

These are a few examples of Data Collaboratives that ICSOs are participating in. Yet, the potential for collaboration goes beyond these examples. Likewise, so do the concerns regarding data protection and privacy….(More)”.

Big data and food retail: Nudging out citizens by creating dependent consumers


Michael Carolan at GeoForum: “The paper takes a critical look at how food retail firms use big data, looking specifically at how these techniques and technologies govern our ability to imagine food worlds. It does this by drawing on two sets of data: (1) interviews with twenty-one individuals who oversaw the use of big data applications in a retail setting and (2) five consumer focus groups composed of individuals who regularly shopped at major food chains along Colorado’s Front Range.

For reasons described below, the “nudge” provides the conceptual entry point for this analysis, as these techniques are typically expressed through big data-driven nudges. The argument begins by describing the nudge concept and how it is used in the context of retail big data. This is followed by a discussion of methods.

The remainder of the paper discusses how big data are used to nudge consumers and the effects of these practices. This analysis is organized around three themes that emerged out of the qualitative data: path dependency, products; path dependency, retail; and path dependency, habitus. The paper concludes connecting these themes through the concept of governance, particularly by way of their ability to, in Foucault’s (2003: 241) words, have “the power to ‘make’ live and ‘let’ die” worlds….(More)”.

The future of statistics and data science


Paper by Sofia C. Olhede and Patrick J. Wolfe in Statistics & Probability Letters: “The Danish physicist Niels Bohr is said to have remarked: “Prediction is very difficult, especially about the future”. Predicting the future of statistics in the era of big data is not so very different from prediction about anything else. Ever since we started to collect data to predict cycles of the moon, seasons, and hence future agriculture yields, humankind has worked to infer information from indirect observations for the purpose of making predictions.

Even while acknowledging the momentous difficulty in making predictions about the future, a few topics stand out clearly as lying at the current and future intersection of statistics and data science. Not all of these topics are of a strictly technical nature, but all have technical repercussions for our field. How might these repercussions shape the still relatively young field of statistics? And what can sound statistical theory and methods bring to our understanding of the foundations of data science? In this article we discuss these issues and explore how new open questions motivated by data science may in turn necessitate new statistical theory and methods now and in the future.

Together, the ubiquity of sensing devices, the low cost of data storage, and the commoditization of computing have led to a volume and variety of modern data sets that would have been unthinkable even a decade ago. We see four important implications for statistics.

First, many modern data sets are related in some way to human behavior. Data might have been collected by interacting with human beings, or personal or private information traceable back to a given set of individuals might have been handled at some stage. Mathematical or theoretical statistics traditionally does not concern itself with the finer points of human behavior, and indeed many of us have only had limited training in the rules and regulations that pertain to data derived from human subjects. Yet inevitably in a data-rich world, our technical developments cannot be divorced from the types of data sets we can collect and analyze, and how we can handle and store them.

Second, the importance of data to our economies and civil societies means that the future of regulation will look not only to protect our privacy, and how we store information about ourselves, but also to include what we are allowed to do with that data. For example, as we collect high-dimensional vectors about many family units across time and space in a given region or country, privacy will be limited by that high-dimensional space, but our wish to control what we do with data will go beyond that….

Third, the growing complexity of algorithms is matched by an increasing variety and complexity of data. Data sets now come in a variety of forms that can be highly unstructured, including images, text, sound, and various other new forms. These different types of observations have to be understood together, resulting in multimodal data, in which a single phenomenon or event is observed through different types of measurement devices. Rather than having one phenomenon corresponding to single scalar values, a much more complex object is typically recorded. This could be a three-dimensional shape, for example in medical imaging, or multiple types of recordings such as functional magnetic resonance imaging and simultaneous electroencephalography in neuroscience. Data science therefore challenges us to describe these more complex structures, modeling them in terms of their intrinsic patterns.

Finally, the types of data sets we now face are far from satisfying the classical statistical assumptions of identically distributed and independent observations. Observations are often “found” or repurposed from other sampling mechanisms, rather than necessarily resulting from designed experiments….

 Our field will either meet these challenges and become increasingly ubiquitous, or risk rapidly becoming irrelevant to the future of data science and artificial intelligence….(More)”.

What if technology could help improve conversations online?


Introduction to “Perspective”: “Discussing things you care about can be difficult. The threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions….Perspective is an API that makes it easier to host better conversations. The API uses machine learning models to score the perceived impact a comment might have on a conversation. Developers and publishers can use this score to give realtime feedback to commenters or help moderators do their job, or allow readers to more easily find relevant information, as illustrated in two experiments below. We’ll be releasing more machine learning models later in the year, but our first model identifies whether a comment could be perceived as “toxic” to a discussion….(More)”.

Who Killed Albert Einstein? From Open Data to Murder Mystery Games


Gabriella A. B. Barros et al at arXiv: “This paper presents a framework for generating adventure games from open data. Focusing on the murder mystery type of adventure games, the generator is able to transform open data from Wikipedia articles, OpenStreetMap and images from Wikimedia Commons into WikiMysteries. Every WikiMystery game revolves around the murder of a person with a Wikipedia article, and populates the game with suspects who must be arrested by the player if guilty of the murder or absolved if innocent. Starting from only one person as the victim, an extensive generative pipeline finds suspects, their alibis, and paths connecting them from open data, transforms open data into cities, buildings, non-player characters, locks and keys and dialog options. The paper describes in detail each generative step, provides a specific playthrough of one WikiMystery where Albert Einstein is murdered, and evaluates the outcomes of games generated for the 100 most influential people of the 20th century….(More)”.

Data journalism and the ethics of publishing Twitter data


Matthew L. Williams at Data Driven Journalism: “Collecting and publishing data collected from social media sites such as Twitter are everyday practices for the data journalist. Recent findings from Cardiff University’s Social Data Science Lab question the practice of publishing Twitter content without seeking some form of informed consent from users beforehand. Researchers found that tweets collected around certain topics, such as those related to terrorism, political votes, changes in the law and health problems, create datasets that might contain sensitive content, such as extreme political opinion, grossly offensive comments, overly personal revelations and threats to life (both to oneself and to others). Handling these data in the process of analysis (such as classifying content as hateful and potentially illegal) and reporting has brought the ethics of using social media in social research and journalism into sharp focus.

Ethics is an issue that is becoming increasingly salient in research and journalism using social media data. The digital revolution has outpaced parallel developments in research governance and agreed good practice. Codes of ethical conduct that were written in the mid twentieth century are being relied upon to guide the collection, analysis and representation of digital data in the twenty-first century. Social media is particularly ethically challenging because of the open availability of the data (particularly from Twitter). Many platforms’ terms of service specifically state users’ data that are public will be made available to third parties, and by accepting these terms users legally consent to this. However, researchers and data journalists must interpret and engage with these commercially motivated terms of service through a more reflexive lens, which implies a context sensitive approach, rather than focusing on the legally permissible uses of these data.

Social media researchers and data journalists have experimented with data from a range of sources, including Facebook, YouTube, Flickr, Tumblr and Twitter to name a few. Twitter is by far the most studied of all these networks. This is because Twitter differs from other networks, such as Facebook, that are organised around groups of ‘friends’, in that it is more ‘open’ and the data (in part) are freely available to researchers. This makes Twitter a more public digital space that promotes the free exchange of opinions and ideas. Twitter has become the primary space for online citizens to publicly express their reaction to events of national significance, and also the primary source of data for social science research into digital publics.

The Twitter streaming API provides three levels of data access: the free random 1% that provides ~5M tweets daily and the random 10% and 100% (chargeable or free to academic researchers upon request). Datasets on social interactions of this scale, speed and ease of access have been hitherto unrealisable in the social sciences and journalism, and have led to a flood of journal articles and news pieces, many of which include tweets with full text content and author identity without informed consent. This is presumably because of Twitter’s ‘open’ nature, which leads to the assumption that ‘these are public data’ and using it does not require the rigor and scrutiny of an ethical oversight. Even when these data are scrutinised, journalists don’t need to be convinced by the ‘public data’ argument, due to the lack of a framework to evaluate the potential harms to users. The Social Data Science Lab takes a more ethically reflexive approach to the use of social media data in social research, and carefully considers users’ perceptions, online context and the role of algorithms in estimating potentially sensitive user characteristics.

recent Lab survey conducted into users’ perceptions of the use of their social media posts found the following:

  • 94% were aware that social media companies had Terms of Service
  • 65% had read the Terms of Service in whole or in part
  • 76% knew that when accepting Terms of Service they were giving permission for some of their information to be accessed by third parties
  • 80% agreed that if their social media information is used in a publication they would expect to be asked for consent
  • 90% agreed that if their tweets were used without their consent they should be anonymized…(More)”.

Spanning Today’s Chasms: Seven Steps to Building Trusted Data Intermediaries


James Shulman at the Mellon Foundation: “In 2001, when hundreds of individual colleges and universities were scrambling to scan their slide libraries, The Andrew W. Mellon Foundation created a new organization, Artstor, to assemble a massive library of digital images from disparate sources to support teaching and research in the arts and humanities.

Rather than encouraging—or paying for—each school to scan its own slide of the Mona Lisa, the Mellon Foundation created an intermediary organization that would balance the interests of those who created, photographed and cared for art works, such as artists and museums, and those who wanted to use such images for the admirable calling of teaching and studying history and culture.  This organization would reach across the gap that separated these two communities and would respect and balance the interests of both sides, while helping each accomplish their missions.  At the same time that Napster was using technology to facilitate the un-balanced transfer of digital content from creators to users, the Mellon Foundation set up a new institution aimed at respecting the interests of one side of the market and supporting the socially desirable work of the other.

As the internet has enabled the sharing of data across the world, new intermediaries have emerged as entire platforms. A networked world needs such bridges—think Etsy or Ebay sitting between sellers and buyers, or Facebook sitting between advertisers and users. While intermediaries that match sellers and buyers of things provide a marketplace to bridge from one side or the other, aggregators of data work in admittedly more shadowy territories.

In the many realms that market forces won’t support, however, a great deal of public good can be done by aggregating and managing access to datasets that might otherwise continue to live in isolation. Whether due to institutional sociology that favors local solutions, the technical challenges associated with merging heterogeneous databases built with different data models, intellectual property limitations, or privacy concerns, datasets are built and maintained by independent groups that—if networked—could be used to further each other’s work.

Think of those studying coral reefs, or those studying labor practices in developing markets, or child welfare offices seeking to call upon court records in different states, or medical researchers working in different sub-disciplines but on essentially the same disease.  What intermediary invests in joining these datasets?  Many people assume that computers can simply “talk” to each other and share data intuitively, but without targeted investment in connecting them, they can’t.  Unlike modern databases that are now often designed with the cloud in mind, decades of locally created databases churn away in isolation, at great opportunity cost to us all.

Art history research is an unusually vivid example. Most people can understand that if you want to study Caravaggio, you don’t want to hunt and peck across hundreds of museums, books, photo archives, libraries, churches, and private collections.  You want all that content in one place—exactly what Mellon sought to achieve by creating Artstor.

What did we learn in creating Artstor that might be distilled as lessons for others taking on an aggregation project to serve the public good?….(More)”.