What if people were paid for their data?


The Economist: “Data Slavery” Jennifer Lyn Morone, an American artist, thinks this is the state in which most people now live. To get free online services, she laments, they hand over intimate information to technology firms. “Personal data are much more valuable than you think,” she says. To highlight this sorry state of affairs, Ms Morone has resorted to what she calls “extreme capitalism”: she registered herself as a company in Delaware in an effort to exploit her personal data for financial gain. She created dossiers containing different subsets of data, which she displayed in a London gallery in 2016 and offered for sale, starting at £100 ($135). The entire collection, including her health data and social-security number, can be had for £7,000.

Only a few buyers have taken her up on this offer and she finds “the whole thing really absurd”. ..Given the current state of digital affairs, in which the collection and exploitation of personal data is dominated by big tech firms, Ms Morone’s approach, in which individuals offer their data for sale, seems unlikely to catch on. But what if people really controlled their data—and the tech giants were required to pay for access? What would such a data economy look like?…

Labour, like data, is a resource that is hard to pin down. Workers were not properly compensated for labour for most of human history. Even once people were free to sell their labour, it took decades for wages to reach liveable levels on average. History won’t repeat itself, but chances are that it will rhyme, Mr Weyl predicts in “Radical Markets”, a provocative new book he has co-written with Eric Posner of the University of Chicago. He argues that in the age of artificial intelligence, it makes sense to treat data as a form of labour.

To understand why, it helps to keep in mind that “artificial intelligence” is something of a misnomer. Messrs Weyl and Posner call it “collective intelligence”: most AI algorithms need to be trained using reams of human-generated examples, in a process called machine learning. Unless they know what the right answers (provided by humans) are meant to be, algorithms cannot translate languages, understand speech or recognise objects in images. Data provided by humans can thus be seen as a form of labour which powers AI. As the data economy grows up, such data work will take many forms. Much of it will be passive, as people engage in all kinds of activities—liking social-media posts, listening to music, recommending restaurants—that generate the data needed to power new services. But some people’s data work will be more active, as they make decisions (such as labelling images or steering a car through a busy city) that can be used as the basis for training AI systems….

But much still needs to happen for personal data to be widely considered as labour, and paid for as such. For one thing, the right legal framework will be needed to encourage the emergence of a new data economy. The European Union’s new General Data Protection Regulation, which came into effect in May, already gives people extensive rights to check, download and even delete personal data held by companies. Second, the technology to keep track of data flows needs to become much more capable. Research to calculate the value of particular data to an AI service is in its infancy.

Third, and most important, people will have to develop a “class consciousness” as data workers. Most people say they want their personal information to be protected, but then trade it away for nearly nothing, something known as the “privacy paradox”. Yet things may be changing: more than 90% of Americans think being in control of who can get data on them is important, according to the Pew Research Centre, a think-tank….(More)”.

Virtualization of government‐to‐citizen engagement process: Enablers and constraints


Paper by Joshua Ofoeda et al: “The purpose of this study is to investigate the factors that constrain or enable process virtualization in a government‐to‐citizen engagement process. Past research has established that most e‐government projects, especially in developing countries, are regarded as total failure or partial failure.

Citizens’ unwillingness to use government electronic services and lack of awareness are among some of the reasons why these electronic services fail.

Using the process virtualization theory (PVT) as theoretical lens, the authors investigated the various activities within the driver license acquisition process at the Driver and Vehicle Licensing Authority.

The PVT helped in identifying factors which enable or inhibit the virtualization of the driver license acquisition process in Ghana. Based on a survey data of 317 participants, we report that process characteristics in the form of relationship requirements affect citizens’ willingness toward the use of government virtualized processes. Situating the PVT within a developing country context, our findings reveal that some cultural and behavioral attributes such as socialization hinder the virtualization of some activities within the driver licensing process….(More)”.

Small Wars, Big Data: The Information Revolution in Modern Conflict


Book by Eli Berman, Joseph H. Felter & Jacob N. Shapiro: “The way wars are fought has changed starkly over the past sixty years. International military campaigns used to play out between large armies at central fronts. Today’s conflicts find major powers facing rebel insurgencies that deploy elusive methods, from improvised explosives to terrorist attacks. Small Wars, Big Datapresents a transformative understanding of these contemporary confrontations and how they should be fought. The authors show that a revolution in the study of conflict–enabled by vast data, rich qualitative evidence, and modern methods—yields new insights into terrorism, civil wars, and foreign interventions. Modern warfare is not about struggles over territory but over people; civilians—and the information they might choose to provide—can turn the tide at critical junctures.

The authors draw practical lessons from the past two decades of conflict in locations ranging from Latin America and the Middle East to Central and Southeast Asia. Building an information-centric understanding of insurgencies, the authors examine the relationships between rebels, the government, and civilians. This approach serves as a springboard for exploring other aspects of modern conflict, including the suppression of rebel activity, the role of mobile communications networks, the links between aid and violence, and why conventional military methods might provide short-term success but undermine lasting peace. Ultimately the authors show how the stronger side can almost always win the villages, but why that does not guarantee winning the war.

Small Wars, Big Data provides groundbreaking perspectives for how small wars can be better strategized and favorably won to the benefit of the local population….(More)”.

Sentiment Analysis of Big Data: Methods, Applications, and Open Challenges


Paper by Shahid Shayaa et al at IEEE: “The development of IoT technologies and the massive admiration and acceptance of social media tools and applications, new doors of opportunity have been opened for using data analytics in gaining meaningful insights from unstructured information. The application of opinion mining and sentiment analysis (OMSA) in the era of big data have been used a useful way in categorize the opinion into different sentiment and in general evaluating the mood of the public. Moreover, different techniques of OMSA have been developed over the years in different datasets and applied to various experimental settings. In this regard, this study presents a comprehensive systematic literature review, aims to discuss both technical aspect of OMSA (techniques, types) and non-technical aspect in the form of application areas are discussed. Furthermore, the study also highlighted both technical aspect of OMSA in the form of challenges in the development of its technique and non-technical challenges mainly based on its application. These challenges are presented as a future direction for research….(More)”.

Migration Data using Social Media


European Commission JRC Technical Report: “Migration is a top political priority for the European Union (EU). Data on international migrant stocks and flows are essential for effective migration management. In this report, we estimated the number of expatriates in 17 EU countries based on the number of Facebook Network users who are classified by Facebook as “expats”. To this end, we proposed a method for correcting the over- or under-representativeness of Facebook Network users compared to countries’ actual population.

This method uses Facebook penetration rates by age group and gender in the country of previous residence and country of destination of a Facebook expat. The purpose of Facebook Network expat estimations is not to reproduce migration statistics, but rather to generate separate estimates of expatriates, since migration statistics and Facebook Network expats estimates do not measure the same quantities of interest.

Estimates of social media application users who are classified as expats can be a timely, low-cost, and almost globally available source of information for estimating stocks of international migrants. Our methodology allowed for the timely capture of the increase of Venezuelan migrants in Spain. However, there are important methodological and data integrity issues with using social media data sources for studying migration-related phenomena. For example, our methodology led us to significantly overestimate the number of expats from Philippines in Spain and in Italy and there is no evidence that this overestimation may be valid. While research on the use of big data sources for migration is in its infancy, and the diffusion of internet technologies in less developed countries is still limited, the use of big data sources can unveil useful insights on quantitative and qualitative characteristics of migration….(More)”.

Reduced‐Boundary Governance: The Advantages of Working Together


Introduction by Jeremy L. Hall and R. Paul Battaglio of Special Issue of the Public Administration Review: “Collaboration, cooperation, and coproduction are all approaches that reflect the realization that creative solutions look beyond traditional, organizational, and structural boundaries to overcome various capacity deficiencies while working toward shared goals….One of the factors complicating measurement and analysis in multistakeholder approaches to solving problems and delivering services is the inherently intergovernmental and intersectoral nature of the work. Performance now depends on accumulated capacity across organizations, including a special form of capacity—the ability to work together collaboratively. Such activity within a government has been referred to as “whole of government” approaches or “joined up government” (Christensen and Lægreid 2007). We have terms for work across levels of government (intergovernmental relations) and between government and the public and private sectors (intersectoral relations), but on the whole, the creative, collaborative, and interactive activities in which governments are involved today transcend even these neat categories and classifications. We might call this phenomenon reduced‐boundary governance. Moving between levels of government or between sectors often changes the variables that are available for analysis, or at least introduces validity issues associated with differences in measurement and estimation (see Brandsen and Honingh 2016; Nabatchi, Sancino, and Sicilia 2017). Sometimes data are not available at all. And, of course, collaboration or pooling of resources typically occurs in an ad hoc or one‐off basis that is limited to a single problem, a single program, or a single defined period of time, further complicating study and knowledge accumulation.

Increasingly, public service is accomplished together rather than alone. Boundaries between organizations are becoming blurred in new approaches to solving public problems (Christensen and Lægreid 2007). PAR is committed to better understanding the circumstances under which collaboration, cooperation, and coproduction occurs. What are the necessary antecedents? What are the deterrents? We are interested in the challenges that organizations face as they pursue collaborative action that transcends boundaries. And, of course, we are interested in the efficiency and performance gains that are achieved as a result of those efforts, as well as in their long‐term sustainability.

In this issue, we feature a series of articles that highlight research that focuses on working together, through collaboration, coproduction, or cooperation. The issue begins with a look at right‐sizing the use of volunteerism in public and nonprofit organizations given their limitations and possibilities (Nesbit, Christensen, and Brudney 2018). Uzochukwu and Thomas (2018) then explore coproduction using a case study of Atlanta to better understand who uses it and why. Klok et al. (2018) presents a fascinating look at intermunicipal cooperation through polycentric regional governance in the Netherlands, with an eye toward the costs and effectiveness of those arrangements. McGuire, Hoang, and Prakash (2018) look at the effectiveness of voluntary environmental programs in pollution reduction. Using different policy tools as lenses for analysis, Jung, Malatesta, and LaLonde (2018) ask whether work release programs are improved by working together or working alone. Finally, Yi et al. (2018) explore the role of regional governance and institutional collective action in promoting environmental sustainability. Each of these pieces explores unique dimensions of working together, or governing beyond traditional boundaries….(More)”.

Ways to think about machine learning


Benedict Evans: “We’re now four or five years into the current explosion of machine learning, and pretty much everyone has heard of it. It’s not just that startups are forming every day or that the big tech platform companies are rebuilding themselves around it – everyone outside tech has read the Economist or BusinessWeek cover story, and many big companies have some projects underway. We know this is a Next Big Thing.

Going a step further, we mostly understand what neural networks might be, in theory, and we get that this might be about patterns and data. Machine learning lets us find patterns or structures in data that are implicit and probabilistic (hence ‘inferred’) rather than explicit, that previously only people and not computers could find. They address a class of questions that were previously ‘hard for computers and easy for people’, or, perhaps more usefully, ‘hard for people to describe to computers’. And we’ve seen some cool (or worrying, depending on your perspective) speech and vision demos.

I don’t think, though, that we yet have a settled sense of quite what machine learning means – what it will mean for tech companies or for companies in the broader economy, how to think structurally about what new things it could enable, or what machine learning means for all the rest of us, and what important problems it might actually be able to solve.

This isn’t helped by the term ‘artificial intelligence’, which tends to end any conversation as soon as it’s begun. As soon as we say ‘AI’, it’s as though the black monolith from the beginning of 2001 has appeared, and we all become apes screaming at it and shaking our fists. You can’t analyze ‘AI’.

Indeed, I think one could propose a whole list of unhelpful ways of talking about current developments in machine learning. For example:

  • Data is the new oil
  • Google and China (or Facebook, or Amazon, or BAT) have all the data
  • AI will take all the jobs
  • And, of course, saying AI itself.

More useful things to talk about, perhaps, might be:

  • Automation
  • Enabling technology layers
  • Relational databases. …(More).

Microsoft Research Open Data


Microsoft Research Open Data: “… is a data repository that makes available datasets that researchers at Microsoft have created and published in conjunction with their research. You can browse available datasets and either download them or directly copy them to an Azure-based Virtual Machine or Data Science Virtual Machine. To the extent possible, we follow FAIR (findable, accessible, interoperable and reusable) data principles and will continue to push towards the highest standards for data sharing. We recognize that there are dozens of data repositories already in use by researchers and expect that the capabilities of this repository will augment existing efforts. Datasets are categorized by their primary research area. You can find links to research projects or publications with the dataset.

What is our goal?

Our goal is to provide a simple platform to Microsoft’s researchers and collaborators to share datasets and related research technologies and tools. The site has been designed to simplify access to these data sets, facilitate collaboration between researchers using cloud-based resources, and enable the reproducibility of research. We will continue to evolve and grow this repository and add features to it based on feedback from the community.

How did this project come to be?

Over the past few years, our team, based at Microsoft Research, has worked extensively with the research community to create cloud-based research infrastructure. We started this project as a prototype about a year ago and are excited to finally share it with the research community to support data-intensive research in the cloud. Because almost all research projects have a data component, there is real need for curated and meaningful datasets in the research community, not only in computer science but in interdisciplinary and domain sciences. We have now made several such datasets available for download or use directly on cloud infrastructure….(More)”.

The Global Council on Extended Intelligence


“The IEEE Standards Association (IEEE-SA) and the MIT Media Lab are joining forces to launch a global Council on Extended Intelligence (CXI) composed of individuals who agree on the following:

One of the most powerful narratives of modern times is the story of scientific and technological progress. While our future will undoubtedly be shaped by the use of existing and emerging technologies – in particular, of autonomous and intelligent systems (A/IS) – there is no guarantee that progress defined by “the next” is beneficial. Growth for humanity’s future should not be defined by reductionist ideas of speed or size alone but as the holistic evolution of our species in positive alignment with the environmental and other systems comprising the modern algorithmic world.

We believe all systems must be responsibly created to best utilize science and technology for tangible social and ethical progress. Individuals, businesses and communities involved in the development and deployment of autonomous and intelligent technologies should mitigate predictable risks at the inception and design phase and not as an afterthought. This will help ensure these systems are created in such a way that their outcomes are beneficial to society, culture and the environment.

Autonomous and intelligent technologies also need to be created via participatory design, where systems thinking can help us avoid repeating past failures stemming from attempts to control and govern the complex-adaptive systems we are part of. Responsible living with or in the systems we are part of requires an awareness of the constrictive paradigms we operate in today. Our future practices will be shaped by our individual and collective imaginations and by the stories we tell about who we are and what we desire, for ourselves and the societies in which we live.

These stories must move beyond the “us versus them” media mentality pitting humans against machines. Autonomous and intelligent technologies have the potential to enhance our personal and social skills; they are much more fully integrated and less discrete than the term “artificial intelligence” implies. And while this process may enlarge our cognitive intelligence or make certain individuals or groups more powerful, it does not necessarily make our systems more stable or socially beneficial.

We cannot create sound governance for autonomous and intelligent systems in the Algorithmic Age while utilizing reductionist methodologies. By proliferating the ideals of responsible participant design, data symmetry and metrics of economic prosperity prioritizing people and the planet over profit and productivity, The Council on Extended Intelligence will work to transform reductionist thinking of the past to prepare for a flourishing future.

Three Priority Areas to Fulfill Our Vision

1 – Build a new narrative for intelligent and autonomous technologies inspired by principles of systems dynamics and design.

“Extended Intelligence” is based on the hypothesis that intelligence, ideas, analysis and action are not formed in any one individual collection of neurons or code…..

2 – Reclaim our digital identity in the algorithmic age

Business models based on tracking behavior and using outdated modes of consent are compounded by the appetites of states, industries and agencies for all data that may be gathered….

3 – Rethink our metrics for success

Although very widely used, concepts of exponential growth and productivity such as the gross domestic product (GDP) index are insufficient to holistically measure societal prosperity. … (More)”.

Blockchain Ethical Design Framework


Report by Cara LaPointe and Lara Fishbane: “There are dramatic predictions about the potential of blockchain to “revolutionize” everything from worldwide financial markets and the distribution of humanitarian assistance to the very way that we outright recognize human identity for billions of people around the globe. Some dismiss these claims as excessive technology hype by citing flaws in the technology or robustness of incumbent solutions and infrastructure.

The reality will likely fall somewhere between these two extremes across multiple sectors. Where initial applications of blockchain were focused on the financial industry, current applications have rapidly expanded to address a wide array of sectors with major implications for social impact.

This paper aims to demonstrate the capacity of blockchain to create scalable social impact and to identify the elements that need to be addressed to mitigate challenges in its application. We are at a moment when technology is enabling society to experiment with new solutions and business models. Ubiquity and global reach, increased capabilities, and affordability have made technology a critical tool for solving problems, making this an exciting time to think about achieving greater social impact. We can address issues for underserved or marginalized people in ways that were previously unimaginable.

Blockchain is a technology that holds real promise for dealing with key inefficiencies and transforming operations in the social sector and for improving lives. Because of its immutability and decentralization, blockchain has the potential to create transparency, provide distributed verification, and build trust across multiple systems. For instance, blockchain applications could provide the means for establishing identities for individuals without identification papers, improving access to finance and banking services for underserved populations, and distributing aid to refugees in a more transparent and efficient manner. Similarly, national and subnational governments are putting land registry information onto blockchains to create greater transparency and avoid corruption and manipulation by third parties.

From increasing access to capital, to tracking health and education data across multiple generations, to improving voter records and voting systems, blockchain has countless potential applications for social impact. As developers take on building these types of solutions, the social effects of blockchain can be powerful and lasting. With the potential for such a powerful impact, the design, application, and approach to the development and implementation of blockchain technologies have long-term implications for society and individuals.

This paper outlines why intentionality of design, which is important with any technology, is particularly crucial with blockchain, and offers a framework to guide policymakers and social impact organizations. As social media, cryptocurrencies, and algorithms have shown, technology is not neutral. Values are embedded in the code. How the problem is defined and by whom, who is building the solution, how it gets programmed and implemented, who has access, and what rules are created have consequences, in intentional and unintentional ways. In the applications and implementation of blockchain, it is critical to understand that seemingly innocuous design choices have resounding ethical implications on people’s lives.

This white paper addresses why intentionality of design matters, identifies the key questions that should be asked, and provides a framework to approach use of blockchain, especially as it relates to social impact. It examines the key attributes of blockchain, its broad applicability as well as its particular potential for social impact, and the challenges in fully realizing that potential. Social impact organizations and policymakers have an obligation to understand the ethical approaches used in designing blockchain technology, especially how they affect marginalized and vulnerable populations….(More)”