Guide to Mobile Data Analytics in Refugee Scenarios


Book edited Albert Ali Salah, Alex Pentland, Bruno Lepri and Emmanuel Letouzé: “After the start of the Syrian Civil War in 2011–12, increasing numbers of civilians sought refuge in neighboring countries. By May 2017, Turkey had received over 3 million refugees — the largest r efugee population in the world. Some lived in government-run camps near the Syrian border, but many have moved to cities looking for work and better living conditions. They faced problems of integration, income, welfare, employment, health, education, language, social tension, and discrimination. In order to develop sound policies to solve these interlinked problems, a good understanding of refugee dynamics is necessary.

This book summarizes the most important findings of the Data for Refugees (D4R) Challenge, which was a non-profit project initiated to improve the conditions of the Syrian refugees in Turkey by providing a database for the scientific community to enable research on urgent problems concerning refugees. The database, based on anonymized mobile call detail records (CDRs) of phone calls and SMS messages of one million Turk Telekom customers, indicates the broad activity and mobility patterns of refugees and citizens in Turkey for the year 1 January to 31 December 2017. Over 100 teams from around the globe applied to take part in the challenge, and 61 teams were granted access to the data.

This book describes the challenge, and presents selected and revised project reports on the five major themes: unemployment, health, education, social integration, and safety, respectively. These are complemented by additional invited chapters describing related projects from international governmental organizations, technological infrastructure, as well as ethical aspects. The last chapter includes policy recommendations, based on the lessons learned.

The book will serve as a guideline for creating innovative data-centered collaborations between industry, academia, government, and non-profit humanitarian agencies to deal with complex problems in refugee scenarios. It illustrates the possibilities of big data analytics in coping with refugee crises and humanitarian responses, by showcasing innovative approaches drawing on multiple data sources, information visualization, pattern analysis, and statistical analysis.It will also provide researchers and students working with mobility data with an excellent coverage across data science, economics, sociology, urban computing, education, migration studies, and more….(More)”.

To Regain Policy Competence: The Software of American Public Problem-Solving


Philip Zelikow at the Texas National Security Review: “Policymaking is a discipline, a craft, and a profession. Policymakers apply specialized knowledge — about other countries, politics, diplomacy, conflict, economics, public health, and more — to the practical solution of public problems. Effective policymaking is difficult. The “hardware” of policymaking — the tools and structures of government that frame the possibilities for useful work — are obviously important. Less obvious is that policy performance in practice often rests more on the “software” of public problem-solving: the way people size up problems, design actions, and implement policy. In other words, the quality of the policymaking.

Like policymaking, engineering is a discipline, a craft, and a profession. Engineers learn how to apply specialized knowledge — about chemistry, physics, biology, hydraulics, electricity, and more — to the solution of practical problems. Effective engineering is similarly difficult. People work hard to learn how to practice it with professional skill. But, unlike the methods taught for engineering, the software of policy work is rarely recognized or studied. It is not adequately taught. There is no canon or norms of professional practice. American policymaking is less about deliberate engineering, and is more about improvised guesswork and bureaucratized habits.

My experience is as a historian who studies the details of policy episodes and the related staff work, but also as a former official who has analyzed a variety of domestic and foreign policy issues at all three levels of American government, including federal work from different bureaucratic perspectives in five presidential administrations from Ronald Reagan to Barack Obama. From this historical and contemporary vantage point, I am struck (and a bit depressed) that the quality of U.S. policy engineering is actually much, much worse in recent decades than it was throughout much of the 20th century. This is not a partisan observation — the decline spans both Republican and Democratic administrations.

I am not alone in my observations. Francis Fukuyama recently concluded that, “[T]he overall quality of the American government has been deteriorating steadily for more than a generation,” notably since the 1970s. In the United States, “the apparently irreversible increase in the scope of government has masked a large decay in its quality.”1 This worried assessment is echoed by other nonpartisan and longtime scholars who have studied the workings of American government.2 The 2003 National Commission on Public Service observed,

The notion of public service, once a noble calling proudly pursued by the most talented Americans of every generation, draws an indifferent response from today’s young people and repels many of the country’s leading private citizens. … The system has evolved not by plan or considered analysis but by accretion over time, politically inspired tinkering, and neglect. … The need to improve performance is urgent and compelling.3

And they wrote that as the American occupation of Iraq was just beginning.

In this article, I offer hypotheses to help explain why American policymaking has declined, and why it was so much more effective in the mid-20th century than it is today. I offer a brief sketch of how American education about policy work evolved over the past hundred years, and I argue that the key software qualities that made for effective policy engineering neither came out of the academy nor migrated back into it.

I then outline a template for doing and teaching policy engineering. I break the engineering methods down into three interacting sets of analytical judgments: about assessment, design, and implementation. In teaching, I lean away from new, cumbersome standalone degree programs and toward more flexible forms of education that can pair more easily with many subject-matter specializations. I emphasize the value of practicing methods in detailed and more lifelike case studies. I stress the significance of an organizational culture that prizes written staff work of the quality that used to be routine but has now degraded into bureaucratic or opinionated dross….(More)”.

Real-time maps warn Hong Kong protesters of water cannons and riot police


Mary Hui at Quartz: “The “Be Water” nature of Hong Kong’s protests means that crowds move quickly and spread across the city. They might stage a protest in the central business district one weekend, then industrial neighborhoods and far-flung suburban towns the next. And a lot is happening at any one time at each protest. One of the key difficulties for protesters is to figure out what’s happening in the crowded, fast-changing, and often chaotic circumstances.

Citizen-led efforts to map protests in real-time are an attempt to address those challenges and answer some pressing questions for protesters and bystanders alike: Where should they go? Where have tear gas and water cannons been deployed? Where are police advancing, and are there armed thugs attacking civilians?

One of the most widely used real-time maps of the protests is HKMap.live, a volunteer-run and crowdsourced effort that officially launched in early August. It’s a dynamic map of Hong Kong that users can zoom in and out of, much like Google Maps. But in addition to detailed street and building names, this one features various emoji to communicate information at a glance: a dog for police, a worker in a yellow hardhat for protesters, a dinosaur for the police’s black-clad special tactical squad, a white speech-bubble for tear gas, two exclamation marks for danger.

HKMap during a protest on August 31, 2019.

Founded by a finance professional in his 20s and who only wished to be identified as Kuma, HKMap is an attempt to level the playing field between protesters and officers, he said in an interview over chat app Telegram. While earlier on in the protest movement people relied on text-based, on-the-ground  live updates through public Telegram channels, Kuma found these to be too scattered to be effective, and hard to visualize unless someone knew the particular neighborhood inside out.

“The huge asymmetric information between protesters and officers led to multiple occasions of surround and capture,” said Kuma. Passersby and non-frontline protesters could also make use of the map, he said, to avoid tense conflict zones. After some of his friends were arrested in late July, he decided to build HKMap….(More)”.

Index: The Data Universe 2019


By Michelle Winowatan, Andrew J. Zahuranec, Andrew Young, Stefaan Verhulst, Max Jun Kim

The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe.

Please share any additional, illustrative statistics on data, or other issues at the nexus of technology and governance, with us at info@thelivinglib.org

Internet Traffic:

  • Percentage of the world’s population that uses the internet: 51.2% (3.9 billion people) – 2018
  • Number of search processed worldwide by Google every year: at least 2 trillion – 2016
  • Website traffic worldwide generated through mobile phones: 52.2% – 2018
  • The total number of mobile subscriptions in the first quarter of 2019: 7.9 billion (addition of 44 million in quarter) – 2019
  • Amount of mobile data traffic worldwide: nearly 30 billion GB – 2018
  • Data category with highest traffic worldwide: video (60%) – 2018
  • Global average of data traffic per smartphone per month: 5.6 GB – 2018
    • North America: 7 GB – 2018
    • Latin America: 3.1 GB – 2018
    • Western Europe: 6.7 GB – 2018
    • Central and Eastern Europe: 4.5 GB – 2018
    • North East Asia: 7.1 GB – 2018
    • Southeast Asia and Oceania: 3.6 GB – 2018
    • India, Nepal, and Bhutan: 9.8 GB – 2018
    • Middle East and Africa: 3.0 GB – 2018
  • Time between the creation of each new bitcoin block: 9.27 minutes – 2019

Streaming Services:

  • Total hours of video streamed by Netflix users every minute: 97,222 – 2017
  • Hours of YouTube watched per day: over 1 billion – 2018
  • Number of tracks uploaded to Spotify every day: Over 20,000 – 2019
  • Number of Spotify’s monthly active users: 232 million – 2019
  • Spotify’s total subscribers: 108 million – 2019
  • Spotify’s hours of content listened: 17 billion – 2019
  • Total number of songs on Spotify’s catalog: over 30 million – 2019
  • Apple Music’s total subscribers: 60 million – 2019
  • Total number of songs on Apple Music’s catalog: 45 million – 2019

Social Media:

Calls and Messaging:

Retail/Financial Transaction:

  • Number of packages shipped by Amazon in a year: 5 billion – 2017
  • Total value of payments processed by Venmo in a year: USD 62 billion – 2019
  • Based on an independent analysis of public transactions on Venmo in 2017:
  • Based on a non-representative survey of 2,436 US consumers between the ages of 21 and 72 on P2P platforms:
    • The average volume of transactions handled by Venmo: USD 64.2 billion – 2019
    • The average volume of transactions handled by Zelle: USD 122.0 billion – 2019
    • The average volume of transactions handled by PayPal: USD 141.8 billion – 2019 
    • Platform with the highest percent adoption among all consumers: PayPal (48%) – 2019 

Internet of Things:

Sources:

Misinformation Has Created a New World Disorder


Claire Wardle at Scientific American: “…Online misinformation has been around since the mid-1990s. But in 2016 several events made it broadly clear that darker forces had emerged: automation, microtargeting and coordination were fueling information campaigns designed to manipulate public opinion at scale. Journalists in the Philippines started raising flags as Rodrigo Duterte rose to power, buoyed by intensive Facebook activity. This was followed by unexpected results in the Brexit referendum in June and then the U.S. presidential election in November—all of which sparked researchers to systematically investigate the ways in which information was being used as a weapon.

During the past three years the discussion around the causes of our polluted information ecosystem has focused almost entirely on actions taken (or not taken) by the technology companies. But this fixation is too simplistic. A complex web of societal shifts is making people more susceptible to misinformation and conspiracy. Trust in institutions is falling because of political and economic upheaval, most notably through ever widening income inequality. The effects of climate change are becoming more pronounced. Global migration trends spark concern that communities will change irrevocably. The rise of automation makes people fear for their jobs and their privacy.

Bad actors who want to deepen existing tensions understand these societal trends, designing content that they hope will so anger or excite targeted users that the audience will become the messenger. The goal is that users will use their own social capital to reinforce and give credibility to that original message.

Most of this content is designed not to persuade people in any particular direction but to cause confusion, to overwhelm and to undermine trust in democratic institutions from the electoral system to journalism. And although much is being made about preparing the U.S. electorate for the 2020 election, misleading and conspiratorial content did not begin with the 2016 presidential race, and it will not end after this one. As tools designed to manipulate and amplify content become cheaper and more accessible, it will be even easier to weaponize users as unwitting agents of disinformation….(More)”.

Credit: Jen Christiansen; Source: Information Disorder: Toward an Interdisciplinary Framework for Research and Policymaking, by Claire Wardle and Hossein Derakhshan. Council of Europe, October 2017

The Internet Freedom League: How to Push Back Against the Authoritarian Assault on the Web


Essay by Richard A. Clarke And Rob Knake in Foreign Affairs: “The early days of the Internet inspired a lofty dream: authoritarian states, faced with the prospect of either connecting to a new system of global communication or being left out of it, would choose to connect. According to this line of utopian thinking, once those countries connected, the flow of new information and ideas from the outside world would inexorably pull them toward economic openness and political liberalization. In reality, something quite different has happened. Instead of spreading democratic values and liberal ideals, the Internet has become the backbone of authoritarian surveillance states all over the world. Regimes in China, Russia, and elsewhere have used the Internet’s infrastructure to build their own national networks. At the same time, they have installed technical and legal barriers to prevent their citizens from reaching the wider Internet and to limit Western companies from entering their digital markets. 

But despite handwringing in Washington and Brussels about authoritarian schemes to split the Internet, the last thing Beijing and Moscow want is to find themselves relegated to their own networks and cut off from the global Internet. After all, they need access to the Internet to steal intellectual property, spread propaganda, interfere with elections in other countries, and threaten critical infrastructure in rival countries. China and Russia would ideally like to re-create the Internet in their own images and force the world to play by their repressive rules. But they haven’t been able to do that—so instead they have ramped up their efforts to tightly control outside access to their markets, limit their citizens’ ability to reach the wider Internet, and exploit the vulnerability that comes with the digital freedom and openness enjoyed in the West.

The United States and its allies and partners should stop worrying about the risk of authoritarians splitting the Internet. Instead, they should split it themselves, by creating a digital bloc within which data, services, and products can flow freely…(More)”.

Fostering an Enabling Policy and Regulatory Environment in APEC for Data-Utilizing Businesses


APEC: “The objectives of this study is to better understand: 1) how firms from different sectors use data in their business models; and considering the significant increase in data-related policies and regulations enacted by governments across the world, 2) how such policies and regulations are affecting their use of data and hence business models. The study also tries: 3) to identify some of the middle-ground approaches that would enable governments to achieve public policy objectives, such as data security and privacy, and at the same time, also promote the growth of data-utilizing businesses. 39 firms from 12 economies have participated in this project and they come from a diverse group of industries, including aviation, logistics, shipping, payment services, encryption services, and manufacturing. The synthesis report can be found in Chapter 1 while the case study chapters can be found in Chapter 2 to 10….(More)”.

Companies Collect a Lot of Data, But How Much Do They Actually Use?


Article by Priceonomics Data Studio: “For all the talk of how data is the new oil and the most valuable resource of any enterprise, there is a deep dark secret companies are reluctant to share — most of the data collected by businesses simply goes unused.

This unknown and unused data, known as dark data comprises more than half the data collected by companies. Given that some estimates indicate that 7.5 septillion (7,700,000,000,000,000,000,000) gigabytes of data are generated every single day, not using  most of it is a considerable issue.

In this article, we’ll look at this dark data. Just how much of it is created by companies, what are the reasons this data isn’t being analyzed, and what are the costs and implications of companies not using the majority of the data they collect.  

Before diving into the analysis, it’s worth spending a moment clarifying what we mean by the term “dark data.” Gartner defines dark data as:

“The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes (for example, analytics, business relationships and direct monetizing). 

To learn more about this phenomenon, Splunk commissioned a global survey of 1,300+ business leaders to better understand how much data they collect, and how much is dark. Respondents were from IT and business roles, and were located in Australia, China, France, Germany, Japan, the United States, and the United Kingdom. across various industries. For the report, Splunk defines dark data as: “all the unknown and untapped data across an organization, generated by systems, devices and interactions.”

While the costs of storing data has decreased overtime, the cost of saving septillions of gigabytes of wasted data is still significant. What’s more, during this time the strategic importance of data has increased as companies have found more and more uses for it. Given the cost of storage and the value of data, why does so much of it go unused?

The following chart shows the reasons why dark data isn’t currently being harnessed:

By a large margin, the number one reason given for not using dark data is that companies lack a tool to capture or analyze the data. Companies accumulate data from server logs, GPS networks, security tools, call records, web traffic and more. Companies track everything from digital transactions to the temperature of their server rooms to the contents of retail shelves. Most of this data lies in separate systems, is unstructured, and cannot be connected or analyzed.

Second, the data captured just isn’t good enough. You might have important customer information about a transaction, but it’s missing location or other important metadata because that information sits somewhere else or was never captured in useable format.

Additionally, dark data exists because there is simply too much data out there and a lot of is unstructured. The larger the dataset (or the less structured it is), the more sophisticated the tool required for analysis. Additionally, these kinds of datasets often time require analysis by individuals with significant data science expertise who are often is short supply

The implications of the prevalence are vast. As a result of the data deluge, companies often don’t know where all the sensitive data is stored and can’t be confident they are complying with consumer data protection measures like GDPR. …(More)”.

The Costs of Connection: How Data Is Colonizing Human Life and Appropriating It for Capitalism


Book by Nick Couldry: “We are told that progress requires human beings to be connected, and that science, medicine and much else that is good demands the kind massive data collection only possible if every thing and person are continuously connected.

But connection, and the continuous surveillance that connection makes possible, usher in an era of neocolonial appropriation. In this new era, social life becomes a direct input to capitalist production, and data – the data collected and processed when we are connected – is the means for this transformation. Hence the need to start counting the costs of connection.

Capturing and processing social data is today handled by an emerging social quantification sector. We are familiar with its leading players, from Acxiom to Equifax, from Facebook to Uber. Together, they ensure the regular and seemingly natural conversion of daily life into a stream of data that can be appropriated for value. This stream is extracted from sensors embedded in bodies and objects, and from the traces left by human interaction online. The result is a new social order based on continuous tracking, and offering unprecedented new opportunities for social discrimination and behavioral influence.  This order has disturbing consequences for freedom, justice and power — indeed, for the quality of human life.

The true violence of this order is best understood through the history of colonialism. But because we assume that colonialism has been replaced by advanced capitalism, we often miss the connection. The concept of data colonialism can thus be used to trace continuities from colonialism’s historic appropriation of territories and material resources to the datafication of everyday life today. While the modes, intensities, scales and contexts of dispossession have changed, the underlying function remains the same: to acquire resources from which economic value can be extracted.

In data colonialism, data is appropriated through a new type of social relation: data relations. We are living through a time when the organization of capital and the configurations of power are changing dramatically because of this contemporary form of social relation. Data colonialism justifies what it does as an advance in scientific knowledge, personalized marketing, or rational management, just as historic colonialism claimed a civilizing mission. Data colonialism is global, dominated by powerful forces in East and West, in the USA and China. The result is a world where, wherever we are connected, we are colonized by data.

Where is data colonialism heading in the long term? Just as historical colonialism paved the way for industrial capitalism, data colonialism is paving the way for a new stage of capitalism whose outlines we only partly see: the capitalization of life without limit. There will be no part of human life, no layer of experience, that is not extractable for economic value. Human life will be there for mining by corporations without reserve as governments look on appreciatively. This process of capitalization will be the foundation for a highly unequal new social arrangement, a social order that is deeply incompatible with human freedom and autonomy.

But resistance is still possible, drawing on past and present decolonial struggles, as well as the on the best of the humanities, philosophy, political economy, information and social science. The goal is to name what is happening and imagine better ways of living together without the exploitation on which today’s models of ‘connection’ are founded….(More)”

Trust in Contemporary Society


Book edited by Masamichi Sasaki: “… deals with conceptual, theoretical and social interaction analyses, historical data on societies, national surveys or cross-national comparative studies, and methodological issues related to trust. The authors are from a variety of disciplines: psychology, sociology, political science, organizational studies, history, and philosophy, and from Britain, the United States, the Czech Republic, the Netherlands, Australia, Germany, and Japan. They bring their vast knowledge from different historical and cultural backgrounds to illuminate contemporary issues of trust and distrust. The socio-cultural perspective of trust is important and increasingly acknowledged as central to trust research. Accordingly, future directions for comparative trust research are also discussed….(More)”.