How the Data That Internet Companies Collect Can Be Used for the Public Good


Stefaan G. Verhulst and Andrew Young at Harvard Business Review: “…In particular, the vast streams of data generated through social media platforms, when analyzed responsibly, can offer insights into societal patterns and behaviors. These types of behaviors are hard to generate with existing social science methods. All this information poses its own problems, of complexity and noise, of risks to privacy and security, but it also represents tremendous potential for mobilizing new forms of intelligence.

In a recent report, we examine ways to harness this potential while limiting and addressing the challenges. Developed in collaboration with Facebook, the report seeks to understand how public and private organizations can join forces to use social media data — through data collaboratives — to mitigate and perhaps solve some our most intractable policy dilemmas.

Data Collaboratives: Public-Private Partnerships for Our Data Age 

For all of data’s potential to address public challenges, most data generated today is collected by the private sector. Typically ensconced in corporate databases, and tightly held in order to maintain competitive advantage, this data contains tremendous possible insights and avenues for policy innovation. But because the analytical expertise brought to bear on it is narrow, and limited by private ownership and access restrictions, its vast potential often goes untapped.

Data collaboratives offer a way around this limitation. They represent an emerging public-private partnership model, in which participants from different areas , including the private sector, government, and civil society , can come together to exchange data and pool analytical expertise in order to create new public value. While still an emerging practice, examples of such partnerships now exist around the world, across sectors and public policy domains….

Professionalizing the Responsible Use of Private Data for Public Good

For all its promise, the practice of data collaboratives remains ad hoc and limited. In part, this is a result of the lack of a well-defined, professionalized concept of data stewardship within corporations. Today, each attempt to establish a cross-sector partnership built on the analysis of social media data requires significant and time-consuming efforts, and businesses rarely have personnel tasked with undertaking such efforts and making relevant decisions.

As a consequence, the process of establishing data collaboratives and leveraging privately held data for evidence-based policy making and service delivery is onerous, generally one-off, not informed by best practices or any shared knowledge base, and prone to dissolution when the champions involved move on to other functions.

By establishing data stewardship as a corporate function, recognized within corporations as a valued responsibility, and by creating the methods and tools needed for responsible data-sharing, the practice of data collaboratives can become regularized, predictable, and de-risked.

If early efforts toward this end — from initiatives such as Facebook’s Data for Good efforts in the social media space and MasterCard’s Data Philanthropy approach around finance data — are meaningfully scaled and expanded, data stewards across the private sector can act as change agents responsible for determining what data to share and when, how to protect data, and how to act on insights gathered from the data.

Still, many companies (and others) continue to balk at the prospect of sharing “their” data, which is an understandable response given the reflex to guard corporate interests. But our research has indicated that many benefits can accrue not only to data recipients but also to those who share it. Data collaboration is not a zero-sum game.

With support from the Hewlett Foundation, we are embarking on a two-year project toward professionalizing data stewardship (and the use of data collaboratives) and establishing well-defined data responsibility approaches. We invite others to join us in working to transform this practice into a widespread, impactful means of leveraging private-sector assets, including social media data, to create positive public-sector outcomes around the world….(More)”.

 

Extracting crowd intelligence from pervasive and social big data


Introduction by Leye Wang, Vincent Gauthier, Guanling Chen and Luis Moreira-Matias of Special Issue of the Journal of Ambient Intelligence and Humanized Computing: “With the prevalence of ubiquitous computing devices (smartphones, wearable devices, etc.) and social network services (Facebook, Twitter, etc.), humans are generating massive digital traces continuously in their daily life. Considering the invaluable crowd intelligence residing in these pervasive and social big data, a spectrum of opportunities is emerging to enable promising smart applications for easing individual life, increasing company profit, as well as facilitating city development. However, the nature of big data also poses fundamental challenges on the techniques and applications relying on the pervasive and social big data from multiple perspectives such as algorithm effectiveness, computation speed, energy efficiency, user privacy, server security, data heterogeneity and system scalability. This special issue presents the state-of-the-art research achievements in addressing these challenges. After the rigorous review process of reviewers and guest editors, eight papers were accepted as follows.

The first paper “Automated recognition of hypertension through overnight continuous HRV monitoring” by Ni et al. proposes a non-invasive way to differentiate hypertension patients from healthy people with the pervasive sensors such as a waist belt. To this end, the authors train a machine learning model based on the heart rate data sensed from waists worn by a crowd of people, and the experiments show that the detection accuracy is around 93%.

The second paper “The workforce analyzer: group discovery among LinkedIn public profiles” by Dai et al. describes two users’ group discovery methods among LinkedIn public profiles. One is based on K-means and another is based on SVM. The authors contrast results of both methods and provide insights about the trending professional orientations of the workforce from an online perspective.

The third paper “Tweet and followee personalized recommendations based on knowledge graphs” by Pla Karidi et al. present an efficient semantic recommendation method that helps users filter the Twitter stream for interesting content. The foundation of this method is a knowledge graph that can represent all user topics of interest as a variety of concepts, objects, events, persons, entities, locations and the relations between them. An important advantage of the authors’ method is that it reduces the effects of problems such as over-recommendation and over-specialization.

The fourth paper “CrowdTravel: scenic spot profiling by using heterogeneous crowdsourced data” by Guo et al. proposes CrowdTravel, a multi-source social media data fusion approach for multi-aspect tourism information perception, which can provide travelling assistance for tourists by crowd intelligence mining. Experiments over a dataset of several popular scenic spots in Beijing and Xi’an, China, indicate that the authors’ approach attains fine-grained characterization for the scenic spots and delivers excellent performance.

The fifth paper “Internet of Things based activity surveillance of defence personnel” by Bhatia et al. presents a comprehensive IoT-based framework for analyzing national integrity of defence personnel with consideration to his/her daily activities. Specifically, Integrity Index Value is defined for every defence personnel based on different social engagements, and activities for detecting the vulnerability to national security. In addition to this, a probabilistic decision tree based automated decision making is presented to aid defence officials in analyzing various activities of a defence personnel for his/her integrity assessment.

The sixth paper “Recommending property with short days-on-market for estate agency” by Mou et al. proposes an estate with short days-on-market appraisal framework to automatically recommend those estates using transaction data and profile information crawled from websites. Both the spatial and temporal characteristics of an estate are integrated into the framework. The results show that the proposed framework can estimate accurately about 78% estates.

The seventh paper “An anonymous data reporting strategy with ensuring incentives for mobile crowd-sensing” by Li et al. proposes a system and a strategy to ensure anonymous data reporting while ensuring incentives simultaneously. The proposed protocol is arranged in five stages that mainly leverage three concepts: (1) slot reservation based on shuffle, (2) data submission based on bulk transfer and multi-player dc-nets, and (3) incentive mechanism based on blind signature.

The last paper “Semantic place prediction from crowd-sensed mobile phone data” by Celik et al. semantically classifes places visited by smart phone users utilizing the data collected from sensors and wireless interfaces available on the phones as well as phone usage patterns, such as battery level, and time-related information, with machine learning algorithms. For this study, the authors collect data from 15 participants at Galatasaray University for 1 month, and try different classification algorithms such as decision tree, random forest, k-nearest neighbour, naive Bayes, and multi-layer perceptron….(More)”.

It’s the (Democracy-Poisoning) Golden Age of Free Speech


Zeynep Tufekci in Wired: “…In today’s networked environment, when anyone can broadcast live or post their thoughts to a social network, it would seem that censorship ought to be impossible. This should be the golden age of free speech.

And sure, it is a golden age of free speech—if you can believe your lying eyes….

The most effective forms of censorship today involve meddling with trust and attention, not muzzling speech itself. As a result, they don’t look much like the old forms of censorship at all. They look like viral or coordinated harassment campaigns, which harness the dynamics of viral outrage to impose an unbearable and disproportionate cost on the act of speaking out. They look like epidemics of disinformation, meant to undercut the credibility of valid information sources. They look like bot-fueled campaigns of trolling and distraction, or piecemeal leaks of hacked materials, meant to swamp the attention of traditional media.

These tactics usually don’t break any laws or set off any First Amendment alarm bells. But they all serve the same purpose that the old forms of censorship did: They are the best available tools to stop ideas from spreading and gaining purchase. They can also make the big platforms a terrible place to interact with other people.

Even when the big platforms themselves suspend or boot someone off their networks for violating “community standards”—an act that doeslook to many people like old-fashioned censorship—it’s not technically an infringement on free speech, even if it is a display of immense platform power. Anyone in the world can still read what the far-right troll Tim “Baked Alaska” Gionet has to say on the internet. What Twitter has denied him, by kicking him off, is attention.

Many more of the most noble old ideas about free speech simply don’t compute in the age of social media. John Stuart Mill’s notion that a “marketplace of ideas” will elevate the truth is flatly belied by the virality of fake news. And the famous American saying that “the best cure for bad speech is more speech”—a paraphrase of Supreme Court justice Louis Brandeis—loses all its meaning when speech is at once mass but also nonpublic. How do you respond to what you cannot see? How can you cure the effects of “bad” speech with more speech when you have no means to target the same audience that received the original message?

This is not a call for nostalgia. In the past, marginalized voices had a hard time reaching a mass audience at all. They often never made it past the gatekeepers who put out the evening news, who worked and lived within a few blocks of one another in Manhattan and Washington, DC. The best that dissidents could do, often, was to engineer self-sacrificing public spectacles that those gatekeepers would find hard to ignore—as US civil rights leaders did when they sent schoolchildren out to march on the streets of Birmingham, Alabama, drawing out the most naked forms of Southern police brutality for the cameras.

But back then, every political actor could at least see more or less what everyone else was seeing. Today, even the most powerful elites often cannot effectively convene the right swath of the public to counter viral messages. …(More)”.

Technology as a Driver for Governance by the People for the People


Chapter by Ruth Kattumuri in the book Governance and Governed: “The changing dynamics of leadership and growing involvement of people in the process of governance can be attributed to an enhanced access to technology, which enables the governed to engage directly and instantly. This is expected to lead to a greater sense of accountability on the part of leaders to render outcomes for the benefit of the public at large. Effective leadership is increasingly seen to play a significant role in institutionalising citizen’s involvement through social media in order to improve the responsibility of political decision-makers towards the citizens. “Governed” have discovered the ability to transform “governance” through the use of technology, such as social media. This chapter examines the role of technology and media, and the interface between the two, as key drivers in the evolving dynamics of state, society and the governance process….(More)”.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge
Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain time, due to a constant risk of being sexually harassed on public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data
Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.
The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having a comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that dentifying these gaps is key to advancing gender equality and women’s empowerment, one of the SDGs.

Catalyzing Inclusive FInancial System: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides informations on how to access and summarizes insights from this data.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

    • This paper seeks to understand homophily, a human behavior characterizes by interaction with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily has to social structures.
    • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measure the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. Meaning that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported case of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12,, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement pattern provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operator, who supplied the de-identified data from 12 million users, before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 people above normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. “Safety First: Street Harassment and Women’s Educational Choices in India.” Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows user to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, the researcher analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Articial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. “The role of gender in social network organization.” PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

Vaitla, Bapu. Big Data and the Well-Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

Tracking Metrics in Social 3.0


Nancy Lim in AdWeek: “…Facebook is the world’s most popular social network, with incomparable reach and real value for marketers. However, as engagement on the channel increases, marketers are in a pickle. While they want to track and support valuable experiences on Facebook, they’re unsure if they can trust the channel’s metrics…..Marketers’ wavering trust in Facebook metrics warrants a look back at the evolution of social media itself.

At social’s advent (Social 1.0), metrics focused strictly on likes and comments. Content simply wasn’t as important as users learned to build social profiles and make the platform work for them.

Then, Social 2.0 invited brands to enter the fray. With them came the new role of content as a driver of top-line metrics.

Now, we’re in the midst of Social 3.0, where advancements in the technology have made it possible for social channels to result in real ad conversions.

When it comes to these conversions, it’s no longer all about the click. There’s been a marked shift away from social interactions of the past, which centered around intangible things like likes and engagement-based activities. Now, marketers are tasked with tracking more tangible metrics like conversions. Another way to look at this evolution is from social objectives (likes, shares, comments) to real business objectives (conversions, units sold, cost per sale)….

To thrive in Social 3.0, marketers must provide more direct channels for responses with lower barriers of entry, and do more of this work themselves.

They must also come to terms with the fact that while Facebook often feels like an owned channel, it’s first and foremost a platform designed for consumers. This means they cannot blindly put all their trust in Facebook’s metrics. Rather, marketers should be partnering with available third-party technologies to truly understand, trust and drive full value from Facebook insights.

Call tracking provides an avenue for this. Armed with call tracking software, marketers can determine which campaigns are causing Facebook users to pick up their phones. For instance, marketers can assign unique call numbers to separate Facebook campaigns to A/B test different copy and CTAs. Once they know what’s working best, they can incorporate that feedback into future campaigns.

Other analytics tools then provide a clearer picture. For example, marketers can leverage insights from Google Analytics, third-party data providers or other big analytics tools to learn more about the users that are engaging. Such a holistic perspective results in the creation of more personalized campaigns and, therefore, conversions….(More)”.

Friendship, Robots, and Social Media: False Friends and Second Selves


Book by Alexis M. Elder: “Various emerging technologies, from social robotics to social media, appeal to our desire for social interactions, while avoiding some of the risks and costs of face-to-face human interaction. But can they offer us real friendship? In this book, Alexis Elder outlines a theory of friendship drawing on Aristotle and contemporary work on social ontology, and then uses it to evaluate the real value of social robotics and emerging social technologies.

In the first part of the book Elder develops a robust and rigorous ontology of friendship: what it is, how it functions, what harms it, and how it relates to familiar ethical and philosophical questions about character, value, and well-being. In Part II she applies this ontology to emerging trends in social robotics and human-robot interaction, including robotic companions for lonely seniors, therapeutic robots used to teach social skills to children on the autism spectrum, and companionate robots currently being developed for consumer markets. Elder articulates the moral hazards presented by these robots, while at the same time acknowledging their real and measurable benefits. In the final section she shifts her focus to connections between real people, especially those enabled by social media. Arguing against critics who have charged that these new communication technologies are weakening our social connections, Elder explores ways in which text messaging, video chats, Facebook, and Snapchat are enabling us to develop, sustain, and enrich our friendship in new and meaningful ways….(More)”.

Psychopolitics: Neoliberalism and New Technologies of Power


Review by Stuart Jeffries of new book by Byung-Chul Han: “During a commercial break in the 1984 Super Bowl, Apple broadcast an ad directed by Ridley Scott. Glum, grey workers sat in a vast grey hall listening to Big Brother’s declamations on a huge screen. Then a maverick athlete-cum-Steve-Jobs-lackey hurled a sledgehammer at the screen, shattering it and bathing workers in healing light. “On January 24th,” the voiceover announced, “Apple Computer will introduce the Macintosh. And you’ll see why 1984 won’t be like [Orwell’s] Nineteen Eighty-Four.”

The ad’s idea, writes Korean-born German philosopher Byung-Chul Han, was that the Apple Mac would liberate downtrodden masses from the totalitarian surveillance state. And indeed, the subsequent rise of Apple, the internet, Twitter, Facebook, Amazon and Google Glass means that today we live in nothing like the nightmare Orwell imagined. After all, Big Brother needed electroshock, sleep deprivation, solitary confinement, drugs and hectoring propaganda broadcasts to keep power, while his Ministry of Plenty ensured that consumer goods were lacking to make sure subjects were in an artificial state of need.

The new surveillance society that has arisen since 1984, argues Han, works differently yet is more elegantly totalitarian and oppressive than anything described by Orwell or Jeremy Bentham. “Confession obtained by force has been replaced by voluntary disclosure,” he writes. “Smartphones have been substituted for torture chambers.” Well, not quite. Torture chambers still exist, it’s just that we in the neoliberal west have outsourced them (thanks, rendition flights) so that that obscenity called polite society can pretend they don’t exist.

Nonetheless, what capitalism realised in the neoliberal era, Han argues, is that it didn’t need to be tough, but seductive. This is what he calls smartpolitics. Instead of saying no, it says yes: instead of denying us with commandments, discipline and shortages, it seems to allow us to buy what we want when we want, become what we want and realise our dream of freedom. “Instead of forbidding and depriving it works through pleasing and fulfilling. Instead of making people compliant, it seeks to make them dependent.”…(More)”.

Computational Propaganda and Political Big Data: Moving Toward a More Critical Research Agenda


Gillian Bolsover and Philip Howard in the Journal Big Data: “Computational propaganda has recently exploded into public consciousness. The U.S. presidential campaign of 2016 was marred by evidence, which continues to emerge, of targeted political propaganda and the use of bots to distribute political messages on social media. This computational propaganda is both a social and technical phenomenon. Technical knowledge is necessary to work with the massive databases used for audience targeting; it is necessary to create the bots and algorithms that distribute propaganda; it is necessary to monitor and evaluate the results of these efforts in agile campaigning. Thus, a technical knowledge comparable to those who create and distribute this propaganda is necessary to investigate the phenomenon.

However, viewing computational propaganda only from a technical perspective—as a set of variables, models, codes, and algorithms—plays into the hands of those who create it, the platforms that serve it, and the firms that profit from it. The very act of making something technical and impartial makes it seem inevitable and unbiased. This undermines the opportunities to argue for change in the social value and meaning of this content and the structures in which it exists. Big-data research is necessary to understand the socio-technical issue of computational propaganda and the influence of technology in politics. However, big data researchers must maintain a critical stance toward the data being used and analyzed so as to ensure that we are critiquing as we go about describing, predicting, or recommending changes. If research studies of computational propaganda and political big data do not engage with the forms of power and knowledge that produce it, then the very possibility for improving the role of social-media platforms in public life evaporates.

Definitionally, computational propaganda has two important parts: the technical and the social. Focusing on the technical, Woolley and Howard define computational propaganda as the assemblage of social-media platforms, autonomous agents, and big data tasked with the manipulation of public opinion. In contrast, the social definition of computational propaganda derives from the definition of propaganda—communications that deliberately misrepresent symbols, appealing to emotions and prejudices and bypassing rational thought, to achieve a specific goal of its creators—with computational propaganda understood as propaganda created or disseminated using computational (technical) means…(More) (Full Text HTMLFull Text PDF)

From Territorial to Functional Sovereignty: The Case of Amazon


Essay by Frank Pasquale: “…Who needs city housing regulators when AirBnB can use data-driven methods to effectively regulate room-letting, then house-letting, and eventually urban planning generally? Why not let Amazon have its own jurisdiction or charter city, or establish special judicial procedures for Foxconn? Some vanguardists of functional sovereignty believe online rating systems could replace state occupational licensure—so rather than having government boards credential workers, a platform like LinkedIn could collect star ratings on them.

In this and later posts, I want to explain how this shift from territorial to functional sovereignty is creating a new digital political economy. Amazon’s rise is instructive. As Lina Khan explains, “the company has positioned itself at the center of e-commerce and now serves as essential infrastructure for a host of other businesses that depend upon it.” The “everything store” may seem like just another service in the economy—a virtual mall. But when a firm combines tens of millions of customers with a “marketing platform, a delivery and logistics network, a payment service, a credit lender, an auction house…a hardware manufacturer, and a leading host of cloud server space,” as Khan observes, it’s not just another shopping option.

Digital political economy helps us understand how platforms accumulate power. With online platforms, it’s not a simple narrative of “best service wins.” Network effects have been on the cyberlaw (and digital economics) agenda for over twenty years. Amazon’s dominance has exhibited how network effects can be self-reinforcing. The more merchants there are selling on (or to) Amazon, the better shoppers can be assured that they are searching all possible vendors. The more shoppers there are, the more vendors consider Amazon a “must-have” venue. As crowds build on either side of the platform, the middleman becomes ever more indispensable. Oh, sure, a new platform can enter the market—but until it gets access to the 480 million items Amazon sells (often at deep discounts), why should the median consumer defect to it? If I want garbage bags, do I really want to go over to Target.com to re-enter all my credit card details, create a new log-in, read the small print about shipping, and hope that this retailer can negotiate a better deal with Glad? Or do I, ala Sunstein, want a predictive shopping purveyor that intimately knows my past purchase habits, with satisfaction just a click away?
As artificial intelligence improves, the tracking of shopping into the Amazon groove will tend to become ever more rational for both buyers and sellers. Like a path through a forest trod ever clearer of debris, it becomes the natural default. To examine just one of many centripetal forces sucking money, data, and commerce into online behemoths, play out game theoretically how the possibility of online conflict redounds in Amazon’s favor. If you have a problem with a merchant online, do you want to pursue it as a one-off buyer? Or as someone whose reputation has been established over dozens or hundreds of transactions—and someone who can credibly threaten to deny Amazon hundreds or thousands of dollars of revenue each year? The same goes for merchants: The more tribute they can pay to Amazon, the more likely they are to achieve visibility in search results and attention (and perhaps even favor) when disputes come up. What Bruce Schneier said about security is increasingly true of commerce online: You want to be in the good graces of one of the neo-feudal giants who bring order to a lawless realm. Yet few hesitate to think about exactly how the digital lords might use their data advantages against those they ostensibly protect.

Forward-thinking legal thinkers are helping us grasp these dynamics. For example, Rory van Loo has described the status of the “corporation as courthouse”—that is, when platforms like Amazon run dispute resolution schemes to settle conflicts between buyers and sellers. Van Loo describes both the efficiency gains that an Amazon settlement process might have over small claims court, and the potential pitfalls for consumers (such as opaque standards for deciding cases). I believe that, on top of such economic considerations, we may want to consider the political economic origins of e-commerce feudalism. For example, as consumer rights shrivel, it’s rational for buyers to turn to Amazon (rather than overwhelmed small claims courts) to press their case. The evisceration of class actions, the rise of arbitration, boilerplate contracts—all these make the judicial system an increasingly vestigial organ in consumer disputes. Individuals rationally turn to online giants for powers to impose order that libertarian legal doctrine stripped from the state. And in so doing, they reinforce the very dynamics that led to the state’s etiolation in the first place….(More)”.