Who Maps the World?


Sarah Holder at CityLab: “For most of human history, maps have been very exclusive,” said Marie Price, the first woman president of the American Geographical Society, appointed 165 years into its 167-year history. “Only a few people got to make maps, and they were carefully guarded, and they were not participatory.” That’s slowly changing, she said, thanks to democratizing projects like OpenStreetMap (OSM)….

But despite OSM’s democratic aims, and despite the long (albeit mostly hidden) history of lady cartographers, the OSM volunteer community is still composed overwhelmingly of men. A comprehensive statistical breakdown of gender equity in the OSM space has not yet been conducted, but Rachel Levine, a GIS operations and training coordinator with the American Red Cross, said experts estimate that only 2 to 5 percent of OSMers are women. The professional field of cartography is also male-dominated, as is the smaller subset of GIS professionals. While it would follow that the numbers of mappers of color and LGBTQ and gender-nonconforming mappers are similarly small, those statistics have gone largely unexamined….

When it comes to increasing access to health services, safety, and education—things women in many developing countries disproportionately lack—equitable cartographic representation matters. It’s the people who make the map who shape what shows up. On OMS, buildings aren’t just identified as buildings; they’re “tagged” with specifics according to mappers’ and editors’ preferences. “If two to five percent of our mappers are women, that means only a subset of that get[s] to decide what tags are important, and what tags get our attention,” said Levine.

Sports arenas? Lots of those. Strip clubs? Cities contain multitudes. Bars? More than one could possibly comprehend.

Meanwhile, childcare centers, health clinics, abortion clinics, and specialty clinics that deal with women’s health are vastly underrepresented. In 2011, the OSM community rejected an appeal to add the “childcare” tag at all. It was finally approved in 2013, and in the time since, it’s been used more than 12,000 times.

Doctors have been tagged more than 80,000 times, while healthcare facilities that specialize in abortion have been tagged only 10; gynecology, near 1,500; midwife, 233, fertility clinics, none. Only one building has been tagged as a domestic violence facility, and 15 as a gender-based violence facility. That’s not because these facilities don’t exist—it’s because the men mapping them don’t know they do, or don’t care enough to notice.

So much of the importance of mapping is about navigating the world safely. For women, especially women in less developed countries, that safety is harder to secure. “If we tag something as a public toilet, does that mean it has facilities for women? Does it mean the facilities are safe?” asked Levine. “When we’re tagging specifically, ‘This is a female toilet,’ that means somebody has gone in and said, ‘This is accessible to me.’ When women aren’t doing the tagging, we just get the toilet tag.”

“Women’s geography,” Price tells her students, is made up of more than bridges and tunnels. It’s shaped by asking things like: Where on the map do you feel safe? How would you walk from A to B in the city without having to look over your shoulder? It’s hard to map these intangibles—but not impossible….(More).

Empowerment tool for women maps cases of harassment


Springwise: “We have previously written about innovations that promote inclusion and equal rights such as edible pie charts that highlight gender inequality. Another example is a predictive text app that finds alternative words for gendered language. Now, NINA, created in Brazil, is an app for empowering women to report violence that occurs in public spaces. The project was shared to Red Bull Amaphiko, a platform for social entrepreneurs to share their work and stories.

A 2016 survey released by ActionAid and conducted by YouGov found that 86 percent of Brazilian women were victims of harassment in public spaces. Responding to these statistics, Simony César created project NINA two years ago to help tackle gender-based violence. The app collects data in real time, mapping locations in which cases of harassment have taken place. The launch and testing of the app took place on public transport. It saw 76 thousand users per day at 17 bus lines at the Federal University of Pernambuco (UFPE).

César states “The premise of NINA aims to empower women through an application that denounces the types of violence they suffer within public spaces”. It combats violence against women by making cases of harassment in the city locatable on a map. NINA can then use this data to find out which bus lines have the highest rate of harassment. It can also record the most common times that cases occur and store photographic records and short videos of harassers.

Another survey by ActionAid in March 2018 revealed that 64 percent of Brazilian women surveyed were victims of sexual harassment. These results demonstrate that the need for empowerment tools, such as NINA, is still necessary. The exposure of women to violence in public city spaces is a global issue and as a result, accessibility within cities is unequal based on gender….(More)”.

Psychographics: the behavioural analysis that helped Cambridge Analytica know voters’ minds


Michael Wade at The Conversation: “Much of the discussion has been on how Cambridge Analytica was able to obtain data on more than 50m Facebook users – and how it allegedly failed to delete this data when told to do so. But there is also the matter of what Cambridge Analytica actually did with the data. In fact the data crunching company’s approach represents a step change in how analytics can today be used as a tool to generate insights – and to exert influence.

For example, pollsters have long used segmentation to target particular groups of voters, such as through categorising audiences by gender, age, income, education and family size. Segments can also be created around political affiliation or purchase preferences. The data analytics machine that presidential candidate Hillary Clinton used in her 2016 campaign – named Ada after the 19th-century mathematician and early computing pioneer – used state-of-the-art segmentation techniques to target groups of eligible voters in the same way that Barack Obama had done four years previously.

Cambridge Analytica was contracted to the Trump campaign and provided an entirely new weapon for the election machine. While it also used demographic segments to identify groups of voters, as Clinton’s campaign had, Cambridge Analytica also segmented using psychographics. As definitions of class, education, employment, age and so on, demographics are informational. Psychographics are behavioural – a means to segment by personality.

This makes a lot of sense. It’s obvious that two people with the same demographic profile (for example, white, middle-aged, employed, married men) can have markedly different personalities and opinions. We also know that adapting a message to a person’s personality – whether they are open, introverted, argumentative, and so on – goes a long way to help getting that message across….

There have traditionally been two routes to ascertaining someone’s personality. You can either get to know them really well – usually over an extended time. Or you can get them to take a personality test and ask them to share it with you. Neither of these methods is realistically open to pollsters. Cambridge Analytica found a third way, with the assistance of two University of Cambridge academics.

The first, Aleksandr Kogan, sold them access to 270,000 personality tests completed by Facebook users through an online app he had created for research purposes. Providing the data to Cambridge Analytica was, it seems, against Facebook’s internal code of conduct, but only now in March 2018 has Kogan been banned by Facebook from the platform. In addition, Kogan’s data also came with a bonus: he had reportedly collected Facebook data from the test-takers’ friends – and, at an average of 200 friends per person, that added up to some 50m people.

However, these 50m people had not all taken personality tests. This is where the second Cambridge academic, Michal Kosinski, came in. Kosinski – who is said to believe that micro-targeting based on online data could strengthen democracy – had figured out a way to reverse engineer a personality profile from Facebook activity such as likes. Whether you choose to like pictures of sunsets, puppies or people apparently says a lot about your personality. So much, in fact, that on the basis of 300 likes, Kosinski’s model is able to predict someone’s personality profile with the same accuracy as a spouse….(More)”

Bias in Online Classes: Evidence from a Field Experiment


Paper by Rachel Baker, Thomas Dee, Brent Evans and June John: “While online learning environments are increasingly common, relatively little is known about issues of equity in these settings. We test for the presence of race and gender biases among postsecondary students and instructors in online classes by measuring student and instructor responses to discussion comments we posted in the discussion forums of 124 different online courses. Each comment was randomly assigned a student name connoting a specific race and gender. We find that instructors are 94% more likely to respond to forum posts by White male students. In contrast, we do not find general evidence of biases in student responses. However, we do find that comments placed by White females are more likely to receive a response from White female peers. We discuss the implications of our findings for our understanding of social identity dynamics in classrooms and the design of equitable online learning environments….(More)”.

How tech used to track the flu could change the game for public health response


Cathie Anderson in the Sacramento Bee: “Tech entrepreneurs and academic researchers are tracking the spread of flu in real-time, collecting data from social media and internet-connected devices that show startling accuracy when compared against surveillance data that public health officials don’t report until a week or two later….

Smart devices and mobile apps have the potential to reshape public health alerts and responses,…, for instance, the staff of smart thermometer maker Kinsa were receiving temperature readings that augured the surge of flu patients in emergency rooms there.

Kinsa thermometers are part of the movement toward the Internet of Things – devices that automatically transmit information to a database. No personal information is shared, unless users decide to input information such as age and gender. Using data from more than 1 million devices in U.S. homes, the staff is able to track fever as it hits and use an algorithm to estimate impact for a broader population….

Computational researcher Aaron Miller worked with an epidemiological team at the University of Iowa to assess the feasibility of using Kinsa data to forecast the spread of flu. He said the team first built a model using surveillance data from the CDC and used it to forecast the spread of influenza. Then the team created a model where they integrated the data from Kinsa along with that from the CDC.

“We got predictions that were … 10 to 50 percent better at predicting the spread of flu than when we used CDC data alone,” Miller said. “Potentially, in the future, if you had granular information from the devices and you had enough information, you could imagine doing analysis on a really local level to inform things like school closings.”

While Kinsa uses readings taken in homes, academic researchers and companies such as sickweather.com are using crowdsourcing from social media networks to provide information on the spread of flu. Siddharth Shah, a transformational health industry analyst at Frost & Sullivan, pointed to an award-winning international study led by researchers at Northeastern University that tracked flu through Twitter posts and other key parameters of flu.

When compared with official influenza surveillance systems, the researchers said, the model accurately forecast the evolution of influenza up to six weeks in advance, much earlier than prior models. Such advance warnings would give health agencies significantly more time to expand upon medical resources or to alert the public to measures they can take to prevent transmission of the disease….

For now, Shah said, technology will probably only augment or complement traditional public data streams. However, he added, innovations already are changing how diseases are tracked. Chronic disease management, for instance, is going digital with devices such as Omada health that helps people with Type 2 diabetes better manage health challenges and Noom, a mobile app that helps people stop dieting and instead work toward true lifestyle change….(More).

Infection forecasts powered by big data


Michael Eisenstein at Nature: “…The good news is that the present era of widespread access to the Internet and digital health has created a rich reservoir of valuable data for researchers to dive into….By harvesting and combining these streams of big data with conventional ways of monitoring infectious diseases, the public-health community could gain fresh powers to catch and curb emerging outbreaks before they rage out of control.

Going viral

Data scientists at Google were the first to make a major splash using data gathered online to track infectious diseases. The Google Flu Trends algorithm, launched in November 2008, combed through hundreds of billions of users’ queries on the popular search engine to look for small increases in flu-related terms such as symptoms or vaccine availability. Initial data suggested that Google Flu Trends could accurately map the incidence of flu with a lag of roughly one day. “It was a very exciting use of these data for the purpose of public health,” says Brownstein. “It really did start a whole revolution and new field of work in query data.”

Unfortunately, Google Flu Trends faltered when it mattered the most, completely missing the onset in April 2009 of the H1N1 pandemic. The algorithm also ran into trouble later on in the pandemic. It had been trained against seasonal fluctuations of flu, says Viboud, but people’s behaviour changed in the wake of panic fuelled by media reports — and that threw off Google’s data. …

Nevertheless, its work with Internet usage data was inspirational for infectious-disease researchers. A subsequent study from a team led by Cecilia Marques-Toledo at the Federal University of Minas Gerais in Belo Horizonte, Brazil, used Twitter to get high-resolution data on the spread of dengue fever in the country. The researchers could quickly map new cases to specific cities and even predict where the disease might spread to next (C. A. Marques-Toledo et al. PLoS Negl. Trop. Dis. 11, e0005729; 2017). Similarly, Brownstein and his colleagues were able to use search data from Google and Twitter to project the spread of Zika virus in Latin America several weeks before formal outbreak declarations were made by public-health officials. Both Internet services are used widely, which makes them data-rich resources. But they are also proprietary systems for which access to data is controlled by a third party; for that reason, Generous and his colleagues have opted instead to make use of search data from Wikipedia, which is open source. “You can get the access logs, and how many people are viewing articles, which serves as a pretty good proxy for search interest,” he says.

However, the problems that sank Google Flu Trends still exist….Additionally, online activity differs for infectious conditions with a social stigma such as syphilis or AIDS, because people who are or might be affected are more likely to be concerned about privacy. Appropriate search-term selection is essential: Generous notes that initial attempts to track flu on Twitter were confounded by irrelevant tweets about ‘Bieber fever’ — a decidedly non-fatal condition affecting fans of Canadian pop star Justin Bieber.

Alternatively, researchers can go straight to the source — by using smartphone apps to ask people directly about their health. Brownstein’s team has partnered with the Skoll Global Threats Fund to develop an app called Flu Near You, through which users can voluntarily report symptoms of infection and other information. “You get more detailed demographics about age and gender and vaccination status — things that you can’t get from other sources,” says Brownstein. Ten European Union member states are involved in a similar surveillance programme known as Influenzanet, which has generally maintained 30,000–40,000 active users for seven consecutive flu seasons. These voluntary reporting systems are particularly useful for diseases such as flu, for which many people do not bother going to the doctor — although it can be hard to persuade people to participate for no immediate benefit, says Brownstein. “But we still get a good signal from the people that are willing to be a part of this.”…(More)”.

Predictive text app helps reverse gendered language


Springwise: “Research has shown that people talk differently to children depending on their gender. Without even realising it, adults tend to talk to boys in terms of their abilities, but to girls in terms of their looks. Over time, this difference can affect how children, and in particular girls, see themselves, and can affect their self-confidence. Finnish child rights organisation Plan International, in conjunction with Samsung Electronics Nordic, decided to try and change this unconscious behaviour with a predictive text app. Sheboard seeks to empower girls by raising awareness of the impacts of gendered speech.

As users are typing, Sheboard will suggest gender neutral words as well as words that are designed to empower girls, such as “I’m capable” and “I deserve”. The app also swaps stereotypical expressions with those that are more positive. The goal is to remind girls about the qualities and abilities they have. In the words of Nora Lindström, Plan’s Global Coordinator for Digital Development, “We want to help people see the impact that words have, and make them consider ways in which they can change how they talk in order to empower girls.”

In developing the app, Plan had girls and women of different ages contribute their personal empowerment phrases. Plan acknowledges that technological innovations in and of themselves won’t change gender-stereotypical behaviour. However, the hope is that the app will lead to a greater awareness and understanding of these issues and in Finland and elsewhere. The app is currently available on Google Play. Sheboard joins other girls-power products such as an app that adds augmented reality statues of remarkable women to public places and toys designed to encourage girls to enter STEM fields….(More)”.

Is your software racist?


Li Zhou at Politico: “Late last year, a St. Louis tech executive named Emre Şarbak noticed something strange about Google Translate. He was translating phrases from Turkish — a language that uses a single gender-neutral pronoun “o” instead of “he” or “she.” But when he asked Google’s tool to turn the sentences into English, they seemed to read like a children’s book out of the 1950’s. The ungendered Turkish sentence “o is a nurse” would become “she is a nurse,” while “o is a doctor” would become “he is a doctor.”

The website Quartz went on to compose a sort-of poem highlighting some of these phrases; Google’s translation program decided that soldiers, doctors and entrepreneurs were men, while teachers and nurses were women. Overwhelmingly, the professions were male. Finnish and Chinese translations had similar problems of their own, Quartz noted.

What was going on? Google’s Translate tool “learns” language from an existing corpus of writing, and the writing often includes cultural patterns regarding how men and women are described. Because the model is trained on data that already has biases of its own, the results that it spits out serve only to further replicate and even amplify them.

It might seem strange that a seemingly objective piece of software would yield gender-biased results, but the problem is an increasing concern in the technology world. The term is “algorithmic bias” — the idea that artificially intelligent software, the stuff we count on to do everything from power our Netflix recommendations to determine our qualifications for a loan, often turns out to perpetuate social bias.

Voice-based assistants, like Amazon’s Alexa, have struggled to recognize different accents. A Microsoft chatbot on Twitter started spewing racist posts after learning from other users on the platform. In a particularly embarrassing example in 2015, a black computer programmer found that Google’s photo-recognition tool labeled him and a friend as “gorillas.”

Sometimes the results of hidden computer bias are insulting, other times merely annoying. And sometimes the effects are potentially life-changing….(More)”.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge
Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain time, due to a constant risk of being sexually harassed on public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data
Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.
The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having a comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that dentifying these gaps is key to advancing gender equality and women’s empowerment, one of the SDGs.

Catalyzing Inclusive FInancial System: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides informations on how to access and summarizes insights from this data.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

    • This paper seeks to understand homophily, a human behavior characterizes by interaction with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily has to social structures.
    • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measure the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. Meaning that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported case of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12,, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement pattern provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operator, who supplied the de-identified data from 12 million users, before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 people above normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. “Safety First: Street Harassment and Women’s Educational Choices in India.” Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows user to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, the researcher analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Articial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. “The role of gender in social network organization.” PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

Vaitla, Bapu. Big Data and the Well-Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

New York City moves to create accountability for algorithms


Lauren Kirchner at ArsTechnica: “The algorithms that play increasingly central roles in our lives often emanate from Silicon Valley, but the effort to hold them accountable may have another epicenter: New York City. Last week, the New York City Council unanimously passed a bill to tackle algorithmic discrimination—the first measure of its kind in the country.

The algorithmic accountability bill, waiting to be signed into law by Mayor Bill de Blasio, establishes a task force that will study how city agencies use algorithms to make decisions that affect New Yorkers’ lives, and whether any of the systems appear to discriminate against people based on age, race, religion, gender, sexual orientation, or citizenship status. The task force’s report will also explore how to make these decision-making processes understandable to the public.

The bill’s sponsor, Council Member James Vacca, said he was inspired by ProPublica’s investigation into racially biased algorithms used to assess the criminal risk of defendants….

A previous, more sweeping version of the bill had mandated that city agencies publish the source code of all algorithms being used for “targeting services” or “imposing penalties upon persons or policing” and to make them available for “self-testing” by the public. At a hearing at City Hall in October, representatives from the mayor’s office expressed concerns that this mandate would threaten New Yorkers’ privacy and the government’s cybersecurity.

The bill was one of two moves the City Council made last week concerning algorithms. On Thursday, the committees on health and public safety held a hearing on the city’s forensic methods, including controversial tools that the chief medical examiner’s office crime lab has used for difficult-to-analyze samples of DNA.

As a ProPublica/New York Times investigation detailed in September, an algorithm created by the lab for complex DNA samples has been called into question by scientific experts and former crime lab employees.

The software, called the Forensic Statistical Tool, or FST, has never been adopted by any other lab in the country….(More)”.