StatCan now crowdsourcing cannabis data


Kyle Duggan at iPolitics: “The national statistics agency is launching a crowdsourcing project to find out how much weed Canadians are consuming and how much it costs them.

Statistics Canada is searching for the best picture of consumption it can find ahead of legalization, and is turning to average Canadians to improve its rough estimates about a product that’s largely been accessed illegally by the population.

Thursday it released a suite of “experimental” data that make up its current best guesses on Canadian consumption habits, along with a crowdsourcing website and app to get its own estimates – a project officials said is an experiment itself.

Statscan is also rolling out a quarterly cannabis survey this year.

The agency has been combing through historical research on legal and illegal cannabis prices, scraping price data from illegal vendors online and, for some data, is relying largely on the self-reporting website priceofweed.com to assemble as much pot information as possible, even if it’s not perfect data.

The agency has been quietly preparing for the July legalization deadline by compiling health, justice and economic datasets and scouring to fill in the blanks where it can. Come July, legal cannabis will suddenly also need to be rolled into other important data products, like the GDP accounts….(More)”.

Nudge Units to Improve the Delivery of Health Care


Mitesh S. Patel et al in The New England Journal of Medicine: “The final common pathway for the application of nearly every advance in medicine is human behavior. No matter how effective a drug, how protective a vaccine, or how targeted a therapy may be, a clinician usually has to prescribe it, and a patient accept and use it as directed, for it to improve health. Clinicians’ and patients’ environments influence their decisions about taking these actions, and the seemingly subtle design of information and choices can have outsize effects on our behavior. When the “choice architecture” is designed to influence behavior in a predictable way but without restricting choice, it is often called a “nudge.”…

In 2016, we launched the Penn Medicine Nudge Unit to systematically develop and test approaches using nudges to improve health care delivery. The goals are to improve health care value and outcomes, advance knowledge about how to best implement nudges for impact, evaluate our efforts, and disseminate our findings. Ideas are generated by health system leadership, frontline clinicians and staff, and members of the unit itself. Our early successes and failures reveal some lessons about the role that nudge units can play in improving health care (see table).

First, these units can help health systems understand when it makes sense to use a nudge and when it doesn’t. Nudges can be designed to remind, guide, or motivate behavior. Promising opportunities are those in which suboptimal care can be addressed by targeting a specific decision that drives a less-than-optimal behavior. For example, when prescribing medications, physicians must decide between brand-name and generic formulations. Systems can set generics as the default choice, so that ordering them becomes the path of least resistance even as the ability to opt out and order a brand-name drug is preserved. When we implemented this change in our EHR, prescribing rates for generics increased from 75% to 98%. Clinical settings also play an important role. We found that reducing the default duration of opioid prescriptions may make sense for acute conditions often seen by clinicians in the emergency department but may be inappropriate for clinicians caring for patients with chronic pain.

Second, although nudges have typically been deployed for simple one-off decisions, we’ve found that they can also support more complex decision paths. For example, only 15% of our eligible patients were being referred for cardiac rehabilitation after myocardial infarction. When we asked the cardiologists why, we discovered that the process remained manual, so they had to take action to initiate the referrals — in other words, it was an opt-in system. The process was redesigned as an opt-out system in which referral for rehab was the default; patient identification was automated using the EHR; staff were notified using secure text messaging; and processes were restructured so that cardiologists signed orders in a template for referral on rounds and staff met with patients to set up rehab placement before discharge. The referral rate increased to more than 80%.

Third, stakeholder alignment is critical to nudges’ success. ….

Fourth, nudges can lead to unintended behavior that’s invisible without proper evaluation….(More)”.

 

The Future Computed: Artificial Intelligence and its role in society


Brad Smith at the Microsoft Blog: “Today Microsoft is releasing a new book, The Future Computed: Artificial Intelligence and its role in society. The two of us have written the foreword for the book, and our teams collaborated to write its contents. As the title suggests, the book provides our perspective on where AI technology is going and the new societal issues it has raised.

On a personal level, our work on the foreword provided an opportunity to step back and think about how much technology has changed our lives over the past two decades and to consider the changes that are likely to come over the next 20 years. In 1998, we both worked at Microsoft, but on opposite sides of the globe. While we lived on separate continents and in quite different cultures, we shared similar experiences and daily routines which were managed by manual planning and movement. Twenty years later, we take for granted the digital world that was once the stuff of science fiction.

Technology – including mobile devices and cloud computing – has fundamentally changed the way we consume news, plan our day, communicate, shop and interact with our family, friends and colleagues. Two decades from now, what will our world look like? At Microsoft, we imagine that artificial intelligence will help us do more with one of our most precious commodities: time. By 2038, personal digital assistants will be trained to anticipate our needs, help manage our schedule, prepare us for meetings, assist as we plan our social lives, reply to and route communications, and drive cars.

Beyond our personal lives, AI will enable breakthrough advances in areas like healthcare, agriculture, education and transportation. It’s already happening in impressive ways.

But as we’ve witnessed over the past 20 years, new technology also inevitably raises complex questions and broad societal concerns. As we look to a future powered by a partnership between computers and humans, it’s important that we address these challenges head on.

How do we ensure that AI is designed and used responsibly? How do we establish ethical principles to protect people? How should we govern its use? And how will AI impact employment and jobs?

To answer these tough questions, technologists will need to work closely with government, academia, business, civil society and other stakeholders. At Microsoft, we’ve identified six ethical principles – fairness, reliability and safety, privacy and security, inclusivity, transparency, and accountability – to guide the cross-disciplinary development and use of artificial intelligence. The better we understand these or similar issues — and the more technology developers and users can share best practices to address them — the better served the world will be as we contemplate societal rules to govern AI.

We must also pay attention to AI’s impact on workers. What jobs will AI eliminate? What jobs will it create? If there has been one constant over 250 years of technological change, it has been the ongoing impact of technology on jobs — the creation of new jobs, the elimination of existing jobs and the evolution of job tasks and content. This too is certain to continue.

Some key conclusions are emerging….

The Future Computed is available here and additional content related to the book can be found here.”

Satellites Predict a Cholera Outbreak Weeks in Advance


Sarah Derouin at Scientific American: “Orbiting satellites can warn us of bad weather and help us navigate to that new taco joint. Scientists are also using data satellites to solve a worldwide problem: predicting cholera outbreaks.

Cholera infects millions of people each year, leading to thousands of deaths. Often communities do not realize an epidemic is underway until infected individuals swarm hospitals. Advanced warning for impending epidemics could help health workers prepare for the onslaught—stockpiling rehydration supplies, medicines and vaccines—which can save lives and quell the disease’s spread. Back in May 2017 a team of scientists used satellite information to assess whether an outbreak would occur in Yemen, and they ended up predicting an outburst that spread across the country in June….

At the American Geophysical Union annual meeting in December, Jutla presented the group’s prediction model of cholera for Yemen. The team used a handful of satellites to monitor temperatures, water storage, precipitation and land around the country. By processing that information in algorithms they developed, the team predicted areas most at risk for an outbreak over the upcoming month.

Weeks later an epidemic occurred that closely resembled what the model had predicted. “It was something we did not expect,” Jutla says, because they had built the algorithms—and calibrated and validated them—on data from the Bengal Delta in southern Asia as well as parts of Africa. They were unable to go into war-torn Yemen directly, however. For those reasons, the team had not informed Yemen officials of the predicted June outbreak….(More).”

Government data: How open is too open?


Sharon Fisher at HPE: “The notion of “open government” appeals to both citizens and IT professionals seeking access to freely available government data. But is there such a thing as data access being too open? Governments may want to be transparent, yet they need to avoid releasing personally identifiable information.

There’s no question that open government data offers many benefits. It gives citizens access to the data their taxes paid for, enables government oversight, and powers the applications developed by government, vendors, and citizens that improve people’s lives.

However, data breaches and concerns about the amount of data that government is collecting makes some people wonder: When is it too much?

“As we think through the big questions about what kind of data a state should collect, how it should use it, and how to disclose it, these nuances become not some theoretical issue but a matter of life and death to some people,” says Alexander Howard, deputy director of the Sunlight Foundation, a Washington nonprofit that advocates for open government. “There are people in government databases where the disclosure of their [physical] location is the difference between a life-changing day and Wednesday.

Open data supporters point out that much of this data has been considered a public record all along and tout the value of its use in analytics. But having personal data aggregated in a single place that is accessible online—as opposed to, say, having to go to an office and physically look up each record—makes some people uneasy.

Privacy breaches, wholesale

“We’ve seen a real change in how people perceive privacy,” says Michael Morisy, executive director at MuckRock, a Cambridge, Massachusetts, nonprofit that helps media and citizens file public records requests. “It’s been driven by a long-standing concept in transparency: practical obscurity.” Even if something was technically a public record, effort needed to be expended to get one’s hands on it. That amount of work might be worth it about, say, someone running for office, but on the whole, private citizens didn’t have to worry. Things are different now, says Morisy. “With Google, and so much data being available at the click of a mouse or the tap of a phone, what was once practically obscure is now instantly available.”

People are sometimes also surprised to find out that public records can contain their personally identifiable information (PII), such as addresses, phone numbers, and even Social Security numbers. That may be on purpose or because someone failed to redact the data properly.

That’s had consequences. Over the years, there have been a number of incidents in which PII from public records, including addresses, was used to harass and sometimes even kill people. For example, in 1989, Rebecca Schaeffer was murdered by a stalker who learned her address from the Department of Motor Vehicles. Other examples of harassment via driver’s license numbers include thieves who tracked down the address of owners of expensive cars and activists who sent anti-abortion literature to women who had visited health clinics that performed abortions.

In response, in 1994, Congress enacted the Driver’s Privacy Protection Act to restrict the sale of such data. More recently, the state of Idaho passed a law protecting the identity of hunters who shot wolves, because the hunters were being harassed by wolf supporters. Similarly, the state of New York allowed concealed pistol permit holders to make their name and address private after a newspaper published an online interactive map showing the names and addresses of all handgun permit holders in Westchester and Rockland counties….(More)”.

When census taking is a recipe for controversy


Anjana Ahuja in the Financial Times: “Population counts are important tools for development, but also politically fraught…The UN describes a census as “among the most complex and massive peacetime exercises a nation undertakes”. Given that social trends, migration patterns and inequalities can be determined from questions that range from health to wealth, housing and even religious beliefs, censuses can also be controversial. So it is with the next one in the US, due to be conducted in 2020. The US Department of Justice has proposed that participants should be quizzed on their citizenship status. Vanita Gupta, president of the Leadership Conference on Civil and Human Rights, warned the journal Science that many would refuse to take part. Ms Gupta said that, in the current political climate, enquiring about citizenship “would destroy any chance of an accurate count, discard years of careful research and increase costs significantly”.

The row has taken on a new urgency because the 2020 census must be finalised by April. The DoJ claims that a citizenship question will ensure that ethnic minorities are treated fairly in the voting process. Currently, only about one in six households is asked about citizenship, with the results extrapolated for the whole population, a process observers say is statistically acceptable and less intrusive. In 2011, the census for England and Wales asked for country of birth and passports held — but not citizenship explicitly. It is one of those curious cases when fewer questions might lead to more accurate and useful data….(More)”.

Cops, Docs, and Code: A Dialogue between Big Data in Health Care and Predictive Policing


Paper by I. Glenn Cohen and Harry Graver: “Big data” has become the ubiquitous watchword of this decade. Predictive analytics, which is something we want to do with big data — to use of electronic algorithms to forecast future events in real time. Predictive analytics is interfacing with the law in a myriad of settings: how votes are counted and voter rolls revised, the targeting of taxpayers for auditing, the selection of travelers for more intensive searching, pharmacovigilance, the creation of new drugs and diagnostics, etc.

In this paper, written for the symposium “Future Proofing the Law,” we want to engage in a bit of legal arbitrage; that is, we want to examine which insights from legal analysis of predictive analytics in better-trodden ground — predictive policing — can be useful for understanding relatively newer ground for legal scholars — the use of predictive analytics in health care. To the degree lessons can be learned from this dialogue, we think they go in both directions….(More)”.

Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor


Book by Virginia Eubanks: “The State of Indiana denies one million applications for healthcare, foodstamps and cash benefits in three years—because a new computer system interprets any mistake as “failure to cooperate.” In Los Angeles, an algorithm calculates the comparative vulnerability of tens of thousands of homeless people in order to prioritize them for an inadequate pool of housing resources. In Pittsburgh, a child welfare agency uses a statistical model to try to predict which children might be future victims of abuse or neglect.

Since the dawn of the digital age, decision-making in finance, employment, politics, health and human services has undergone revolutionary change. Today, automated systems—rather than humans—control which neighborhoods get policed, which families attain needed resources, and who is investigated for fraud. While we all live under this new regime of data, the most invasive and punitive systems are aimed at the poor.

In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America. The book is full of heart-wrenching and eye-opening stories, from a woman in Indiana whose benefits are literally cut off as she lays dying to a family in Pennsylvania in daily fear of losing their daughter because they fit a certain statistical profile.

The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute. Like the county poorhouse and scientific charity before them, digital tracking and automated decision-making hide poverty from the middle-class public and give the nation the ethical distance it needs to make inhumane choices: which families get food and which starve, who has housing and who remains homeless, and which families are broken up by the state. In the process, they weaken democracy and betray our most cherished national values….(More)”.

“Crowdsourcing” ten years in: A review


Kerri Wazny at the Journal of Global Health: “First coined by Howe in 2006, the field of crowdsourcing has grown exponentially. Despite its growth and its transcendence across many fields, the definition of crowdsourcing has still not been agreed upon, and examples are poorly indexed in peer–reviewed literature. Many examples of crowdsourcing have not been scaled–up past the pilot phase.

In spite of this, crowdsourcing has great potential, especially in global health where resources are lacking. This narrative review seeks to review both indexed and grey crowdsourcing literature broadly in order to explore the current state of the field….(More)”.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge
Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain time, due to a constant risk of being sexually harassed on public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data
Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.
The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having a comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that dentifying these gaps is key to advancing gender equality and women’s empowerment, one of the SDGs.

Catalyzing Inclusive FInancial System: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides informations on how to access and summarizes insights from this data.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

    • This paper seeks to understand homophily, a human behavior characterizes by interaction with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily has to social structures.
    • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measure the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. Meaning that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported case of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12,, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement pattern provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operator, who supplied the de-identified data from 12 million users, before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 people above normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. “Safety First: Street Harassment and Women’s Educational Choices in India.” Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows user to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, the researcher analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Articial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. “The role of gender in social network organization.” PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

Vaitla, Bapu. Big Data and the Well-Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.