The Paths to Digital Self-Determination – A Foundational Theoretical Framework


Paper by Nydia Remolina and Mark Findlay: “A deluge of data is giving rise to new understandings and experiences of society and economy as our digital footprint grows steadily. Are data subjects able to determine themselves in this data-driven society? The emerging debates about autonomy and communal responsibility in the context of data access or protection, highlight a pressing imperative to re-imagine the ‘self’ in the digital space. Empowerment, autonomy, sovereignty, human centricity, are all terms often associated with the notion of digital self-determination in current policy language. More academics, industry experts, policymakers, regulators are now advocating self-determination in a data-driven world. The attitudes to self-determination range from alienating data as property through to broad considerations of communal access and enrichment. Digital self-determination is a complex notion to be viewed from different perspectives and in unique spaces, re-shaping what we understand as self-determination in the non-digital world. This paper explores the notion of digital self-determination by presenting a foundational theoretical framework based on pre-existent self-determination theories and exploring the implications of the digital society in the determination of the self. Only by better appreciating and critically framing the discussion of digital self-determination, is it possible to engage in trustworthy data spaces, and ensure ethical human-centric approaches when living in a data driven society….(More)”.

The Case for Open Land-Data Systems


Tim Hanstad at Project Syndicate: “Last month, a former Zimbabwean cabinet minister was arrested for illegally selling parcels of state land. A few days earlier, a Malaysian court convicted the ex-chairman of a state-owned land development agency of corruption. And in January, the Estonian government collapsed amid allegations of corrupt property dealings. These recent events all turned the spotlight on the growing but neglected threat of land-related corruption.

Such corruption can flourish in countries that are unprepared to manage the heightened demand for land that accompanies economic and population growth. Land governance in these countries – institutions, policies, rules, and records for managing land rights and use – is underdeveloped, which undermines the security of citizens’ land rights and enables covert land grabs by the well connected.

In Ghana, for example, the government keeps land records for only about 2% of currently operating farms; the ownership of the remainder is largely undocumented. In India, these records were, until recently, often kept in disorganized stacks in government offices.

Under such circumstances, corruption becomes relatively easy and lucrative. After all, when recordkeeping is nonexistent or chaotic, who can confidently identify the rightful owner of a parcel of land? As the United Nations Food and Agriculture Organization and Transparency International put it in a report a decade ago, “where land governance is deficient, high levels of corruption often flourish.” This corruption “is pervasive and without effective means of control.”

Globally, one in five people report having paid a bribe to access land services. In Africa, two out of three people believe the rich are likely to pay bribes or use their connections to grab land. Uncertainty about land rights can also affect housing security – around a billion people worldwide say they expect to be forced from their homes over the next five years.

Inevitably, the marginalized and vulnerable are the worst affected, whether they are widows driven from their homes by speculators or entire communities subjected to forced eviction by developers. Weak land rights and corruption also fuel conflict within communities, such as in Kenya, where political parties promise already-occupied land to supporters in an attempt to win votes.

But there is reason for hope. The ongoing revolution in information and communications technology provides unprecedented opportunities to digitize and open land records. Doing so would clarify the land rights of hundreds of millions of people globally and limit the scope for corrupt practices….(More)”.

Governing Privacy in Knowledge Commons


Open Access Book edited by Madelyn Rose Sanfilippo et al: “…explores how privacy impacts knowledge production, community formation, and collaborative governance in diverse contexts, ranging from academia and IoT, to social media and mental health. Using nine new case studies and a meta-analysis of previous knowledge commons literature, the book integrates the Governing Knowledge Commons framework with Helen Nissenbaum’s Contextual Integrity framework. The multidisciplinary case studies show that personal information is often a key component of the resources created by knowledge commons. Moreover, even when it is not the focus of the commons, personal information governance may require community participation and boundaries. Taken together, the chapters illustrate the importance of exit and voice in constructing and sustaining knowledge commons through appropriate personal information flows. They also shed light on the shortcomings of current notice-and-consent style regulation of social media platforms….(More)”.

‘Master,’ ‘Slave’ and the Fight Over Offensive Terms in Computing


Kate Conger at the New York Times: “Anyone who joined a video call during the pandemic probably has a global volunteer organization called the Internet Engineering Task Force to thank for making the technology work.

The group, which helped create the technical foundations of the internet, designed the language that allows most video to run smoothly online. It made it possible for someone with a Gmail account to communicate with a friend who uses Yahoo, and for shoppers to safely enter their credit card information on e-commerce sites.

Now the organization is tackling an even thornier issue: getting rid of computer engineering terms that evoke racist history, like “master” and “slave” and “whitelist” and “blacklist.”

But what started as an earnest proposal has stalled as members of the task force have debated the history of slavery and the prevalence of racism in tech. Some companies and tech organizations have forged ahead anyway, raising the possibility that important technical terms will have different meanings to different people — a troubling proposition for an engineering world that needs broad agreement so technologies work together.

While the fight over terminology reflects the intractability of racial issues in society, it is also indicative of a peculiar organizational culture that relies on informal consensus to get things done.

The Internet Engineering Task Force eschews voting, and it often measures consensus by asking opposing factions of engineers to hum during meetings. The hums are then assessed by volume and ferocity. Vigorous humming, even from only a few people, could indicate strong disagreement, a sign that consensus has not yet been reached…(More)”.

Combining Racial Groups in Data Analysis Can Mask Important Differences in Communities


Blog by Jonathan Schwabish and Alice Feng: “Surveys, datasets, and published research often lump together racial and ethnic groups, which can erase the experiences of certain communities. Combining groups with different experiences can mask how specific groups and communities are faring and, in turn, affect how government funds are distributed, how services are provided, and how groups are perceived.

Large surveys that collect data on race and ethnicity are used to disburse government funds and services in a number of ways. The US Department of Housing Urban Development, for instance, distributes millions of dollars annually to Native American tribes through the Indian Housing Block Grant. And statistics on race and ethnicity are used as evidence in employment discrimination lawsuits and to help determine whether banks are discriminating against people and communities of color.

Despite the potentially large effects these data can have, researchers don’t always disaggregate their analysis to more racial groups. Many point to small sample sizes as a limitation for including more race and ethnicity categories in their analysis, but efforts to gather more specific data and disaggregate available survey results are critical to creating better policy for everyone.

To illustrate how aggregating racial groups can mask important variation, we looked at the 2019 poverty rate across 139 detailed race categories in the Census Bureau’s annual American Community Survey (ACS). The ACS provides information that helps determine how more than $675 billion in government funds is distributed each year.

The official poverty rate in the United States stood at 10.5 percent in 2019, with significant variation across racial and ethnic groups. The primary question in the ACS concerning race includes 15 separate checkboxes, with space to print additional names or races for some options (a separate question refers to Hispanic or Latino origin).

Screenshot of the American Community Survey's race question

Although the survey offers ample latitude for interviewees to respond with their race, researchers have a tendency to aggregate racial categories. People who identify as Asian or Pacific Islander (API), for example, are often combined in economic analyses.

This aggregation can mask variation within racial or ethnic categories. As an example, one analysis that used the ACS showed 11 percent of children in the API group are in poverty, relative to 18 percent of the overall population. But that estimate could understate the poverty rate among children who identify as Pacific lslanders and could overstate the poverty rate among children who identify as Asian, which itself is a broad grouping that encompasses many different communities with various experiences. Similar aggregating can be found across economic literature, including on educationimmigration (PDF), and wealth….(More)”.

Decisions, Decisions, Decisions.


BSR Report on “Responsible Business Decision-Making Before, During, and After Public Health Emergencies: A Rights-Based Approach to Technology and Data Use…The COVID-19 public health emergency has surfaced important questions about the relationship between the right to privacy and other rights, such as the right to health, work, movement, expression, and assembly. Data and digital infrastructures can be used for many positive outcomes, such as facilitating “back to work” efforts, enhancing research into COVID-19 vaccines and treatments, and allowing the resumption of economic activity while also protecting public health.

However, these uses may also result in the infringement of privacy rights, new forms of discrimination, and harm to vulnerable groups. Some governments are using the emergency as an excuse to expand their power, leading to concerns that initiatives launched to address COVID-19 could become permanent forms of state surveillance.

As the providers of data, systems, and software, technology companies are often central in these public health emergency response efforts. For this reason, companies need to address the human rights risks associated with their involvement in disease response to avoid being connected to human rights violations.

This paper sets out the key elements of a human rights-based approach to the use of data and technology solutions during public health emergencies in today and tomorrow’s digital era, with a focus on the role of business and impacts to privacy.

These elements are primarily captured in a human rights-based decision-making framework for companies that can guide them through future public health emergencies. This framework can be found on page 5 of the report or can be downloaded separately.

COVID-19 is the first truly global pandemic of the modern age, but it won’t be the last. We hope this paper highlights lessons learned from COVID-19 that can be applied during the public health emergencies of the future….(More)”.

More than a number: The telephone and the history of digital identification


Article by Jennifer Holt and Michael Palm: “This article examines the telephone’s entangled history within contemporary infrastructural systems of ‘big data’, identity and, ultimately, surveillance. It explores the use of telephone numbers, keypads and wires to offer new perspective on the imbrication of telephonic information, interface and infrastructure within contemporary surveillance regimes. The article explores telephone exchanges as arbiters of cultural identities, keypads as the foundation of digital transactions and wireline networks as enacting the transformation of citizens and consumers into digital subjects ripe for commodification and surveillance. Ultimately, this article argues that telephone history – specifically the histories of telephone numbers and keypads as well as infrastructure and policy in the United States – continues to inform contemporary practices of social and economic exchange as they relate to consumer identity, as well as to current discourses about surveillance and privacy in a digital age…(More)”.

Selected Readings on Data, Gender, and Mobility


By Michelle Winowatan, Uma Kalkar, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017, and updated in 2021.

This edition of the Selected Readings was  developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature. 

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge

Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women’s mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with multiple stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain times due to a constant risk of being sexually harassed n public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data

Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.

The below curated set of readings seek to focus on the following areas:

  1. Insights on how data can inform gender empowerment initiatives,
  2. Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and,
  3. Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

  • Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
  • The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
  • The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

  • This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
  • The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
  • Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having  comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

  • This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
  • It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that identifying these gaps is key to achieving SDG 5: advancing gender equality and women’s empowerment.

Catalyzing Inclusive Financial Systems: Chile’s Commitment to Women’s Data. Data2X, 2014.

  • This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

  • Using readily available data, this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

  • The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides information on how to access this data and summarizes insights extracted from it.
  • Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
  • Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
  • The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

  • The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
  • It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
  • One of its findings is a strong positive correlation between the proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

  • This paper seeks to understand homophily, a human behavior that characterizes interactions with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily applied to social structures.
  • Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measures the physical proximity and duration of the interaction among those children in the playground.
  • It finds that interaction patterns are significantly determined by grade and class structure of the school. This means that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
  • From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Strengthening Gender Measures and Data in the COVID-19 Era: An Urgent Need for Change. Paris 21, 2021.

  • COVID-19 has exacerbated gender disparities, especially with regard to women’s livelihoods, unpaid labor, mental health, and risk of gender-based violence. Gaps in gender data impedes robust, data-driven, and effective policies to quantify, analyse, and respond to these issues. 
  • Without this information, the full effects of the COVID-19 pandemic cannot be understood. This report calls on National Statistical Systems, survey managers, funders, multilateral agencies, researchers, and policymakers to collect gender-intentional and disaggregated data that is standardized and comparable to address key areas of concern for women and girls. Additionally, it seeks to link non-traditional data sources, such as social media and news media, with existing frameworks to fill in knowledge gaps. Moreover, this information must be rendered accessible for all stakeholders to maximize the potential of the information. Post-pandemic, conscious collection and collation of gendered data is vital to preempt policy problems.

The Sex, Gender and COVID-19 Project: The COVID-19 Sex-Disaggregated Data Tracker. 2021.

  • This data tracker, produced by Global Health 50/50, the African Population and Health Research Center, and the International Center for Research on Women, tracks which countries and datasets have reported sex-disaggregated data on COVID-19 testing, confirmed cases, hospitalizations, and deaths.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

  • This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
  • From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported cases of cholera in the study areas to calculate the infectious pressure level of those areas.
  • The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
  • It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

  • This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
  • The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
  • This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

  • This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
  • The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Gauvin, Laetitia et al. Gender gaps in urban mobility. Humanities and Information Science. Humanities & Social Sciences Communications vol. 7, issue 11, 2020.

  • This article discusses how urbanization affects mobility of women in realizing their rights. It points out the historic lack of gender disaggregated data for urban planning, leading to transportation designs that do not best accommodate the needs of women.
  • Examining the case study of urban mobility through a gendered lens in the large and growing metropolitan area of Santiago, Chile, the article examines the mobility traces from Call Detail Records (CDRs) of an anonymized cohort of mobile phone users, sorted by gender, over 3 months. It then mapped differences between men and women with regard to socio-demographic indicators and mobility differences across the city and through the Santiago transportation network structure and identified points of interests frequented by either sex to inform gendered mobility needs in urban areas.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12, 2014.

  • This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
  • The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

  • This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
  • It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
  • It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
  • The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

  • Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
  • The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

  • This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

  • Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
  • It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement patterns provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
  • This project relies on the foundational cooperation with mobile phone operators, who supplied the de-identified data from 12 million users before the earthquake.
  • The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 more people  than normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al.Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality. Nature, 2017.

  • This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
  • The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
  • Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. Safety First: Street Harassment and Women’s Educational Choices in India.Stop Street Harassment, 2017.

  • Using data collected from SafetiPin, an application that allows users to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, Borker analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
  • The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Artificial Intelligence, 2010.

  • Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
  • It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
  • Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. The role of gender in social network organization. PLoS ONE 12, December 20, 2017.

  • Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
  • Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

The Landscape of Big Data and Gender. Data2X, February, 2021.

  • Under the backdrop of COVID-19, this report reaffirms that big data initiatives to study mobility, health, and social norms through gendered lenses have greatly progressed. More private companies and think tanks have launched data collection and sharing efforts to spur innovative projects to address COVID-19 complications.
  • However, economic opportunity, security, and civic action have been lagging behind. Big data collection among these topics is complicated by the lack of sex-disaggregated datasets, gender disparities in technology access, and the lack of gender-tags among big data.
  • Large technology firms, especially social networks like Facebook, LinkedIn, Uber, and more, create a large amount of gender-organized data. The report found that users and data-holding companies are willing to share this information for public policy reasons so long as it provides value and is protected. To this end, Data2X, alongside its partners, champion the use of data collaboratives to use gender sorted information for social good.

Vaitla, Bapu. Big Data and the Well Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

  • In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
  • From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
  • Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

How can stakeholder engagement and mini-publics better inform the use of data for pandemic response?


Andrew Zahuranec, Andrew Young and Stefaan G. Verhulst at the OECD Participo Blog Series:

Image for post

“What does the public expect from data-driven responses to the COVID-19 pandemic? And under what conditions?” These are the motivating questions behind The Data Assembly, a recent initiative by The GovLab at New York University Tandon School of Engineering — an action research center that aims to help institutions work more openly, collaboratively, effectively, and legitimately.

Launched with support from The Henry Luce Foundation, The Data Assembly solicited diverse, actionable public input on data re-use for crisis response in the United States. In particular, we sought to engage the public on how to facilitate, if deemed acceptable, the use of data that was collected for a particular purpose for informing COVID-19. One additional objective was to inform the broader emergence of data collaboration— through formal and ad hoc arrangements between the public sector, civil society, and those in the private sector — by evaluating public expectation and concern with current institutional, contractual, and technical structures and instruments that may underpin these partnerships.

The Data Assembly used a new methodology that re-imagines how organisations can engage with society to better understand local expectations regarding data re-use and related issues. This work goes beyond soliciting input from just the “usual suspects”. Instead, data assemblies provide a forum for a much more diverse set of participants to share their insights and voice their concerns.

This article is informed by our experience piloting The Data Assembly in New York City in summer 2020. It provides an overview of The Data Assembly’s methodology and outcomes and describes major elements of the effort to support organisations working on similar issues in other cities, regions, and countries….(More)”.

How a Google Street View image of your house predicts your risk of a car accident


MIT Technology Review: “Google Street View has become a surprisingly useful way to learn about the world without stepping into it. People use it to plan journeys, to explore holiday destinations, and to virtually stalk friends and enemies alike.

But researchers have found more insidious uses. In 2017 a team of researchers used the images to study the distribution of car types in the US and then used that data to determine the demographic makeup of the country. It turns out that the car you drive is a surprisingly reliable proxy for your income level, your education, your occupation, and even the way you vote in elections.

Street view of houses in Poland

Now a different group has gone even further. Łukasz Kidziński at Stanford University in California and Kinga Kita-Wojciechowska at the University of Warsaw in Poland have used Street View images of people’s houses to determine how likely they are to be involved in a car accident. That’s valuable information that an insurance company could use to set premiums.

The result raises important questions about the way personal information can leak from seemingly innocent data sets and whether organizations should be able to use it for commercial purposes.

Insurance data

The researchers’ method is straightforward. They began with a data set of 20,000 records of people who had taken out car insurance in Poland between 2013 and 2015. These were randomly selected from the database of an undisclosed insurance company.

Each record included the address of the policyholder and the number of damage claims he or she made during the 2013–’15 period. The insurer also shared its own prediction of future claims, calculated using its state-of-the-art risk model that takes into account the policyholder’s zip code and the driver’s age, sex, claim history, and so on.

The question that Kidziński and Kita-Wojciechowska investigated is whether they could make a more accurate prediction using a Google Street View image of the policyholder’s house….(More)”.