Uma

By Uma Kalkar, Salwa Mansuri, Marine Ragnet and Andrew J. Zahuranec

As part of an ongoing effort to contribute to current topics in data, technology, and governance, The GovLab’s Selected Readings series provides an annotated and curated collection of recommended works on themes such as open data, data collaboration, and civic technology.

Around the world, LGBTQ+ people face exclusion and discrimination that undermines their capacity to live their lives and succeed. Together with allies, many LGBTQ+ people are fighting to exercise their rights and achieve full equality. However, this struggle has been undermined by a lack of specific, quantifiable information on the challenges they face.

When collected and managed responsibly, data about sexual and gender minorities can be used to protect and empower LGBTQ+ people through informed policy and advocacy work. To this end, this Selected Reading investigates what data is (and is not) collected about LGBTQ+ individuals in the areas within healthcare, education, economics, and public policy and the ramifications of these outcomes. It offers a perspective on some of the existing gaps regarding LGBTQ+ data collection. It also examines the various challenges that LGBTQ+ groups have had to overcome through a data lens. While activism and advocacy has increased the visibility and acceptance of sexual and gender minorities and allowed them to better exercise their rights in society, significant inequities remain. Our literature review puts forward some of these recent efforts.

Most of the papers included in this review, however, conclude with similar findings: data for about LGBTQ+ communities is still lacking and as a result, research on the topic is often times also lagging behind. This is particularly problematic, as detailed in some of our readings, because LGBTQ+ populations are often at the center of discrimination and still face disparate health vulnerabilities. The LGBTQI+ Data Inclusion Act, which recently passed the US House of Representatives and would require over 100 federal agencies to improve data collection and surveying of LGBTQ communities, seeks to address this gap.

We hope this selection of readings can provide some clarity on current data-driven research for and about LGBTQ+ individuals. The readings are presented in alphabetical order.

***

Selected Reading List (in alphabetical order)

D’Ignazio, Catherine, and Lauren F. Klein. Data Feminism. MIT Press, 2020. https://mitpress.mit.edu/books/data-feminism.
Giblon, Rachel, and Greta R. Bauer. “Health care availability, quality, and unmet need: a comparison of transgender and cisgender residents of Ontario, Canada.” BMC Health Services Research 17, no. 1 (2017): 1–10. https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-017-2226-z.
Marshall, Zack, Vivian Welch, Alexa Minichiello, Michelle Swab, Fern Brunger, and Chris Kaposy. “Documenting research with transgender, nonbinary, and other gender diverse (trans) individuals and communities: introducing the global trans research evidence map.” Transgender Health 4, no. 1 (2019): 68–80. https://www.liebertpub.com/doi/10.1089/trgh.2018.0020.
Medina, Caroline and Lindsay Mahowald. “Collecting Data about LGBTQI+ and Other Sexual and Gender-Diverse Communities.” Center for American Progress, May 26, 2022. https://www.americanprogress.org/article/collecting-data-about-lgbtqi-and-other-sexual-and-gender-diverse-communities.
Miner, Michael H., Walter O. Bockting, Rebecca Swinburne Romine, and Sivakumaran Raman. “Conducting internet research with the transgender population: Reaching broad samples and collecting valid data.” Social science computer review 30, no. 2 (2012): 202–211. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769415/.
Pega, Frank, Sari L. Reisner, Randall L. Sell, and Jaimie F. Veale. “Transgender health: New Zealand’s innovative statistical standard for gender identity.” American journal of public health 107, no. 2 (2017): 217–221. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5227923/.
Ruberg, Bonnie, and Spencer Ruelos. “Data for Queer Lives: How LGBTQ Gender and Sexuality Identities Challenge Norms of Demographics.” Big Data & Society 7, no. 1 (June 18, 2020): 205395172093328. https://journals.sagepub.com/doi/full/10.1177/2053951720933286.
Sell, Randall L. “LGBTQ health surveillance: data = power.” American Journal of Public Health 107, no. 6 (2017): 843–844. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425894/.
Snapp, Shannon D., Stephen T. Russell, Mariella Arredondo, and Russell Skiba. “A right to disclose: LGBTQ youth representation in data, science, and policy.” Advances in child development and behavior 50 (2016): 135–159. https://pubmed.ncbi.nlm.nih.gov/26956072/.
Wimberly, George L. “Chapter 10: Use of large-scale data sets and LGBTQ education.” LGBTQ issues in education: Advancing a research agenda (2015): 175–218. https://ebooks.aera.net/LGBTQCH10.

***

Annotated Selected Reading List (in alphabetical order):

D’Ignazio, Catherine, and Lauren F. Klein. Data Feminism. MIT Press, 2020. https://mitpress.mit.edu/books/data-feminism.

D’Ignazio and Klein investigate how data has been historically used to maintain specific social status quos. To overcome this challenge, they approach data collection and uses through an intersectional, feminist lens that identifies issues in current data handling systems and looks toward solutions for more inclusive data applications.
The editors define data feminism as “power, about who has it and who doesn’t, and about how those differentials of power can be challenged and changed using data.” The book centers around seven principles that identify and challenge existing power structures around data and seek pluralist, context-based data processes that illuminate hidden and missed data.

Giblon, Rachel, and Greta R. Bauer. “Health care availability, quality, and unmet need: a comparison of transgender and cisgender residents of Ontario, Canada.” BMC Health Services Research 17, no. 1 (2017): 1–10. https://bmchealthservres.biomedcentral.com/articles/10.1186/s12913-017-2226-z.

Canada boasts a universal healthcare and insurance system, yet disparities exist between the treatment quality, services, and knowledge about transgender patients.
Data collection on transgender, non-binary, and intersex individuals is not conducted in Canadian health surveys, making it difficult to compare and contrast the healthcare provided to transgender people with that provided to cisgender people. Moreover, a lack of physician knowledge about trans needs and/or refusal to provide hormone therapy/ gender-affirming procedures result in trans individuals explicitly avoiding medical services. The lack of services, comfort, and data about transgender people in Canada demonstrate their severely “unmet health care need.”
Using data about Ontario residents from the Canadian Community Health Survey and the Trans PULSE survey, the researchers find that 33% transgender Ontarians had an unmet health need that would not be unmet if they were cisgender. As well, transgender men and women found the quality of healthcare in their community to be poor than compared to cisgender individuals. Twenty-one percent of transgender people avoided going to emergency rooms because of their gender identity.

Bowleg, Lisa, and Stewart Landers. “The need for COVID-19 LGBTQ-specific data.” American Journal of Public Health 111, no. 9 (2021): 1604–1605. https://pubmed.ncbi.nlm.nih.gov/34436923/.

The adage “no data, no problem” has been magnified during the pandemic, highlighting gaps around data collection for LGBTQ communities, which often intersect with other communities who are disproportionately at-risk for COVID-19, such as minority populations in the service industry and those who smoke.
Despite concerns about the stigma facing LBGTQ communities, data collection from these demographics has been relatively feasible, with federal governments drastically increasing their data collection from LGBTQ communities.
However, the lack of direction and guidance at a federal level to collect sexual and gender minority data has stunted information about how this demographic has experienced COVID-19 when compared to cis-gender, heterosexual groups. The authors stress the need for data collection from LGBTQ communities and advocacy to encourage these practices to help address the pandemic.

Marshall, Zack, Vivian Welch, Alexa Minichiello, Michelle Swab, Fern Brunger, and Chris Kaposy. “Documenting research with transgender, nonbinary, and other gender diverse (trans) individuals and communities: introducing the global trans research evidence map.” Transgender Health 4, no. 1 (2019): 68–80. https://www.liebertpub.com/doi/10.1089/trgh.2018.0020.

Marshall and colleagues study a series of 15 academic databases to assemble a dataset describing 690 trans-focused articles. They then map where and how transgender “have been studied and represented within and across multiple fields of research” to understand the landscape of existing research on transgender people. They find that research around the trans community focused on physical and mental healthcare services and marginalization and were primarily observational research.
The authors found that social determinants of health for transgender people were the least studied, along with ethnicity, culture, and race, violence, early life experiences, activism, and education.
With this evidence map, researchers have a strong starting point to further explore issues through a LGBTQ lens and better engage with trans people and perspectives when looking at social problems.

Medina, Caroline and Lindsay Mahowald. “Collecting Data about LGBTQI+ and Other Sexual and Gender-Diverse Communities.” Center for American Progress, May 26, 2022. https://www.americanprogress.org/article/collecting-data-about-lgbtqi-and-other-sexual-and-gender-diverse-communities.

The paper argues, that despite advances “a persistent lack of routine data collection on sexual orientation, gender identity, and variations in sex characteristics (SOGISC) is still a substantial roadblock for policymakers, researchers, service providers, and advocates seeking to improve the health and well-being of LGBTQI+ people.”
Even though various types of data are integral to the experiences of LGBTQI+ people, the report narrows its focus to data collection in two forms of environments: general population surveys & surveys regarding LGBTQI+ people. Specific population surveys such as the latter provide significant advantage to capture specific and sensitive data.
It argues that a range of precautions can be adopted from a research design perspective to ensure that personal data and information is handled with care and matches ethical standards as outlined in the Data Ethics Framework of the Federal Data Strategy ranging from privacy and confidentiality to honesty and transparency.

Miner, Michael H., Walter O. Bockting, Rebecca Swinburne Romine, and Sivakumaran Raman. “Conducting internet research with the transgender population: Reaching broad samples and collecting valid data.” Social science computer review 30, no. 2 (2012): 202–211. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3769415/.

The internet has the potential to collect information from transgender people, who are “a hard-to-reach, relatively small, and geographically dispersed population” in a diverse and representative manner.
To study HIV risk behaviors of transgender individuals in the U.S., Miner et al. developed an online tool that recruited individuals who frequent websites that are important for the transgender community and used quantiative and qualitative methods to learn more about these individuals. They conclude that while online data collection can be difficult to ensure internal validity, careful testing and methods can overcome these issues to improve data quality on transgender people.

Pega, Frank, Sari L. Reisner, Randall L. Sell, and Jaimie F. Veale. “Transgender health: New Zealand’s innovative statistical standard for gender identity.” American journal of public health 107, no. 2 (2017): 217–221. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5227923/.

Pega et al. discuss New Zealand’s national statistical standard for gender identity data collection, the first of its kind. More governments in Australia and the United States are now following suit to address the health access and information disparity that transgender people face.
Data about transgender people has advanced progressive policy action in New Zealand, and the authors celebrate this statistical standard as a way to collect high quality data for data-driven policies to support these groups.
While this move will help uncover LGBTQ individuals currently hidden in data, the authors critique the standard because it does not “promote the two-question method, risking misclassification and undercounts; does promote the use of the ambiguous response category “gender diverse” in standard questions; and is not intersex inclusive.”

Ruberg, Bonnie, and Spencer Ruelos. “Data for Queer Lives: How LGBTQ Gender and Sexuality Identities Challenge Norms of Demographics.” Big Data & Society 7, no. 1 (June 18, 2020): 205395172093328. https://journals.sagepub.com/doi/full/10.1177/2053951720933286.

Drawing from the responses of 178 people who identified as non-heterosexual or non-cisgender in a survey, this paper argues that “dominant notions of demographic data, […] that seeks to accurately categorize and “capture” identity do not sufficiently account for the complexities of LGBTQ lives.”
Demographic data commonly imagines identity as fixed, singular, and discrete. However, the researchers’ findings suggest that, for LGBTQ people, gender and sexual identities are often multiple and in flux. Most respondents reported their understanding of their identity shifting over time. For many, “gender identity was made up of overlapping factors, including the relationship between gender and transgender identities. These findings challenge researchers to reconsider how identity is understood as and through data.” They argue that considering identities as fixed and discrete are not only exclusionary but also do not wholly represent the dynamic and fluid nature of gender identities.
The piece offers several recommendations to address this challenge. Firstly, the researchers argue to remove data discreteness, which will enable users to select multiple identities rather than choose one from a drop-down list. Secondly, create communication and feedback channels for LGBTQ+ to express whether surveys and other data collection methods are sufficiently inclusive and gender-sensitive.

Sell, Randall L. “LGBTQ health surveillance: data = power.” American Journal of Public Health 107, no. 6 (2017): 843–844. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5425894/.

Sell recounts his motto: ‘data = power;’ ‘silence = death’ and how LGBTQ people have been victims of this situation. He argues that health research and surveillance has systemically ignored sexual and gender minorities, leading to gaps in administrative understanding and policies for LGBTQ population.
He laments that very few surveys on American health collect sexual and gender orientation data, and the lack of standardization around this data collection muddies researchers’ ability to collate and utilize the information meaningfully.
He calls for legislation that mandates the National Institutes of Health to include sexual and gender minorities in all publicly funded research similar to the specific inclusion requirement of women and racial and ethnic minorities in studies. Despite concerns about surveillance and targeting of LGBTQ minorities, Sell argues that data collection is imperative now for a long-scale understanding of the needs of the community, transcending political terms.

Snapp, Shannon D., Stephen T. Russell, Mariella Arredondo, and Russell Skiba. “A right to disclose: LGBTQ youth representation in data, science, and policy.” Advances in child development and behavior 50 (2016): 135–159. https://pubmed.ncbi.nlm.nih.gov/26956072/.

Despite significant and positive reforms such as the legalization of same-sex marriages and protection from intersectional sexual harrasment (Webb, 2011) in the United States, there is a striking gap in literature on evidence-based practices that support LGBTQ+ Youth (Kosciw & Pizmony-Levy, 2013). The lack of data-driven solutions stifle the creation of inclusive environments where members of the LGBTQI+ community feel heard and seen. There is a striking gap in literature on evidence-based practices that support LGBTQ+ Youth (also see Kosciw & Pizmony-Levy, 2013; Mustanski, 2011).
At present federal and local state data-states do not include SOGI (Sexual Oreintation & Gender Identity) in demographic questions. Data sets that do have spaces to disclose SOGI are largely in a health-related setting such as the Centre for Disease Control or Youth Risk Behavior. As such learning and education disparities and outcomes are not accurately measured.
Missing systematic SOGI data renders members of the LGBTQ+ community invisible and sidelined. As such several members of civil society have therefore demanded for the need to gather SOGI data in the Department of Health, Education & Justice. Such data is therefore central to holistically encapsulate the discriminatory experiencees LGBTQ+ Youth face in an education setting, integral to well-being and development. Scholars and research teams have thusfar overcome the barriers of data reliability and validity (see Ridolfo, Miller, & Maitland, 2012) by collating the most effective methods for data collection (Sexual Minority Assessment Research Team, 2009).

Wimberly, George L. “Chapter 10: Use of large-scale data sets and LGBTQ education.” LGBTQ issues in education: Advancing a research agenda (2015): 175–218. https://ebooks.aera.net/LGBTQCH10.

This book chapter highlighs the importance of large-scale data sets to gain understanding about LGBTQ students, school experiences, and academic achievement.
Young people who identify as LGBTQ tend to be generalized and ways that LGBTQ identification questions are asked by surveys change across years, making it important to disaggregate large-scale data for more granular knowledge about LGBTQ people in education.
Wimberly provides information about multiple datasets that collect this information, how they ask questions on LGBTQ identity, and ways in which the datasets have been used or have the potential to be leveraged for a more comprehensive understanding of students. He also points out the limitations of existing data sets, namely that they tend to be retrospective of the LGBTQ adolescent experience and collected from convenience samples, such as college students. This limitation also impacts the external validity of the data, especially with regard to rural, racialized, and lower-income LGBTQ students.

Selected Readings on the LGTBQ+ Community and Data

By: Uma Kalkar, Salwa Mansuri, Andrew J. Zahuranec

As part of an ongoing effort to contribute to current topics in data, technology, and governance, The GovLab’s Selected Readings series provides an annotated and curated collection of recommended readings on themes such as open data, data collaboration, and civic technology.

In this edition, we reflect on the intersection between data, abortion, and women’s health following the United States Supreme Court ruling regarding Dobbs v. Jackson Women’s Health Organization which held that there was no constitutional right to abortion and decided that individual states have the authority to regulate access to abortion services. In the days before and since the decision, a large amount of literature has been produced both on the implications of this ruling for individuals’ data privacy and the effects on women’s social and economic lives. It is clear that, while opinions on access to abortion services are often influenced by deeply held attitudes about women’s bodily autonomy and when life begins, data has critical importance both as a potential source of risk and as a tool to understand the decision’s impact.

Below we curate some stories from news sources and academic papers on the role of data in abortion services as well as data-driven research by institutions into the effects of abortion. We hope this selection of readings provides a broader perspective on how data and women’s rights and health intersect.

As well, we urge that anyone seeking further information about abortion access visit www.ineedana.com via a secure site, and preferably via a VPN. For those looking for menstrual apps, Spot On by the Planned Parenthood Federation of America saves data locally on phones, does not provide information to third parties, and allows for anonymous accounts.

The readings are presented in alphabetical order.

***

Data & Privacy Concerns

Conti-Cook, Cynthia. “Surveilling the Digital Abortion Diary: A Preview of How Anti-Abortion Prosecutors Will Weaponize Commonly-Used Digital Devices As Criminal Evidence Against Pregnant People and Abortion Providers in a Post-Roe America.” University of Baltimore Law Review, forthcoming. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3666305

In this four-part article, Conti-Cook discusses the history of health data rights and the long-standing ways in which digital evidence produced by pregnant people has been used to prosecute their actions. She discusses how digital technologies help prosecutors lay charges against those seeking abortions and how they help “ the state see[k] control over [them] by virtue of their pregnancy status” by digitally surveilling them.
The author examines how “digital, biometric, and genetic surveillance” serves as a vehicle to “microtarget” historically oppressed communities” under a patriarchal and racist social structure.
She also discusses how online searches relating to pregnancy termination and abortion, location and tracking data, site history, wearable devices, and app data can be factored into risk assessment tools to assess social service outcomes and federal prosecutions.
Conti-Cook ends by reviewing digital hygiene strategies to stop the use of personal data against oneself and foster a more critical use of digital tools for reproductive and pregnancy-related health needs.

Diamant, Jeff, and Besheer Mohamed. “What the Data Says about Abortion in the U.S.” Pew Research Center, June 24, 2022. https://www.pewresearch.org/fact-tank/2022/06/24/what-the-data-says-about-abortion-in-the-u-s-2

In the aftermath of the overturn of Roe v. Wade (1973), the Pew Research Center published a compilation of facts and statistics about abortion care in the United States obtained through the Centers for Disease Control and Prevention and Guttmacher Institute.
The piece describes shifting trends pertaining to the number of legal abortions conducted each year in the United States since the 1970s, the abortion rate among women, the most common types of abortions, and the number of abortion providers over time. It describes, for example, how the procedure has generally declined at “a slow yet steady pace” since the early 1990s. It also notes that the number of providers has declined over time.

Paul, Kari. “Tech Firms under Pressure to Safeguard User Data as Abortion Prosecutions Loom.” The Guardian, June 25, 2022, sec. US news. https://www.theguardian.com/us-news/2022/jun/25/tech-companies-health-data-security-abortion-prosecution

Paul writes about the concerns of abortion and civil rights activists on how data collected about individuals through apps and online searches might incriminate those seeking or providing abortion services. It notes how geo-location data used by tech companies can make “it easy for law enforcement officials to access incriminating data on location, internet searches, and communication history.”
While period tracking apps have received significant attention, the article notes that companies such as Meta, Uber, Lyft, Google, and Apple have yet to publicly announce how they would respond to law enforcement requests on abortion evidence.
The piece finally includes a recommendation from the digital rights advocacy group Electronic Frontier Foundation that companies preemptively prepare “for a future in which they are served with subpoenas and warrants seeking user data to prosecute abortion seekers and providers.” It suggests end-to-end encryption as a default, refraining from collecting location information, and allowing anonymous or pseudonymous access to apps.

Nguyen, Nicole, and Cordilia James. “How Period-Tracker Apps Treat Your Data, and What That Means If Roe v. Wade Is Overturned.” Wall Street Journal, June 21, 2022. https://www.wsj.com/articles/how-period-tracker-apps-treat-your-data-and-what-that-means-if-roe-v-wade-is-overturned-11655561595

Nguyen and James provide an extensive analysis of the ways that period tracking apps track, collect, store, and share data about women’s fertility and menstrual cycle. Following Dobbs v. Jackson Women’s Health Organization (2022), which overturned Roe v. Wade (1973), there has been significant public concern about the (re)use of the data these apps collect.
They detail different kinds of data that could be subpoenaed from period trackers and the terminology that users can search for in an app’s privacy policy to understand how their data will be used. It describes, for example, what it means to when Terms & Conditions outline how they will “encrypt” (that is, to scramble into an incoherent string of code), “share” or “sell” (data can be given to third parties such as advertisers), and respond to “requests” (companies may notify the user when a court or government data asks for data).
The article closes with an overview of the most-downloaded fertility apps — including Flo, Apple Health, Clue, FitBit, Glow, and Natural Cycles — and where they stand on data privacy.

Sherman, Jenna. “How Abortion Misinformation and Disinformation Spread Online.” Scientific American, June 24, 2022. https://www.scientificamerican.com/article/how-abortion-misinformation-and-disinformation-spread-online/

In Scientific American, Sherman writes an opinion piece on the growth of online dis- and misinformation in the aftermath of Dobbs. She summarizes how, according to current data-driven research, much of the information people find online about abortion is not reliable and that the highest volume of online searches about abortion tends to be in those states with the most restricted access.
Despite much research on abortion, Sherman notes “a lack of access to quality information or care” online, especially for marginalized communities. She also summarizes the results of studies on social media and search engines. In one 2021 study, searches for “abortion pill” tended not to yield scientifically accurate and moderately accessible information.
Another study cited in the article found that half of the web pages surfaced by Google on abortion contained misinformation. This appears to be by design — with false information about “abortion pill reversal” and abortion practices generating large revenues for platforms like Facebook.

Data on the Impact of Abortion Access

Amador, Diego. “The Consequences of Abortion and Contraception Policies on Young Women’s Reproductive Choices, Schooling and Labor Supply.” Documento CEDE №2017–43 (2017). https://ssrn.com/abstract=2987367

Amador analyzes aggregate provider data from the Guttmacher Institute to assess the relationship between contraceptive use, abortion, schooling, and labor decisions of US women. The dataset follows a sample of women born between 1980 and 1984, with data from interviews starting in 1997 and ending in 2011.
A counterfactual model based on the data suggests that a perfectly enforced ban on abortions would raise the rate of standard contraceptive use for women 9.1%. The fraction of children born to single mothers would increase from 30% to 34% while the average amount of schooling after high school would decrease by 3.1%. The number of women with college degrees would drop by 1.8% age points. The estimated average loss in lifetime earnings for women who would have at least had one abortion was estimated at USD 39,172.
The author also assesses the impact that free contraception would have, suggesting a 15.7 decrease in pregnancies per 1000 women and an 11.6 reduction in abortions per 1000 women. Accumulated schooling after high school increased by an estimated 3%. An assessment of mandatory counseling laws found that the long-run effect of these laws on women ages 18 to 30 was a 10% decrease in abortion rates.
The author concludes that policies such as an abortion ban and free contraception have important effects on schooling and lifetime earnings but only a moderate impact on labor supply.

ANSIRH. “Introduction to the Turnaway Study.” ANSIRH, March 2020. https://www.ansirh.org/sites/default/files/publications/files/turnawaystudyannotatedbibliography.pdf

This fact sheet summarizes various analyses stemming from the Turnaway Study, the first study to rigorously examine the effects of receiving abortion services versus being denied access to them. The study is an initiative by Advancing New Standards in Reproductive Health (ANSIRH), a program within the UCSF Bixby Center for Global Reproductive Health. It examines 1,000 women seeking abortion from 30 facilities around the country, with interviews conducted over five years.
Studies conducted with the dataset find that the most common reason for women to seek an abortion was not being able to afford a child and/or not having a suitable partner/parent involved to assist with childrearing. Most women don’t feel pressured by counseling that occurs in clinics but find it less helpful when it is state-mandated. Half of all women report seeing anti-abortion protestors at clinics and greater contact with them tends to be more upsetting.
Studies also suggest no evidence that abortion causes negative mental health outcomes, although being denied an abortion is associated with elevated anxiety and stress and lower self-esteem. Those who receive an abortion experience “a mix of positive and negative emotions in the days after […] with relief predominating.” The intensity of the emotion diminishes over time but over 95% of women report “abortion was the right decision for them at all times over five years after.”
Carrying an unwanted pregnancy tended to be associated with worse outcomes for women’s physical health and socioeconomic status. Women denied abortion who later gave birth reported more chronic pain and rated their overall health as worse. Economic insecurity for women and their families increased almost four-fold. In terms of education, women who received abortions tended to have higher odds of having positive one-year plans while women denied abortions were no more or less likely to drop out of school.

Donohue, John J., and Steven D. Levitt. “The Impact of Legalized Abortion on Crime Over the Last Two Decades.” The University of Chicago, Becker Friedman Institute for Economics Working Paper №2019–75 (May 2017). https://ssrn.com/abstract=3391510

This paper primarily argues that legalizing abortion in the 1970s had positive consequences in the significant reduction of crime even two decades later, in the 1990s. In particular, the paper suggested an approximate 20% decrease in crime rates between 1997 and 2014. Not only is abortion legalization a crucial factor but perhaps one of the most crucial ones in the significant reduction in crime rates (see Donohue and Levitt, 2001).
A particularly crucial aspect of the data collected was that it took close to a decade for the “number of abortions performed to reach a steady-state” attributed to the variability and heterogeneity of state-level data due to the variability and dynamic nature of evolving abortion legislation and abortion reform.
Moreover, the effect of abortion on crime rates was only incrementally visible as “crime-aged cohorts” were gradually exposed to legalized abortion. Donohue and Levitt’s work supports the abortion-crime hypothesis — that increased access to abortion would decrease crime.

Frost, Jennifer J., Jennifer Mueller, and Zoe H. Pleasure. “Trends and Differentials in Receipt of Sexual and Reproductive Health Services in the United States: Services Received and Sources of Care, 2006–2019.” The Guttmacher Institute, June 24, 2021. https://doi.org/10.1363/2021.33017

This report describes trends in reproductive and sexual health care across the United States over a 13-year period as told by the National Survey of Family Growth, the only national data source that contains detailed information on sexual and reproductive health. It finds that some 7 in 10 women of reproductive age (44 million people) make at least one medical visit for sexual and reproductive health care each year. However, disparities exist — Hispanic women are less likely to receive care than White women, and the uninsured are substantially less likely to receive care than privately insured women.
It further finds that publicly funded clinics were a critical source of care for young women, lower-income women, women of color, foreign-born women, women on Medicaid, and women without insurance.
The report also finds that the Affordable Care Act increased the number of women receiving contraceptive services by 8% among women with private providers. There was a complimentary drop among women receiving contraceptive care from publicly funded clinics.

Hill, J. Jackson IV. “The Need for a National Abortion Reporting Requirement: Why Both Sides Should Be in Support of Better Data.” Available at SSRN (May 2, 2014). https://ssrn.com/abstract=2306667.

Hill writes a paper urging organizations to improve the status of abortion reporting in the United States. Examining statistics collected by the Centers for Disease Control and the Guttmacher Institute, the author finds serious deficiencies, including a lack of voluntary reporting from states, conflicting requirements (or unenforced requirements) about what data is collected, and an absence of timely data.
After the passage of Roe, state legislatures attempted to mandate abortion reporting and monitoring; however, concerns over the safety of women’s choice, undue administrative hurdles, and issues over pervasive data collection made it difficult to impose a standardized, non-intrusive, and anonymized data collection practice.
Hill argues that these data gaps and paternalistic methods of collecting data have had consequences on the ability of policymakers to make decisions around abortion policy and undermine the public’s knowledge on the issue. He assesses the feasibility of federally regulated abortion data and potential other strategies for achieving reliable, uniform data. He proposes two avenues for a “comprehensive, uniform abortion data” set: a ‘command’ option that requires states to provide and collect abortion information for a federal database or a ‘bribe’ option that monetarily incentivizes states to provide this information.

Knowles Myers, Caitlin, and Morgan Welch. “What Can Economic Research Tell Us about the Effect of Abortion Access on Women’s Lives?” Brookings, November 30, 2021. https://www.brookings.edu/research/what-can-economic-research-tell-us-about-the-effect-of-abortion-access-on-womens-lives/

Knowles Myers and Welch write on what current economic research suggests about abortion access on women’s reproductive, social, and economic outcomes.
Comparing Alaska, California, Hawaii, New York, Washington, and the District of Columbia (states which repealed abortion bans prior to Roe) to other states, research suggests states that repealed abortion bans had between a 4–11% decline in births relative to the rest of the country — with effects particularly large for teens and women of color. Studies also suggest that abortion legalization reduced the number of teen mothers by 34% and reduced maternal mortality by 30–40%, with little impact on white women.
Additional studies indicate that abortion access has a large impact on the circumstances under which children are born. Various studies find that abortion legalization reduced the number of unwanted children, cases of neglect and abuse, and the number of children living in poverty. It also improved long-term outcomes by increasing the likelihood of child attendance in college.
Other studies find that abortion and pregnancy have substantial impact on women’s economic and social lives, with pregnancy frequently lowering women’s wages. This fact has substantial implications for “low-income mothers experiencing disruptive life events.” Based on various studies, the authors argue that “access to abortion could be pivotal to these women’s financial lives.”
While abortion is driven by views on women’s bodily autonomy and when life begins, the authors find a clear causal link between access to abortion and “whether, when, and under what circumstances women become mothers.” All studies suggest that access to abortion can have substantial implications on education, earnings, careers, and life outcomes. Restricting or eliminating access would diminish women’s personal and economic lives along with that of their families.

Maxmen, Amy. “Why Hundreds of Scientists Are Weighing in on a High-Stakes US Abortion Case.” Nature 599, no. 7884 (October 26, 2021): 187–89. https://doi.org/10.1038/d41586-021-02834-7

A piece by Amy Maxmen for Nature summarizes a recent amicus brief filed by more than 800 scientists and several scientific organizations providing data-driven research into how abortion access is an important aspect of reproductive health.
It notes, for example, more than 40 studies suggesting that receiving an abortion does not harm a woman’s mental or physical health but that being denied an abortion can result in negative financial and health outcomes. It also cites a 2019 study of nearly 900 women who “who sought but were unable to get abortions reported higher rates of chronic headaches and joint pain five years later, compared with those who got an abortion,” while a similar 2017 study finds no similar physical or psychological effects.
A separate amicus brief submitted to the Court by about 550 public health and reproductive health researchers described how unwanted pregnancies can result in worse health outcomes. It also can disproportionately harm the physical, mental, and economic well-being of Black people according to a separate study.
An additional amicus brief filed by economists notes several studies that found that “abortion legalization in the 1970s helped to increase women’s educational attainment, participation in the labor force and earnings — especially for single Black women.”

Myers, Caitlin, and Ladd, Daniel. “Did parental involvement laws grow teeth? The effects of state restrictions on minors’ access to abortion.” Journal of Health Economics, 71, (2020): p.102302. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3029823

A paper by Caitlin Knowles Myers of Germany’s IZA Institute of Labor Economics and Daniel Ladd of the University of California, Irvine compiles data on the location of abortion providers and enforcement of parental involvement laws. The researchers seek to assess the impact of laws requiring parental approval for an abortion have on minors seeking abortions.
The paper concludes that parental involvement laws may have contributed to a modest decline in teen births (a 1.4% reduction) during the 1980s and 1990s but a 2.8% increase from 1993 to 2014 in women aged 15 to 18.
It further finds that laws with an avoidance distance (the distance minors have to travel to avoid parental involvement and can seek an abortion confidentially) have significant effects. In the 1980s, a parental involvement law with an avoidance distance of 100 miles decreased teen births by 1.48%. A parental involvement law with a 400-mile avoidance distance, about a day’s drive, increases the teen birth rate by 4.3%.

Popinchalk, Anna, Cynthia Beavin, and Jonathan Bearak. “The State of Global Abortion Data: An Overview and Call to Action.” BMJ Sexual & Reproductive Health 48, no. 1 (January 1, 2022): 3–6. https://doi.org/10.1136/bmjsrh-2021-201109.

Popinchalf and colleagues at the Guttmacher Institute write in the journal BMJ Sexual & Reproductive Health on the urgent need for data on abortion incidents and access to examine disparities in people’s ability to safely terminate a pregnancy.
The authors note that the three sources of data on abortion are official statistics, surveys of women, and scientific studies. However, stigmatization and varying legal access undermine the quality of this data and can lead to substantial under-reporting. Even in high-income countries, there can be significant variation in the frequency with which data is published. This variation in quality and availability exacerbates inequities by limiting the number of experiences that can be studied.
The authors argue that data availability and quality of abortion care can be improved by investing in country-level surveys and scientific studies. It also argues for reducing stigma through community and provider messaging as it can hinder the accuracy and completeness of datasets.

Tierney, Katherine I. “Abortion Underreporting in Add Health: Findings and Implications.” Population Research and Policy Review 38, no. 3 (June 1, 2019): 417–28. https://doi.org/10.1007/s11113-019-09511-8

Tierney notes that there is substantial evidence that abortion is significantly underreported in the United States, especially among Black women and those in lower socioeconomic classes.
She supplements this review with her own evaluation of the abortion data in the National Longitudinal Study of Adolescent to Adult Health (Add Health), finding that the dataset captures only 35% of expected abortions. Examining data from 1994–1995, 1996, 2001–2002, and 2008–2009, she found severe abortion underreporting; however, there were no significant differences between race/ethnicity, age, or time of abortion and underreporting.
Tierney argues that this fact means that Add Health is no better than other surveys in collecting abortion data. She also argues that this underreporting, likely caused by stigma, has substantial implications for research and that researchers should be cautious with self-reports of abortion. Figures need to be evaluated, contextualized, and used with caution.

Selected Readings on the Intersection of Data, Abortion Care, and Women’s Health

By Uma Kalkar, Marine Ragnet, and Stefaan Verhulst

Digital self-determination (DSD) is a multidisciplinary concept that extends self-determination to the digital sphere. Self-determination places humans (and their ability to make ‘moral’ decisions) at the center of decision-making actions. While self-determination is considered as a jus cogens rule (i.e. a global norm), the concept of digital self-determination came only to light in the early 2010s as a result of the increasing digitization of most aspects of society.

While digitalization has opened up new opportunities for self-expression and communication for individuals across the globe, its reach and benefits have not been evenly distributed. For instance, migrants and refugees are particularly vulnerable to the deepening inequalities and power structures brought on by increased digitization, and the subsequent datafication. Further, non-traditional data, such as social media and telecom data, have brought great potential to improve our understanding of the migration experience and patterns of mobility that can provide more targeted migration policies and services yet it also has brought new concerns related to the lack of agency to determine how the data is being used and who determines the migration narrative.

These selected readings look at DSD in light of the growing ubiquity of technology applications and specifically focus on their impacts on migrants. They were produced to inform the first studio on DSD and migration co-hosted by the Big Data for Migration Alliance and the International Digital Self Determination Network. The readings are listed in alphabetical order.

These readings serve as a primer to offer base perspectives on DSD and its manifestations, as well as provide a better understanding of how migration data is managed today to advance or hinder life for those on the move. Please alert us of any other publication we should include moving forward.

Berens, Jos, Nataniel Raymond, Gideon Shimshon, Stefaan Verhulst, and Lucy Bernholz. “The Humanitarian Data Ecosystem: the Case for Collective Responsibility.” Stanford Center for Philanthropy and Civil Society, 2017.

The authors explore the challenges to, and potential solutions for, the responsible use of digital data in the context of international humanitarian action. Data governance is related to DSD because it oversees how the information extracted from an individual—understood by DSD as an extension of oneself in the digital sphere—is handled.
They argue that in the digital age, the basic service provision activities of NGOs and aid organizations have become data collection processes. However, the ecosystem of actors is “uncoordinated” creating inefficiencies and vulnerabilities in the humanitarian space.
The paper presents a new framework for responsible data use in the humanitarian domain. The authors advocate for data users to follow three steps:

“[L]ook beyond the role they take up in the ‘data-lifecycle’ and consider previous and following steps and roles;
Develop sound data responsibility strategies not only to prevent harm to their own operations but also to other organizations in the ‘data-lifecycle;’ and,
Collaborate with and learn from other organizations, both in the humanitarian field and beyond, to establish broadly supported guidelines and standards for humanitarian data use.”

Currion, Paul. “The Refugee Identity.” Caribou Digital (via Medium), March 13, 2018.

Developed as part of a DFID-funded initiative, this essay outlines the Data Requirements for Service Delivery within Refugee Camps project that investigated current data standards and design of refugee identity systems.
Currion finds that since “the digitisation of aid has already begun…aid agencies must therefore pay more attention to the way in which identity systems affect the lives and livelihoods of the forcibly displaced, both positively and negatively.” He argues that an interoperable digital identity for refugees is essential to access financial, social, and material resources while on the move but also to tap into IoT services.
However, many refugees are wary of digital tracking and data collection services that could further marginalize them as they search for safety. At present, there are no sector-level data standards around refugee identity data collection, combination, and centralization. How can regulators balance data protection with government and NGO requirements to serve refugees in the ways they want to uphold their DSD?
Currion argues that a Responsible Data approach, as opposed to a process defined by a Data Minimization principle, provides “useful guidelines” but notes that data responsibility “still needs to be translated into organizational policy, then into institutional processes, and finally into operational practice. He further adds that “the digitization of aid, if approached from a position that empowers the individual as much as the institution, offers a chance to give refugees back their voices.”

Decker, Rianne, Paul Koot, S. Ilker Birbil, Mark van Embden Andres. “Co-designing algorithms for governance: Ensuring responsible and accountable algorithmic management of refugee camp supplies” Big Data and Society, April 2022.

While recent literature has looked at the negative impacts of big data and algorithms in public governance, claiming they may reinforce existing biases and defy scrutiny by public officials, this paper argues that designing algorithms with relevant government and society stakeholders might be a way to make them more accountable and transparent.
It presents a case study of the development of an algorithmic tool to estimate the populations of refugee camps to manage the delivery of emergency supplies. The algorithms included in this tool were co-designed with relevant stakeholders.
This may provide a way to uphold DSD by contributing to the “accountability of the algorithm by making the estimations transparent and explicable to its users.”
The authors found that the co-design process enabled better accuracy and responsibility and fostered collaboration between partners, creating a suitable purpose for the tool and making the algorithm understandable to its users. This enabled algorithmic accountability.
The authors note, however, that the beneficiaries of the tools were not included in the design process, limiting the legitimacy of the initiative.

European Migration Network. “The Use of Digitalisation and Artificial Intelligence in Migration Management.” EMN-OECD Inform Series, February 2022.

This paper explores the role of new digital technologies in the management of migration and asylum, focusing specifically on where digital technologies, such as online portals, blockchain, and AI-powered speech and facial recognition systems are being used across Europe to navigate the processes of obtaining visas, claiming asylum, gaining citizenship, and deploying border control management.
Further, it points to friction between GDPR and new technologies like blockchain—which by decision does not allow for the right to be forgotten—and potential workarounds, such as two-step pseudonymisation.
As well, it highlights steps taken to oversee and open up data protection processes for immigration. Austria, Belgium, and France have begun to conduct Data Protection Impact Assessments; France has a portal that allows one to request the right to be forgotten; Ireland informs online service users on how data can be shared or used with third-party agencies; and Spain outlines which personal data are used in immigration as per the Registry Public Treatment Activities.
Lastly, the paper points out next steps for policy development that upholds DSD, including universal access and digital literacy, trust in digital systems, willingness for government digital transformations, and bias and risk reduction.

Martin, Aaron, Gargi Sharma, Siddharth Peter de Souza, Linnet Taylor, Boudewijn van Eerd, Sean Martin McDonald, Massimo Marelli, Margie Cheesman, Stephan Scheel, and Huub Dijstelbloem. “Digitisation and Sovereignty in Humanitarian Space: Technologies, Territories and Tensions.” Geopolitics (2022): 1-36.

This paper explores how digitisation and datafication are reshaping sovereign authority, power, and control in humanitarian spaces.
Building on the notion that technology is political, Martin et al. discuss three cases where digital tools powered by partnerships between international organizations and NGOs and private firms such as Palantir and Facebook have raised concerns for data to be “repurposed” to undermine national sovereignty and distort humanitarian aims with for-profit motivations.
The authors draw attention to how cyber dependencies threaten international humanitarian organizations’ purported digital sovereignty. They touch on the tensions between national and digital sovereignty and self-governance.
The paper further argues that the rise of digital technologies in the governance of international mobility and migration policies “has all kinds of humanitarian and security consequences,” including (but not limited to) surveillance, privacy infringement, profiling, selection, inclusion/exclusion, and access barriers. Specifically, Scheel introduces the notion of function creep—the use of digital data beyond initially defined purposes—and emphasizes its common use in the context of migration as part “of the modus operandi of sovereign power.”

McAuliffe, Marie, Jenna Blower, and Ana Beduschi. “Digitalization and Artificial Intelligence in Migration and Mobility: Transnational Implications of the COVID-19 Pandemic.” Societies 11, no. 135 (2021): 1-13.

This paper critically examines the implications of intensifying digitalization and AI for migration and mobility systems in a post- COVID transnational context.
The authors first situate digitalization and AI in migration by analyzing its uptake throughout the Migration Cycle, i.e. to verify identities and visas, “enable “smart” border processing,” and understand travelers’ adherence to legal frameworks. It then evaluates the current challenges and opportunities to migrants and migration systems brought about by deepening digitalization due to COVID-19. For example, contact tracing, infection screening, and quarantining procedures generate increased data about an individual and are meant, by design, to track and trace people, which raises concerns about migrants’ safety, privacy, and autonomy.
This essay argues that recent changes show the need for further computational advances that incorporate human rights throughout the design and development stages, “to mitigate potential risks to migrants’ human rights.” AI is severely flawed when it comes to decision-making around minority groups because of biased training data and could further marginalize vulnerable populations and intrusive data collection for public health could erode the power of one’s universal right to privacy. Leaving migrants at the mercy of black-box AI systems fails to uphold their right to DSD because it forces them to relinquish their agency and power to an opaque system.

Ponzanesi, Sandra. “Migration and Mobility in a Digital Age: (Re)Mapping Connectivity and Belonging.” Television & New Media 20, no. 6 (2019): 547-557.

This article explores the role of new media technologies in rethinking the dynamics of migration and globalization by focusing on the role of migrant users as “connected” and active participants, as well as “screened” and subject to biometric datafication, visualization, and surveillance.
Elaborating on concepts such as “migration” and “mobility,” the article analyzes the paradoxes of intermittent connectivity and troubled belonging, which are seen as relational definitions that are always fluid, negotiable, and porous.
It states that a city’s digital infrastructures are “complex sociotechnical systems” that have a functional side related to access and connectivity and a performative side where people engage with technology. Digital access and action represent areas of individual and collective manifestations of DSD. For migrants, gaining digital access and skills and “enacting citizenship” are important for resettlement. Ponzanesi advocates for further research conducted both from the bottom-up that leans on migrant experiences with technology to resettle and remain in contact with their homeland and a top-down approach that looks at datafication, surveillance, digital/e-governance as a part of the larger technology application ecosystem to understand contemporary processes and problems of migration.

Remolina, Nydia, and Mark James Findlay. “The Paths to Digital Self-Determination — A Foundational Theoretical Framework.” SMU Centre for AI & Data Governance Research Paper No. 03 (2021): 1-34.

Remolina and Findlay stress that self-determination is the vehicle by which people “decide their own destiny in the international order.” Decision-making ability powers humans to be in control of their own lives and excited to pursue a set of actions. Collective action, or the ability to make decisions as a part of a group—be it based on ethnicity, nationality, shared viewpoints, etc.—further motivates oneself.
The authors discuss how the European Union and European Court of Human Rights’ “principle of subsidiarity” aligns with self-determination because it advocates for power to be placed at the lowest level possible to preserve bottom-up agency with a “reasonable level of efficiency.” In practice, the results of subsidiarity have been disappointing.
The paper provides examples of indigenous populations’ fight for self-determination, offline and online. Here, digital self-determination refers to the challenges indigenous peoples face in accessing growing government uses of technology for unlocking innovative solutions because of a lack of physical infrastructure due to structural and social inequities between settler and indigenous communities.
Understanding self-determination—and by extension, digital self-determination as a human right, the report investigates how autonomy, sovereignty, the legal definition of a ‘right,’ inclusion, agency, data governance, data ownership, data control, and data quality.
Lastly, the paper presents a foundational theoretical framework that goes beyond just protecting personal data and privacy. Understanding that DSD “cannot be detached from duties for responsible data use,” the authors present a collective and individual dimension to DSD. They extend the individual dimension of DSD to include both my data and data about me that can be used to influence a person’s actions through micro-targeting and nudge techniques. They update the collective dimension of DSD to include the views and influences of organizations, businesses, and communities online and call for a better way of visualizing the ‘social self’ and its control over data.

Ziebart, Astrid, and Jessica Bither. “AI, Digital Identities, Biometrics, Blockchain: A Primer on the Use of Technology in Migration Management.” Migration Strategy Group on International Cooperation and Development, June 2020.

Ziebart and Bither note the implications of increasingly sophisticated use of technology and data collection by governments with respect to their citizens. They note that migrants and refugees “often are exposed to particular vulnerabilities” during these processes and underscore the need to bring migrants into data gathering and use policy conversations.
The authors discuss the promise of technology—i.e., to predict migration through AI-powered analyses, employ technologies to reduce friction in the asylum-seeking processes, and the power of digital identities for those on the move. However, they stress the need to combine these tools with informational self-determination that allows migrants to own and control what data they share and how and where the data are used.
The migration and refugee policy space faces issues of “tech evangelism,” where technologies are being employed just because they exist, rather than because they serve an actual policy need or provide an answer to a particular policy question. This supply-driven policy implementation signals the need for more migrant voices to inform policymakers on what tools are actually useful for the migratory experience. In order to advance the digital agency of migrants, the paper offers recommendations for some of the ethical challenges these technologies might pose and ultimately advocates for greater participation of migrants and refugees in devising technology-driven policy instruments for migration issues.

On-the-go interesting resources

Empowering Digital Self-Determination, mediaX at Stanford University: This short video presents definitions of DSD, and digital personhood, identity, and privacy and an overview of their applications across ethics, law, and the private sector.
Digital Self-Determination — A Living Syllabus: This syllabus and assorted materials have been created and curated from the 2021 Research Sprint run by the Digital Asia Hub and Berkman Klein Center for Internet Society at Harvard University. It introduces learners to the fundamentals of DSD across a variety of industries to enrich understanding of its existing and potential applications.
Digital Self-Determination Wikipedia Page: This Wikipedia page was developed by the students who took part in the Berkman Klein Center research sprint on digital self-determination. It provides a comprehensive overview of DSD definitions and its key elements, which include human-centered design, robust privacy mandates and data governance, and control over data use to give data subjects the ability to choose how algorithms manipulate their data for autonomous decision-making.
Roger Dubach on Digital Self-Determination: This short video presents DSD in the public sector and the dangers of creating a ‘data-protected’ world, but rather on understanding how governments can efficiently use data and protect privacy. Note: this video is part of the Living Syllabus course materials (Digital Self-Determination/Module 1: Beginning Inquiries).

Selected Readings on Digital Self-Determination for Migrants

By Fiona Cece, Uma Kalkar, Stefaan Verhulst, and Andrew J. Zahuranec

In this edition, we reflect on the one-year anniversary of the January 6, 2021 Capitol Hill Insurrection and its implications of disinformation and data misuse to support malicious objectives. This selected reading builds on the previous edition, published last year, on misinformation’s effect on violence and riots. Readings are listed in alphabetical order. New additions are highlighted in green.

The mob attack on the US Congress was alarming and the result of various efforts to undermine the trust in and legitimacy of longstanding democratic processes and institutions. The use of inaccurate data, half-truths, and disinformation to spread hate and division is considered a key driver behind last year’s attack. Altering data to support conspiracy theories or challenging and undermining the credibility of trusted data sources to allow for alternative narratives to flourish, if left unchallenged, has consequences — including the increased acceptance and use of violence both offline and online.

The January 6th insurrection was unfortunately not a unique event, nor was it contained to the United States. While efforts to bring perpetrators of the attack to justice have been fruitful, much work remains to be done to address the willful dissemination of disinformation online. Below, we provide a curation of findings and readings that illustrate the global danger of inaccurate data, half-truths, and disinformation. As well, The GovLab, in partnership with the OECD, has explored data-actionable questions around how disinformation can spread across and affect society, and ways to mitigate it. Learn more at disinformation.the100questions.org.

To suggest additional readings on this or any other topic, please email info@thelivinglib.org. All our Selected Readings can be found here.

Readings and Annotations

Al-Zaman, Md. Sayeed. “Digital Disinformation and Communalism in Bangladesh.” China Media Research 15, no. 2 (2019): 68–76.

Md. Sayeed Al-Zaman, Lecturer at Jahangirnagar University in Bangladesh, discusses how the country’s increasing number of “netizens” are being manipulated by online disinformation and inciting violence along religious lines. Social media helps quickly spread Anti-Hindu and Buddhist rhetoric, inflaming religious divisions between these groups and Bangladesh’s Muslim majority, impeding possibilities for “peaceful coexistence.”
Swaths of online information make it difficult to fact-check, and alluring stories that feed on people’s fear and anxieties are highly likely to be disseminated, leading to a spread of rumors across Bangladesh. Moreover, disruptors and politicians wield religion to target citizens’ emotionality and create violence.
Al-Zaman recounts two instances of digital disinformation and communalism. First, in 2016, following a Facebook post supposedly criticizing Islam, riots destroyed 17 templates and 100 houses in Nasrinagar and led to protests in neighboring villages. While the exact source of the disinformation post was never confirmed, a man was beaten and jailed for it despite robust evidence of his wrongdoing. Second, in 2012, after a Facebook post circulated an image of someone desecrating the Quran tagged a Buddhist youth in the picture, 12 Buddhist monasteries and 100 houses in Ramu were destroyed. Through social media, a mob of over 6,000 people, including local Muslim community leaders, attacked the town of Ramu. Later investigation found that the image had been doctored and spread by an Islamic extremist group member in a coordinated attack, manipulating Islamic religious sentiment via fake news to target Buddhist minorities.

Banaji, Shakuntala, and Ram Bhat. “WhatsApp Vigilantes: An exploration of citizen reception and circulation of WhatsApp misinformation linked to mob violence in India.” London School of Economics and Political Science, 2019.

London School of Economics and Political Science Associate Professor Shakuntala Banaji and Researcher Ram Bhat articulate how discriminated groups (Dalits, Muslims, Christians, and Adivasis) have been targeted by peer-to-peer communications spreading allegations of bovine related issues, child-snatching, and organ harvesting, culminating in violence against these groups with fatal consequences.
WhatsApp messages work in tandem with ideas, tropes, messages, and stereotypes already in the public domain, providing “verification” of fake news.
WhatsApp use is gendered, and users are predisposed to believe misinformation and spread misinformation, particularly if it targets a discriminated group that they already have negative and discriminatory feelings towards.
Among most WhatsApp users, civic trust is based on ideological, family, and community ties.
Restricting sharing, tracking, and reporting of misinformation using “beacon” features and imposing penalties on groups can serve to mitigate the harmful effects of fake news.

Funke, Daniel, and Susan Benkelman. “Misinformation is inciting violence around the world. And tech platforms don’t seem to have a plan to stop it.” Poynter, April 4, 2019.

Misinformation leading to violence has been on the rise worldwide. PolitiFact writer Daniel Funke and Susan Benkelman, former Director of Accountability Journalism at the American Press Institute, point to mob violence against Romas in France after rumors of kidnapping attempts circulated on Facebook and Snapchat; the immolation of two men in Puebla, Mexico following fake news spread on Whatsapp of a gang of organ harvesters on the prowl; and false kidnapping claims sent through Whatsapp fueling lynch mobs in India.
Slow (re)action to fake news allows mis/disinformation to prey on vulnerable people and infiltrate society. Examples covered in the article discuss how fake news preys on older Americans who lack strong digital literacy. Virulent online rumors have made it difficult for citizens to separate fact from fiction during the Indian general election. Foreign adversaries like Russia are bribing Facebook users for their accounts in order to spread false political news in Ukraine.
The article notes that increases in violence caused by disinformation are doubly enabled by “a lack of proper law enforcement” and inaction by technology companies. Facebook, Youtube, and Whatsapp have no coordinated, comprehensive plans to fight fake news and attempt to shift responsibility to “fact-checking partners.” Troublingly, it appears that some platforms deliberately delay the removal of mis/disinformation to attract more engagement. Only once facing intense pressure from policymakers does it seem that these companies remove misleading information.

Kyaw, Nyi Nyi. “Facebooking in Myanmar: From Hate Speech to Fake News to Partisan Political Communication.” ISEAS — Yusof Ishak Institute, no. 36 (2019): 1–10.

In the past decade, the number of plugged-in Myanmar citizens has skyrocketed to 39% of the population. All of these 21 million internet users are active on Facebook, where much political rhetoric occurs. Widespread fake news disseminated through Facebook has led to an increase in anti-Muslim sentiment and the spread of misleading, inflammatory headlines.
Attempts to curtail fake news on Facebook are difficult. In Myanmar, a developing country where “the rule of law is weak,” monitoring and regulation on social media is not easily enforceable. Criticism from Myanmar and international governments and civil society organizations resulted in Facebook banning and suspending fake news accounts and pages and employing stricter, more invasive monitoring of citizen Facebook use — usually without their knowledge. However, despite Facebook’s key role in agitating and spreading fake news, no political or oversight bodies have “explicitly held the company accountable.”
Nyi Nyi Kyaw, Visiting Fellow at the Yusof Ishak Institute in Singapore, notes a cyber law initiative set in motion by the Myanmar government to strengthen social media monitoring methods but is wary of Myanmar’s “human and technological capacity” to enforce these regulations.

Lewandowsky, Stephan, & Sander van der Linden. “Countering Misinformation and Fake News Through Inoculation and Prebunking.” European Review of Social Psychology 32, no. 2, (2020): 348-384.

Researchers Stephan Lewandowsky and Sander van der Linden present a scan of conventional instances and tools to combat misinformation. They note the staying power and spread of sensational sound bites, especially in the political arena, and their real-life consequences on problems such as anti-vaccination campaigns, ethnically-charged violence in Myanmar, and mob lynchings in India spurred by Whatsapp rumors.
To proactively stop misinformation, the authors introduce the psychological theory of “inoculation,” which forewarns people that they have been exposed to misinformation and alerts them to the ways by which they could be misled to make them more resilient to false information. The paper highlights numerous successes of inoculation in combating misinformation and presents it as a strategy to prevent disinformation-fueled violence.
The authors then discuss best strategies to deploy fake news inoculation and generate “herd” cognitive immunity in the face of microtargeting and filter bubbles online.

Osmundsen, Mathias, Alexander Bor, Peter Bjerregaard Vahlstrup, Anja Bechmann, and Michael Bang Petersen. “Partisan polarization is the primary psychological motivation behind “fake news” sharing on Twitter.” American Political Science Review, 115, no.3, (2020): 999-1015.

Mathias Osmundsen and colleagues explore the proliferation of fake news on digital platforms. Are those who share fake news “ignorant and lazy,” malicious actors, or playing political games online? Through a psychological mapping of over 2,000 Twitter users across 500,000 stories, the authors find that disruption and polarization fuel fake news dissemination more so than ignorance.
Given the increasingly polarized American landscape, spreading fake news can help spread “partisan feelings,” increase interparty social and political cohesion, and call supporters to incideniary and violent action. Thus, misinformation prioritizes usefulness to reach end goals over accuracy and veracity of information.
Overall, the authors find that those with low political awareness and media literacy are the least likely to share fake news. While older individuals were more likely to share fake news, the inability to identify real versus fake information was not a major contributor of motivating the spread of misinformation.
For the most part, those who share fake news are knowledgeable about the political sphere and online spaces. They are primarily motivated to ‘troll’ or create online disruption, or to further their partisan stance. In the United States, right-leaning individuals are more likely to follow fake news because they “must turn to more extreme news sources” to find information aligned with their politics, while left-leaning people can find more credible sources from liberal and centrist outlets.

Piazza, James A. “Fake news: the effects of social media disinformation on domestic terrorism.” Dynamics of Asymmetric Conflict (2021): 1-23.

James A. Piazza of Pennsylvania State University examines the role of online misinformation in driving distrust, political extremism, and political violence. He reviews some of the ongoing literature on online misinformation and disinformation in driving these and other adverse outcomes.
Using data on incidents of terrorism from the Global Terrorism Database and three independent measures of disinformation derived from the Digital Society Project, Piazza finds “disinformation propagated through online social media outlets is statistically associated with increases in domestic terrorism in affected countries. The impact of disinformation on terrorism is mediated, significantly and substantially, through increased political polarization.”
Piazza notes that his results support other literature that shows the real-world effects of online disinformation. He emphasizes the need for further research and investigation to better understand the issue.

Posetti, Julie, Nermine Aboulez, Kalina Bontcheva, Jackie Harrison, and Silvio Waisbord. “Online violence Against Women Journalists: A Global Snapshot of Incidence and Impacts.” United Nations Educational, Scientific and Cultural Organization, 2020.

The survey focuses on incidence, impacts, and responses to online violence against women journalists that are a result of “coordinated disinformation campaigns leveraging misogyny and other forms of hate speech. There were 901 respondents, hailing from 125 countries, and covering various ethnicities.
73% of women journalists reported facing online violence and harassment in the course of their work, suggesting escalating gendered violence against women in online media.
The impact of COVID-19 and populist politics is evident in the gender-based harassment and disinformation campaigns, the source of which is traced to political actors (37%) or anonymous/troll accounts (57%).
Investigative reporting on gender issues, politics and elections, immigration and human rights abuses, or fake news itself seems to attract online retaliation and targeted disinformation campaigns against the reporters.

Rajeshwari, Rema. “Mob Lynching and Social Media.” Yale Journal of International Affairs, June 1, 2019.

District Police Chief of Jogulamba Gadwal, India, and Yale World Fellow (’17) Rema Rajeshwari writes about how misinformation and disinformation are becoming a growing problem and security threat in India. The fake news phenomenon has spread hatred, fueled sectarian tensions, and continues to diminish social trust in society.
One example of this can be found in Jogulamba Gadwal, where videos and rumors were spread throughout social media about how the Parthis, a stigmatized tribal group, were committing acts of violence in the village. This led to a series of mob attacks and killings — “thirty-three people were killed in sixty-nine mob attacks since January 2018 due to rumors” — that could be traced to rumors spread on social media.
More importantly, however, Rajeshwari elaborates on how self-regulation and local campaigns can be used as an effective intervention for mis/dis-information. As a police officer, Rajeshwari fought a battle that was both online and on the ground, including the formation of a group of “tech-savvy” cops who could monitor local social media content and flag inaccurate and/or malicious posts, and mobilizing local WhatsApp groups alongside village headmen who could encourage community members to not forward fake messages. These interventions effectively combined local traditions and technology to achieve an “early warning-focused deterrence.”

Taylor, Luke. “Covid-19 Misinformation Sparks Threats and Violence against Doctors in Latin America.” BMJ (2020): m3088.

Journalist Luke Taylor details the many incidents of how disinformation campaigns across Latin America have resulted in the mistreatment of health care workers during the Coronavirus pandemic. Examining case studies from Mexico and Colombia, Taylor finds that these mis/disinformation campaigns have resulted in health workers receiving death threats and being subject to acts of aggression.
One instance of this link between disinformation and acts of aggression are the 47 reported cases of aggression towards health workers in Mexico and 265 reported complaints against health workers as well. The National Council to Prevent Discrimination noted these acts were the result of a loss of trust in government and government institutions, which was further exacerbated by conspiracy theories that circulated WhatsApp and other social media channels.
Another example of false narratives can be seen in Colombia, where a politician theorized that a “covid cartel” of doctors were admitting COVID-19 patients to ICUs in order to receive payments (e.g., a cash payment of ~17,000 Columbian pesos for every dead patient with a covid-19 diagnosis). This false narrative of doctors being incentivized to increase beds for COVID-19 patients quickly spread across social media platforms, resulting in many of those who were ill to avoid seeking care. This rumor also led to doctors in Colombia receiving death threats and intimidation acts.

“The Danger of Fake News in Inflaming or Suppressing Social Conflict.” Center for Information Technology and Society — University of California Santa Barbara, n.d.

The article provides case studies of how fake news can be used to intensify social conflict for political gains (e.g., by distracting citizens from having a conversation about critical issues and undermining the democratic process).
The cases elaborated upon are 1) Pizzagate: a fake news story that linked human trafficking to a presidential candidate and a political party, and ultimately led to a shooting; 2) Russia’s Internet Research Agency: Russian agents created social media accounts to spread fake news that favored Donald Trump during the 2016 election, and even instigated online protests about social issues (e.g., a BLM protest); and 3) Cambridge Analytica: a British company that used unauthorized social media data for sensationalistic and inflammatory targeted US political advertisements.
Notably, it points out that fake news undermines a citizen’s ability to participate in the democratic process and make accurate decisions in important elections.

Tworek, Heidi. “Disinformation: It’s History.” Center for International Governance Innovation, July 14, 2021.

While some public narratives frame online disinformation and its influence on real-world violence as “unprecedented and unparalleled” to occurrences in the past. Professor Heidi Tworek of the University of British Columbia points out that “assumptions about the history of disinformation” have (and continue to) influence policymaking to combat fake news. She argues that today’s unprecedented events are rooted in tactics similar to those of the past, such as how Finnish policymakers invested in national communications strategy to fight foreign disinformation coming from Russia and the Soviet Union.
She emphasizes the power of learning from historical events to guide modern methods of fighting political misinformation. Connecting today’s concerns of election fraud, foreign interference, and conspiracy theories to those of the past, such as “funding magazines [and] spreading rumors” on Soviet and American practices during the Cold War to further anti-opposition sentiment and hatred reinforces that disinformation is a long-standing problem.

Ward, Megan, and Jessica Beyer. “Vulnerable Landscapes: Case Studies of Violence and Disinformation” Wilson Center, August 2019.

This article discusses instances where disinformation inflamed already existing social, political, and ideological cleavages, and ultimately caused violence. Specifically, it elaborates on instances from the US-Mexico border, India, Sri Lanka, and during the course of three Latin American elections.
Though the cases are meant to be illustrative and highlight the spread of disinformation globally, the violence in these cases was shown to be affected by the distinct social fabric of each place. Their findings lend credence to the idea that disinformation helped spark violence in places that were already vulnerable and tense.
Indeed, now that disinformation can be so quickly distributed using social media, coupled with declining trust in public institutions, low levels of media literacy, meager actions taken by social media companies, and government actors who exploit disinformation for political gain, there has been a rise of these cases globally. It is an interaction of factors such as distrust in traditional media and public institutions, lack of content moderation on social media, and ethnic divides that render societies vulnerable and susceptible to violence.
One example of this is at the US/Mexico border, where disinformation campaigns have built on pre-existing xenophobia, and have led to instances of mob-violence and mass shootings. Inflamed by disinformation campaigns that migrant caravans contain criminals (e.g., invasion narratives often used to describe migrant caravans), the armed group United Constitutional Patriots (UCP) impersonated law enforcement and detained migrants at the US border, often turning them over to border officials. UCP has since been arrested by the FBI for impersonating law enforcement.

We welcome other sources we may have missed — please share any suggested additions with us at datastewards [at] thegovlab.org or The GovLab on Twitter.

Updated Selected Readings on Inaccurate Data, Half-Truths, Disinformation, and Mob Violence

By Kateryna Gazaryan and Uma Kalkar

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works focuses on algorithms and artificial intelligence in the public sector.

As Artificial Intelligence becomes more developed, governments have turned to it to improve the speed and quality of public sector service delivery, among other objectives. Below, we provide a selection of recent literature that examines how the public sector has adopted AI to serve constituents and solve public problems. While the use of AI in governments can cut down costs and administrative work, these technologies are often early in development and difficult for organizations to understand and control with potential harmful effects as a result. As such, this selected reading explores not only the use of artificial intelligence in governance but also its benefits, and its consequences.

Readings are listed in alphabetical order.

Berryhill, Jamie, Kévin Kok Heang, Rob Clogher, and Keegan McBride. “Hello, World: Artificial intelligence and its use in the public sector.” OECD Working Papers on Public Governance no. 36 (2019): https://doi.org/10.1787/726fd39d-en.

This working paper emphasizes the importance of defining AI for the public sector and outlining use cases of AI within governments. It provides a map of 50 countries that have implemented or set in motion the development of AI strategies and highlights where and how these initiatives are cross-cutting, innovative, and dynamic. Additionally, the piece provides policy recommendations governments should consider when exploring public AI strategies to adopt holistic and humanistic approaches.

Kuziemski, Maciej, and Gianluca Misuraca. “AI Governance in the Public Sector: Three Tales from the Frontiers of Automated Decision-Making in Democratic Settings.” Telecommunications Policy 44, no. 6 (2020): 101976.

Kuziemski and Misuraca explore how the use of artificial intelligence in the public sector can exacerbate existing power imbalances between the public and the government. They consider the European Union’s artificial intelligence “governance and regulatory frameworks” and compare these policies with those of Canada, Finland, and Poland. Drawing on previous scholarship, the authors outline the goals, drivers, barriers, and risks of incorporating artificial intelligence into public services and assess existing regulations against these factors. Ultimately, they find that the “current AI policy debate is heavily skewed towards voluntary standards and self-governance” while minimizing the influence of power dynamics between governments and constituents.

Misuraca, Gianluca, and Colin van Noordt. “AI Watch, Artificial Intelligence in Public Services: Overview of the Use and Impact of AI in Public Services in the EU.” 30255 (2020).

This study provides “evidence-based scientific support” for the European Commission as it navigates AI regulation via an overview of ways in which European Union member-states use AI to enhance their public sector operations. While AI has the potential to positively disrupt existing policies and functionalities, this report finds gaps in how AI gets applied by governments. It suggests the need for further research centered on the humanistic, ethical, and social ramification of AI use and a rigorous risk assessment from a “public-value perspective” when implementing AI technologies. Additionally, efforts must be made to empower all European countries to adopt responsible and coherent AI policies and techniques.

Saldanha, Douglas Morgan Fullin, and Marcela Barbosa da Silva. “Transparency and Accountability of Government Algorithms: The Case of the Brazilian Electronic Voting System.” Cadernos EBAPE.BR 18 (2020): 697–712.

Saldanha and da Silva note that open data and open government revolutions have increased citizen demand for algorithmic transparency. Algorithms are increasingly used by governments to speed up processes and reduce costs, but their black-box systems and lack of explanability allows them to insert implicit and explicit bias and discrimination into their calculations. The authors conduct a qualitative study of the “practices and characteristics of the transparency and accountability” in the Brazilian e-voting system across seven dimensions: consciousness; access and reparations; accountability; explanation; data origin, privacy and justice; auditing; and validation, precision and tests. They find the Brazilian e-voting system fulfilled the need to inform citizens about the benefits and consequences of data collection and algorithm use but severely lacked in demonstrating accountability and opening algorithm processes for citizen oversight. They put forth policy recommendations to increase the e-voting system’s accountability to Brazilians and strengthen auditing and oversight processes to reduce the current distrust in the system.

Sharma, Gagan Deep, Anshita Yadav, and Ritika Chopra. “Artificial intelligence and effective governance: A review, critique and research agenda.” Sustainable Futures 2 (2020): 100004.

This paper conducts a systematic review of the literature of how AI is used across different branches of government, specifically, healthcare, information, communication, and technology, environment, transportation, policy making, and economic sectors. Across the 74 papers surveyed, the authors find a gap in the research on selecting and implementing AI technologies, as well as their monitoring and evaluation. They call on future research to assess the impact of AI pre- and post-adoption in governance, along with the risks and challenges associated with the technology.

Tallerås, Kim, Terje Colbjørnsen, Knut Oterholm, and Håkon Larsen. “Cultural Policies, Social Missions, Algorithms and Discretion: What Should Public Service Institutions Recommend?” Part of the Lecture Notes in Computer Science book series (2020).

Tallerås et al. examine how the use of algorithms by public services, such as public radio and libraries, influence broader society and culture. For instance, to modernize their offerings, Norway’s broadcasting corporation (NRK) has adopted online platforms similar to popular private streaming services. However, NRK’s filtering process has faced “exposure diversity” problems that narrow recommendations to already popular entertainment and move Norway’s cultural offerings towards a singularity. As a public institution, NRK is required to “fulfill […] some cultural policy goals,” raising the question of how public media services can remain relevant in the era of algorithms fed by “individualized digital culture.” Efforts are currently underway to employ recommendation systems that balance cultural diversity with personalized content relevance that engage individuals and uphold the socio-cultural mission of public media.

Vogl, Thomas, Seidelin Cathrine, Bharath Ganesh, and Jonathan Bright. “Smart Technology and the Emergence of Algorithmic Bureaucracy: Artificial Intelligence in UK Local Authorities.” Public administration review 80, no. 6 (2020): 946–961.

Local governments are using “smart technologies” to create more efficient and effective public service delivery. These tools are twofold: not only do they help the public interact with local authorities, they also streamline the tasks of government officials. To better understand the digitization of local government, the authors conducted surveys, desk research, and in-depth interviews with stakeholders from local British governments to understand reasoning, processes, and experiences within a changing government framework. Vogl et al. found an increase in “algorithmic bureaucracy” at the local level to reduce administrative tasks for government employees, generate feedback loops, and use data to enhance services. While the shift toward digital local government demonstrates initiatives to utilize emerging technology for public good, further research is required to determine which demographics are not involved in the design and implementation of smart technology services and how to identify and include these audiences.

Wirtz, Bernd W., Jan C. Weyerer, and Carolin Geyer. “Artificial intelligence and the public sector—Applications and challenges.” International Journal of Public Administration 42, no. 7 (2019): 596-615.

The authors provide an extensive review of the existing literature on AI uses and challenges in the public sector to identify the gaps in current applications. The developing nature of AI in public service has led to differing definitions of what constitutes AI and what are the risks and benefits it poses to the public. As well, the authors note the lack of focus on the downfalls of AI in governance, with studies tending to primarily focus on the positive aspects of the technology. From this qualitative analysis, the researchers highlight ten AI applications: knowledge management, process automation, virtual agents, predictive analytics and data visualization, identity analytics, autonomous systems, recommendation systems, digital assistants, speech analytics, and threat intelligence. As well, they note four challenge dimensions—technology implementation, laws and regulation, ethics, and society. From these applications and risks, Wirtz et al. provide a “checklist for public managers” to make informed decisions on how to integrate AI into their operations.

Wirtz, Bernd W., Jan C. Weyerer, and Benjamin J. Sturm. “The dark sides of artificial intelligence: An integrated AI governance framework for public administration.” International Journal of Public Administration 43, no. 9 (2020): 818-829.

As AI is increasingly popularized and picked up by governments, Wirtz et al. highlight the lack of research on the challenges and risks—specifically, privacy and security—associated with implementing AI systems in the public sector. After assessing existing literature and uncovering gaps in the main governance frameworks, the authors outline the three areas of challenges of public AI: law and regulations, society, and ethics. Last, they propose an “integrated AI governance framework” that takes into account the risks of AI for a more holistic “big picture” approach to AI in the public sector.

Zuiderwijk, Anneke, Yu-Che Chen, and Fadi Salem. “Implications of the use of artificial intelligence in public governance: A systematic literature review and a research agenda.” Government Information Quarterly (2021): 101577.

Following a literature review on the risks and possibilities of AI in the public sector, Zuiderwijk, Chen, and Salem design a research agenda centered around the “implications of the use of AI for public governance.” The authors provide eight process recommendations, including: avoiding superficial buzzwords in research; conducting domain- and locality-specific research on AI in governance; shifting from qualitative analysis to diverse research methods; applying private sector “practice-driven research” to public sector study; furthering quantitative research on AI use by governments; creating “explanatory research designs”; sharing data for broader study; and adopting multidisciplinary reference theories. Further, they note the need for scholarship to delve into best practices, risk management, stakeholder communication, multisector use, and impact assessments of AI in the public sector to help decision-makers make informed decisions on the introduction, implementation, and oversight of AI in the public sector.

Selected Readings on the Use of Artificial Intelligence in the Public Sector

By Michelle Winowatan, Uma Kalkar, Andrew Young, and Stefaan Verhulst

The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data, gender, and mobility was originally published in 2017, and updated in 2021.

This edition of the Selected Readings was developed as part of an ongoing project at the GovLab, supported by Data2X, in collaboration with UNICEF, DigitalGlobe, IDS (UDD/Telefonica R&D), and the ISI Foundation, to establish a data collaborative to analyze unequal access to urban transportation for women and girls in Chile. We thank all our partners for their suggestions to the below curation – in particular Leo Ferres at IDS who got us started with this collection; Ciro Cattuto and Michele Tizzoni from the ISI Foundation; and Bapu Vaitla at Data2X for their pointers to the growing data and mobility literature.

Introduction

Daily mobility is key for gender equity. Access to transportation contributes to women’s agency and independence. The ability to move from place to place safely and efficiently can allow women to access education, work, and the public domain more generally. Yet, mobility is not just a means to access various opportunities. It is also a means to enter the public domain.

Women’s mobility is a multi-layered challenge

Women’s daily mobility, however, is often hampered by social, cultural, infrastructural, and technical barriers. Cultural bias, for instance, limits women’s mobility in a way that women are confined to an area with close proximity to their house due to society’s double standard on women to be homemakers. From an infrastructural perspective, public transportation mostly only accommodates home-to-work trips, when in reality women often make more complex trips with multiple stops, for example, at the market, school, healthcare provider – sometimes called “trip chaining.” From a safety perspective, women tend to avoid making trips in certain areas and/or at certain times due to a constant risk of being sexually harassed n public places. Women are also pushed toward more expensive transportation – such as taking a cab instead of a bus or train – based on safety concerns.

The growing importance of (new sources of) data

Researchers are increasingly experimenting with ways to address these interdependent problems through the analysis of diverse datasets, often collected by private sector businesses and other non-governmental entities. Gender-disaggregated mobile phone records, geospatial data, satellite imagery, and social media data, to name a few, are providing evidence-based insight into gender and mobility concerns. Such data collaboratives – the exchange of data across sectors to create public value – can help governments, international organizations, and other public sector entities in the move toward more inclusive urban and transportation planning, and the promotion of gender equity.

The below curated set of readings seek to focus on the following areas:

Insights on how data can inform gender empowerment initiatives,
Emergent research into the capacity of new data sources – like call detail records (CDRs) and satellite imagery – to increase our understanding of human mobility patterns, and,
Publications exploring data-driven policy for gender equity in mobility.

Readings are listed in alphabetical order.

We selected the readings based upon their focus (gender and/or mobility related); scope and representativeness (going beyond one project or context); type of data used (such as CDRs and satellite imagery); and date of publication.

Annotated Reading List

Data and Gender

Blumenstock, Joshua, and Nathan Eagle. Mobile Divides: Gender, Socioeconomic Status, and Mobile Phone Use in Rwanda. ACM Press, 2010.

Using traditional survey and mobile phone operator data, this study analyzes gender and socioeconomic divides in mobile phone use in Rwanda, where it is found that the use of mobile phones is significantly more prevalent in men and the higher class.
The study also shows the differences in the way men and women use phones, for example: women are more likely to use a shared phone than men.
The authors frame their findings around gender and economic inequality in the country to the end of providing pointers for government action.

Bosco, Claudio, et al. Mapping Indicators of Female Welfare at High Spatial Resolution. WorldPop and Flowminder, 2015.

This report focuses on early adolescence in girls, which often comes with higher risk of violence, fewer economic opportunity, and restrictions on mobility. Significant data gaps, methodological and ethical issues surrounding data collection for girls also create barriers for policymakers to create evidence-based policy to address those issues.
The authors analyze geolocated household survey data, using statistical models and validation techniques, and creates high-resolution maps of various sex-disaggregated indicators, such as nutrition level, access to contraception, and literacy, to better inform local policy making processes.
Further, it identifies the gender data gap and issues surrounding gender data collection, and provides arguments for why having comprehensive data can help create better policy and contribute to the achievements of the Sustainable Development Goals (SDGs).

Buvinic, Mayra, Rebecca Furst-Nichols, and Gayatri Koolwal. Mapping Gender Data Gaps. Data2X, 2014.

This study identifies gaps in gender data in developing countries on health, education, economic opportunities, political participation, and human security issues.
It recommends ways to close the gender data gap through censuses and micro-level surveys, service and administrative records, and emphasizes how “big data” in particular can fill the missing data that will be able to measure the progress of women and girls well being. The authors argue that identifying these gaps is key to achieving SDG 5: advancing gender equality and women’s empowerment.

Catalyzing Inclusive Financial Systems: Chile’s Commitment to Women’s Data. Data2X, 2014.

This article analyzes global and national data in the banking sector to fill the gap of sex-disaggregated data in Chile. The purpose of the study is to describe the difference in spending behavior and priorities between women and men, identify the challenges for women in accessing financial services, and create policies that promote women inclusion in Chile.

Ready to Measure: Twenty Indicators for Monitoring SDG Gender Targets. Open Data Watch and Data2X, 2016.

Using readily available data, this study identifies 20 SDG indicators related to gender issues that can serve as a baseline measurement for advancing gender equality, such as percentage of women aged 20-24 who were married or in a union before age 18 (child marriage), proportion of seats held by women in national parliament, and share of women among mobile telephone owners, among others.

Ready to Measure Phase II: Indicators Available to Monitor SDG Gender Targets. Open Data Watch and Data2X, 2017.

The Phase II paper is an extension of the Ready to Measure Phase I above. Where Phase I identifies the readily available data to measure women and girls well-being, Phase II provides information on how to access this data and summarizes insights extracted from it.
Phase II elaborates the insights about data gathered from ready to measure indicators and finds that although underlying data to measure indicators of women and girls’ wellbeing is readily available in most cases, it is typically not sex-disaggregated.
Over one in five – 53 out of 232 – SDG indicators specifically refer to women and girls. However, further analysis from this study reveals that at least 34 more indicators should be disaggregated by sex. For instance, there should be 15 more sex-disaggregated indicators for SDG number 3: “Ensure healthy lives and promote well-being for all at all ages.”
The report recommends national statistical agencies to take the lead and assert additional effort to fill the data gap by utilizing tools such as the statistical model to fill the current gender data gap for each of the SDGs.

Reed, Philip J., Muhammad Raza Khan, and Joshua Blumenstock. Observing gender dynamics and disparities with mobile phone metadata. International Conference on Information and Communication Technologies and Development (ICTD), 2016.

The study analyzes mobile phone logs of millions of Pakistani residents to explore whether there is a difference in mobile phone usage behavior between male and female and determine the extent to which gender inequality is reflected in mobile phone usage.
It utilizes mobile phone data to analyze the pattern of usage behavior between genders, and socioeconomic and demographic data obtained from census and advocacy groups to assess the state of gender equality in each region in Pakistan.
One of its findings is a strong positive correlation between the proportion of female mobile phone users and education score.

Stehlé, Juliette, et al. Gender homophily from spatial behavior in a primary school: A sociometric study. 2013.

This paper seeks to understand homophily, a human behavior that characterizes interactions with peers who have similarities in “physical attributes to tastes or political opinions”. Further, it seeks to identify the magnitude of influence, a type of homophily applied to social structures.
Focusing on gender interaction among primary school aged children in France, this paper collects data from wearable devices from 200 children in the period of 2 days and measures the physical proximity and duration of the interaction among those children in the playground.
It finds that interaction patterns are significantly determined by grade and class structure of the school. This means that children belonging to the same class have most interactions, and that lower grades usually do not interact with higher grades.
From a gender lens, this study finds that mixed-gender interaction lasts shorter relative to same-gender interaction. In addition, interaction among girls is also longer compared to interaction among boys. These indicate that the children in this school tend to have stronger relationships within their own gender, or what the study calls gender homophily. It further finds that gender homophily is apparent in all classes.

Strengthening Gender Measures and Data in the COVID-19 Era: An Urgent Need for Change. Paris 21, 2021.

COVID-19 has exacerbated gender disparities, especially with regard to women’s livelihoods, unpaid labor, mental health, and risk of gender-based violence. Gaps in gender data impedes robust, data-driven, and effective policies to quantify, analyse, and respond to these issues.
Without this information, the full effects of the COVID-19 pandemic cannot be understood. This report calls on National Statistical Systems, survey managers, funders, multilateral agencies, researchers, and policymakers to collect gender-intentional and disaggregated data that is standardized and comparable to address key areas of concern for women and girls. Additionally, it seeks to link non-traditional data sources, such as social media and news media, with existing frameworks to fill in knowledge gaps. Moreover, this information must be rendered accessible for all stakeholders to maximize the potential of the information. Post-pandemic, conscious collection and collation of gendered data is vital to preempt policy problems.

The Sex, Gender and COVID-19 Project: The COVID-19 Sex-Disaggregated Data Tracker. 2021.

This data tracker, produced by Global Health 50/50, the African Population and Health Research Center, and the International Center for Research on Women, tracks which countries and datasets have reported sex-disaggregated data on COVID-19 testing, confirmed cases, hospitalizations, and deaths.

Data and Mobility

Bengtsson, Linus, et al. Using Mobile Phone Data to Predict the Spatial Spread of Cholera. Flowminder, 2015.

This study seeks to predict the 2010 cholera epidemic in Haiti using 2.9 million anonymous mobile phone SIM cards and reported cases of Cholera from the Haitian Directorate of Health, where 78 study areas were analyzed in the period of October 16 – December 16, 2010.
From this dataset, the study creates a mobility matrix that indicates mobile phone movement from one study area to another and combines that with the number of reported cases of cholera in the study areas to calculate the infectious pressure level of those areas.
The main finding of its analysis shows that the outbreak risk of a study area correlates positively with the infectious pressure level, where an infectious pressure of over 22 results in an outbreak within 7 days. Further, it finds that the infectious pressure level can inform the sensitivity and specificity of the outbreak prediction.
It hopes to improve infectious disease containment by identifying areas with highest risks of outbreaks.

Calabrese, Francesco, et al. Understanding Individual Mobility Patterns from Urban Sensing Data: A Mobile Phone Trace Example. SENSEable City Lab, MIT, 2012.

This study compares mobile phone data and odometer readings from annual safety inspections to characterize individual mobility and vehicular mobility in the Boston Metropolitan Area, measured by the average daily total trip length of mobile phone users and average daily Vehicular Kilometers Traveled (VKT).
The study found that, “accessibility to work and non-work destinations are the two most important factors in explaining the regional variations in individual and vehicular mobility, while the impacts of populations density and land use mix on both mobility measures are insignificant.” Further, “a well-connected street network is negatively associated with daily vehicular total trip length.”
This study demonstrates the potential for mobile phone data to provide useful and updatable information on individual mobility patterns to inform transportation and mobility research.

Campos-Cordobés, Sergio, et al. “Chapter 5 – Big Data in Road Transport and Mobility Research.” Intelligent Vehicles. Edited by Felipe Jiménez. Butterworth-Heinemann, 2018.

This study outlines a number of techniques and data sources – such as geolocation information, mobile phone data, and social network observation – that could be leveraged to predict human mobility.
The authors also provide a number of examples of real-world applications of big data to address transportation and mobility problems, such as transport demand modeling, short-term traffic prediction, and route planning.

Gauvin, Laetitia et al. Gender gaps in urban mobility. Humanities and Information Science. Humanities & Social Sciences Communications vol. 7, issue 11, 2020.

This article discusses how urbanization affects mobility of women in realizing their rights. It points out the historic lack of gender disaggregated data for urban planning, leading to transportation designs that do not best accommodate the needs of women.
Examining the case study of urban mobility through a gendered lens in the large and growing metropolitan area of Santiago, Chile, the article examines the mobility traces from Call Detail Records (CDRs) of an anonymized cohort of mobile phone users, sorted by gender, over 3 months. It then mapped differences between men and women with regard to socio-demographic indicators and mobility differences across the city and through the Santiago transportation network structure and identified points of interests frequented by either sex to inform gendered mobility needs in urban areas.

Lin, Miao, and Wen-Jing Hsu. Mining GPS Data for Mobility Patterns: A Survey. Pervasive and Mobile Computing vol. 12, 2014.

This study surveys the current field of research using high resolution positioning data (GPS) to capture mobility patterns.
The survey focuses on analyses related to frequently visited locations, modes of transportation, trajectory patterns, and placed-based activities. The authors find “high regularity” in human mobility patterns despite high levels of variation among the mobility areas covered by individuals.

Phithakkitnukoon, Santi, Zbigniew Smoreda, and Patrick Olivier. Socio-Geography of Human Mobility: A Study Using Longitudinal Mobile Phone Data. PLoS ONE, 2012.

This study used a year’s call logs and location data of approximately one million mobile phone users in Portugal to analyze the association between individuals’ mobility and their social networks.
It measures and analyze travel scope (locations visited) and geo-social radius (distance from friends, family, and acquaintances) to determine the association.
It finds that 80% of places visited are within 20 km of an individual’s nearest social ties’ location and it rises to 90% at 45 km radius. Further, as population density increases, distance between individuals and their social networks decreases.
The findings in this study demonstrates how mobile phone data can provide insights to “the socio-geography of human mobility”.

Semanjski, Ivana, and Sidharta Gautama. Crowdsourcing Mobility Insights – Reflection of Attitude Based Segments on High Resolution Mobility Behaviour Data. vol. 71, Transportation Research, 2016.

Using cellphone data, this study maps attitudinal segments that explain how age, gender, occupation, household size, income, and car ownership influence an individual’s mobility patterns. This type of segment analysis is seen as particularly useful for targeted messaging.
The authors argue that these time- and space-specific insights could also provide value for government officials and policymakers, by, for example, allowing for evidence-based transportation pricing options and public sector advertising campaign placement.

Silveira, Lucas M., et al. MobHet: Predicting Human Mobility using Heterogeneous Data Sources. vol. 95, Computer Communications , 2016.

This study explores the potential of using data from multiple sources (e.g., Twitter and Foursquare), in addition to GPS data, to provide a more accurate prediction of human mobility. This heterogenous data captures popularity of different locations, frequency of visits to those locations, and the relationships among people who are moving around the target area. The authors’ initial experimentation finds that the combination of these sources of data are demonstrated to be more accurate in identifying human mobility patterns.

Wilson, Robin, et al. Rapid and Near Real-Time Assessments of Population Displacement Using Mobile Phone Data Following Disasters: The 2015 Nepal Earthquake. PLOS Current Disasters, 2016.

Utilizing call detail records of 12 million mobile phone users in Nepal, this study seeks spatio-temporal details of the population after the earthquake on April 25, 2015.
It seeks to answer the problem of slow and ineffective disaster response, by capturing near real-time displacement patterns provided by mobile phone call detail records, in order to inform humanitarian agencies on where to distribute their assistance. The preliminary results of this study were available nine days after the earthquake.
This project relies on the foundational cooperation with mobile phone operators, who supplied the de-identified data from 12 million users before the earthquake.
The study finds that shortly after the earthquake there was an anomalous population movement out of the Kathmandu Valley, the most impacted area, to surrounding areas. The study estimates 390,000 more people than normal had left the valley.

Data, Gender and Mobility

Althoff, Tim, et al. “Large-Scale Physical Activity Data Reveal Worldwide Activity Inequality.” Nature, 2017.

This study’s analysis of worldwide physical activity is built on a dataset containing 68 million days of physical activity of 717,527 people collected through their smartphone accelerometers.
The authors find a significant reduction in female activity levels in cities with high active inequality, where high active inequality is associated with low city walkability – walkability indicators include pedestrian facilities (city block length, intersection density, etc.) and amenities (shops, parks, etc.).
Further, they find that high active inequality is associated with high levels of inactivity-related health problems, like obesity.

Borker, Girija. Safety First: Street Harassment and Women’s Educational Choices in India.Stop Street Harassment, 2017.

Using data collected from SafetiPin, an application that allows users to mark an area on a map as safe or not, and Safecity, another application that lets users share their experience of harassment in public places, Borker analyzes the safety of travel routes surrounding different colleges in India and their effect on women’s college choices.
The study finds that women are willing to go to a lower ranked college in order to avoid higher risk of street harassment. Women who choose the best college from their set of options, spend an average of $250 more each year to access safer modes of transportation.

Frias-Martinez, Vanessa, Enrique Frias-Martinez, and Nuria Oliver. A Gender-Centric Analysis of Calling Behavior in a Developing Economy Using Call Detail Records. Association for the Advancement of Artificial Intelligence, 2010.

Using encrypted Call Detail Records (CDRs) of 10,000 participants in a developing economy, this study analyzes the behavioral, social, and mobility variables to determine the gender of a mobile phone user, and finds that there is a difference in behavioral and social variables in mobile phone use between female and male.
It finds that women have higher usage of phone in terms of number of calls made, call duration, and call expenses compared to men. Women also have bigger social network, meaning that the number of unique phone numbers that contact or get contacted is larger. It finds no statistically significant difference in terms of distance made between calls in men and women.
Frias-Martinez et al recommends to take these findings into consideration when designing a cellphone based service.

Psylla, Ioanna, Piotr Sapiezynski, Enys Mones, Sune Lehmann. The role of gender in social network organization. PLoS ONE 12, December 20, 2017.

Using a large dataset of high resolution data collected through mobile phones, as well as detailed questionnaires, this report studies gender differences in a large cohort. The researchers consider mobility behavior and individual personality traits among a group of more than 800 university students.
Analyzing mobility data, they find both that women visit more unique locations over time, and that they have more homogeneous time distribution over their visited locations than men, indicating the time commitment of women is more widely spread across places.

The Landscape of Big Data and Gender. Data2X, February, 2021.

Under the backdrop of COVID-19, this report reaffirms that big data initiatives to study mobility, health, and social norms through gendered lenses have greatly progressed. More private companies and think tanks have launched data collection and sharing efforts to spur innovative projects to address COVID-19 complications.
However, economic opportunity, security, and civic action have been lagging behind. Big data collection among these topics is complicated by the lack of sex-disaggregated datasets, gender disparities in technology access, and the lack of gender-tags among big data.
Large technology firms, especially social networks like Facebook, LinkedIn, Uber, and more, create a large amount of gender-organized data. The report found that users and data-holding companies are willing to share this information for public policy reasons so long as it provides value and is protected. To this end, Data2X, alongside its partners, champion the use of data collaboratives to use gender sorted information for social good.

Vaitla, Bapu. Big Data and the Well Being of Women and Girls: Applications on the Social Scientific Frontier. Data2X, Apr. 2017.

In this study, the researchers use geospatial data, credit card and cell phone information, and social media posts to identify problems–such as malnutrition, education, access to healthcare, mental health–facing women and girls in developing countries.
From the credit card and cell phone data in particular, the report finds that analyzing patterns of women’s spending and mobility can provide useful insight into Latin American women’s “economic lifestyles.”
Based on this analysis, Vaitla recommends that various untraditional big data be used to fill gaps in conventional data sources to address the common issues of invisibility of women and girls’ data in institutional databases.

Selected Readings on Data, Gender, and Mobility

Uma