Are Your Data Visualizations Racist?


Article by Alice Feng & Jonathan Schwabish: “Through rigorous, data-based analysis, researchers and analysts can add to our understanding of societal shortcomings and point toward evidence-based solutions. But carelessly collecting and communicating data can lead to analyses and visualizations that have an outsized capacity to mislead, misrepresent, and harm communities already experiencing inequity and discrimination.

To unlock the full potential of data, researchers and analysts must consider and apply equity at every step of the research process. Ensuring responsible data collection, representing the communities surveyed accurately, and incorporating community input whenever possible will lead to more equitable data analyses and visualizations. Although there is no one-size-fits-all approach to working with data, for researchers to truly do no harm, they must build their work on a foundation of empathy.

In our recent report, Do No Harm Guide: Applying Equity Awareness in Data Visualization, we focus on how data practitioners can approach their work through a lens of diversity, equity, and inclusion. To create this report, we conducted more than a dozen interviews with nearly 20 people who work with data to hear how they approach inclusivity. In those interviews, we heard time and time again that demonstrating empathy for the people and communities you are focusing on and communicating with should be the guiding light for those working with data. Journalist Kim Bui succinctly captured how researchers and analysts can apply empathy, saying: “If I were one of the data points on this visualization, would I feel offended?”…(More)”.

Leveraging Location and Mobility Data: Perils & Practices


Paper by Suha Mohamed: “…Mobility data refers to information (often passively captured) that provides insights into the location and movement of a population – often through their interactions with digital mobility devices (like our smartphones) or transport services. Sources of mobility data, while diverse, include call detail records from telecom companies, GPS details from phones or vehicles, geotagged social media data or first or third-party software data. 

Geolocation, a subset of mobility data, may be useful in shaping responsive courses of action as it can be leveraged in granular form to understand hyperlocal realities or, when aggregated, regional, national or international patterns. However, privacy concerns arise from the sensitive or personal data that may be inferred from these records and the often opaque conditions around its usage. The ongoing deployment of contact tracing applications, which largely depend on individual-level location data, have demonstrated extensive potential for misuse and surveillance….

Despite the surveillance and privacy concerns around the use of contact tracing apps and mobility data, it is undeniable that this data has immense public value and has helped officials understand the development of the COVID-19 virus and map its variants and waves. It has also been used to track: areas of mobility that contribute towards increased transmission of the virus, adherence to social distancing norms and the effectiveness of measures like lockdowns or restrictions….(More)”.

Improving Consumer Welfare with Data Portability


Report by Daniel Castro: “Data protection laws and regulations can contain restrictive provisions, which limit data sharing and use, as well as permissive provisions, which increase it. Data portability is an example of a permissive provision that allows consumers to obtain a digital copy of their personal information from an online service and provide this information to other services. By carefully crafting data portability provisions, policymakers can enable consumers to obtain more value from their data, create new opportunities for businesses to innovate with data, and foster competition….(More)”.

Sharing Student Data Across Public Sectors: Importance of Community Engagement to Support Responsible and Equitable Use


Report by CDT: “Data and technology play a critical role in today’s education institutions, with 85 percent of K-12 teachers anticipating that online learning and use of education technology at their school will play a larger role in the future than it did before the pandemic.  The growth in data-driven decision-making has helped fuel the increasing prevalence of data sharing practices between K-12 education agencies and adjacent public sectors like social services. Yet the sharing of personal data can pose risks as well as benefits, and many communities have historically experienced harm as a result of irresponsible data sharing practices. For example, if the underlying data itself is biased, sharing that information exacerbates those inequities and increases the likelihood that potential harms fall disproportionately on certain communities. As a result, it is critical that agencies participating in data sharing initiatives take steps to ensure the benefits are available to all and no groups of students experience disproportionate harm.

A core component of sharing data responsibly is proactive, robust community engagement with the group of people whose data is being shared, as well as their surrounding community. This population has the greatest stake in the success or failure of a given data sharing initiative; as such, public agencies have a practical incentive, and a moral obligation, to engage them regarding decisions being made about their data…

This paper presents guidance on how practitioners can conduct effective community engagement around the sharing of student data between K-12 education agencies and adjacent public sectors. We explore the importance of community engagement around data sharing initiatives, and highlight four dimensions of effective community engagement:

  • Plan: Establish Goals, Processes, and Roles
  • Enable: Build Collective Capacity
  • Resource: Dedicate Appropriate People, Time, and Money
  • Implement: Carry Out Vision Effectively and Monitor Implementation…(More)”.

Group Backed by Top Companies Moves to Combat A.I. Bias in Hiring


Steve Lohr at The New York Times: “Artificial intelligence software is increasingly used by human resources departments to screen résumés, conduct video interviews and assess a job seeker’s mental agility.

Now, some of the largest corporations in America are joining an effort to prevent that technology from delivering biased results that could perpetuate or even worsen past discrimination.

The Data & Trust Alliance, announced on Wednesday, has signed up major employers across a variety of industries, including CVS Health, Deloitte, General Motors, Humana, IBM, Mastercard, Meta (Facebook’s parent company), Nike and Walmart.

The corporate group is not a lobbying organization or a think tank. Instead, it has developed an evaluation and scoring system for artificial intelligence software.

The Data & Trust Alliance, tapping corporate and outside experts, has devised a 55-question evaluation, which covers 13 topics, and a scoring system. The goal is to detect and combat algorithmic bias.“This is not just adopting principles, but actually implementing something concrete,” said Kenneth Chenault, co-chairman of the group and a former chief executive of American Express, which has agreed to adopt the anti-bias tool kit…(More)”.

Making change: What works?


Report by Laurie Laybourn-Langton, Harry Quilter-Pinner, and Nicolas Treloar: “Movements change the world. Throughout history, loosely organised networks of individuals and organisations have sought changes to societies – and won. From the abolitionist struggle and campaigns for voting rights to #MeToo and #BlackLivesMatter, the impact of movements can be seen everywhere.

Over the last year, IPPR and the Runnymede Trust have sought to understand what we can learn from movements that have made change – as well as those who have fallen short – for our efforts to create change today.

We did this by exploring what worked and didn’t work for four movements from recent decades. These were: LGBTQ+ rights, race equality, climate action, and health inequality….(More)”.

The State of Open Data 2021


Report by Digital Science (Australia): “Since 2016, we have monitored levels of data sharing and usage. Over the years, we have had 21,000 responses from researchers worldwide providing unparalleled insight into their motivations, challenges, perceptions, and behaviours toward open data.

In our sixth survey, we asked about motivations as well as perceived discoverability and credibility of data that is shared openly. The State of Open Data is a critical piece of information that enables us to identify the barriers to open data from a researcher perspective, laying the foundation for future action. 

Key findings from this year’s survey

  • 73% support the idea of a national mandate for making research data openly available
  • 52% said funders should make the sharing of research data part of their requirements for awarding grants
  • 47% said they would be motivated to share their data if there was a journal or publisher requirement to do so
  • About a third of respondents indicated that they have reused their own or someone else’s openly accessible data more during the pandemic than before
  • There are growing concerns over misuse and lack of credit for open sharing…(More)”

Crime Prediction Software Promised to Be Free of Biases. New Data Shows It Perpetuates Them


Article by Aaron Sankin, Dhruv Mehrotra for Gizmodo, Surya Mattu, and Annie Gilbertson: “Between 2018 and 2021, more than one in 33 U.S. residents were potentially subject to police patrol decisions directed by crime prediction software called PredPol.

The company that makes it sent more than 5.9 million of these crime predictions to law enforcement agencies across the country—from California to Florida, Texas to New Jersey—and we found those reports on an unsecured server.

The Markup and Gizmodo analyzed them and found persistent patterns.

Residents of neighborhoods where PredPol suggested few patrols tended to be Whiter and more middle- to upper-income. Many of these areas went years without a single crime prediction.

By contrast, neighborhoods the software targeted for increased patrols were more likely to be home to Blacks, Latinos, and families that would qualify for the federal free and reduced lunch program.

These communities weren’t just targeted more—in some cases they were targeted relentlessly. Crimes were predicted every day, sometimes multiple times a day, sometimes in multiple locations in the same neighborhood: thousands upon thousands of crime predictions over years. A few neighborhoods in our data were the subject of more than 11,000 predictions.

The software often recommended daily patrols in and around public and subsidized housing, targeting the poorest of the poor.

“Communities with troubled relationships with police—this is not what they need,” said Jay Stanley, a senior policy analyst at the ACLU Speech, Privacy, and Technology Project. “They need resources to fill basic social needs.”…(More)”.

How Courts Embraced Technology, Met the Pandemic Challenge, and Revolutionized Their Operations


Report by The Pew Charitable Trusts: “To begin to assess whether, and to what extent, the rapid improvements in court technology undertaken in 2020 and 2021 made the civil legal system easier to navigate, The Pew Charitable Trusts examined pandemic-related emergency orders issued by the supreme courts of all 50 states and Washington, D.C. The researchers supplemented that review with an analysis of court approaches to virtual hearings, e-filing, and digital notarization, with a focus on how these tools affected litigants in three of the most common types of civil cases: debt claims, evictions, and child support. The key findings of this research are:

  • Civil courts’ adoption of technology was unprecedented in pace and scale. Despite having almost no history of using remote civil court proceedings, beginning in March 2020 every state and D.C. initiated online hearings at record rates to resolve many types of cases. For example, the Texas court system, which had never held a civil hearing via video before the pandemic, conducted 1.1 million remote proceedings across its civil and criminal divisions between March 2020 and February 2021. Similarly, Michigan courts held more than 35,000 video hearings totaling nearly 200,000 hours between April 1 and June 1, 2020, compared with no such hearings during the same two months in 2019.Courts moved other routine functions online as well. Before the pandemic, 37 states and D.C. allowed people without lawyers to electronically file court documents in at least some civil cases. But since March 2020, 10 more states have created similar processes, making e-filing available to more litigants in more jurisdictions and types of cases. In addition, after 11 states and D.C. made pandemic-driven changes to their policies on electronic notarization (e-notarization), 42 states and D.C. either allowed it or had waived notarization requirements altogether as of fall 2020.
  • Courts leveraged technology not only to stay open, but also to improve participation rates and help users resolve disputes more efficiently. Arizona civil courts, for example, saw an 8% drop year-over-year in June 2020 in the rate of default, or automatic, judgment—which results when defendants fail to appear in court—indicating an increase in participation. Although national and other state data is limited, court officials across the country, including judges, administrators, and attorneys, report increases in civil court appearance rates.
  • The accelerated adoption of technology disproportionately benefited people and businesses with legal representation—and in some instances, made the civil legal system more difficult to navigate for those without...(More)”.

Articulating Value from Data


Report by the World Economic Forum: “The distinct characteristics and dynamics of data – contextual, relational and cumulative – call for new approaches to articulating its value. Businesses should value data based on cases that go beyond the transactional monetization of data and take into account the broader context, future opportunities to collaborate and innovate, and value created for its ecosystem stakeholders. Doing so will encourage companies to think about the future value data can help generate, beyond the existing data lakes they sit on, and open them up to collaboration opportunities….(More)”.