Making data for good better


Article by Caroline Buckee, Satchit Balsari, and Andrew Schroeder: “…Despite the long standing excitement about the potential for digital tools, Big Data and AI to transform our lives, these innovations–with some exceptions–have so far had little impact on the greatest public health emergency of our time.

Attempts to use digital data streams to rapidly produce public health insights that were not only relevant for local contexts in cities and countries around the world, but also available to decision makers who needed them, exposed enormous gaps across the translational pipeline. The insights from novel data streams which could help drive precise, impactful health programs, and bring effective aid to communities, found limited use among public health and emergency response systems. We share here our experience from the COVID-19 Mobility Data Network (CMDN), now Crisis Ready (crisisready.io), a global collaboration of researchers, mostly infectious disease epidemiologists and data scientists, who served as trusted intermediaries between technology companies willing to share vast amounts of digital data, and policy makers, struggling to incorporate insights from these novel data streams into their decision making. Through our experience with the Network, and using human mobility data as an illustrative example, we recognize three sets of barriers to the successful application of large digital datasets for public good.

First, in the absence of pre-established working relationships with technology companies and data brokers, the data remain primarily confined within private circuits of ownership and control. During the pandemic, data sharing agreements between large technology companies and researchers were hastily cobbled together, often without the right kind of domain expertise in the mix. Second, the lack of standardization, interoperability and information on the uncertainty and biases associated with these data, necessitated complex analytical processing by highly specialized domain experts. And finally, local public health departments, understandably unfamiliar with these novel data streams, had neither the bandwidth nor the expertise to sift noise from signal. Ultimately, most efforts did not yield consistently useful information for decision making, particularly in low resource settings, where capacity limitations in the public sector are most acute…(More)”.

Nonprofit Websites Are Riddled With Ad Trackers


Article by By Alfred Ng and Maddy Varner: “Last year, nearly 200 million people visited the website of Planned Parenthood, a nonprofit that many people turn to for very private matters like sex education, access to contraceptives, and access to abortions. What those visitors may not have known is that as soon as they opened plannedparenthood.org, some two dozen ad trackers embedded in the site alerted a slew of companies whose business is not reproductive freedom but gathering, selling, and using browsing data.

The Markup ran Planned Parenthood’s website through our Blacklight tool and found 28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called “session recorders” that could be capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions. The site also contained trackers that tell Facebook and Google if users visited the site.

The Markup’s scan found Planned Parenthood’s site communicating with companies like Oracle, Verizon, LiveRamp, TowerData, and Quantcast—some of which have made a business of assembling and selling access to masses of digital data about people’s habits.

Katie Skibinski, vice president for digital products at Planned Parenthood, said the data collected on its website is “used only for internal purposes by Planned Parenthood and our affiliates,” and the company doesn’t “sell” data to third parties.

“While we aim to use data to learn how we can be most impactful, at Planned Parenthood, data-driven learning is always thoughtfully executed with respect for patient and user privacy,” Skibinski said. “This means using analytics platforms to collect aggregate data to gather insights and identify trends that help us improve our digital programs.”

Skibinski did not dispute that the organization shares data with third parties, including data brokers.

Blacklight scan of Planned Parenthood Gulf Coast—a localized website specifically for people in the Gulf region, including Texas, where abortion has been essentially outlawed—churned up similar results.

Planned Parenthood is not alone when it comes to nonprofits, some operating in sensitive areas like mental health and addiction, gathering and sharing data on website visitors.

Using our Blacklight tool, The Markup scanned more than 23,000 websites of nonprofit organizations, including those belonging to abortion providers and nonprofit addiction treatment centers. The Markup used the IRS’s nonprofit master file to identify nonprofits that have filed a tax return since 2019 and that the agency categorizes as focusing on areas like mental health and crisis intervention, civil rights, and medical research. We then examined each nonprofit’s website as publicly listed in GuideStar. We found that about 86 percent of them had third-party cookies or tracking network requests. By comparison, when The Markup did a survey of the top 80,000 websites in 2020, we found 87 percent used some type of third-party tracking.

About 11 percent of the 23,856 nonprofit websites we scanned had a Facebook pixel embedded, while 18 percent used the Google Analytics “Remarketing Audiences” feature.

The Markup found that 439 of the nonprofit websites loaded scripts called session recorders, which can monitor visitors’ clicks and keystrokes. Eighty-nine of those were for websites that belonged to nonprofits that the IRS categorizes as primarily focusing on mental health and crisis intervention issues…(More)”.

Data Science for Social Good: Philanthropy and Social Impact in a Complex World


Book edited by Ciro Cattuto and Massimo Lapucci: “This book is a collection of insights by thought leaders at first-mover organizations in the emerging field of “Data Science for Social Good”. It examines the application of knowledge from computer science, complex systems, and computational social science to challenges such as humanitarian response, public health, and sustainable development. The book provides an overview of scientific approaches to social impact – identifying a social need, targeting an intervention, measuring impact – and the complementary perspective of funders and philanthropies pushing forward this new sector.

TABLE OF CONTENTS


Introduction; By Massimo Lapucci

The Value of Data and Data Collaboratives for Good: A Roadmap for Philanthropies to Facilitate Systems Change Through Data; By Stefaan G. Verhulst

UN Global Pulse: A UN Innovation Initiative with a Multiplier Effect; By Dr. Paula Hidalgo-Sanchis

Building the Field of Data for Good; By Claudia Juech

When Philanthropy Meets Data Science: A Framework for Governance to Achieve Data-Driven Decision-Making for Public Good; By Nuria Oliver

Data for Good: Unlocking Privately-Held Data to the Benefit of the Many; By Alberto Alemanno

Building a Funding Data Ecosystem: Grantmaking in the UK; By Rachel Rank

A Reflection on the Role of Data for Health: COVID-19 and Beyond; By Stefan E. Germann and Ursula Jasper….(More)”

Impact Evidence and Beyond: Using Evidence to Drive Adoption of Humanitarian Innovations


Learning paper by DevLearn: “…provides guidance to humanitarian innovators on how to use evidence to enable and drive adoption of innovation.

Innovation literature and practice show time and time again that it is difficult to scale innovations. Even when an innovation is demonstrably impactful, better than the existing solution and good value for money, it does not automatically get adopted or used in mainstream humanitarian programming.

Why do evidence-based innovations face difficulties in scaling and how can innovators best position their innovation to scale?

This learning paper is for innovators who want to effectively use evidence to support and enable their journey to scale. It explores the underlying social, organisational and behavioural factors that stifle uptake of innovations.

It also provides guidance on how to use, prioritise and communicate evidence to overcome these barriers. The paper aims to help innovators generate and present their evidence in more tailored and nuanced ways to improve adoption and scaling of their innovations….(More)”.

Where Is Everyone? The Importance of Population Density Data


Data Artefact Study by Aditi Ramesh, Stefaan Verhulst, Andrew Young and Andrew Zahuranec: “In this paper, we explore new and traditional approaches to measuring population density, and ways in which density information has frequently been used by humanitarian, private-sector and government actors to advance a range of private and public goals. We explain how new innovations are leading to fresh ways of collecting data—and fresh forms of data—and how this may open up new avenues for using density information in a variety of contexts. Section III examines one particular example: Facebook’s High-Resolution Population Density Maps (also referred to as HRSL, or high resolution settlement layer). This recent initiative, created in collaboration with a number of external organizations, shows not only the potential of mapping innovations but also the potential benefits of inter-sectoral partnerships and sharing. We examine three particular use cases of HRSL, and we follow with an assessment and some lessons learned. These lessons are applicable to HRSL in particular, but also more broadly. We conclude with some thoughts on avenues for future research….(More)”.

Introducing collective crisis intelligence


Blogpost by Annemarie Poorterman et al: “…It has been estimated that over 600,000 Syrians have been killed since the start of the civil war, including tens of thousands of civilians killed in airstrike attacks. Predicting where and when strikes will occur and issuing time-critical warnings enabling civilians to seek safety is an ongoing challenge. It was this problem that motivated the development of Sentry Syria, an early warning system that alerts citizens to a possible airstrike. Sentry uses acoustic sensor data, reports from on-the-ground volunteers, and open media ‘scraping’ to detect warplanes in flight. It uses historical data and AI to validate the information from these different data sources and then issues warnings to civilians 5-10 minutes in advance of a strike via social media, TV, radio and sirens. These extra minutes can be the difference between life and death.

Sentry Syria is just one example of an emerging approach in the humanitarian response we call collective crisis intelligence (CCI). CCI methods combine the collective intelligence (CI) of local community actors (e.g. volunteer plane spotters in the case of Sentry) with a wide range of additional data sources, artificial intelligence (AI) and predictive analytics to support crisis management and reduce the devastating impacts of humanitarian emergencies….(More)”

The Innovation Project: Can advanced data science methods be a game-change for data sharing?


Report by JIPS (Joint Internal Displacement Profiling Service): “Much has changed in the humanitarian data landscape in the last decade and not primarily with the arrival of big data and artificial intelligence. Mostly, the changes are due to increased capacity and resources to collect more data quicker, leading to the professionalisation of information management as a domain of work. Larger amounts of data are becoming available in a more predictable way. We believe that as the field has progressed in filling critical data gaps, the problem is not the availability of data, but the curation and sharing of that data between actors as well as the use of that data to its full potential.

In 2018, JIPS embarked on an innovation journey to explore the potential of state-of-the-art technologies to incentivise data sharing and collaboration. This report covers the first phase of the innovation project and launches a series of articles in which we will share more about the innovation journey itself, discuss safe data sharing and collaboration, and look at the prototype we developed – made possible by the UNHCR Innovation Fund.

We argue that by making data and insights safe and secure to share between stakeholders, it will allow for a more efficient use of available data, reduce the resources needed to collect new data, strengthen collaboration and foster a culture of trust in the evidence-informed protection of people in displacement and crises.

The paper first defines the problem and outlines the processes through which data is currently shared among the humanitarian community. It explores questions such as: what are the existing data sharing methods and technologies? Which ones constitute a feasible option for humanitarian and development organisations? How can different actors share and collaborate on datasets without impairing confidentiality and exposing them to disclosure threats?…(More)”.

Building a Responsible Open Data Ecosystem: Mobility Data & COVID-19


Blog by Anna Livaccari: “Over the last year and a half, COVID-19 has changed the way people move, work, shop, and live. The pandemic has necessitated new data-sharing initiatives to understand new patterns of movement, analyze the spread of COVID-19, and inform research and decision-making. Earlier this year, Cuebiq collaborated with the Open Data Institute (ODI) and NYU’s The GovLab to explore the efficacy of these new initiatives. 

The ODI is a non-profit organization that brings together commercial and non-commercial organizations and governments to address global issues as well as advise on how data can be used for positive social good. As part of a larger project titled “COVID-19: Building an open and trustworthy data ecosystem,” the ODI published a new report with Cuebiq and The GovLab, an action research center at NYU’s Tandon School of Engineering that has pioneered the concept of data collaboratives and runs the data stewards network among other initiatives to advance data-driven decision making in the public interest. This report, “The Use of Mobility Data for Responding to the COVID-19 Pandemic,” specifically addresses key enablers and obstacles to the successful sharing of mobility data between public and private organizations during the pandemic….

Since early 2020, researchers and policy makers have been eager to understand the impact of COVID-19. With the help of mobility data, organizations from different sectors were able to answer some of the most pressing questions regarding the pandemic: questions about policy decisions, mass-communication strategies, and overall socioeconomic impact. Mobility data can be applied to specific use cases and can help answer complex questions, a fact that The GovLab discusses in its short-form mobility data brief. Understanding exactly how organizations employ mobility data can also improve how institutions operate post-pandemic and make data collaboration as a whole more responsible, sustainable, and systemic.

Cuebiq and the GovLab identified 51 projects where mobility data was used for pandemic response, and then selected five case studies to analyze further. The report defines mobility data, the ethics surrounding it, and the lessons learned for the future….(More)”.

Co-Develop: Digital Public Infrastructure for an Equitable Recovery


A report by The Rockefeller Foundation: “Digital systems that accomplish basic, society-wide functions played a critical role in the response to the Covid-19 pandemic, enabling both public health and social protection measures. The pandemic has shown the value of these systems, but it has also revealed how they are non-existent or weak in far too many places.

As we build back better, we have an unprecedented opportunity to build digital public infrastructure that promotes inclusion, human rights, and progress toward global goals. This report outlines an agenda for international cooperation on digital public infrastructure to guide future investments and expansion of this critical tool.

6 Key Areas for International Cooperation on Digital Public Infrastructure

  1. A vision for digital public infrastructure as a whole, backed by practice, research, and evaluation.
  2. A global commons based on digital public goods.
  3. Safeguards for inclusion, trust, competition, security, and privacy.
  4. Tools that use data in digital public infrastructure for public value and private empowerment.
  5. Private and public capacity, particularly in implementing countries.
  6. Silo-busting, built-for-purpose coordinating, funding, and financing….(More)”.

Chinese web users are writing a new playbook for disaster response


Shen Lu at Protocol: Severe floods caused by torrential rains in Central China’s Henan province have killed dozens and displaced tens of thousands of residents since last weekend. In parallel with local and central governments’ disaster relief and rescue efforts, Chinese web users have organized online, using technology in novel ways to mitigate risks and rescue those who were trapped in subway cars and neighborhoods submerged in floodwaters.

Chinese web users are no strangers to digital crowdsourcing efforts. During the COVID-19 outbreak, volunteers archived censored media reports and personal stories of suffering from disease or injustice that were scattered on social media, saving them on sharable files on GitHub and broadcasting them via Telegram. Despite pervasive censorship, in times of crisis, Chinese web users have managed to keep information and communications channels open among themselves, and with the rest of the world.

Now, people in one of the most oppressive information environments in the world might be helping write the future playbook for disaster response…

In hard-hit Zhengzhou, the capital city of Henan province, tens of thousands of residents crowdsourced relief assistance over the past 48 hours through a simple shared spreadsheet powered by the Tencent equivalent of Google Sheets (Google products are banned in China). It was created by a college student to allow those awaiting rescue to log their contact and location information.

In the 36 hours that followed, droves of volunteers have logged on, vastly expanding the breadth of information that lives on the document. It now includes contact information for official and unofficial rescue teams, relief resources, shelter locations, phone-charging stations and online medical consultations. At certain points, over 200 people have edited the sheet simultaneously.

Tencent reported that by Wednesday evening Beijing time, volunteers had entered nearly 1,000 data points. The document has received over 2.5 million visits, becoming the most visited Tencent Doc ever and one of the most efficient and powerful rescue and aid platforms started and contributed by civilians.

Similar crowdsourced documents for flooding victims live elsewhere on the internet. On Shimo Docs, a cloud-based productivity suite developed by the Beijing-based startup Shimo, volunteers have aggregated relief and rescue resources’ contacts by cities and counties. These shared documents have made the rounds on social media platforms like Weibo and WeChat in the past few days….(More)”.