Google is using AI to better detect searches from people in crisis


Article by James Vincent: “In a personal crisis, many people turn to an impersonal source of support: Google. Every day, the company fields searches on topics like suicide, sexual assault, and domestic abuse. But Google wants to do more to direct people to the information they need, and says new AI techniques that better parse the complexities of language are helping.

Specifically, Google is integrating its latest machine learning model, MUM, into its search engine to “more accurately detect a wider range of personal crisis searches.” The company unveiled MUM at its IO conference last year, and has since used it to augment search with features that try to answer questions connected to the original search.

In this case, MUM will be able to spot search queries related to difficult personal situations that earlier search tools could not, says Anne Merritt, a Google product manager for health and information quality.

“MUM is able to help us understand longer or more complex queries like ‘why did he attack me when i said i dont love him,’” Merrit told The Verge. “It may be obvious to humans that this query is about domestic violence, but long, natural-language queries like these are difficult for our systems to understand without advanced AI.”

Other examples of queries that MUM can react to include “most common ways suicide is completed” (a search Merrit says earlier systems “may have previously understood as information seeking”) and “Sydney suicide hot spots” (where, again, earlier responses would have likely returned travel information — ignoring the mention of “suicide” in favor of the more popular query for “hot spots”). When Google detects such crisis searches, it responds with an information box telling users “Help is available,” usually accompanied by a phone number or website for a mental health charity like Samaritans.

In addition to using MUM to respond to personal crises, Google says it’s also using an older AI language model, BERT, to better identify searches looking for explicit content like pornography. By leveraging BERT, Google says it’s “reduced unexpected shocking results by 30%” year-on-year. However, the company was unable to share absolute figures for how many “shocking results” its users come across on average, so while this is a comparative improvement, it gives no indication of how big or small the problem actually is.

Google is keen to tell you that AI is helping the company improve its search products — especially at a time when there’s a building narrative that “Google search is dying.” But integrating this technology comes with its downsides, too.

Many AI experts warn that Google’s increasing use of machine learning language models could surface new problems for the company, like introducing biases and misinformation into search results. AI systems are also opaque, offering engineers restricted insight into how they come to certain conclusions…(More)”.

Machine learning and phone data can improve targeting of humanitarian aid


Paper by Emily Aiken, Suzanne Bellue, Dean Karlan, Chris Udry & Joshua E. Blumenstock: “The COVID-19 pandemic has devastated many low- and middle-income countries, causing widespread food insecurity and a sharp decline in living standards. In response to this crisis, governments and humanitarian organizations worldwide have distributed social assistance to more than 1.5 billion people. Targeting is a central challenge in administering these programmes: it remains a difficult task to rapidly identify those with the greatest need given available data. Here we show that data from mobile phone networks can improve the targeting of humanitarian assistance. Our approach uses traditional survey data to train machine-learning algorithms to recognize patterns of poverty in mobile phone data; the trained algorithms can then prioritize aid to the poorest mobile subscribers. We evaluate this approach by studying a flagship emergency cash transfer program in Togo, which used these algorithms to disburse millions of US dollars worth of COVID-19 relief aid. Our analysis compares outcomes—including exclusion errors, total social welfare and measures of fairness—under different targeting regimes. Relative to the geographic targeting options considered by the Government of Togo, the machine-learning approach reduces errors of exclusion by 4–21%. Relative to methods requiring a comprehensive social registry (a hypothetical exercise; no such registry exists in Togo), the machine-learning approach increases exclusion errors by 9–35%. These results highlight the potential for new data sources to complement traditional methods for targeting humanitarian assistance, particularly in crisis settings in which traditional data are missing or out of date…(More)”.

Cities4Cities: new matchmaking platform launched to support Ukrainian local and regional authorities


Council of Europe: “A new matchmaking online platform, Cities4Cities, developed to help Ukrainian cities was launched in Strasbourg today. The platform is a free online exchange tool; it allows local authorities in Ukraine and in the rest of Europe to share their needs and offers related to local infrastructure and get in direct contact to receive practical help.

The platform was launched at the initiative of Bernd Vöhringer (Germany, EPP/CCE), President of the Chamber of Local Authorities of the Congress of Local and Regional Authorities and Mayor of the city of Sindelfingen, with the support of the Congress of Local and Regional Authorities of the Council of Europe.

Bernd Vöhringer explained that the need for co-ordination of support action coming from the local level became very clear to him after the visit in the end of March to the Polish twin city of Sindelfingen, Chełm, situated near the Ukrainian border where he saw first-hand the “urgent need for material, financial and human resources support”. “The platform will be a place to match the demands/needs of Ukrainian cities with the capacity, know-how and supply of other European cities,” he noted, “It will enable faster and more efficient support to our Ukrainian friends and partners”.

Secretary General of the Congress, Andreas Kiefer, said that the Congress “welcomes the efforts of local and regional authorities of the member States of the Council of Europe and their associations in support for their Ukrainian counterparts and citizens”, and the Cities4Cities initiative is an example of such result-oriented solidarity action at the local level. “In the recently adopted Declaration the Congress stressed that democracy, multilevel governance and human rights are stronger than war, and reiterated its firm stand by Ukraine and its people”, Kiefer concluded.

Ambassador Borys Tarasyuk, Permanent Representative of Ukraine to the Council of Europe, stressed that the initiative will serve well the purpose of providing practical assistance to the most vulnerable, amidst the immense human tragedy and challenges, and will complement the political support and solidarity expressed by the Congress of Local and Regional Authorities and the Council of Europe as a whole…(More)”.

Open Data for Social Impact Framework


Framework by Microsoft: “The global pandemic has shown us the important role of data in understanding, assessing, and taking action to solve the challenges created by COVID-19. However, nearly all organizations, large and small, still struggle to make data relevant to their work. Despite the value data provides, many organizations fail to harness its power to improve outcomes.

Part of this struggle stems from the “data divide” – the gap that exists between countries and organizations that have effective access to data to help them innovate and solve problems and those that do not. To close this divide, Microsoft launched the Open Data Campaign in 2020 to help realize the promise of more open data and data collaborations that drive innovation.

One of the key lessons we’ve learned from the Campaign and the work we’ve been doing with our partners, the Open Data Institute and The GovLab, is that the ability to access and use data to improve outcomes involves much more than technological tools and the data itself. It is also important to be able to leverage and share the experiences and practices that promote effective data collaboration and decision-making. This is especially true when it comes to working with governments, multi-lateral organizations, nonprofits, research institutions, and others who seek to open and reuse data to address important social issues, particularly those faced by developing countries.

Put another way, just having access to data and technology does not magically create value and improve outcomes. Making the most of open data and data collaboration requires thinking about how an organization’s leadership can commit to making data useful towards its mission, defining the questions it wants to answer with data, identifying the skills its team needs to use data, and determining how best to develop and establish trust among collaborators and communities served to derive more insight and benefit from data.

The Open Data for Social Impact Framework is a tool leaders can use to put data to work to solve the challenges most important to them. Recognizing that not all data can be made publicly accessible, we see the tremendous benefits that can come from advancing more open data, whether that takes shape as trusted data collaborations or truly open and public data. We use the phrase ‘social impact’ to mean a positive change towards addressing a societal problem, such as reducing carbon emissions, closing the broadband gap, building skills for jobs, and advancing accessibility and inclusion.

We believe in the limitless opportunities that opening, sharing, and collaborating around data can create to draw out new insights, make better decisions, and improve efficiencies when tackling some of the world’s most pressing challenges….(More)”.

Digitisation and Sovereignty in Humanitarian Space: Technologies, Territories and Tensions


Paper by Aaron Martin: “Debates are ongoing on the limits of – and possibilities for – sovereignty in the digital era. While most observers spotlight the implications of the Internet, cryptocurrencies, artificial intelligence/machine learning and advanced data analytics for the sovereignty of nation states, a critical yet under-examined question concerns what digital innovations mean for authority, power and control in the humanitarian sphere in which different rules, values and expectations are thought to apply. This forum brings together practitioners and scholars to explore both conceptually and empirically how digitisation and datafication in aid are (re)shaping notions of sovereign power in humanitarian space. The forum’s contributors challenge established understandings of sovereignty in new forms of digital humanitarian action. Among other focus areas, the forum draws attention to how cyber dependencies threaten international humanitarian organisations’ purported digital sovereignty. It also contests the potential of technologies like blockchain to revolutionise notions of sovereignty in humanitarian assistance and hypothesises about the ineluctable parasitic qualities of humanitarian technology. The forum concludes by proposing that digital technologies deployed in migration contexts might be understood as ‘sovereignty experiments’. We invite readers from scholarly, policy and practitioner communities alike to engage closely with these critical perspectives on digitisation and sovereignty in humanitarian space….(More)”.

‘It’s like the wild west’: Data security in frontline aid


A Q&A on how aid workers handle sensitive data by Irwin Loy: “The cyber-attack on the International Committee of the Red Cross, discovered in January, was the latest high-profile breach to connect the dots between humanitarian data risks and real-world harms. Personal information belonging to more than 515,000 people was exposed in what the ICRC said was a “highly sophisticated” hack using tools employed mainly by states or state-backed groups.

But there are countless other examples of how the reams of data collected from some of the world’s most vulnerable communities can be compromisedmisused, and mishandled.

“The biggest frontier in the humanitarian sector is the weaponisation of humanitarian data,” said Olivia Williams, a former aid worker who now specialises in information security at Apache iX, a UK-based defence consultancy.

She recently completed research – including surveys and interviews with more than 180 aid workers from 28 countries – examining how data is handled, and what agencies and frontline staff say they do to protect it.

Sensitive data is often collected on personal devices, sent over hotel WiFi, scrawled on scraps of paper then photographed and sent to headquarters via WhatsApp, or simply emailed and widely shared with partner organisations, aid workers told her.

The organisational security and privacy policies meant to guide how data is stored and protected? Impractical, irrelevant, and often ignored, Williams said.

Some frontline staff are taking information security into their own hands, devising their own systems of coding, filing, and securing data. One respondent kept paper files locked in their bedroom.

Aid workers from dozens of major UN agencies, NGOs, Red Cross organisations, and civil society groups took part in the survey.

Williams’ findings echo her own misgivings about data security in her previous deployments to crisis zones from northern Iraq to Nepal and the Philippines. Aid workers are increasingly alarmed about how data is handled, she said, while their employers are largely “oblivious” to what actually happens on the ground.

Williams spoke to The New Humanitarian about the unspoken power imbalance in data collection, why there’s so much data, and what aid workers can do to better protect it….(More)”.

New and updated building footprints


Bing Blogs: “…The Microsoft Maps Team has been leveraging that investment to identify map features at scale and produce high-quality building footprint data sets with the overall goal to add to the OpenStreetMap and MissingMaps humanitarian efforts.

As of this post, the following locations are available and Microsoft offers access to this data under the Open Data Commons Open Database License (ODbL).

Country/RegionMillion buildings
United States of America129.6
Nigeria and Kenya50.5
South America44.5
Uganda and Tanzania17.9
Canada11.8
Australia11.3

As you might expect, the vintage of the footprints depends on the collection date of the underlying imagery. Bing Maps Imagery is a composite of multiple sources with different capture dates (ranging 2012 to 2021). To ensure we are setting the right expectation for that building, each footprint has a capture date tag associated if we could deduce the vintage of imagery used…(More)”

Data Re-Use and Collaboration for Development


Stefaan G. Verhulst at Data & Policy: “It is often pointed out that we live in an era of unprecedented data, and that data holds great promise for development. Yet equally often overlooked is the fact that, as in so many domains, there exist tremendous inequalities and asymmetries in where this data is generated, and how it is accessed. The gap that separates high-income from low-income countries is among the most important (or at least most persistent) of these asymmetries…

Data collaboratives are an emerging form of public-private partnership that, when designed responsibly, can offer a potentially innovative solution to this problem. Data collaboratives offer at least three key benefits for developing countries:

1. Cost Efficiencies: Data and data analytic capacity are often hugely expensive and beyond the limited capacities of many low-income countries. Data reuse, facilitated by data collaboratives, can bring down the cost of data initiatives for development projects.

2. Fresh insights for better policy: Combining data from various sources by breaking down silos has the potential to lead to new and innovative insights that can help policy makers make better decisions. Digital data can also be triangulated with existing, more traditional sources of information (e.g., census data) to generate new insights and help verify the accuracy of information.

3. Overcoming inequalities and asymmetries: Social and economic inequalities, both within and among countries, are often mapped onto data inequalities. Data collaboratives can help ease some of these inequalities and asymmetries, for example by allowing costs and analytical tools and techniques to be pooled. Cloud computing, which allows information and technical tools to be easily shared and accessed, are an important example. They can play a vital role in enabling the transfer of skills and technologies between low-income and high-income countries…(More)”. See also: Reusing data responsibly to achieve development goals (OECD Report).

Making data for good better


Article by Caroline Buckee, Satchit Balsari, and Andrew Schroeder: “…Despite the long standing excitement about the potential for digital tools, Big Data and AI to transform our lives, these innovations–with some exceptions–have so far had little impact on the greatest public health emergency of our time.

Attempts to use digital data streams to rapidly produce public health insights that were not only relevant for local contexts in cities and countries around the world, but also available to decision makers who needed them, exposed enormous gaps across the translational pipeline. The insights from novel data streams which could help drive precise, impactful health programs, and bring effective aid to communities, found limited use among public health and emergency response systems. We share here our experience from the COVID-19 Mobility Data Network (CMDN), now Crisis Ready (crisisready.io), a global collaboration of researchers, mostly infectious disease epidemiologists and data scientists, who served as trusted intermediaries between technology companies willing to share vast amounts of digital data, and policy makers, struggling to incorporate insights from these novel data streams into their decision making. Through our experience with the Network, and using human mobility data as an illustrative example, we recognize three sets of barriers to the successful application of large digital datasets for public good.

First, in the absence of pre-established working relationships with technology companies and data brokers, the data remain primarily confined within private circuits of ownership and control. During the pandemic, data sharing agreements between large technology companies and researchers were hastily cobbled together, often without the right kind of domain expertise in the mix. Second, the lack of standardization, interoperability and information on the uncertainty and biases associated with these data, necessitated complex analytical processing by highly specialized domain experts. And finally, local public health departments, understandably unfamiliar with these novel data streams, had neither the bandwidth nor the expertise to sift noise from signal. Ultimately, most efforts did not yield consistently useful information for decision making, particularly in low resource settings, where capacity limitations in the public sector are most acute…(More)”.

Nonprofit Websites Are Riddled With Ad Trackers


Article by By Alfred Ng and Maddy Varner: “Last year, nearly 200 million people visited the website of Planned Parenthood, a nonprofit that many people turn to for very private matters like sex education, access to contraceptives, and access to abortions. What those visitors may not have known is that as soon as they opened plannedparenthood.org, some two dozen ad trackers embedded in the site alerted a slew of companies whose business is not reproductive freedom but gathering, selling, and using browsing data.

The Markup ran Planned Parenthood’s website through our Blacklight tool and found 28 ad trackers and 40 third-party cookies tracking visitors, in addition to so-called “session recorders” that could be capturing the mouse movements and keystrokes of people visiting the homepage in search of things like information on contraceptives and abortions. The site also contained trackers that tell Facebook and Google if users visited the site.

The Markup’s scan found Planned Parenthood’s site communicating with companies like Oracle, Verizon, LiveRamp, TowerData, and Quantcast—some of which have made a business of assembling and selling access to masses of digital data about people’s habits.

Katie Skibinski, vice president for digital products at Planned Parenthood, said the data collected on its website is “used only for internal purposes by Planned Parenthood and our affiliates,” and the company doesn’t “sell” data to third parties.

“While we aim to use data to learn how we can be most impactful, at Planned Parenthood, data-driven learning is always thoughtfully executed with respect for patient and user privacy,” Skibinski said. “This means using analytics platforms to collect aggregate data to gather insights and identify trends that help us improve our digital programs.”

Skibinski did not dispute that the organization shares data with third parties, including data brokers.

Blacklight scan of Planned Parenthood Gulf Coast—a localized website specifically for people in the Gulf region, including Texas, where abortion has been essentially outlawed—churned up similar results.

Planned Parenthood is not alone when it comes to nonprofits, some operating in sensitive areas like mental health and addiction, gathering and sharing data on website visitors.

Using our Blacklight tool, The Markup scanned more than 23,000 websites of nonprofit organizations, including those belonging to abortion providers and nonprofit addiction treatment centers. The Markup used the IRS’s nonprofit master file to identify nonprofits that have filed a tax return since 2019 and that the agency categorizes as focusing on areas like mental health and crisis intervention, civil rights, and medical research. We then examined each nonprofit’s website as publicly listed in GuideStar. We found that about 86 percent of them had third-party cookies or tracking network requests. By comparison, when The Markup did a survey of the top 80,000 websites in 2020, we found 87 percent used some type of third-party tracking.

About 11 percent of the 23,856 nonprofit websites we scanned had a Facebook pixel embedded, while 18 percent used the Google Analytics “Remarketing Audiences” feature.

The Markup found that 439 of the nonprofit websites loaded scripts called session recorders, which can monitor visitors’ clicks and keystrokes. Eighty-nine of those were for websites that belonged to nonprofits that the IRS categorizes as primarily focusing on mental health and crisis intervention issues…(More)”.