Improved targeting for mobile phone surveys: A public-private data collaboration


Blogpost by Kristen Himelein and Lorna McPherson: “Mobile phone surveys have been rapidly deployed by the World Bank to measure the impact of COVID-19 in nearly 100 countries across the world. Previous posts on this blog have discussed the sampling and  implementation challenges associated with these efforts, and coverage errors are an inherent problem to the approach. The survey methodology literature has shown mobile phone survey respondents in the poorest countries are more likely to be male, urban, wealthier, and more highly educated. This bias can stem from phone ownership, as mobile phone surveys are at best representative of mobile phone owners, a group which, particularly in poor countries, may differ from the overall population; or from differential response rates among these owners, with some groups more or less likely to respond to a call from an unknown number. In this post, we share our experiences in trying to improve representativeness and boost sample sizes for the poor in Papua New Guinea (PNG)….(More)”.

An Open Data Team Experiments with a New Way to Tell City Stories


Article by  Sean Finnan: “Can you see me?” says Mark Linnane, over Zoom, as he walks around a plastic structure on the floor of an office at Maynooth University. “That gives you some sense of the size of it. It’s 3.5 metres by 2.”

Linnane trails his laptop’s webcam over the surface of the off-white 3D model, giving a birds-eye view of tens of thousands of tiny buildings, the trails of roads and the clear pathway of the Liffey.

This replica of the heart of the city from Phoenix Park to Dublin Port was created to scale by the university’s Building City Dashboards team, using data from the Ordnance Survey Ireland.

In the five years since they started to grapple with the question of how to present data about the city in an engaging and accessible way, the team has experimented with virtual reality, and augmented reality – and most recently, with this new form of mapping, which blends the lego-like miniature of Dublin’s centre with changeable data projected on.

This could really come into its own as a public exhibit if they start to tell meaningful data-driven and empirical stories, says Linnane, a digital exhibition developer at Maynooth University.

Stories that are “relevant in terms of the everyday daily lives of people who will be coming to see it”, he says.

Layers of Meaning

Getting the projector that throws the visualisations onto the model to work right was Linnane’s job, he says.

He had to mesh the Ordnance Survey data with others that showed building heights for example. “Every single building down to the sheds in someone’s garden have a unique identifier,” says Linnane.

Projectors are built to project onto flat surfaces and not 3D models so that had to be finessed, too, he says. “Every step on the way was a new development. There wasn’t really a process there before.”

The printed 3D model shows 7km by 4km of Dublin and 122,355 structures, says Linnane. That includes bigger buildings but also small outbuildings, railway platforms, public toilets and glasshouses – all mocked up and serving as a canvas for a kaleidoscope of data.

“We’re just projecting data on to it and seeing what’s going on with that,” says Rob Kitchin, principal investigator at Maynooth University’s Programmable City project….(More)”

Image of model courtesy of Mark Linnane.

2020 was the year activists mastered hashtag flooding


Nicole Gallucci at Mashable: “A lone hashtag might not look very mighty, but when used en masse, the symbols can become incredibly powerful activism tools.

Over the past two decades — largely since product designer Chris Messina pitched hashtags to Twitter in 2007 — activists have learned to harness the symbols to form online communities, raise awareness on pressing issues, organize protests, shape digital narratives, and redirect social media discourse. 

On any given day, a series of hashtags are spotlighted in “Trending” section of Twitter. The hashtags featured are those that have gained traction online and reflect topics being heavily discussed in the moment. More often than not, a trending hashtag’s popularity is organic, but a hashtag’s origin and initial purpose can become clouded when people partake in a clever tactic called hashtag flooding.

Hashtag flooding, or the act of hijacking a hashtag on social media platforms to change its meaning, has been around for years. But in 2020, particularly in the months leading up to the presidential election, activists and social media users looking to make their voices heard used the technique to drown out hateful narratives.

From K-pop fans flooding Donald Trump-related hashtags to members of the gay community putting their own spin on the #ProudBoys hashtag, the method of online communication dominated timelines this year and should be in every activist’s playbook….(More)”.

New York Temporarily Bans Facial Recognition Technology in Schools


Hunton’s Privacy Blog: “On December 22, 2020, New York Governor Andrew Cuomo signed into law legislation that temporarily bans the use or purchase of facial recognition and other biometric identifying technology in public and private schools until at least July 1, 2022. The legislation also directs the New York Commissioner of Education (the “Commissioner”) to conduct a study on whether this technology is appropriate for use in schools.

In his press statement, Governor Cuomo indicated that the legislation comes after concerns were raised about potential risks to students, including issues surrounding misidentification by the technology as well as safety, security and privacy concerns. “This legislation requires state education policymakers to take a step back, consult with experts and address privacy issues before determining whether any kind of biometric identifying technology can be brought into New York’s schools. The safety and security of our children is vital to every parent, and whether to use this technology is not a decision to be made lightly,” the Governor explained.

Key elements of the legislation include:

  • Defining “facial recognition” as “any tool using an automated or semi-automated process that assists in uniquely identifying or verifying a person by comparing and analyzing patterns based on the person’s face,” and “biometric identifying technology” as “any tool using an automated or semi-automated process that assists in verifying a person’s identity based on a person’s biometric information”;
  • Prohibiting the purchase and use of facial recognition and other biometric identifying technology in all public and private elementary and secondary schools until July 1, 2022, or until the Commissioner authorizes the purchase and use of such technology, whichever occurs later; and
  • Directing the Commissioner, in consultation with New York’s Office of Information Technology, Division of Criminal Justice Services, Education Department’s Chief Privacy Officer and other stakeholders, to conduct a study and make recommendations as to the circumstances in which facial recognition and other biometric identifying technology is appropriate for use in schools and what restrictions and guidelines should be enacted to protect privacy, civil rights and civil liberties interests….(More)”.

Data for Good: New Tools to Help Small Businesses and Communities During the COVID-19 Pandemic


Blogpost by Laura McGorman and Alex Pompe at Facebook: “Small businesses and people around the world are suffering devastating financial losses due to the ongoing COVID-19 pandemic, and public institutions need real time information to help. Today Facebook is launching new datasets and insights to help support economic recovery through our Data for Good program. 

Researchers estimate that over the next five years, the global economy could suffer over $80 trillion in losses due to COVID-19. Small businesses in particular are being hit hard — our Global State of Small Business Report found that over one in four had closed their doors in 2020. Governments around the world are looking to effectively distribute financial aid as well as accurately forecast when and how economies will recover. These four datasets — Business Activity Trends, Commuting Zones, Economic Insights from the Symptom Survey and the latest Future of Business Survey results — will help researchers, nonprofits and local officials identify which areas and businesses may need the most support.

Business Activity Trends

Many factors influence the pandemic’s impact on local economies around the world. However, real time information on business activity is scarce, leaving institutions seeking to provide economic aid with limited information on how to distribute it. To address these information gaps, we partnered with the University of Bristol to aggregate information from Facebook Business Pages to estimate the change in activity among local businesses around the world and how they respond and recover from crises over time.

UK graph showing average business activity
The above graph shows the drop in Business Page posting on Facebook across cities in the UK the day after the Prime Minister announced lockdown measures. Business Activity Trends can be used to determine how businesses and customers are reacting to local COVID-19 containment policies.

“Determining whether small and medium businesses are open is very important to assess the recovery after events like mandatory stay-at-home orders,” said Dr. Flavia De Luca, Senior Lecturer in Structural and Earthquake Engineering at the University of Bristol. “The traditional way of collecting this information, such as surveys and interviews, are usually costly, time consuming, and do not scale. By using real time information from Facebook, we hope to make it easier for public institutions to better respond to these events.”…(More)”.

The Modern World Has Finally Become Too Complex for Any of Us to Understand


Blog by Tim Maughan: “One of the dominant themes of the last few years is that nothing makes sense. Donald Trump is president, QAnon has mainstreamed fringe conspiracy theories, and hundreds of thousands are dead from a pandemic and climate change while many Americans do not believe that the pandemic or climate change are deadly. It’s incomprehensible.

I am here to tell you that the reason so much of the world seems incomprehensible is that it is incomprehensible. From social media to the global economy to supply chains, our lives rest precariously on systems that have become so complex, and we have yielded so much of it to technologies and autonomous actors that no one totally comprehends it all….

And those platforms of technology and software that glue all these huge networks together have become a complex system themselves. The internet might be the system that we interact with in the most direct and intimate ways, but most of us have little comprehension of what lies behind our finger-smudged touchscreens, truly understood by few. Made up of data centers, internet exchanges, huge corporations, tiny startups, investors, social media platforms, datasets, adtech companies, and billions of users and their connected devices, it’s a vast network dedicated to mining, creating, and moving data on scales we can’t comprehend. YouTube users upload more than 500 hours of video every minute — which works out as 82.2 yearsof video uploaded to YouTube every day. As of June 30, 2020, there are over 2.7 billion monthly active Facebook users, with 1.79 billion people on average logging on daily. Each day, 500 million tweets are sent— or 6,000 tweets every second, with a day’s worth of tweets filling a 10-million-page book. Every day, 65 billion messages are sent on WhatsApp. By 2025, it’s estimated that 463 million terabytes of data will be created each day — the equivalent of 212,765,957 DVDs.

So, what we’ve ended up with is a civilization built on the constant flow of physical goods, capital, and data, and the networks we’ve built to manage those flows in the most efficient ways have become so vast and complex that they’re now beyond the scale of any single (and, arguably, any group or team of) human understanding them. It’s tempting to think of these networks as huge organisms, with tentacles spanning the globe that touch everything and interlink with one another, but I’m not sure the metaphor is apt. An organism suggests some form of centralized intelligence, a nervous system with a brain at its center, processing data through feedback loops and making decisions. But the reality with these networks is much closer to the concept of distributed intelligence or distributed knowledge, where many different agents with limited information beyond their immediate environment interact in ways that lead to decision-making, often without them even knowing that’s what they’re doing….(More)”.

Introducing Reach: find and track research being put into action


Blog by Dawn Duhaney: “At Wellcome Data Labs we’re releasing our first product, Reach. Our goal is to support funding organisations and researchers by making it easier to find and track scientific research being put into action by governments and global health organisations.

https://reach.wellcomedatalabs.org/
https://reach.wellcomedatalabs.org/

We focused on solving this problem in collaboration with our internal Insights and Analysis team for Wellcome and with partner organisations before deciding to release Reach more widely.

We found that evaluation teams wanted tools to help them measure the influence academic research was having on policy making institutions. We noticed that it is often challenging to track how scientific evidence makes its way into policy making. Institutions like the UK Government and the World Health Organisation have hundreds of thousands of policy documents available — it’s a heavily manual task to search through them to find evidence of our funded research.

At Wellcome we have some established methods for collecting evidence of policy influence from our funded research such as end of scheme reporting and via word of mouth. Through these methods we found great examples of how funded research was being put into policy and practice by government and global health organisations.

One example is from Kenya. The KEMRI Research Programme — a collaboration between the Kenyan Medical Research Institute, Wellcome and Oxford University launched a research programme to improve maternal health in 2005. Their research was cited in the World Health Organisation and with advocacy efforts from the KEMRI team influenced the development of new Kenyan national guidelines of paediatric care.

In Wellcome Data Labs we wanted to build a tool that would aid the discovery of evidence based policy making and be a step in the process of assessing research influence for evaluators, researchers and funding institutions….(More)”.

Data could hold the key to stopping Alzheimer’s


Blog post by Bill Gates: “My family loves to do jigsaw puzzles. It’s one of our favorite activities to do together, especially when we’re on vacation. There is something so satisfying about everyone working as a team to put down piece after piece until finally the whole thing is done.

In a lot of ways, the fight against Alzheimer’s disease reminds me of doing a puzzle. Your goal is to see the whole picture, so that you can understand the disease well enough to better diagnose and treat it. But in order to see the complete picture, you need to figure out how all of the pieces fit together.

Right now, all over the world, researchers are collecting data about Alzheimer’s disease. Some of these scientists are working on drug trials aimed at finding a way to stop the disease’s progression. Others are studying how our brain works, or how it changes as we age. In each case, they’re learning new things about the disease.

But until recently, Alzheimer’s researchers often had to jump through a lot of hoops to share their data—to see if and how the puzzle pieces fit together. There are a few reasons for this. For one thing, there is a lot of confusion about what information you can and can’t share because of patient privacy. Often there weren’t easily available tools and technologies to facilitate broad data-sharing and access. In addition, pharmaceutical companies invest a lot of money into clinical trials, and often they aren’t eager for their competitors to benefit from that investment, especially when the programs are still ongoing.

Unfortunately, this siloed approach to research data hasn’t yielded great results. We have only made incremental progress in therapeutics since the late 1990s. There’s a lot that we still don’t know about Alzheimer’s, including what part of the brain breaks down first and how or when you should intervene. But I’m hopeful that will change soon thanks in part to the Alzheimer’s Disease Data Initiative, or ADDI….(More)“.

How Can Policy Makers Predict the Unpredictable?


Essay by Meg King and Aaron Shull: “Policy makers around the world are leaning on historical analogies to try to predict how artificial intelligence, or AI — which, ironically, is itself a prediction technology — will develop. They are searching for clues to inform and create appropriate policies to help foster innovation while addressing possible security risks. Much in the way that electrical power completely changed our world more than a century ago — transforming every industry from transportation to health care to manufacturing — AI’s power could effect similar, if not even greater, disruption.

Whether it is the “next electricity” or not, one fact all can agree on is that AI is not a thing in itself. Most authors contributing to this essay series focus on the concept that AI is a general-purpose technology — or GPT — that will enable many applications across a variety of sectors. While AI applications are expected to have a significantly positive impact on our lives, those same applications will also likely be abused or manipulated by bad actors. Setting rules at both the national and the international level — in careful consultation with industry — will be crucial for ensuring that AI offers new capabilities and efficiencies safely.

Situating this discussion, though, requires a look back, in order to determine where we may be going. While AI is not new — Marvin Minsky developed what is widely believed to be the first neural network learning machine in the early 1950s — its scale, scope, speed of adoption and potential use cases today highlight a number of new challenges. There are now many ominous signs pointing to extreme danger should AI be deployed in an unchecked manner, particularly in military applications, as well as worrying trends in the commercial context related to potential discrimination, undermining of privacy, and upended traditional employment structures and economic models….(More)”

To mitigate the costs of future pandemics, establish a common data space


Article by Stephanie Chin and Caitlin Chin: “To improve data sharing during global public health crises, it is time to explore the establishment of a common data space for highly infectious diseases. Common data spaces integrate multiple data sources, enabling a more comprehensive analysis of data based on greater volume, range, and access. At its essence, a common data space is like a public library system, which has collections of different types of resources from books to video games; processes to integrate new resources and to borrow resources from other libraries; a catalog system to organize, sort, and search through resources; a library card system to manage users and authorization; and even curated collections or displays that highlight themes among resources.

Even before the COVID-19 pandemic, there was significant momentum to make critical data more widely accessible. In the United States, Title II of the Foundations for Evidence-Based Policymaking Act of 2018, or the OPEN Government Data Act, requires federal agencies to publish their information online as open data, using standardized, machine-readable data formats. This information is now available on the federal data.gov catalog and includes 50 state- or regional-level data hubs and 47 city- or county-level data hubs. In Europe, the European Commission released a data strategy in February 2020 that calls for common data spaces in nine sectors, including healthcare, shared by EU businesses and governments.

Going further, a common data space could help identify outbreaks and accelerate the development of new treatments by compiling line list incidence data, epidemiological information and models, genome and protein sequencing, testing protocols, results of clinical trials, passive environmental monitoring data, and more.

Moreover, it could foster a common understanding and consensus around the facts—a prerequisite to reach international buy-in on policies to address situations unique to COVID-19 or future pandemics, such as the distribution of medical equipment and PPE, disruption to the tourism industry and global supply chains, social distancing or quarantine, and mass closures of businesses….(More). See also Call for Action for a Data Infrastructure to tackle Pandemics and other Dynamic Threats.