Stefaan Verhulst
Nigel Cory at ITIF: “If nations could regulate viruses the way many regulate data, there would be no global pandemics. But the sad reality is that, in the midst of the worst global pandemic in living memory, many nations make it unnecessarily complicated and costly, if not illegal, for health data to cross their borders. In so doing, they are hindering critically needed medical progress.
In the COVID-19 crisis, data analytics powered by artificial intelligence (AI) is critical to identifying the exact nature of the pandemic and developing effective treatments. The technology can produce powerful insights and innovations, but only if researchers can aggregate and analyze data from populations around the globe. And that requires data to move across borders as part of international research efforts by private firms, universities, and other research institutions. Yet, some countries, most notably China, are stopping health and genomic data at their borders.
Indeed, despite the significant benefits to companies, citizens, and economies that arise from the ability to easily share data across borders, dozens of countries—across every stage of development—have erected barriers to cross-border data flows. These data-residency requirements strictly confine data within a country’s borders, a concept known as “data localization,” and many countries have especially strict requirements for health data.
China is a noteworthy offender, having created a new digital iron curtain that requires data localization for a range of data types, including health data, as part of its so-called “cyber sovereignty” strategy. A May 2019 State Council regulation required genomic data to be stored and processed locally by Chinese firms—and foreign organizations are prohibited. This is in service of China’s mercantilist strategy to advance its domestic life sciences industry. While there has been collaboration between U.S. and Chinese medical researchers on COVID-19, including on clinical trials for potential treatments, these restrictions mean that it won’t involve the transfer, aggregation, and analysis of Chinese personal data, which otherwise might help find a treatment or vaccine. If China truly wanted to make amends for blocking critical information during the early stages of the outbreak in Wuhan, then it should abolish this restriction and allow genomic and other health data to cross its borders.
But China is not alone in limiting data flows. Russia requires all personal data, health-related or not, to be stored locally. India’s draft data protection bill permits the government to classify any sensitive personal data as critical personal data and mandate that it be stored and processed only within the country. This would be consistent with recent debates and decisions to require localization for payments data and other types of data. And despite its leading role in pushing for the free flow of data as part of new digital trade agreements, Australia requires genomic and other data attached to personal electronic health records to be only stored and processed within its borders.
Countries also enact de facto barriers to health and genomic data transfers by making it harder and more expensive, if not impractical, for firms to transfer it overseas than to store it locally. For example, South Korea and Turkey require firms to get explicit consent from people to transfer sensitive data like genomic data overseas. Doing this for hundreds or thousands of people adds considerable costs and complexity.
And the European Union’s General Data Protection Regulation encourages data localization as firms feel pressured to store and process personal data within the EU given the restrictions it places on data transfers to many countries. This is in addition to the renewed push for local data storage and processing under the EU’s new data strategy.
Countries rationalize these steps on the basis that health data, particularly genomic data, is sensitive. But requiring health data to be stored locally does little to increase privacy or data security. The confidentiality of data does not depend on which country the information is stored in, only on the measures used to store it securely, such as via encryption, and the policies and procedures the firms follow in storing or analyzing the data. For example, if a nation has limits on the use of genomics data, then domestic organizations using that data face the same restrictions, whether they store the data in the country or outside of it. And if they share the data with other organizations, they must require those organizations, regardless of where they are located, to abide by the home government’s rules.
As such, policymakers need to stop treating health data differently when it comes to cross-border movement, and instead build technical, legal, and ethical protections into both domestic and international data-governance mechanisms, which together allow the responsible sharing and transfer of health and genomic data.
This is clearly possible—and needed. In February 2020, leading health researchers called for an international code of conduct for genomic data following the end of their first-of-its-kind international data-driven research project. The project used a purpose-built cloud service that stored 800 terabytes of genomic data on 2,658 cancer genomes across 13 data centers on three continents. The collaboration and use of cloud computing were transformational in enabling large-scale genomic analysis….(More)”.
L. M. Sacasas at The New Atlantis: “…The challenges we are facing are not merely the bad actors, whether they be foreign agents, big tech companies, or political extremists. We are in the middle of a deep transformation of our political culture, as digital technology is reshaping the human experience at both an individual and a social level. The Internet is not simply a tool with which we do politics well or badly; it has created a new environment that yields a different set of assumptions, principles, and habits from those that ordered American politics in the pre-digital age.
We are caught between two ages, as it were, and we are experiencing all of the attendant confusion, frustration, and exhaustion that such a liminal state involves. To borrow a line from the Marxist thinker Antonio Gramsci, “The crisis consists precisely in the fact that the old is dying and the new cannot be born; in this interregnum a great variety of morbid symptoms appear.”
Although it’s not hard to see how the Internet, given its scope, ubiquity, and closeness to human life, radically reshapes human consciousness and social structures, that does not mean that the nature of that reshaping is altogether preordained or that it will unfold predictably and neatly. We must then avoid crassly deterministic just-so stories, and this essay is not an account of how digital media will necessarily change American politics irrespective of competing ideologies, economic forces, or already existing political and cultural realities. Rather, it is an account of how the ground on which these realities play out is shifting. Communication technologies are the material infrastructure on which so much of the work of human society is built. One cannot radically transform that infrastructure without radically altering the character of the culture built upon it. As Neil Postman once put it, “In the year 1500, fifty years after the printing press was invented, we did not have old Europe plus the printing press. We had a different Europe.” So, likewise, we may say that in the year 2020, fifty years after the Internet was invented, we do not have old America plus the Internet. We have a different America….(More)”.
Paper by Patrick Diamond: “In countries worldwide, the provision of policy advice to central governments has been transformed by the deinstitutionalisation of policymaking, which has engaged a diverse range of actors in the policy process. Scholarship should therefore address the impact of deinstitutionalisation in terms of the scope and scale of policy advisory systems, as well as in terms of the influence of policy advisors. This article addresses this gap, presenting a programme of research on policy advice in Whitehall. Building on Craft and Halligan’s conceptualisation of a ‘policy advisory system’, it argues that in an era of polycentric governance, policy advice is shaped by ‘interlocking actors’ beyond government bureaucracy, and that the pluralisation of advisory bodies marginalises the civil service. The implications of such alterations are considered against the backdrop of governance changes, particularly the hybridisation of institutions, which has made policymaking processes complex, prone to unpredictability and at risk of policy blunders….(More)”.
Jonathan Fuller at the Boston Review: “COVID-19 has revealed a contest between two competing philosophies of scientific knowledge. To manage the crisis, we must draw on both….The lasting icon of the COVID-19 pandemic will likely be the graphic associated with “flattening the curve.” The image is now familiar: a skewed bell curve measuring coronavirus cases that towers above a horizontal line—the health system’s capacity—only to be flattened by an invisible force representing “non-pharmaceutical interventions” such as school closures, social distancing, and full-on lockdowns.
How do the coronavirus models generating these hypothetical curves square with the evidence? What roles do models and evidence play in a pandemic? Answering these questions requires reconciling two competing philosophies in the science of COVID-19.
To some extent, public health epidemiology and clinical epidemiology are distinct traditions in health care, competing philosophies of scientific knowledge.
In one camp are infectious disease epidemiologists, who work very closely with institutions of public health. They have used a multitude of models to create virtual worlds in which sim viruses wash over sim populations—sometimes unabated, sometimes held back by a virtual dam of social interventions. This deluge of simulated outcomes played a significant role in leading government actors to shut borders as well as doors to schools and businesses. But the hypothetical curves are smooth, while real-world data are rough. Some detractors have questioned whether we have good evidence for the assumptions the models rely on, and even the necessity of the dramatic steps taken to curb the pandemic. Among this camp are several clinical epidemiologists, who typically provide guidance for clinical practice—regarding, for example, the effectiveness of medical interventions—rather than public health.
The latter camp has won significant media attention in recent weeks. Bill Gates—whose foundation funds the research behind the most visible outbreak model in the United States, developed by the Institute for Health Metrics and Evaluation (IHME) at the University of Washington—worries that COVID-19 might be a “once-in-a-century pandemic.” A notable detractor from this view is Stanford’s John Ioannidis, a clinical epidemiologist, meta-researcher, and reliable skeptic who has openly wondered whether the coronavirus pandemic might rather be a “once-in-a-century evidence fiasco.” He argues that better data are needed to justify the drastic measures undertaken to contain the pandemic in the United States and elsewhere.
Ioannidis claims, in particular, that our data about the pandemic are unreliable, leading to exaggerated estimates of risk. He also points to a systematic review published in 2011 of the evidence regarding physical interventions that aim to reduce the spread of respiratory viruses, worrying that the available evidence is nonrandomized and prone to bias. (A systematic review specific to COVID-19 has now been published; it concurs that the quality of evidence is “low” to “very low” but nonetheless supports the use of quarantine and other public health measures.) According to Ioannidis, the current steps we are taking are “non-evidence-based.”…(More)”.
Book edited by Leon van den Dool: This book presents international experiences in urban network learning. It is vital for cities to learn as it is necessary to constantly adapt and improve public performance and address complex challenges in a constantly changing environment. It is therefore highly relevant to gain more insight into how cities can learn. Cities address problems and challenges in networks of co-operation between existing and new actors, such as state actors, market players and civil society. This book presents various learning environments and methods for urban network learning, and aims to learn from experiences across the globe. How does learning take place in these urban networks? What factors and situations help or hinder these learning practices? Can we move from intuition to a strategy to improve urban network learning?…(More)”.
Book by Ivana Bartoletti: “AI has unparalleled transformative potential to reshape society but without legal scrutiny, international oversight and public debate, we are sleepwalking into a future written by algorithms which encode regressive biases into our daily lives. As governments and corporations worldwide embrace AI technologies in pursuit of efficiency and profit, we are at risk of losing our common humanity: an attack that is as insidious as it is pervasive.
Leading privacy expert Ivana Bartoletti exposes the reality behind the AI revolution, from the low-paid workers who train algorithms to recognise cancerous polyps, to the rise of data violence and the symbiotic relationship between AI and right-wing populism.
Impassioned and timely, An Artificial Revolution is an essential primer to understand the intersection of technology and geopolitical forces shaping the future of civilisation, and the political response that will be required to ensure the protection of democracy and human rights….(More)”.
Article by Satchit Balsari, Caroline Buckee and Tarun Khanna: “The Covid-19 pandemic has created a tidal wave of data. As countries and cities struggle to grab hold of the scope and scale of the problem, tech corporations and data aggregators have stepped up, filling the gap with dashboards scoring social distancing based on location data from mobile phone apps and cell towers, contact-tracing apps using geolocation services and Bluetooth, and modeling efforts to predict epidemic burden and hospital needs. In the face of uncertainty, these data can provide comfort — tangible facts in the face of many unknowns.
In a crisis situation like the one we are in, data can be an essential tool for crafting responses, allocating resources, measuring the effectiveness of interventions, such as social distancing, and telling us when we might reopen economies. However, incomplete or incorrect data can also muddy the waters, obscuring important nuances within communities, ignoring important factors such as socioeconomic realities, and creating false senses of panic or safety, not to mention other harms such as needlessly exposing private information. Right now, bad data could produce serious missteps with consequences for millions.
Unfortunately, many of these technological solutions — however well intended — do not provide the clear picture they purport to. In many cases, there is insufficient engagement with subject-matter experts, such as epidemiologists who specialize in modeling the spread of infectious diseases or front-line clinicians who can help prioritize needs. But because technology and telecom companies have greater access to mobile device data, enormous financial resources, and larger teams of data scientists, than academic researchers do, their data products are being rolled out at a higher volume than high quality studies.
Whether you’re a CEO, a consultant, a policymaker, or just someone who is trying to make sense of what’s going on, it’s essential to be able to sort the good data from the misleading — or even misguided.
Common Pitfalls
While you may not be qualified to evaluate the particulars of every dashboard, chart, and study you see, there are common red flags to let you know data might not be reliable. Here’s what to look out for:
Data products that are too broad, too specific, or lack context. Over-aggregated data — such as national metrics of physical distancing that some of our largest data aggregators in the world are putting out — obscure important local and regional variation, are not actionable, and mean little if used for inter-nation comparisons given the massive social, demographic, and economic disparities in the world….(More)”.
Tom Lamont at 1843 (Economist): “…Information overload was a term coined in the mid-1960s by Bertram Gross, an American social scientist. In 1970 a writer called Alvin Toffler, who was known at the time as a dependable futurist – someone who prognosticated for a living – popularised the idea of information overload as part of a set of bleak predictions about eventual human dependence on technology. (Good call, Alvin.) Information overload can occur in man or machine, wrote another set of academics in a 1977 study, “when the amount of input to a system exceeds its processing capacity”. Then came VHS, home computers, the internet, mobile phones, mobile-phones-with-the-internet – and waves of anxiety that we might be reaching the limits of our capacity.
A study in 2011 found that on a typical day Americans were taking in five times as much information as they had done 25 years earlier – and this was before most people had bought smartphones. In 2019 a study by academics in Germany, Ireland and Denmark identified that humans’ attention span is shrinking, probably because of digital intrusion, but was manifesting itself both “online and offline”.
By that time an organisation called the Information Overload Research Group had done a study which estimated that hundreds of billions of dollars were being shucked away from the American economy every year, in miscellaneous productivity costs, by an overload of data. The group had been co-founded in 2007 by a computer engineer-turned-consultant, Nathan Zeldes, who had once been asked by Intel, a computer-chip maker, to reduce the burden of email imposed on its workers. By the end of 2019 Zeldes was ready to sound a note of defeat. “I’d love to give you a magic potion that would restore your attention span to that of your grandparents,” he wrote in a blog, “but I can’t. After over a decade of smartphone use and social media, the harm is probably irreversible.” He advised people to take up a hobby.
In an age of overload it can feel as though technology has rather chanced its luck. Pushed too much, too far, bone-deep. Even before coronavirus spread across the world, parts of the culture had started to tack towards isolation and deprivation as desirable lifestyle signifiers, hot-this-year, as if some time spent alone and without a device was the new season’s outfit, the next Cronut, another twerk.
Before a pandemic limited the appeal of wallowing in someone else’s tepid water, flotation-tank centres were opening all over London. In the Czech Republic there are spas that sell clients a week in the dark in shuttered, serviced suites. “Social distancing is underrated,” Edward Snowden tweeted, deadpan, in March 2020: a corona-joke, but one that will have spoken to the tech bros of Silicon Valley, for whom retreats were the treat of choice.
Recently, I saw that a person called Celine in San Francisco had tweeted to her 2,500-odd followers about the difficulty of “trying to date SF guys in between their week-long meditation retreats, Tahoe weekends, month-long remote work sessions…” About 4,000 people tapped to endorse the sentiment, launching Celine onto an exponential number of strangers’ screens, including my own. The default sound for any new tweet is a whistle, somewhere between a neighbourly “yoo-hoo” and a dog-walker’s call to heel.
Hilda Burke, a British psychotherapist who has written about smartphone addiction, told me that part of the problem in this age of overload is the yoo-hooing insistence with which each new parcel of information seeks our attention. Speakers chime. Pixelated columns shuffle urgently or icons bounce, as if to signal that here is the fire. Our twitch response to urgency is triggered, in bad faith.
When Celine’s tweet whistled onto my phone one idle Friday I couldn’t understand why I found it mildly stressful to read. Was it that it made me feel old? That I already had enough to think about? Eventually I realised that, for me, every tweet is a bit stressful. Every trifling, whistling update that comes at us, Burke said, “is like a sheep dressed in wolf’s clothing. The body springs to attention, ready to run or fight, and for nothing that’s worth it. This is confusing.”…(More)”
Press Release: “As part of efforts to identify priorities across sectors in which data and data science could make a difference, The Governance Lab (The GovLab) at the New York University Tandon School of Engineering has partnered with Data2X, the gender data alliance housed at the United Nations Foundation, to release ten pressing questions on gender that experts have determined can be answered using data. Members of the public are invited to share their views and vote to help develop a data agenda on gender.
The questions are part of the 100 Questions Initiative, an effort to identify the most important societal questions that can be answered by data. The project relies on an innovative process of sourcing “bilinguals,” individuals with both subject-matter and data expertise, who in this instance provided questions related to gender they considered to be urgent and answerable. The results span issues of labor, health, climate change, and gender-based violence.
Through the initiative’s new online platform, anyone can now vote on what they consider to be the most pressing, data-related questions about gender that researchers and institutions should prioritize. Through voting, the public can steer the conversation and determine which topics should be the subject of data collaboratives, an emerging form of collaboration that allows organizations from different sectors to exchange data to create public value.

The GovLab has conducted significant research on the value and practice of data collaboratives, and its research shows that inter-sectoral collaboration can both increase access to data as well as unleash the potential of that data to serve the public good.
Data2X supported the 100 Questions Initiative by providing expertise and connecting The GovLab with relevant communities, events, and resources. The initiative helped inform Data2X’s “Big Data, Big Impact? Towards Gender-Sensitive Data Systems” report, which identifies gaps of information on gender equality across key policy domains.
“Asking the right questions is a critical first step in fostering data production and encouraging data use to truly meet the unique experiences and needs of women and girls,” said Emily Courey Pryor, executive director of Data2X. “Obtaining public feedback is a crucial way to identify the most urgent questions — and to ultimately incentivize investment in gender data collection and use to find the answers.”Said Stefaan Verhulst, co-founder and chief research and development officer at The GovLab, “Sourcing and prioritizing questions related to gender can inform resource and funding allocation to address gender data gaps and support projects with the greatest potential impact. This way, we can be confident about solutions that address the challenges facing women and girls.”…(More)”.
Blog Post by Manuela Di Fusco: “Real-world data (RWD) and real-world evidence (RWE) are playing an increasing role in healthcare decision making.
The conduct of RWD studies involves many interconnected stages, ranging from the definition of research questions of high scientific interest, to the design of a study protocol and statistical plan, and the conduct of the analyses, quality reviews, publication and presentation to the scientific community. Every stage requires extensive knowledge, expertise and efforts from the multidisciplinary research team.
There are a number of well-accepted guidelines for good procedural practices in RWD . Despite their stress on the importance of data reliability, relevance and studies being fit for purpose, their recommendations generally focus on methods/analyses and transparent reporting of results. There often is little focus on feasibility concerns at the early stages of a study; ongoing RWD initiatives, too, focus on improving standards and practices for data collection and analyses.
RWD and RWE are playing an increasing role in healthcare decision making.”
The availability and use of new data sources, which have the ability to store health-related data, have been growing globally, and include mobile technologies, electronic patient-reported outcome tools and wearables [1].
As data sources exist in various formats, and are often created for non-research purposes, they have inherent associated limitations – such as missing data. Determining the best approach for collecting complete and quality data is of critical importance. At study conception, it is not always clear if it is reasonable to expect that the research question of interest could be fully answered and all analyses carried out. Numerous methodological and data collection challenges can emerge during study execution. However, some of these downstream study challenges could be proactively addressed through an early feasibility study, concurrent to protocol development. For example, during this exploratory study, datasets may be explored carefully to ensure data points deemed relevant for the study are routinely ascertained and captured sufficiently, despite potential missing data and/or other data source limitations.
Determining the best approach for collecting complete and quality data is of critical importance.”
This feasibility assessment serves primarily as a first step to gain knowledge of the data and ensure realistic assumptions are included in the protocol; relevant sensitivity analyses can test those assumptions, hence setting the basis for successful study development.
Below is a list of key feasibility questions which may guide the technical exploration and conceptualization of a retrospective RWD study. The list is based on experience supporting observational studies on a global scale and is not intended to be exhaustive and representative of all preparatory activities. This technical feasibility analysis should be carried out while considering other relevant aspects, including the novelty and strategic value of the study versus the existing evidence – in the form of randomized controlled trial data and other RWE –, the intended audience, data access/protection, reporting requirements and external validity aspects.
This feasibility assessment serves primarily as a first step to gain knowledge of the data and ensure realistic assumptions are included in the protocol…”
The list may support early discussions among study team members during the preparation and determination of a RWD study.
- Can the population be accurately identified in the data source?
Diagnosis and procedures can be identified through International Classification of Diseases codes; published code validation studies on the population of interest can be a useful guide.
- How generalizable is the population of the data source?
Generalizability issues should be recognized upfront. For example, the patient population for which data is available in the data source might be restricted to a specific geographic region, health insurance plan (e.g. Medicare or commercial), system (hospital/inpatient and ambulatory) or group (e.g. age, gender)…(More)”.