Introducing Reach: find and track research being put into action


Blog by Dawn Duhaney: “At Wellcome Data Labs we’re releasing our first product, Reach. Our goal is to support funding organisations and researchers by making it easier to find and track scientific research being put into action by governments and global health organisations.

https://reach.wellcomedatalabs.org/
https://reach.wellcomedatalabs.org/

We focused on solving this problem in collaboration with our internal Insights and Analysis team for Wellcome and with partner organisations before deciding to release Reach more widely.

We found that evaluation teams wanted tools to help them measure the influence academic research was having on policy making institutions. We noticed that it is often challenging to track how scientific evidence makes its way into policy making. Institutions like the UK Government and the World Health Organisation have hundreds of thousands of policy documents available — it’s a heavily manual task to search through them to find evidence of our funded research.

At Wellcome we have some established methods for collecting evidence of policy influence from our funded research such as end of scheme reporting and via word of mouth. Through these methods we found great examples of how funded research was being put into policy and practice by government and global health organisations.

One example is from Kenya. The KEMRI Research Programme — a collaboration between the Kenyan Medical Research Institute, Wellcome and Oxford University launched a research programme to improve maternal health in 2005. Their research was cited in the World Health Organisation and with advocacy efforts from the KEMRI team influenced the development of new Kenyan national guidelines of paediatric care.

In Wellcome Data Labs we wanted to build a tool that would aid the discovery of evidence based policy making and be a step in the process of assessing research influence for evaluators, researchers and funding institutions….(More)”.

Data could hold the key to stopping Alzheimer’s


Blog post by Bill Gates: “My family loves to do jigsaw puzzles. It’s one of our favorite activities to do together, especially when we’re on vacation. There is something so satisfying about everyone working as a team to put down piece after piece until finally the whole thing is done.

In a lot of ways, the fight against Alzheimer’s disease reminds me of doing a puzzle. Your goal is to see the whole picture, so that you can understand the disease well enough to better diagnose and treat it. But in order to see the complete picture, you need to figure out how all of the pieces fit together.

Right now, all over the world, researchers are collecting data about Alzheimer’s disease. Some of these scientists are working on drug trials aimed at finding a way to stop the disease’s progression. Others are studying how our brain works, or how it changes as we age. In each case, they’re learning new things about the disease.

But until recently, Alzheimer’s researchers often had to jump through a lot of hoops to share their data—to see if and how the puzzle pieces fit together. There are a few reasons for this. For one thing, there is a lot of confusion about what information you can and can’t share because of patient privacy. Often there weren’t easily available tools and technologies to facilitate broad data-sharing and access. In addition, pharmaceutical companies invest a lot of money into clinical trials, and often they aren’t eager for their competitors to benefit from that investment, especially when the programs are still ongoing.

Unfortunately, this siloed approach to research data hasn’t yielded great results. We have only made incremental progress in therapeutics since the late 1990s. There’s a lot that we still don’t know about Alzheimer’s, including what part of the brain breaks down first and how or when you should intervene. But I’m hopeful that will change soon thanks in part to the Alzheimer’s Disease Data Initiative, or ADDI….(More)“.

How Can Policy Makers Predict the Unpredictable?


Essay by Meg King and Aaron Shull: “Policy makers around the world are leaning on historical analogies to try to predict how artificial intelligence, or AI — which, ironically, is itself a prediction technology — will develop. They are searching for clues to inform and create appropriate policies to help foster innovation while addressing possible security risks. Much in the way that electrical power completely changed our world more than a century ago — transforming every industry from transportation to health care to manufacturing — AI’s power could effect similar, if not even greater, disruption.

Whether it is the “next electricity” or not, one fact all can agree on is that AI is not a thing in itself. Most authors contributing to this essay series focus on the concept that AI is a general-purpose technology — or GPT — that will enable many applications across a variety of sectors. While AI applications are expected to have a significantly positive impact on our lives, those same applications will also likely be abused or manipulated by bad actors. Setting rules at both the national and the international level — in careful consultation with industry — will be crucial for ensuring that AI offers new capabilities and efficiencies safely.

Situating this discussion, though, requires a look back, in order to determine where we may be going. While AI is not new — Marvin Minsky developed what is widely believed to be the first neural network learning machine in the early 1950s — its scale, scope, speed of adoption and potential use cases today highlight a number of new challenges. There are now many ominous signs pointing to extreme danger should AI be deployed in an unchecked manner, particularly in military applications, as well as worrying trends in the commercial context related to potential discrimination, undermining of privacy, and upended traditional employment structures and economic models….(More)”

To mitigate the costs of future pandemics, establish a common data space


Article by Stephanie Chin and Caitlin Chin: “To improve data sharing during global public health crises, it is time to explore the establishment of a common data space for highly infectious diseases. Common data spaces integrate multiple data sources, enabling a more comprehensive analysis of data based on greater volume, range, and access. At its essence, a common data space is like a public library system, which has collections of different types of resources from books to video games; processes to integrate new resources and to borrow resources from other libraries; a catalog system to organize, sort, and search through resources; a library card system to manage users and authorization; and even curated collections or displays that highlight themes among resources.

Even before the COVID-19 pandemic, there was significant momentum to make critical data more widely accessible. In the United States, Title II of the Foundations for Evidence-Based Policymaking Act of 2018, or the OPEN Government Data Act, requires federal agencies to publish their information online as open data, using standardized, machine-readable data formats. This information is now available on the federal data.gov catalog and includes 50 state- or regional-level data hubs and 47 city- or county-level data hubs. In Europe, the European Commission released a data strategy in February 2020 that calls for common data spaces in nine sectors, including healthcare, shared by EU businesses and governments.

Going further, a common data space could help identify outbreaks and accelerate the development of new treatments by compiling line list incidence data, epidemiological information and models, genome and protein sequencing, testing protocols, results of clinical trials, passive environmental monitoring data, and more.

Moreover, it could foster a common understanding and consensus around the facts—a prerequisite to reach international buy-in on policies to address situations unique to COVID-19 or future pandemics, such as the distribution of medical equipment and PPE, disruption to the tourism industry and global supply chains, social distancing or quarantine, and mass closures of businesses….(More). See also Call for Action for a Data Infrastructure to tackle Pandemics and other Dynamic Threats.

How to Use the Bureaucracy to Govern Well


Good Governance Paper by Rebecca Ingber:”…Below I offer four concrete recommendations for deploying Intentional Bureaucratic Architecture within the executive branch. But first, I will establish three key background considerations that provide context for these recommendations.  The focus of this piece is primarily executive branch legal decisionmaking, but many of these recommendations apply equally to other areas of policymaking.

First, make room for the views and expertise of career officials. As a political appointee entering a new office, ask those career officials: What are the big issues on the horizon on which we will need to take policy or legal views?  What are the problems with the positions I am inheriting?  What is and is not working?  Where are the points of conflict with our allies abroad or with Congress?  Career officials are the institutional memory of the government and often the only real experts in the specific work of their agency.  They will know about the skeletons in the closet and where the bodies are buried and all the other metaphors for knowing things that other people do not. Turn to them early. Value them. They will have views informed by experience rather than partisan politics. But all bureaucratic actors, including civil servants, also bring to the table their own biases, and they may overvalue the priorities of their own office over others. Valuing their role does not mean handing the reins over to the civil service—good governance requires exercising judgement and balancing the benefits of experience and expertise with fresh eyes and leadership. A savvy bureaucratic actor might know how to “get around” the bureaucratic roadblocks, but the wise bureaucratic player also knows how much the career bureaucracy has to offer and exercises judgment based in clear values about when to defer and when to overrule.

Second, get ahead of decisions: choose vehicles for action carefully and early. The reality of government life is that much of the big decisionmaking happens in the face of a fire drill. As I’ve written elsewhere, the trigger or “interpretation catalyst” that compels the government to consider and assert a position—in other words, the cause of that fire drill—shapes the whole process of decisionmaking and the resulting decision. When an issue arises in defensive litigation, a litigation-driven process controls.  That means that career line attorneys shape the government’s legal posture, drawing from longstanding positions and often using language from old briefs. DOJ calls the shots in a context biased toward zealous defense of past action. That looks very different from a decisionmaking process that results from the president issuing an executive order or presidential memorandum, a White House official deciding to make a speech, the State Department filing a report with a treaty body, or DOD considering whether to engage in an operation involving force. Each of these interpretation catalysts triggers a different process for decisionmaking that will shape the resulting outcome.  But because of the stickiness of government decisions—and the urgent need to move on to the next fire drill—these positions become entrenched once taken. That means that the process and outcome are driven by the hazards of external events, unless officials find ways to take the reins and get ahead of them.

And finally, an incoming administration must put real effort into Intentional Bureaucratic Architecture by deliberately and deliberatively creating and managing the bureaucratic processes in which decisionmaking happens. Novel issues arise and fire drills will inevitably happen in even the best prepared administrations.  The bureaucratic architecture will dictate how decisionmaking happens from the novel crises to the bread and butter of daily agency work. There are countless varieties of decisionmaking models inside the executive branch, which I have classified in other work. These include a unitary decider model, of which DOJ’s Office of Legal Counsel (OLC) is a prime example, an agency decider model, and a group lawyering model. All of these models will continue to co-exist. Most modern national security decisionmaking engages the interests and operations of multiple agencies. Therefore, in a functional government, most of these decisions will involve group lawyering in some format—from agency lawyers picking up the phone to coordinate with counterparts in other agencies to ad hoc meetings to formal regularized working groups with clear hierarchies all the way up to the cabinet. Often these processes evolve organically, as issues arise. Some are created from the top down by presidential administrations that want to impose order on the process. But all of these group lawyering dynamics often lack a well-defined process for determining the outcome in cases of conflict or deciding how to establish a clear output. This requires rule setting and organizing the process from the top down….(More).

Data to Go: The Value of Data Portability as a Means to Data Liquidity


Juliet McMurren and Stefaan G. Verhulst at Data & Policy: “If data is the “new oil,” why isn’t it flowing? For almost two decades, data management in fields such as government, healthcare, finance, and research has aspired to achieve a state of data liquidity, in which data can be reused where and when it is needed. For the most part, however, this aspiration remains unrealized. The majority of the world’s data continues to stagnate in silos, controlled by data holders and inaccessible to both its subjects and others who could use it to create or improve services, for research, or to solve pressing public problems.

Efforts to increase liquidity have focused on forms of voluntary institutional data sharing such as data pools or other forms of data collaboratives. Although useful, these arrangements can only advance liquidity so far. Because they vest responsibility and control over liquidity in the hands of data holders, their success depends on data holders’ willingness and ability to provide access to their data for the greater good. While that willingness exists in some fields, particularly medical research, a willingness to share data is much less likely where data holders are commercial competitors and data is the source of their competitive advantage. And even where willingness exists, the ability of data holders to share data safely, securely, and interoperably may not. Without a common set of secure, standardized, and interoperable tools and practices, the best that such bottom-up collaboration can achieve is a disconnected patchwork of initiatives, rather than the data liquidity proponents are seeking.

Image for post

Data portability is one potential solution to this problem. As enacted in the EU General Data Protection Regulation (2018) and the California Consumer Privacy Act (2018), the right to data portability asserts that individuals have a right to obtain, copy, and reuse their personal data and transfer it between platforms or services. In so doing, it shifts control over data liquidity to data subjects, obliging data holders to release data whether or not it is in their commercial interests to do so. Proponents of data portability argue that, once data is unlocked and free to move between platforms, it can be combined and reused in novel ways and in contexts well beyond those in which it was originally collected, all while enabling greater individual control.

To date, however, arguments for the benefits of the right to data portability have typically failed to connect this rights-based approach with the larger goal of data liquidity and how portability might advance it. This failure to connect these principles and to demonstrate their collective benefits to data subjects, data holders, and society has real-world consequences. Without a clear view of what can be achieved, policymakers are unlikely to develop interventions and incentives to advance liquidity and portability, individuals will not exercise their rights to data portability, and industry will not experiment with use cases and develop the tools and standards needed to make portability and liquidity a reality.

Toward these ends, we have been exploring the current literature on data portability and liquidity, searching for lessons and insights into the benefits that can be unlocked when data liquidity is enabled through the right to data portability. Below we identify some of the greatest potential benefits for society, individuals, and data-holding organizations. These benefits are sometimes in conflict with one another, making the field a contentious one that demands further research on the trade-offs and empirical evidence of impact. In the final section, we also discuss some barriers and challenges to achieving greater data liquidity….(More)”.

Data as Property?


Blog by Salomé Viljoen: “Since the proliferation of the World Wide Web in the 1990s, critics of widely used internet communications services have warned of the misuse of personal data. Alongside familiar concerns regarding user privacy and state surveillance, a now-decades-long thread connects a group of theorists who view data—and in particular data about people—as central to what they have termed informational capitalism.1 Critics locate in datafication—the transformation of information into commodity—a particular economic process of value creation that demarcates informational capitalism from its predecessors. Whether these critics take “information” or “capitalism” as the modifier warranting primary concern, datafication, in their analysis, serves a dual role: both a process of production and a form of injustice.

In arguments levied against informational capitalism, the creation, collection, and use of data feature prominently as an unjust way to order productive activity. For instance, in her 2019 blockbuster The Age of Surveillance Capitalism, Shoshanna Zuboff likens our inner lives to a pre-Colonial continent, invaded and strip-mined of data by technology companies seeking profits.2 Elsewhere, Jathan Sadowski identifies data as a distinct form of capital, and accordingly links the imperative to collect data to the perpetual cycle of capital accumulation.3 Julie Cohen, in the Polanyian tradition, traces the “quasi-ownership through enclosure” of data and identifies the processing of personal information in “data refineries” as a fourth factor of production under informational capitalism.4

Critiques breed proposals for reform. Thus, data governance emerges as key terrain on which to discipline firms engaged in datafication and to respond to the injustices of informational capitalism. Scholars, activists, technologists and even presidential candidates have all proposed data governance reforms to address the social ills generated by the technology industry.

These reforms generally come in two varieties. Propertarian reforms diagnose the source of datafication’s injustice in the absence of formal property (or alternatively, labor) rights regulating the process of production. In 2016, inventor of the world wide web Sir Tim Berners-Lee founded Solid, a web decentralization platform, out of his concern over how data extraction fuels the growing power imbalance of the web which, he notes, “has evolved into an engine of inequity and division; swayed by powerful forces who use it for their own agendas.” In response, Solid “aims to radically change the way Web applications work today, resulting in true data ownership as well as improved privacy.” Solid is one popular project within the blockchain community’s #ownyourdata movement; another is Radical Markets, a suite of proposals from Glen Weyl (an economist and researcher at Microsoft) that includes developing a labor market for data. Like Solid, Weyl’s project is in part a response to inequality: it aims to disrupt the digital economy’s “technofeudalism,” where the unremunerated fruits of data laborers’ toil help drive the inequality of the technology economy writ large.5 Progressive politicians from Andrew Yang to Alexandria Ocasio-Cortez have similarly advanced proposals to reform the information economy, proposing variations on the theme of user-ownership over their personal data.

The second type of reforms, which I call dignitarian, take a further step beyond asserting rights to data-as-property, and resist data’s commodification altogether, drawing on a framework of civil and human rights to advocate for increased protections. Proposed reforms along these lines grant individuals meaningful capacity to say no to forms of data collection they disagree with, to determine the fate of data collected about them, and to grant them rights against data about them being used in ways that violate their interests….(More)”.

A New Normal for Data Collection: Using the Power of Community to Tackle Gender Violence Amid COVID-19


Claudia Wells at SDG Knowledge Hub: “A shocking increase in violence against women and girls has been reported in many countries during the COVID-19 pandemic, amounting to what UN Women calls a “shadow pandemic.”

The jarring facts are:

  • Globally 243 million women and girls have been subjected to sexual and/or physical violence by an intimate partner in the past 12 months.
  • The UNFPA estimates that the pandemic will cause a one-third reduction in progress towards ending gender-based violence by 2030;
  • UNFPA predicts an additional 15 million cases of gender-based violence for every three months of lockdown.
  • Official data captures only a fraction of the true prevalence and nature of gender-based violence.

The response to these new challenges were discussed at a meeting in July with a community-led response delivered through local actors highlighted as key. This means that timely, disaggregated, community-level data on the nature and prevalence of gender-based violence has never been more important. Data collected within communities can play a vital role to fill the gaps and ensure that data-informed policies reflect the lived experiences of the most marginalized women and girls.

Community Scorecards: Example from Nepal

Collecting and using community-level data can be challenging, particularly under the restrictions of the pandemic. Working in partnerships is therefore vital if we are to respond quickly and flexibly to new and old challenges.

A great example of this is the Leave No One Behind Partnership, which responds to these challenges while delivering on crucial data and evidence at the community level. This important partnership brings together international civil society organizations with national NGOs, civic platforms and community-based organizations to monitor progress towards the SDGs….

While COVID-19 has highlighted the need for local, community-driven data, public health restrictions have also made it more challenging to collect such data. For example the usual focus group approach to creating a community scorecard is no longer possible.

The coalition in Nepal  therefore faces an increased demand for community-driven data while needing to develop a “new normal for data collection.”. Partners must: make data collection more targeted; consider how data on gender-based violence are included in secondary sources; and map online resources and other forms of data collection.

Addressing these new challenges may include using more blended collection approaches such as  mobile phones or web-based platforms. However, while these may help to facilitate data collection, they come with increased privacy and safeguarding risks that have to be carefully considered to ensure that participants, particularly women and girls, are not at increased risk of violence or have their privacy and confidentiality exposed….(More)”.

Poor data on groundwater jeopardizes climate resilience


Rebecca Root at Devex: “A lack of data on groundwater is impeding water management and could jeopardize climate resilience efforts in some places, according to recent research by WaterAid and the HSBC Water Programme.

Groundwater is found underground in gaps between soil, sand, and rock. Over 2.5 million people are thought to depend on groundwater — which has a higher tolerance to droughts than other water sources — for drinking.

The report looked at groundwater security and sustainability in Bangladesh, Ghana, India, Nepal, and Nigeria, where collectively more than 160 million people lack access to clean water close to home. It found that groundwater data tends to be limited — including on issues such as overextraction, pollution, and contamination — leaving little evidence for decision-makers to consider for its management.

“There’s a general lack of information and data … which makes it very hard to manage the resource sustainably,” said Vincent Casey, senior water, sanitation, and hygiene manager for waste at WaterAid…(More)”.

High-Stakes AI Decisions Need to Be Automatically Audited


Oren Etzioni and Michael Li in Wired: “…To achieve increased transparency, we advocate for auditable AI, an AI system that is queried externally with hypothetical cases. Those hypothetical cases can be either synthetic or real—allowing automated, instantaneous, fine-grained interrogation of the model. It’s a straightforward way to monitor AI systems for signs of bias or brittleness: What happens if we change the gender of a defendant? What happens if the loan applicants reside in a historically minority neighborhood?

Auditable AI has several advantages over explainable AI. Having a neutral third-party investigate these questions is a far better check on bias than explanations controlled by the algorithm’s creator. Second, this means the producers of the software do not have to expose trade secrets of their proprietary systems and data sets. Thus, AI audits will likely face less resistance.

Auditing is complementary to explanations. In fact, auditing can help to investigate and validate (or invalidate) AI explanations. Say Netflix recommends The Twilight Zone because I watched Stranger Things. Will it also recommend other science fiction horror shows? Does it recommend The Twilight Zone to everyone who’s watched Stranger Things?

Early examples of auditable AI are already having a positive impact. The ACLU recently revealed that Amazon’s auditable facial-recognition algorithms were nearly twice as likely to misidentify people of color. There is growing evidence that public audits can improve model accuracy for under-represented groups.

In the future, we can envision a robust ecosystem of auditing systems that provide insights into AI. We can even imagine “AI guardians” that build external models of AI systems based on audits. Instead of requiring AI systems to provide low-fidelity explanations, regulators can insist that AI systems used for high-stakes decisions provide auditing interfaces.

Auditable AI is not a panacea. If an AI system is performing a cancer diagnostic, the patient will still want an accurate and understandable explanation, not just an audit. Such explanations are the subject of ongoing research and will hopefully be ready for commercial use in the near future. But in the meantime, auditable AI can increase transparency and combat bias….(More)”.