AI+1: Shaping Our Integrated Future


Report edited by the Rockefeller Foundation: “As we speak—and browse, and post photos, and move about—artificial intelligence is transforming the fabric of our lives. It is making life easier, better informed, healthier, more convenient. It also threatens to crimp our freedoms, worsen social disparities, and gives inordinate powers to unseen forces.

Both AI’s virtues and risks have been on vivid display during this moment of global turmoil, forcing a deeper conversation around its responsible use and, more importantly, the rules and regulations needed to harness its power for good.

This is a vastly complex subject, with no easy conclusions. With no roadmap, however, we risk creating more problems instead of solving meaningful ones.

Last fall The Rockefeller Foundation convened a unique group of thinkers and doers at its Bellagio Center in Italy to weigh one of the great challenges of our time: How to harness the powers of machine learning for social good and minimize its harms. The resulting AI + 1 report includes diverse perspectives from top technologists, philosophers, economists, and artists at a critical moment during the current Covid-19 pandemic.

The report’s authors present a mix of skepticism and hope centered on three themes:

  1. AI is more than a technology. It reflects the values in its system, suggesting that any ethical lapses simply mirror our own deficiencies. And yet, there’s hope: AI can also inspire us, augment us, and make us go deeper.
  2. AI’s goals need to be society’s goals. As opposed to the market-driven, profit-making ones that dominate its use today, applying AI responsibly is to use it to support systems that have human goals.
  3. We need a new rule-making system to guide its responsible development. Self-regulation simply isn’t enough. Cross-sector oversight must start with transparency and access to meaningful information, as well as an ability to expose harm.

AI itself is a slippery force, hard to pin down and define, much less regulate. We describe it using imprecise metaphors and deepen our understanding of it through nuanced conversation. This collection of essays provokes the kind of thoughtful consideration that will help us wrestle with AI’s complexity, develop a common language, create bridges between sectors and communities, and build practical solutions. We hope that you join us….(More)”.

COVID-19 from the Margins: What We Have Learned So Far


Blog by Silvia Masiero, Stefania Milan and Emiliano Treré: “Since the World Health Organisation declared the outbreak of COVID-19 a pandemic on 11 March 2020, narratives of the virus outbreak centred on counting and measuring have became dominant in public discourse. Enumerating and comparing cases and locations, victims or the progressive occupancy of intensive care units, policymakers and experts alike have turned data into the condition of existence of the first pandemic of the datafied society. However, many communities at the margins—from workers in the informal economy to low-income countries to victims of domestic violence—were left in the dark.

This is why our attention of researchers of datafication across the many Souths inhabiting the globe turned into the untold stories of the pandemic. We decided to make space for narratives from those individuals, communities, countries and regions that have thus far remained at the margins of global news reports and relief efforts. The multilingual blog COVID-19 from the Margins, launched on 4 May 2020, hosts stories of invisibility, including from migrants and communities living in countries and regions with limited statistical capacity or in cities and slums where pre-existing inequality and vulnerability have been augmented by the pandemic. In entering the third month of this initiative, a reflection on the main threads emerged from the 28 articles published so far is in order to devise our look to the future. In what follows, we identify four threads that have informed discussions on this blog so far, namely data visualisation, perpetuated vulnerabilities and inequalities, datafied social policies, and digital activism at the time of the pandemic…(More)”.

The AI Powered State: What can we learn from China’s approach to public sector innovation?


Essay collection edited by Nesta: “China is striding ahead of the rest of the world in terms of its investment in artificial intelligence (AI), rate of experimentation and adoption, and breadth of applications. In 2017, China announced its aim of becoming the world leader in AI technology by 2030. AI innovation is now a key national priority, with central and local government spending on AI estimated to be in the tens of billions of dollars.

While Europe and the US are also following AI strategies designed to transform the public sector, there has been surprisingly little analysis of what practical lessons can be learnt from China’s use of AI in public services. Given China’s rapid progress in this area, it is important for the rest of the world to pay attention to developments in China if it wants to keep pace.

This essay collection finds that examining China’s experience of public sector innovation offers valuable insights for policymakers. Not everything is applicable to a western context – there are social, political and ethical concerns that arise from China’s use of new technologies in public services and governance – but there is still much that can be learned from its experience while also acknowledging what should be criticized and avoided….(More)”.

Data is Dangerous: Comparing the Risks that the United States, Canada and Germany See in Data Troves


Paper by Susan Ariel Aaronson: “Data and national security have a complex relationship. Data is essential to national defense — to understanding and countering adversaries. Data underpins many modern military tools from drones to artificial intelligence. Moreover, to protect their citizens, governments collect lots of data about their constituents. Those same datasets are vulnerable to theft, hacking, and misuse. In 2013, the Department of Defense’s research arm (DARPA) funded a study examining if “ the availability of data provide a determined adversary with the tools necessary to inflict nation-state level damage. The results were not made public. Given the risks to the data of their citizens, defense officials should be vociferous advocates for interoperable data protection rules.

This policy brief uses case studies to show that inadequate governance of personal data can also undermine national security. The case studies represent diverse internet sectors affecting netizens differently. I do not address malware or disinformation, which are also issues of data governance, but have already been widely researched by other scholars. I illuminate how policymakers, technologists, and the public are/were unprepared for how inadequate governance spillovers affected national security. I then makes some specific recommendations about what we can do about this problem….(More)”.

Social Research in Times of Big Data: The Challenges of New Data Worlds and the Need for a Sociology of Social Research


Paper by Rainer Diaz-Bone et al: “The phenomenon of big data does not only deeply affect current societies but also poses crucial challenges to social research. This article argues for moving towards a sociology of social research in order to characterize the new qualities of big data and its deficiencies. We draw on the neopragmatist approach of economics of convention (EC) as a conceptual basis for such a sociological perspective.

This framework suggests investigating processes of quantification in their interplay with orders of justifications and logics of evaluation. Methodological issues such as the question of the “quality of big data” must accordingly be discussed in their deep entanglement with epistemic values, institutional forms, and historical contexts and as necessarily implying political issues such as who controls and has access to data infrastructures. On this conceptual basis, the article uses the example of health to discuss the challenges of big data analysis for social research.

Phenomena such as the rise of new and massive privately owned data infrastructures, the economic valuation of huge amounts of connected data, or the movement of “quantified self” are presented as indications of a profound transformation compared to established forms of doing social research. Methodological and epistemological, but also institutional and political, strategies are presented to face the risk of being “outperformed” and “replaced” by big data analysis as they are already done in big US American and Chinese Internet enterprises. In conclusion, we argue that the sketched developments have important implications both for research practices and methods teaching in the era of big data…(More)”.

Blockchain for the public good


Blog by Camille Crittenden: “Over the last year, I have had the privilege to lead the California Blockchain Working Group, which delivered its report to the Legislature in early July. Established by AB 2658, the 20-member Working Group comprised experts with backgrounds in computer science, cybersecurity, information technology, law, and policy. We were charged with drafting a working definition of blockchain, providing advice to State offices and agencies considering implementation of blockchain platforms, and offering guidance to policymakers to foster an open and equitable regulatory environment for the technology in California.

What did we learn? Enough to make a few outright recommendations as well as identify areas where further research is warranted.

A few guiding principles: Refine the application of blockchain systems first on things, not people. This could mean implementations of blockchain for tracing food from farms to stores to reduce the economic and human harm of food-borne illnesses; reducing paperwork and increasing reliability of tracing vehicles and parts from manufacturing floor to consumer to future owners or dismantlers; improving workflows for digitizing, cataloging and storing the reams of documents held in the State Archives.

Similarly, blockchain solutions could be implemented for public vital records, such as birth, death and marriage certificates or real estate titles without risk of compromising private information. Greater caution should be taken in applications that affect public service delivery to populations in precarious circumstances, such as the homeless or unemployed. Overarching problems to address, especially for sensitive records, include the need for reliable, persistent digital identification and the evolving requirements for cybersecurity….

The Working Group’s final report, Blockchain in California: A Roadmap, avoids the magical thinking or technological solutionism that sometimes attends shiny new tech ideas. Blockchain won’t cure Covid-19, fix systemic racism, or reverse alarming unemployment trends. But if implemented conscientiously on a case-by-case basis, it could make a dent in improving health outcomes, increasing autonomy for property owners and consumers, and alleviating some bureaucratic practices that may be a drag on the economy. And those are contributions we can all welcome….(More)”.

Medical data has a silo problem. These models could help fix it.


Scott Khan at the WEF: “Every day, more and more data about our health is generated. Data, which if analyzed, could hold the key to unlocking cures for rare diseases, help us manage our health risk factors and provide evidence for public policy decisions. However, due to the highly sensitive nature of health data, much is out of reach to researchers, halting discovery and innovation. The problem is amplified further in the international context when governments naturally want to protect their citizens’ privacy and therefore restrict the movement of health data across international borders. To address this challenge, governments will need to pursue a special approach to policymaking that acknowledges new technology capabilities.

Understanding data siloes

Data becomes siloed for a range of well-considered reasons ranging from restrictions on terms-of-use (e.g., commercial, non-commercial, disease-specific, etc), regulations imposed by governments (e.g., Safe Harbor, privacy, etc.), and an inability to obtain informed consent from historically marginalized populations.

Siloed data, however, also creates a range of problems for researchers looking to make that data useful to the general population. Siloes, for example, block researchers from accessing the most up-to-date information or the most diverse, comprehensive datasets. They can slow the development of new treatments and therefore, curtail key findings that can lead to much needed treatments or cures.

Even when these challenges are overcome, the incidences of data mis-use – where health data is used to explore non-health related topics or without an individual’s consent – continue to erode public trust in the same research institutions that are dependent on such data to advance medical knowledge.

Solving this problem through technology

Technology designed to better protect and decentralize data is being developed to address many of these challenges. Techniques such as homomorphic encryption (a cryptosystem that encrypts data with a public key) and differential privacy (a system leveraging information about a group without revealing details about individuals) both provide means to protect and centralize data while distributing the control of its use to the parties that steward the respective data sets.

Federated data leverages a special type of distributed database management system that can provide an alternative approach to centralizing encoded data without moving the data sets across jurisdictions or between institutions. Such an approach can help connect data sources while accounting for privacy. To further forge trust in the system, a federated model can be implemented to return encoded data to prevent unauthorized distribution of data and learnings as a result of the research activity.

To be sure, within every discussion of the analysis of aggregated data lies challenges with data fusion between data sets, between different studies, between data silos, between institutions. Despite there being several data standards that could be used, most data exist within bespoke data models built for a single purpose rather than for the facilitation of data sharing and data fusion. Furthermore, even when data has been captured into a standardized data model (e.g., the Global Alliance for Genomics and Health offers some models for standardizing sensitive health data), many data sets are still narrowly defined. They often lack any shared identifiers to combine data from different sources into a coherent aggregate data source useful for research. Within a model of data centralization, data fusion can be addressed through data curation of each data set, whereas within a federated model, data fusion is much more vexing….(More)“.

The European data market


European Commission: “It was the first European Data Market study (SMART 2013/0063) contracted by the European Commission in 2013 that made a first attempt to provide facts and figures on the size and trends of the EU data economy by developing a European data market monitoring tool.

The final report of the updated European Data Market (EDM) study (SMART 2016/0063) now presents in detail the results of the final round of measurement of the updated European Data Market Monitoring Tool contracted for the 2017-2020 period.

Designed along a modular structure, as a first pillar of the study, the European Data Market Monitoring Tool is built around a core set of quantitative indicators to provide a series of assessments of the emerging market of data at present, i.e. for the years 2018 through 2020, and with projections to 2025.

The key areas covered by the indicators measured in this report are:

  • The data professionals and the balance between demand and supply of data skills;
  • The data companies and their revenues;
  • The data user companies and their spending for data technologies;
  • The market of digital products and services (“Data market”);
  • The data economy and its impacts on the European economy.
  • Forecast scenarios of all the indicators, based on alternative market trajectories.

Additionally, as a second major work stream, the study also presents a series of descriptive stories providing a complementary view to the one offered by the Monitoring Tool (for example, “How Big Data is driving AI” or “The Secondary Use of Health Data and Data-driven Innovation in the European Healthcare Industry”), adding fresh, real-life information around the quantitative indicators. By focusing on specific issues and aspects of the data market, the stories offer an initial, indicative “catalogue” of good practices of what is happening in the data economy today in Europe and what is likely to affect the development of the EU data economy in the medium term.

Finally, as a third work stream of the study, a landscaping exercise on the EU data ecosystem was carried out together with some community building activities to bring stakeholders together from all segments of the data value chain. The map containing the results of the landscaping of the EU data economy as well as reports from the webinars organised by the study are available on the www.datalandscape.eu website….(More)”.

The National Cancer Institute Cancer Moonshot Public Access and Data Sharing Policy—Initial assessment and implications


Paper by Tammy M. Frisby and Jorge L. Contreras: “Since 2013, federal research-funding agencies have been required to develop and implement broad data sharing policies. Yet agencies today continue to grapple with the mechanisms necessary to enable the sharing of a wide range of data types, from genomic and other -omics data to clinical and pharmacological data to survey and qualitative data. In 2016, the National Cancer Institute (NCI) launched the ambitious $1.8 billion Cancer Moonshot Program, which included a new Public Access and Data Sharing (PADS) Policy applicable to funding applications submitted on or after October 1, 2017. The PADS Policy encourages the immediate public release of published research results and data and requires all Cancer Moonshot grant applicants to submit a PADS plan describing how they will meet these goals. We reviewed the PADS plans submitted with approximately half of all funded Cancer Moonshot grant applications in fiscal year 2018, and found that a majority did not address one or more elements required by the PADS Policy. Many such plans made no reference to the PADS Policy at all, and several referenced obsolete or outdated National Institutes of Health (NIH) policies instead. We believe that these omissions arose from a combination of insufficient education and outreach by NCI concerning its PADS Policy, both to potential grant applicants and among NCI’s program staff and external grant reviewers. We recommend that other research funding agencies heed these findings as they develop and roll out new data sharing policies….(More)”.

The Computermen


Podcast Episode by Jill Lepore: “In 1966, just as the foundations of the Internet were being imagined, the federal government considered building a National Data Center. It would be a centralized federal facility to hold computer records from each federal agency, in the same way that the Library of Congress holds books and the National Archives holds manuscripts. Proponents argued that it would help regulate and compile the vast quantities of data the government was collecting. Quickly, though, fears about privacy, government conspiracies, and government ineptitude buried the idea. But now, that National Data Center looks like a missed opportunity to create rules about data and privacy before the Internet took off. And in the absence of government action, corporations have made those rules themselves….(More)”.