The Open-Source Movement Comes to Medical Datasets


Blog by Edmund L. Andrews: “In a move to democratize research on artificial intelligence and medicine, Stanford’s Center for Artificial Intelligence in Medicine and Imaging (AIMI) is dramatically expanding what is already the world’s largest free repository of AI-ready annotated medical imaging datasets.

Artificial intelligence has become an increasingly pervasive tool for interpreting medical images, from detecting tumors in mammograms and brain scans to analyzing ultrasound videos of a person’s pumping heart.

Many AI-powered devices now rival the accuracy of human doctors. Beyond simply spotting a likely tumor or bone fracture, some systems predict the course of a patient’s illness and make recommendations.

But AI tools have to be trained on expensive datasets of images that have been meticulously annotated by human experts. Because those datasets can cost millions of dollars to acquire or create, much of the research is being funded by big corporations that don’t necessarily share their data with the public.

“What drives this technology, whether you’re a surgeon or an obstetrician, is data,” says Matthew Lungren, co-director of AIMI and an assistant professor of radiology at Stanford. “We want to double down on the idea that medical data is a public good, and that it should be open to the talents of researchers anywhere in the world.”

Launched two years ago, AIMI has already acquired annotated datasets for more than 1 million images, many of them from the Stanford University Medical Center. Researchers can download those datasets at no cost and use them to train AI models that recommend certain kinds of action.

Now, AIMI has teamed up with Microsoft’s AI for Health program to launch a new platform that will be more automated, accessible, and visible. It will be capable of hosting and organizing scores of additional images from institutions around the world. Part of the idea is to create an open and global repository. The platform will also provide a hub for sharing research, making it easier to refine different models and identify differences between population groups. The platform can even offer cloud-based computing power so researchers don’t have to worry about building local resource intensive clinical machine-learning infrastructure….(More)”.

The “Onion Model”: A Layered Approach to Documenting How the Third Wave of Open Data Can Provide Societal Value


Blog post by Andrew Zahuranec, Andrew Young and Stefaan Verhulst: “There’s a lot that goes into data-driven decision-making. Behind the datasets, platforms, and analysts is a complex series of processes that inform what kinds of insight data can produce and what kinds of ends it can achieve. These individual processes can be hard to understand when viewed together but, by separating the stages out, we can not only track how data leads to decisions but promote better and more impactful data management.

Earlier this year, The Open Data Policy Lab published the Third Wave of Open Data Toolkit to explore the elements of data re-use. At the center of this toolkit was an abstraction that we call the Open Data Framework. Divided into individual, onion-like layers, the framework shows all the processes that go into capitalizing on data in the third wave, starting with the creation of a dataset through data collaboration, creating insights, and using those insights to produce value.

This blog tries to re-iterate what’s included in each layer of this data “onion model” and demonstrate how organizations can create societal value by making their data available for re-use by other parties….(More)”.

Why I’m a proud solutionist


Blog by Jason Crawford: “Debates about technology and progress are often framed in terms of “optimism” vs. “pessimism.” For instance, Steven Pinker, Matt Ridley, Johan Norberg, Max Roser, and the late Hans Rosling have been called the “New Optimists” for their focus on the economic, scientific, and social progress of the last two centuries. Their opponents, such as David Runciman and Jason Hickel, accuse them of being blind to real problems in the world, such as poverty, and to risks of catastrophe, such as nuclear war.

Economic historian Robert Gordon calls himself “the prophet of pessimism.” His book The Rise and Fall of American Growth warned that the days of high economic growth are over for the United States and will not return. Gordon’s opponents include a group he calls the “techno-optimists,” such as Andrew McAfee and Erik Brynjolfsson, who have predicted a growth spurt in productivity from information technology.

It’s tempting to choose sides. But while it can be rational to be optimistic or pessimistic on any specific question, these terms are too imprecise to be adopted as a general intellectual identity. Those who identify as optimists can be too quick to dismiss or downplay the problems of technology, while self-styled technology pessimists or progress skeptics can be too reluctant to believe in solutions.

As we look forward to the post-pandemic recovery, once again we’re being tugged between the optimists, who highlight all the diseases that may soon be beaten through new vaccines, and the pessimists, who warn that humanity will never win the evolutionary arms race against microbes. But this represents a false choice. History provides us with powerful examples of people who were brutally honest in identifying a crisis but were equally active in seeking solutions.

At the end of the 19th century, William Crookes—physicist, chemist, and inventor of the Crookes tube (an early type of vacuum tube)—was the president of the British Association for the Advancement of Science. On September 7, 1898, he used the traditional annual address to the association to issue a dire warning.

The British Isles, he said, were at grave risk of running out of food. His reasoning was simple: the population was growing exponentially, but the amount of land under cultivation could not keep pace. The only way to continue to increase production was to improve crop yields. But the limiting factor on yields was the availability of nitrogen fertilizer, and the sources of nitrogen, such as the rock salts of the Chilean desert and the guano deposits of the Peruvian islands, were running out. His argument was detailed and comprehensive, based on figures for wheat production and land availability from every major European country and colony; he apologized in advance for boring his audience with statistics….(More)”.

In Need of Speed: Data can Accelerate Progress Towards Water and Sanitation for All


Article by Joakim Harlin et al: Even before COVID-19, the world was off-track to meet Sustainable Development Goal (SDG) 6 – ensuring water and sanitation for all by 2030.

The latest data, which is provided in seven SDG indicators reports published today by the UN-Water Integrated Monitoring Initiative for SDG 6 (IMI-SDG6), show us that 2 billion people worldwide still live without safely managed drinking water and 3.6 billion without safely managed sanitation. In addition, 2.3 billion people lack a basic handwashing facility with soap and water at home. Most wastewater is returned to nature untreated. One in five of the world’s river basins are experiencing rapid changes, such as flooding or drought with increased frequency and intensity, and 80% of wetland ecosystems are already lost….

We can only sustainably manage what we measure, and right now, there are too many gaps in the data, despite unprecedented, heroic levels of reporting during the chaos of the pandemic.

Last year, the IMI-SDG6 combined the efforts of WHO, UNICEF, UN-Habitat, UNEP, FAO, UNECE and UNESCO (as custodian agencies of the various SDG 6 global indicators) to reach out to countries with requests for data: this was our ‘2020 Data Drive.’

COVID-19 caused extreme difficulties for the SDG 6 national focal points in every country, with people forced to work from home with little equipment, few in-person consultations, and many data collection activities cancelled. Under the circumstances, the focal points made a remarkable effort. On average, UN Member States now have data on 8.2 out of 12 indicators (up from 7.0 in 2019), and the number reporting on nine or more indicators increased from 37 in 2019 to 92.

Despite this significant progress, large data gaps remain for some indicators, typically those that rely on in situ monitoring networks, such as water quality and aquifers. For example, many countries base their ambient water quality reporting on relatively few measurements; the poorest 20 countries reported on only 1,000 water bodies in total, whereas the richest 24 reported on nearly 60,000. Addressing these issues is a long-term, capital-intensive effort.

Our country monitoring focal points know better than anyone about the benefits and costs of robust water and sanitation monitoring systems, and the urgent need to establish them. We encourage high-level officials in national ministries to listen to what the focal points have to say. And, as we continue our capacity-building activities in countries, we also call on development partners to support this work. We call on academia, the private sector, and civil society to contribute to the joint effort by bringing their water and sanitation datasets to the table. …(More)”

“We do not feel safe”: A Kabul-based crisis alert app struggles to protect its own employees


Q and A with Sara Wahedi by Hajira Maryam: “Ehtesab, a Kabul-based startup, emerged out of a personal security-related incident that Sara Wahedi, a former Afghan government employee, experienced in May 2018. After witnessing a suicide bomb attack firsthand, Wahedi rushed home, where she could see militants roaming the streets from her balcony. The city was put on lockdown for 12 hours and left without electricity. No one, Wahedi said, knew when the electricity would be restored or when roads would be cleared. The authorities were of little help. 

“Since that moment, I kept pondering about the idea of accountability and information provision. I jotted down a few words in different languages for accountability, namely Dari and Pashto. That was the moment the term Ehtesab came to my mind.” 

Ehtesab means “accountability” in Dari and Pashto, and the app, formally launched in March 2020, offers streamlined security-related information, including general security updates in Kabul to its users. With real-time, crowdsourced alerts, users across the city can track bomb blasts, roadblocks, electricity outages, or other problems in locations close to them. The app, which generates push notifications about nearby security risks, is supported by 20 employees working out of the company’s Kabul office, according to Wahedi. 

Despite the company’s single-minded focus on security, the Ehtesab team was caught off-guard by the sudden collapse of the Afghan government over the weekend. “It was inevitable that there would be a significant shift in governance … but we weren’t expecting the Taliban to come in within the first eight hours of the day,” Wahedi said….(More)”.

Satellite Earth observation for sustainable rural development


A blog post by Peter Hargreaves: “…We find ourselves in a “golden age for satellite exploration”. ‘Big Data’ from satellite Earth observation – hereafter denoted ‘EO’ – could be an important part of the solution to the shortage of socioeconomic data required to inform several of the goals and targets that compose the United Nations (UN) Sustainable Development Goals (SDGs) [hyperlink]. In particular, the goals that pertain to socioeconomic and human wellbeing dimensions of development. EO data could play a significant role in producing the transparent data system necessary to achieve sustainable development….

Census and nationally representative household surveys are the medium through which most socioeconomic data are collected. It is impossible to understand socioeconomic conditions without them – I cannot stress this enough. But they have limitations, particularly in terms of cost and spatio-temporal coverage. In an ideal world, we would vastly upscale the spatial and temporal reporting of these surveys to cover more places and points in time. But this mass enumeration would be prohibitively expensive and *logistically impossible*. Imagine the quantity of data produced and the burden placed upon National Statistics Offices (NSOs) and governmental institutions? The 2030 end point for the SDGs would be upon us before much of the data was processed leaving very little time to use the outputs for policy.

This is where unconventional data enters the debate, and in this sphere – that of measuring socioeconomic conditions for development – EO data is unconventional. EO data has considerable potential to augment survey and census data for measuring rural poverty development in rural spaces, especially during intercensal periods, and where ground data are patchy, or non-existent. While on the subject, there is an important point to make: you can’t use EO to understand everything about a particular context. It does not matter how elaborate the model or the effort put in. Quite simply, EO cannot give you the full picture.

What EO *does* have is a five-decade temporal legacy (most platforms and data products are near continuous), and its broadly open access with low to negligible acquisition costs. EO data is also availabile across multiple spatial resolutions and is often easily comparable and complementary. When we say, ‘five-decade temporal legacy’, this means that there are roughly 50 years of EO data (if we use the Landsat program as an anchor). Not all EO platforms have operated across the whole timeline – Figure 1 below offers an idea of when different platforms were launched and for how long they were, or have been, operational. What’s more, data will be increasingly available and accessible, catalysed by technological innovation and investment in public and private ventures. A lot of this data is open access e.g. EO platforms operated by NASA or the ESA Copernicus programme, which include Landsat, MODIS, AVHRR, VIIRs, and the Sentinels amongst others. Meanwhile, the availability of EO data across multiple spatial resolutions enables disaggregation of data alongside survey and census data for subnational monitoring of socioeconomic conditions….(More)”.

The controversy over the term ‘citizen science’


CBC News: “The term citizen science has been around for decades. Its original definition, coined in the 1990s, refers to institution-guided projects that invite the public to contribute to scientific knowledge in all kinds of ways, from the cataloguing of plants, animals and insects in people’s backyards to watching space.

Anyone is invited to participate in citizen science, regardless of whether they have an academic background in the sciences, and every year these projects number in the thousands. 

Recently, however, some large institutions, scientists and community members have proposed replacing the term citizen science with “community science.” 

Those in favour of the terminology change — such as eBird, one of the world’s largest biodiversity databases — say they want to avoid using the word citizen. They do so because they want to be “welcoming to any birder or person who wants to learn more about bird watching, regardless of their citizen status,” said Lynn Fuller, an eBird spokesperson, in a news release earlier this year. 

Some argue that while the intention is valid, the term community science already holds another definition — namely projects that gather different groups of people around environmental justice focused on social action. 

To add to the confusion, renaming citizen science could impact policies and legislation that have been established in countries such as the U.S. and Canada to support projects and efforts in favour of citizen science. 

For example, if we suddenly decided to call all species of birds “waterbirds,” then the specific meaning of this category of bird species that lives on or around water would eventually be lost. This would, in turn, make communication between people and the various fields of science incredibly difficult. 

A paper published in Science magazine last month pointed out some of the reasons why rebranding citizen science in the name of inclusion could backfire. 

Caren Cooper, a professor of forestry and environmental resources at North Carolina State University and one of the authors of the paper, said that the term citizen science didn’t originally mean to imply that people should have a certain citizenship status to participate in such projects. 

Rather, citizen science is meant to convey the idea of responsibilities and rights to access science. 

She said there are other terms being used to describe this meaning, including “public science, participatory science [and] civic science.”

Chris Hawn, a professor of geography and environmental systems at the University of Maryland Baltimore County and one of Cooper’s co-authors, said that being aware of the need for change is a good first step, but any decision to rename should be made carefully….(More)”.

What is the difference between current awareness and horizon scanning?


identifying the trends

An informed perspective is more important than ever in order to anticipate what comes next and succeed in emerging futures”. HBR, October 16, 2015

Article by Clare Brown: “Legal professionals are busy people. They are concerned with doing the best they can for their clients and making sure that their business runs smoothly. Trend spotting or horizon scanning isn’t necessarily at the top of their daily “to do” lists but if they want to grow the firm effectively, everyone – from trainee to managing partner – needs to anticipate future events. 

The best way information people can help to do this is to first understand how everything fits together. We need to look at the difference between current awareness and horizon scanning – and put them both into a wider strategic context. When we present our management teams with evidence that they need automated current awareness, we should also be dazzling them with future information possibilities. 

…The answer might lie in a strategic and collaborative form of foresight, or as Kerstin E. Cuhls defines it, “a systematic debate of complex futures”. Large corporations, governments and intergovernmental organisations have used various methods to use information in their efforts to predict all possible outcomes. For instance, 

Georghiou (2007) reported that foresight activities have been conducted in conjunction with NIS in the USA, Canada, UK, Germany, The Netherlands, Austria, Russia, Australia, New Zealand, Columbia, India, South Korea, Kazakhstan, Taiwan, Malaysia, Egypt, Morocco, South Africa and other countries. In Germany, the Fraunhofer Society has taken the lead in progressively applying foresight not only in NIS but also in the preparation of strategic scenarios at the corporate level (Cuhls, 2015). (Yuichi Washida and Akihisa Yahata “Predictive value of horizon scanning for future scenarios in Foresight, 3 February 2021)

The excellent article on horizon scanning I mentioned above explains how they attempt it. In essence, it involves literature searches, conversations, taking a broad view, and being open to any and all possibilities:

  • Structured: it is a systematic approach by applying methods of futures research, science-based, and based on new theories of futures research
  • Debate: it includes interaction of relevant actors, active preparation for the future or different futures, and orientation towards shaping the future
  • Complex: it includes the consideration of systemic interdependencies, takes a holistic view
  • Futures is plural: it is an open view on different paths into the future with thinking in alternatives. We also envisage different types of futures, in futures research we differentiate between possible, probable and preferable futures…(More)”.

Proposal for a European Interoperability Framework for Smart Cities and Communities (EIF4SCC) published


Article by Nóirín Ní Earcáin: “In recognition of the importance of interoperability and the specific challenges it presents in a city context, The Commission (DG DIGIT and DG CONNECT) appointed Deloitte and KU Leven to prepare a Proposal for a European Interoperability Framework for Smart Cities and Communities. While an EIF for eGovernment has been in place since 2010, this is the first time the concepts and ideas developed there have been adapted to the local context.

The aim of the EIF4SCC is to provide EU local administration leaders with definitions, principles, recommendations, practical use cases drawn from cities and communities from around Europe and beyond, and a common model to facilitate delivery of services to the public across domains, cities, regions and borders.

The framework was developed by building on and finding complementarities with previous and ongoing initiatives, such as the Living-in.EU movement, the 2017 European Interoperability Framework (EIF), the Minimal Interoperability Mechanisms (MIMs Plus) and the outcomes of EU funded initiatives (e.g.Connecting Europe Facility (CEF) Digital Building BlocksSmart Cities MarketplaceIntelligent Cities ChallengeDigital Transition Partnership under the Urban Agenda) and EU funded projects (SynchronicityTriangulum, etc.).

Why do cities and communities need interoperability?

The EIF4SCC is targeted at EU local administration leaders and aims to provide a generic framework of interoperability of all types, and how it can contribute to the development of a Smart(er) City/Community. This will pave the way for services for citizens and business to be offered not only in a single city, but also across cities, regions and across borders.

European Interoperability Framework for Smart Cities and Communities

The EIF4SCC includes three concepts (interoperability, smart city or community, EIF4SCC), five principles (drawing on the Living-in.EU declaration), and seven elements (consisting of the five components of interoperability, one cross-cutting layer – Integrated Service Governance, and a foundational layer of Interoperability Governance)….The European Commission encourages local administrations at regional, city and community level to review the Proposed EIF4SCC, and the accompanying Final Study Report which details the methodology, literature review, and stakeholder engagement process undertaken. It will be discussed through the Living-in.EU community and other fora, with a view to its adoption as an official Commission document, based on users’ and stakeholders’ feedback…(More)”.

An Obsolete Paradigm


Blogpost by Paul Wormelli: “…Our national system of describing the extent of crime in the U.S. is broken beyond repair and deserves to be replaced by a totally new paradigm (system). 

Since 1930, we have relied on the metrics generated by the Uniform Crime Reporting (UCR) Program to describe crime in the U.S., but it simply does not do so, even with its evolution into the National Incident-Based Reporting System (NIBRS). Criminologists have long recognized the limited scope of the UCR summary crime data, leading to the creation of the National Crime Victimization Survey (NCVS) and other supplementary crime data measurement vehicles. However, despite these measures, the United States still has no comprehensive national data on the amount of crime that has occurred. Even after decades of collecting data, the 1968 Presidential Crime Commission report on the Challenge of Crime in a Free Society lamented the absence of sound and complete data on crime in the U.S., and called for the creation of a National Crime Survey (NCS) that eventually led to the creation of the NCVS. Since then, we have slowly attempted to make improvements that will lead to more robust data. Only in 2021 did the FBI end UCR summary-based crime data collection and move to NIBRS crime data collection on a national scale.

Admittedly, the shift to NIBRS will unleash a sea change in how we analyze crime data and use it for decision making. However, it still lacks the completeness of national crime reporting. In the landmark study of the National Academy of Sciences Committee on Statistics (funded by the FBI and the Bureau of Justice Statistics to make recommendations on modernizing crime statistics), the panel members grappled with this reality and called out the absence of national statistics on crime that would fully inform policymaking on this critical subject….(More)”