Five Conjectures to Explore in 2023 as They Relate to Data for Good

Essay by Hannah Chafetz, Uma Kalkar, Marine Ragnet, Stefaan Verhulst: “From the regulations proposed in the European Artificial Intelligence (AI) Act to the launch of OpenAI’s ChatGPT tool, 2022 was a year that saw many policy and technological developments. Taking stock of recent data and technology trends, we offer some conjectures as to how these ideas may play out over the next year. Indeed, predictions can be dangerous, which is why we position the below as conjectures — propositions that remain tentative till more evidence emerges — that can help advance the agenda and direction of responsible use of data for the public good focus areas.

Below, we provide a summary of the five conjectures that The GovLab will track and revisit throughout 2023.

Conjecture 1. In 2023 … non-traditional data may be used with increasing frequency to solve public problems.

Complex crises, from COVID-19 to climate change, demonstrate a need for information about a variety of developments quickly and at scale. Traditional sources are not enough: growing awareness and (re)use of non-traditional data sources (NTD) to fill the gaps in traditional data cast a spotlight on the value of using and combining new data sources for problem-solving. Over the next year, NTD sources could increasingly be called upon by decision-making to address large-scale public problems.

NTD refers to data that is “digitally captured (for example, mobile phone records and financial data), mediated (for example, social media and online data), or observed (for example, satellite imagery),” using new instrumentation mechanisms and is often privately held. Our recent report discussed how COVID-19 was a “watershed moment” in terms of generating access to non-traditional health, mobility, economic, and sentiment data. As detailed in the report, decision-makers around the world increasingly recognize the potential of NTD sources when combined with traditional data responsibly. Similarly, developments in the war in Ukraine presented a pivotal moment regarding the use of NTD sources. For instance, satellite images, social media narrative trends, and real-time location mapping have supported humanitarian action and peacebuilding.

These are just two examples of the increasing interest in NTD to solve public problems. We predict that this trend could continue to expand as technological advances continue to make non-traditional data more widely available to decision-makers. Already, the financial sector is increasingly incorporating non-traditional data to inform decisions such as assessing lending risks, for example. Recently, the fintech business Nova Credit and HSBC partnered together to exploit cross-border data to allow immigrants access to credit by predicting creditworthiness via digital footprint and psychometric data. This trend is compounded by increased legislation aiming to open up the re-use of private sector data, particularly in Europe. The increased attention to NTD sources signals a need to prioritize the alignment of the supply and demand of NTD and develop a systematized approach to how it can be integrated within decision-making cycles…(More)”.

Explore the first Open Science Indicators dataset

Article by Lauren Cadwallader, Lindsay Morton, and Iain Hrynaszkiewicz: “Open Science is on the rise. We can infer as much from the proliferation of Open Access publishing options; the steady upward trend in bioRxiv postings; the periodic rollout of new national, institutional, or funder policies. 

But what do we actually know about the day-to-day realities of Open Science practice? What are the norms? How do they vary across different research subject areas and regions? Are Open Science practices shifting over time? Where might the next opportunity lie and where do barriers to adoption persist? 

To even begin exploring these questions and others like them we need to establish a shared understanding of how we define and measure Open Science practices. We also need to understand the current state of adoption in order to track progress over time. That’s where the Open Science Indicators project comes in. PLOS conceptualized a framework for measuring Open Science practices according to the FAIR principles, and partnered with DataSeer to develop a set of numerical “indicators” linked to specific Open Science characteristics and behaviors observable in published research articles. Our very first dataset, now available for download at Figshare, focuses on three Open Science practices: data sharing, code sharing, and preprint posting…(More)”.

Seemingly contrasting disciplines

Blog by Andreas Pawelke: “Development organizations increasingly embrace systems thinking (and portfolio approaches) in tackling complex challenges.

At the same time, there is a growing supply of (novel) data sources and analytical methods available to the development sector.

Little evidence exists, however, of these two seemingly contrasting disciplines to be combined by development practitioners for systems transformation with little progress made since 2019 when Thea Snow called for system thinkers and data scientists to work together.

This is not to say that system thinkers disregard data in their work. A range of data types is used, in particular the thick, rich, qualitative data from observations, deep listening and micro-narratives. And already back in 2013, MIT researchers organized an entire conference around big data and systems thinking.

When it comes to the use of non-traditional data in the work of system innovators in international development, however, there seems to be little in terms of examples and experiences.

Enhancing system innovation?

Is there a (bigger) role to play for non-traditional data in the systems work of development organizations?

Let’s start with definitions:

A system is an interconnected set of elements that form a unified whole or serve a function.

Systems thinking is about recognizing and taking into account the complexity of the world while trying to understand how the elements of a system are interconnected and how they influence each other.

System innovation emphasizes the act of changing (shifting) systems through innovations to a system (transformation), not within a system (improvement).

Non-traditional data refers to data that is digitally captured, mediated or observed. Such data is often (but not always) unstructured, big and used as proxies for purposes unrelated to its initial collection. We’re talking about the large quantities of digital data generated from our digital interactions and transactions but also (more or less) novel sources like satellites and drones that generate data that is readily available at large spatial and temporal scales.

There are at least three ways how non-traditional data could be used to enhance the practice of system innovation in the development sector:

  1. Observe: gain a better understanding of a system
  2. Shift: identify entry points of interventions and model potential outcomes
  3. Learn: measure and observe changes in a system over time..(More)”

Smart OCR – Advancing the Use of Artificial Intelligence with Open Data

Article by Parth Jain, Abhinay Mannepalli, Raj Parikh, and Jim Samuel: “Optical character recognition (OCR) is growing at a projected compounded annual growth rate (CAGR) of 16%, and is expected to have a value of 39.7 billion USD by 2030, as estimated by Straits research. There has been a growing interest in OCR technologies over the past decade. Optical character recognition is the technological process for transforming images of typed, handwritten, scanned, or printed texts into machine-encoded and machine-readable texts (Tappert, et al., 1990). OCR can be used with a broad range of image or scan formats – for example, these could be in the form of a scanned document such as a .pdf file, a picture of a piece of paper in .png or .jpeg format, or images with embedded text, such as characters on a coffee cup, title on the cover page of a book, the license number on vehicular plates, and images of code on websites. OCR has proven to be a valuable technological process for tackling the important challenge of transforming non-machine-readable data into machine readable data. This enables the use of natural language processing and computational methods on information-rich data which were previously largely non-processable. Given the broad array of scanned and image documents in open government data and other open data sources, OCR holds tremendous promise for value generation with open data.

Open data has been defined as “being data that is made freely available for open consumption, at no direct cost to the public, which can be efficiently located, filtered, downloaded, processed, shared, and reused without any significant restrictions on associated derivatives, use, and reuse” (Chidipothu et al., 2022). Large segments of open data contain images, visuals, scans, and other non-machine-readable content. The size and complexity associated with the manual analysis of such content is prohibitive. The most efficient way would be to establish standardized processes for transforming documents into their OCR output versions. Such machine-readable text could then be analyzed using a range of NLP methods. Artificial Intelligence (AI) can be viewed as being a “set of technologies that mimic the functions and expressions of human intelligence, specifically cognition and logic” (Samuel, 2021). OCR was one of the earliest AI technologies implemented. The first ever optical reader to identify handwritten numerals was the advanced reading machine “IBM 1287,” presented at the World Fair in New York in 1965 (Mori, et al., 1990). The value of open data is well established – however, the extent of usefulness of open data is dependent on “accessibility, machine readability, quality” and the degree to which data can be processed by using analytical and NLP methods (, 2022John, et al., 2022)…(More)”

The rise and fall of peer review

Blog by Adam Mastroianni: “For the last 60 years or so, science has been running an experiment on itself. The experimental design wasn’t great; there was no randomization and no control group. Nobody was in charge, exactly, and nobody was really taking consistent measurements. And yet it was the most massive experiment ever run, and it included every scientist on Earth.

Most of those folks didn’t even realize they were in an experiment. Many of them, including me, weren’t born when the experiment started. If we had noticed what was going on, maybe we would have demanded a basic level of scientific rigor. Maybe nobody objected because the hypothesis seemed so obviously true: science will be better off if we have someone check every paper and reject the ones that don’t pass muster. They called it “peer review.”

This was a massive change. From antiquity to modernity, scientists wrote letters and circulated monographs, and the main barriers stopping them from communicating their findings were the cost of paper, postage, or a printing press, or on rare occasions, the cost of a visit from the Catholic Church. Scientific journals appeared in the 1600s, but they operated more like magazines or newsletters, and their processes of picking articles ranged from “we print whatever we get” to “the editor asks his friend what he thinks” to “the whole society votes.” Sometimes journals couldn’t get enough papers to publish, so editors had to go around begging their friends to submit manuscripts, or fill the space themselves. Scientific publishing remained a hodgepodge for centuries.

(Only one of Einstein’s papers was ever peer-reviewed, by the way, and he was so surprised and upset that he published his paper in a different journal instead.)

That all changed after World War II. Governments poured funding into research, and they convened “peer reviewers” to ensure they weren’t wasting their money on foolish proposals. That funding turned into a deluge of papers, and journals that previously struggled to fill their pages now struggled to pick which articles to print. Reviewing papers before publication, which was “quite rare” until the 1960s, became much more common. Then it became universal.

Now pretty much every journal uses outside experts to vet papers, and papers that don’t please reviewers get rejected. You can still write to your friends about your findings, but hiring committees and grant agencies act as if the only science that exists is the stuff published in peer-reviewed journals. This is the grand experiment we’ve been running for six decades.

The results are in. It failed…(More)”.

Learnings on the Importance of Youth Engagement

Blog by  Anna Ibru and Dane Gambrell at The GovLab: “…In recent years, public institutions around the world are piloting new youth engagement initiatives like Creamos that tap the expertise and experiences of young people to develop projects, programs, and policies and address complex social challenges within communities. 

To learn from and scale best practices from international models of youth engagement, The GovLab has develop case studies about three path breaking initiatives: Nuortenbudjetti, Helsinki’s participatory budgeting initiative for youth; Forum Jove BCN, Barcelona’s youth led citizens’ assembly; and Creamos, an open innovation and coaching program for young social innovators in Chile. For government decision makers and institutions who are looking to engage and empower young people to get involved in their communities, develop real-world solutions, and strengthen democracy, these examples describe these initiatives and their outcomes along with guidance on how to design and replicate such projects in your community. Young people are still a widely untapped resource who are too-often left out in policy and program design. The United Nations affirms that it is impossible to meet the UN SDGs by 2030 without active participation of the 1.8 billion youth in the world. Government decision makers and institutions must capitalize on the opportunity to engage and empower young people. The successes of NuortenbudjettiForum Jove BCN, and Creamos provide a roadmap for policymakers looking to engage in this space….(More)” See also:  Nuortenbudjetti: Helsinki’s Youth BudgetCreamos: Co-creating youth-led social innovation projects in Chile and Forum Jove BCN: Barcelona’s Youth Forum.

Screen Shot 2022 12 06 At 1.36.48 Pm

Can citizen deliberation address the climate crisis? Not if it is disconnected from politics and policymaking

Blog by John Boswell, Rikki Dean and Graham Smith: “..Modelled on the deliberative democratic ideal, much of the attention on climate assemblies focuses on their internal features. The emphasis is on their novelty in providing respite from the partisan bickering of politics-as-usual, instead creating space for the respectful free and fair exchange of reasons.

On these grounds, the Global Citizens’ Assembly in 2021 and experimental ‘wave’ of climate assemblies across European countries are promising. Participating citizens have demonstrated they can grapple with complex information, deliberate respectfully, and come to a well thought-through set of recommendations that are – every time – more progressive than current climate policies.

But, before we get carried away with this enthusiasm, it is important to focus on a fundamental point usually glossed over. Assemblies are too often talked about in magical terms, as if by their moral weight alone citizen recommendations will win the day through the forceless force of their arguments. But this expectation is naive.

Designing for impact requires much more attention to the nitty-gritty of how policy actually gets made. That means taking seriously the technical uncertainties and complexities associated with policy interventions, and confronting the political challenges and trade-offs required in balancing priorities in the shadow of powerful interests.

In a recent study, we have examined the first six national climate assemblies – in Ireland, France, the UK, Scotland, Germany and Denmark – to see how they tried to achieve impact. Our novel approach is to take the focus away from their (very similar) ‘internal design characteristics’ – such as random selection – and instead put it on their ‘integrative design characteristics’…(More)”.

Bridging Data Gaps Can Help Tackle the Climate Crisis

Article by Bo Li and Bert Kroese: “A famous physicist once said: “When you can measure what you are speaking about, and express it in numbers, you know something about it”.

Nearly 140 years later, this maxim remains true and is particularly poignant for policymakers tasked with addressing climate mitigation and adaptation.

That’s because they face major information gaps that impede their ability to understand the impact of policies—from measures to incentivize cuts in emissions, to regulations that reduce physical risks and boost resilience to climate shocks. And without comprehensive and internationally comparable data to monitor progress, it’s impossible to know what works, and where course corrections are needed.

This underscores the importance of the support of G20 leaders for a new Data Gaps Initiative to make official statistics more detailed, and timely. It calls for better data to understand climate change, together with indicators that cover income and wealth, financial innovation and inclusion, access to private and administrative data, and data sharing. In short, official statistics need to be broader, more detailed, and timely.

The sector where change is needed the most is energy, the largest contributor to greenhouse gas emissions, accounting for around three-quarters of the total.

Economies must expand their renewable energy sources and curb fossil fuel use, but while there’s been a gradual shift in that direction, the pace is still not sufficient. And not only is there a lack of policy ambition in many cases, there also is a lack of comprehensive and internationally comparable data to monitor progress.

To accelerate cuts to emissions, policymakers need detailed statistics to monitor the path of the energy transition and assist them in devising effective mitigation measures that can deliver the fastest and least disruptive pathway toward net zero emissions…(More)”.

Building Trust and Reinforcing Democracy

OECD Report: “Democracies are at a critical juncture, under growing internal and external pressures. This publication sheds light on the important public governance challenges countries face today in preserving and strengthening their democracies, including fighting mis- and disinformation; improving government openness, citizen participation and inclusiveness; and embracing global responsibilities and building resilience to foreign influence. It also looks at two cross-cutting themes that will be crucial for robust, effective democracies: transforming public governance for digital democracy and gearing up government to deliver on climate and other environmental challenges. These areas lay out the foundations of the new OECD Reinforcing Democracy Initiative, which has also involved the development of action plans to support governments in responding to these challenges..(More)”.

Behavioral Economics and the Energy Crisis in Europe

Blog by Carlos Scartascini: “European nations, stunned by Russia’s aggression, have mostly rallied in support of Ukraine, sending weapons and welcoming millions of refugees. But European citizens are paying dearly for it. Apart from the costs in direct assistance, the energy conflict with Russia had sent prices of gas soaring to eight times their 10-year average by the end of September and helped push inflation to around 10%. With a partial embargo of Russian oil going into effect in December and cold weather coming, many Europeans now fear an icy, bitter and poorer winter of 2023.

European governments hope to take the edge off by enacting price regulations, providing energy subsidies for households, and crucially curbing energy demand. Germany’s government, for example, imposed limits on heating in public offices and buildings to 19 degrees Celsius (66.2 Fahrenheit). France has introduced a raft of voluntary measures ranging from asking public officials to travel by train rather than car, suggesting that municipalities swap old lamps for LEDs and designing incentives to get people to car share…

As we know from years of experiments at the IDB in using behavioral economics to achieve policy goals, however, rules and recommendations are not enough. Trust in fellow citizens and in the government are also crucial when calling for a shared sacrifice. That means not appealing to fear, which can lead to deeper divisions in society, energy hoarding, resignation and indifference. Rather, it means appealing to social norms of morality and community.

In using behavioral economics to boost tax compliance in Argentina, for example, we found that sending messages that revealed how fellow citizens were paying their taxes significantly improved tax collection. Revealing how the government was using tax funds to improve people’s lives provided an additional boost to the effort. Posters and television ads in Europe showing people wearing sweaters, turning down their thermostats, insulating their homes and putting up solar panels might similarly instill a sense of common purpose. And signals that governments are trying to relieve hardship might help instill in citizens the need for sacrifice…(More)”.