World Development Report 2021: Data for Better Lives — Leveraging greater value from data to help the poor


Report by the World Bank: “Data has become ubiquitous—with global data flows increasing one thousand times over the last 20 years. What is not always appreciated is the extent to which data offer the potential to improve people’s lives, including the poor and those living in lower-income countries.

Consider this example. The Indian state of Odisha is susceptible to devastating cyclones. When disaster struck in 1999, as many as 10,000 people lost their lives. This tragedy prompted the Odisha State Disaster Management Authority to invest heavily in weather forecast data. When another, similarly powerful storm struck in 2013, the capture and broadcast of early warning data allowed nearly one million people to be evacuated to safety, slashing the death toll to just 38.

Data’s direct benefits on lives and livelihoods can come not only from government initiatives, as in Odisha, but also through a plethora of new private business models. Many of us are familiar with on-demand ride-hailing platforms that have revolutionized public transportation in major cities. In Nigeria, the platform business Hello Tractor has adapted the concept of a ride-hailing platform allowing farmers to rent agricultural equipment on demand and increase their agricultural productivity.

Furthermore, Civil Society Organizations across the world are using crowdsourced data collected from citizens as a way of holding governments accountable. For example, the platform ForestWatchers allows people to directly report deforestation of the Amazon. And in Egypt, the HarrassMap tool allows women to report the location of sexual harassment incidents.

Despite all these innovative uses, data still remain grossly under-utilized, leaving much of the economic and social value of data untapped. Collecting and using data for a single purpose without making it available to others for reuse is a waste of resources.  By reusing and combining data from both public and private sources, and applying modern analytical techniques, merged data sets can cover more people, more precisely, and more frequently.  Leveraging these data synergies can bring real benefits….(More)”.

The (Im)possibility of Fairness: Different Value Systems Require Different Mechanisms For Fair Decision Making


Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian at Communications of the ACM: “Automated decision-making systems (often machine learning-based) now commonly determine criminal sentences, hiring choices, and loan applications. This widespread deployment is concerning, since these systems have the potential to discriminate against people based on their demographic characteristics. Current sentencing risk assessments are racially biased, and job advertisements discriminate on gender. These concerns have led to an explosive growth in fairness-aware machine learning, a field that aims to enable algorithmic systems that are fair by design.

To design fair systems, we must agree precisely on what it means to be fair. One such definition is individual fairness: individuals who are similar (with respect to some task) should be treated similarly (with respect to that task). Simultaneously, a different definition states that demographic groups should, on the whole, receive similar decisions. This group fairness definition is inspired by civil rights law in the U.S. and U.K. Other definitions state that fair systems should err evenly across demographic groups. Many of these definitions have been incorporated into machine learning pipelines.

In this article, we introduce a framework for understanding these different definitions of fairness and how they relate to each other. Crucially, our framework shows these definitions and their implementations correspond to different axiomatic beliefs about the world. We present two such worldviews and will show they are fundamentally incompatible. First, one can believe the observation processes that generate data for machine learning are structurally biased. This belief provides a justification for seeking non-discrimination. When one believes that demographic groups are, on the whole, fundamentally similar, group fairness mechanisms successfully guarantee the top-level goal of non-discrimination: similar groups receiving similar treatment. Alternatively, one can assume the observed data generally reflects the true underlying reality about differences between people. These worldviews are in conflict; a single algorithm cannot satisfy either definition of fairness under both worldviews. Thus, researchers and practitioners ought to be intentional and explicit about world-views and value assumptions: the systems they design will always encode some belief about the world….(More)”.

Hospitals Hide Pricing Data From Search Results


Tom McGintyAnna Wilde Mathews and Melanie Evans at the Wall Street Journal: “Hospitals that have published their previously confidential prices to comply with a new federal rule have also blocked that information from web searches with special coding embedded on their websites, according to a Wall Street Journal examination.

The information must be disclosed under a federal rule aimed at making the $1 trillion sector more consumer friendly. But hundreds of hospitals embedded code in their websites that prevented Alphabet Inc.’s Google and other search engines from displaying pages with the price lists, according to the Journal examination of more than 3,100 sites.

The code keeps pages from appearing in searches, such as those related to a hospital’s name and prices, computer-science experts said. The prices are often accessible other ways, such as through links that can require clicking through multiple layers of pages.

“It’s technically there, but good luck finding it,” said Chirag Shah, an associate professor at the University of Washington who studies human interactions with computers. “It’s one thing not to optimize your site for searchability, it’s another thing to tag it so it can’t be searched. It’s a clear indication of intentionality.”…(More)”.

Negligence, Not Politics, Drives Most Misinformation Sharing


John Timmer at Wired: “…a small international team of researchers… decided to take a look at how a group of US residents decided on which news to share. Their results suggest that some of the standard factors that people point to when explaining the tsunami of misinformation—inability to evaluate information and partisan biases—aren’t having as much influence as most of us think. Instead, a lot of the blame gets directed at people just not paying careful attention.

The researchers ran a number of fairly similar experiments to get at the details of misinformation sharing. This involved panels of US-based participants recruited either through Mechanical Turk or via a survey population that provided a more representative sample of the US. Each panel had several hundred to over 1,000 individuals, and the results were consistent across different experiments, so there was a degree of reproducibility to the data.

To do the experiments, the researchers gathered a set of headlines and lead sentences from news stories that had been shared on social media. The set was evenly mixed between headlines that were clearly true and clearly false, and each of these categories was split again between those headlines that favored Democrats and those that favored Republicans.

One thing that was clear is that people are generally capable of judging the accuracy of the headlines. There was a 56 percentage point gap between how often an accurate headline was rated as true and how often a false headline was. People aren’t perfect—they still got things wrong fairly often—but they’re clearly quite a bit better at this than they’re given credit for.

The second thing is that ideology doesn’t really seem to be a major factor in driving judgements on whether a headline was accurate. People were more likely to rate headlines that agreed with their politics, but the difference here was only 10 percentage points. That’s significant (both societally and statistically), but it’s certainly not a large enough gap to explain the flood of misinformation.

But when the same people were asked about whether they’d share these same stories, politics played a big role, and the truth receded. The difference in intention to share between true and false headlines was only 6 percentage points. Meanwhile the gap between whether a headline agreed with a person’s politics or not saw a 20 percentage point gap. Putting it in concrete terms, the authors look at the false headline “Over 500 ‘Migrant Caravaners’ Arrested With Suicide Vests.” Only 16 percent of conservatives in the survey population rated it as true. But over half of them were amenable to sharing it on social media….(More)”.

The speed of science


Essay by Saloni Dattani & Nathaniel Bechhofer: “The 21st century has seen some phenomenal advances in our ability to make scientific discoveries. Scientists have developed new technology to build vaccines swiftly, new algorithms to predict the structure of proteins accurately, new equipment to sequence DNA rapidly, and new engineering solutions to harvest energy efficiently. But in many fields of science, reliable knowledge and progress advance staggeringly slowly. What slows it down? And what can we learn from individual fields of science to pick up the pace across the board – without compromising on quality?

By and large, scientific research is published in journals in the form of papers – static documents that do not update with new data or new methods. Instead of sharing the data and the code that produces their results, most scientists simply publish a textual description of their research in online publications. These publications are usually hidden behind paywalls, making it harder for outsiders to verify their authenticity.

On the occasion when a reader spots a discrepancy in the data or an error in the methods, they must read the intricate details of a study’s method scrupulously, and cross-check the statistics manually. When scientists don’t share the data to produce their results openly, the task becomes even harder. The process of error correction – from scientists publishing a paper, to readers spotting errors, to having the paper corrected or retracted – can take years, assuming those errors are spotted at all.

When scientists reference previous research, they cite entire papers, not specific results or values from them. And although there is evidence that scientists hold back from citing papers once they have been retracted, the problem is compounded over time – consider, for example, a researcher who cites a study that itself derives its data or assumptions from prior research that has been disputed, corrected or retracted. The longer it takes to sift through the science, to identify which results are accurate, the longer it takes to gather an understanding of scientific knowledge.

What makes the problem even more challenging is that flaws in a study are not necessarily mathematical errors. In many situations, researchers make fairly arbitrary decisions as to how they collect their data, which methods they apply to analyse them, and which results they report – altogether leaving readers blind to the impact of these decisions on the results.

This murkiness can result in what is known as p-hacking: when researchers selectively apply arbitrary methods in order to achieve a particular result. For example, in a study that compares the well-being of overweight people to that of underweight people, researchers may find that certain cut-offs of weight (or certain subgroups in their sample) provide the result they’re looking for, while others don’t. And they may decide to only publish the particular methods that provided that result…(More)”.

Governance Innovation ver.2: A Guide to Designing and Implementing Agile Governance


Draft report by the Ministry of Economy, Trade and Industry (METI): “Japan has been aiming at the realization of “Society 5.0,” a policy for building a human-centric society which realizes both economic development and solutions to social challenges by taking advantage of a system in which cyberspaces, including AI, IoT and big data, and physical spaces are integrated in a sophisticated manner (CPSs: cyber-physical systems). In advancing social implementation of innovative technologies toward the realization of the Society 5.0, it is considered necessary to fundamentally reform governance models in view of changes in social structures which new technologies may bring about.

Triggered by this problem awareness, at the G20 Ministerial Meeting on Trade and Digital Economy, which Japan hosted in June 2019, the ministers declared in the ministerial statement the need for “governance innovation” tailored to social changes which will be brought about by digital technologies and social implementation thereof.

In light of this, METI inaugurated its Study Group on a New Governance Model in Society 5.0 (hereinafter referred to as the “study group”) and in July 2020, the study group published a report titled “GOVERNANCE INNOVATION: Redesigning Law and Architecture for Society 5.0” (hereinafter referred to as the “first report”). The first report explains ideal approaches to cross-sectoral governance by multi-stakeholders, including goal-based regulations, importance for businesses to fulfill their accountability, and enforcement of laws with an emphasis on incentives.

Against this backdrop, the study group, while taking into consideration the outcomes of the first report, presented approaches to “agile governance” as an underlying idea of the governance shown in the Society 5.0 policy, and then prepared the draft report titled “Governance Innovation ver.2: A Guide to Designing and Implementing Agile Governance” as a compilation presenting a variety of ideal approaches to governance mechanisms based on agile governance, including corporate governance, regulations, infrastructures, markets and social norms.

In response, METI opened a call for public comments on this draft report in order to receive opinions from a variety of people. As the subjects shown in the draft report are common challenges seen across the world and many parts of the subjects require international cooperation, METI wishes to receive wide-ranging, frank opinions not only from people in Japan but also from those in overseas countries….(More)”.

Coming wave of video games could build empathy on racism, environment and aftermath of war


Mike Snider at USA Today: “Some of the newest video games in development aren’t really games at all, but experiences that seek to build empathy for others.

Among the five such projects getting funding grants and support from 3D software engine maker Unity is “Our America,” in which the player takes the role of a Black man who is driving with his son when their car is pulled over by a police officer.

The father worries about getting his car registration from the glove compartment because the officer “might think it’s a gun or something,” the character says in the trailer.

On the project’s website, the developers describe “Our America” as “an autobiographical VR Experience” in which “the audience must make quick decisions, answer questions – but any wrong move is the difference between life and death.”…

The other Unity for Humanity winners include:

  • Ahi Kā Rangers: An ecological mobile game with development led by Māori creators. 
  • Dot’s Home: A game that explores historical housing injustices faced by Black and brown home buyers. 
  • Future Aleppo: A VR experience for children to rebuild homes and cities destroyed by war. 
  • Samudra: A children’s environmental puzzle game that takes the player across a polluted sea to learn about pollution and plastic waste.

While “Our America” may serve best as a VR experience, other projects such as “Dot’s Home” may be available on mobile devices to expand its accessibility….(More)”.

European Data Economy: Between Competition and Regulation


Report by René Arnold, Christian Hildebrandt, and Serpil Taş: “Data and its economic impact permeates all sectors of the economy. The data economy is not a new sector, but more like a challenge for all firms to compete and innovate as part of a new wave of economic value creation.

With data playing an increasingly important role across all sectors of the economy, the results of this report point European policymakers to promote the development and adoption of unified reference architectures. These architectures constitute a technology-neutral and cross-sectoral approach that will enable companies small and large to compete and to innovate—unlocking the economic potential of data capture in an increasingly digitized world.

Data access appears to be less of a hindrance to a thriving data economy due to the net increase in capabilities in data capture, elevation, and analysis. What does prove difficult for firms is discovering existing datasets and establishing their suitability for achieving their economic objectives. Reference architectures can facilitate this process as they provide a framework to locate potential providers of relevant datasets and carry sufficient additional information (metadata) about datasets to enable firms to understand whether a particular dataset, or parts of it, fits their purpose.

Whether third-party data access is suitable to solve a specific business task in the first place ought to be a decision at the discretion of the economic actors involved. As our report underscores, data captured in one context with a specific purpose may not be fit for another context or another purpose. Consequently, a firm has to evaluate case-by-case whether first-party data capture, third-party data access, or a mixed approach is the best solution. This evaluation will naturally depend on whether there is any other firm capturing data suitable for the task that is willing to negotiate conditions for third-party access to this data. Unified data architectures may also lower the barriers for a firm capturing suitable data to engage in negotiations, since its adoption will lower the costs of making the data ready for a successful exchange. Such architectures may further integrate licensing provisions ensuring that data, once exchanged, is not used beyond the agreed purpose. It can also bring in functions that improve the discoverability of potential data providers….(More)”.

How can we measure productivity in the public sector?


Ravi Somani at the World Bank: “In most economies, the public sector is a major purchaser of goods, services and labor. According to the Worldwide Bureaucracy Indicators, globally the public sector accounts for around 25% of GDP and 38% of formal employment. Generating efficiency gains in the public sector can, therefore, have important implications for a country’s overall economic performance.  

Public-sector productivity measures the rate with which inputs are converted into desirable outputs in the public sector. Measures can be developed at the level of the employee, organization, or overall public sector, and can be tracked over time. Such information allows policymakers to identify good and bad performers, understand what might be correlated with good performance, and measure the returns to different types of public expenditures. This knowledge can be used to improve the allocation of public resources in the future and maximize the impact of the public purse.

But how can we measure it?

However, measuring productivity in the public sector can be tricky because:

  • There are often no market transactions for public services, or they are distorted by subsidies and other market imperfections.
  • Many public services are complex, requiring (often immeasurable) inputs from multiple individuals and organizations.
  • There is often a substantial time lag between investments in inputs and the realization of outputs and outcomes.

This recent World Bank publication provides a summary of the different approaches to measuring productivity in the public sector, presented in the table below.  For simplicity, the approaches are separated into: ‘macro’ approaches, which provide aggregate information at the level of an organization, sector, or service as a whole; and ‘micro’ approaches, which can be applied to the individual employee, task, project, and process.   
 

Macro and Micro Approaches to measure public-sector productivity

There is no silver bullet for accurately measuring public-sector productivity – each approach has its own limitations.  For example, the cost-weighted-output approach requires activity-level data, necessitates different approaches for different sectors, and results in metrics with difficult-to-interpret absolute levels.  Project-completion rates require access to project-level data and may not fully account for differences in the quality and complexity of projects. The publication includes a list of the pros, cons, and implementation requirements for each approach….(More)”.

Using Data and Citizen Science for Gardening Success


Article by Elizabeth Waddington: “…Data can help you personally by providing information you can use. And it also allows you to play a wider role in boosting understanding of our planet and tackling the global crises we face in a collaborative way. Consider the following examples.

Grow Observatory

This is one great example of data gathering and citizen science. Grow Observatory is a European citizen’s observatory through which people work together to take action on climate change, build better soil, grow healthier food and corroborate data from the new generation of Copernicus satellites.

Twenty-four Grow communities in 13 European countries created a network of over 6,500 ground-based soil sensors and collected a lot of soil-related data. And many insights have helped people learn about and test regenerative food growing techniques.

On their website, you can explore sensor locations, or make use of dynamic soil moisture maps. With the Grow Observatory app, you can get crop and planting advice tailored to your location, and get detailed, science-based information about regenerative growing practices. Their water planner also allows small-scale growers to learn more about how much water their plants will need in their location over the coming months if they live in one of the areas which currently have available data…

Cooperative Citizen Science: iNaturalist, Bioblitzes, Bird Counts, and More

Wherever you live, there are many different ways to get involved and help build data. From submitting observations on wildlife in your garden through apps like iNaturalist to taking part in local Bioblitzes, bird counts, and more – there are plenty of ways we can collect data that will help us – and others – down the road.

Collecting data through our observations, and, crucially, sharing that data with others can help us create the future we all want to see. We, as individuals, can often feel powerless. But citizen science projects help us to see the collective power we can wield when we work together. Modern technology means we can be hyper-connected, and affect wider systems, even when we are alone in our own gardens….(More)”