Japan to pitch data-sharing framework to bolster Asia supply chains


Nikkei coverage: “The Japanese government is set to propose a scheme to promote data-sharing among companies in Asia to strengthen supply chains in the region, Nikkei has learned.

The Ministry of Economy, Trade and Industry (METI) hopes that a secure data-sharing framework like the one developed in Europe will enable companies in Asia to smoothly exchange data, such as inventory information on products and parts, as well as information on potential disruptions in procurement.

The ministry will propose the idea as a key part of Japan’s digital trade policy at an expert panel meeting on Friday. The meeting will propose a major review of industrial policy to emphasize digitization and a decarbonized economy.

It sees Europe’s efforts as a role model in terms of information-sharing. The European Union is building a data distribution infrastructure, Gaia-X, to let companies in the region share information on supply chains.

The goal is to counter the monopoly on data held by large technology companies in the U.S. and China. The EU is promoting the sharing of data by connecting different cloud services among companies. Under Gaia, companies can limit the scope of data disclosure and the use of data provided to others, based on the concept of data sovereignty.

The scheme envisioned by METI will also allow companies to decide what type of data they share and how much. The infrastructure will be developed on a regional basis, with the participation of various countries.

Google and China’s Alibaba Group Holding offer data-sharing services for supply chain, but the Japanese government is concerned that it will be difficult to protect Japanese companies’ industrial secrets unless it develops its own data infrastructure….(More)”

Financing Models for Digital Ecosystems


Paper by Rahul Matthan, Prakhar Misra and Harshita Agrawal: “This paper explores various financing models for the digital ecosystem within the Indian setup. It uses the market/non-market failure distinction and applies it to different parts of the ecosystem, outlined in the Open Digital Ecosystems framework. It identifies which form of financing — public, private and philanthropic — is suitable for the relevant component of the digital world — data registries, exchanges, open stacks, marketplaces, co-creation platforms, and information access portals. Finally, it treats philanthropic financing as a special case of financing mechanisms available and analyses their pros and cons in the Indian digital ecosystem…(More)”.

Data Innovation in Demography, Migration and Human Mobility


Report by Bosco, C., Grubanov-Boskovic, S., Iacus, S., Minora, U., Sermi, F. and Spyratos, S.: “With the consolidation of the culture of evidence-based policymaking, the availability of data has become central for policymakers. Nowadays, innovative data sources have offered opportunity to describe more accurately demographic, mobility- and migration- related phenomena by making available large volumes of real-time and spatially detailed data. At the same time, however, data innovation has brought up new challenges (ethics, privacy, data governance models, data quality) for citizens, statistical offices, policymakers and the private sector.

Focusing on the fields of demography, mobility and migration studies, the aim of this report is to assess the current state of utilisation of data innovation in the scientific literature as well as to identify areas in which data innovation has the most concrete potential for policymaking. For that purpose, this study has reviewed more than 300 articles and scientific reports, as well as numerous tools, that employed non-traditional data sources for demographic, human mobility or migration research.The specific findings of our report contribute to a discussion on a) how innovative data is used in respect to traditional data sources; b) domains in which innovative data have the highest potential to contribute to policymaking; c) prospects for an innovative data transition towards systematic contribution to official statistics and policymaking…(More)”. See also Big Data for Migration Alliance.

Leveraging Non-Traditional Data For The Covid-19 Socioeconomic Recovery Strategy


Article by Deepali Khanna: “To this end, it is opportune to ask the following questions: Can we harness the power of data routinely collected by companies—including transportation providers, mobile network operators, social media networks and others—for the public good? Can we bridge the data gap to give governments access to data, insights and tools that can inform national and local response and recovery strategies?

There is increasing recognition that traditional and non-traditional data should be seen as complementary resources. Non-traditional data can bring significant benefits in bridging existing data gaps but must still be calibrated against benchmarks based on established traditional data sources. These traditional datasets are widely seen as reliable as they are subject to established stringent international and national standards. However, they are often limited in frequency and granularity, especially in low- and middle-income countries, given the cost and time required to collect such data. For example, official economic indicators such as GDP, household consumption and consumer confidence may be available only up to national or regional level with quarterly updates…

In the Philippines, UNDP, with support from The Rockefeller Foundation and the government of Japan, recently setup the Pintig Lab: a multidisciplinary network of data scientists, economists, epidemiologists, mathematicians and political scientists, tasked with supporting data-driven crisis response and development strategies. In early 2021, the Lab conducted a study which explored how household spending on consumer-packaged goods, or fast-moving consumer goods (FMCGs), can been used to assess the socioeconomic impact of Covid-19 and identify heterogeneities in the pace of recovery across households in the Philippines. The Philippine National Economic Development Agency is now in the process of incorporating this data for their GDP forecasting, as additional input to their predictive models for consumption. Further, this data can be combined with other non-traditional datasets such as credit card or mobile wallet transactions, and machine learning techniques for higher-frequency GDP nowcasting, to allow for more nimble and responsive economic policies that can both absorb and anticipate the shocks of crisis….(More)”.

The UN is testing technology that processes data confidentially


The Economist: “Reasons of confidentiality mean that many medical, financial, educational and other personal records, from the analysis of which much public good could be derived, are in practice unavailable. A lot of commercial data are similarly sequestered. For example, firms have more granular and timely information on the economy than governments can obtain from surveys. But such intelligence would be useful to rivals. If companies could be certain it would remain secret, they might be more willing to make it available to officialdom.

A range of novel data-processing techniques might make such sharing possible. These so-called privacy-enhancing technologies (PETs) are still in the early stages of development. But they are about to get a boost from a project launched by the United Nations’ statistics division. The UN PETs Lab, which opened for business officially on January 25th, enables national statistics offices, academic researchers and companies to collaborate to carry out projects which will test various PETs, permitting technical and administrative hiccups to be identified and overcome.

The first such effort, which actually began last summer, before the PETs Lab’s formal inauguration, analysed import and export data from national statistical offices in America, Britain, Canada, Italy and the Netherlands, to look for anomalies. Those could be a result of fraud, of faulty record keeping or of innocuous re-exporting.

For the pilot scheme, the researchers used categories already in the public domain—in this case international trade in things such as wood pulp and clocks. They thus hoped to show that the system would work, before applying it to information where confidentiality matters.

They put several kinds of PETs through their paces. In one trial, OpenMined, a charity based in Oxford, tested a technique called secure multiparty computation (SMPC). This approach involves the data to be analysed being encrypted by their keeper and staying on the premises. The organisation running the analysis (in this case OpenMined) sends its algorithm to the keeper, who runs it on the encrypted data. That is mathematically complex, but possible. The findings are then sent back to the original inquirer…(More)”.

Data Sharing in Transport


Technical Note by the European Investment Board: “Traveller and transport related data are essential for planning efficient urban mobility and delivering an effective public transport services while adequately managing infrastructure investment costs. It also supports local authorities in their efforts towards decarbonisation of transport as well as improving air quality.

Nowadays, most of the data are generated by location-based mobile phone applications and connected vehicles or other mobility equipment like scooters and bikes. This opens up new opportunities in public sector engagement with private sector and partnerships.

This report, through an extensive literature review and interviews, identifies seven Data Partnership Models that could be used by public and private sector entities in the field of transport. It also provides a concise roadmap for local authorities as a guidance in their efforts when engaging with private sector in transport data sharing…(More)”.

Counting Crimes: An Obsolete Paradigm


Paul Wormeli at The Criminologist: “To the extent that a paradigm is defined as the way we view things, the crime statistics paradigm in the United States is inadequate and requires reinvention….The statement—”not all crime is reported to the police”—lies at the very heart of why our current crime data are inherently incomplete. It is a direct reference to the fact that not all “street crime” is reported and that state and local law enforcement are not the only entities responsible for overseeing violations of societally established norms (“street crime” or otherwise). Two significant gaps exist, in that: 1) official reporting of crime from state and local law enforcement agencies cannot provide insight into unreported incidents, and 2) state and local law enforcement may not have or acknowledge jurisdiction over certain types of matters, such as cybercrime, corruption, environmental crime, or terrorism, and therefore cannot or do not report on those incidents…

All of these gaps in crime reporting mask the portrait of crime in the U.S. If there was a complete accounting of crime that could serve as the basis of policy formulation, including the distribution of federal funds to state and local agencies, there could be a substantial impact across the nation. Such a calculation would move the country toward a more rational basis for determining federal support for communities based on a comprehensive measure of community wellness.

In its deliberations, the NAS Panel recognized that it is essential to consider both the concepts of classification and the rules of counting as we seek a better and more practical path to describing crime in the U.S. and its consequences. The panel postulated that a meaningful classification of incidents found to be crimes would go beyond the traditional emphasis on street crime and include all crime categories.

The NAS study identified the missing elements of a national crime report as including more complete data on crimes involving drugrelated offenses, criminal acts where juveniles are involved, so-called white-collar crimes such as fraud and corruption, cybercrime, crime against businesses, environmental crimes, and crimes against animals. Just as one example, it is highly unlikely that we will know the full extent of fraudulent claims against all federal, state, and local governments in the face of the massive influx of funding from recent and forthcoming Congressional action.

In proposing a set of crime classifications, the NAS panel recommended 11 major categories, 5 of which are not addressed in our current crime data collection systems. While there are parallel data systems that collect some of the missing data within these five crime categories, it remains unclear which federal agency, if any, has the authority to gather the information and aggregate it to give us anywhere near a complete estimate of crime in the United States. No federal or national entity has the assignment of estimating the total amount of crime that takes place in the United States. Without such leadership, we are left with an uninformed understanding of the health and wellness of communities throughout the country…(More)”

New and updated building footprints


Bing Blogs: “…The Microsoft Maps Team has been leveraging that investment to identify map features at scale and produce high-quality building footprint data sets with the overall goal to add to the OpenStreetMap and MissingMaps humanitarian efforts.

As of this post, the following locations are available and Microsoft offers access to this data under the Open Data Commons Open Database License (ODbL).

Country/RegionMillion buildings
United States of America129.6
Nigeria and Kenya50.5
South America44.5
Uganda and Tanzania17.9
Canada11.8
Australia11.3

As you might expect, the vintage of the footprints depends on the collection date of the underlying imagery. Bing Maps Imagery is a composite of multiple sources with different capture dates (ranging 2012 to 2021). To ensure we are setting the right expectation for that building, each footprint has a capture date tag associated if we could deduce the vintage of imagery used…(More)”

Data Re-Use and Collaboration for Development


Stefaan G. Verhulst at Data & Policy: “It is often pointed out that we live in an era of unprecedented data, and that data holds great promise for development. Yet equally often overlooked is the fact that, as in so many domains, there exist tremendous inequalities and asymmetries in where this data is generated, and how it is accessed. The gap that separates high-income from low-income countries is among the most important (or at least most persistent) of these asymmetries…

Data collaboratives are an emerging form of public-private partnership that, when designed responsibly, can offer a potentially innovative solution to this problem. Data collaboratives offer at least three key benefits for developing countries:

1. Cost Efficiencies: Data and data analytic capacity are often hugely expensive and beyond the limited capacities of many low-income countries. Data reuse, facilitated by data collaboratives, can bring down the cost of data initiatives for development projects.

2. Fresh insights for better policy: Combining data from various sources by breaking down silos has the potential to lead to new and innovative insights that can help policy makers make better decisions. Digital data can also be triangulated with existing, more traditional sources of information (e.g., census data) to generate new insights and help verify the accuracy of information.

3. Overcoming inequalities and asymmetries: Social and economic inequalities, both within and among countries, are often mapped onto data inequalities. Data collaboratives can help ease some of these inequalities and asymmetries, for example by allowing costs and analytical tools and techniques to be pooled. Cloud computing, which allows information and technical tools to be easily shared and accessed, are an important example. They can play a vital role in enabling the transfer of skills and technologies between low-income and high-income countries…(More)”. See also: Reusing data responsibly to achieve development goals (OECD Report).

Making data for good better


Article by Caroline Buckee, Satchit Balsari, and Andrew Schroeder: “…Despite the long standing excitement about the potential for digital tools, Big Data and AI to transform our lives, these innovations–with some exceptions–have so far had little impact on the greatest public health emergency of our time.

Attempts to use digital data streams to rapidly produce public health insights that were not only relevant for local contexts in cities and countries around the world, but also available to decision makers who needed them, exposed enormous gaps across the translational pipeline. The insights from novel data streams which could help drive precise, impactful health programs, and bring effective aid to communities, found limited use among public health and emergency response systems. We share here our experience from the COVID-19 Mobility Data Network (CMDN), now Crisis Ready (crisisready.io), a global collaboration of researchers, mostly infectious disease epidemiologists and data scientists, who served as trusted intermediaries between technology companies willing to share vast amounts of digital data, and policy makers, struggling to incorporate insights from these novel data streams into their decision making. Through our experience with the Network, and using human mobility data as an illustrative example, we recognize three sets of barriers to the successful application of large digital datasets for public good.

First, in the absence of pre-established working relationships with technology companies and data brokers, the data remain primarily confined within private circuits of ownership and control. During the pandemic, data sharing agreements between large technology companies and researchers were hastily cobbled together, often without the right kind of domain expertise in the mix. Second, the lack of standardization, interoperability and information on the uncertainty and biases associated with these data, necessitated complex analytical processing by highly specialized domain experts. And finally, local public health departments, understandably unfamiliar with these novel data streams, had neither the bandwidth nor the expertise to sift noise from signal. Ultimately, most efforts did not yield consistently useful information for decision making, particularly in low resource settings, where capacity limitations in the public sector are most acute…(More)”.