“Anonymous” Data Won’t Protect Your Identity


Sophie Bushwick at Scientific American: “The world produces roughly 2.5 quintillion bytes of digital data per day, adding to a sea of information that includes intimate details about many individuals’ health and habits. To protect privacy, data brokers must anonymize such records before sharing them with researchers and marketers. But a new study finds it is relatively easy to reidentify a person from a supposedly anonymized data set—even when that set is incomplete.

Massive data repositories can reveal trends that teach medical researchers about disease, demonstrate issues such as the effects of income inequality, coach artificial intelligence into humanlike behavior and, of course, aim advertising more efficiently. To shield people who—wittingly or not—contribute personal information to these digital storehouses, most brokers send their data through a process of deidentification. This procedure involves removing obvious markers, including names and social security numbers, and sometimes taking other precautions, such as introducing random “noise” data to the collection or replacing specific details with general ones (for example, swapping a birth date of “March 7, 1990” for “January–April 1990”). The brokers then release or sell a portion of this information.

“Data anonymization is basically how, for the past 25 years, we’ve been using data for statistical purposes and research while preserving people’s privacy,” says Yves-Alexandre de Montjoye, an assistant professor of computational privacy at Imperial College London and co-author of the new study, published this week in Nature Communications.  Many commonly used anonymization techniques, however, originated in the 1990s, before the Internet’s rapid development made it possible to collect such an enormous amount of detail about things such as an individual’s health, finances, and shopping and browsing habits. This discrepancy has made it relatively easy to connect an anonymous line of data to a specific person: if a private detective is searching for someone in New York City and knows the subject is male, is 30 to 35 years old and has diabetes, the sleuth would not be able to deduce the man’s name—but could likely do so quite easily if he or she also knows the target’s birthday, number of children, zip code, employer and car model….(More)”

The value of data in Canada: Experimental estimates


Statistics Canada: “As data and information take on a far more prominent role in Canada and, indeed, all over the world, data, databases and data science have become a staple of modern life. When the electricity goes out, Canadians are as much in search of their data feed as they are food and heat. Consumers are using more and more data that is embodied in the products they buy, whether those products are music, reading material, cars and other appliances, or a wide range of other goods and services. Manufacturers, merchants and other businesses depend increasingly on the collection, processing and analysis of data to make their production processes more efficient and to drive their marketing strategies.

The increasing use of and investment in all things data is driving economic growth, changing the employment landscape and reshaping how and from where we buy and sell goods. Yet the rapid rise in the use and importance of data is not well measured in the existing statistical system. Given the ‘lack of data on data’, Statistics Canada has initiated new research to produce a first set of estimates of the value of data, databases and data science. The development of these estimates benefited from collaboration with the Bureau of Economic Analysis in the United States and the Organisation for Economic Co-operation and Development.

In 2018, Canadian investment in data, databases and data science was estimated to be as high as $40 billion. This was greater than the annual investment in industrial machinery, transportation equipment, and research and development and represented approximately 12% of total non-residential investment in 2018….

Statistics Canada recently released a conceptual framework outlining how one might measure the economic value of data, databases and data science. Thanks to this new framework, the growing role of data in Canada can be measured through time. This framework is described in a paper that was released in The Daily on June 24, 2019 entitled “Measuring investments in data, databases and data science: Conceptual framework.” That paper describes the concept of an ‘information chain’ in which data are derived from everyday observations, databases are constructed from data, and data science creates new knowledge by analyzing the contents of databases….(More)”.

What Restaurant Reviews Reveal About Cities


Linda Poon at CityLab: “Online review sites can tell you a lot about a city’s restaurant scene, and they can reveal a lot about the city itself, too.

Researchers at MIT recently found that information about restaurants gathered from popular review sites can be used to uncover a number of socioeconomic factors of a neighborhood, including its employment rates and demographic profiles of the people who live, work, and travel there.

A report published last week in the Proceedings of the National Academy of Sciences explains how the researchers used information found on Dianping—a Yelp-like site in China—to find information that might usually be gleaned from an official government census. The model could prove especially useful for gathering information about cities that don’t have that kind of reliable or up-to-date government data, especially in developing countries with limited resources to conduct regular surveys….

Zheng and her colleagues tested out their machine-learning model using restaurant data from nine Chinese cities of various sizes—from crowded ones like Beijing, with a population of more than 10 million, to smaller ones like Baoding, a city of fewer than 3 million people.

They pulled data from 630,000 restaurants listed on Dianping, including each business’s location, menu prices, opening day, and customer ratings. Then they ran it through a machine-learning model with official census data and with anonymous location and spending data gathered from cell phones and bank cards. By comparing the information, they were able to determine where the restaurant data reflected the other data they had about neighborhoods’ characteristics.

They found that the local restaurant scene can predict, with 95 percent accuracy, variations in a neighborhood’s daytime and nighttime populations, which are measured using mobile phone data. They can also predict, with 90 and 93 percent accuracy, respectively, the number of businesses and the volume of consumer consumption. The type of cuisines offered and kind of eateries available (coffeeshop vs. traditional teahouses, for example), can also predict the proportion of immigrants or age and income breakdown of residents. The predictions are more accurate for neighborhoods near urban centers as opposed to those near suburbs, and for smaller cities, where neighborhoods don’t vary as widely as those in bigger metropolises….(More)”.

The Power of Global Performance Indicators


Introduction to Special Issue of International Organization by
Judith G. Kelley and Beth A. Simmons: “In recent decades, IGOs, NGOs, private firms and even states have begun to regularly package and distribute information on the relative performance of states. From the World Bank’s Ease of Doing Business Index to the Financial Action Task Force blacklist, global performance indicators (GPIs) are increasingly deployed to influence governance globally. We argue that GPIs derive influence from their ability to frame issues, extend the authority of the creator, and — most importantly — to invoke recurrent comparison that stimulates governments’ concerns for their own and their country’s reputation. Their public and ongoing ratings and rankings of states are particularly adept at capturing attention not only at elite policy levels but also among other domestic and transnational actors. GPIs thus raise new questions for research on politics and governance globally. What are the social and political effects of this form of information on discourse, policies and behavior? What types of actors can effectively wield GPIs and on what types of issues? In this symposium introduction, we define GPIs, describe their rise, and theorize and discuss these questions in light of the findings of the symposium contributions…(More)”.

How an AI Utopia Would Work


Sami Mahroum at Project Syndicate: “…It is more than 500 years since Sir Thomas More found inspiration for the “Kingdom of Utopia” while strolling the streets of Antwerp. So, when I traveled there from Dubai in May to speak about artificial intelligence (AI), I couldn’t help but draw parallels to Raphael Hythloday, the character in Utopia who regales sixteenth-century Englanders with tales of a better world.

As home to the world’s first Minister of AI, as well as museumsacademies, and foundations dedicated to studying the future, Dubai is on its own Hythloday-esque voyage. Whereas Europe, in general, has grown increasingly anxious about technological threats to employment, the United Arab Emirates has enthusiastically embraced the labor-saving potential of AI and automation.

There are practical reasons for this. The ratio of indigenous-to-foreign labor in the Gulf states is highly imbalanced, ranging from a high of 67% in Saudi Arabia to a low of 11% in the UAE. And because the region’s desert environment cannot support further population growth, the prospect of replacing people with machines has become increasingly attractive.

But there is also a deeper cultural difference between the two regions. Unlike Western Europe, the birthplace of both the Industrial Revolution and the “Protestant work ethic,” Arab societies generally do not “live to work,” but rather “work to live,” placing a greater value on leisure time. Such attitudes are not particularly compatible with economic systems that require squeezing ever more productivity out of labor, but they are well suited for an age of AI and automation….

Fortunately, AI and data-driven innovation could offer a way forward. In what could be perceived as a kind of AI utopia, the paradox of a bigger state with a smaller budget could be reconciled, because the government would have the tools to expand public goods and services at a very small cost.

The biggest hurdle would be cultural: As early as 1948, the German philosopher Joseph Pieper warned against the “proletarianization” of people and called for leisure to be the basis for culture. Westerners would have to abandon their obsession with the work ethic, as well as their deep-seated resentment toward “free riders.” They would have to start differentiating between work that is necessary for a dignified existence, and work that is geared toward amassing wealth and achieving status. The former could potentially be all but eliminated.

With the right mindset, all societies could start to forge a new AI-driven social contract, wherein the state would capture a larger share of the return on assets, and distribute the surplus generated by AI and automation to residents. Publicly-owned machines would produce a wide range of goods and services, from generic drugs, food, clothes, and housing, to basic research, security, and transportation….(More)”.

Trusted data and the future of information sharing


 MIT Technology Review: “Data in some form underpins almost every action or process in today’s modern world. Consider that even farming, the world’s oldest industry, is on the verge of a digital revolution, with AI, drones, sensors, and blockchain technology promising to boost efficiencies. The market value of an apple will increasingly reflect not only traditional farming inputs but also some value of modern data, such as weather patterns, soil acidity levels and agri-supply-chain information. By 2022 more than 60% of global GDP will be digitized, according to IDC.

Governments seeking to foster growth in their digital economies need to be more active in encouraging safe data sharing between organizations. Tolerating the sharing of data and stepping in only where security breaches occur is no longer enough. Sharing data across different organizations enables the whole ecosystem to grow and can be a unique source of competitive advantage. But businesses need guidelines and support in how to do this effectively.   

This is how Singapore’s data-sharing worldview has evolved, according to Janil Puthucheary, senior minister of state for communications and information and transport, upon launching the city-state’s new Trusted Data Sharing Framework in June 2019.

The Framework, a product of consultations between Singapore’s Infocomm Media Development Authority (IMDA), its Personal Data Protection Commission (PDPC), and industry players, is intended to create a common data-sharing language for relevant stakeholders. Specifically, it addresses four common categories of concerns with data sharing: how to formulate an overall data-sharing strategy, legal and regulatory considerations, technical and organizational considerations, and the actual operationalizing of data sharing.

For instance, companies often have trouble assessing the value of their own data, a necessary first step before sharing should even be considered. The framework describes the three general approaches used: market-, cost-, and income-based. The legal and regulatory section details when businesses can, among other things, seek exemptions from Singapore’s Personal Data Protection Act.

The technical and organizational chapter includes details on governance, infrastructure security, and risk management. Finally, the section on operational aspects of data sharing includes guidelines for when it is appropriate to use shared data for a secondary purpose or not….(More)”.

Age of the expert as policymaker is coming to an end


Wolfgang Münchau at the Financial Times: “…Where the conflation of the expert and the policymaker did real damage was not to policy but to expertdom itself. It compromised the experts’ most prized asset — their independence.

When economics blogging started to become fashionable, I sat on a podium with an academic blogger who predicted that people like him would usurp the role of the economics newspaper columnist within a period of 10 years. That was a decade ago. His argument was that trained economists were just smarter. What he did not reckon with is that it is hard to speak truth to power when you have to beg that power to fund your think-tank or institute. E

ven less so once you are politically attached or appointed. Independence matters. A good example of a current issue where lack of independence gets in the way is the debate on cryptocurrencies. I agree that governments should not lightly concede the money monopoly of the state, which is at the heart of our economic system. But I sometimes wonder whether those who hyperventilate about crypto do so because they find the whole concept offensive. Cryptocurrencies embody a denial of economics. There are no monetary policy committees. Cryptocurrencies may, or may not, damage the economy. But they surely damage the economist.

Even the best arguments lose power when they get mixed up with personal interests. If you want to be treated as an independent authority, do not join a policy committee, or become a minister or central banker. As soon as you do, you have changed camps. You may think of yourself as an expert. The rest of the world does not. The minimum needed to maintain or regain credibility is to state conflicts of interests openly. The only option in such cases is to be transparent. This is also why financial journalists have to declare the shares they own. The experts I listen to are those who are independent, and who do not follow a political agenda. The ones I avoid are the zealots and those who wander off their reservation and make pronouncements without inhibition. An economist may have strong views on the benefits of vaccination, for example, but is still no expert on the subject. And I often cringe when I hear a doctor trying to prove a point by using statistics. The world will continue to need policymakers and the experts who advise them. But more than that, it needs them to be independent….(More)”.

Future Studies and Counterfactual Analysis


Book by Theodore J. Gordon and Mariana Todorova: “In this volume, the authors contribute to futures research by placing the counterfactual question in the future tense. They explore the possible outcomes of future, and consider how future decisions are turning points that may produce different global outcomes. This book focuses on a dozen or so intractable issues that span politics, religion, and technology, each addressed in individual chapters. Until now, most scenarios written by futurists have been built on cause and effect narratives or depended on numerical models derived from historical relationships. In contrast, many of the scenarios written for this book are point descriptions of future discontinuities, a form allows more thought-provoking presentations. Ultimately, this book demonstrates that counterfactual thinking and point scenarios of discontinuities are new, groundbreaking tools for futurists….(More)”.

The Age of Digital Interdependence


Report of the High-level Panel on Digital Cooperation: “The immense power and value of data in the modern economy can and must be harnessed to meet the SDGs, but this will require new models of collaboration. The Panel discussed potential pooling of data in areas such as health, agriculture and the environment to enable scientists and thought leaders to use data and artificial intelligence to better understand issues and find new ways to make progress on the SDGs. Such data commons would require criteria for establishing relevance to the SDGs, standards for interoperability, rules on access and safeguards to ensure privacy and security.

Anonymised data – information that is rendered anonymous in such a way that the data subject is not or no longer identifiable – about progress toward the SDGs is generally less sensitive and controversial than the use of personal data of the kind companies such as Facebook, Twitter or Google may collect to drive their business models, or facial and gait data that could be used for surveillance. However, personal data can also serve development goals, if handled with proper oversight to ensure its security and privacy.

For example, individual health data is extremely sensitive – but many people’s health data, taken together, can allow researchers to map disease outbreaks, compare the effectiveness of treatments and improve understanding of conditions. Aggregated data from individual patient cases was crucial to containing the Ebola outbreak in West Africa. Private and public sector healthcare providers around the world are now using various forms of electronic medical records. These help individual patients by making it easier to personalise health services, but the public health benefits require these records to be interoperable.

There is scope to launch collaborative projects to test the interoperability of data, standards and safeguards across the globe. The World Health Assembly’s consideration of a global strategy for digital health in 2020 presents an opportunity to launch such projects, which could initially be aimed at global health challenges such as Alzheimer’s and hypertension.

Improved digital cooperation on a data-driven approach to public health has the potential to lower costs, build new partnerships among hospitals, technology companies, insurance providers and research institutes and support the shift from treating diseases to improving wellness. Appropriate safeguards are needed to ensure the focus remains on improving health care outcomes. With testing, experience and necessary protective measures as well as guidelines for the responsible use of data, similar cooperation could emerge in many other fields related to the SDGs, from education to urban planning to agriculture…(More)”.

Study finds that a GPS outage would cost $1 billion per day


Eric Berger at Ars Technica: “….one of the most comprehensive studies on the subject has assessed the value of this GPS technology to the US economy and examined what effect a 30-day outage would have—whether it’s due to a severe space weather event or “nefarious activity by a bad actor.” The study was sponsored by the US government’s National Institutes of Standards and Technology and performed by a North Carolina-based research organization named RTI International.

Economic effect

As part of the analysis, researchers spoke to more than 200 experts in the use of GPS technology for various services, from agriculture to the positioning of offshore drilling rigs to location services for delivery drivers. (If they’d spoken to me, I’d have said the value of using GPS to navigate Los Angeles freeways and side streets was incalculable). The study covered a period from 1984, when the nascent GPS network was first opened to commercial use, through 2017. It found that GPS has generated an estimated $1.4 trillion in economic benefits during that time period.

The researchers found that the largest benefit, valued at $685.9 billion, came in the “telecommunications” category,  including improved reliability and bandwidth utilization for wireless networks. Telematics (efficiency gains, cost reductions, and environmental benefits through improved vehicle dispatch and navigation) ranked as the second most valuable category at $325 billion. Location-based services on smartphones was third, valued at $215 billion.

Notably, the value of GPS technology to the US economy is growing. According to the study, 90 percent of the technology’s financial impact has come since just 2010, or just 20 percent of the study period. Some sectors of the economy are only beginning to realize the value of GPS technology, or are identifying new uses for it, the report says, indicating that its value as a platform for innovation will continue to grow.

Outage impact

In the case of some adverse event leading to a widespread outage, the study estimates that the loss of GPS service would have a $1 billion per-day impact, although the authors acknowledge this is at best a rough estimate. It would likely be higher during the planting season of April and May, when farmers are highly reliant on GPS technology for information about their fields.

To assess the effect of an outage, the study looked at several different variables. Among them was “precision timing” that enables a number of wireless services, including the synchronization of traffic between carrier networks, wireless handoff between base stations, and billing management. Moreover, higher levels of precision timing enable higher bandwidth and provide access to more devices. (For example, the implementation of 4G LTE technology would have been impossible without GPS technology)….(More)”