DATA – Page 359 – The Living Library

International Open Data Roadmap

Curated on January 17, 2017May 29, 2019 by Stefaan Verhulst

IODC16: We have entered the next phase in the evolution of the open data movement. Just making data publicly available can no longer be the beginning and end of every conversation about open data. The focus of the movement is now shifting to building open data communities, and an increasingly sophisticated network of communities have begun to make data truly useful in addressing a myriad of problems facing citizens and their governments around the world:

More than 40 national and local governments have already committed to implement the principles of the International Open Data Charter;
Open data is central to many commitments made this year by world leaders, including the Sustainable Development Goals (SDGs), the Paris Climate Agreement, and the G20 Anti Corruption Data Principles; and
Open data is also an increasingly local issue, as hundreds of cities and sub-national governments implement open data policies to drive transparency, economic growth, and service delivery in close collaboration with citizens.

Screen Shot 2017-01-17 at 11.32.32 AM To further accelerate collaboration and increase the impact of open data activities globally, the Government of Spain, the International Development Research Centre, the World Bank, and the Open Data for Development Network recently hosted the fourth International Open Data Conference (IODC) on October 6-7, 2106 in Madrid, Spain.

Under the theme of Global Goals, Local Impact, the fourth IODC reconvened an ever expanding open data community to showcase best practices, confront shared challenges, and deepen global and regional collaboration in an effort to maximize the impact of open data. Supported by a full online archive of the 80+ sessions and 20+ special events held in Madrid during the first week of October 2016, this report reflects on the discussions and debates that took place, as well as the information shared on a wide range of vibrant global initiatives, in order to map out the road ahead, strengthen cohesion among existing efforts, and explore new ways to use open data to drive social and economic inclusion around the world….(More)”

Empirical data on the privacy paradox

Curated on January 16, 2017August 3, 2018 by Stefaan Verhulst

Benjamin Wittes and Emma Kohse at Brookings: “The contemporary debate about the effects of new technology on individual privacy centers on the idea that privacy is an eroding value. The erosion is ongoing and takes place because of the government and big corporations that collect data on us all: In the consumer space, technology and the companies that create it erode privacy, as consumers trade away their solitude either unknowingly or in exchange for convenience and efficiency.

On January 13, we released a Brookings paper that challenges this idea. Entitled, “The Privacy Paradox II: Measuring the Privacy Benefits of Privacy Threats,” we try to measure the extent to which this focus ignores the significant privacy benefits of the technologies that concern privacy advocates. And we conclude that quantifiable effects in consumer behavior strongly support the reality of these benefits.

In 2015, one of us, writing with Jodie Liu, laid out the basic idea last year in a paper published by Brookings called “The Privacy Paradox: the Privacy Benefits of Privacy Threats.” (The title, incidentally, became the name of Lawfare’s privacy-oriented subsidiary page.) Individuals, we argued, might be more concerned with keeping private information from specific people—friends, neighbors, parents, or even store clerks—than from large, remote corporations, and they might actively prefer to give information remote corporations by way of shielding it from those immediately around them. By failing to associate this concern with the concept of privacy, academic and public debates tends to ignore countervailing privacy benefits associated with privacy threats, and thereby keeps score in a way biased toward the threats side of the ledger.To cite a few examples, an individual may choose to use a Kindle e-reader to read Fifty Shades of Grey precisely because she values the privacy benefit of hiding her book choice from the eyes of people on the bus or the store clerk at the book store, rather than for reasons of mere convenience. This privacy benefit, for many consumers, can outweigh the privacy concern presented by Amazon’s data mining. At the very least, the privacy benefits of the Kindle should enter into the discussion.

To cite a few examples, an individual may choose to use a Kindle e-reader to read Fifty Shades of Grey precisely because she values the privacy benefit of hiding her book choice from the eyes of people on the bus or the store clerk at the book store, rather than for reasons of mere convenience. This privacy benefit, for many consumers, can outweigh the privacy concern presented by Amazon’s data mining. At the very least, the privacy benefits of the Kindle should enter into the discussion.

In this paper, we tried to begin the task for measuring the effect and reasoning that supported the thesis in the “Privacy Paradox” using Google Surveys, an online survey tool….(More)”.

Open Data Inventory 2016

Curated on January 15, 2017August 3, 2018 by Stefaan Verhulst

“Open Data Watch is pleased to announce the release of the 2016 Open Data Inventory (ODIN). The new ODIN results provide a comprehensive review of the coverage and openness of official statistics in 173 countries around the world, including most OECD countries. Featuring a methodology updated to reflect the latest international open data standards, ODIN 2016 results are fully available online at odin.opendatawatch.com, including interactive functions to compare year-to-year results from 122 countries.

ODIN assesses the coverage and openness of data provided on the websites maintained by national statistical offices (NSOs). The overall ODIN score is an indicator of how complete and open an NSO’s data offerings are. In addition to ratings of coverage and openness in twenty statistical categories, ODIN assessments provide the online location of key indicators in each data category, permitting quick access to hundreds of indicators.

ODIN 2016 Top Scores Reveal Gaps Between Openness and Coverage

In the 2016 round, the top scores went to high-income and OECD countries. Sweden was ranked first overall with a score of 81. Sweden was also the most open site, with an openness score of 91. Among non-OECD countries, the highest rank was Lithuania with an overall score of 77. Among non-high-income countries, Mexico again earned the highest ranking with a score of 67, followed by the lower-middle-income economies of Mongolia (61), and Moldova (59). Among low-income countries, Rwanda received the highest score of 55. ODIN overall scores are scaled from 0 to 100 and provide equal weighting for social, economic, and environmental statistics….

The new ODIN website allows users to compare and download scores for 2015 and 2016….(More)”

Open Traffic Data to Revolutionize Transport

Curated on January 14, 2017August 3, 2018 by Stefaan Verhulst

World Bank: “Congestion in metropolitan Manila costs the economy more than $60 million per day, and it is not atypical to spend more than 2 hours to travel 8 km during the evening commute there. But beyond these statistics, until recently, very little was actually known about Manila’s congestion, because the equipment and manpower required to collect traffic data has far exceeded available resources. Most cities in developing countries face similar issues.

Traditional methods of collecting traffic data rely either on labor-intensive fieldwork or capital-intensive sensor data networks. The former is slow and results in low-quality data, and the latter requires substantial capital and maintenance outlays, while only covering a small portion of a metropolitan area. In the era of big data, shouldn’t we be able to do better?

Responding to this need, Easy Taxi, Grab, and Le.Taxi, three ridesharing companies—which, combined, cover more than 30 countries and millions of customers—are working with the World Bank and partners to make traffic data derived from their drivers’ GPS streams available to public through an open data license. Through the new Open Transport Partnership, these companies, along with founding members Mapzen, the World Resources Institute, Miovision, and NDrive, will empower resource-constrained transport agencies to make better, evidence-based decisions that previously had been out of reach.

Issues that this data will help address include, among others, traffic signal timing plans, public transit provision, roadway infrastructure needs, emergency traffic management, and travel demand management. According to Alyssa Wright, president of the US Open Street Map Foundation, the partnership “seeks to improve the efficiency and efficacy of global transportation use and provision through open data and capacity building.” …(More)

Open data for democracy: Developing a theoretical framework for open data use

Curated on January 13, 2017August 3, 2018 by Stefaan Verhulst

Erna Ruijer, Stephan Grimmelikhuijsen, and Albert Meijer in Government Information Quarterly: “Open data platforms are hoped to foster democratic processes, yet recent empirical research shows that so far they have failed to do so. We argue that current open data platforms do not take into account the complexity of democratic processes which results in overly simplistic approaches to open data platform design. Democratic processes are multifaceted and open data can be used for various purposes, with diverging roles, rules and tools by citizens and public administrators. This study develops a Democratic Activity Model of Open Data Use, which is illustrated by an exploratory qualitative multiple case study outlining three democratic processes: monitorial, deliberative and participatory. We find that each type of democratic process requires a different approach and open data design. We conclude that a context-sensitive open data design facilitates the transformation of raw data into meaningful information constructed collectively by public administrators and citizens….(More)”

Fighting Ebola with information

Curated on January 13, 2017October 24, 2018 by Stefaan Verhulst

Larissa Fast and Adele Waugaman at Global Innovation Exchange: What can be learned from the use of data, information, and digital technologies, such as mobile-based systems and internet connectivity, during the Ebola outbreak response in West Africa? What worked, what didn’t, and how can we apply these lessons to improve data and information flows in the future? This report details key findings and recommendations about the collection, management, analysis, and use of paper-based and digital data and information, drawing upon the insights of more than 130 individuals and organizations who worked tirelessly to end the Ebola outbreak in West Africa in 2014 and 2015….(More)”

The Emergence of a Post-Fact World

Curated on January 13, 2017August 3, 2018 by Stefaan Verhulst

Francis Fukuyama in Project Syndicate: “One of the more striking developments of 2016 and its highly unusual politics was the emergence of a “post-fact” world, in which virtually all authoritative information sources were called into question and challenged by contrary facts of dubious quality and provenance.

The emergence of the Internet and the World Wide Web in the 1990s was greeted as a moment of liberation and a boon for democracy worldwide. Information constitutes a form of power, and to the extent that information was becoming cheaper and more accessible, democratic publics would be able to participate in domains from which they had been hitherto excluded.

The development of social media in the early 2000s appeared to accelerate this trend, permitting the mass mobilization that fueled various democratic “color revolutions” around the world, from Ukraine to Burma (Myanmar) to Egypt. In a world of peer-to-peer communication, the old gatekeepers of information, largely seen to be oppressive authoritarian states, could now be bypassed.

While there was some truth to this positive narrative, another, darker one was also taking shape. Those old authoritarian forces were responding in dialectical fashion, learning to control the Internet, as in China, with its tens of thousands of censors, or, as in Russia, by recruiting legions of trolls and unleashing bots to flood social media with bad information. These trends all came together in a hugely visible way during 2016, in ways that bridged foreign and domestic politics….

The traditional remedy for bad information, according to freedom-of-information advocates, is simply to put out good information, which in a marketplace of ideas will rise to the top. This solution, unfortunately, works much less well in a social-media world of trolls and bots. There are estimates that as many as a third to a quarter of Twitter users fall into this category. The Internet was supposed to liberate us from gatekeepers; and, indeed, information now comes at us from all possible sources, all with equal credibility. There is no reason to think that good information will win out over bad information….

The inability to agree on the most basic facts is the direct product of an across-the-board assault on democratic institutions – in the US, in Britain, and around the world. And this is where the democracies are headed for trouble. In the US, there has in fact been real institutional decay, whereby powerful interest groups have been able to protect themselves through a system of unlimited campaign finance. The primary locus of this decay is Congress, and the bad behavior is for the most part as legal as it is widespread. So ordinary people are right to be upset.

And yet, the US election campaign has shifted the ground to a general belief that everything has been rigged or politicized, and that outright bribery is rampant. If the election authorities certify that your favored candidate is not the victor, or if the other candidate seemed to perform better in a debate, it must be the result of an elaborate conspiracy by the other side to corrupt the outcome. The belief in the corruptibility of all institutions leads to a dead end of universal distrust. American democracy, all democracy, will not survive a lack of belief in the possibility of impartial institutions; instead, partisan political combat will come to pervade every aspect of life….(More)”

The social data revolution will be crowdsourced

Curated on January 13, 2017August 3, 2018 by Stefaan Verhulst

Nicholas B. Adams at SSRC Parameters: “It is now abundantly clear to librarians, archivists, computer scientists, and many social scientists that we are in a transformational age. If we can understand and measure meaning from all of these data describing so much of human activity, we will finally be able to test and revise our most intricate theories of how the world is socially constructed through our symbolic interactions….

We cannot write enough rules to teach a computer to read like us. And because the social world is not a game per se, we can’t design a reinforcement-learning scenario teaching a computer to “score points” and just ‘win.’ But AlphaGo’s example does show a path forward. Recall that much of AlphaGo’s training came in the form of supervised machine learning, where humans taught it to play like them by showing the machine how human experts played the game. Already, humans have used this same supervised learning approach to teach computers to classify images, identify parts of speech in text, or categorize inventories into various bins. Without writing any rules, simply by letting the computer guess, then giving it human-generated feedback about whether it guessed right or wrong, humans can teach computers to label data as we do. The problem is (or has been): humans label textual data slowly—very, very slowly. So, we have generated precious little data with which to teach computers to understand natural language as we do. But that is going to change….

The single greatest factor dilating the duration of such large-scale text-labeling projects has been workforce training and turnover. ….The key to organizing work for the crowd, I had learned from talking to computer scientists, was task decomposition. The work had to be broken down into simple pieces that any (moderately intelligent) person could do through a web interface without requiring face-to-face training. I knew from previous experiments with my team that I could not expect a crowd worker to read a whole article, or to know our whole conceptual scheme defining everything of potential interest in those articles. Requiring either or both would be asking too much. But when I realized that my conceptual scheme could actually be treated as multiple smaller conceptual schemes, the idea came to me: Why not have my RAs identify units of text that corresponded with the units of analysis of my conceptual scheme? Then, crowd workers reading those much smaller units of text could just label them according to a smaller sub-scheme. Moreover, I came to realize, we could ask them leading questions about the text to elicit information about the variables and attributes in the scheme, so they wouldn’t have to memorize the scheme either. By having them highlight the words justifying their answers, they would be labeling text according to our scheme without any face-to-face training. Bingo….

This approach promises more, too. The databases generated by crowd workers, citizen scientists, and students can also be used to train machines to see in social data what we humans see comparatively easily. Just as AlphaGo learned from humans how to play a strategy game, our supervision can also help it learn to see the social world in textual or video data. The final products of social data analysis assembly lines, therefore, are not merely rich and massive databases allowing us to refine our most intricate, elaborate, and heretofore data-starved theories; they are also computer algorithms that will do most or all social data labeling in the future. In other words, whether we know it or not, we social scientists hold the key to developing artificial intelligences capable of understanding our social world….

At stake is a social science with the capacity to quantify and qualify so many of our human practices, from the quotidian to mythic, and to lead efforts to improve them. In decades to come, we may even be able to follow the path of other mature sciences (including physics, biology, and chemistry) and shift our focus toward engineering better forms of sociality. All the more so because it engages the public, a crowd-supported social science could enlist a new generation in the confident and competent re-construction of society….(More)”