big data

Seeing Cities Through Big Data

Curated on October 10, 2016August 3, 2018 by Stefaan Verhulst

Book edited by Thakuriah, Piyushimita (Vonu), Tilahun, Nebiyou, and Zellner, Moira: “… introduces the latest thinking on the use of Big Data in the context of urban systems, including research and insights on human behavior, urban dynamics, resource use, sustainability and spatial disparities, where it promises improved planning, management and governance in the urban sectors (e.g., transportation, energy, smart cities, crime, housing, urban and regional economies, public health, public engagement, urban governance and political systems), as well as Big Data’s utility in decision-making, and development of indicators to monitor economic and social activity, and for urban sustainability, transparency, livability, social inclusion, place-making, accessibility and resilience…(More)”

The challenges and limits of big data algorithms in technocratic governance

Curated on October 6, 2016August 3, 2018 by Stefaan Verhulst

Paper by Marijn Janssen and George Kuk in Government Information Quarterly: “Big data is driving the use of algorithm in governing mundane but mission-critical tasks. Algorithms seldom operate on their own and their (dis)utilities are dependent on the everyday aspects of data capture, processing and utilization. However, as algorithms become increasingly autonomous and invisible, they become harder for the public to detect and scrutinize their impartiality status. Algorithms can systematically introduce inadvertent bias, reinforce historical discrimination, favor a political orientation or reinforce undesired practices. Yet it is difficult to hold algorithms accountable as they continuously evolve with technologies, systems, data and people, the ebb and flow of policy priorities, and the clashes between new and old institutional logics. Greater openness and transparency do not necessarily improve understanding. In this editorial we argue that through unraveling the imperceptibility, materiality and governmentality of how algorithms work, we can better tackle the inherent challenges in the curatorial practice of data and algorithm. Fruitful avenues for further research on using algorithm to harness the merits and utilities of a computational form of technocratic governance are presented….(More)”

Helping Smart Cities Harness Big Data

Curated on October 3, 2016August 3, 2018 by Stefaan Verhulst

Linda Poon in CityLab: “Harnessing the power of open data is key to developing the smart cities of the future. But not all governments have the capacity—be that funding or human capital—to collect all the necessary information and turn it into a tool. That’s where Mapbox comes in.

Mapbox offers open-source mapping platforms, and is no stranger to turning complex data into visualizations cities can use, whether it’s mapping traffic fatalities in the U.S. or the conditions of streets in Washington, D.C., during last year’s East Coast blizzard. As part of the White House Smart Cities Initiative, which announced this week that it would make more than $80 million in tech investments this year, the company is rolling out Mapbox Cities, a new “mentorship” program that, for now, will give three cities the tools and support they need to solve some of their most pressing urban challenges. It issued a call for applications earlier this week, and responses have poured in from across the globe says Christina Franken, who specializes in smart cities at Mapbox.

“It’s very much an experimental approach to working with cities,” she says. “A lot of cities have open-data platforms but they don’t really do something with the data. So we’re trying to bridge that gap.”

During Hurricane Sandy, Mapbox launched a tool to help New Yorkers figure out if they were in an evacuation zone. (Mapbox)

But the company isn’t approaching the project blindly. In a way, Mapbox has the necessary experience to help cities jumpstart their own projects. Its resume includes, for example, a map that visualizes the sheer quantity of traffic fatalities along any commuting route in the U.S., showcasing its ability to turn a whopping five years’ worth of data into a public-safety tool. During 2012’s Hurricane Sandy, they created a disaster-relief tool to help New Yorkers find shelter.

And that’s just in the United States. Mapbox recently also started a group focusing primarily on humanitarian issues and bringing their mapping and data-collecting tools to aid organizations all over the world in times of crisis. It provides free access to its vast collection of resources, and works closely with collaborators to help them customize maps based on specific needs….(More)”

Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity

Curated on September 27, 2016August 3, 2018 by Stefaan Verhulst

Book by James Hendler and Alice Mulvehill: “Will your next doctor be a human being—or a machine? Will you have a choice? If you do, what should you know before making it?

This book introduces the reader to the pitfalls and promises of artificial intelligence in its modern incarnation and the growing trend of systems to “reach off the Web” into the real world. The convergence of AI, social networking, and modern computing is creating an historic inflection point in the partnership between human beings and machines with potentially profound impacts on the future not only of computing but of our world.

AI experts and researchers James Hendler and Alice Mulvehill explore the social implications of AI systems in the context of a close examination of the technologies that make them possible. The authors critically evaluate the utopian claims and dystopian counterclaims of prognosticators. Social Machines: The Coming Collision of Artificial Intelligence, Social Networking, and Humanity is your richly illustrated field guide to the future of your machine-mediated relationships with other human beings and with increasingly intelligent machines.

What you’ll learn

• What the concept of a social machine is and how the activities of non-programmers are contributing to machine intelligence• How modern artificial intelligence technologies, such as Watson, are evolving and how they process knowledge from both carefully produced information (such as Wikipedia or journal articles) and from big data collections

• The fundamentals of neuromorphic computing

• The fundamentals of knowledge graph search and linked data as well as the basic technology concepts that underlie networking applications such as Facebook and Twitter

• How the change in attitudes towards cooperative work on the Web, especially in the younger demographic, is critical to the future of Web applications…(More)”

Informed Choice? Motivations and methods of data usage among public officials in India

Curated on September 27, 2016October 24, 2018 by Stefaan Verhulst

Report by Rwitwika Bhattacharya and Mohitkumar Daga: “The importance of data in informing the policy-making process is being increasingly realized across the world. With India facing significant developmental challenges, use of data offers an important opportunity to improve the quality of public services. However, lack of formal structures to internalize a data-informed decision-making process impedes the path to robust policy formation. This paper seeks to highlight these challenges through a case study of data dashboard implementation in the state of Andhra Pradesh. The study suggests the importance of capacity building, improvement of data collection and engagement of non-governmental players as measures to address issues….(More)”

Twitter, UN Global Pulse announce data partnership

Curated on September 27, 2016July 19, 2019 by Stefaan Verhulst

PressRelease: “Twitter and UN Global Pulse today announced a partnership that will provide the United Nations with access to Twitter’s data tools to support efforts to achieve the Sustainable Development Goals, which were adopted by world leaders last year.

Every day, people around the world send hundreds of millions of Tweets in dozens of languages. This public data contains real-time information on many issues including the cost of food, availability of jobs, access to health care, quality of education, and reports of natural disasters. This partnership will allow the development and humanitarian agencies of the UN to turn these social conversations into actionable information to aid communities around the globe.

“The Sustainable Development Goals are first and foremost about people, and Twitter’s unique data stream can help us truly take a real-time pulse on priorities and concerns — particularly in regions where social media use is common — to strengthen decision-making. Strong public-private partnerships like this show the vast potential of big data to serve the public good,” said Robert Kirkpatrick, Director of UN Global Pulse.

“We are incredibly proud to partner with the UN in support of the Sustainable Development Goals,” said Chris Moody, Twitter’s VP of Data Services. “Twitter data provides a live window into the public conversations that communities around the world are having, and we believe that the increased potential for research and innovation through this partnership will further the UN’s efforts to reach the Sustainable Development Goals.”

Organizations and business around the world currently use Twitter data in many meaningful ways, and this unique data source enables them to leverage public information at scale to better inform their policies and decisions. These partnerships enable innovative uses of Twitter data, while protecting the privacy and safety of Twitter users.

UN Global Pulse’s new collaboration with Twitter builds on existing R&D that has shown the power of social media for social impact, like measuring the impact of public health campaigns, tracking reports of rising food prices, or prioritizing needs after natural disasters….(More)”

Beware of the gaps in Big Data

Curated on September 21, 2016August 3, 2018 by Stefaan Verhulst

Edd Gent at E&T: “When the municipal authority in charge of Boston, Massachusetts, was looking for a smarter way to find which roads it needed to repair, it hit on the idea of crowdsourcing the data. The authority released a mobile app called Street Bump in 2011 that employed an elegantly simple idea: use a smartphone’s accelerometer to detect jolts as cars go over potholes and look up the location using the Global Positioning System. But the approach ran into a pothole of its own.The system reported a disproportionate number of potholes in wealthier neighbourhoods. It turned out it was oversampling the younger, more affluent citizens who were digitally clued up enough to download and use the app in the first place. The city reacted quickly, but the incident shows how easy it is to develop a system that can handle large quantities of data but which, through its own design, is still unlikely to have enough data to work as planned.

As we entrust more of our lives to big data analytics, automation problems like this could become increasingly common, with their errors difficult to spot after the fact. Systems that ‘feel like they work’ are where the trouble starts.

Harvard University professor Gary King, who is also founder of social media analytics company Crimson Hexagon, recalls a project that used social media to predict unemployment. The model was built by correlating US unemployment figures with the frequency that people used words like ‘jobs’, ‘unemployment’ and ‘classifieds’. A sudden spike convinced researchers they had predicted a big rise in joblessness, but it turned out Steve Jobs had died and their model was simply picking up posts with his name. “This was an example of really bad analytics and it’s even worse because it’s the kind of thing that feels like it should work and does work a little bit,” says King.

Big data can shed light on areas with historic information deficits, and systems that seem to automatically highlight the best course of action can be seductive for executives and officials. “In the vacuum of no decision any decision is attractive,” says Jim Adler, head of data at Toyota Research Institute in Palo Alto. “Policymakers will say, ‘there’s a decision here let’s take it’, without really looking at what led to it. Was the data trustworthy, clean?”…(More)”

Data Love: The Seduction and Betrayal of Digital Technologies

Curated on September 14, 2016August 3, 2018 by Stefaan Verhulst

Book by Roberto Simanowski: “Intelligence services, government administrations, businesses, and a growing majority of the population are hooked on the idea that big data can reveal patterns and correlations in everyday life. Initiated by software engineers and carried out through algorithms, the mining of big data has sparked a silent revolution. But algorithmic analysis and data mining are not simply byproducts of media development or the logical consequences of computation. They are the radicalization of the Enlightenment’s quest for knowledge and progress. Data Love argues that the “cold civil war” of big data is taking place not among citizens or between the citizen and government but within each of us.

Roberto Simanowski elaborates on the changes data love has brought to the human condition while exploring the entanglements of those who―out of stinginess, convenience, ignorance, narcissism, or passion―contribute to the amassing of ever more data about their lives, leading to the statistical evaluation and individual profiling of their selves. Writing from a philosophical standpoint, Simanowski illustrates the social implications of technological development and retrieves the concepts, events, and cultural artifacts of past centuries to help decode the programming of our present….(More)”

National Transit Map Seeks to Close the Transit Data Gap

Curated on September 14, 2016October 24, 2018 by Stefaan Verhulst

Ben Miller at GovTech: “In bringing together the first ever map illustrating the nation’s transit system, the U.S. Department of Transportation isn’t just making data more accessible — it’s also aiming to modernize data collection and dissemination for many of the country’s transit agencies.

With more than 10,000 routes and 98,000 stops represented, the National Transit Map is already enormous. But Dan Morgan, chief data officer of the department, says it’s not enough. When measuring vehicles operated in maximum service — a metric illustrating peak service at a transit agency — the National Transit Map captures only about half of all transit in the U.S.

“Not all of these transit agencies have this data available,” Morgan said, “so this is an ongoing project to really close the transit data gap.”Which is why, in the process of building out the map, the DOT is working with transit agencies to make their data available.

Which is why, in the process of building out the map, the DOT is working with transit agencies to make their data available.

On the whole, transit data is easier to collect and process than a lot of transportation data because many agencies have adopted a standard called General Transit Feed Specification (GTFS) that applies to schedule-related data. That’s what made the National Transit Map an easy candidate for completion, Morgan said.

But as popular as GTFS has become, many agencies — especially smaller ones — haven’t been able to use it. The tools to convert to GTFS come with a learning curve.

“It’s really a matter of priority and availability of resources,” he said.

Bringing those agencies into the mainstream is important to achieving the goals of the map. In the map, Morgan said he sees an opportunity to achieve a new level of clarity where it has never existed before.

That’s because transit has long suffered from difficulty in seeing its own history. Transit officials can describe their systems as they exist, but looking at how they got there is trickier.

“There’s no archive,” Morgan said, “there’s no picture of how transit changes over time.”

And that’s a problem for assessing what works and what doesn’t, for understanding why the system operates the way it does and how it responds to changes. …(More)”

Recent Developments in Open Data Policy

Curated on September 14, 2016May 29, 2019 by Stefaan Verhulst

Presentation by Paul Uhlir: “Several International organizations have issued policy statements on open data policies in the past two years. This presentation provides an overview of those statements and their relevance to developing countries.

International Statements on Open Data Policy

Open data policies have become much more supported internationally in recent years. Policy statements in just the most recent 2014-2016 period that endorse and promote openness to research data derived from public funding include: the African Data Consensus (UNECA 2014); the CODATA Nairobi Principles for Data Sharing for Science and Development in Developing Countries (PASTD 2014); the Hague Declaration on Knowledge Discovery in the Digital Age (LIBER 2014); Policy Guidelines for Open Access and Data Dissemination and Preservation (RECODE 2015); Accord on Open Data in a Big Data World (Science International 2015). This presentation will present the principal guidelines of these policy statements.

The Relevance of Open Data from Publicly Funded Research for Development

There are many reasons that publicly funded research data should be made as freely and openly available as possible. Some of these are noted here, although many other benefits are possible. For research, it is closing the gap with more economically developed countries, making researchers more visible on the web, enhancing their collaborative potential, and linking them globally. For educational benefits, open data assists greatly in helping students learn how to do data science and to manage data better. From a socioeconomic standpoint, open data policies have been shown to enhance economic opportunities and to enable citizens to improve their lives in myriad ways. Such policies are more ethical in allowing access to those that have no means to pay and not having to pay for the data twice—once through taxes to create the data in the first place and again at the user level . Finally, access to factual data can improve governance, leading to better decision making by policymakers, improved oversight by constituents, and digital repatriation of objects held by former colonial powers.

Some of these benefits are cited directly in the policy statements themselves, while others are developed more fully in other documents (Bailey Mathae and Uhlir 2012, Uhlir 2015). Of course, not all publicly funded data and information can be made available and there are appropriate reasons—such as the protection of national security, personal privacy, commercial concerns, and confidentiality of all kinds—that make the withholding of them legal and ethical. However, the default rule should be one of openness, balanced against a legitimate reason not to make the data public….(More)”