Can Big Data Help Measure Inflation?


Bourree Lam in The Atlantic: “…As more and more people are shopping online, calculating this index has gotten more difficult, because there haven’t been any great ways of recording prices from the sites disparate retailers.Data shared by retailers and compiled by the technology firm Adobe might help close this gap. The company is perhaps known best for its visual software,including Photoshop, but the company has also become a provider of software and analytics for online retailers. Adobe is now aggregating the sales data that flows through their software for its Digital Price Index (DPI) project, an initiative that’s meant to answer some of the questions that have been dogging researcher snow that e-commerce is such a big part of the economy.

The project, which tracks billions of online transactions and the prices of over a million products, was developed with the help of the economists Austan Goolsbee, the former chairman of Obama’s Council of Economic Advisors and a professor at the University of Chicago’s Booth School of Business, and Peter Klenow, a professor at Stanford University. “We’ve been excited to help them setup various measures of the digital economy, and of prices, and also to see what the Adobe data can teach us about some of the questions that everybody’s had about the CPI,” says Goolsbee. “People are asking questions like ‘How price sensitive is online commerce?’ ‘How much is it growing?’ ‘How substitutable is itf or non-electronic commerce?’ A lot issues you can address with this in a way that we haven’t really been able to do before.” These are some questions that the DPI has the potential to answer.

…While this new trove of data will certainly be helpful to economists and analysts looking at inflation, it surely won’t replace the CPI. Currently, the government sends out hundreds of BLS employees to stores around the country to collect price data. Online pricing is a small part of the BLS calculation, which is incorporated into its methodology as people increasingly report shopping from retailers online, but there’s a significant time lag. While it’s unlikely that the BLS would incorporate private sources of data into its inflation calculations, as e-commerce grows they might look to improve the way they include online prices.Still, economists are optimistic about the potential of Adobe’s DPI. “I don’t think we know the digital economy as well as we should,” says Klenow, “and this data can help us eventually nail that better.”…(More)

Cities, Data, and Digital Innovation


Paper by Mark Kleinman: “Developments in digital innovation and the availability of large-scale data sets create opportunities for new economic activities and new ways of delivering city services while raising concerns about privacy. This paper defines the terms Big Data, Open Data, Open Government, and Smart Cities and uses two case studies – London (U.K.) and Toronto – to examine questions about using data to drive economic growth, improve the accountability of government to citizens, and offer more digitally enabled services. The paper notes that London has been one of a handful of cities at the forefront of the Open Data movement and has been successful in developing its high-tech sector, although it has so far been less innovative in the use of “smart city” technology to improve services and lower costs. Toronto has also made efforts to harness data, although it is behind London in promoting Open Data. Moreover, although Toronto has many assets that could contribute to innovation and economic growth, including a growing high-technology sector, world-class universities and research base, and its role as a leading financial centre, it lacks a clear narrative about how these assets could be used to promote the city. The paper draws some general conclusions about the links between data innovation and economic growth, and between open data and open government, as well as ways to use big data and technological innovation to ensure greater efficiency in the provision of city services…(More)

App turns smartphones into seismic monitors


Springwise: “MyShake is an app that enables anyone to contribute to a worldwide seismic network and help people prepare for earthquakes.

The sheer number of smartphones on the planet make them excellent tools for collecting scientific data. We have already seen citizen scientists use their devices to help crowdsource big data about jellyfish and pollution.Now, MyShake is an Android app from Berkeley University, which enables anyone to contribute to a worldwide seismic network and help reduce the effects of earthquakes.

To begin, users download the app and enable it to run silently in the background of their smartphone. The app monitors for movement that fits the vibrational profile of an earthquake and sends anonymous information to a central system whenever relevant. The crowdsourced data enables the system to confirm an impending quake and estimate its origin time, location and magnitude. Then, the app can send warnings to those in the network who are likely to be affected by the earthquake. MyShake makes use of on the fact that the average smartphone can record earthquakes larger than magnitude five and within 10 km.

myshake-2-earthquake-crowdsource-citizen-scientist-app

MyShake is free to download and the team hopes to launch an iPhone version in the future….(More)”

Political Behavior and Big Data


Special issue of the International Journal of Sociology: “Interest in the use of “big data” in the social sciences is growing dramatically. Yet, adequate methodological research on what constitutes such data, and about their validity, is lacking. Scholars face both opportunities and challenges inherent in this new era of unprecedented quantification of information, including that related to political actions and attitudes. This special issue of the International Journal of Sociology addresses recent uses of “big data,” its multiple meanings, and the potential that this may have in building a stronger understanding of political behavior. We present a working definition of “big data” and summarize the major issues involved in their use. While the papers in this volume deal with various problems – how to integrate “big data” sources with cross-national survey research, the methodological challenges involved in building cross-national longitudinal network data of country memberships in international nongovernmental organizations, methods of detecting and correcting for source selection bias in event data derived from news and other online sources, the challenges and solutions to ex post harmonization of international social survey data – they share a common viewpoint. To make good on the substantive promise of “big data,” scholars need to engage with their inherent methodological problems. At this date, scholars are only beginning to identify and solve them….(More)”

Big data, meet behavioral science


 at Brookings: “America’s community colleges offer the promise of a more affordable pathway to a bachelor’s degree. Students can pay substantially less for the first two years of college, transfer to a four-year college or university, and still earn their diploma in the same amount of time. At least in theory. Most community college students—80 percent of them—enter with the intention to transfer, but only 20 percent actually do so within five years of entering college. This divide represents a classic case of what behavioralists call an intention-action gap.

Why would so many students who enter community colleges intending to transfer fail to actually do so? Put yourself in the shoes of a 20-something community college student. You’ve worked hard for the past couple years, earning credits and paying a lot less in tuition than you would have if you had enrolled immediately in a four-year college or university. But now you want to transfer, so that you can complete your bachelor’s degree. How do you figure out where to go? Ideally you’d probably like to find a college that would take most of your credits, where you’re likely to graduate from, and where the degree is going to count for something in the labor market. A college advisor could probably help you figure this out,but at many community colleges there are at least 1,000 other students assigned to your advisor, so you might have a hard time getting a quality meeting.  Some states have articulation agreements between two- and four-year institutions that guarantee admission for students who complete certain course sequences and perform at a high enough level. But these agreements are often dense and inaccessible.

The combination of big data and behavioral insights has the potential to help students navigate these complex decisions and successfully follow through on their intentions. Big data analytic techniques allow us to identify concrete transfer pathways where students are positioned to succeed; behavioral insights ensure we communicate these options in a way that maximizes students’ engagement and responsiveness…..A growing body of innovative research has demonstrated that, by applying behavioral science insights to the way we communicate with students and families about the opportunities and resources available to them, we can help people navigate these complex decisions and experience better outcomes as a result. A combination of simplified information, reminders, and access to assistance have improved achievement and attainment up and down the education pipeline, nudging parents to practice early-literacy activities with their kids or check in with their high schoolers about missed assignments, andencouraging students to renew their financial aid for college….

These types of big data techniques are already being used in some education sectors. For instance, a growing number of colleges use predictive analytics to identify struggling students who need additional assistance, so faculty and administrators can intervene before the student drops out. But frequently there is insufficient attention, once the results of these predictive analyses are in hand, about how to communicate the information in a way that is likely to lead to behavior change among students or educators. And much of the predictive analytics work has been on the side of plugging leaks in the pipeline (e.g. preventing drop-outs from higher education), rather than on the side of proactively sending students and families personalized information about educational and career pathways where they are likely to flourish…(More)”

Ebola: A Big Data Disaster


Study by Sean Martin McDonald: “…undertaken with support from the Open Society Foundation, Ford Foundation, and Media Democracy Fund, explores the use of Big Data in the form of Call Detail Record (CDR) data in humanitarian crisis.

It discusses the challenges of digital humanitarian coordination in health emergencies like the Ebola outbreak in West Africa, and the marked tension in the debate around experimentation with humanitarian technologies and the impact on privacy. McDonald’s research focuses on the two primary legal and human rights frameworks, privacy and property, to question the impact of unregulated use of CDR’s on human rights. It also highlights how the diffusion of data science to the realm of international development constitutes a genuine opportunity to bring powerful new tools to fight crisis and emergencies.

Analysing the risks of using CDRs to perform migration analysis and contact tracing without user consent, as well as the application of big data to disease surveillance is an important entry point into the debate around use of Big Data for development and humanitarian aid. The paper also raises crucial questions of legal significance about the access to information, the limitation of data sharing, and the concept of proportionality in privacy invasion in the public good. These issues hold great relevance in today’s time where big data and its emerging role for development, involving its actual and potential uses as well as harms is under consideration across the world.

The paper highlights the absence of a dialogue around the significant legal risks posed by the collection, use, and international transfer of personally identifiable data and humanitarian information, and the grey areas around assumptions of public good. The paper calls for a critical discussion around the experimental nature of data modelling in emergency response due to mismanagement of information has been largely emphasized to protect the contours of human rights….

See Sean Martin McDonald – “Ebola: A Big Data Disaster” (PDF).

 

A machine intelligence commission for the UK


Geoff Mulgan at NESTA: ” This paper makes the case for creating a Machine Intelligence Commission – a new public institution to help the development of new generations of algorithms, machine learning tools and uses of big data, ensuring that the public interest is protected.

I argue that new institutions of this kind – which can interrogate, inspect and influence technological development – are a precondition for growing informed public trust. That trust will, in turn, be essential if we are to reap the full potential public and economic benefits from new technologies. The proposal draws on lessons from fields such as human fertilisation, biotech and energy, which have shown how trust can be earned, and how new industries can be grown.  It also draws on lessons from the mistakes made in fields like GM crops and personal health data, where lack of trust has impeded progress….(More)”

Big Data Visualization: Review of 20 Tools


Edoardo L’Astorina at BluFrame: “Big Data is amazing. It describes our everyday behavior, keeps track of the places we go, stores what we like to do and how much time we spend doing our favorite activities.

Big Data is made of numbers, and I think we all agree when we say: Numbers are difficult to look at. Enter Big Data visualization….Data visualization lets you interact with data. It goes beyond analysis. Visualization brings a presentation to life. It keeps your audience’s eyes on the screen. And gets people interested….

We made everything easy for you and prepared a series of reviews that cover all the features of the best data visualization tools out there. And we divided our reviews in two sections: data visualization tools for presentations and data visualization tools for developers.

Here are reviews of our 20 best tools for Big Data visualization.

Data Visualization Tools for Presentations: Zero Coding Required:…

Tableau.. is the big data visualization tool for corporate. Tableau lets you create charts, graphs, maps and many other graphics. A desktop app is available for visual analytics….

Infogram…lets you link their visualizations and infographics to real time big data…

ChartBlocks… is an easy-to-use online tool that requires no coding, and builds visualizations from spreadsheets, databases… and live feeds….

Datawrapper.. is aimed squarely at publishers and journalists…

Plotly…will help you create a sharp and slick chart in just a few minutes, starting from a simple spreadsheet….

RAW… boasts on its homepage to be “the missing link between spreadsheets and vector graphics”….

Visual.ly… is a visual content service….

Data Visualization Tools for Developers: JavaScript libraries

D3.js…runs on JavaScript and uses HTML, CSS and SVG. D3.js is open-source and applies data-driven transformation to a webpage and – as you can see from their examples – allows for beautiful and fast visualizations….

Ember Charts is – as the name suggests – based on the Ember.js framework and uses D3.js under the hood….

NVD3…runs on top of D3.js –surprise surprise– and aims to build re-usable charts and components….

Google Charts… runs on HTML5 and SVG and aims at Android, iOS and total cross-browser compatibility, including older Internet Explorer versions supported via VML

FusionCharts is – according to their site – the most comprehensive JavaScript charting library, and includes over 90 charts and 900 maps….

Highcharts…is a JavaScript API that integrates easily with jQuery and boasts being used by 61 out of the world’s 100 largest companies….

Chart.js…For a small chart project, Chart.js is your go-to place….

Leaflet… leveragesOpenStreetMap data and adds HTML5/CSS3 visualizations and interactivity on top to ensure everything is responsive and mobile ready….

Chartist.js.. is born out of a community effort to blow all other JavaScript charting libraries out of the water…

n3-charts…is for the AngularJS lovers out there….

Sigma JS…is what you want for interactivity….

Polymaps…visualizes…. you guessed it: maps….

Processing.js …is a JavaScript library that sits on top of the Processing visual programming language…(More)

Public-Private Partnerships for Statistics: Lessons Learned, Future Steps


Report by Nicholas Robin, Thilo Klein and Johannes Jütting for Paris 21: “Non-offcial sources of data, big data in particular, are currently attracting enormous interest in the world of official statistics. An impressive body of work focuses on how different types of big data (telecom data, social media, sensors, etc.) can be used to fll specifc data gaps, especially with regard to the post-2015 agenda and the associated technology challenges. The focus of this paper is on a different aspect, but one that is of crucial importance: what are the perspectives of the commercial operations and national statistical offces which respectively produce and might use this data and which incentives, business models and protocols are needed in order to leverage non-offcial data sources within the offcial statistics community?

Public-private partnerships (PPPs) offer signifcant opportunities such as cost effectiveness, timeliness, granularity, new indicators, but also present a range of challenges that need to be surmounted. These comprise technical diffculties, risks related to data confdentiality as well as a lack of incentives. Nevertheless, a number of collaborative projects have already emerged and can be

Nevertheless, a number of collaborative projects have already emerged and can be classified into four ideal types: namely the in-house production of statistics by the data provider, the transfer of private data sets to the end user, the transfer of private data sets to a trusted third party for processing and/or analysis, and the outsourcing of national statistical office functions (the only model which is not centred around a data-sharing dimension). In developing countries, a severe lack of resources and particular statistical needs (to adopt a system-wide approach within national statistical systems and fill statistical gaps which are relevant to national development plans) highlight the importance of harnessing the private sector’s resources and point to the most holistic models (in-house and third party) in which the private sector contributes to the processing and analysis of data. The following key lessons are drawn from four case studies….(More)”

The Geography of Cultural Ties and Human Mobility: Big Data in Urban Contexts


Wenjie Wu Jianghao Wang & Tianshi Dai  in Annals of the American Association of Geographers: “A largely unexplored big data application in urban contexts is how cultural ties affect human mobility patterns. This article explores China’s intercity human mobility patterns from social media data to contribute to our understanding of this question. Exposure to human mobility patterns is measured by big data computational strategy for identifying hundreds of millions of individuals’ space–time footprint trajectories. Linguistic data are coded as a proxy for cultural ties from a unique geographically coded atlas of dialect distributions. We find that cultural ties are associated with human mobility flows between city pairs, contingent on commuting costs and geographical distances. Such effects are not distributed evenly over time and space, however. These findings present useful insights in support of the cultural mechanism that can account for the rise, decline, and dynamics of human mobility between regions….(More)”