Twitter Analytics Project HealthMap Outperforming WHO in Ebola Tracking


HIS Talk: “HealthMap, a collaborative data analytics project launched in 2006 between Harvard Medical School and Boston Children’s Hospital, has been quietly tracking the recent Ebola outbreak in Western Africa with notable accuracy, beating the World Health Organization’s own tracking efforts by two weeks in some instances.
HealthMap aggregates information from a variety of online sources to plot real-time disease outbreaks. Currently, the platform analyzes data from the World Health Organization, Google News, and GeoSentinel, a global disease tracking platform that tracks major geography changes in diseases carried through travelers, foreign visitors, and immigrants. The analytics project also got a new source of feeder-data this February when Twitter announced that the HealthMap project had been selected as a Twitter Data Grant recipient, which gives the 45 epidemiologists working on the project access to the “fire hose” of unfiltered data generated from Twitter’s 500 million daily tweets….”

Open Data: Going Beyond Solving Problems to Making the Impossible Possible


at the Huffington Post: “As a global community, we are producing data at an astounding rate. The pace was recently described as a “new Google every four days” by the highly respected Andreesen Horowitz partner, Peter Levine, in a thought-provoking post addressing the challenge of making sense of this mountain of data.

“… we are now collecting more data each day, so much that 90 percent of the data in the world today has been created in the last two years alone. In fact, every day, we create 2.5 quintillion bytes of data — by some estimates that’s one new Google every four days, and the rate is only increasing. Our desire to use, interact, and learn from this data will become increasingly important and strategic to businesses and society as a whole.”

For the past year and a half, my cofounders and team have focused on what it will take to use, interact and learn from data being produced within the civic sector. It’s one thing to be able to build an app for civic; it’s quite another to build a platform that can manage multiple apps across multiple platforms while addressing the challenges plaguing the “wild west” nature of growth in the quickly emerging market of open data. I was recently invited to share our lessons learned and the promise of the future in mobile open data at the TEDxABQ Technology Salon. Here is a bit of what I shared:
Beyond Solving Problems to New Possibilities
When we first looked at the opportunities for creating apps built on open data, our priority was finding pain points for both cities and the people who lived there. We focused on solving real problems, and it led to some early success. We worked with the City of Albuquerque to deploy their ABQ RIDE app on iOS and Android platforms, and the app not only solved real problems for riders, it also saved real money for the city. The app has grown to over 20,000 regular users and continues to be one of the highest downloaded apps on our platform.
But recently, we’ve started asking questions that go beyond the basic, that change the experience or make things possible in ways that never were before. Here are two I’m incredibly proud to be a part of…”

Knowledge is Beautiful


New book by David McCandless: “In this mind-blowing follow-up to the bestselling Information is Beautiful, the undisputed king of infographics David McCandless uses stunning and unique visuals to reveal unexpected insights into how the world really works. Every minute of every hour of every day we are bombarded with information – be it on television, in print or online. How can we relate to this mind-numbing overload? Enter David McCandless and his amazing infographics: simple, elegant ways to understand information too complex or abstract to grasp any way but visually. McCandless creates dazzling displays that blend the facts with their connections, contexts and relationships, making information meaningful, entertaining – and beautiful. Knowledge is Beautiful is an endlessly fascinating spin through the world of visualized data, all of it bearing the hallmark of David McCandless’s ground-breaking signature style. Taking infographics to the next level, Knowledge is Beautiful offers a deeper, more wide-ranging look at the world and its history. Covering everything from dog breeds and movie plots to the most commonly used passwords and crazy global warming solutions, Knowledge is Beautiful is guaranteed to enrich your understanding of the world.”

What Cars Did for Today’s World, Data May Do for Tomorrow’s


Quentin Hardy in the New York Times: “New technology products head at us constantly. There’s the latest smartphone, the shiny new app, the hot social network, even the smarter thermostat.

As great (or not) as all these may be, each thing is a small part of a much bigger process that’s rarely admired. They all belong inside a world-changing ecosystem of digital hardware and software, spreading into every area of our lives.

Thinking about what is going on behind the scenes is easier if we consider the automobile, also known as “the machine that changed the world.” Cars succeeded through the widespread construction of highways and gas stations. Those things created a global supply chain of steel plants and refineries. Seemingly unrelated things, including suburbs, fast food and drive-time talk radio, arose in the success.

Today’s dominant industrial ecosystem is relentlessly acquiring and processing digital information. It demands newer and better ways of collecting, shipping, and processing data, much the way cars needed better road building. And it’s spinning out its own unseen businesses.

A few recent developments illustrate the new ecosystem. General Electric plans to announce Monday that it has created a “data lake” method of analyzing sensor information from industrial machinery in places like railroads, airlines, hospitals and utilities. G.E. has been putting sensors on everything it can for a couple of years, and now it is out to read all that information quickly.

The company, working with an outfit called Pivotal, said that in the last three months it has looked at information from 3.4 million miles of flights by 24 airlines using G.E. jet engines. G.E. said it figured out things like possible defects 2,000 times as fast as it could before.

The company has to, since it’s getting so much more data. “In 10 years, 17 billion pieces of equipment will have sensors,” said William Ruh, vice president of G.E. software. “We’re only one-tenth of the way there.”

It hardly matters if Mr. Ruh is off by five billion or so. Billions of humans are already augmenting that number with their own packages of sensors, called smartphones, fitness bands and wearable computers. Almost all of that will get uploaded someplace too.

Shipping that data creates challenges. In June, researchers at the University of California, San Diego announced a method of engineering fiber optic cable that could make digital networks run 10 times faster. The idea is to get more parts of the system working closer to the speed of light, without involving the “slow” processing of electronic semiconductors.

“We’re going from millions of personal computers and billions of smartphones to tens of billions of devices, with and without people, and that is the early phase of all this,” said Larry Smarr, drector of the California Institute for Telecommunications and Information Technology, located inside U.C.S.D. “A gigabit a second was fast in commercial networks, now we’re at 100 gigabits a second. A terabit a second will come and go. A petabit a second will come and go.”

In other words, Mr. Smarr thinks commercial networks will eventually be 10,000 times as fast as today’s best systems. “It will have to grow, if we’re going to continue what has become our primary basis of wealth creation,” he said.

Add computation to collection and transport. Last month, U.C. Berkeley’s AMP Lab, created two years ago for research into new kinds of large-scale computing, spun out a company called Databricks, that uses new kinds of software for fast data analysis on a rental basis. Databricks plugs into the one million-plus computer servers inside the global system of Amazon Web Services, and will soon work inside similar-size megacomputing systems from Google and Microsoft.

It was the second company out of the AMP Lab this year. The first, called Mesosphere, enables a kind of pooling of computing services, building the efficiency of even million-computer systems….”

The infrastructure Africa really needs is better data reporting


Data reporting on the continent is sketchy. Just look at the recent GDP revisions of large countries. How is it that Nigeria’s April GDP recalculation catapulted it ahead of South Africa, making it the largest economy in Africa overnight? Or that Kenya’s economy is actually 20% larger (paywall) than previously thought?

Indeed, countries in Africa get noticeably bad scores on the World Bank’s Bulletin Board on Statistical Capacity, an index of data reporting integrity.

Bad data is not simply the result of inconsistencies or miscalculations: African governments have an incentive to produce statistics that overstate their economic development.

A recent working paper from the Center for Global Development (CGD) shows how politics influence the statistics released by many African countries…

But in the long run, dodgy statistics aren’t good for anyone. They “distort the way we understand the opportunities that are available,” says Amanda Glassman, one of the CGD report’s authors. US firms have pledged $14 billion in trade deals at the summit in Washington. No doubt they would like to know whether high school enrollment promises to create a more educated workforce in a given country, or whether its people have been immunized for viruses.

Overly optimistic indicators also distort how a government decides where to focus its efforts. If school enrollment appears to be high, why implement programs intended to increase it?

The CGD report suggests increased funding to national statistical agencies, and making sure that they are wholly independent from their governments. President Obama is talking up $7 billion into African agriculture. But unless cash and attention are given to improving statistical integrity, he may never know whether that investment has borne fruit”

The Emergence of Government Innovation Teams


Hollie Russon Gilman at TechTank: “A new global currency is emerging.  Governments understand that people at home and abroad evaluate them based on how they use technology and innovative approaches in their service delivery and citizen engagement.  This raises opportunities, and critical questions about the role of innovation in 21st century governance.
Bloomberg Philanthropies and Nesta, the UK’s Innovation foundation, recently released a global report highlighting 20 government innovation teams.  Importantly, the study included teams that were established and funded by all levels of government (city, regional and national), and aims to find creative solutions to seemingly intractable solutions. This report features 20 teams across six continents and features some basic principles and commonalities that are instructive for all types of innovators, inside and outside, of government.
Using Government to Locally Engage
One of the challenges of representational democracy is that elected officials and government officials spend time in bureaucracies isolated from the very people they aim to serve.  Perhaps there can be different models.  For example, Seoul’s Innovation Bureau is engaging citizens to re-design and re-imagine public services.  Seoul is dedicated to becoming a Sharing City; including Tool Kit Centers where citizens can borrow machinery they would rarely use that would also benefit the whole community. This approach puts citizens at the center of their communities and leverages government to work for the people…
As I’ve outlined in a earlier TechTank post, there are institutional constraints for governments to try the unknown.  There are potential electoral costs, greater disillusionment, and gaps in vital service delivery. Yet, despite all of these barriers there are a variety of promising tools. For example, Finland has Sitra, an Innovation fund, whose mission is to foster experimentation to transform a diverse set of policy issues including sustainable energy and healthcare. Sitra invests in both the practical research and experiments to further public sector issues as well as invest in early stage companies.
We need a deeper understanding of the opportunities, and challenges, of innovation in government.    Luckily there are many researchers, think-tanks, and organizations beginning analysis.  For example, Professor and Associate Dean Anita McGahan, of the Rotman School of Management at the University of Toronto, calls for a more strategic approach toward understanding the use of innovation, including big data, in the public sector…”

Digital Footprints: Opportunities and Challenges for Online Social Research


Paper by Golder, Scott A. and Macy, Michael for the Annual Review of Sociology: “Online interaction is now a regular part of daily life for a demographically diverse population of hundreds of millions of people worldwide. These interactions generate fine-grained time-stamped records of human behavior and social interaction at the level of individual events, yet are global in scale, allowing researchers to address fundamental questions about social identity, status, conflict, cooperation, collective action, and diffusion, both by using observational data and by conducting in vivo field experiments. This unprecedented opportunity comes with a number of methodological challenges, including generalizing observations to the offline world, protecting individual privacy, and solving the logistical challenges posed by “big data” and web-based experiments. We review current advances in online social research and critically assess the theoretical and methodological opportunities and limitations. [J]ust as the invention of the telescope revolutionized the study of the heavens, so too by rendering the unmeasurable measurable, the technological revolution in mobile, Web, and Internet communications has the potential to revolutionize our understanding of ourselves and how we interact…. [T]hree hundred years after Alexander Pope argued that the proper study of mankind should lie not in the heavens but in ourselves, we have finally found our telescope. Let the revolution begin. —Duncan Watts”

Fifteen open data insights


Tim Davies from ODRN: “…below are the 15 points from the three-page briefing version, and you can find a full write-up of these points for download. You can also find reports from all the individual project partners, including a collection of quick-read research posters over on the Open Data Research Network website.

15 insights into open data supply, use and impacts

(1) There are many gaps to overcome before open data availability, can lead to widespread effective use and impact. Open data can lead to change through a ‘domino effect’, or by creating ripples of change that gradually spread out. However, often many of the key ‘domino pieces’ are missing, and local political contexts limit the reach of ripples. Poor data quality, low connectivity, scarce technical skills, weak legal frameworks and political barriers may all prevent open data triggering sustainable change. Attentiveness to all the components of open data impact is needed when designing interventions.
(2) There is a frequent mismatch between open data supply and demand in developing countries. Counting datasets is a poor way of assessing the quality of an open data initiative. The datasets published on portals are often the datasets that are easiest to publish, not the datasets most in demand. Politically sensitive datasets are particularly unlikely to be published without civil society pressure. Sometimes the gap is on the demand side – as potential open data users often do not articulate demands for key datasets.
(3) Open data initiatives can create new spaces for civil society to pursue government accountability and effectiveness. The conversation around transparency and accountability that ideas of open data can support is as important as the datasets in some developing countries.
(4) Working on open data projects can change how government creates, prepares and uses its own data. The motivations behind an open data initiative shape how government uses the data itself. Civil society and entrepreneurs interacting with government through open data projects can help shape government data practices. This makes it important to consider which intermediaries gain insider roles shaping data supply.
(5) Intermediaries are vital to both the supply and the use of open data. Not all data needed for governance in developing countries comes from government. Intermediaries can create data, articulate demands for data, and help translate open data visions from political leaders into effective implementations. Traditional local intermediaries are an important source of information, in particular because they are trusted parties.
(6) Digital divides create data divides in both the supply and use of data. In some developing countries key data is not digitised, or a lack of technical staff has left data management patchy and inconsistent. Where Internet access is scarce, few citizens can have direct access to data or services built with it. Full access is needed for full empowerment, but offline intermediaries, including journalists and community radio stations, also play a vital role in bridging the gaps between data and citizens.
(7) Where information is already available and used, the shift to open data involves data evolution rather than data revolution. Many NGOs and intermediaries already access the information which is now becoming available as data. Capacity building should start from existing information and data practices in organisations, and should look for the step-by-step gains to be made from a data-driven approach.
(8) Officials’ fears about the integrity of data are a barrier to more machine-readable data being made available. The publication of data as PDF or in scanned copies is often down to a misunderstanding of how open data works. Only copies can be changed, and originals can be kept authoritative. Helping officials understand this may help increase the supply of data.
(9) Very few datasets are clearly openly licensed, and there is low understanding of what open licenses entail. There are mixed opinions on the importance of a focus on licensing in different contexts. Clear licenses are important to building a global commons of interoperable data, but may be less relevant to particular uses of data on the ground. In many countries wider conversation about licensing are yet to take place.
(10) Privacy issues are not on the radar of most developing country open data projects, although commercial confidentiality does arise as a reason preventing greater data transparency. Much state held data is collected either from citizens or from companies. Few countries in the ODDC study have weak or absent privacy laws and frameworks, yet participants in the studies raised few personal privacy considerations. By contrast, a lack of clarity, and officials’ concerns, about potential breaches of commercial confidentiality when sharing data gathered from firms was a barrier to opening data.
(11) There is more to open data than policies and portals. Whilst central open data portals act as a visible symbol of open data initiatives, a focus on portal building can distract attention from wider reforms. Open data elements can also be built on existing data sharing practices, and data made available through the locations where citizens, NGOs are businesses already go to access information.
(12) Open data advocacy should be aware of, and build upon, existing policy foundations in specific countries and sectors. Sectoral transparency policies for local government, budget and energy industry regulation, amongst others, could all have open data requirements and standards attached, drawing on existing mechanisms to secure sustainable supplies of relevant open data in developing countries. In addition, open data conversations could help make existing data collection and disclosure requirements fit better with the information and data demands of citizens.
(13) Open data is not just a central government issue: local government data, city data, and data from the judicial and legislative branches are all important. Many open data projects focus on the national level, and only on the executive branch. However, local government is closer to citizens, urban areas bring together many of the key ingredients for successful open data initiatives, and transparency in other branches of government is important to secure citizens democratic rights.
(14) Flexibility is needed in the application of definitions of open data to allow locally relevant and effective open data debates and advocacy to emerge. Open data is made up of various elements, including proactive publication, machine-readability and permissions to re-use. Countries at different stages of open data development may choose to focus on one or more of these, but recognising that adopting all elements at once could hinder progress. It is important to find ways to both define open data clearly, and to avoid a reductive debate that does not recognise progressive steps towards greater openness.
(15) There are many different models for an open data initiative: including top-down, bottom-up and sector-specific. Initiatives may also be state-led, civil society-led and entrepreneur-led in their goals and how they are implemented – with consequences for the resources and models required to make them sustainable. There is no one-size-fits-all approach to open data. More experimentation, evaluation and shared learning on the components, partners and processes for putting open data ideas into practice must be a priority for all who want to see a world where open-by-default data drives real social, political and economic change.
You can read more about each of these points in the full report.”

Quantifying the Interoperability of Open Government Datasets


Paper by Pieter Colpaert, Mathias Van Compernolle, Laurens De Vocht, Anastasia Dimou, Miel Vander Sande, Peter Mechant, Ruben Verborgh, and Erik Mannens, to be published in Computer: “Open Governments use the Web as a global dataspace for datasets. It is in the interest of these governments to be interoperable with other governments worldwide, yet there is currently no way to identify relevant datasets to be interoperable with and there is no way to measure the interoperability itself. In this article we discuss the possibility of comparing identifiers used within various datasets as a way to measure semantic interoperability. We introduce three metrics to express the interoperability between two datasets: the identifier interoperability, the relevance and the number of conflicts. The metrics are calculated from a list of statements which indicate for each pair of identifiers in the system whether they identify the same concept or not. While a lot of effort is needed to collect these statements, the return is high: not only relevant datasets are identified, also machine-readable feedback is provided to the data maintainer.”

Selected Readings on Economic Impact of Open Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of open data was originally published in 2014.

Open data is publicly available data – often released by governments, scientists, and occasionally private companies – that is made available for anyone to use, in a machine-readable format, free of charge. Considerable attention has been devoted to the economic potential of open data for businesses and other organizations, and it is now widely accepted that open data plays an important role in spurring innovation, growth, and job creation. From new business models to innovation in local governance, open data is being quickly adopted as a valuable resource at many levels.

Measuring and analyzing the economic impact of open data in a systematic way is challenging, and governments as well as other providers of open data seek to provide access to the data in a standardized way. As governmental transparency increases and open data changes business models and activities in many economic sectors, it is important to understand best practices for releasing and using non-proprietary, public information. Costs, social challenges, and technical barriers also influence the economic impact of open data.

These selected readings are intended as a first step in the direction of answering the question of if we can and how we consider if opening data spurs economic impact.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Bonina, Carla. New Business Models and the Values of Open Data: Definitions, Challenges, and Opportunities. NEMODE 3K – Small Grants Call 2013. http://bit.ly/1xGf9oe

  • In this paper, Dr. Carla Bonina provides an introduction to open data and open data business models, evaluating their potential economic value and identifying future challenges for the effectiveness of open data, such as personal data and privacy, the emerging data divide, and the costs of collecting, producing and releasing open (government) data.

Carpenter, John and Phil Watts. Assessing the Value of OS OpenData™ to the Economy of Great Britain – Synopsis. June 2013. Accessed July 25, 2014. http://bit.ly/1rTLVUE

  • John Carpenter and Phil Watts of Ordnance Survey undertook a study to examine the economic impact of open data to the economy of Great Britain. Using a variety of methods such as case studies, interviews, downlad analysis, adoption rates, impact calculation, and CGE modeling, the authors estimates that the OS OpenData initiative will deliver a net of increase in GDP of £13 – 28.5 million for Great Britain in 2013.

Capgemini Consulting. The Open Data Economy: Unlocking Economic Value by Opening Government and Public Data. Capgemini Consulting. Accessed July 24, 2014. http://bit.ly/1n7MR02

  • This report explores how governments are leveraging open data for economic benefits. Through using a compariative approach, the authors study important open data from organizational, technological, social and political perspectives. The study highlights the potential of open data to drive profit through increasing the effectiveness of benchmarking and other data-driven business strategies.

Deloitte. Open Growth: Stimulating Demand for Open Data in the UK. Deloitte Analytics. December 2012. Accessed July 24, 2014. http://bit.ly/1oeFhks

  • This early paper on open data by Deloitte uses case studies and statistical analysis on open government data to create models of businesses using open data. They also review the market supply and demand of open government data in emerging sectors of the economy.

Gruen, Nicholas, John Houghton and Richard Tooth. Open for Business: How Open Data Can Help Achieve the G20 Growth Target.  Accessed July 24, 2014, http://bit.ly/UOmBRe

  • This report highlights the potential economic value of the open data agenda in Australia and the G20. The report provides an initial literature review on the economic value of open data, as well as a asset of case studies on the economic value of open data, and a set of recommendations for how open data can help the G20 and Australia achieve target objectives in the areas of trade, finance, fiscal and monetary policy, anti-corruption, employment, energy, and infrastructure.

Heusser, Felipe I. Understanding Open Government Data and Addressing Its Impact (draft version). World Wide Web Foundation. http://bit.ly/1o9Egym

  • The World Wide Web Foundation, in collaboration with IDRC has begun a research network to explore the impacts of open data in developing countries. In addition to the Web Foundation and IDRC, the network includes the Berkman Center for Internet and Society at Harvard, the Open Development Technology Alliance and Practical Participation.

Howard, Alex. San Francisco Looks to Tap Into the Open Data Economy. O’Reilly Radar: Insight, Analysis, and Reach about Emerging Technologies.  October 19, 2012.  Accessed July 24, 2014. http://oreil.ly/1qNRt3h

  • Alex Howard points to San Francisco as one of the first municipalities in the United States to embrace an open data platform.  He outlines how open data has driven innovation in local governance.  Moreover, he discusses the potential impact of open data on job creation and government technology infrastructure in the City and County of San Francisco.

Huijboom, Noor and Tijs Van den Broek. Open Data: An International Comparison of Strategies. European Journal of ePractice. March 2011. Accessed July 24, 2014.  http://bit.ly/1AE24jq

  • This article examines five countries and their open data strategies, identifying key features, main barriers, and drivers of progress for of open data programs. The authors outline the key challenges facing European, and other national open data policies, highlighting the emerging role open data initiatives are playing in political and administrative agendas around the world.

Manyika, J., Michael Chui, Diana Farrell, Steve Van Kuiken, Peter Groves, and Elizabeth Almasi Doshi. Open Data: Unlocking Innovation and Performance with Liquid Innovation. McKinsey Global Institute. October 2013. Accessed July 24, 2014.  http://bit.ly/1lgDX0v

  • This research focuses on quantifying the potential value of open data in seven “domains” in the global economy: education, transportation, consumer products, electricity, oil and gas, health care, and consumer finance.

Moore, Alida. Congressional Transparency Caucus: How Open Data Creates Jobs. April 2, 2014. Accessed July 30, 2014. Socrata. http://bit.ly/1n7OJpp

  • Socrata provides a summary of the March 24th briefing of the Congressional Transparency Caucus on the need to increase government transparency through adopting open data initiatives. They include key takeaways from the panel discussion, as well as their role in making open data available for businesses.

Stott, Andrew. Open Data for Economic Growth. The World Bank. June 25, 2014. Accessed July 24, 2014. http://bit.ly/1n7PRJF

  • In this report, The World Bank examines the evidence for the economic potential of open data, holding that the economic potential is quite large, despite a variation in the published estimates, and difficulties assessing its potential methodologically. They provide five archetypes of businesses using open data, and provides recommendations for governments trying to maximize economic growth from open data.