David Lazer: “…big data last winter had its “Dewey beats Truman” moment, when the poster child of big data (at least for behavioral data), Google Flu Trends (GFT), went way off the rails in “nowcasting” the flu–overshooting the peak last winter by 130% (and indeed, it has been systematically overshooting by wide margins for 3 years). Tomorrow we (Ryan Kennedy, Alessandro Vespignani, and Gary King) have a paper out in Science dissecting why GFT went off the rails, how that could have been prevented, and the broader lessons to be learned regarding big data.
[We are The Parable of Google Flu (WP-Final).pdf we submitted before acceptance. We have also posted an SSRN paper evaluating GFT for 2013-14, since it was reworked in the Fall.]Key lessons that I’d highlight:
1) Big data are typically not scientifically calibrated. This goes back to my post last month regarding measurement. This does not make them useless from a scientific point of view, but you do need to build into the analysis that the “measures” of behavior are being affected by unseen things. In this case, the likely culprit was the Google search algorithm, which was modified in various ways that we believe likely to have increased flu related searches.
2) Big data + analytic code used in scientific venues with scientific claims need to be more transparent. This is a tricky issue, because there are both legitimate proprietary interests involved and privacy concerns, but much more can be done in this regard than has been done in the 3 GFT papers. [One of my aspirations over the next year is to work together with big data companies, researchers, and privacy advocates to figure out how this can be done.]
3) It’s about the questions, not the size of the data. In this particular case, one could have done a better job stating the likely flu prevalence today by ignoring GFT altogether and just project 3 week old CDC data to today (better still would have been to combine the two). That is, a synthesis would have been more effective than a pure “big data” approach. I think this is likely the general pattern.
4) More generally, I’d note that there is much more that the academy needs to do. First, the academy needs to build the foundation for collaborations around big data (e.g., secure infrastructures, legal understandings around data sharing, etc). Second, there needs to be MUCH more work done to build bridges between the computer scientists who work on big data and social scientists who think about deriving insights about human behavior from data more generally. We have moved perhaps 5% of the way that we need to in this regard.”
Intelligent Demand: Policy Rationale, Design and Potential Benefits
New OECD paper: “Policy interest in demand-side initiatives has grown in recent years. This may reflect an expectation that demand-side policy could be particularly effective in steering innovation to meet societal needs. In addition, owing to constrained public finances in most OECD countries, the possibility that demand-side policies might be less expensive than direct support measures is attractive. Interest may also reflect some degree of disappointment with the outcomes of traditional supply-side measures. This paper reviews demand-side innovation policies, their rationales and importance across countries, different approaches to their design, the challenges entailed in their implementation and evaluation, and good practices. Three main forms of demand-side policy are considered: innovation-oriented public procurement, innovation-oriented regulations, and standards. Emphasis is placed on innovation-oriented public procurement.”
A Framework for Benchmarking Open Government Data Efforts
DS Sayogo, TA Pardo, M Cook in the HICSS ’14 Proceedings of the 2014 47th Hawaii International Conference on System Sciences: “This paper presents a preliminary exploration on the status of open government data worldwide as well as in-depth evaluation of selected open government data portals. Using web content analysis of the open government data portals from 35 countries, this study outlines the progress of open government data efforts at the national government level. This paper also conducted in-depth evaluation of selected cases to justify the application of a proposed framework for understanding the status of open government data initiatives. This paper suggest that findings of this exploration offer a new-level of understanding of the depth, breath, and impact of current open government data efforts. The review results also point to the different stages of open government data portal development in term of data content, data manipulation capability and participatory and engagement capability. This finding suggests that development of open government portal follows an incremental approach similar to those of e-government development stages in general. Subsequently, this paper offers several observations in terms of policy and practical implication of open government data portal development drawn from the application of the proposed framework”
Social media effects on fostering online civic engagement and building citizen trust and trust in institutions
Anne Marie Warren, Ainin Sulaiman and Noor Ismawati Jaafar in Government Information Quaterly: “This paper tests the extent to which social media is shaping civic engagement initiatives to build trust among people and increase trust in their institutions, particularly the government, police and justice systems. A survey of 502 citizens showed that using social media for civic engagement has a significant positive impact on trust propensity and that this trust had led to an increase in trust towards institutions. Interestingly, while group incentives encouraged citizens to engage online for civic matters, it is civic publications through postings on social media that intensify the urge of citizens for civic action to address social issues. Post-hoc analysis via ten interviews with social activists was conducted to further examine their perceptions on trust towards institutions. The overall findings suggest that institutions, in their effort to promote a meaningful and trusting citizen engagement, need to enhance trust among the public by fostering social capital via online civic engagement and closing the public–police disengagement gap”
Revolutionising Digital Public Service Delivery: A UK Government Perspective
Coordinating the Commons: Diversity & Dynamics in Open Collaborations
Dissertation by Jonathan T. Morgan: “The success of Wikipedia demonstrates that open collaboration can be an effective model for organizing geographically-distributed volunteers to perform complex, sustained work at a massive scale. However, Wikipedia’s history also demonstrates some of the challenges that large, long-term open collaborations face: the core community of Wikipedia editors—the volunteers who contribute most of the encyclopedia’s content and ensure that articles are correct and consistent — has been gradually shrinking since 2007, in part because Wikipedia’s social climate has become increasingly inhospitable for newcomers, female editors, and editors from other underrepresented demographics. Previous research studies of change over time within other work contexts, such as corporations, suggests that incremental processes such as bureaucratic formalization can make organizations more rule-bound and less adaptable — in effect, less open— as they grow and age. There has been little research on how open collaborations like Wikipedia change over time, and on the impact of those changes on the social dynamics of the collaborating community and the way community members prioritize and perform work. Learning from Wikipedia’s successes and failures can help researchers and designers understand how to support open collaborations in other domains — such as Free/Libre Open Source Software, Citizen Science, and Citizen Journalism.
True Collective Intelligence? A Sketch of a Possible New Field
Paper by Geoff Mulgan in Philosophy & Technology :” Collective intelligence is much talked about but remains very underdeveloped as a field. There are small pockets in computer science and psychology and fragments in other fields, ranging from economics to biology. New networks and social media also provide a rich source of emerging evidence. However, there are surprisingly few useable theories, and many of the fashionable claims have not stood up to scrutiny. The field of analysis should be how intelligence is organised at large scale—in organisations, cities, nations and networks. The paper sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to the possible intellectual barriers to progress.”
Overcoming 'Tragedies of the Commons' with a Self-Regulating, Participatory Market Society
Paper by Dirk Helbing; “Our society is fundamentally changing. These days, almost nothing works without a computer chip. Processing power doubles every 18 months and will exceed the capabilities of human brains in about ten years from now. Some time ago, IBM’s Big Blue computer already beat the best chess player. Meanwhile, computers perform about 70 percent of all financial transactions, and IBM’s Watson advises customers better than human telephone hotlines. Will computers and robots soon replace skilled labor? In many European countries, unemployment is reaching historical heights. The forthcoming economic and social impact of future information and communication technologies (ICT) will be huge – probably more significant than that caused by the steam engine, or by nano- or biotechnology.
The storage capacity for data is growing even faster than computational capacity. Within just a year we will soon generate more data than in the entire history of humankind. The “Internet of Things” will network trillions of sensors. Unimaginable amounts of data will be collected. Big Data is already being praised as the “oil of the 21st century”. What opportunities and risks does this create for our society, economy, and environment?”
Smart Governance: A Roadmap for Research and Practice
New report by Hans J. Scholl and Margit C. Scholl: “It has been the object of this article to make the case and present a roadmap for the study of the phenomena of smart governance as well as smart and open governance as an enactment of smart governance in practice. As a concept paper, this contribution aimed at sparking interest and at inspiring scholarly and practitioner discourse in this area of study inside the community of electronic government research and practice, and beyond. The roadmap presented here comprises and details seven elements of smart governance along with eight areas of focus in practice.
Smart governance along with its administrative enactment of smart and open government, it was argued, can help effectively address the three grand challenges to 21st century societal and individual well-being, which are (a) the Third Industrial Revolution with the information revolution at its core, (b) the rapidity of change and the lack of timely and effective government intervention, and (c) expansive government spending and exorbitant public debt financing. Although not seen as a panacea, it was also argued that smart governance principles could guide the relatively complex administrative enactment of smart and open government more intelligently than traditional static and inflexible governance approaches could do.
Since much of the road ahead metaphorically speaking leads through uncharted territory, dedicated research is needed that accompanies projects in this area and evaluates them. Research could further be embedded into practical projects providing for fast and systematic learning. We believe that such embedding of research into smart governance projects should become an integral part of smart projects’ agendas.”
The Web at 25 in the U.S.
Paper by Lee Rainie and Susannah Fox from Pew: “The overall verdict: The internet has been a plus for society and an especially good thing for individual users… This report is the first part of a sustained effort through 2014 by the Pew Research Center to mark the 25th anniversary of the creation of the World Wide Web by Sir Tim Berners-Lee. Lee wrote a paper on March 12, 1989 proposing an “information management” system that became the conceptual and architectural structure for the Web. He eventually released the code for his system—for free—to the world on Christmas Day in 1990. It became a milestone in easing the way for ordinary people to access documents and interact over a network of computers called the internet—a system that linked computers and that had been around for years. The Web became especially appealing after Web browsers were perfected in the early 1990s to facilitate graphical displays of pages on those linked computers.”