How Internet surveillance predicts disease outbreak before WHO


Kurzweil News: “Have you ever Googled for an online diagnosis before visiting a doctor? If so, you may have helped provide early warning of an infectious disease epidemic.
In a new study published in Lancet Infectious Diseases, Internet-based surveillance has been found to detect infectious diseases such as Dengue Fever and Influenza up to two weeks earlier than traditional surveillance methods, according to Queensland University of Technology (QUT) research fellow and senior author of the paper Wenbiao Hu.
Hu, based at the Institute for Health and Biomedical Innovation, said there was often a lag time of two weeks before traditional surveillance methods could detect an emerging infectious disease.
“This is because traditional surveillance relies on the patient recognizing the symptoms and seeking treatment before diagnosis, along with the time taken for health professionals to alert authorities through their health networks. In contrast, digital surveillance can provide real-time detection of epidemics.”
Hu said the study used search engine algorithms such as Google Trends and Google Insights. It found that detecting the 2005–06 avian influenza outbreak “Bird Flu” would have been possible between one and two weeks earlier than official surveillance reports.
“In another example, a digital data collection network was found to be able to detect the SARS outbreak more than two months before the first publications by the World Health Organization (WHO),” Hu said.
According to this week’s CDC FluView report published Jan. 17, 2014, influenza activity in the United States remains high overall, with 3,745 laboratory-confirmed influenza-associated hospitalizations reported since October 1, 2013 (credit: CDC)
“Early detection means early warning and that can help reduce or contain an epidemic, as well alert public health authorities to ensure risk management strategies such as the provision of adequate medication are implemented.”
Hu said the study found that social media including Twitter and Facebook and microblogs could also be effective in detecting disease outbreaks. “The next step would be to combine the approaches currently available such as social media, aggregator websites, and search engines, along with other factors such as climate and temperature, and develop a real-time infectious disease predictor.”
“The international nature of emerging infectious diseases combined with the globalization of travel and trade, have increased the interconnectedness of all countries and that means detecting, monitoring and controlling these diseases is a global concern.”
The other authors of the paper were Gabriel Milinovich (first author), Gail Williams and Archie Clements from the University of Queensland School of Population, Health and State.
Supramap 
Another powerful tool is Supramap, a web application that synthesizes large, diverse datasets so that researchers can better understand the spread of infectious diseases across hosts and geography by integrating genetic, evolutionary, geospatial, and temporal data. It is now open-source — create your own maps here.
Associate Professor Daniel Janies, Ph.D., an expert in computational genomics at the Wexner Medical Center at The Ohio State University (OSU), worked with software engineers at the Ohio Supercomputer Center (OSC) to allow researchers and public safety officials to develop other front-end applications that draw on the logic and computing resources of Supramap.
It was originally developed in 2007 to track the spread and evolution of pandemic (H1N1) and avian influenza (H5N1).
“Using SUPRAMAP, we initially developed maps that illustrated the spread of drug-resistant influenza and host shifts in H1N1 and H5N1 influenza and in coronaviruses, such as SARS,” said Janies. “SUPRAMAP allows the user to track strains carrying key mutations in a geospatial browser such as Google Earth. Our software allows public health scientists to update and view maps on the evolution and spread of pathogens.”
Grant funding through the U.S. Army Research Laboratory and Office supports this Innovation Group on Global Infectious Disease Research project. Support for the computational requirements of the project comes from  the American Museum of Natural History (AMNH) and OSC. Ohio State’s Wexner Medical Center, Department of Biomedical Informatics and offices of Academic Affairs and Research provide additional support.”
See also

How should we analyse our lives?


Gillian Tett in the Financial Times on the challenge of using the new form of data science: “A few years ago, Alex “Sandy” Pentland, a professor of computational social sciences at MIT Media Lab, conducted a curious experiment at a Bank of America call centre in Rhode Island. He fitted 80 employees with biometric devices to track all their movements, physical conversations and email interactions for six weeks, and then used a computer to analyse “some 10 gigabytes of behaviour data”, as he recalls.
The results showed that the workers were isolated from each other, partly because at this call centre, like others of its ilk, the staff took their breaks in rotation so that the phones were constantly manned. In response, Bank of America decided to change its system to enable staff to hang out together over coffee and swap ideas in an unstructured way. Almost immediately there was a dramatic improvement in performance. “The average call-handle time decreased sharply, which means that the employees were much more productive,” Pentland writes in his forthcoming book Social Physics. “[So] the call centre management staff converted the break structure of all their call centres to this new system and forecast a $15m per year productivity increase.”
When I first heard Pentland relate this tale, I was tempted to give a loud cheer on behalf of all long-suffering call centre staff and corporate drones. Pentland’s data essentially give credibility to a point that many people know instinctively: that it is horribly dispiriting – and unproductive – to have to toil in a tiny isolated cubicle by yourself all day. Bank of America deserves credit both for letting Pentland’s team engage in this people-watching – and for changing its coffee-break schedule in response.
But there is a bigger issue at stake here too: namely how academics such as Pentland analyse our lives. We have known for centuries that cultural and social dynamics influence how we behave but until now academics could usually only measure this by looking at micro-level data, which were often subjective. Anthropology (a discipline I know well) is a case in point: anthropologists typically study cultures by painstakingly observing small groups of people and then extrapolating this in a subjective manner.

Pentland and others like him are now convinced that the great academic divide between “hard” and “soft” sciences is set to disappear, since researchers these days can gather massive volumes of data about human behaviour with precision. Sometimes this information is volunteered by individuals, on sites such as Facebook; sometimes it can be gathered from the electronic traces – the “digital breadcrumbs” – that we all deposit (when we use a mobile phone, say) or deliberately collected with biometric devices like the ones used at Bank of America. Either way, it can enable academics to monitor and forecast social interaction in a manner we could never have dreamed of before. “Social physics helps us understand how ideas flow from person to person . . . and ends up shaping the norms, productivity and creative output of our companies, cities and societies,” writes Pentland. “Just as the goal of traditional physics is to understand how the flow of energy translates into change in motion, social physics seems to understand how the flow of ideas and information translates into changes in behaviour….

But perhaps the most important point is this: whether you love or hate this new form of data science, the genie cannot be put back in the bottle. The experiments that Pentland and many others are conducting at call centres, offices and other institutions across America are simply the leading edge of a trend.

The only question now is whether these powerful new tools will be mostly used for good (to predict traffic queues or flu epidemics) or for more malevolent ends (to enable companies to flog needless goods, say, or for government control). Sadly, “social physics” and data crunching don’t offer any prediction on this issue, even though it is one of the dominant questions of our age.”

Mapping the Data Shadows of Hurricane Sandy: Uncovering the Sociospatial Dimensions of ‘Big Data’


New Paper by Shelton, T., Poorthuis, A., Graham, M., and Zook, M. : “Digital social data are now practically ubiquitous, with increasingly large and interconnected databases leading researchers, politicians, and the private sector to focus on how such ‘big data’ can allow potentially unprecedented insights into our world. This paper investigates Twitter activity in the wake of Hurricane Sandy in order to demonstrate the complex relationship between the material world and its digital representations. Through documenting the various spatial patterns of Sandy-related tweeting both within the New York metropolitan region and across the United States, we make a series of broader conceptual and methodological interventions into the nascent geographic literature on big data. Rather than focus on how these massive databases are causing necessary and irreversible shifts in the ways that knowledge is produced, we instead find it more productive to ask how small subsets of big data, especially georeferenced social media information scraped from the internet, can reveal the geographies of a range of social processes and practices. Utilizing both qualitative and quantitative methods, we can uncover broad spatial patterns within this data, as well as understand how this data reflects the lived experiences of the people creating it. We also seek to fill a conceptual lacuna in studies of user-generated geographic information, which have often avoided any explicit theorizing of sociospatial relations, by employing Jessop et al’s TPSN framework. Through these interventions, we demonstrate that any analysis of user-generated geographic information must take into account the existence of more complex spatialities than the relatively simple spatial ontology implied by latitude and longitude coordinates.”

Needed: A New Generation of Game Changers to Solve Public Problems


Beth Noveck: “In order to change the way we govern, it is important to train and nurture a new generation of problem solvers who possess the multidisciplinary skills to become effective agents of change. That’s why we at the GovLab have launched The GovLab Academy with the support of the Knight Foundation.
In an effort to help people in their own communities become more effective at developing and implementing creative solutions to compelling challenges, The Gov Lab Academy is offering two new training programs:
1) An online platform with an unbundled and evolving set of topics, modules and instructors on innovations in governance, including themes such as big and open data and crowdsourcing and forthcoming topics on behavioral economics, prizes and challenges, open contracting and performance management for governance;
2) Gov 3.0: A curated and sequenced, 14-week mentoring and training program.
While the online-platform is always freely available, Gov 3.0 begins on January 29, 2014 and we invite you to to participate. Please forward this email to your networks and help us spread the word about the opportunity to participate.
Please consider applying (individuals or teams may apply), if you are:

  • an expert in communications, public policy, law, computer science, engineering, business or design who wants to expand your ability to bring about social change;

  • a public servant who wants to bring innovation to your job;

  • someone with an important idea for positive change but who lacks key skills or resources to realize the vision;

  • interested in joining a network of like-minded, purpose-driven individuals across the country; or

  • someone who is passionate about using technology to solve public problems.

The program includes live instruction and conversation every Wednesday from 5:00– 6:30 PM EST for 14 weeks starting Jan 29, 2014. You will be able to participate remotely via Google Hangout.

Gov 3.0 will allow you to apply evolving technology to the design and implementation of effective solutions to public interest challenges. It will give you an overview of the most current approaches to smarter governance and help you improve your skills in collaboration, communication, and developing and presenting innovative ideas.

Over 14 weeks, you will develop a project and a plan for its implementation, including a long and short description, a presentation deck, a persuasive video and a project blog. Last term’s projects covered such diverse issues as post-Fukushima food safety, science literacy for high schoolers and prison reform for the elderly. In every case, the goal was to identify realistic strategies for making a difference quickly.  You can read the entire Gov 3.0 syllabus here.

The program will include national experts and instructors in technology and governance both as guests and as mentors to help you design your project. Last term’s mentors included current and former officials from the White House and various state, local and international governments, academics from a variety of fields, and prominent philanthropists.

People who complete the program will have the opportunity to apply for a special fellowship to pursue their projects further.

Previously taught only on campus, we are offering Gov 3.0 in beta as an online program. This is not a MOOC. It is a mentoring-intensive coaching experience. To maximize the quality of the experience, enrollment is limited.

Please submit your application by January 22, 2014. Accepted applicants (individuals and teams) will be notified on January 24, 2014. We hope to expand the program in the future so please use the same form to let us know if you would like to be kept informed about future opportunities.”

Safety Datapalooza Shows Power of Data.gov Communities


Lisa Nelson at DigitalGov: “The White House Office of Public Engagement held the first Safety Datapalooza illustrating the power of Data.gov communities. Federal Chief Technology Officer Todd Park and Deputy Secretary of Transportation John Porcari hosted the event, which touted the data available on Safety.Data.gov and the community of innovators using it to make effective tools for consumers.
The event showcased many of the  tools that have been produced as a result of  opening this safety data including:

  • PulsePoint, from the San Ramon Fire Protection District, a lifesaving mobile app that allows CPR-trained volunteers to be notified if someone nearby is in need of emergency assistance;
  • Commute and crime maps, from Trulia, allow home buyers to choose their new residence based on two important everyday factors; and
  • Hurricane App, from the American Red Cross, to monitor storm conditions, prepare your family and home, find help, and let others know you’re safe even if the power is out;

Safety data is far from alone in generating innovative ideas and gathering a community of developers and entrepreneurs, Data.gov currently has 16 different topically diverse communities on land and sea — the Cities and Oceans communities being two such examples. Data.gov’s communities are a virtual meeting spot for interested parties across government, academia and industry to come together and put the data to use. Data.gov enables a whole set of tools to make these communities come to life: apps, blogs, challenges, forums, ranking, rating and wikis.
For a summary of the Safety Datapalooza visit Transportation’s “Fast Lane” blog.”

The LinkedIn Volunteer Marketplace: Connecting Professionals to Nonprofit Volunteer Opportunities


LinkedIn: “Last spring, a shelter in Berkeley, CA needed an architect to help it expand its facilities. A young architect who lives nearby had just made a New Year’s resolution to join a nonprofit board. In an earlier era, they would not have known each other existed.
But in this instance the shelter’s executive director used LinkedIn to contact the architect – and the architect jumped at the opportunity to serve on the shelter’s board. The connection brought enormous value to both parties involved – the nonprofit shelter got the expertise it needed and the young architect was able to amplify her social impact while broadening her professional skills.
This story inspired me and my colleagues at LinkedIn. As someone who studies and invests (as a venture capitalist) in internet marketplaces, I realized the somewhat serendipitous connection between architect and shelter would happen more often if there were a dedicated volunteer marketplace. After all, there are hundreds of thousands of “nonprofit needs” in the world, and even more professionals who want to donate their skills to help meet these needs.
The challenge is that nonprofits and professionals don’t know how to easily find each other. LinkedIn Volunteer Marketplace aims to solve that problem.
Changing the professional definition of “opportunity”
When I talk with LinkedIn members, many tell me they aren’t actively looking for traditional job opportunities. Instead, they want to hone or leverage their skills while also making a positive impact on the world.
Students often fall into this category. Retired professionals and stay-at-home parents seek ways to continue to leverage their skills and experience. And while busy professionals who love their current gigs may not necessarily be looking for a new position, these are often the very people who are most actively engaged in “meaningful searches” – a volunteer opportunity that will enhance their life in ways beyond what their primary vocation provides.
By providing opportunities for all these different kinds of LinkedIn members, we aim to help the social sector by doing what we do best as a company: connecting talent with opportunity at massive scale.
And to ensure that the volunteer opportunities you see in the LinkedIn Volunteer Marketplace are high quality, we’re partnering with the most trusted organizations in this space, including Catchafire, Taproot Foundation, BoardSource and VolunteerMatch.”
 

Bad Data


Bad Data is a site providing real-world examples of how not to prepare or provide data. It showcases the poorly structured, the mis-formatted, or the just plain ugly. Its primary purpose is to educate – though there may also be some aspect of entertainment.
As a side-product it also provides a source of good practice material for budding data wranglers (the repo in fact began as a place to keep practice data for Data Explorer).
New examples wanted and welcome – submit them here »

Examples

Open data movement faces fresh hurdles


SciDevNet: “The open-data community made great strides in 2013 towards increasing the reliability of and access to information, but more efforts are needed to increase its usability on the ground and the general capacity of those using it, experts say.
An international network of innovation hubs, the first extensive open data certification system and a data for development partnership are three initiatives launched last year by the fledgling Open Data Institute (ODI), a UK-based not-for-profit firm that champions the use of open data to aid social, economic and environmental development.
Before open data can be used effectively the biggest hurdles to be cleared are agreeing common formats for data sets and improving their trustworthiness and searchability, says the ODI’s chief statistician, Ulrich Atz.
“As it is so new, open data is often inconsistent in its format, making it difficult to reuse. We see a great need for standards and tools,” he tells SciDev.Net. Data that is standardised is of “incredible value” he says, because this makes it easier and faster to use and gives it a longer useable lifetime.
The ODI — which celebrated its first anniversary last month — is attempting to achieve this with a first-of-its-kind certification system that gives publishers and users important details about online data sets, including publishers’ names and contact information, the type of sharing licence, the quality of information and how long it will be available.
Certificates encourage businesses and governments to make use of open data by guaranteeing their quality and usability, and making them easier to find online, says Atz.
Finding more and better ways to apply open data will also be supported by a growing network of ODI ‘nodes’: centres that bring together companies, universities and NGOs to support open-data projects and communities….
Because lower-income countries often lack well-established data collection systems, they have greater freedom to rethink how data are collected and how they flow between governments and civil society, he says.
But there is still a long way to go. Open-data projects currently rely on governments and other providers sharing their data on online platforms, whereas in a truly effective system, information would be published in an open format from the start, says Davies.
Furthermore, even where advances are being made at a strategic level, open-data initiatives are still having only a modest impact in the real world, he says.
“Transferring [progress at a policy level] into availability of data on the ground and the capacity to use it is a lot tougher and slower,” Davies says.”

Open Development (Networked Innovations in International Development)


New book edited by Matthew L. Smith and Katherine M. A. Reilly (Foreword by Yochai Benkler) : “The emergence of open networked models made possible by digital technology has the potential to transform international development. Open network structures allow people to come together to share information, organize, and collaborate. Open development harnesses this power, to create new organizational forms and improve people’s lives; it is not only an agenda for research and practice but also a statement about how to approach international development. In this volume, experts explore a variety of applications of openness, addressing challenges as well as opportunities.
Open development requires new theoretical tools that focus on real world problems, consider a variety of solutions, and recognize the complexity of local contexts. After exploring the new theoretical terrain, the book describes a range of cases in which open models address such specific development issues as biotechnology research, improving education, and access to scholarly publications. Contributors then examine tensions between open models and existing structures, including struggles over privacy, intellectual property, and implementation. Finally, contributors offer broader conceptual perspectives, considering processes of social construction, knowledge management, and the role of individual intent in the development and outcomes of social models.”

New Book: Open Data Now


New book by Joel Gurin (The GovLab): “Open Data is the world’s greatest free resource–unprecedented access to thousands of databases–and it is one of the most revolutionary developments since the Information Age began. Combining two major trends–the exponential growth of digital data and the emerging culture of disclosure and transparency–Open Data gives you and your business full access to information that has never been available to the average person until now. Unlike most Big Data, Open Data is transparent, accessible, and reusable in ways that give it the power to transform business, government, and society.
Open Data Now is an essential guide to understanding all kinds of open databases–business, government, science, technology, retail, social media, and more–and using those resources to your best advantage. You’ll learn how to tap crowds for fast innovation, conduct research through open collaboration, and manage and market your business in a transparent marketplace.
Open Data is open for business–and the opportunities are as big and boundless as the Internet itself. This powerful, practical book shows you how to harness the power of Open Data in a variety of applications:

  • HOT STARTUPS: turn government data into profitable ventures
  • SAVVY MARKETING: understand how reputational data drives your brand
  • DATA-DRIVEN INVESTING: apply new tools for business analysis
  • CONSUMER IN FORMATION: connect with your customers using smart disclosure
  • GREEN BUSINESS: use data to bet on sustainable companies
  • FAST R&D: turn the online world into your research lab
  • NEW OPPORTUNITIES: explore open fields for new businesses

Whether you’re a marketing professional who wants to stay on top of what’s trending, a budding entrepreneur with a billion-dollar idea and limited resources, or a struggling business owner trying to stay competitive in a changing global market–or if you just want to understand the cutting edge of information technology–Open Data Now offers a wealth of big ideas, strategies, and techniques that wouldn’t have been possible before Open Data leveled the playing field.
The revolution is here and it’s now. It’s Open Data Now.”