Can Government Mine Tweets to Assess Public Opinion?


at Government Technology: “What if instead of going to a city meeting, you could go on Twitter, tweet your opinion, and still be heard by those in government? New research suggests this is a possibility.
The Urban Attitudes Lab at Tufts University has conducted research on accessing “big data” on social networking sites for civic purposes, according to Justin Hollander, associate professor in the Department of Urban and Environmental Policy and Planning at Tufts.
About six months ago, Hollander began researching new ways of accessing how people think about the places they live, work and play. “We’re looking to see how tapping into social media data to understand attitudes and opinions can benefit both urban planning and public policy,” he said.
Harnessing natural comments — there are about one billion tweets per day — could help governments learn what people are saying and feeling, said Hollander. And while formal types of data can be used as proxies for how happy people are, people openly share their sentiments on social networking sites.
Twitter and other social media sites can also provide information in an unobtrusive way. “The idea is that we can capture a potentially more valid and reliable view [of people’s] opinions about the world,” he said. As an inexact science, social science relies on a wide range of data sources to inform research, including surveys, interviews and focus groups; but people respond to being the subject of study, possibly affecting outcomes, Hollander said.
Hollander is also interested in extracting data from social sites because it can be done on a 24/7 basis, which means not having to wait for government to administer surveys, like the Decennial Census. Information from Twitter can also be connected to place; Hollander has approximated that about 10 percent of all tweets are geotagged to location.
In its first study earlier this year, the lab looked at using big data to learn about people’s sentiments and civic interests in New Bedford, Mass., comparing Twitter messages with the city’s published meeting minutes.
To extract tweets over a six-week period from February to April, researchers used the lab’s own software to capture 122,186 tweets geotagged within the city that also had words pertaining to the New Bedford area. Hollander said anyone can get API information from Twitter to also mine data from an area as small as a neighborhood containing a couple hundred houses.
Researchers used IBM’s SPSS Modeler software, comparing this to custom-designed software, to leverage a sentiment dictionary of nearly 3,000 words, assigning a sentiment score to each phrase — ranging from -5 for awful feelings to +5 for feelings of elation. The lab did this for the Twitter messages, and found that about 7 percent were positive versus 5.5 percent negative, and correspondingly in the minutes, 1.7 percent were positive and .7 percent negative. In total, about 11,000 messages contained sentiments.
The lab also used NVivo qualitative software to analyze 24 key words in a one-year sample of the city’s meeting minutes. By searching for the same words in Twitter posts, the researchers found that “school,” “health,” “safety,” “parks,” “field” and “children” were used frequently across both mediums.
….
Next up for the lab is a new study contrasting Twitter posts from four Massachusetts cities with the recent election results.

Measuring the Impact of Public Innovation in the Wild


Beth Noveck at Governing: “With complex, seemingly intractable problems such as inequality, climate change and affordable access to health care plaguing contemporary society, traditional institutions such as government agencies and nonprofit organizations often lack strategies for tackling them effectively and legitimately. For this reason, this year the MacArthur Foundation launched its Research Network on Opening Governance.
The Network, which I chair and which also is supported by Google.org, is what MacArthur calls a “research institution without walls.” It brings together a dozen researchers across universities and disciplines, with an advisory network of academics, technologists, and current and former government officials, to study new ways of addressing public problems using advances in science and technology.
Through regular meetings and collaborative projects, the Network is exploring, for example, the latest techniques for more open and transparent decision-making, the uses of data to transform how we govern, and the identification of an individual’s skills and experiences to improve collaborative problem-solving between government and citizen.
One of the central questions we are grappling with is how to accelerate the pace of research so we can learn better and faster when an innovation in governance works — for whom, in which contexts and under which conditions. With better methods for doing fast-cycle research in collaboration with government — in the wild, not in the lab — our hope is to be able to predict with accuracy, not just know after the fact, whether innovations such as opening up an agency’s data or consulting with citizens using a crowdsourcing platform are likely to result in real improvements in people’s lives.
An example of such an experiment is the work that members of the Network are undertaking with the Food and Drug Administration. As one of its duties, the FDA manages the process of pre-market approval of medical devices to ensure that patients and providers have timely access to safe, effective and high-quality technology, as well as the post-market review of medical devices to ensure that unsafe ones are identified and recalled from the market. In both of these contexts, the FDA seeks to provide the medical-device industry with productive, consistent, transparent and efficient regulatory pathways.
With thousands of devices, many of them employing cutting-edge technology, to examine each year, the FDA is faced with the challenge of finding the right internal and external expertise to help it quickly study a device’s safety and efficacy. Done right, lives can be saved and companies can prosper from bringing innovations quickly to market. Done wrong, bad devices can kill…”

The Governance Of Socio-Technical Systems


New book edited by Susana Borrás and Jakob Edler: “Why are so few electric cars in our streets today? Why is it difficult to introduce electronic patient records in our hospitals? To answer these questions we need to understand how state and non-state actors interact with the purpose of transforming socio-technical systems.
Examining the “who” (agents), “how” (policy instruments) and “why” (societal legitimacy) of the governance process, this book presents a conceptual framework for the governance of change in socio-technical systems. Bridging the gap between disciplinary fields, expert contributions provide innovative empirical cases of different modes of governing change. The Governance of Socio-Technical Systems offers a stepping-stone towards building a theory of governance of change and presents a new research agenda on the interaction between science, technology and society.”

Stories of Innovative Democracy at Local Level


Special Issue of Field Actions Science Reports published in partnership with CIVICUS, coordinated by Dorothée Guénéheux, Clara Bosco, Agnès Chamayou and Henri Rouillé d’Orfeuil: “This special issue presents many and varied field actions, such as the promotion of the rights of young people, the resolution of the conflicts of agropastoral activities, or the process of participatory decisionmaking on community budgetary allocations, among many others. It addresses projects developed all over the world, on five continents, and covering both the northern and southern hemispheres. The legitimate initial queries and doubts that assailed those who started this publication as regards its feasibility, have been swept away by the enthusiasm and the large number of papers that have been sent in….”

 

Politics, Policy and Privatisation in the Everyday Experience of Big Data in the NHS


Chapter by Andrew Goffey ; Lynne Pettinger and Ewen Speed in Martin Hand , Sam Hillyard (ed.) Big Data? Qualitative Approaches to Digital Research (Studies in Qualitative Methodology, Volume 13) : “This chapter explains how fundamental organisational change in the UK National Health Service (NHS) is being effected by new practices of digitised information gathering and use. It analyses the taken-for-granted IT infrastructures that lie behind digitisation and considers the relationship between digitisation and big data.
Design/methodology/approach

Qualitative research methods including discourse analysis, ethnography of software and key informant interviews were used. Actor-network theories, as developed by Science and technology Studies (STS) researchers were used to inform the research questions, data gathering and analysis. The chapter focuses on the aftermath of legislation to change the organisation of the NHS.

Findings

The chapter shows the benefits of qualitative research into specific manifestations information technology. It explains how apparently ‘objective’ and ‘neutral’ quantitative data gathering and analysis is mediated by complex software practices. It considers the political power of claims that data is neutral.

Originality/value

The chapter provides insight into a specific case of healthcare data and. It makes explicit the role of politics and the State in digitisation and shows how STS approaches can be used to understand political and technological practice.”

Could digital badges clarify the roles of co-authors?


  at AAAS Science Magazine: “Ever look at a research paper and wonder how the half-dozen or more authors contributed to the work? After all, it’s usually only the first or last author who gets all the media attention or the scientific credit when people are considered for jobs, grants, awards, and more. Some journals try to address this issue with the “authors’ contributions” sections within a paper, but a collection of science, publishing, and software groups is now developing a more modern solution—digital “badges,” assigned on publication of a paper online, that detail what each author did for the work and that the authors can link to their profiles elsewhere on the Web.

Digital badges could clarify co-authors' roles

Those organizations include publishers BioMed Central and the Public Library of Science; The Wellcome Trust research charity; software development groups Mozilla Science Lab (a group of researchers, developers, librarians, and publishers) and Digital Science (a software and technology firm); and ORCID, an effort to assign researchers digital identifiers. The collaboration presented its progress on the project at the Mozilla Festival in London that ended last week. (Mozilla is the open software community behind the Firefox browser and other programs.)
The infrastructure of the badges is still being established, with early prototypes scheduled to launch early next year, according to Amye Kenall, the journal development manager of open data initiatives and journals at BioMed Central. She envisions the badge process in the following way: Once an article is published, the publisher would alert software maintained by Mozilla to automatically set up an online form, where authors fill out roles using a detailed contributor taxonomy. After the authors have completed this, the badges would then appear next to their names on the journal article, and double-clicking on a badge would lead to the ORCID site for that particular author, where the author’s badges, integrated with their publishing record, live….
The parties behind the digital badge effort are “looking to change behavior” of scientists in the competitive dog-eat-dog world of academia by acknowledging contributions, says Kaitlin Thaney, director of Mozilla Science Lab. Amy Brand, vice president of academic and research relations and VP of North America at Digital Science, says that the collaboration believes that the badges should be optional, to accommodate old-fashioned or less tech-savvy authors. She says that the digital credentials may improve lab culture, countering situations where junior scientists are caught up in lab politics and the “star,” who didn’t do much of the actual research apart from obtaining the funding, gets to be the first author of the paper and receive the most credit. “All of this calls out for more transparency,” Brand says….”

Urban Observatory Is Snapping 9,000 Images A Day Of New York City


FastCo-Exist: “Astronomers have long built observatories to capture the night sky and beyond. Now researchers at NYU are borrowing astronomy’s methods and turning their cameras towards Manhattan’s famous skyline.
NYU’s Center for Urban Science and Progress has been running what’s likely the world’s first “urban observatory” of its kind for about a year. From atop a tall building in downtown Brooklyn (NYU won’t say its address, due to security concerns), two cameras—one regular one and one that captures infrared wavelengths—take panoramic images of lower and midtown Manhattan. One photo is snapped every 10 seconds. That’s 8,640 images a day, or more than 3 million since the project began (or about 50 terabytes of data).

“The real power of the urban observatory is that you have this synoptic imaging. By synoptic imaging, I mean these large swaths of the city,” says the project’s chief scientist Gregory Dobler, a former astrophysicist at Harvard University and the University of California, Santa Barbara who now heads the 15-person observatory team at NYU.
Dobler’s team is collaborating with New York City officials on the project, which is now expanding to set up stations that study other parts of Manhattan and Brooklyn. Its major goal is to discover information about the urban landscape that can’t be seen at other scales. Such data could lead to applications like tracking which buildings are leaking energy (with the infrared camera), or measuring occupancy patterns of buildings at night, or perhaps detecting releases of toxic chemicals in an emergency.
The video above is an example. The top panel cycles through a one-minute slice of observatory images. The bottom panel is an analysis of the same images in which everything that remains static in each image is removed, such as buildings, trees, and roads. What’s left is an imprint of everything in flux within the scene—the clouds, the cars on the FDR Drive, the boat moving down the East River, and, importantly, a plume of smoke that puffs out of a building.
“Periodically, a building will burp,” says Dobler. “It’s hard to see the puffs of smoke . . . but we can isolate that plume and essentially identify it.” (As Dobler has done by highlighting it in red in the top panel).
To the natural privacy concerns about this kind of program, Dobler emphasizes that the pictures are only from an 8 megapixel camera (the same found in the iPhone 6) and aren’t clear enough to see inside a window or make out individuals. As a further privacy safeguard, the images are analyzed to only look at “aggregate” measures—such as the patterns of nighttime energy usage—rather than specific buildings. “We’re not really interested in looking at a given building, and saying, hey, these guys are particular offenders,” he says (He also says the team is not looking at uses for the data in security applications.) However, Dobler was not able to answer a question as to whether the project’s partners at city agencies are able to access data analysis for individual buildings….”

Finding Collaborators: Toward Interactive Discovery Tools for Research Network Systems


New paper by Charles D Borromeo, Titus K Schleyer, Michael J Becich, and Harry Hochheiser: “Background: Research networking systems hold great promise for helping biomedical scientists identify collaborators with the expertise needed to build interdisciplinary teams. Although efforts to date have focused primarily on collecting and aggregating information, less attention has been paid to the design of end-user tools for using these collections to identify collaborators. To be effective, collaborator search tools must provide researchers with easy access to information relevant to their collaboration needs.
Objective: The aim was to study user requirements and preferences for research networking system collaborator search tools and to design and evaluate a functional prototype.
Methods: Paper prototypes exploring possible interface designs were presented to 18 participants in semistructured interviews aimed at eliciting collaborator search needs. Interview data were coded and analyzed to identify recurrent themes and related software requirements. Analysis results and elements from paper prototypes were used to design a Web-based prototype using the D3 JavaScript library and VIVO data. Preliminary usability studies asked 20 participants to use the tool and to provide feedback through semistructured interviews and completion of the System Usability Scale (SUS).
Results: Initial interviews identified consensus regarding several novel requirements for collaborator search tools, including chronological display of publication and research funding information, the need for conjunctive keyword searches, and tools for tracking candidate collaborators. Participant responses were positive (SUS score: mean 76.4%, SD 13.9). Opportunities for improving the interface design were identified.
Conclusions: Interactive, timeline-based displays that support comparison of researcher productivity in funding and publication have the potential to effectively support searching for collaborators. Further refinement and longitudinal studies may be needed to better understand the implications of collaborator search tools for researcher workflows.”

How Wikipedia Data Is Revolutionizing Flu Forecasting


They say their model has the potential to transform flu forecasting from a black art to a modern science as well-founded as weather forecasting.
Flu takes between 3,000 and 49,000 lives each year in the U.S. so an accurate forecast can have a significant impact on the way society prepares for the epidemic. The current method of monitoring flu outbreaks is somewhat antiquated. It relies on a voluntary system in which public health officials report the percentage of patients they see each week with influenza-like illnesses. This is defined as the percentage of people with a temperature higher than 100 degrees, a cough and no other explanation other than flu.
These numbers give a sense of the incidence of flu at any instant but the accuracy is clearly limited. They do not, for example, account for people with flu who do not seek treatment or people with flu-like symptoms who seek treatment but do not have flu.
There is another significant problem. The network that reports this data is relatively slow. It takes about two weeks for the numbers to filter through the system so the data is always weeks old.
That’s why the CDC is interested in finding new ways to monitor the spread of flu in real time. Google, in particular, has used the number of searches for flu and flu-like symptoms to forecast flu in various parts of the world. That approach has had considerable success but also some puzzling failures. One problem, however, is that Google does not make its data freely available and this lack of transparency is a potential source of trouble for this kind of research.
So Hickmann and co have turned to Wikipedia. Their idea is that the variation in numbers of people accessing articles about flu is an indicator of the spread of the disease. And since Wikipedia makes this data freely available to any interested party, it is an entirely transparent source that is likely to be available for the foreseeable future….
Ref: arxiv.org/abs/1410.7716 : Forecasting the 2013–2014 Influenza Season using Wikipedia”

The New Thing in Google Flu Trends Is Traditional Data


in the New York Times: “Google is giving its Flu Trends service an overhaul — “a brand new engine,” as it announced in a blog post on Friday.

The new thing is actually traditional data from the Centers for Disease Control and Prevention that is being integrated into the Google flu-tracking model. The goal is greater accuracy after the Google service had been criticized for consistently over-estimating flu outbreaks in recent years.

The main critique came in an analysis done by four quantitative social scientists, published earlier this year in an article in Science magazine, “The Parable of Google Flu: Traps in Big Data Analysis.” The researchers found that the most accurate flu predictor was a data mash-up that combined Google Flu Trends, which monitored flu-related search terms, with the official C.D.C. reports from doctors on influenza-like illness.

The Google Flu Trends team is heeding that advice. In the blog post, written by Christian Stefansen, a Google senior software engineer, wrote, “We’re launching a new Flu Trends model in the United States that — like many of the best performing methods in the literature — takes official CDC flu data into account as the flu season progresses.”

Google’s flu-tracking service has had its ups and downs. Its triumph came in 2009, when it gave an advance signal of the severity of the H1N1 outbreak, two weeks or so ahead of official statistics. In a 2009 article in Nature explaining how Google Flu Trends worked, the company’s researchers did, as the Friday post notes, say that the Google service was not intended to replace official flu surveillance methods and that it was susceptible to “false alerts” — anything that might prompt a surge in flu-related search queries.

Yet those caveats came a couple of pages into the Nature article. And Google Flu Trends became a symbol of the superiority of the new, big data approach — computer algorithms mining data trails for collective intelligence in real time. To enthusiasts, it seemed so superior to the antiquated method of collecting health data that involved doctors talking to patients, inspecting them and filing reports.

But Google’s flu service greatly overestimated the number of cases in the United States in the 2012-13 flu season — a well-known miss — and, according to the research published this year, has persistently overstated flu cases over the years. In the Science article, the social scientists called it “big data hubris.”