New flu tracker uses Google search data better than Google


 at ArsTechnica: “With big data comes big noise. Google learned this lesson the hard way with its now kaput Google Flu Trends. The online tracker, which used Internet search data to predict real-life flu outbreaks, emerged amid fanfare in 2008. Then it met a quiet death this August after repeatedly coughing up bad estimates.

But big Internet data isn’t out of the disease tracking scene yet.

With hubris firmly in check, a team of Harvard researchers have come up with a way to tame the unruly data, combine it with other data sets, and continually calibrate it to track flu outbreaks with less error. Their new model, published Monday in the Proceedings of the National Academy of Sciences, out-performs Google Flu Trends and other models with at least double the accuracy. If the model holds up in coming flu seasons, it could reinstate some optimism in using big data to monitor disease and herald a wave of more accurate second-generation models.

Big data has a lot of potential, Samuel Kou, a statistics professor at Harvard University and coauthor on the new study, told Ars. It’s just a question of using the right analytics, he said.

Kou and his colleagues built on Google’s flu tracking model for their new version, called ARGO (AutoRegression with GOogle search data). Google Flu Trends basically relied on trends in Internet search terms, such as headache and chills, to estimate the number of flu cases. Those search terms were correlated with flu outbreak data collected by the Centers for Disease Control and Prevention. The CDC’s data relies on clinical reports from around the country. But compiling and analyzing that data can be slow, leading to a lag time of one to three weeks. The Google data, on the other hand, offered near real-time tracking for health experts to manage and prepare for outbreaks.

At first Google’s tracker appeared to be pretty good, matching CDC data’s late-breaking data somewhat closely. But, two notable stumbles led to its ultimate downfall: an underestimate of the 2009 H1N1 swine flu outbreak and an alarming overestimate (almost double real numbers) of the 2012-2013 flu season’s cases…..For ARGO, he and colleagues took the trend data and then designed a model that could self-correct for changes in how people search. The model has a two-year sliding window in which it re-calibrates current search term trends with the CDC’s historical flu data (the gold standard for flu data). They also made sure to exclude winter search terms, such as March Madness and the Oscars, so they didn’t get accidentally correlated with seasonal flu trends. Last, they incorporated data on the historical seasonality of flu.

The result was a model that significantly out-competed the Google Flu Trends estimates for the period between March 29, 2009 to July 11, 2015. ARGO also beat out other models, including one based on current and historical CDC data….(More)”

See also Proceedings of the National Academy of Sciences, 2015. DOI: 10.1073/pnas.1515373112

Politics and the New Machine


Jill Lepore in the NewYorker on “What the turn from polls to data science means for democracy”: “…The modern public-opinion poll has been around since the Great Depression, when the response rate—the number of people who take a survey as a percentage of those who were asked—was more than ninety. The participation rate—the number of people who take a survey as a percentage of the population—is far lower. Election pollsters sample only a minuscule portion of the electorate, not uncommonly something on the order of a couple of thousand people out of the more than two hundred million Americans who are eligible to vote. The promise of this work is that the sample is exquisitely representative. But the lower the response rate the harder and more expensive it becomes to realize that promise, which requires both calling many more people and trying to correct for “non-response bias” by giving greater weight to the answers of people from demographic groups that are less likely to respond. Pollster.com’s Mark Blumenthal has recalled how, in the nineteen-eighties, when the response rate at the firm where he was working had fallen to about sixty per cent, people in his office said, “What will happen when it’s only twenty? We won’t be able to be in business!” A typical response rate is now in the single digits.

Meanwhile, polls are wielding greater influence over American elections than ever….

Still, data science can’t solve the biggest problem with polling, because that problem is neither methodological nor technological. It’s political. Pollsters rose to prominence by claiming that measuring public opinion is good for democracy. But what if it’s bad?

A “poll” used to mean the top of your head. Ophelia says of Polonius, “His beard as white as snow: All flaxen was his poll.” When voting involved assembling (all in favor of Smith stand here, all in favor of Jones over there), counting votes required counting heads; that is, counting polls. Eventually, a “poll” came to mean the count itself. By the nineteenth century, to vote was to go “to the polls,” where, more and more, voting was done on paper. Ballots were often printed in newspapers: you’d cut one out and bring it with you. With the turn to the secret ballot, beginning in the eighteen-eighties, the government began supplying the ballots, but newspapers kept printing them; they’d use them to conduct their own polls, called “straw polls.” Before the election, you’d cut out your ballot and mail it to the newspaper, which would make a prediction. Political parties conducted straw polls, too. That’s one of the ways the political machine worked….

Ever since Gallup, two things have been called polls: surveys of opinions and forecasts of election results. (Plenty of other surveys, of course, don’t measure opinions but instead concern status and behavior: Do you own a house? Have you seen a doctor in the past month?) It’s not a bad idea to reserve the term “polls” for the kind meant to produce election forecasts. When Gallup started out, he was skeptical about using a survey to forecast an election: “Such a test is by no means perfect, because a preelection survey must not only measure public opinion in respect to candidates but must also predict just what groups of people will actually take the trouble to cast their ballots.” Also, he didn’t think that predicting elections constituted a public good: “While such forecasts provide an interesting and legitimate activity, they probably serve no great social purpose.” Then why do it? Gallup conducted polls only to prove the accuracy of his surveys, there being no other way to demonstrate it. The polls themselves, he thought, were pointless…

If public-opinion polling is the child of a strained marriage between the press and the academy, data science is the child of a rocky marriage between the academy and Silicon Valley. The term “data science” was coined in 1960, one year after the Democratic National Committee hired Simulmatics Corporation, a company founded by Ithiel de Sola Pool, a political scientist from M.I.T., to provide strategic analysis in advance of the upcoming Presidential election. Pool and his team collected punch cards from pollsters who had archived more than sixty polls from the elections of 1952, 1954, 1956, 1958, and 1960, representing more than a hundred thousand interviews, and fed them into a UNIVAC. They then sorted voters into four hundred and eighty possible types (for example, “Eastern, metropolitan, lower-income, white, Catholic, female Democrat”) and sorted issues into fifty-two clusters (for example, foreign aid). Simulmatics’ first task, completed just before the Democratic National Convention, was a study of “the Negro vote in the North.” Its report, which is thought to have influenced the civil-rights paragraphs added to the Party’s platform, concluded that between 1954 and 1956 “a small but significant shift to the Republicans occurred among Northern Negroes, which cost the Democrats about 1 per cent of the total votes in 8 key states.” After the nominating convention, the D.N.C. commissioned Simulmatics to prepare three more reports, including one that involved running simulations about different ways in which Kennedy might discuss his Catholicism….

Data science may well turn out to be as flawed as public-opinion polling. But a stage in the development of any new tool is to imagine that you’ve perfected it, in order to ponder its consequences. I asked Hilton to suppose that there existed a flawless tool for measuring public opinion, accurately and instantly, a tool available to voters and politicians alike. Imagine that you’re a member of Congress, I said, and you’re about to head into the House to vote on an act—let’s call it the Smeadwell-Nutley Act. As you do, you use an app called iThePublic to learn the opinions of your constituents. You oppose Smeadwell-Nutley; your constituents are seventy-nine per cent in favor of it. Your constituents will instantly know how you’ve voted, and many have set up an account with Crowdpac to make automatic campaign donations. If you vote against the proposed legislation, your constituents will stop giving money to your reëlection campaign. If, contrary to your convictions but in line with your iThePublic, you vote for Smeadwell-Nutley, would that be democracy? …(More)”

 

Predictive policing is ‘technological racism’


Shaun King at the New York Daily News: “The future is here.

For years now, the NYPD, the Miami PD, and many police departments around the country have been using new technology that claims it can predict where crime will happen and where police should focus their energies in order. They call it predictive policing. Months ago, I raised several red flags to such software because it does not appear to properly account for the presence of racism or racial profiling in how it predicts where crimes will be committed.

See, these systems claim to predict where crimes will happen based on prior arrest data. What they don’t account for is the widespread reality that race and racial profiling have everything to do with who is arrested and where they are arrested. For instance, study after study has shown that white people actually are more likely to sell drugs and do drugs than black people, but are exponentially less likely to be arrested for either crime. But, and this is where these systems fail, if the only data being entered into systems is based not on the more complex reality of who sells and purchases drugs, but on a racial stereotype, then the system will only perpetuate the racism that preceded it…

In essence, it’s not predicting who will sell drugs and where they will sell it, as much as it is actually predicting where a certain race of people may sell or purchase drugs. It’s technological racism at its finest.

Now, in addition to predictive policing, the state of Pennsylvania is pioneering predictive prison sentencing. Through complex questionnaires and surveys completed not by inmates, but by prison staff members, inmates may be given a smaller bail or shorter sentences or a higher bail and lengthier prison sentences. The surveys focus on family background, economic background, prior crimes, education levels and more.

When all of the data is scored, the result classifies prisoners as low, medium or high risk. While this may sound benign, it isn’t. No prisoner should ever be given a harsh sentence or an outrageous bail amount because of their family background or economic status. Even these surveys lend themselves to being racist and putting black and brown women and men in positions where it’s nearly impossible to get a good score because of prevalent problems in communities of color….(More)”

How Satellite Data and Artificial Intelligence could help us understand poverty better


Maya Craig at Fast Company: “Governments and development organizations currently measure poverty levels by conducting door-to-door surveys. The new partnership will test the use of AI to supplement these surveys and increase the accuracy of poverty data. Orbital said its AI software will analyze satellite images to see if characteristics such as building height and rooftop material can effectively indicate wealth.

The pilot study will be conducted in Sri Lanka. If successful, the World Bank hopes to scale it worldwide. A recent study conducted by the organization found that more than 50 countries lack legitimate poverty estimates, which limits the ability of the development community to support the world’s poorest populations.

“Data depravation is a serious issue, especially in many of the countries where we need it most,” says David Newhouse, senior economist at the World Bank. “This technology has the potential to help us get that data more frequently and at a finer level of detail than is currently possible.”

The announcement is the latest in an emerging industry of AI analysis of satellite photos. A growing number of investors and entrepreneurs are betting that the convergence of these fields will have far-reaching impacts on business, policy, resource management and disaster response.

Wall Street’s biggest hedge-fund businesses have begun using the technology to improve investment strategies. The Pew Charitable Trust employs the method to monitor oceans for illegal fishing activities. And startups like San Francisco-based Mavrx use similar analytics to optimize crop harvest.

The commercial earth-imaging satellite market, valued at $2.7 billion in 2014, is predicted to grow by 14% each year through the decade, according to a recent report.

As recently as two years ago, there were just four commercial earth imaging satellites operated in the U.S., and government contracts accounted for about 70% of imagery sales. By 2020, there will be hundreds of private-sector “smallsats” in orbit capturing imagery that will be easily accessible online. Companies like Skybox Imaging and Planet Labs have the first of these smallsats already active, with plans for more.

The images generated by these companies will be among the world’s largest data sets. And recent breakthroughs in AI research have made it possible to analyze these images to inform decision-making…(More)”

Push, Pull, and Spill: A Transdisciplinary Case Study in Municipal Open Government


Paper by Jan Whittington et al: “Cities hold considerable information, including details about the daily lives of residents and employees, maps of critical infrastructure, and records of the officials’ internal deliberations. Cities are beginning to realize that this data has economic and other value: If done wisely, the responsible release of city information can also release greater efficiency and innovation in the public and private sector. New services are cropping up that leverage open city data to great effect.

Meanwhile, activist groups and individual residents are placing increasing pressure on state and local government to be more transparent and accountable, even as others sound an alarm over the privacy issues that inevitably attend greater data promiscuity. This takes the form of political pressure to release more information, as well as increased requests for information under the many public records acts across the country.

The result of these forces is that cities are beginning to open their data as never before. It turns out there is surprisingly little research to date into the important and growing area of municipal open data. This article is among the first sustained, cross-disciplinary assessments of an open municipal government system. We are a team of researchers in law, computer science, information science, and urban studies. We have worked hand-in-hand with the City of Seattle, Washington for the better part of a year to understand its current procedures from each disciplinary perspective. Based on this empirical work, we generate a set of recommendations to help the city manage risk latent in opening its data….(More)”

Open Data: Six Stories About Impact in the UK


Laura Bacon at Omidyar Network: “In 2010, the year the United Kingdom launched its open data portal, a Transparency & Accountability Initiative report highlighted both the promise and potential of open data to improve services and create economic growth.

In the five years since, the UK’s progress in opening its data has been pioneering and swift, but not without challenges and questions about impact. It’s this qualified success that prompted us to commission this report in an effort to understand if the promise and potential of open data are being realized, and, specifically, to…

… explore and document open data’s social, cultural, political, and economic impact;

… shine a light on the range of sectors and ways in which open data can make a difference; and

… profile the open data value chain, including its supply, demand, use, and re-use.

The report’s author, Becky Hogge, finds that open data has had catalytic and significant impact and that time will likely reveal even further value. She also flags critical challenges and obstacles, including closed datasets, valuable data not currently being collected, and important privacy considerations….Download full report here.”

Remaking Participation: Science, Environment and Emergent Publics


Book edited by Jason Chilvers and Matthew Kearnes: “Changing relations between science and democracy – and controversies over issues such as climate change, energy transitions, genetically modified organisms and smart technologies – have led to a rapid rise in new forms of public participation and citizen engagement. While most existing approaches adopt fixed meanings of ‘participation’ and are consumed by questions of method or critiquing the possible limits of democratic engagement, this book offers new insights that rethink public engagements with science, innovation and environmental issues as diverse, emergent and in the making. Bringing together leading scholars on science and democracy, working between science and technology studies, political theory, geography, sociology and anthropology, the volume develops relational and co-productionist approaches to studying and intervening in spaces of participation. New empirical insights into the making, construction, circulation and effects of participation across cultures are illustrated through examples ranging from climate change and energy to nanotechnology and mundane technologies, from institutionalised deliberative processes to citizen-led innovation and activism, and from the global north to global south. This new way of seeing participation in science and democracy opens up alternative paths for reconfiguring and remaking participation in more experimental, reflexive, anticipatory and responsible ways….(More)”

Role of Citizens in India’s Smart Cities Challenge


Florence Engasser and Tom Saunders at the World Policy Blog: “India faces a wide range of urban challenges — from serious air pollution and poor local governance, to badly planned cities and a lack of decent housing. India’s Smart Cities Challenge, which has now selected 98 of the 100 cities that will receive funding, could go a long way in addressing these issues.

According to Prime Minister Narendra Modi, there are five key instruments that make a “smart” city: the use of clean technologies, the use of information and communications technology (ICT), private sector involvement, citizen participation and smart governance. There are good examples of new practices for each of these pillars.

For example, New Delhi recently launched a program to replace streetlights with energy efficient LEDs. The Digital India program is designed to upgrade the country’s IT infrastructure and includes plans to build “broadband highways” across the country. As for private sector participation, the Indian government is trying to encourage it by listing sectors and opportunities for public-private partnerships.

Citizen participation is one of Modi’s five key instruments, but this is an area where smart city pilots around the world have tended to perform least well on. While people are the implied beneficiaries of programs that aim to improve efficiency and reduce waste, they are rarely given a chance to participate in the design or delivery of smart city projects, which are usually implemented and managed by experts who have only a vague idea of the challenges that local communities face.

Citizen Participation

Engaging citizens is especially important in an Indian context because there have already been several striking examples of failed urban redevelopments that have blatantly lacked any type of community consultation or participation….

In practice, how can Indian cities engage residents in their smart city projects?

There are many tools available to policymakers — from traditional community engagement activities such as community meetings, to websites like Mygov.in that ask for feedback on policies. Now, there are a number of reasons to think smartphones could be an important tool to help improve collaboration between residents and city governments in Indian cities.

First, while only around 10 percent of Indians currently own a smartphone, this is predicted to rise to around half by 2020, and will be much higher in urban areas. A key driver of this is local manufacturing giants like Micromax, which have revolutionized low-cost technology in India, with smartphones costing as little as $30 (compared to around $800 for the newest iPhone).

Second, smartphone apps give city governments the potential to interact directly with citizens to make the most of what they know and feel about their communities. This can happen passively, for example, the Waze Connected Citizens program, which shares user location data with city governments to help improve transport planning. It can also be more active, for example, FixMyStreet, which allows people to report maintenance issues like potholes to their city government.

Third, smartphones are one of the main ways for people to access social media, and researchers are now developing a range of new and innovative solutions to address urban challenges using these platforms. This includes Petajakarta, which creates crowd-sourced maps of flooding in Jakarta by aggregating tweets that mention the word ‘flood.’

Made in India

Considering some of the above trends, it is interesting to think about the role smartphones could play in the governance of Indian cities and in better engaging communities. India is far from being behind in the field, and there are already a few really good examples of innovative smartphone applications made in India.

Swachh Bharat Abhiyan (translated as Clean India Initiative) is a campaign launched by Modi in October 2014, covering over 4,000 towns all over the country, with the aim to clean India’s streets. The Clean India mobile application, launched at the end of 2014 to coincide with Modi’s initiative, was developed by Mahek Shah and allows users to take pictures to report, geo-locate, and timestamp streets that need cleaning or problems to be fixed by the local authorities.

Similar to FixMyStreet, users are able to tag their reports with keywords to categorize problems. Today, Clean India has been downloaded over 12,000 times and has 5,000 active users. Although still at a very early stage, Clean India has great potential to facilitate the complaint and reporting process by empowering people to become the eyes and ears of municipalities on the ground, who are often completely unaware of issues that matter to residents.

In Bangalore, an initiative by the MOD Institute, a local nongovernmental organization, enabled residents to come together, online and offline, to create a community vision for the redevelopment of Shanthinagar, a neighborhood of the city. The project, Next Bengaluru, used new technologies to engage local residents in urban planning and tap into their knowledge of the area to promote a vision matching their real needs.

The initiative was very successful. In just three months, between December 2014 and March 2015, over 1,200 neighbors and residents visited the on-site community space, and the team crowd-sourced more than 600 ideas for redevelopment and planning both on-site and through the Next Bangalore website.

The MOD Institute now intends to work with local urban planners to try get these ideas adopted by the city government. The project has also developed a pilot app that will enable people to map abandoned urban spaces via smartphone and messaging service in the future.

Finally, Safecity India is a nonprofit organization providing a platform for anyone to share, anonymously or not, personal stories of sexual harassment and abuse in public spaces. Men and women can report different types of abuses — from ogling, whistles and comments, to stalking, groping and sexual assault. The aggregated data is then mapped, allowing citizens and governments to better understand crime trends at hyper-local levels.

Since its launch in 2012, SafeCity has received more than 4,000 reports of sexual crime and harassment in over 50 cities across India and Nepal. SafeCity helps generate greater awareness, breaks the cultural stigma associated with reporting sexual abuse and gives voice to grassroots movements and campaigns such as SayftyProtsahan, or Stop Street Harassment, forcing authorities to take action….(More)

Behavioural Science, Randomized Evaluations and the Transformation Of Public Policy: The Case of the UK Government


Chapter by Peter John: “Behaviour change policy conveys powerful image: groups of psychologists and scientists, maybe wearing white coats, messing with the minds of citizens, doing experiments on them without their consent, and seeking to manipulate their behaviours. Huddled in an office in the heart of Whitehall, or maybe working out of a windowless room in the White House, behavioural scientists are redesigning the messages and regulations that governments make, operating very far from the public’s view. The unsuspecting citizen becomes something akin to the subjects of science fiction novels, such as Huxley’s Brave New World or Zamyatin’s We. The emotional response to these developments is to cry out for a more humanistic form of public policy, a more participatory form of governance, and to base public policy on the common sense and good judgements of citizens and their elected representatives.

Of course, such an account is a massive stereotype, but something of this viewpoint has emerged as a backdrop to critical academic work on the use of behavioural science in government in what is described as the rise of the psychological state (Jones et al 2013a b), which might be seen to represent a step-change in use of psychological and other form of behavioural research to design public policies. Such a claim speaks more generally to the use of scientific ideas by government since the eighteenth century, which has been subject to a considerable amount of theoretical work in recent years, drawing on the work of Foucault, and which has developed into explorations of the theory and practice of governmentality (see Jones et al 2013:182-188).

With behaviour change, the ‘central concern has been to critically evaluate the broader ethical concerns of behavioural governance, which includes tracing geo-historical contingencies of knowledge mobilized in the legitimation of the behavior change agenda itself’ (190). This line of work presents a subtle set of arguments and claims that an empirical account, such as that presented in this chapter, cannot⎯nor should⎯challenge. Nonetheless, it is instructive to find out more about the phenomenon under study and to understand how the uses of behavioural ideas and randomized evaluations are limited and structured by the institutions and actors in the political process, which are following political and organizational ends. Of particular interest is the incremental and patchy nature of the diffusion of ideas, and how the use of behavioural sciences meshes with existing standard operating procedures and routines of bureaucracies. That said, behavioural sciences can make progress within the fragmented and decentralized policy process, and has the power to create innovations in public policies, often helped by articulate advocates of such measures.

The path of ideas in public policy is usually slow, one of gradual diffusion and small changes in operating assumptions, and this route is likely for the use of behavioural sciences. The implication of this line of argument is that agency as well as structure plays an important role in the adoption and diffusion of the ideas from the behavioural sciences. It implies a more limited and less uniform use of ideas and evidence than implied by the critical writers in this field, but one where public argument and debate play a central role….(More)”

Using Crowdsourcing to Track the Next Viral Disease Outbreak


The TakeAway: “Last year’s Ebola outbreak in West Africa killed more than 11,000 people. The pandemic may be diminished, but public health officials think that another major outbreak of infectious disease is fast-approaching, and they’re busy preparing for it.

Boston public radio station WGBH recently partnered with The GroundTruth Project and NOVA Next on a series called “Next Outbreak.” As part of the series, they reported on an innovative global online monitoring system called HealthMap, which uses the power of the internet and crowdsourcing to detect and track emerging infectious diseases, and also more common ailments like the flu.

Researchers at Boston Children’s Hospital are the ones behind HealthMap (see below), and they use it to tap into tens of thousands of sources of online data, including social media, news reports, and blogs to curate information about outbreaks. Dr. John Brownstein, chief innovation officer at Boston Children’s Hospital and co-founder of HealthMap, says that smarter data collection can help to quickly detect and track emerging infectious diseases, fatal or not.

“Traditional public health is really slowed down by the communication process: People get sick, they’re seen by healthcare providers, they get laboratory confirmed, information flows up the channels to state and local health [agencies], national governments, and then to places like the WHO,” says Dr. Brownstein. “Each one of those stages can take days, weeks, or even months, and that’s the problem if you’re thinking about a virus that can spread around the world in a matter of days.”

The HealthMap team looks at a variety of communication channels to undo the existing hierarchy of health information.

“We make everyone a stakeholder when it comes to data about outbreaks, including consumers,” says Dr. Brownstein. “There are a suite of different tools that public health officials have at their disposal. What we’re trying to do is think about how to communicate and empower individuals to really understand what the risks are, what the true information is about a disease event, and what they can do to protect themselves and their families. It’s all about trying to demystify outbreaks.”

In addition to the map itself, the HealthMap team has a number of interactive tools that individuals can both use and contribute to. Dr. Brownstein hopes these resources will enable the public to care more about disease outbreaks that may be happening around them—it’s a way to put the “public” back in “public health,” he says.

“We have a app called Outbreaks Near Me that allows people to know about what disease outbreaks are happening in their neighborhood,” Dr. Brownstein says. “Flu Near You is a an app that people use to self report on symptoms; Vaccine Finder is a tool that allows people to know what vaccines are available to them and their community.”

In addition to developing their own app, the HealthMap has partnered with existing tech firms like Uber to spread the word about public health.

“We worked closely with Uber last year and actually put nurses in Uber cars and delivered vaccines to people,” Dr. Brownstein says. “The closest vaccine location might still be only a block away for people, but people are still hesitant to get it done.”…(More)”