Politics and the New Machine


Jill Lepore in the NewYorker on “What the turn from polls to data science means for democracy”: “…The modern public-opinion poll has been around since the Great Depression, when the response rate—the number of people who take a survey as a percentage of those who were asked—was more than ninety. The participation rate—the number of people who take a survey as a percentage of the population—is far lower. Election pollsters sample only a minuscule portion of the electorate, not uncommonly something on the order of a couple of thousand people out of the more than two hundred million Americans who are eligible to vote. The promise of this work is that the sample is exquisitely representative. But the lower the response rate the harder and more expensive it becomes to realize that promise, which requires both calling many more people and trying to correct for “non-response bias” by giving greater weight to the answers of people from demographic groups that are less likely to respond. Pollster.com’s Mark Blumenthal has recalled how, in the nineteen-eighties, when the response rate at the firm where he was working had fallen to about sixty per cent, people in his office said, “What will happen when it’s only twenty? We won’t be able to be in business!” A typical response rate is now in the single digits.

Meanwhile, polls are wielding greater influence over American elections than ever….

Still, data science can’t solve the biggest problem with polling, because that problem is neither methodological nor technological. It’s political. Pollsters rose to prominence by claiming that measuring public opinion is good for democracy. But what if it’s bad?

A “poll” used to mean the top of your head. Ophelia says of Polonius, “His beard as white as snow: All flaxen was his poll.” When voting involved assembling (all in favor of Smith stand here, all in favor of Jones over there), counting votes required counting heads; that is, counting polls. Eventually, a “poll” came to mean the count itself. By the nineteenth century, to vote was to go “to the polls,” where, more and more, voting was done on paper. Ballots were often printed in newspapers: you’d cut one out and bring it with you. With the turn to the secret ballot, beginning in the eighteen-eighties, the government began supplying the ballots, but newspapers kept printing them; they’d use them to conduct their own polls, called “straw polls.” Before the election, you’d cut out your ballot and mail it to the newspaper, which would make a prediction. Political parties conducted straw polls, too. That’s one of the ways the political machine worked….

Ever since Gallup, two things have been called polls: surveys of opinions and forecasts of election results. (Plenty of other surveys, of course, don’t measure opinions but instead concern status and behavior: Do you own a house? Have you seen a doctor in the past month?) It’s not a bad idea to reserve the term “polls” for the kind meant to produce election forecasts. When Gallup started out, he was skeptical about using a survey to forecast an election: “Such a test is by no means perfect, because a preelection survey must not only measure public opinion in respect to candidates but must also predict just what groups of people will actually take the trouble to cast their ballots.” Also, he didn’t think that predicting elections constituted a public good: “While such forecasts provide an interesting and legitimate activity, they probably serve no great social purpose.” Then why do it? Gallup conducted polls only to prove the accuracy of his surveys, there being no other way to demonstrate it. The polls themselves, he thought, were pointless…

If public-opinion polling is the child of a strained marriage between the press and the academy, data science is the child of a rocky marriage between the academy and Silicon Valley. The term “data science” was coined in 1960, one year after the Democratic National Committee hired Simulmatics Corporation, a company founded by Ithiel de Sola Pool, a political scientist from M.I.T., to provide strategic analysis in advance of the upcoming Presidential election. Pool and his team collected punch cards from pollsters who had archived more than sixty polls from the elections of 1952, 1954, 1956, 1958, and 1960, representing more than a hundred thousand interviews, and fed them into a UNIVAC. They then sorted voters into four hundred and eighty possible types (for example, “Eastern, metropolitan, lower-income, white, Catholic, female Democrat”) and sorted issues into fifty-two clusters (for example, foreign aid). Simulmatics’ first task, completed just before the Democratic National Convention, was a study of “the Negro vote in the North.” Its report, which is thought to have influenced the civil-rights paragraphs added to the Party’s platform, concluded that between 1954 and 1956 “a small but significant shift to the Republicans occurred among Northern Negroes, which cost the Democrats about 1 per cent of the total votes in 8 key states.” After the nominating convention, the D.N.C. commissioned Simulmatics to prepare three more reports, including one that involved running simulations about different ways in which Kennedy might discuss his Catholicism….

Data science may well turn out to be as flawed as public-opinion polling. But a stage in the development of any new tool is to imagine that you’ve perfected it, in order to ponder its consequences. I asked Hilton to suppose that there existed a flawless tool for measuring public opinion, accurately and instantly, a tool available to voters and politicians alike. Imagine that you’re a member of Congress, I said, and you’re about to head into the House to vote on an act—let’s call it the Smeadwell-Nutley Act. As you do, you use an app called iThePublic to learn the opinions of your constituents. You oppose Smeadwell-Nutley; your constituents are seventy-nine per cent in favor of it. Your constituents will instantly know how you’ve voted, and many have set up an account with Crowdpac to make automatic campaign donations. If you vote against the proposed legislation, your constituents will stop giving money to your reëlection campaign. If, contrary to your convictions but in line with your iThePublic, you vote for Smeadwell-Nutley, would that be democracy? …(More)”

 

Using Crowdsourcing to Track the Next Viral Disease Outbreak


The TakeAway: “Last year’s Ebola outbreak in West Africa killed more than 11,000 people. The pandemic may be diminished, but public health officials think that another major outbreak of infectious disease is fast-approaching, and they’re busy preparing for it.

Boston public radio station WGBH recently partnered with The GroundTruth Project and NOVA Next on a series called “Next Outbreak.” As part of the series, they reported on an innovative global online monitoring system called HealthMap, which uses the power of the internet and crowdsourcing to detect and track emerging infectious diseases, and also more common ailments like the flu.

Researchers at Boston Children’s Hospital are the ones behind HealthMap (see below), and they use it to tap into tens of thousands of sources of online data, including social media, news reports, and blogs to curate information about outbreaks. Dr. John Brownstein, chief innovation officer at Boston Children’s Hospital and co-founder of HealthMap, says that smarter data collection can help to quickly detect and track emerging infectious diseases, fatal or not.

“Traditional public health is really slowed down by the communication process: People get sick, they’re seen by healthcare providers, they get laboratory confirmed, information flows up the channels to state and local health [agencies], national governments, and then to places like the WHO,” says Dr. Brownstein. “Each one of those stages can take days, weeks, or even months, and that’s the problem if you’re thinking about a virus that can spread around the world in a matter of days.”

The HealthMap team looks at a variety of communication channels to undo the existing hierarchy of health information.

“We make everyone a stakeholder when it comes to data about outbreaks, including consumers,” says Dr. Brownstein. “There are a suite of different tools that public health officials have at their disposal. What we’re trying to do is think about how to communicate and empower individuals to really understand what the risks are, what the true information is about a disease event, and what they can do to protect themselves and their families. It’s all about trying to demystify outbreaks.”

In addition to the map itself, the HealthMap team has a number of interactive tools that individuals can both use and contribute to. Dr. Brownstein hopes these resources will enable the public to care more about disease outbreaks that may be happening around them—it’s a way to put the “public” back in “public health,” he says.

“We have a app called Outbreaks Near Me that allows people to know about what disease outbreaks are happening in their neighborhood,” Dr. Brownstein says. “Flu Near You is a an app that people use to self report on symptoms; Vaccine Finder is a tool that allows people to know what vaccines are available to them and their community.”

In addition to developing their own app, the HealthMap has partnered with existing tech firms like Uber to spread the word about public health.

“We worked closely with Uber last year and actually put nurses in Uber cars and delivered vaccines to people,” Dr. Brownstein says. “The closest vaccine location might still be only a block away for people, but people are still hesitant to get it done.”…(More)”

How smartphones are solving one of China’s biggest mysteries


Ana Swanson at the Washington Post: “For decades, China has been engaged in a building boom of a scale that is hard to wrap your mind around. In the last three decades, 260 million people have moved from the countryside to Chinese cities — equivalent to around 80 percent of the population of the U.S. To make room for all of those people, the size of China’s built-up urban areas nearly quintupled between 1984 and 2010.

Much of that development has benefited people’s lives, but some has not. In a breathless rush to boost growth and development, some urban areas have built vast, unused real estate projects — China’s infamous “ghost cities.” These eerie, shining developments are complete except for one thing: people to live in them.

China’s ghost cities have sparked a lot of debate over the last few years. Some argue that the developments are evidence of the waste in top-down planning, or the result of too much cheap funding for businesses. Some blame the lack of other good places for average people to invest their money, or the desire of local officials to make a quick buck — land sales generate a lot of revenue for China’s local governments.

Others say the idea of ghost cities has been overblown. They espouse a “build it and they will come” philosophy, pointing out that, with time, some ghost cities fill up and turn into vibrant communities.

It’s been hard to evaluate these claims, since most of the research on ghost cities has been anecdotal. Even the most rigorous research methods leave a lot to be desired — for example, investment research firms sending poor junior employees out to remote locations to count how many lights are turned on in buildings at night.

Now new research from Baidu, one of China’s biggest technology companies, provides one of the first systematic looks at Chinese ghost cities. Researchers from Baidu’s Big Data Lab and Peking University in Beijing used the kind of location data gathered by mobile phones and GPS receivers to track how people moved in and out suspected ghost cities, in real time and on a national scale, over a period of six months. You can see the interactive project here.

Google has been blocked in China for years, and Baidu dominates the market in terms of search, mobile maps and other offerings. That gave the researchers a huge data base to work with —  770 million users, a hefty chunk of China’s 1.36 billion people.

To identify potential ghost cities, the researchers created an algorithm that identifies urban areas with a relatively spare population. They define a ghost city as an urban region with a population of fewer than 5,000 people per square kilometer – about half the density recommended by the Chinese Ministry of Housing and Urban-Rural Development….(More)”

Open government: a new paradigm in social change?


Rosie Williams: In a recent speech to the Australian and New Zealand School of Government (ANSOG) annual conference, technology journalist and academic Suelette Drefyus explained the growing ‘information asymmetry’ that characterises the current-day relationship between government and citizenry.

According to Dreyfus:

‘Big Data makes government very powerful in its relationship with the citizen. This is even more so with the rise of intelligent systems, software that increasingly trawls, matches and analyses that Big Data. And it is moving toward making more decisions once made by human beings.’

The role of technology in the delivery of government services gives much food for thought in terms of both its implications for potential good and the potential dangers it may pose. The concept of open government is an important one for the future of policy and democracy in Australia. Open government has at its core a recognition that the world has changed, that the ways people engage and who they engage with has transformed in ways that governments around the world must respond to in both technological and policy terms.

As described in the ANSOG speech, the change within government in how it uses technology is well underway, however in many regards we are at the very beginning of understanding and implementing the potential of data and technology in providing solutions to many of our shared problems. Australia’s pending membership of the Open Government Partnership is integral to how Australia responds to these challenges. Membership of the multi-lateral partnership requires the Australian government to create a National Action Plan based on consultation and demonstrate our credentials in the areas of Fiscal Transparency, Access to Information, Income and Asset Disclosure, and Citizen Engagement.

What are the implications of the National Action Plan for policy consultation formulation, implementation and evaluation? In relative terms, Australia’s history with open government is fairly recent. Policies on open data have seen the roll out of data.gov.au – a repository of data published by government agencies and made available for re-use in efforts such as the author’s own financial transparency site OpenAus.

In this way citizen activity and government come together for the purposes of achieving open government. These efforts express a new paradigm in government and activism where the responsibility for solving the problems of democracy are shared between government and the people as opposed to the government ‘solving’ the problems of a passive, receptive citizenry.

As the famous whistle-blowers have shown, citizens are no longer passive but this new capability also requires a consciousness of the responsibilities and accountability that go along with the powers newly developed by citizen activists through technological change.

The opening of data and communication channels in the formulation of public policy provides a way forward to create both a better informed citizenry and also better informed policy evaluation. When new standards of transparency are applied to wicked problems what shortcomings does this highlight?

This question was tested with my recent request for a basic fact missing from relevant government research and reviews but key to social issues of homelessness and domestic violence….(More)”

New traffic app and disaster prevention technology road tested


Psych.org: “A new smartphone traffic app tested by citizens in Dublin, Ireland allows users to give feedback on traffic incidents, enabling traffic management centres to respond quicker when collisions and other incidents happen around the city. The ‘CrowdAlert’ app, which is now available for download, is one of the key components utilised in the EU-funded INSIGHT project and a good example of how smartphones and social networks can be harnessed to improve public services and safety.

‘We are witnessing an explosion in the quantity, quality, and variety of available information, fuelled in large part by advances in sensor networking, the availability of low-cost sensor-enabled devices and by the widespread adoption of powerful smart-phones,’ explains  coordinator professor Dimitrios Gunopulos from the National and Kapodistrian University of Athens. ‘These revolutionary technologies are driving the development and adoption of applications where mobile devices are used for continuous data sensing and analysis.’

The project also developed a novel citywide real-time traffic monitoring tool, the ‘INSIGHT System’, which was tested in real conditions in the Dublin City control room, along with nationwide disaster monitoring technologies. The INSIGHT system was shown to provide early warnings to experts at situation centres, enabling them to monitor situations in real-time, including disasters with potentially nation-wide impacts such as severe weather conditions, floods and subsequent knock-on events such as fires and power outages.

The project’s results will be of interest to public services, which have until now lacked the necessary infrastructure for handling and integrating miscellaneous data streams, including data from static and mobile sensors as well as information coming from social network sources, in real-time. Providing cities with the ability to manage emergency situations with enhanced capabilities will also open up new markets for network technologies….(More)”

Teaching Open Data for Social Movements: a Research Strategy


Alan Freihof Tygel and Maria Luiza Machado Campo at the Journal of Community Informatics: “Since the year 2009, the release of public government data in open formats has been configured as one of the main actions taken by national states in order to respond to demands for transparency and participation by the civil society. The United States and theUnited Kingdom were pioneers, and today over 46 countries have their own Open Government Data Portali , many of them fostered by the Open Government Partnership (OGP), an international agreement aimed at stimulating transparency.

The premise of these open data portals is that, by making data publicly available in re-usable formats, society would take care of building applications and services, and gain value from this data (Huijboom & Broek, 2011). According to the same authors, the discourse around open data policies also includes increasing democratic control and participation and strengthening law enforcement.

Several recent works argue that the impact of open data policies, especially the release of open data portals, is still difficult to assess (Davies & Bawa, 2012; Huijboom & Broek, 2011; Zuiderwijk, Janssen, Choenni, Meijer, & Alibaks, 2012). One important consideration is that “The gap between the promise and reality of OGD [Open Government Data] re-use cannot be addressed by technological solutions alone” (Davies, 2012). Therefore, sociotechnical approaches (Mumford, 1987) are mandatory.

The targeted users of open government data lie over a wide range that includes journalists, non-governmental organizations (NGO), civil society organizations (CSO), enterprises, researchers and ordinary citizens who want to audit governments’ actions. Among them, the focus of our research is on social (or grassroots) movements. These are groups of organized citizens at local, national or international level who drive some political action, normally placing themselves in opposition to the established power relations and claiming rights for oppressed groups.

A literature definition gives a social movement as “collective social actions with a socio-political and cultural approach, which enable distinct forms of organizing the population and expressing their demands” (Gohn, 2011).

Social movements have been using data in their actions repertory with several motivations (as can be seen in Table 1 and Listing 1). From our experience, an overview of several cases where social movements use open data reveals a better understanding of reality and a more solid basis for their claims as motivations. Additionally, in some cases data produced by the social movements was used to build a counter-hegemonic discourse based on data. An interesting example is the Citizen Public Depth Audit Movement which takes place in Brazil. This movement, which is part of an international network, claims that “significant amounts registered as public debt do not correspond to money collected through loans to the country” (Fattorelli, 2011), and thus origins of this debt should be proven. According to the movement, in 2014 45% of Brazil’s Federal spend was paid to debt services.

Recently, a number of works tried to develop comparison schemes between open data strategies (Atz, Heath, & Fawcet, 2015; Caplan et al., 2014; Ubaldi, 2013; Zuiderwijk & Janssen, 2014). Huijboom & Broek (2011) listed four categories of instruments applied by the countries to implement their open data policies:

  • voluntary approaches, such as general recommendations,
  • economic instruments,
  • legislation and control, and
  • education and training.

One of the conclusions is that the latter was used to a lesser extent than the others.

Social movements, in general, are composed of people with little experience of informatics, either because of a lack of opportunities or of interest. Although it is recognized that using data is important for a social movement’s objectives, the training aspect still hinders a wider use of it.

In order to address this issue, an open data course for social movements was designed. Besides building a strategy on open data education, the course also aims to be a research strategy to understand three aspects:

  • the motivations of social movements for using open data;
  • the impediments that block a wider and better use; and
  • possible actions to be taken to enhance the use of open data by social movements….(More)”

Open Data Impact: How Zillow Uses Open Data to Level the Playing Field for Consumers


Daniel Castro at US Dept of Commerce: “In the mid-2000s, several online data firms began to integrate real estate data with national maps to make the data more accessible for consumers. Of these firms, Zillow was the most effective at attracting users by rapidly growing its database, thanks in large part to open data. Zillow’s success is based, in part, on its ability to create tailored products that blend multiple data sources to answer customer’s questions about the housing market. Zillow’s platform lets customers easily compare neighborhoods and conduct thorough real estate searches through a single portal. This ensures a level playing field of information for home buyers, sellers and real estate professionals.

The system empowers consumers by providing them all the information needed to make well-informed decisions about buying or renting a home. For example, information from the Census Bureau’s American Community Survey helps answer people’s questions about what kind of housing they can afford in any U.S. market. Zillow also creates market analysis reports, which inform consumer about whether it is a good time to buy or sell, how an individual property’s value is likely to fluctuate over time, or whether it is better to rent or to own in certain markets. These reports can even show which neighborhoods are the top buyers’ or sellers’ markets in a given city. Zillow uses a wide range of government data, not just from the Census Bureau, to produce economic analyses and products it then freely provides to the public.

In addition to creating reports from synthesized data, Zillow has made a conscious effort to make raw data more usable. It has combined rental, mortgage, and other data into granular metrics on individual neighborhoods and zip codes. For example, the “Breakeven Horizon” is a metric that gives users a snapshot of how long they would need to own a home in a given area for the accrued cost of buying to be less than renting. Zillow creates this by comparing the up-front costs of buying a home versus the amount of interest that money could generate, and then analyzing how median rents and home values are likely to fluctuate, affecting both values. By creating metrics, rankings, and indices, Zillow makes raw or difficult-to-quantify data readily accessible to the public.

While real estate agents can be instrumental in the process of finding a new home or selling an old one, Zillow and other platforms add value by connecting consumers to a wealth of data, some of which may have been accessible before but was too cumbersome for the average user. Not only does this allow buyers and sellers to make more informed decisions about real estate, but it also helps to balance the share of knowledge. Buyers have more information than ever before on available properties, their valuations for specific neighborhoods, and how those valuations have changed in relation to larger markets. Sellers can use the same types of information to evaluate offers they receive, or decide whether to list their home in the first place. The success that Zillow and other companies like it have achieved in the real estate market is a testament to how effective they have been in harnessing data to address consumers’ needs and it is a marvelous example of the power of open data….(More)”

Digital Continuity 2020


National Archives of Australia: “The Digital Continuity 2020 Policy is a whole-of-government approach to digital information governance. It complements the Australian Government’s digital transformation agenda and underpins the digital economy. The policy aims to support efficiency, innovation, interoperability, information re-use and accountability by integrating robust digital information management into all government business processes.

The policy is based on three principles, and for each of them identifies what success looks like and the targets that agencies should reach by 2020. All Digital Continuity 2020 targets are expected to be achieved as part of normal business reviews and ongoing technology maintenance and investment cycles.

The principles

Principle 1 – Information is valued

Focus on governance and people

Agencies will manage their information as an asset, ensuring that it is created, stored and managed for as long as it is required, taking into account business requirements and other needs and risks.
Case study – Parliamentary Budget Office

Principle 2 – Information is managed digitally

Focus on digital assets and processes

Agencies will transition to entirely digital work processes, meaning business processes including authorisations and approvals are completed digitally, and that information is created and managed in digital format.
Case study – Federal Court of Australia

Principle 3 – Information, systems and processes are interoperable

Focus on metadata and standards

Agencies will have interoperable information, systems and processes to improve information quality and enable information to be found, managed, shared and re-used easily and efficiently.
Case study – Opening government data with the NationalMap

View the Digital Continuity 2020 Policy. (More)

Advancing Open and Citizen-Centered Government


The White House: “Today, the United States released our third Open Government National Action Plan, announcing more than 40 new or expanded initiatives to advance the President’s commitment to an open and citizen-centered government….In the third Open Government National Action Plan, the Administration both broadens and deepens efforts to help government become more open and more citizen-centered. The plan includes new and impactful steps the Administration is taking to openly and collaboratively deliver government services and to support open government efforts across the country. These efforts prioritize a citizen-centric approach to government, including improved access to publicly available data to provide everyday Americans with the knowledge and tools necessary to make informed decisions.

One example is the College Scorecard, which shares data through application programming interfaces (APIs) to help students and families make informed choices about education. Open APIs help create an ecosystem around government data in which civil society can provide useful visual tools, making this data more accessible and commercial developers can enable even more value to be extracted to further empower students and their families. In addition to these newer approaches, the plan also highlights significant longstanding open government priorities such as access to information, fiscal transparency, and records management, and continues to push for greater progress in that work.

The plan also focuses on supporting implementation of the landmark 2030 Agenda for Sustainable Development, which sets out a vision and priorities for global development over the next 15 years and was adopted last month by 193 world leaders including President Obama. The plan includes commitments to harness open government and progress toward the Sustainable Development Goals (SDGs) both in the United States and globally, including in the areas of education, health, food security, climate resilience, science and innovation, justice and law enforcement. It also includes a commitment to take stock of existing U.S. government data that relates to the 17 SDGs, and to creating and using data to support progress toward the SDGs.

Some examples of open government efforts newly included in the plan:

  • Promoting employment by unlocking workforce data, including training, skill, job, and wage listings.
  • Enhancing transparency and participation by expanding available Federal services to theOpen311 platform currently available to cities, giving the public a seamless way to report problems and request assistance.
  • Releasing public information from the electronically filed tax forms of nonprofit and charitable organizations (990 forms) as open, machine-readable data.
  • Expanding access to justice through the White House Legal Aid Interagency Roundtable.
  • Promoting open and accountable implementation of the Sustainable Development Goals….(More)”

Can Mobile Phone Surveys Identify People’s Development Priorities?


Ben Leo and Robert Morello at the Center for Global Development: “Mobile phone surveys are fast, flexible, and cheap. But, can they be used to engage citizens on how billions of dollars in donor and government resources are spent? Over the last decade, donor governments and multilateral organizations have repeatedly committed to support local priorities and programs. Yet, how are they supposed to identify these priorities on a timely, regular basis? Consistent discussions with the local government are clearly essential, but so are feeding ordinary people’s views into those discussions. However, traditional tools, such as household surveys or consultative roundtables, present a range of challenges for high-frequency citizen engagement. That’s where mobile phone surveys could come in, enabled by the exponential rise in mobile coverage throughout the developing world.

Despite this potential, there have been only a handful of studies into whether mobile surveys are a reliable and representative tool across a broad range of developing-country contexts. Moreover, there have been almost none that specifically look at collecting information about people’s development priorities. Along with Tiago Peixoto,Steve Davenport, and Jonathan Mellon, who focus on promoting citizen engagement and open government practices at the World Bank, we sought to address this policy research gap. Through a study focused on four low-income countries (Afghanistan, Ethiopia, Mozambique, and Zimbabwe), we rigorously tested the feasibility of interactive voice recognition (IVR) surveys for gauging citizens’ development priorities.

Specifically, we wanted to know whether respondents’ answers are sensitive to a range of different factors, such as (i) the specified executing actor (national government or external partners); (ii) time horizons; or (iii) question formats. In other words, can we be sufficiently confident that surveys about people’s priorities can be applied more generally to a range of development actors and across a range of country contexts?

Several of these potential sensitivity concerns were raised in response to an earlier CGD working paper, which found that US foreign aid is only modestly aligned with Africans’ and Latin Americans’ most pressing concerns. This analysis relied upon Afrobarometer and Latinobarometro survey data (see explanatory note below). For instance, some argued that people’s priorities for their own government might be far less relevant for donor organizations. Put differently, the World Bank or USAID shouldn’t prioritize job creation in Nigeria simply because ordinary Nigerians cite it as a pressing government priority. Our hypothesis was that development priorities would likely transcend all development actors, and possibly different timeframes and question formats as well. But, we first needed to test these assumptions.

So, what did we find? We’ve included some of the key highlights below. For a more detailed description of the study and the underlying analysis, please see our new working paper. Along with our World Bank colleagues, we also published an accompanying paper that considers a range of survey method issues, including survey representativeness….(More)”