Paper by Bendik Bygstad and Francis D’Silva: “A national administration is dependent on its archives and registers, for many purposes, such as tax collection, enforcement of law, economic governance, and welfare services. Today, these services are based on large digital infrastructures, which grow organically in volume and scope. Building on a critical realist approach we investigate a particularly successful infrastructure in Norway called Altinn, and ask: what are the evolutionary mechanisms for a successful “government as a platform”? We frame our study with two perspectives; a historical institutional perspective that traces the roots of Altinn back to the Middle Ages, and an architectural perspective that allows for a more detailed analysis of the consequences of digitalization and the role of platforms. We offer two insights from our study: we identify three evolutionary mechanisms of national registers, and we discuss a future scenario of government platforms as “digital commons”…(More)”
Robots Will Make Leeds the First Self-Repairing City
Emiko Jozuka at Motherboard: “Researchers in Britain want to make the first “self-repairing” city by 2035. How will they do this? By creating autonomous repair robots that patrol the streets and drainage systems, making sure your car doesn’t dip into a pothole, and that you don’t experience any gas leaks.
“The idea is to create a city that behaves almost like a living organism,” said Raul Fuentes, a researcher at the School of Civil Engineering at Leeds University, who is working on the project. “The robots will act like white cells that are able to identify bacteria or viruses and attack them. It’s kind of like an immune system.”
The £4.2 million ($6.4 million) national infrastructure project is in collaboration with Leeds City Council and the UK Collaboration for Research in Infrastructures and Cities (UKCRIC). The aim is to create a fleet of robot repair workers who will live in Leeds city, spot problems, and sort them out before they become even bigger ones by 2035, said Fuentes. The project is set to launch officially in January 2016, he added.
For their five-year project—which has a vision that extends until 2050—the researchers will develop robot designs and technologies that focus on three main areas. The first is to create drones that can perch on high structures and repair things like street lamps; the second is to develop drones that can autonomously spot when a pothole is about to form and zone in and patch that up before it worsens; and the third is to develop robots that will live in utility pipes so they can inspect, repair, and report back to humans when they spot an issue.
“The robots will be living permanently in the city, and they’ll be able to identify issues before they become real problems,” explained Fuentes. The researchers are working on making the robots autonomous, and want them to be living in swarms or packs where they can communicate with one another on how best they could get the repair job done….(More)
New flu tracker uses Google search data better than Google
Beth Mole at ArsTechnica: “With big data comes big noise. Google learned this lesson the hard way with its now kaput Google Flu Trends. The online tracker, which used Internet search data to predict real-life flu outbreaks, emerged amid fanfare in 2008. Then it met a quiet death this August after repeatedly coughing up bad estimates.
But big Internet data isn’t out of the disease tracking scene yet.
With hubris firmly in check, a team of Harvard researchers have come up with a way to tame the unruly data, combine it with other data sets, and continually calibrate it to track flu outbreaks with less error. Their new model, published Monday in the Proceedings of the National Academy of Sciences, out-performs Google Flu Trends and other models with at least double the accuracy. If the model holds up in coming flu seasons, it could reinstate some optimism in using big data to monitor disease and herald a wave of more accurate second-generation models.
Big data has a lot of potential, Samuel Kou, a statistics professor at Harvard University and coauthor on the new study, told Ars. It’s just a question of using the right analytics, he said.
Kou and his colleagues built on Google’s flu tracking model for their new version, called ARGO (AutoRegression with GOogle search data). Google Flu Trends basically relied on trends in Internet search terms, such as headache and chills, to estimate the number of flu cases. Those search terms were correlated with flu outbreak data collected by the Centers for Disease Control and Prevention. The CDC’s data relies on clinical reports from around the country. But compiling and analyzing that data can be slow, leading to a lag time of one to three weeks. The Google data, on the other hand, offered near real-time tracking for health experts to manage and prepare for outbreaks.
At first Google’s tracker appeared to be pretty good, matching CDC data’s late-breaking data somewhat closely. But, two notable stumbles led to its ultimate downfall: an underestimate of the 2009 H1N1 swine flu outbreak and an alarming overestimate (almost double real numbers) of the 2012-2013 flu season’s cases…..For ARGO, he and colleagues took the trend data and then designed a model that could self-correct for changes in how people search. The model has a two-year sliding window in which it re-calibrates current search term trends with the CDC’s historical flu data (the gold standard for flu data). They also made sure to exclude winter search terms, such as March Madness and the Oscars, so they didn’t get accidentally correlated with seasonal flu trends. Last, they incorporated data on the historical seasonality of flu.
The result was a model that significantly out-competed the Google Flu Trends estimates for the period between March 29, 2009 to July 11, 2015. ARGO also beat out other models, including one based on current and historical CDC data….(More)”
See also Proceedings of the National Academy of Sciences, 2015. DOI: 10.1073/pnas.1515373112
Politics and the New Machine
Jill Lepore in the NewYorker on “What the turn from polls to data science means for democracy”: “…The modern public-opinion poll has been around since the Great Depression, when the response rate—the number of people who take a survey as a percentage of those who were asked—was more than ninety. The participation rate—the number of people who take a survey as a percentage of the population—is far lower. Election pollsters sample only a minuscule portion of the electorate, not uncommonly something on the order of a couple of thousand people out of the more than two hundred million Americans who are eligible to vote. The promise of this work is that the sample is exquisitely representative. But the lower the response rate the harder and more expensive it becomes to realize that promise, which requires both calling many more people and trying to correct for “non-response bias” by giving greater weight to the answers of people from demographic groups that are less likely to respond. Pollster.com’s Mark Blumenthal has recalled how, in the nineteen-eighties, when the response rate at the firm where he was working had fallen to about sixty per cent, people in his office said, “What will happen when it’s only twenty? We won’t be able to be in business!” A typical response rate is now in the single digits.
Meanwhile, polls are wielding greater influence over American elections than ever….
Still, data science can’t solve the biggest problem with polling, because that problem is neither methodological nor technological. It’s political. Pollsters rose to prominence by claiming that measuring public opinion is good for democracy. But what if it’s bad?
A “poll” used to mean the top of your head. Ophelia says of Polonius, “His beard as white as snow: All flaxen was his poll.” When voting involved assembling (all in favor of Smith stand here, all in favor of Jones over there), counting votes required counting heads; that is, counting polls. Eventually, a “poll” came to mean the count itself. By the nineteenth century, to vote was to go “to the polls,” where, more and more, voting was done on paper. Ballots were often printed in newspapers: you’d cut one out and bring it with you. With the turn to the secret ballot, beginning in the eighteen-eighties, the government began supplying the ballots, but newspapers kept printing them; they’d use them to conduct their own polls, called “straw polls.” Before the election, you’d cut out your ballot and mail it to the newspaper, which would make a prediction. Political parties conducted straw polls, too. That’s one of the ways the political machine worked….
Ever since Gallup, two things have been called polls: surveys of opinions and forecasts of election results. (Plenty of other surveys, of course, don’t measure opinions but instead concern status and behavior: Do you own a house? Have you seen a doctor in the past month?) It’s not a bad idea to reserve the term “polls” for the kind meant to produce election forecasts. When Gallup started out, he was skeptical about using a survey to forecast an election: “Such a test is by no means perfect, because a preelection survey must not only measure public opinion in respect to candidates but must also predict just what groups of people will actually take the trouble to cast their ballots.” Also, he didn’t think that predicting elections constituted a public good: “While such forecasts provide an interesting and legitimate activity, they probably serve no great social purpose.” Then why do it? Gallup conducted polls only to prove the accuracy of his surveys, there being no other way to demonstrate it. The polls themselves, he thought, were pointless…
If public-opinion polling is the child of a strained marriage between the press and the academy, data science is the child of a rocky marriage between the academy and Silicon Valley. The term “data science” was coined in 1960, one year after the Democratic National Committee hired Simulmatics Corporation, a company founded by Ithiel de Sola Pool, a political scientist from M.I.T., to provide strategic analysis in advance of the upcoming Presidential election. Pool and his team collected punch cards from pollsters who had archived more than sixty polls from the elections of 1952, 1954, 1956, 1958, and 1960, representing more than a hundred thousand interviews, and fed them into a UNIVAC. They then sorted voters into four hundred and eighty possible types (for example, “Eastern, metropolitan, lower-income, white, Catholic, female Democrat”) and sorted issues into fifty-two clusters (for example, foreign aid). Simulmatics’ first task, completed just before the Democratic National Convention, was a study of “the Negro vote in the North.” Its report, which is thought to have influenced the civil-rights paragraphs added to the Party’s platform, concluded that between 1954 and 1956 “a small but significant shift to the Republicans occurred among Northern Negroes, which cost the Democrats about 1 per cent of the total votes in 8 key states.” After the nominating convention, the D.N.C. commissioned Simulmatics to prepare three more reports, including one that involved running simulations about different ways in which Kennedy might discuss his Catholicism….
Data science may well turn out to be as flawed as public-opinion polling. But a stage in the development of any new tool is to imagine that you’ve perfected it, in order to ponder its consequences. I asked Hilton to suppose that there existed a flawless tool for measuring public opinion, accurately and instantly, a tool available to voters and politicians alike. Imagine that you’re a member of Congress, I said, and you’re about to head into the House to vote on an act—let’s call it the Smeadwell-Nutley Act. As you do, you use an app called iThePublic to learn the opinions of your constituents. You oppose Smeadwell-Nutley; your constituents are seventy-nine per cent in favor of it. Your constituents will instantly know how you’ve voted, and many have set up an account with Crowdpac to make automatic campaign donations. If you vote against the proposed legislation, your constituents will stop giving money to your reëlection campaign. If, contrary to your convictions but in line with your iThePublic, you vote for Smeadwell-Nutley, would that be democracy? …(More)”
Using Crowdsourcing to Track the Next Viral Disease Outbreak
The TakeAway: “Last year’s Ebola outbreak in West Africa killed more than 11,000 people. The pandemic may be diminished, but public health officials think that another major outbreak of infectious disease is fast-approaching, and they’re busy preparing for it.
Boston public radio station WGBH recently partnered with The GroundTruth Project and NOVA Next on a series called “Next Outbreak.” As part of the series, they reported on an innovative global online monitoring system called HealthMap, which uses the power of the internet and crowdsourcing to detect and track emerging infectious diseases, and also more common ailments like the flu.
Researchers at Boston Children’s Hospital are the ones behind HealthMap (see below), and they use it to tap into tens of thousands of sources of online data, including social media, news reports, and blogs to curate information about outbreaks. Dr. John Brownstein, chief innovation officer at Boston Children’s Hospital and co-founder of HealthMap, says that smarter data collection can help to quickly detect and track emerging infectious diseases, fatal or not.
“Traditional public health is really slowed down by the communication process: People get sick, they’re seen by healthcare providers, they get laboratory confirmed, information flows up the channels to state and local health [agencies], national governments, and then to places like the WHO,” says Dr. Brownstein. “Each one of those stages can take days, weeks, or even months, and that’s the problem if you’re thinking about a virus that can spread around the world in a matter of days.”
The HealthMap team looks at a variety of communication channels to undo the existing hierarchy of health information.
“We make everyone a stakeholder when it comes to data about outbreaks, including consumers,” says Dr. Brownstein. “There are a suite of different tools that public health officials have at their disposal. What we’re trying to do is think about how to communicate and empower individuals to really understand what the risks are, what the true information is about a disease event, and what they can do to protect themselves and their families. It’s all about trying to demystify outbreaks.”
In addition to the map itself, the HealthMap team has a number of interactive tools that individuals can both use and contribute to. Dr. Brownstein hopes these resources will enable the public to care more about disease outbreaks that may be happening around them—it’s a way to put the “public” back in “public health,” he says.
“We have a app called Outbreaks Near Me that allows people to know about what disease outbreaks are happening in their neighborhood,” Dr. Brownstein says. “Flu Near You is a an app that people use to self report on symptoms; Vaccine Finder is a tool that allows people to know what vaccines are available to them and their community.”
In addition to developing their own app, the HealthMap has partnered with existing tech firms like Uber to spread the word about public health.
“We worked closely with Uber last year and actually put nurses in Uber cars and delivered vaccines to people,” Dr. Brownstein says. “The closest vaccine location might still be only a block away for people, but people are still hesitant to get it done.”…(More)”
How smartphones are solving one of China’s biggest mysteries
Ana Swanson at the Washington Post: “For decades, China has been engaged in a building boom of a scale that is hard to wrap your mind around. In the last three decades, 260 million people have moved from the countryside to Chinese cities — equivalent to around 80 percent of the population of the U.S. To make room for all of those people, the size of China’s built-up urban areas nearly quintupled between 1984 and 2010.
Much of that development has benefited people’s lives, but some has not. In a breathless rush to boost growth and development, some urban areas have built vast, unused real estate projects — China’s infamous “ghost cities.” These eerie, shining developments are complete except for one thing: people to live in them.
China’s ghost cities have sparked a lot of debate over the last few years. Some argue that the developments are evidence of the waste in top-down planning, or the result of too much cheap funding for businesses. Some blame the lack of other good places for average people to invest their money, or the desire of local officials to make a quick buck — land sales generate a lot of revenue for China’s local governments.
Others say the idea of ghost cities has been overblown. They espouse a “build it and they will come” philosophy, pointing out that, with time, some ghost cities fill up and turn into vibrant communities.
It’s been hard to evaluate these claims, since most of the research on ghost cities has been anecdotal. Even the most rigorous research methods leave a lot to be desired — for example, investment research firms sending poor junior employees out to remote locations to count how many lights are turned on in buildings at night.
Now new research from Baidu, one of China’s biggest technology companies, provides one of the first systematic looks at Chinese ghost cities. Researchers from Baidu’s Big Data Lab and Peking University in Beijing used the kind of location data gathered by mobile phones and GPS receivers to track how people moved in and out suspected ghost cities, in real time and on a national scale, over a period of six months. You can see the interactive project here.
Google has been blocked in China for years, and Baidu dominates the market in terms of search, mobile maps and other offerings. That gave the researchers a huge data base to work with — 770 million users, a hefty chunk of China’s 1.36 billion people.
To identify potential ghost cities, the researchers created an algorithm that identifies urban areas with a relatively spare population. They define a ghost city as an urban region with a population of fewer than 5,000 people per square kilometer – about half the density recommended by the Chinese Ministry of Housing and Urban-Rural Development….(More)”
Open government: a new paradigm in social change?
Rosie Williams: In a recent speech to the Australian and New Zealand School of Government (ANSOG) annual conference, technology journalist and academic Suelette Drefyus explained the growing ‘information asymmetry’ that characterises the current-day relationship between government and citizenry.
According to Dreyfus:
‘Big Data makes government very powerful in its relationship with the citizen. This is even more so with the rise of intelligent systems, software that increasingly trawls, matches and analyses that Big Data. And it is moving toward making more decisions once made by human beings.’
The role of technology in the delivery of government services gives much food for thought in terms of both its implications for potential good and the potential dangers it may pose. The concept of open government is an important one for the future of policy and democracy in Australia. Open government has at its core a recognition that the world has changed, that the ways people engage and who they engage with has transformed in ways that governments around the world must respond to in both technological and policy terms.
As described in the ANSOG speech, the change within government in how it uses technology is well underway, however in many regards we are at the very beginning of understanding and implementing the potential of data and technology in providing solutions to many of our shared problems. Australia’s pending membership of the Open Government Partnership is integral to how Australia responds to these challenges. Membership of the multi-lateral partnership requires the Australian government to create a National Action Plan based on consultation and demonstrate our credentials in the areas of Fiscal Transparency, Access to Information, Income and Asset Disclosure, and Citizen Engagement.
What are the implications of the National Action Plan for policy consultation formulation, implementation and evaluation? In relative terms, Australia’s history with open government is fairly recent. Policies on open data have seen the roll out of data.gov.au – a repository of data published by government agencies and made available for re-use in efforts such as the author’s own financial transparency site OpenAus.
In this way citizen activity and government come together for the purposes of achieving open government. These efforts express a new paradigm in government and activism where the responsibility for solving the problems of democracy are shared between government and the people as opposed to the government ‘solving’ the problems of a passive, receptive citizenry.
As the famous whistle-blowers have shown, citizens are no longer passive but this new capability also requires a consciousness of the responsibilities and accountability that go along with the powers newly developed by citizen activists through technological change.
The opening of data and communication channels in the formulation of public policy provides a way forward to create both a better informed citizenry and also better informed policy evaluation. When new standards of transparency are applied to wicked problems what shortcomings does this highlight?
This question was tested with my recent request for a basic fact missing from relevant government research and reviews but key to social issues of homelessness and domestic violence….(More)”
New traffic app and disaster prevention technology road tested
Psych.org: “A new smartphone traffic app tested by citizens in Dublin, Ireland allows users to give feedback on traffic incidents, enabling traffic management centres to respond quicker when collisions and other incidents happen around the city. The ‘CrowdAlert’ app, which is now available for download, is one of the key components utilised in the EU-funded INSIGHT project and a good example of how smartphones and social networks can be harnessed to improve public services and safety.
‘We are witnessing an explosion in the quantity, quality, and variety of available information, fuelled in large part by advances in sensor networking, the availability of low-cost sensor-enabled devices and by the widespread adoption of powerful smart-phones,’ explains project coordinator professor Dimitrios Gunopulos from the National and Kapodistrian University of Athens. ‘These revolutionary technologies are driving the development and adoption of applications where mobile devices are used for continuous data sensing and analysis.’
The project also developed a novel citywide real-time traffic monitoring tool, the ‘INSIGHT System’, which was tested in real conditions in the Dublin City control room, along with nationwide disaster monitoring technologies. The INSIGHT system was shown to provide early warnings to experts at situation centres, enabling them to monitor situations in real-time, including disasters with potentially nation-wide impacts such as severe weather conditions, floods and subsequent knock-on events such as fires and power outages.
The project’s results will be of interest to public services, which have until now lacked the necessary infrastructure for handling and integrating miscellaneous data streams, including data from static and mobile sensors as well as information coming from social network sources, in real-time. Providing cities with the ability to manage emergency situations with enhanced capabilities will also open up new markets for network technologies….(More)”
Teaching Open Data for Social Movements: a Research Strategy
Alan Freihof Tygel and Maria Luiza Machado Campo at the Journal of Community Informatics: “Since the year 2009, the release of public government data in open formats has been configured as one of the main actions taken by national states in order to respond to demands for transparency and participation by the civil society. The United States and theUnited Kingdom were pioneers, and today over 46 countries have their own Open Government Data Portali , many of them fostered by the Open Government Partnership (OGP), an international agreement aimed at stimulating transparency.
The premise of these open data portals is that, by making data publicly available in re-usable formats, society would take care of building applications and services, and gain value from this data (Huijboom & Broek, 2011). According to the same authors, the discourse around open data policies also includes increasing democratic control and participation and strengthening law enforcement.
Several recent works argue that the impact of open data policies, especially the release of open data portals, is still difficult to assess (Davies & Bawa, 2012; Huijboom & Broek, 2011; Zuiderwijk, Janssen, Choenni, Meijer, & Alibaks, 2012). One important consideration is that “The gap between the promise and reality of OGD [Open Government Data] re-use cannot be addressed by technological solutions alone” (Davies, 2012). Therefore, sociotechnical approaches (Mumford, 1987) are mandatory.
The targeted users of open government data lie over a wide range that includes journalists, non-governmental organizations (NGO), civil society organizations (CSO), enterprises, researchers and ordinary citizens who want to audit governments’ actions. Among them, the focus of our research is on social (or grassroots) movements. These are groups of organized citizens at local, national or international level who drive some political action, normally placing themselves in opposition to the established power relations and claiming rights for oppressed groups.
A literature definition gives a social movement as “collective social actions with a socio-political and cultural approach, which enable distinct forms of organizing the population and expressing their demands” (Gohn, 2011).
Social movements have been using data in their actions repertory with several motivations (as can be seen in Table 1 and Listing 1). From our experience, an overview of several cases where social movements use open data reveals a better understanding of reality and a more solid basis for their claims as motivations. Additionally, in some cases data produced by the social movements was used to build a counter-hegemonic discourse based on data. An interesting example is the Citizen Public Depth Audit Movement which takes place in Brazil. This movement, which is part of an international network, claims that “significant amounts registered as public debt do not correspond to money collected through loans to the country” (Fattorelli, 2011), and thus origins of this debt should be proven. According to the movement, in 2014 45% of Brazil’s Federal spend was paid to debt services.
Recently, a number of works tried to develop comparison schemes between open data strategies (Atz, Heath, & Fawcet, 2015; Caplan et al., 2014; Ubaldi, 2013; Zuiderwijk & Janssen, 2014). Huijboom & Broek (2011) listed four categories of instruments applied by the countries to implement their open data policies:
- voluntary approaches, such as general recommendations,
- economic instruments,
- legislation and control, and
- education and training.
One of the conclusions is that the latter was used to a lesser extent than the others.
Social movements, in general, are composed of people with little experience of informatics, either because of a lack of opportunities or of interest. Although it is recognized that using data is important for a social movement’s objectives, the training aspect still hinders a wider use of it.
In order to address this issue, an open data course for social movements was designed. Besides building a strategy on open data education, the course also aims to be a research strategy to understand three aspects:
- the motivations of social movements for using open data;
- the impediments that block a wider and better use; and
- possible actions to be taken to enhance the use of open data by social movements….(More)”
Open Data Impact: How Zillow Uses Open Data to Level the Playing Field for Consumers
Daniel Castro at US Dept of Commerce: “In the mid-2000s, several online data firms began to integrate real estate data with national maps to make the data more accessible for consumers. Of these firms, Zillow was the most effective at attracting users by rapidly growing its database, thanks in large part to open data. Zillow’s success is based, in part, on its ability to create tailored products that blend multiple data sources to answer customer’s questions about the housing market. Zillow’s platform lets customers easily compare neighborhoods and conduct thorough real estate searches through a single portal. This ensures a level playing field of information for home buyers, sellers and real estate professionals.
The system empowers consumers by providing them all the information needed to make well-informed decisions about buying or renting a home. For example, information from the Census Bureau’s American Community Survey helps answer people’s questions about what kind of housing they can afford in any U.S. market. Zillow also creates market analysis reports, which inform consumer about whether it is a good time to buy or sell, how an individual property’s value is likely to fluctuate over time, or whether it is better to rent or to own in certain markets. These reports can even show which neighborhoods are the top buyers’ or sellers’ markets in a given city. Zillow uses a wide range of government data, not just from the Census Bureau, to produce economic analyses and products it then freely provides to the public.
In addition to creating reports from synthesized data, Zillow has made a conscious effort to make raw data more usable. It has combined rental, mortgage, and other data into granular metrics on individual neighborhoods and zip codes. For example, the “Breakeven Horizon” is a metric that gives users a snapshot of how long they would need to own a home in a given area for the accrued cost of buying to be less than renting. Zillow creates this by comparing the up-front costs of buying a home versus the amount of interest that money could generate, and then analyzing how median rents and home values are likely to fluctuate, affecting both values. By creating metrics, rankings, and indices, Zillow makes raw or difficult-to-quantify data readily accessible to the public.
While real estate agents can be instrumental in the process of finding a new home or selling an old one, Zillow and other platforms add value by connecting consumers to a wealth of data, some of which may have been accessible before but was too cumbersome for the average user. Not only does this allow buyers and sellers to make more informed decisions about real estate, but it also helps to balance the share of knowledge. Buyers have more information than ever before on available properties, their valuations for specific neighborhoods, and how those valuations have changed in relation to larger markets. Sellers can use the same types of information to evaluate offers they receive, or decide whether to list their home in the first place. The success that Zillow and other companies like it have achieved in the real estate market is a testament to how effective they have been in harnessing data to address consumers’ needs and it is a marvelous example of the power of open data….(More)”