China’s Biggest Polluters Face Wrath of Data-Wielding Citizens


Bloomberg News: “Besides facing hefty fines, criminal punishments and the possibility of closing, the worst emitters in China risk additional public anger as new smartphone applications and lower-cost monitoring devices widen access to data on pollution sources.

The Blue Map app, developed by the Institute of Public & Environmental Affairs with support from the SEE Foundation and the Alibaba Foundation, provides pollution data from more than 3,000 large coal-power, steel, cement and petrochemical production plants. Origins Technology Ltd. in July began sale of the Laser Egg, a palm-sized air quality monitor used to track indoor and outdoor air quality by measuring fine particulate matter in the air.

“Letting people know the sources of regional pollution will help the push for control over emissions of every chimney,” said Ma Jun, the founder and director of the Beijing-based IPE.

The phone map and Laser Egg are the latest levers in prying control over information on air quality from the hands of the few to the many, and they’re beginning to weigh on how officials respond to the issue. Numerous smartphone applications, including those developed by SINA Corp. and Moji Fengyun (Beijing) Software Technology Development Co., now provide people in China with real-time access to air quality readings, essentially democratizing what was once an information pipeline available only to the government.

“China’s continuing struggle to control and reduce air pollution exemplifies the government’s fear that lifestyle issues will mutate into demands for political change,” said Mary Gallagher, an associate professor of political science at the University of Michigan.

Even the government is getting in on the act. The Ministry of Environmental Protection rolled out a smartphone application called “Nationwide Air Quality” with the help ofWuhan Juzheng Environmental Science & Technology Co. at the end of 2013.

“As citizens know more about air pollution, more pressure will be put on the government,” said Xu Qinxiang, a technology manager at Wuhan Juzheng. “This will urge the government to control pollutant sources and upgrade heavy industries.”

 Laser Egg

Sources of air quality data come from the China National Environment Monitoring Center, local environmental protection bureaus and non-Chinese sources such as the U.S. Embassy’s website in Beijing, Xu said.

Air quality is a controversial subject in China. Since 2012, the public has pushed the government to move more quickly than planned to begin releasing data measuring pollution levels — especially of PM2.5, the particulates most harmful to human health.

The reading was 267 micrograms per cubic meter at 10 a.m. Monday near Tiananmen Square, according to the Beijing Municipal Environmental Monitoring Center. The World Health Organization cautions against 24-hour exposure to concentrations higher than 25.

The availability of data appears to be filling a need, especially with the arrival of colder temperatures and the associated smog that blanketed Beijing and northern Chinarecently….

“With more disclosure of the data, everyone becomes more sensitive, hoping the government can do something,” Li Yajuan, a 27-year-old office secretary, said in an interview in Beijing’s Fuchengmen area. “It’s our own living environment after all.”

Efforts to make products linked to air data continue. IBM has been developing artificial intelligence to help fight Beijing’s toxic air pollution, and plans to work with other municipalities in China and India on similar projects to manage air quality….(More)”

Engaging Citizens: A Review of Eight Approaches to Civic Engagement


 at User Experience: “Twenty years ago, Robert Putnam wrote about the rise of “bowling alone,” a metaphor for people participating in activities as individuals instead of groups that can lead to community. This led to the decline in social capital in America. The problem of individual participation as opposed to community building has become an even bigger problem since the invention of smartphones, the Internet as the source of all information, social networking, and asynchronous entertainment. We never need to talk to anyone anymore and it often feels like an imposition when we ask for an answer we know we could find online.

Putnam posited that the decline in social capital is a cause for decline in civic engagement and participation in democracy. If we aren’t engaged socially with the people around us, we don’t have as much incentive to care about what is going on that might affect them. Local elections have low voter turnout in part because people aren’t aware of or engaged in local issues.

In an attempt to chip away at this problem, platforms that attempt to encourage people to engage in civic life with government and local communities have been popping up. But how well do they actually engage people? These platforms are often criticized for producing “slacktivists” who are applying the minimum amount of effort possible and not really effecting change. Several of these platforms were evaluated to see how they work and to determine how well they actually promote civic engagement.

Measuring Civic Engagement

Code for America is an organization that works to increase engagement with local governments by putting together “brigades” of local volunteers to solve local problems using technology. They have developed an Engagement Standard that attempts to measure how well a government enables citizens to engage in civic life.

Elements of Code for America’s Engagement Standard include:

  • Reach: Defining the constituency you are trying to reach, with an emphasis on identifying those whose voices aren’t already represented.
  • Channels: Making use of a diversity of spaces, both online and off, that meet people where they are.
  • Information: Providing relevant information that is easy to find and understand, and speak with an authentic voice.
  • Productive Actions: Identifying clear, concrete, and meaningful actions residents can take to reach desired outcomes.
  • Feedback Loops: Making sure the public understands the productive impact of their participation and that their actions have value.

These elements form a funnel, shown in Figure 1, which starts with reaching the right audience and ends with providing feedback to that audience on the effects of their actions. Platforms that have low engagement tend to get stuck at the top of the funnel and platforms that foster more engagement meet all the standards in the funnel.

Funnel of engagement showing reach, channels, information, productive actions, and feedback loops.

Eight Approaches to Civic Engagement

Each of the platforms described below attempts to engage citizens in civic actions. However, each platform has a different approach….(More)”

The Quest for Good Governance


New book byAlina Mungiu-Pippidi: “Why do some societies manage to control corruption so that it manifests itself only occasionally, while other societies remain systemically corrupt? This book is about how societies reach that point when integrity becomes the norm and corruption the exception in regard to how public affairs are run and public resources are allocated. It primarily asks what lessons we have learned from historical and contemporary experiences in developing corruption control, which can aid policy-makers and civil societies in steering and expediting this process. Few states now remain without either an anticorruption agency or an Ombudsman, yet no statistical evidence can be found that they actually induce progress. Using both historical and contemporary studies and easy to understand statistics, Alina Mungiu-Pippidi looks at how to diagnose, measure and change governance so that those entrusted with power and authority manage to defend public resources….(More)”

Opening up government data for public benefit


Keiran Hardy at the Mandarin (Australia): “…This post explains the open data movement and considers the benefits and risks of releasing government data as open data. It then outlines the steps taken by the Labor and Liberal governments in accordance with this trend. It argues that the Prime Minister’stask, while admirably intentioned, is likely to prove difficult due to ongoing challenges surrounding the requirements of privacy law and a public service culture that remains reluctant to release government data into the public domain….

A key purpose of releasing government data is to improve the effectiveness and efficiency of services delivered by the government. For example, data on crops, weather and geography might be analysed to improve current approaches to farming and industry, or data on hospital admissions might be analysed alongside demographic and census data to improve the efficiency of health services in areas of need. It has been estimated that such innovation based on open data could benefit the Australian economy by up to $16 billion per year.

Another core benefit is that the open data movement is making gains in transparency and accountability, as a greater proportion of government decisions and operations are being shared with the public. These democratic values are made clear in the OGP’s Open Government Declaration, which aims to make governments ‘more open, accountable, and responsive to citizens’.

Open data can also improve democratic participation by allowing citizens to contribute to policy innovation. Events like GovHack, an annual Australian competition in which government, industry and the general public collaborate to find new uses for open government data, epitomise a growing trend towards service delivery informed by user input. The winner of the “Best Policy Insights Hack” at GovHack 2015 developed a software program for analysing which suburbs are best placed for rooftop solar investment.

At the same time, the release of government data poses significant risks to the privacy of Australian citizens. Much of the open data currently available is spatial (geographic or satellite) data, which is relatively unproblematic to post online as it poses minimal privacy risks. However, for the full benefits of open data to be gained, these kinds of data need to be supplemented with information on welfare payments, hospital admission rates and other potentially sensitive areas which could drive policy innovation.

Policy data in these areas would be de-identified — that is, all names, addresses and other obvious identifying information would be removed so that only aggregate or statistical data remains. However, debates continue as to the reliability of de-identification techniques, as there have been prominent examples of individuals being re-identified by cross-referencing datasets….

With regard to open data, a culture resistant to releasing government informationappears to be driven by several similar factors, including:

  • A generational preference amongst public service management for maintaining secrecy of information, whereas younger generations expect that data should be made freely available;
  • Concerns about the quality or accuracy of information being released;
  • Fear that mistakes or misconduct on behalf of government employees might be exposed;
  • Limited understanding of the benefits that can be gained from open data; and
  • A lack of leadership to help drive the open data movement.

If open data policies have a similar effect on public service culture as FOI legislation, it may be that open data policies in fact hinder transparency by having a chilling effect on government decision-making for fear of what might be exposed….

These legal and cultural hurdles will pose ongoing challenges for the Turnbull government in seeking to release greater amounts of government data as open data….(More)

Big Data Before the Web


Evan Hepler-Smith in the Wall Street Journal: “Sometime in the early 1950s, on a reservation in Wisconsin, a Menominee Indian man looked at an ink blot. An anthropologist recorded the man’s reaction according to a standard Rorschach-test protocol. The researcher submitted a copy of these notes to an enormous cache of records collected over the course of decades by American social scientists working among various “societies ‘other than our own.’ ” This entire collection of social-scientific data was photographed and printed in arrays of microscopic images on 3-by-5-inch cards. Sets of these cards were shipped to research libraries around the world. They gathered dust.

In the results of this Rorschach test, the anthropologist saw evidence of a culture eroded by modernity. Sixty years later, these documents also testify to the aspirations and fate of the social-scientific project for which they were generated. Deep within this forgotten Ozymandian card file sits the Menominee man’s reaction to Rorschach card VI: “It is like a dead planet. It seems to tell the story of a people once great who have lost . . . like something happened. All that’s left is the symbol.”

In “Database of Dreams: The Lost Quest to Catalog Humanity,” Rebecca Lemov delves into the ambitious efforts of mid-20th-century social scientists to build a “capacious and reliable science of the varieties of the human being” by generating an archive of human experience through interviews and tests and by storing the information on the high-tech media of the day.

 For these psychologists and anthropologists, the key to a universal human science lay in studying members of cultures in transition between traditional and modern ways of life and in rendering their individuality as data. Interweaving stories of social scientists, Native American research subjects and information technologies, Ms. Lemov presents a compelling account of “what ‘humanness’ came to mean in an age of rapid change in technological and social conditions.” Ms. Lemov, an associate professor of the history of science at Harvard University, follows two contrasting threads through a story that she calls “a parable for our time.” She shows, first, how collecting data about human experience shapes human experience and, second, how a high-tech data repository of the 1950s became, as she puts it, a “data ruin.”…(More) – See also: Database of Dreams: The Lost Quest to Catalog Humanity

OpenFDA: an innovative platform providing access to a wealth of FDA’s publicly available data


Paper by Taha A Kass-Hout et al in JAMIA: “The objective of openFDA is to facilitate access and use of big important Food and Drug Administration public datasets by developers, researchers, and the public through harmonization of data across disparate FDA datasets provided via application programming interfaces (APIs).

Materials and Methods: Using cutting-edge technologies deployed on FDA’s new public cloud computing infrastructure, openFDA provides open data for easier, faster (over 300 requests per second per process), and better access to FDA datasets; open source code and documentation shared on GitHub for open community contributions of examples, apps and ideas; and infrastructure that can be adopted for other public health big data challenges.

Results:Since its launch on June 2, 2014, openFDA has developed four APIs for drug and device adverse events, recall information for all FDA-regulated products, and drug labeling. There have been more than 20 million API calls (more than half from outside the United States), 6000 registered users, 20,000 connected Internet Protocol addresses, and dozens of new software (mobile or web) apps developed. A case study demonstrates a use of openFDA data to understand an apparent association of a drug with an adverse event. Conclusion With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products

Conclusion: With easier and faster access to these datasets, consumers worldwide can learn more about FDA-regulated products…(More)”

Data Science ethics


Gov.uk blog: “If Tesco knows day-to-day how poorly the nation is, how can Government access  similar  insights so it can better plan health services? If Airbnb can give you a tailored service depending on your tastes, how can Government provide people with the right support to help them back into work in a way that is right for them? If companies are routinely using social media data to get feedback from their customers to improve their services, how can Government also use publicly available data to do the same?

Data science allows us to use new types of data and powerful tools to analyse this more quickly and more objectively than any human could. It can put us in the vanguard of policymaking – revealing new insights that leads to better and more tailored interventions. And  it can help reduce costs, freeing up resource to spend on more serious cases.

But some of these data uses and machine-learning techniques are new and still relatively untested in Government. Of course, we operate within legal frameworks such as the Data Protection Act and Intellectual Property law. These are flexible but don’t always talk explicitly about the new challenges data science throws up. For example, how are you to explain the decision making process of a deep learning black box algorithm? And if you were able to, how would you do so in plain English and not a row of 0s and 1s?

We want data scientists to feel confident to innovate with data, alongside  the policy makers and operational staff who make daily decisions on the data that the analysts provide –. That’s why we are creating an ethical framework which brings together the relevant parts of the law and ethical considerations into a simple document that helps Government officials decide what it can do and what it should do. We have a moral responsibility to maximise the use of data – which is never more apparent than after incidents of abuse or crime are left undetected – as well as to pay heed to the potential risks of these new tools. The guidelines are draft and not formal government policy, but we want to share them more widely in order to help iterate and improve them further….

So what’s in the framework? There is more detail in the fuller document, but it is based around six key principles:

  1. Start with a clear user need and public benefit: this will help you justify the level of data sensitivity and method you use
  2. Use the minimum level of data necessary to fulfill the public benefit: there are many techniques for doing so, such as de-identification, aggregation or querying against data
  3. Build robust data science models: the model is only as good as the data it contains and while machines are less biased than humans they can get it wrong. It’s critical to be clear about the confidence of the model and think through unintended consequences and biases contained within the data
  4. Be alert to public perceptions: put simply, what would a normal person on the street think about the project?
  5. Be as open and accountable as possible: Transparency is the antiseptic for unethical behavior. Aim to be as open as possible (with explanations in plain English), although in certain public protection cases the ability to be transparent will be constrained.
  6. Keep data safe and secure: this is not restricted to data science projects but we know that the public are most concerned about losing control of their data….(More)”

Controlling the crowd? Government and citizen interaction on emergency-response platforms


 at the Policy and Internet Blog: “My interest in the role of crowdsourcing tools and practices in emergency situations was triggered by my personal experience. In 2010 I was one of the co-founders of the Russian “Help Map” project, which facilitated volunteer-based response to wildfires in central Russia. When I was working on this project, I realized that a crowdsourcing platform can bring the participation of the citizen to a new level and transform sporadic initiatives by single citizens and groups into large-scale, relatively well coordinated operations. What was also important was that both the needs and the forms of participation required in order to address these needs be defined by the users themselves.

To some extent the citizen-based response filled the gap left by the lack of a sufficient response from the traditional institutions.[1] This suggests that the role of ICTs in disaster response should be examined within the political context of the power relationship between members of the public who use digital tools and the traditional institutions. My experience in 2010 was the first time I was able to see that, while we would expect that in a case of natural disaster both the authorities and the citizens would be mostly concerned about the emergency, the actual situation might be different.

Apparently the emergence of independent, citizen-based collective action in response to a disaster was considered as some type of threat by the institutional actors. First, it was a threat to the image of these institutions, which didn’t want citizens to be portrayed as the leading responding actors. Second, any type of citizen-based collective action, even if not purely political, may be an issue of concern in authoritarian countries in particular. Accordingly, one can argue that, while citizens are struggling against a disaster, in some cases the traditional institutions may make substantial efforts to restrain and contain the action of citizens. In this light, the role of information technologies can include not only enhancing citizen engagement and increasing the efficiency of the response, but also controlling the digital crowd of potential volunteers.

The purpose of this paper was to conceptualize the tension between the role of ICTs in the engagement of the crowd and its resources, and the role of ICTs in controlling the resources of the crowd. The research suggests a theoretical and methodological framework that allows us to explore this tension. The paper focuses on an analysis of specific platforms and suggests empirical data about the structure of the platforms, and interviews with developers and administrators of the platforms. This data is used in order to identify how tools of engagement are transformed into tools of control, and what major differences there are between platforms that seek to achieve these two goals. That said, obviously any platform can have properties of control and properties of engagement at the same time; however the proportion of these two types of elements can differ significantly.

One of the core issues for my research is how traditional actors respond to fast, bottom-up innovation by citizens.[2]. On the one hand, the authorities try to restrict the empowerment of citizens by the new tools. On the other hand, the institutional actors also seek to innovate and develop new tools that can restore the balance of power that has been challenged by citizen-based innovation. The tension between using digital tools for the engagement of the crowd and for control of the crowd can be considered as one of the aspects of this dynamic.

That doesn’t mean that all state-backed platforms are created solely for the purpose of control. One can argue, however, that the development of digital tools that offer a mechanism of command and control over the resources of the crowd is prevalent among the projects that are supported by the authorities. This can also be approached as a means of using information technologies in order to include the digital crowd within the “vertical of power”, which is a top-down strategy of governance. That is why this paper seeks to conceptualize this phenomenon as “vertical crowdsourcing”.

The question of whether using a digital tool as a mechanism of control is intentional is to some extent secondary. What is important is that the analysis of platform structures relying on activity theory identifies a number of properties that allow us to argue that these tools are primarily tools of control. The conceptual framework introduced in the paper is used in order to follow the transformation of tools for the engagement of the crowd into tools of control over the crowd. That said, some of the interviews with the developers and administrators of the platforms may suggest the intentional nature of the development of tools of control, while crowd engagement is secondary….Read the full article: Asmolov, G. (2015) Vertical Crowdsourcing in Russia: Balancing Governance of Crowds and State–Citizen Partnership in Emergency Situations.”

 

The Discourse of Public Participation Media


Book by Joanna Thornborrow: “The Discourse of Public Participation Media takes a fresh look at what ‘ordinary’ people are doing on air – what they say, and how and where they get to say it.

Using techniques of discourse analysis to explore the construction of participant identities in a range of different public participation genres, Joanna Thornborrow argues that the role of the ‘ordinary’ person in these media environments is frequently anything but.

Tracing the development of discourses of public participation media, the book focusses particularly on the 1990s onwards when broadcasting was expanding rapidly: the rise of the TV talk show, increasing formats for public participation in broadcast debate and discussion, and the explosion of reality TV in the first decade of the 21st century. During this period, traditional broadcasting has also had to move with the times and incorporate mobile and web-based communication technologies as new platforms for public access and participation – text and email as well as the telephone – and an audience that moves out of the studio and into the online spaces of chat rooms, comment forums and the ‘twitterverse’.

This original study examines the shifting discourses of public engagement and participation resulting from these new forms of communication, making it an ideal companion for students of communication, media and cultural studies, media discourse, broadcast talk and social interaction….(More)”

Open Data Index 2015


Open Knowledge: “….This year’s Index showed impressive gains from non-OECD countries with Taiwan topping the Index and Colombia and Uruguay breaking into the top ten at four and seven respectively. Overall, the Index evaluated 122 places and 1586 datasets and determined that only 9%, or 156 datasets, were both technically and legally open.

The Index ranks countries based on the availability and accessibility of data in thirteen key categories, including government spending, election results, procurement, and pollution levels. Over the summer, we held a public consultation, which saw contributions from individuals within the open data community as well as from key civil society organisations across an array of sectors. As a result of this consultation, we expanded the 2015 Index to include public procurement data, water quality data, land ownership data and weather data; we also decided to removed transport timetables due to the difficulties faced when comparing transport system data globally.

Open Knowledge International began to systematically track the release of open data by national governments in 2013 with the objective of measuring if governments were releasing the key datasets of high social and democratic value as open data. That enables us to better understand the current state of play and in turn work with civil society actors to address the gaps in data release. Over the course of the last three years, the Global Open Data Index has become more than just a benchmark – we noticed that governments began to use the Index as a reference to inform their open data priorities and civil society actors began to use the Index advocacy tool to encourage governments to improve their performance in releasing key datasets.

Furthermore, indices such as the Global Open Data Index are not without their challenges. The Index measures the technical and legal openness of datasets deemed to be of critical democratic and social value – it does not measure the openness of a given government. It should be clear that the release of a few key datasets is not a sufficient measure of the openness of a government. The blurring of lines between open data and open government is nothing new and has been hotly debated by civil society groups and transparency organisations since the sharp rise in popularity of open data policies over the last decade. …Index at http://index.okfn.org/”