In democracy and disaster, emerging world embraces 'open data'


Jeremy Wagstaff’ at Reuters: “Open data’ – the trove of data-sets made publicly available by governments, organizations and businesses – isn’t normally linked to high-wire politics, but just may have saved last month’s Indonesian presidential elections from chaos.
Data is considered open when it’s released for anyone to use and in a format that’s easy for computers to read. The uses are largely commercial, such as the GPS data from U.S.-owned satellites, but data can range from budget numbers and climate and health statistics to bus and rail timetables.
It’s a revolution that’s swept the developed world in recent years as governments and agencies like the World Bank have freed up hundreds of thousands of data-sets for use by anyone who sees a use for them. Data.gov, a U.S. site, lists more than 100,000 data-sets, from food calories to magnetic fields in space.
Consultants McKinsey reckon open data could add up to $3 trillion worth of economic activity a year – from performance ratings that help parents find the best schools to governments saving money by releasing budget data and asking citizens to come up with cost-cutting ideas. All the apps, services and equipment that tap the GPS satellites, for example, generate $96 billion of economic activity each year in the United States alone, according to a 2011 study.
But so far open data has had a limited impact in the developing world, where officials are wary of giving away too much information, and where there’s the issue of just how useful it might be: for most people in emerging countries, property prices and bus schedules aren’t top priorities.
But last month’s election in Indonesia – a contentious face-off between a disgraced general and a furniture-exporter turned reformist – highlighted how powerful open data can be in tandem with a handful of tech-smart programmers, social media savvy and crowdsourcing.
“Open data may well have saved this election,” said Paul Rowland, a Jakarta-based consultant on democracy and governance…”
 

Beyond just politics: A systematic literature review of online participation


Paper by Christoph Lutz, Christian Pieter Hoffmann, and Miriam Meckel in First Monday :”This paper presents a systematic literature review of the current state–of–research on online participation. The review draws on four databases and is guided by the application of six topical search terms. The analysis strives to differentiate distinct forms of online participation and to identify salient discourses within each research field. We find that research on online participation is highly segregated into specific sub–discourses that reflect disciplinary boundaries. Research on online political participation and civic engagement is identified as the most prominent and extensive research field. Yet research on other forms of participation, such as cultural, business, education and health participation, provides distinct perspectives and valuable insights. We outline both field–specific and common findings and derive propositions for future research.”

Reprogramming Government: A Conversation With Mikey Dickerson


Q and A by in The New York Times: “President Obama owes Mikey Dickerson two debts of gratitude. Mr. Dickerson was a crucial member of the team that, in just six weeks, fixed the HealthCare.gov website when the two-year, $400 million health insurance project failed almost as soon as it opened to the public in October.

Mr. Dickerson, 35, also oversaw the computers and wrote software for Mr. Obama’s 2012 re-election campaign, including crucial last-minute programs to figure out ad placement and plan “get out the vote” campaigns in critical areas. It was a good fit for him; since 2006, Mr. Dickerson had worked for Google on its computer systems, which have grown rapidly and are now among the world’s largest.

But last week Mr. Obama lured Mr. Dickerson away from Google. His new job at the White House will be to identify and fix other troubled government computer systems and websites. The engineer says he wants to change how citizens interact with the government as well as prevent catastrophes. He talked on Friday about his new role, in a conversation that has been condensed and edited….”

Can big data help build more wind and solar farms?


Rachael Post in The Guardian: “Convincing customers to switch to renewable energy is an uphill battle. But for a former political operative, finding business is as easy as mining a consumer behavior database…After his father died from cancer related to pollution from a coal-burning plant, Tom Matzzie, the former director of democratic activist group MoveOn.org, decided that he’d had enough with traditional dirty energy. But when he installed solar panels on his home, he discovered that the complicated permitting and construction process made switching to renewable energy difficult and unwieldy. The solution, he concluded, was to use his online campaigning and big data skills – honed from his years of working in politics – to find the most likely customers for renewables and convince them to switch. Ethical Electric was born.
Matzzie’s company isn’t the first to sell renewable energy, but it might be the smartest. For the most part, convincing people to switch away from dirty energy is an unprofitable and work-intensive process, requiring electrical company representatives to approach thousands of randomly chosen customers. Ethical Electric, on the other hand, uses a highly-targeted, strategic method to identify its potential customers.
From finding votes to finding customers
Matzzie, who is now CEO of Ethical Electric, explained that the secret lies in his company’s use of big data, a resource that he and his partners mastered on the political front lines. In the last few presidential elections, big data fundamentally changed the way candidates – and their teams – approached voters. “We couldn’t rely on voter registration lists to make assumptions about who would be willing to vote in the next election,” Matzzie said. “What happened in politics is a real revolution in data.”…”

Interpreting Hashtag Politics – Policy Ideas in an Era of Social Media


New book by Stephen Jeffares: “Why do policy actors create branded terms like Big Society and does launching such policy ideas on Twitter extend or curtail their life? This book argues that the practice of hashtag politics has evolved in response to an increasingly congested and mediatised environment, with the recent and rapid growth of high speed internet connections, smart phones and social media. It examines how policy analysis can adapt to offer interpretive insights into the life and death of policy ideas in an era of hashtag politics.
This text reveals that policy ideas can at the same time be ideas, instruments, visions, containers and brands, and advises readers on how to tell if a policy idea is dead or dying, how to map the diversity of viewpoints, how to capture the debate, when to engage and when to walk away. Each chapter showcases innovative analytic techniques, illustrated by application to contemporary policy ideas.”

Designing an Online Civic Engagement Platform: Balancing “More” vs. “Better” Participation in Complex Public Policymaking


Paper by Cynthia R. Farina et al in E-Politics: “A new form of online citizen participation in government decisionmaking has arisen in the United States (U.S.) under the Obama Administration. “Civic Participation 2.0” attempts to use Web 2.0 information and communication technologies to enable wider civic participation in government policymaking, based on three pillars of open government: transparency, participation, and collaboration. Thus far, the Administration has modeled Civic Participation 2.0 almost exclusively on a universalist/populist Web 2.0 philosophy of participation. In this model, content is created by users, who are enabled to shape the discussion and assess the value of contributions with little information or guidance from government decisionmakers. The authors suggest that this model often produces “participation” unsatisfactory to both government and citizens. The authors propose instead a model of Civic Participation 2.0 rooted in the theory and practice of democratic deliberation. In this model, the goal of civic participation is to reveal the conclusions people reach when they are informed about the issues and have the opportunity and motivation seriously to discuss them. Accordingly, the task of civic participation design is to provide the factual and policy information and the kinds of participation mechanisms that support and encourage this sort of participatory output. Based on the authors’ experience with Regulation Room, an experimental online platform for broadening effective civic participation in rulemaking (the process federal agencies use to make new regulations), the authors offer specific suggestions for how designers can strike the balance between ease of engagement and quality of engagement – and so bring new voices into public policymaking processes through participatory outputs that government decisionmakers will value.”

Sharing Data Is a Form of Corporate Philanthropy


Matt Stempeck in HBR Blog:  “Ever since the International Charter on Space and Major Disasters was signed in 1999, satellite companies like DMC International Imaging have had a clear protocol with which to provide valuable imagery to public actors in times of crisis. In a single week this February, DMCii tasked its fleet of satellites on flooding in the United Kingdom, fires in India, floods in Zimbabwe, and snow in South Korea. Official crisis response departments and relevant UN departments can request on-demand access to the visuals captured by these “eyes in the sky” to better assess damage and coordinate relief efforts.

DMCii is a private company, yet it provides enormous value to the public and social sectors simply by periodically sharing its data.
Back on Earth, companies create, collect, and mine data in their day-to-day business. This data has quickly emerged as one of this century’s most vital assets. Public sector and social good organizations may not have access to the same amount, quality, or frequency of data. This imbalance has inspired a new category of corporate giving foreshadowed by the 1999 Space Charter: data philanthropy.
The satellite imagery example is an area of obvious societal value, but data philanthropy holds even stronger potential closer to home, where a wide range of private companies could give back in meaningful ways by contributing data to public actors. Consider two promising contexts for data philanthropy: responsive cities and academic research.
The centralized institutions of the 20th century allowed for the most sophisticated economic and urban planning to date. But in recent decades, the information revolution has helped the private sector speed ahead in data aggregation, analysis, and applications. It’s well known that there’s enormous value in real-time usage of data in the private sector, but there are similarly huge gains to be won in the application of real-time data to mitigate common challenges.
What if sharing economy companies shared their real-time housing, transit, and economic data with city governments or public interest groups? For example, Uber maintains a “God’s Eye view” of every driver on the road in a city:
stempeck2
Imagine combining this single data feed with an entire portfolio of real-time information. An early leader in this space is the City of Chicago’s urban data dashboard, WindyGrid. The dashboard aggregates an ever-growing variety of public datasets to allow for more intelligent urban management.
stempeck3
Over time, we could design responsive cities that react to this data. A responsive city is one where services, infrastructure, and even policies can flexibly respond to the rhythms of its denizens in real-time. Private sector data contributions could greatly accelerate these nascent efforts.
Data philanthropy could similarly benefit academia. Access to data remains an unfortunate barrier to entry for many researchers. The result is that only researchers with access to certain data, such as full-volume social media streams, can analyze and produce knowledge from this compelling information. Twitter, for example, sells access to a range of real-time APIs to marketing platforms, but the price point often exceeds researchers’ budgets. To accelerate the pursuit of knowledge, Twitter has piloted a program called Data Grants offering access to segments of their real-time global trove to select groups of researchers. With this program, academics and other researchers can apply to receive access to relevant bulk data downloads, such as an period of time before and after an election, or a certain geographic area.
Humanitarian response, urban planning, and academia are just three sectors within which private data can be donated to improve the public condition. There are many more possible applications possible, but few examples to date. For companies looking to expand their corporate social responsibility initiatives, sharing data should be part of the conversation…
Companies considering data philanthropy can take the following steps:

  • Inventory the information your company produces, collects, and analyzes. Consider which data would be easy to share and which data will require long-term effort.
  • Think who could benefit from this information. Who in your community doesn’t have access to this information?
  • Who could be harmed by the release of this data? If the datasets are about people, have they consented to its release? (i.e. don’t pull a Facebook emotional manipulation experiment).
  • Begin conversations with relevant public agencies and nonprofit partners to get a sense of the sort of information they might find valuable and their capacity to work with the formats you might eventually make available.
  • If you expect an onslaught of interest, an application process can help qualify partnership opportunities to maximize positive impact relative to time invested in the program.
  • Consider how you’ll handle distribution of the data to partners. Even if you don’t have the resources to set up an API, regular releases of bulk data could still provide enormous value to organizations used to relying on less-frequently updated government indices.
  • Consider your needs regarding privacy and anonymization. Strip the data of anything remotely resembling personally identifiable information (here are some guidelines).
  • If you’re making data available to researchers, plan to allow researchers to publish their results without obstruction. You might also require them to share the findings with the world under Open Access terms….”

How Three Startups Are Using Data to Renew Public Trust In Government


Mark Hall: “Chances are that when you think about the word government, it is with a negative connotation.Your less-than-stellar opinion of government may be caused by everything from Washington’s dirty politics to the long lines at your local DMV.Regardless of the reason, local, state and national politics have frequently garnered a bad reputation. People feel like governments aren’t working for them.We have limited information, visibility and insight into what’s going on and why. Yes, the data is public information but it’s difficult to access and sift through.
Good news. Things are changing fast.
Innovative startups are emerging and they are changing the way we access government information at all levels.
Here are three tech startups that are taking a unique approach to opening up government data:
1. OpenGov is a Mountain View-based software company that enables government officials and local residents to easily parse through the city’s financial data.
Founded by a team with extensive technology and finance experience, this startup has already racked up some of the largest cities to join the movement, including the City of Los Angeles.OpenGov’s approach pairs data with good design in a manner that makes it easy to use.Historically, information like expenditures of public funds existed in a silo within the mayor’s office or city manager, diminishing  the accountability of public employees.Imagine you are a citizen who is interested in seeing how much your city spent on a particular matter?
Now you can find out within just a few clicks.
This data is always of great importance but could also become increasingly critical during events like local elections.This level of openness and accessibility to data will be game-changing.
2. FiscalNote is a one-year old startup that uses analytical signals and intelligent government data to map legislation and predict an outcome.
Headquartered in Washington D.C., the company has developed a search layer and unique algorithm that makes tracking legislative data extremely easy. If you are an organization that has vested interests in specific legislative bills, tools by FiscalNote can give you insights into its progress and likelihood of being passed or held up. Want to know if your local representative favors a bill that could hurt your industry? Find out early and take the steps necessary to minimize the impact. Large corporations and special interest groups have traditionally held lobbying power with elected officials. This technology is important because small businesses, nonprofits and organizations now have an additional tool to see a changing legislative landscape in ways that were previously unimaginable.
3. Civic Industries is a San Francisco startup that allows citizens and local government officials to easily access data that previously required you to drive down to city hall. Building permits, code enforcements, upcoming government projects and construction data is now openly available within a few clicks.
Civic Insight maps various projects in your community and enables you to see all the projects with the corresponding start and completion dates, along with department contacts.
Accountability of public planning is no longer concealed to the city workers in the back-office. Responsibility is made clear. The startup also pushes underutilized city resources like empty storefronts and abandoned buildings to the forefront in an effort to drive action, either by residents or government officials.
So What’s Next?
While these three startups using data to push government transparency in the right direction, more work is needed…”

Selected Readings on Sentiment Analysis


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of sentiment analysis was originally published in 2014.

Sentiment Analysis is a field of Computer Science that uses techniques from natural language processing, computational linguistics, and machine learning to predict subjective meaning from text. The term opinion mining is often used interchangeably with Sentiment Analysis, although it is technically a subfield focusing on the extraction of opinions (the umbrella under which sentiment, evaluation, appraisal, attitude, and emotion all lie).

The rise of Web 2.0 and increased information flow has led to an increase in interest towards Sentiment Analysis — especially as applied to social networks and media. Events causing large spikes in media — such as the 2012 Presidential Election Debates — are especially ripe for analysis. Such analyses raise a variety of implications for the future of crowd participation, elections, and governance.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Choi, Eunsol et al. “Hedge detection as a lens on framing in the GMO debates: a position paper.” Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics 13 Jul. 2012: 70-79. http://bit.ly/1wweftP

  • Understanding the ways in which participants in public discussions frame their arguments is important for understanding how public opinion is formed. This paper adopts the position that it is time for more computationally-oriented research on problems involving framing. In the interests of furthering that goal, the authors propose the following question: In the controversy regarding the use of genetically-modified organisms (GMOs) in agriculture, do pro- and anti-GMO articles differ in whether they choose to adopt a more “scientific” tone?
  • Prior work on the rhetoric and sociology of science suggests that hedging may distinguish popular-science text from text written by professional scientists for their colleagues. The paper proposes a detailed approach to studying whether hedge detection can be used to understand scientific framing in the GMO debates, and provides corpora to facilitate this study. Some of the preliminary analyses suggest that hedges occur less frequently in scientific discourse than in popular text, a finding that contradicts prior assertions in the literature.

Michael, Christina, Francesca Toni, and Krysia Broda. “Sentiment analysis for debates.” (Unpublished MSc thesis). Department of Computing, Imperial College London (2013). http://bit.ly/Wi86Xv

  • This project aims to expand on existing solutions used for automatic sentiment analysis on text in order to capture support/opposition and agreement/disagreement in debates. In addition, it looks at visualizing the classification results for enhancing the ease of understanding the debates and for showing underlying trends. Finally, it evaluates proposed techniques on an existing debate system for social networking.

Murakami, Akiko, and Rudy Raymond. “Support or oppose?: classifying positions in online debates from reply activities and opinion expressions.” Proceedings of the 23rd International Conference on Computational Linguistics: Posters 23 Aug. 2010: 869-875. https://bit.ly/2Eicfnm

  • In this paper, the authors propose a method for the task of identifying the general positions of users in online debates, i.e., support or oppose the main topic of an online debate, by exploiting local information in their remarks within the debate. An online debate is a forum where each user posts an opinion on a particular topic while other users state their positions by posting their remarks within the debate. The supporting or opposing remarks are made by directly replying to the opinion, or indirectly to other remarks (to express local agreement or disagreement), which makes the task of identifying users’ general positions difficult.
  • A prior study has shown that a link-based method, which completely ignores the content of the remarks, can achieve higher accuracy for the identification task than methods based solely on the contents of the remarks. In this paper, it is shown that utilizing the textual content of the remarks into the link-based method can yield higher accuracy in the identification task.

Pang, Bo, and Lillian Lee. “Opinion mining and sentiment analysis.” Foundations and trends in information retrieval 2.1-2 (2008): 1-135. http://bit.ly/UaCBwD

  • This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Its focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. It includes material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

Ranade, Sarvesh et al. “Online debate summarization using topic directed sentiment analysis.” Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining 11 Aug. 2013: 7. http://bit.ly/1nbKtLn

  • Social networking sites provide users a virtual community interaction platform to share their thoughts, life experiences and opinions. Online debate forum is one such platform where people can take a stance and argue in support or opposition of debate topics. An important feature of such forums is that they are dynamic and grow rapidly. In such situations, effective opinion summarization approaches are needed so that readers need not go through the entire debate.
  • This paper aims to summarize online debates by extracting highly topic relevant and sentiment rich sentences. The proposed approach takes into account topic relevant, document relevant and sentiment based features to capture topic opinionated sentences. ROUGE (Recall-Oriented Understudy for Gisting Evaluation, which employ a set of metrics and a software package to compare automatically produced summary or translation against human-produced onces) scores are used to evaluate the system. This system significantly outperforms several baseline systems and show improvement over the state-of-the-art opinion summarization system. The results verify that topic directed sentiment features are most important to generate effective debate summaries.

Schneider, Jodi. “Automated argumentation mining to the rescue? Envisioning argumentation and decision-making support for debates in open online collaboration communities.” http://bit.ly/1mi7ztx

  • Argumentation mining, a relatively new area of discourse analysis, involves automatically identifying and structuring arguments. Following a basic introduction to argumentation, the authors describe a new possible domain for argumentation mining: debates in open online collaboration communities.
  • Based on our experience with manual annotation of arguments in debates, the authors propose argumentation mining as the basis for three kinds of support tools, for authoring more persuasive arguments, finding weaknesses in others’ arguments, and summarizing a debate’s overall conclusions.