What does Big Data mean to public affairs research?


Ines Mergel, R. Karl Rethemeyer, and Kimberley R. Isett at LSE’s The Impact Blog: “…Big Data promises access to vast amounts of real-time information from public and private sources that should allow insights into behavioral preferences, policy options, and methods for public service improvement. In the private sector, marketing preferences can be aligned with customer insights gleaned from Big Data. In the public sector however, government agencies are less responsive and agile in their real-time interactions by design – instead using time for deliberation to respond to broader public goods. The responsiveness Big Data promises is a virtue in the private sector but could be a vice in the public.

Moreover, we raise several important concerns with respect to relying on Big Data as a decision and policymaking tool. While in the abstract Big Data is comprehensive and complete, in practice today’sversion of Big Data has several features that should give public sector practitioners and scholars pause. First, most of what we think of as Big Data is really ‘digital exhaust’ – that is, data collected for purposes other than public sector operations or research. Data sets that might be publicly available from social networking sites such as Facebook or Twitter were designed for purely technical reasons. The degree to which this data lines up conceptually and operationally with public sector questions is purely coincidental. Use of digital exhaust for purposes not previously envisioned can go awry. A good example is Google’s attempt to predict the flu based on search terms.

Second, we believe there are ethical issues that may arise when researchers use data that was created as a byproduct of citizens’ interactions with each other or with a government social media account. Citizens are not able to understand or control how their data is used and have not given consent for storage and re-use of their data. We believe that research institutions need to examine their institutional review board processes to help researchers and their subjects understand important privacy issues that may arise. Too often it is possible to infer individual-level insights about private citizens from a combination of data points and thus predict their behaviors or choices.

Lastly, Big Data can only represent those that spend some part of their life online. Yet we know that certain segments of society opt in to life online (by using social media or network-connected devices), opt out (either knowingly or passively), or lack the resources to participate at all. The demography of the internet matters. For instance, researchers tend to use Twitter data because its API allows data collection for research purposes, but many forget that Twitter users are not representative of the overall population. Instead, as a recent Pew Social Media 2016 update shows, only 24% of all online adults use Twitter. Internet participation generally is biased in terms of age, educational attainment, and income – all of which correlate with gender, race, and ethnicity. We believe therefore that predictive insights are potentially biased toward certain parts of the population, making generalisations highly problematic at this time….(More)”

Toward Evidence-Based Open Governance by Curating and Exchanging Research: OGRX 2.0


Andrew Young and Stefaan Verhulst at OGRX : “The Open Governance Research Exchange (OGRX) is a platform that seeks to identify, collect and share curated insights on new ways of solving public problems. It was created last year by the GovLab, World Bank Digital Engagement Evaluation Team and mySociety. Today, while more than 3000 representatives from more than 70 countries are gathering in Paris for the Open Government Partnership Summit, we are launching OGRX 2.0 with new features and functionalities to further help users identify the signal in the noise of research and evidence on more innovative means of governing….

What is new?

First, the new OGRX Blog provides an outlet for more easily digestible and shareable insights from the open governance research community. OGRX currently features over 600 publications on governance innovation – but how to digest and identify insights? This space will provide summaries of important works, analyses of key trends in the field of research, and guest posts from researchers working at the leading edge of governance innovation across regions and domains. Check back often to stay on top of what’s new in open governance research.

Second, the new OGRX Selected Readings series offers curated reading lists from well-known experts in open governance. These Selected Readings will give readers a sense of how to jumpstart their knowledge by focusing on those publications that have been curated by those in the known about the topics at hand. Today we are launching this new series with the Selected Readings on Civic Technology, curated by mySociety’s head of research Rebecca Rumbul; and the Selected Readings on Policy Informatics, curated by Erik Johnston of the MacArthur Foundation Research Network on Opening Governance and director of the Arizona State University Center for Policy Informatics. New Selected Readings will be posted each month, so check back often!…Watch this space and #OGRX to stay abreast of new developments….”

How the Circle Line rogue train was caught with data


Daniel Sim at the Data.gov.sg Blog: “Singapore’s MRT Circle Line was hit by a spate of mysterious disruptions in recent months, causing much confusion and distress to thousands of commuters.

Like most of my colleagues, I take a train on the Circle Line to my office at one-north every morning. So on November 5, when my team was given the chance to investigate the cause, I volunteered without hesitation.

 From prior investigations by train operator SMRT and the Land Transport Authority (LTA), we already knew that the incidents were caused by some form of signal interference, which led to loss of signals in some trains. The signal loss would trigger the emergency brake safety feature in those trains and cause them to stop randomly along the tracks.

But the incidents — which first happened in August — seemed to occur at random, making it difficult for the investigation team to pinpoint the exact cause.

We were given a dataset compiled by SMRT that contained the following information:

  • Date and time of each incident
  • Location of incident
  • ID of train involved
  • Direction of train…

LTA and SMRT eventually published a joint press release on November 11 to share the findings with the public….

When we first started, my colleagues and I were hoping to find patterns that may be of interest to the cross-agency investigation team, which included many officers at LTA, SMRT and DSTA. The tidy incident logs provided by SMRT and LTA were instrumental in getting us off to a good start, as minimal cleaning up was required before we could import and analyse the data. We were also gratified by the effective follow-up investigations by LTA and DSTA that confirmed the hardware problems on PV46.

From the data science perspective, we were lucky that incidents happened so close to one another. That allowed us to identify both the problem and the culprit in such a short time. If the incidents were more isolated, the zigzag pattern would have been less apparent, and it would have taken us more time — and data — to solve the mystery….(More).”

Open Data Collection (PLOS)


Daniella Lowenberg, Amy Ross, Emma Ganley at PLOS: “In the spirit of Open Con and highlighting the state of Open Data, PLOS is proud to release our Open Data Collection. The many values of Open Data are becoming increasingly apparent, and as a result, we are seeing an adoption of Open Data policies across publishers, funders and organizations. Open Data has proven a fantastic tool to help evaluate the replicability of published research, and even politicians are taking a stand in favor of Open Data as a mechanism to advance science rapidly. In March of 2014, PLOS updated our Data Policy to reflect the need for the underlying data to be as open as the paper itself resulting in complete transparency of the research. Two and-a-half years later, we have seen over 60,000 published papers with open data sets and an increase in submissions reflecting open data practices and policies….

To create this Open Data Collection, we exhaustively searched for relevant articles published across PLOS that discuss open data in some way. Then, in collaboration with our external advisor, Melissa Haendel, we have selected 26 of those articles with the aim to highlight a broad scope of research articles, guidelines, and commentaries about data sharing, data practices, and data policies from different research fields. Melissa has written an engaging blog post detailing the rubric and reasons behind her selections….(More)”

Transforming government through digitization


Bjarne Corydon, Vidhya Ganesan, and Martin Lundqvist at McKinsey: “By digitizing processes and making organizational changes, governments can enhance services, save money, and improve citizens’ quality of life.

As companies have transformed themselves with digital technologies, people are calling on governments to follow suit. By digitizing, governments can provide services that meet the evolving expectations of citizens and businesses, even in a period of tight budgets and increasingly complex challenges. Our estimates suggest that government digitization, using current technology, could generate over $1 trillion annually worldwide.

Digitizing a government requires attention to two major considerations: the core capabilities for engaging citizens and businesses, and the organizational enablers that support those capabilities (exhibit). These make up a framework for setting digital priorities. In this article, we look at the capabilities and enablers in this framework, along with guidelines and real-world examples to help governments seize the opportunities that digitization offers.

A digital government has core capabilities supported by organizational enablers.

Governments typically center their digitization efforts on four capabilities: services, processes, decisions, and data sharing. For each, we believe there is a natural progression from quick wins to transformative efforts….(More)”

See also: Digital by default: A guide to transforming government (PDF–474KB) and  “Never underestimate the importance of good government,”  a New at McKinsey blog post with coauthor Bjarne Corydon, director of the McKinsey Center for Government.

Beyond nudging: it’s time for a second generation of behaviourally-informed social policy


Katherine Curchin at LSE Blog: “…behavioural scientists are calling for a second generation of behaviourally-informed policy. In some policy areas, nudges simply aren’t enough. Behavioural research shows stronger action is required to attack the underlying cause of problems. For example, many scholars have argued that behavioural insights provide a rationale for regulation to protect consumers from manipulation by private sector companies. But what might a second generation of behaviourally-informed social policy look like?

Behavioural insights could provide a justification to change the trajectory of income support policy. Since the 1990s policy attention has focused on the moral character of benefits recipients. Inspired by Lawrence Mead’s paternalist philosophy, governments have tried to increase the resolve of the unemployed to work their way out of poverty. More and more behavioural requirements have been attached to benefits to motivate people to fulfil their obligations to society.

But behavioural research now suggests that these harsh policies are misguided. Behavioural science supports the idea that people often make poor decisions and do things which are not in their long term interests.  But the weakness of individuals’ moral constitution isn’t so much the problem as the unequal distribution of opportunities in society. There are circumstances in which humans are unlikely to flourish no matter how motivated they are.

Normal human psychological limitations – our limited cognitive capacity, limited attention and limited self-control – interact with environment to produce the behaviour that advocates of harsh welfare regimes attribute to permissive welfare. In their book Scarcity, Sendhil Mullainathan and Eldar Shafir argue that the experience of deprivation creates a mindset that makes it harder to process information, pay attention, make good decisions, plan for the future, and resist temptations.

Importantly, behavioural scientists have demonstrated that this mindset can be temporarily created in the laboratory by placing subjects in artificial situations which induce the feeling of not having enough. As a consequence, experimental subjects from middle-class backgrounds suddenly display the short-term thinking and irrational decision making often attributed to a culture of poverty.

Tying inadequate income support to a list of behavioural conditions will most punish those who are suffering most. Empirical studies of welfare conditionality have found that benefit claimants often do not comprehend the complicated rules that apply to them. Some are being punished for lack of understanding rather than deliberate non-compliance.

Behavioural insights can be used to mount a case for a more generous, less punitive approach to income support. The starting point is to acknowledge that some of Mead’s psychological assumptions have turned out to be wrong. The nature of the cognitive machinery humans share imposes limits on how self-disciplined and conscientious we can reasonably expect people living in adverse circumstances to be. We have placed too much emphasis on personal responsibility in recent decades. Why should people internalize the consequences of their behaviour when this behaviour is to a large extent the product of their environment?…(More)”

Crowdsourcing Gun Violence Research


Penn Engineering: “Gun violence is often described as an epidemic, but as visible and shocking as shooting incidents are, epidemiologists who study that particular source of mortality have a hard time tracking them. The Centers for Disease Control is prohibited by federal law from conducting gun violence research, so there is little in the way of centralized infrastructure to monitor where, how,when, why and to whom shootings occur.

Chris Callison-Burch, Aravind K.Joshi Term Assistant Professor in Computer and InformationScience, and graduate studentEllie Pavlick are working to solve this problem.

They have developed the GunViolence Database, which combines machine learning and crowdsourcing techniques to produce a national registry of shooting incidents. Callison-Burch and Pavlick’s algorithm scans thousands of articles from local newspaper and television stations,determines which are about gun violence, then asks everyday people to pullout vital statistics from those articles, compiling that information into a unified, open database.

For natural language processing experts like Callison-Burch and Pavlick, the most exciting prospect of this effort is that it is training computer systems to do this kind of analysis automatically. They recently presented their work on that front at Bloomberg’s Data for Good Exchange conference.

The Gun Violence Database project started in 2014, when it became the centerpiece of Callison-Burch’s “Crowdsourcing and Human Computation”class. There, Pavlick developed a series of homework assignments that challenged undergraduates to develop a classifier that could tell whether a given news article was about a shooting incident.

“It allowed us to teach the things we want students to learn about datascience and natural language processing, while giving them the motivation to do a project that could contribute to the greater good,” says Callison-Burch.

The articles students used to train their classifiers were sourced from “TheGun Report,” a daily blog from New York Times reporters that attempted to catalog shootings from around the country in the wake of the Sandy Hook massacre. Realizing that their algorithmic approach could be scaled up to automate what the Times’ reporters were attempting, the researchers began exploring how such a database could work. They consulted with DouglasWiebe, a Associate Professor of Epidemiology in Biostatistics andEpidemiology in the Perelman School of Medicine, to learn more about what kind of information public health researchers needed to better study gun violence on a societal scale.

From there, the researchers enlisted people to annotate the articles their classifier found, connecting with them through Mechanical Turk, Amazon’scrowdsourcing platform, and their own website, http://gun-violence.org/…(More)”

For Better Citizenship, Scratch and Win


Tina Rosenberg in the New York Times: “China, with its largely cash economy, has a huge problem with tax evasion. Not just grand tax evasion, but the everyday “no receipt, please” kind, even though there have been harsh penalties: Before 2011, some forms of tax evasion were even punishable by death.

The country needed a different approach. So what did it do to get people to pay sales tax?
A. Hired a force of inspectors to raid restaurants and stores to catch people skipping the receipt, accompanied by big fines and prison terms.
B. Started an “It’s a citizen’s duty to denounce” exhortation campaign.
C. Installed cameras to photograph every transaction.
D. Turned receipts into scratch-off lottery games.

One of these things is not like the other, and that’s the answer: D. Instead of punishing under-the-table transactions, China wisely decided to encouragelegal transactions by starting a receipt lottery. Many places have done this — Brazil, Chile, Malta, Portugal, Slovakia and Taiwan, among others. In Taiwan, for example, every month the tax authorities post lottery numbers; match a few numbers for a small prize, or all of them to win more than $300,000.

China took it further. Customers need not store their receipts and wait until the end of the month to see if they’ve won money. Gratification is instant: Each receipt, known as a fapiao, is a scratch-off lottery ticket. People still game the system, but much less. The fapiao system has greatly raised collections of sales tax, business income tax and total tax. And it’s cheap to administer: one study found that new tax revenue totaled 30 times (PDF) the cost of the lottery prizes.

When a receipt is a lottery ticket, people ask for a receipt. They hope to get money, but just as important, they like to play games. Those axioms apply around the globe.

“We have groups that say: we can give out an incentive to our customers worth $15,” said Aron Ezra, chief executive of OfferCraft, an American company that designs games for businesses. “They could do that and have everyone get an incentive for $15. But they’d get better results for the same average price by having variability — some get $10, some get $100.” The lottery makes it exciting.

The huge popularity of lotteries shows this. Another example is the Save to Win program, which credit unions are using in seven states. Microscopic interest rates weren’t enough to get low-income customers to save. So instead, for every $25 they put into a savings account, depositors get one lottery entry. They can win a grand prize — in some states, $10,000 — or $100 prizes every month.

What else could lotteries do?

Los Angeles and Philadelphia have been the sites of experiments to increase dismal voter turnout in local elections by choosing a voter at random to win a large cash prize. In May 2015, the Southwest Voter Registration Education Project in Los Angeles offered $25,000 to a random voter in one district during a school board election, in a project named Voteria.

Health-related lotteries aren’t new. In 1957, Glasgow held a mass X-ray campaign to diagnose tuberculosis. Health officials aimed to X-ray 250,000 people and in the end got three times that many. One reason for the enthusiasm: a weekly prize draw. A lovely vintage newsreel reported on the campaign.

More than 50 years later, researchers set up a lottery among young adults in Lesotho, designed to promote safe sex practices. Every four months the subjects were tested for two sexually transmitted diseases, syphilis and trichonomiasis. A negative test got them entered into a lottery to win either $50 (equivalent to a week’s average salary) or $100. The idea was to see if incentives to reduce the spread of syphilis would also protect against HIV.

The results were significant — a 21.4 percent reduction in the rate of new H.I.V. infections, and a 3.4 percent lower prevalence rate of HIV in the treatment group after two years. And the effect was lasting — the gains persisted a year after the experiment ended. The lottery worked in large part because it was most attractive to those most at risk: many people who take sexual risks also enjoy taking monetary risks, and might be eager to play a lottery.

The authors wrote in a blog post: “To the best of our knowledge, this is the first H.I.V. prevention intervention focusing on sexual behavior changes (as opposed to medical interventions) to have been demonstrated to lead to a significant reduction in H.I.V. incidence, the ultimate objective of any H.I.V. prevention intervention.”…(More)”

Designing the Next Generation of Open Data Policy


Andrew Young and Stefaan Verhulst at the Open Data Charter Blog: “The international Open Data Charter has emerged from the global open data community as a galvanizing document to place open government data directly in the hands of citizens and organizations. To drive this process forward, and ensure that the outcomes are both systemic and transformational, new open data policy needs to be based on evidence of how and when open data works in practice. To support this work, the GovLab, in collaboration with Omidyar Network, has recently completed research which provides vital evidence of open data projects around the world, including an analysis of 19 in-depth, impact-focused case studies and a key findings paper. All of the research is now available in an eBook published by O’Reilly Media.

The research found that open data is making an impact in four core ways, including:…(More)”

Living in the World of Both/And


Essay by Adene Sacks & Heather McLeod Grant  in SSIR: “In 2011, New York Times data scientist Jake Porway wrote a blog post lamenting the fact that most data scientists spend their days creating apps to help users find restaurants, TV shows, or parking spots, rather than addressing complicated social issues like helping identify which teens are at risk of suicide or creating a poverty index of Africa using satellite data.

That post hit a nerve. Data scientists around the world began clamoring for opportunities to “do good with data.” Porway—at the center of this storm—began to convene these scientists and connect them to nonprofits via hackathon-style events called DataDives, designed to solve big social and environmental problems. There was so much interest, he eventually quit his day job at the Times and created the organization DataKind to steward this growing global network of data science do-gooders.

At the same time, in the same city, another movement was taking shape—#GivingTuesday, an annual global giving event fueled by social media. In just five years, #GivingTuesday has reshaped how nonprofits think about fundraising and how donors give. And yet, many don’t know that 92nd Street Y (92Y)—a 140-year-old Jewish community and cultural center in Manhattan, better known for its star-studded speaker series, summer camps, and water aerobics classes—launched it.

What do these two examples have in common? One started as a loose global network that engaged data scientists in solving problems, and then became an organization to help support the larger movement. The other started with a legacy organization, based at a single site, and catalyzed a global movement that has reshaped how we think about philanthropy. In both cases, the founding groups have incorporated the best of both organizations and networks.

Much has been written about the virtues of thinking and acting collectively to solve seemingly intractable challenges. Nonprofit leaders are being implored to put mission above brand, build networks not just programs, and prioritize collaboration over individual interests. And yet, these strategies are often in direct contradiction to the conventional wisdom of organization-building: differentiating your brand, developing unique expertise, and growing a loyal donor base.

A similar tension is emerging among network and movement leaders. These leaders spend their days steering the messy process required to connect, align, and channel the collective efforts of diverse stakeholders. It’s not always easy: Those searching to sustain movements often cite the lost momentum of the Occupy movement as a cautionary note. Increasingly, network leaders are looking at how to adapt the process, structure, and operational expertise more traditionally associated with organizations to their needs—but without co-opting or diminishing the energy and momentum of their self-organizing networks…

Welcome to the World of “Both/And”

Today’s social change leaders—be they from business, government, or nonprofits—must learn to straddle the leadership mindsets and practices of both networks and organizations, and know when to use which approach. Leaders like Porway, and Henry Timms and Asha Curran of 92Y can help show us the way.

How do these leaders work with the “both/and” mindset?

First, they understand and leverage the strengths of both organizations and networks—and anticipate their limitations. As Timms describes it, leaders need to be “bilingual” and embrace what he has called “new power.” Networks can be powerful generators of new talent or innovation around complex multi-sector challenges. It’s useful to take a network approach when innovating new ideas, mobilizing and engaging others in the work, or wanting to expand reach and scale quickly. However, networks can dissipate easily without specific “handrails,” or some structure to guide and support their work. This is where they need some help from the organizational mindset and approach.

On the flip side, organizations are good at creating centralized structures to deliver products or services, manage risk, oversee quality control, and coordinate concrete functions like communications or fundraising. However, often that efficiency and effectiveness can calcify over time, becoming a barrier to new ideas and growth opportunities. When organizational boundaries are too rigid, it is difficult to engage the outside world in ideating or mobilizing on an issue. This is when organizations need an infusion of the “network mindset.”

 

…(More)