Big data for government good: using analytics for policymaking


Kent Smetters in The Hill: ” Big Data and analytics are driving advancements that touch nearly every part of our lives. From improving disaster relief efforts following a storm, to enhancing patient response to specific medications to criminal justice reform and real-time traffic reporting, Big Data is saving lives, reducing costs and improving productivity across the private and the public sector.Yet when our elected officials draft policy they lack access to advanced data and analytics that would help them understand the economic implications of proposed legislation. Instead of using Big Data to inform and shape vital policy questions, Members of Congress typically don’t receive a detailed analysis of a bill until after it has been written, and after they have sought support for it. That’s when a policy typically undergoes a detailed budgetary analysis. And even then, these assessments often ignore the broader impact on jobs and the economy.

Yet when our elected officials draft policy they lack access to advanced data and analytics that would help them understand the economic implications of proposed legislation. Instead of using Big Data to inform and shape vital policy questions, Members of Congress typically don’t receive a detailed analysis of a bill until after it has been written, and after they have sought support for it. That’s when a policy typically undergoes a detailed budgetary analysis. And even then, these assessments often ignore the broader impact on jobs and the economy.

We must do better. Just as modern marketing firms use deep analytical tools to make smart business decisions, policymakers in Washington should similarly have access to modern tools for analyzing important policy questions.
Will Social Security be solvent for our grandchildren? How will changes to immigration policy influence the number of jobs and the GDP? How will tax reform impact the budget, economic growth and the income distribution? What is the impact of new investments in health care, education and roads? These are big questions that must be answered with reliable data and analysis while legislation is being written, not afterwards. The absence leaves us with ideology-driven partisanship.

Simply put, Washington needs better tools to evaluate these complex factors. Imagine the productive conversations we could have if we applied the kinds of tools that are commonplace in the business world to help Washington make more informed choices.

For example, with the help of a nonpartisan budget model from the Wharton School of the University of Pennsylvania, policymakers and the public can uncover some valuable—and even surprising—information about our choices surrounding Social Security, immigration and other issues.

By analyzing more than 4,000 different Social Security policy options, for example, the model projects that the Social Security Trust Fund will be depleted three years earlier than the Social Security Administration’s projections, barring any changes in current law. The tool’s projected shortfalls are larger than the SSA’s, in fact—because it takes into account how changes over time will affect the outcome. We also learn that many standard policy options fail to significantly move the Trust Fund exhaustion date, as these policies phase in too slowly or are too small. Securing Social Security, we now know, requires a range of policy combinations and potentially larger changes than we may have been considering.

Immigration policy, too, is an area where we could all benefit from greater understanding. The political left argues that legalizing undocumented workers will have a positive impact on jobs and the economy. The political right argues for just the opposite—deportation of undocumented workers—for many of the same reasons. But, it turns out, the numbers don’t offer much support to either side.

On one hand, legalization actually slightly reduces the number of jobs. The reason is simple: legal immigrants have better access to school and college, and they can spend more time looking for the best job match. However, because legal immigrants can gain more skills, the actual impact on GDP from legalization alone is basically a wash.

The other option being discussed, deportation, also reduces jobs, in this case because the number of native-born workers can’t rise enough to absorb the job losses caused by deportation. GDP also declines. Calculations based on 125 different immigration policy combinations show that increasing the total amount of legal immigrants—especially those with higher skills—is the most effective policy for increasing employment rates and GDP….(More)”

How Snapchat Is Recruiting Bone Marrow Donors


PSFK: “Every ten minutes, blood cancer takes a life away. One of the ways to treat this disease is through stem cell transplants that create healthy blood cells. Since 70% of patients who need a transplant cannot find a match in their family, they need to turn to outside donors. In an effort to increase the number of bone marrow donors in the registry, Be The Match turned to Snapchat to find male donors from the ages of 18 to 24.

The aim of the campaign “Be the Guy” is to release short videos on Snapchat of regular guys acting silly. This emphasizes the idea that literally anyone can save a life, no matter who—or how quirky—you are. With a swipe up, any Snapchat user will be directed to a form that makes it easy to sign up to be a donor. To complete the registration, all it takes is to receive a kit in the mail and mail back a swap from your cheek….(More)”

Against Elections The Case for Democracy


Book by David Van Reybrouck: “Democracy is in bad health. Against Elections offers a new diagnosis and an ancient remedy. Fear-mongering populists, distrust in the establishment, personality contests instead of reasoned debate: these are the results of the latest elections.

In fact, as this ingenious book shows, the original purpose of elections was to exclude the people from power by appointing an elite to govern over them.

Yet for most of its 3000-year history, democracy did not involve elections at all: members of the public were appointed to positions in government through a combination of volunteering and lottery.

Based on studies and trials from around the globe, this hugely influential manifesto presents the practical case for a true democracy – one that actually works.

Urgent, heretical and completely convincing, Against Elections leaves only one question to be answered: what are we waiting for?…(More)”

Fine-grained dengue forecasting using telephone triage services


Nabeel Abdur Rehman et al at Science Advances: “Thousands of lives are lost every year in developing countries for failing to detect epidemics early because of the lack of real-time disease surveillance data. We present results from a large-scale deployment of a telephone triage service as a basis for dengue forecasting in Pakistan. Our system uses statistical analysis of dengue-related phone calls to accurately forecast suspected dengue cases 2 to 3 weeks ahead of time at a subcity level (correlation of up to 0.93). Our system has been operational at scale in Pakistan for the past 3 years and has received more than 300,000 phone calls. The predictions from our system are widely disseminated to public health officials and form a critical part of active government strategies for dengue containment. Our work is the first to demonstrate, with significant empirical evidence, that an accurate, location-specific disease forecasting system can be built using analysis of call volume data from a public health hotline….(More)”

Two Laws On Expertise That Make Government Dumber


Beth Noveck in Forbes: “With the announcement of Microsoft’s acquisition of LinkedIn last week comes the prospect of new tech products that can help us visualize more than ever before about what we know and can do. But the buzz about what this might mean for our ability to find a job in the 21st century (and for privacy), obscures a tantalizing possibility for improving government.

Imagine if the Department of Health and Human Services needed to craft a new policy on hospitals. With better tools for automating the identification of expertise from our calendar, email, and document data (Microsoft), our education history and credentials (LinkedIn) skills acquired from training (Lynda), it might become possible to match the demand for know how about healthcare to the supply of those people who have worked in the sector, have degrees in public health, or who have demonstrated passion and know how evident from their volunteer experience.

The technological possibility of matching people to public opportunities to participate in the life of our democracy in ways that relate to our competencies and interests is impeded, however, by two decades-old statutes that prohibit the federal government from taking advantage of the possibilities of technology to tap into the expertise of the American people to solve our hardest problems.

The Federal Advisory Committee Act of 1972 (FACA) and the Paperwork Reduction Act of 1980 (PRA) entrench the committee and consultation practices of an era before the Internet. They make it illegal for wider networks of more diverse people with innovative ideas from convening to help solve public problems and need to be updated for the 21st century….(More)”

Use of big data risks making some people uninsurable


Oliver Ralph at the Financial Times: “More sophisticated use of data could create an “underclass” of people who cannot afford insurance. According to a new report from the Chartered Institute of Insurance, consumers could miss out on some types of cover altogether if insurers deem them too risky.

Big data are one of the insurance industry’s great hopes for the future. Established insurers and a host of start-ups are investing millions in new systems to better understand the information they hold about customers, and to collect more data. They hope that by better analysing the risks that each policyholder faces, they can not only price their products more accurately but also advise customers on how to avoid problems.

However, the CII paper warns that using data in this way threatens the concept of pooling risk on which the industry was founded.

“Data is a double-edged sword,” said David Thomson, director of policy and public affairs at the CII. “The insurance sector needs to be careful about moving away from pooled risk into individual pricing. They need to think about the broader public interest.”

The report says that the concept of pooling risk “underpins the effectiveness of insurance cover”.

It adds: “Some people may be identified as such high risk to insurers that they are priced out of insurance altogether. Big data could, in effect, create groups of ‘uninsurable’ people. While in some cases this may be to do with modifiable behaviour, like driving style, it could easily be due to factors that people can’t control, such as where they live, age, genetic conditions or health problems.”

The issue of genetic data is a particularly contentious one.

In theory, genetic data could be useful to insurers when deciding how to price life or health insurance. Because of the ethical questions this poses, an agreement signed in 2000 between the government and the Association of British Insurers stops the industry from using predictive genetic test results. The agreement runs until 2019, although a review is due this year.

“You could price people out of the market for health products. There’s a danger insurers will not offer health cover to some people. The government would intervene if people are doing social sorting,” said Mr Thomson.

Better use of data in other areas has already forced the government to act. Improved mapping and data analysis have allowed insurers to more accurately assess which homes and businesses run a high risk of flooding. Many people complained that the resulting prices made cover unaffordable for people living in areas at risk….(More)”.

Is internet freedom a tool for democracy or authoritarianism?


 and  in the Conversation: “The irony of internet freedom was on full display shortly after midnight July 16 in Turkey when President Erdogan used FaceTime and independent TV news to call for public resistance against the military coup that aimed to depose him.

In response, thousands of citizens took to the streets and aided the government in beating back the coup. The military plotters had taken over state TV. In this digital age they apparently didn’t realize television was no longer sufficient to ensure control over the message.

This story may appear like a triumphant example of the internet promoting democracy over authoritarianism.

Not so fast….This duality of the internet, as a tool to promote democracy or authoritarianism, or simultaneously both, is a complex puzzle.

The U.S. has made increasing internet access around the world a foreign policy priority. This policy was supported by both Secretaries of State John Kerry and Hillary Clinton.

The U.S. State Department has allocated tens of millions of dollars to promote internet freedom, primarily in the area of censorship circumvention. And just this month, the United Nations Human Rights Council passed a resolution declaring internet freedom a fundamental human right. The resolution condemns internet shutdowns by national governments, an act that has become increasingly common in variety of countries across the globe, including Turkey, Brazil, India and Uganda.

On the surface, this policy makes sense. The internet is an intuitive boon for democracy. It provides citizens around the world with greater freedom of expression, opportunities for civil society, education and political participation. And previous research, including our own, has been optimistic about the internet’s democratic potential.

However, this optimism is based on the assumption that citizens who gain internet access use it to expose themselves to new information, engage in political discussions, join social media groups that advocate for worthy causes and read news stories that change their outlook on the world.

And some do.

But others watch Netflix. They use the internet to post selfies to an intimate group of friends. They gain access to an infinite stream of music, movies and television shows. They spend hours playing video games.

However, our recent research shows that tuning out from politics and immersing oneself in online spectacle has political consequences for the health of democracy….Political use of the internet ranks very low globally, compared to other uses. Research has found that just 9 percent of internet users posted links to political news and only 10 percent posted their own thoughts about political or social issues. In contrast, almost three-quarters (72 percent) say they post about movies and music, and over half (54 percent) also say they post about sports online.

This inspired our study, which sought to show how the internet does not necessarily serve as democracy’s magical solution. Instead, its democratic potential is highly dependent on how citizens choose to use it….

Ensuring citizens have access to the internet is not sufficient to ensure democracy and human rights. In fact, internet access may negatively impact democracy if exploited for authoritarian gain.

The U.S. government, NGOs and other democracy advocates have invested a great deal of time and resources toward promoting internet access, fighting overt online censorship and creating circumvention technologies. Yet their success, at best, has been limited.

The reason is twofold. First, authoritarian governments have adapted their own strategies in response. Second, the “if we build it, they will come” philosophy underlying a great deal of internet freedom promotion doesn’t take into account basic human psychology in which entertainment choices are preferred over news and attitudes toward the internet determine its use, not the technology itself.

Allies in the internet freedom fight should realize that the locus of the fight has shifted. Greater efforts must be put toward tearing down “psychological firewalls,” building demand for internet freedom and influencing citizens to employ the internet’s democratic potential.

Doing so ensures that the democratic online toolkit is a match for the authoritarian one….(More)”

Big health data: the need to earn public trust


Tjeerd-Pieter van Staa et al in the BMJ: “Better use of large scale health data has the potential to benefit patient care, public health, and research. The handling of such data, however, raises concerns about patient privacy, even when the risks of disclosure are extremely small.

The problems are illustrated by recent English initiatives trying to aggregate and improve the accessibility of routinely collected healthcare and related records, sometimes loosely referred to as “big data.” One such initiative, care.data, was set to link and provide access to health and social care information from different settings, including primary care, to facilitate the planning and provision of healthcare and to advance health science.1 Data were to be extracted from all primary care practices in England. A related initiative, the Clinical Practice Research Datalink (CPRD), evolved from the General Practice Research Database (GPRD). CPRD was intended to build on GPRD by linking patients’ primary care records to hospital data, around 50 disease registries and clinical audits, genetic information from UK Biobank, and even the loyalty cards of a large supermarket chain, creating an integrated data repository and linked services for all of England that could be sold to universities, drug companies, and non-healthcare industries. Care.data has now been abandoned and CPRD has stalled. The flawed implementation of care.data plus earlier examples of data mismanagement have made privacy issues a mainstream public concern. We look at what went wrong and how future initiatives might gain public support….(More)”

The big health data sale


Philip Hunter at the EMBO Journal: “Personal health and medical data are a valuable commodity for a number of sectors from public health agencies to academic researchers to pharmaceutical companies. Moreover, “big data” companies are increasingly interested in tapping into this resource. One such firm is Google, whose subsidiary Deep Mind was granted access to medical records on 1.6 million patients who had been treated at some time by three major hospitals in London, UK, in order to develop a diagnostic app. The public discussion it raised was just another sign of the long‐going tensions between drug companies, privacy advocates, regulators, legislators, insurers and patients about privacy, consent, rights of access and ownership of medical data that is generated in pharmacies, hospitals and doctors’ surgeries. In addition, the rapid growth of eHealth will add a boon of even more health data from mobile phones, portable diagnostic devices and other sources.

These developments are driving efforts to create a legal framework for protecting confidentiality, controlling communication and governing access rights to data. Existing data protection and human rights laws are being modified to account for personal medical and health data in parallel to the campaign for greater transparency and access to clinical trial data. Healthcare agencies in particular will have to revise their procedures for handling medical or research data that is associated with patients.

Google’s foray into medical data demonstrates the key role of health agencies, in this case the Royal Free NHS Trust, which operates the three London hospitals that granted Deep Mind access to patient data. Royal Free approached Deep Mind with a request to develop an app for detecting acute kidney injury, which, according to the Trust, affects more than one in six inpatients….(More)”

How Twitter gives scientists a window into human happiness and health


 at the Conversation: “Since its public launch 10 years ago, Twitter has been used as a social networking platform among friends, an instant messaging service for smartphone users and a promotional tool for corporations and politicians.

But it’s also been an invaluable source of data for researchers and scientists – like myself – who want to study how humans feel and function within complex social systems.

By analyzing tweets, we’ve been able to observe and collect data on the social interactions of millions of people “in the wild,” outside of controlled laboratory experiments.

It’s enabled us to develop tools for monitoring the collective emotions of large populations, find the happiest places in the United States and much more.

So how, exactly, did Twitter become such a unique resource for computational social scientists? And what has it allowed us to discover?

Twitter’s biggest gift to researchers

On July 15, 2006, Twittr (as it was then known) publicly launched as a “mobile service that helps groups of friends bounce random thoughts around with SMS.” The ability to send free 140-character group texts drove many early adopters (myself included) to use the platform.

With time, the number of users exploded: from 20 million in 2009 to 200 million in 2012 and 310 million today. Rather than communicating directly with friends, users would simply tell their followers how they felt, respond to news positively or negatively, or crack jokes.

For researchers, Twitter’s biggest gift has been the provision of large quantities of open data. Twitter was one of the first major social networks to provide data samples through something called Application Programming Interfaces (APIs), which enable researchers to query Twitter for specific types of tweets (e.g., tweets that contain certain words), as well as information on users.

This led to an explosion of research projects exploiting this data. Today, a Google Scholar search for “Twitter” produces six million hits, compared with five million for “Facebook.” The difference is especially striking given that Facebook has roughly five times as many users as Twitter (and is two years older).

Twitter’s generous data policy undoubtedly led to some excellent free publicity for the company, as interesting scientific studies got picked up by the mainstream media.

Studying happiness and health

With traditional census data slow and expensive to collect, open data feeds like Twitter have the potential to provide a real-time window to see changes in large populations.

The University of Vermont’s Computational Story Lab was founded in 2006 and studies problems across applied mathematics, sociology and physics. Since 2008, the Story Lab has collected billions of tweets through Twitter’s “Gardenhose” feed, an API that streams a random sample of 10 percent of all public tweets in real time.

I spent three years at the Computational Story Lab and was lucky to be a part of many interesting studies using this data. For example, we developed a hedonometer that measures the happiness of the Twittersphere in real time. By focusing on geolocated tweets sent from smartphones, we were able to map the happiest places in the United States. Perhaps unsurprisingly, we found Hawaii to be the happiest state and wine-growing Napa the happiest city for 2013.

A map of 13 million geolocated U.S. tweets from 2013, colored by happiness, with red indicating happiness and blue indicating sadness. PLOS ONE, Author provided

These studies had deeper applications: Correlating Twitter word usage with demographics helped us understand underlying socioeconomic patterns in cities. For example, we could link word usage with health factors like obesity, so we built a lexicocalorimeter to measure the “caloric content” of social media posts. Tweets from a particular region that mentioned high-calorie foods increased the “caloric content” of that region, while tweets that mentioned exercise activities decreased our metric. We found that this simple measure correlates with other health and well-being metrics. In other words, tweets were able to give us a snapshot, at a specific moment in time, of the overall health of a city or region.

Using the richness of Twitter data, we’ve also been able to see people’s daily movement patterns in unprecedented detail. Understanding human mobility patterns, in turn, has the capacity to transform disease modeling, opening up the new field of digital epidemiology….(More)”