Health Big Data in the Commercial Context


CDT Press Release: “This paper is the third in a series of three, each of which explores health big data in a different context. The first — on health big data in the government context — is available here, and the second — on health big data in the clinical context — is available here.

Consumers are increasingly using mobile phone apps and wearable devices to generate and share data on health and wellness. They are using personal health record tools to access and copy health records and move them to third party platforms. They are sharing health information on social networking sites. They leave digital health footprints when they conduct online searches for health information. The health data created, accessed, and shared by consumers using these and many other tools can range from detailed clinical information, such as downloads from an implantable device and details about medication regimens, to data about weight, caloric intake, and exercise logged with a smart phone app.

These developments offer a wealth of opportunities for health care and personal wellness. However, privacy questions arise due to the volume and sensitivity of health data generated by consumer-focused apps, devices, and platforms, including the potential analytics uses that can be made of such data.

Many of the privacy issues that face traditional health care entities in the big data era also apply to app developers, wearable device manufacturers, and other entities not part of the traditional health care ecosystem. These include questions of data minimization, retention, and secondary use. Notice and consent pose challenges, especially given the limits of presenting notices on mobile device screens, and the fact that consumer devices may be bought and used without consultation with a health care professional. Security is a critical issue as well.

However, the privacy and security provisions of the Heath Insurance Portability and Accountability Act (HIPAA) do not apply to most app developers, device manufacturers or others in the consumer health space. This has benefits to innovation, as innovators would otherwise have to struggle with the complicated HIPAA rules. However, the current vacuum also leaves innovators without clear guidance on how to appropriately and effectively protect consumers’ health data. Given the promise of health apps, consumer devices, and consumer-facing services, and given the sensitivity of the data that they collect and share, it is important to provide such guidance….

As the source of privacy guidelines, we look to the framework provided by the Fair Information Practice Principles (FIPPs) and explore how it could be applied in an age of big data to patient-generated data. The FIPPs have influenced to varying degrees most modern data privacy regimes. While some have questioned the continued validity of the FIPPs in the current era of mass data collection and analysis, we consider here how the flexibility and rigor of the FIPPs provide an organizing framework for responsible data governance, promoting innovation, efficiency, and knowledge production while also protecting privacy. Rather than proposing an entirely new framework for big data, which could be years in the making at best, using the FIPPs would seem the best approach in promoting responsible big data practices. Applying the FIPPs could also help synchronize practices between the traditional health sector and emerging consumer products….(More)”

Does Twitter Increase Perceived Police Legitimacy?


Paper by Stephan G. Grimmelikhuijsen and Albert J. Meijer in Public Administration Review: “Social media use has become increasingly popular among police forces. The literature suggests that social media use can increase perceived police legitimacy by enabling transparency and participation. Employing data from a large and representative survey of Dutch citizens (N = 4,492), this article tests whether and how social media use affects perceived legitimacy for a major social media platform, Twitter. A negligible number of citizens engage online with the police, and thus the findings reveal no positive relationship between participation and perceived legitimacy. The article shows that by enhancing transparency, Twitter does increase perceived police legitimacy, albeit to a limited extent. Subsequent analysis of the mechanism shows both an affective and a cognitive path from social media use to legitimacy. Overall, the findings suggest that establishing a direct channel with citizens and using it to communicate successes does help the police strengthen their legitimacy, but only slightly and for a small group of interested citizens….(More)”

How Crowdsourcing And Machine Learning Will Change The Way We Design Cities


Shaunacy Ferro at FastCompany: “In 2011, researchers at the MIT Media Lab debuted Place Pulse, a website that served as a kind of “hot or not” for cities. Given two Google Street View images culled from a select few cities including New York City and Boston, the site asked users to click on the one that seemed safer, more affluent, or more unique. The result was an empirical way to measure urban aesthetics.

Now, that data is being used to predict what parts of cities feel the safest. StreetScore, a collaboration between the MIT Media Lab’s Macro Connections and Camera Culture groups, uses an algorithm to create a super high-resolution map of urban perceptions. The algorithmically generated data could one day be used to research the connection between urban perception and crime, as well as informing urban design decisions.

The algorithm, created by Nikhil Naik, a Ph.D. student in the Camera Culture lab, breaks an image down into its composite features—such as building texture, colors, and shapes. Based on how Place Pulse volunteers rated similar features, the algorithm assigns the streetscape a perceived safety score between 1 and 10. These scores are visualized as geographic points on a map, designed by MIT rising sophomore Jade Philipoom. Each image available from Google Maps in the two cities are represented by a colored dot: red for the locations that the algorithm tags as unsafe, and dark green for those that appear safest. The site, now limited to New York and Boston, will be expanded to feature Chicago and Detroit later this month, and eventually, with data collected from a new version of Place Pulse, will feature dozens of cities around the world….(More)”

Big Other: Surveillance Capitalism and the Prospects of an Information Civilization


New paper by Shoshana Zuboff in the Journal of Information Technology: “This article describes an emergent logic of accumulation in the networked sphere, ‘surveillance capitalism,’ and considers its implications for ‘information civilization.’ Google is to surveillance capitalism what General Motors was to managerial capitalism. Therefore the institutionalizing practices and operational assumptions of Google Inc. are the primary lens for this analysis as they are rendered in two recent articles authored by Google Chief Economist Hal Varian. Varian asserts four uses that follow from computer-mediated transactions: ‘data extraction and analysis,’ ‘new contractual forms due to better monitoring,’ ‘personalization and customization,’ and ‘continuous experiments.’ An examination of the nature and consequences of these uses sheds light on the implicit logic of surveillance capitalism and the global architecture of computer mediation upon which it depends. This architecture produces a distributed and largely uncontested new expression of power that I christen: ‘Big Other.’ It is constituted by unexpected and often illegible mechanisms of extraction, commodification, and control that effectively exile persons from their own behavior while producing new markets of behavioral prediction and modification. Surveillance capitalism challenges democratic norms and departs in key ways from the centuries long evolution of market capitalism….(More)”

The extreme poverty of data


 in the Financial Times: “As finance ministers gather this week in Washington DC they cannot but agree and commit to fighting extreme poverty. All of us must rejoice in the fact that over the past 15 years, the world has reportedly already “halved the number of poor people living on the planet”.

But none of us really knows it for sure. It could be less, it could be more. In fact, for every crucial issue related to human development, whether it is poverty, inequality, employment, environment or urbanization, there is a seminal crisis at the heart of global decision making – the crisis of poor data.

Because the challenges are huge and the resources scarce, on these issues more maybe than anywhere else, we need data, to monitor the results and adapt the strategies whenever needed. Bad data feed bad management, weak accountability, loss of resources and, of course, corruption.

It is rather bewildering that while we live in this technology-driven age, the development communities and many of our African governments are relying too much on guesswork. Our friends in the development sector and our African leaders would not dream of driving their cars or flying without instruments. But somehow they pretend they can manage and develop countries without reliable data.

The development community must admit it has a big problem. The sector is relying on dodgy data sets. Take the data on extreme poverty. The data we have are mainly extrapolations of estimates from years back – even up to a decade or more ago. For 38 out of 54 African countries, data on poverty and inequality are either out-dated or non-existent. How can we measure progress with such a shaky baseline? To make things worse we also don’t know how much countries spend on fighting poverty. Only 3 per cent of African citizens live in countries where governmental budgets and expenditures are made open, according to the Open Budget Index. We will never end extreme poverty if we don’t know who or where the poor are, or how much is being spent to help them.

Our African countries have all fought and won their political independence. They should now consider the battle for economic sovereignty, which begins with the ownership of sound and robust national data: how many citizens, living where, and how, to begin with.

There are three levels of intervention required.

First, a significant increase in resources for credible, independent, national statistical institutions. Establishing a statistical office is less eye-catching than building a hospital or school but data driven policy will ensure that more hospital and schools are delivered more effectively and efficiently. We urgently need these boring statistical offices. In 2013, out of a total aid budget of $134.8bn, a mere $280m went in support of statistics. Governments must also increase the resources they put into data.

Second, innovative means of collecting data. Mobile phones, geocoding, satellites and the civic engagement of young tech-savvy citizens to collect data can all secure rapid improvements in baseline data if harnessed.

Third, everyone must take on this challenge of the global public good dimension of high quality open data. Public registers of the ownership of companies, global standards on publishing payments and contracts in the extractives sector and a global charter for open data standards will help media and citizens to track corruption and expose mismanagement. Proposals for a new world statistics body – “Worldstat” – should be developed and implemented….(More)”

New Interactive Citizen-Generated Data Platform


DataShift: “Following a study to better understand the number, type and scale of citizen-generated data initiatives across the world, the DataShift has visualised the resulting data to create an interactive online platform. Users are presented with a definition of a citizen-generated data initiative before being invited to browse the multiple initiatives according to the various themes that they address….(More)”

The big medical data miss: challenges in establishing an open medical resource


Eric J. Topol in Nature: ” I call for an international open medical resource to provide a database for every individual’s genomic, metabolomic, microbiomic, epigenomic and clinical information. This resource is needed in order to facilitate genetic diagnoses and transform medical care.

“We are each, in effect, one-person clinical trials”

Laurie Becklund was a noted journalist who died in February 2015 at age 66 from breast cancer. Soon thereafter, the Los Angeles Times published her op-ed entitled “As I lay dying” (Ref. 1). She lamented, “We are each, in effect, one-person clinical trials. Yet the knowledge generated from those trials will die with us because there is no comprehensive database of metastatic breast cancer patients, their characteristics and what treatments did and didn’t help them”. She went on to assert that, in the era of big data, the lack of such a resource is “criminal”, and she is absolutely right….

Around the same time of this important op-ed, the MIT Technology Review published their issue entitled “10 Breakthrough Technologies 2015” and on the list was the “Internet of DNA” (Ref. 2). While we are often reminded that the world we live in is becoming the “Internet of Things”, I have not seen this terminology applied to DNA before. The article on the “Internet of DNA” decried, “the unfolding calamity in genomics is that a great deal of life-saving information, though already collected, is inaccessible”. It called for a global network of millions of genomes and cited theMatchmaker Exchange as a frontrunner. For this international initiative, a growing number of research and clinical teams have come together to pool and exchange phenotypic and genotypic data for individual patients with rare disorders, in order to share this information and assist in the molecular diagnosis of individuals with rare diseases….

an Internet of DNA — or what I have referred to as a massive, open, online medicine resource (MOOM) — would help to quickly identify the genetic cause of the disorder4 and, in the process of doing so, precious guidance for prevention, if necessary, would become available for such families who are currently left in the lurch as to their risk of suddenly dying.

So why aren’t such MOOMs being assembled? ….

There has also been much discussion related to privacy concerns that patients might be unwilling to participate in a massive medical information resource. However, multiple global consumer surveys have shown that more than 80% of individuals are ready to share their medical data provided that they are anonymized and their privacy maximally assured4. Indeed, just 24 hours into Apple’s ResearchKit initiative, a smartphone-based medical research programme, there were tens of thousand of patients with Parkinson disease, asthma or heart disease who had signed on. Some individuals are even willing to be “open source” — that is, to make their genetic and clinical data fully available with free access online, without any assurance of privacy. This willingness is seen by the participants in the recently launched Open Humans initiative. Along with the Personal Genome Project, Go Viral and American Gut have joined in this initiative. Still, studies suggest that most individuals would only agree to be medical research participants if their identities would not be attainable. Unfortunately, to date, little has been done to protect individual medical privacy, for which there are both promising new data protection technological approaches4 and the need for additional governmental legislation.

This leaves us with perhaps the major obstacle that is holding back the development of MOOMs — researchers. Even with big, team science research projects culling together hundreds of investigators and institutions throughout the world, such as the Global Alliance for Genomics and Health (GA4GH), the data obtained clinically are just as Laurie Becklund asserted in her op-ed — “one-person clinical trials” (Ref. 1). While undertaking the construction of a MOOM is a huge endeavour, there is little motivation for researchers to take on this task, as this currently offers no academic credit and has no funding source. But the transformative potential of MOOMs to improve medical care is extraordinary. Rather than having the knowledge die with each of us, the time has come to take down the walls of academic medical centres and health-care systems around the world, and create a global knowledge medical resource that leverages each individual’s information to help one another…(More)”

Bloomberg Philanthropies Launches $42 Million “What Works Cities” Initiative


Press Release: “Today, Bloomberg Philanthropies announced the launch of the What Works Cities initiative, a $42 million program to help 100 mid-sized cities better use data and evidence. What Works Cities is the latest initiative from Bloomberg Philanthropies’ Government Innovation portfolio which promotes public sector innovation and spreads effective ideas amongst cities.

Through partners, Bloomberg Philanthropies will help mayors and local leaders use data and evidence to engage the public, make government more effective and improve people’s lives. U.S. cities with populations between 100,000 and 1 million people are invited to apply.

“While cities are working to meet new challenges with limited resources, they have access to more data than ever – and they are increasingly using it to improve people’s lives,” said Michael R. Bloomberg. “We’ll help them build on their progress, and help even more cities take steps to put data to work. What works? That’s a question that every city leader should ask – and we want to help them find answers.”

The $42 million dollar effort is the nation’s most comprehensive philanthropic initiative to help accelerate the ability of local leaders to use data and evidence to improve the lives of their residents. What Works Cities will provide mayors with robust technical assistance, expertise, and peer-to-peer learning opportunities that will help them enhance their use of data and evidence to improve services to solve problems for communities. The program will help cities:

1. Create sustainable open data programs and policies that promote transparency and robust citizen engagement;

2. Better incorporate data into budget, operational, and policy decision making;

3. Conduct low-cost, rapid evaluations that allow cities to continually improve programs; and

4. Focus funding on approaches that deliver results for citizens.

Across the initiative, Bloomberg Philanthropies will document how cities currently use data and evidence in decision making, and how this unique program of support helps them advance. Over time, the initiative will also launch a benchmark system which will collect standardized, comparable data so that cities can understand their performance relative to peers.

In cities across the country, mayors are increasingly relying on data and evidence to deliver better results for city residents. For example, New Orleans’ City Hall used data to reduce blighted residences by 10,000 and increased the number of homes brought into compliance by 62% in 2 years. The City’s “BlightStat” program has put New Orleans, once behind in efforts to revitalize abandoned and decaying properties, at the forefront of national efforts.

In New York City and other jurisdictions, open data from transit agencies has led to the creation of hundreds of apps that residents now use to get around town, choose where to live based on commuting times, provide key transit information to the visually impaired, and more. And Louisville has asked volunteers to attach GPS trackers to their asthma inhalers to see where they have the hardest time breathing. The city is now using that data to better target the sources of air pollution….

To learn more and apply to be a What Works City, visitwww.WhatWorksCities.org.”

A New Source of Data for Public Health Surveillance: Facebook Likes


Paper by Steven Gittelman et al in the Journal of Medical Internet Research: “The development of the Internet and the explosion of social media have provided many new opportunities for health surveillance. The use of the Internet for personal health and participatory health research has exploded, largely due to the availability of online resources and health care information technology applications [18]. These online developments, plus a demand for more timely, widely available, and cost-effective data, have led to new ways epidemiological data are collected, such as digital disease surveillance and Internet surveys [825]. Over the past 2 decades, Internet technology has been used to identify disease outbreaks, track the spread of infectious disease, monitor self-care practices among those with chronic conditions, and to assess, respond, and evaluate natural and artificial disasters at a population level [6,8,11,12,14,15,17,22,2628]. Use of these modern communication tools for public health surveillance has proven to be less costly and more timely than traditional population surveillance modes (eg, mail surveys, telephone surveys, and face-to-face household surveys).

The Internet has spawned several sources of big data, such as Facebook [29], Twitter [30], Instagram [31], Tumblr [32], Google [33], and Amazon [34]. These online communication channels and market places provide a wealth of passively collected data that may be mined for purposes of public health, such as sociodemographic characteristics, lifestyle behaviors, and social and cultural constructs. Moreover, researchers have demonstrated that these digital data sources can be used to predict otherwise unavailable information, such as sociodemographic characteristics among anonymous Internet users [3538]. For example, Goel et al [36] found no difference by demographic characteristics in the usage of social media and email. However, the frequency with which individuals accessed the Web for news, health care, and research was a predictor of gender, race/ethnicity, and educational attainment, potentially providing useful targeting information based on ethnicity and income [36]. Integrating these big data sources into the practice of public health surveillance is vital to move the field of epidemiology into the 21st century as called for in the 2012 US “Big Data Research and Development Initiative” [19,39].

Understanding how big data can be used to predict lifestyle behavior and health-related data is a step toward the use of these electronic data sources for epidemiologic needs…(More)”

Americans’ Views on Open Government Data


The upshot has been the appearance of a variety of “open data” and “open government” initiatives throughout the United States that try to use data as a lever to improve government performance and encourage warmer citizens’ attitudes toward government.

This report is based on the first national survey that seeks to benchmark public sentiment about the government initiatives that use data to cultivate the public square. The survey, conducted by Pew Research Center in association with the John S. and James L. Knight Foundation, captures public views at the emergent moment when new technology tools and techniques are being used to disseminate and capitalize on government data and specifically looks at:

  • People’s level of awareness of government efforts to share data
  • Whether these efforts translate into people using data to track government performance
  • If people think government data initiatives have made, or have the potential to make, government perform better or improve accountability
  • The more routine kinds of government-citizen online interactions, such as renewing licenses or searching for the hours of public facilities.

The results cover all three levels of government in America — federal, state and local — and show that government data initiatives are in their early stages in the minds of most Americans. Generally, people are optimistic that these initiatives can make government more accountable; even though many are less sure open data will improve government performance. And government does touch people online, as evidenced by high levels of use of the internet for routine information applications. But most Americans have yet to delve too deeply into government data and its possibilities to closely monitor government performance.

Among the survey’s main findings:

As open data and open government initiatives get underway, most Americans are still largely engaged in “e-Gov 1.0” online activities, with far fewer attuned to “Data-Gov 2.0” initiatives that involve agencies sharing data online for public use….

Minorities of Americans say they pay a lot of attention to how governments share data with the public and relatively few say they are aware of examples where government has done a good (or bad) job sharing data. Less than one quarter use government data to monitor how government performs in several different domains….
Americans have mixed hopes about government data initiatives. People see the potential in these initiatives as a force to improve government accountability. However, the jury is still out for many Americans as to whether government data initiatives will improve government performance….
People’s baseline level of trust in government strongly shapes how they view the possible impact of open data and open government initiatives on how government functions…
Americans’ perspectives on trusting government are shaped strongly by partisan affiliation, which in turn makes a difference in attitudes about the impacts of government data initiatives…

Americans are for the most part comfortable with government sharing online data about their communities, although they sound cautionary notes when the data hits close to home…

Smartphone users have embraced information-gathering using mobile apps that rely on government data to function, but not many see a strong link between the underlying government data and economic value…

…(More)”