Personal Data for the Public Good


Final report on “New Opportunities to Enrich Understanding of Individual and Population Health” of the health data exploration project: “Individuals are tracking a variety of health-related data via a growing number of wearable devices and smartphone apps. More and more data relevant to health are also being captured passively as people communicate with one another on social networks, shop, work, or do any number of activities that leave “digital footprints.”
Almost all of these forms of “personal health data” (PHD) are outside of the mainstream of traditional health care, public health or health research. Medical, behavioral, social and public health research still largely rely on traditional sources of health data such as those collected in clinical trials, sifting through electronic medical records, or conducting periodic surveys.
Self-tracking data can provide better measures of everyday behavior and lifestyle and can fill in gaps in more traditional clinical data collection, giving us a more complete picture of health. With support from the Robert Wood Johnson Foundation, the Health Data Exploration (HDE) project conducted a study to better understand the barriers to using personal health data in research from the individuals who track the data about their own personal health, the companies that market self-track- ing devices, apps or services and aggregate and manage that data, and the researchers who might use the data as part of their research.
Perspectives
Through a series of interviews and surveys, we discovered strong interest in contributing and using PHD for research. It should be noted that, because our goal was to access individuals and researchers who are already generating or using digital self-tracking data, there was some bias in our survey findings—participants tended to have more educa- tion and higher household incomes than the general population. Our survey also drew slightly more white and Asian participants and more female participants than in the general population.
Individuals were very willing to share their self-tracking data for research, in particular if they knew the data would advance knowledge in the fields related to PHD such as public health, health care, computer science and social and behavioral science. Most expressed an explicit desire to have their information shared anonymously and we discovered a wide range of thoughts and concerns regarding thoughts over privacy.
Equally, researchers were generally enthusiastic about the potential for using self-tracking data in their research. Researchers see value in these kinds of data and think these data can answer important research questions. Many consider it to be of equal quality and importance to data from existing high quality clinical or public health data sources.
Companies operating in this space noted that advancing research was a worthy goal but not their primary business concern. Many companies expressed interest in research conducted outside of their company that would validate the utility of their device or application but noted the critical importance of maintaining their customer relationships. A number were open to data sharing with academics but noted the slow pace and administrative burden of working with universities as a challenge.
In addition to this considerable enthusiasm, it seems a new PHD research ecosystem may well be emerging. Forty-six percent of the researchers who participated in the study have already used self-tracking data in their research, and 23 percent of the researchers have already collaborated with application, device, or social media companies.
The Personal Health Data Research Ecosystem
A great deal of experimentation with PHD is taking place. Some individuals are experimenting with personal data stores or sharing their data directly with researchers in a small set of clinical experiments. Some researchers have secured one-off access to unique data sets for analysis. A small number of companies, primarily those with more of a health research focus, are working with others to develop data commons to regularize data sharing with the public and researchers.
SmallStepsLab serves as an intermediary between Fitbit, a data rich company, and academic research- ers via a “preferred status” API held by the company. Researchers pay SmallStepsLab for this access as well as other enhancements that they might want.
These promising early examples foreshadow a much larger set of activities with the potential to transform how research is conducted in medicine, public health and the social and behavioral sciences.
Opportunities and Obstacles
There is still work to be done to enhance the potential to generate knowledge out of personal health data:

  • Privacy and Data Ownership: Among individuals surveyed, the dominant condition (57%) for making their PHD available for research was an assurance of privacy for their data, and over 90% of respondents said that it was important that the data be anonymous. Further, while some didn’t care who owned the data they generate, a clear majority wanted to own or at least share owner- ship of the data with the company that collected it.
  • InformedConsent:Researchersareconcerned about the privacy of PHD as well as respecting the rights of those who provide it. For most of our researchers, this came down to a straightforward question of whether there is informed consent. Our research found that current methods of informed consent are challenged by the ways PHD are being used and reused in research. A variety of new approaches to informed consent are being evaluated and this area is ripe for guidance to assure optimal outcomes for all stakeholders.
  • Data Sharing and Access: Among individuals, there is growing interest in, as well as willingness and opportunity to, share personal health data with others. People now share these data with others with similar medical conditions in online groups like PatientsLikeMe or Crohnology, with the intention to learn as much as possible about mutual health concerns. Looking across our data, we find that individuals’ willingness to share is dependent on what data is shared, how the data will be used, who will have access to the data and when, what regulations and legal protections are in place, and the level of compensation or benefit (both personal and public).
  • Data Quality: Researchers highlighted concerns about the validity of PHD and lack of standard- ization of devices. While some of this may be addressed as the consumer health device, apps and services market matures, reaching the optimal outcome for researchers might benefit from strategic engagement of important stakeholder groups.

We are reaching a tipping point. More and more people are tracking their health, and there is a growing number of tracking apps and devices on the market with many more in development. There is overwhelming enthusiasm from individuals and researchers to use this data to better understand health. To maximize personal data for the public good, we must develop creative solutions that allow individual rights to be respected while providing access to high-quality and relevant PHD for research, that balance open science with intellectual property, and that enable productive and mutually beneficial collaborations between the private sector and the academic research community.”

Expanding Opportunity through Open Educational Resources


Hal Plotkin and Colleen Chien at the White House: “Using advanced technology to dramatically expand the quality and reach of education has long been a key priority for the Obama Administration.
In December 2013, the President’s Council of Advisors on Science and Technology (PCAST) issued a report exploring the potential of Massive Open Online Courses (MOOCs) to expand access to higher education opportunities. Last month, the President announced a $2B down payment, and another $750M in private-sector commitments to deliver on the President’s ConnectEd initiative, which will connect 99% of American K-12 students to broadband by 2017 at no cost to American taxpayers.
This week, we are happy to be joining with educators, students, and technologists worldwide to recognize and celebrate Open Education Week.
Open Educational Resources (“OER”) are educational resources that are released with copyright licenses allowing for their free use, continuous improvement, and modification by others. The world is moving fast, and OER enables educators and students to access, customize, and remix high-quality course materials reflecting the latest understanding of the world and materials that incorporate state of the art teaching methods – adding their own insights along the way. OER is not a silver bullet solution to the many challenges that teachers, students and schools face. But it is a tool increasingly being used, for example by players like edX and the Kahn Academy, to improve learning outcomes and create scalable platforms for sharing educational resources that reach millions of students worldwide.
Launched at MIT in 2001, OER became a global movement in 2007 when thousands of educators around the globe endorsed the Cape Town Declaration on Open Educational Resources. Another major milestone came in 2011, when Secretary of Education Arne Duncan and then-Secretary of Labor Hilda Solis unveiled the four-year, $2B Trade Adjustment Assistance Community College and Career Training Grant Program (TAACCCT). It was the first Federal program to leverage OER to support the development of a new generation of affordable, post-secondary educational programs that can be completed in two years or less to prepare students for careers in emerging and expanding industries….
Building on this record of success, OSTP and the U.S. Agency for International Development (USAID) are exploring an effort to inspire and empower university students through multidisciplinary OER focused on one of the USAID Grand Challenges, such as securing clean water, saving lives at birth, or improving green agriculture. This effort promises to  be a stepping stone towards leveraging OER to help solve other grand challenges such as the NAE Grand Challenges in Engineering or Grand Challenges in Global Health.
This is great progress, but there is more work to do. We look forward to keeping the community updated right here. To see the winning videos from the U.S. Department of Education’s “Why Open Education Matters” Video Contest, click here.”

Computational Social Science: Exciting Progress and Future Directions


Duncan Watts in The Bridge: “The past 15 years have witnessed a remarkable increase in both the scale and scope of social and behavioral data available to researchers. Over the same period, and driven by the same explosion in data, the study of social phenomena has increasingly become the province of computer scientists, physicists, and other “hard” scientists. Papers on social networks and related topics appear routinely in top science journals and computer science conferences; network science research centers and institutes are sprouting up at top universities; and funding agencies from DARPA to NSF have moved quickly to embrace what is being called computational social science.
Against these exciting developments stands a stubborn fact: in spite of many thousands of published papers, there’s been surprisingly little progress on the “big” questions that motivated the field of computational social science—questions concerning systemic risk in financial systems, problem solving in complex organizations, and the dynamics of epidemics or social movements, among others.
Of the many reasons for this state of affairs, I concentrate here on three. First, social science problems are almost always more difficult than they seem. Second, the data required to address many problems of interest to social scientists remain difficult to assemble. And third, thorough exploration of complex social problems often requires the complementary application of multiple research traditions—statistical modeling and simulation, social and economic theory, lab experiments, surveys, ethnographic fieldwork, historical or archival research, and practical experience—many of which will be unfamiliar to any one researcher. In addition to explaining the particulars of these challenges, I sketch out some ideas for addressing them….”

New Journal Helps Behavioral Scientists Find Their Way to Washington


The PsychReport: “When it comes to being heard in Washington, classical economists have long gotten their way. Behavioral scientists, on the other hand, haven’t proved so adept at getting their message across.

It isn’t for lack of good ideas. Psychology’s applicability has been gaining momentum in recent years, namely in the U.K.’s Behavioral Insights Team, which has helped prove the discipline’s worth to policy makers. The recent (but not-yet-official) announcement that the White House is creating a similar team is another major endorsement of behavioral science’s value.

But when it comes to communicating those ideas to the public in general, psychologists and other behavioral scientists can’t name so many successes. Part of the problem is PR know-how: writing for a general audience, publicizing good ideas, reaching-out to decision makers. Another is incentive: academics need to publish, and many times publishing means producing long, dense, jargon-laden articles for peer-reviewed journals read by a rarified audience of other academics. And then there’s time, or lack of it.

But a small group of prominent behavioral scientists is working to help other researchers find their way to Washington. The brainchild of UCLA’s Craig Fox and Duke’s Sim Sitkin, Behavioral Science & Policy is a peer-reviewed journal set to launch online this fall and in print early next year, whose mission is to influence policy and practice through promoting high-quality behavioral science research. Articles will be brief, well written, and will all provide straightforward, applicable policy recommendations that serve the public interest.

“What we’re trying to do is create policies that are mindful of how individuals, groups, and organizations behave. How can you create smart policies if you don’t do that?”

In bringing behavioral science to the capital, Fox echoed a similar motivation as David Halpern of the Behavioral Insights Team.

“What we’re trying to do is create policies that are mindful of how individuals, groups, and organizations behave. How can you create smart policies if you don’t do that?” Fox said. “Because after all, all policies affect individuals, groups, and/or organizations.”

Fox has already assembled an impressive team of scientists from around the country for the journal’s advisory board including Richard Thaler and Cass Sunstein, authors of Nudge which helped inspire the creation of the Behavioral Insights Team, The New York Times columnist David Brooks, and Nobel Prize Winner Daniel Kahneman. They’ve created a strong partnership with the prestigious think tank Brookings Institute, who will serve as their publishing partner and who they plan will also co-host briefings for policy makers in Washington…”

The Parable of Google Flu: Traps in Big Data Analysis


David Lazer: “…big data last winter had its “Dewey beats Truman” moment, when the poster child of big data (at least for behavioral data), Google Flu Trends (GFT), went way off the rails in “nowcasting” the flu–overshooting the peak last winter by 130% (and indeed, it has been systematically overshooting by wide margins for 3 years). Tomorrow we (Ryan Kennedy, Alessandro Vespignani, and Gary King) have a paper out in Science dissecting why GFT went off the rails, how that could have been prevented, and the broader lessons to be learned regarding big data.
[We are The Parable of Google Flu (WP-Final).pdf we submitted before acceptance. We have also posted an SSRN paper evaluating GFT for 2013-14, since it was reworked in the Fall.]Key lessons that I’d highlight:
1) Big data are typically not scientifically calibrated. This goes back to my post last month regarding measurement. This does not make them useless from a scientific point of view, but you do need to build into the analysis that the “measures” of behavior are being affected by unseen things. In this case, the likely culprit was the Google search algorithm, which was modified in various ways that we believe likely to have increased flu related searches.
2) Big data + analytic code used in scientific venues with scientific claims need to be more transparent. This is a tricky issue, because there are both legitimate proprietary interests involved and privacy concerns, but much more can be done in this regard than has been done in the 3 GFT papers. [One of my aspirations over the next year is to work together with big data companies, researchers, and privacy advocates to figure out how this can be done.]
3) It’s about the questions, not the size of the data. In this particular case, one could have done a better job stating the likely flu prevalence today by ignoring GFT altogether and just project 3 week old CDC data to today (better still would have been to combine the two). That is, a synthesis would have been more effective than a pure “big data” approach. I think this is likely the general pattern.
4) More generally, I’d note that there is much more that the academy needs to do. First, the academy needs to build the foundation for collaborations around big data (e.g., secure infrastructures, legal understandings around data sharing, etc). Second, there needs to be MUCH more work done to build bridges between the computer scientists who work on big data and social scientists who think about deriving insights about human behavior from data more generally. We have moved perhaps 5% of the way that we need to in this regard.”

Participatory Budgeting Platform


Hollie Gilman:  “Stanford’s Social Algorithm’s Lab SOAL has built an interactive Participatory Budgeting Platform that allows users to simulate budgetary decision making on $1 million dollars of public monies.  The center brings together economics, computer science, and networking to work on problems and understand the impact of social networking.   This project is part of Stanford’s Widescope Project to enable people to make political decisions on the budgets through data driven social networks.
The Participatory Budgeting simulation highlights the fourth annual Participatory Budgeting in Chicago’s 49th ward — the first place to implement PB in the U.S.  This year $1 million, out of $1.3 million in Alderman capital funds, will be allocated through participatory budgeting.
One goal of the platform is to build consensus. The interactive geo-spatial mapping software enables citizens to more intuitively identify projects in a given area.  Importantly, the platform forces users to make tough choices and balance competing priorities in real time.
The platform is an interesting example of a collaborative governance prototype that could be transformative in its ability to engage citizens with easily accessible mapping software.”

New Research Network to Study and Design Innovative Ways of Solving Public Problems


Network

MacArthur Foundation Research Network on Opening Governance formed to gather evidence and develop new designs for governing 

NEW YORK, NY, March 4, 2014 The Governance Lab (The GovLab) at New York University today announced the formation of a Research Network on Opening Governance, which will seek to develop blueprints for more effective and legitimate democratic institutions to help improve people’s lives.
Convened and organized by the GovLab, the MacArthur Foundation Research Network on Opening Governance is made possible by a three-year grant of $5 million from the John D. and Catherine T. MacArthur Foundation as well as a gift from Google.org, which will allow the Network to tap the latest technological advances to further its work.
Combining empirical research with real-world experiments, the Research Network will study what happens when governments and institutions open themselves to diverse participation, pursue collaborative problem-solving, and seek input and expertise from a range of people. Network members include twelve experts (see below) in computer science, political science, policy informatics, social psychology and philosophy, law, and communications. This core group is supported by an advisory network of academics, technologists, and current and former government officials. Together, they will assess existing innovations in governing and experiment with new practices and how institutions make decisions at the local, national, and international levels.
Support for the Network from Google.org will be used to build technology platforms to solve problems more openly and to run agile, real-world, empirical experiments with institutional partners such as governments and NGOs to discover what can enhance collaboration and decision-making in the public interest.
The Network’s research will be complemented by theoretical writing and compelling storytelling designed to articulate and demonstrate clearly and concretely how governing agencies might work better than they do today. “We want to arm policymakers and practitioners with evidence of what works and what does not,” says Professor Beth Simone Noveck, Network Chair and author of Wiki Government: How Technology Can Make Government Better, Democracy Stronger and Citi More Powerful, “which is vital to drive innovation, re-establish legitimacy and more effectively target scarce resources to solve today’s problems.”
“From prize-backed challenges to spur creative thinking to the use of expert networks to get the smartest people focused on a problem no matter where they work, this shift from top-down, closed, and professional government to decentralized, open, and smarter governance may be the major social innovation of the 21st century,” says Noveck. “The MacArthur Research Network on Opening Governance is the ideal crucible for helping  transition from closed and centralized to open and collaborative institutions of governance in a way that is scientifically sound and yields new insights to inform future efforts, always with an eye toward real-world impacts.”
MacArthur Foundation President Robert Gallucci added, “Recognizing that we cannot solve today’s challenges with yesterday’s tools, this interdisciplinary group will bring fresh thinking to questions about how our governing institutions operate, and how they can develop better ways to help address seemingly intractable social problems for the common good.”
Members
The MacArthur Research Network on Opening Governance comprises:
Chair: Beth Simone Noveck
Network Coordinator: Andrew Young
Chief of Research: Stefaan Verhulst
Faculty Members:

  • Sir Tim Berners-Lee (Massachusetts Institute of Technology (MIT)/University of Southampton, UK)
  • Deborah Estrin (Cornell Tech/Weill Cornell Medical College)
  • Erik Johnston (Arizona State University)
  • Henry Farrell (George Washington University)
  • Sheena S. Iyengar (Columbia Business School/Jerome A. Chazen Institute of International Business)
  • Karim Lakhani (Harvard Business School)
  • Anita McGahan (University of Toronto)
  • Cosma Shalizi (Carnegie Mellon/Santa Fe Institute)

Institutional Members:

  • Christian Bason and Jesper Christiansen (MindLab, Denmark)
  • Geoff Mulgan (National Endowment for Science Technology and the Arts – NESTA, United Kingdom)
  • Lee Rainie (Pew Research Center)

The Network is eager to hear from and engage with the public as it undertakes its work. Please contact Stefaan Verhulst to share your ideas or identify opportunities to collaborate.”

Coordinating the Commons: Diversity & Dynamics in Open Collaborations


Dissertation by Jonathan T. Morgan: “The success of Wikipedia demonstrates that open collaboration can be an effective model for organizing geographically-distributed volunteers to perform complex, sustained work at a massive scale. However, Wikipedia’s history also demonstrates some of the challenges that large, long-term open collaborations face: the core community of Wikipedia editors—the volunteers who contribute most of the encyclopedia’s content and ensure that articles are correct and consistent — has been gradually shrinking since 2007, in part because Wikipedia’s social climate has become increasingly inhospitable for newcomers, female editors, and editors from other underrepresented demographics. Previous research studies of change over time within other work contexts, such as corporations, suggests that incremental processes such as bureaucratic formalization can make organizations more rule-bound and less adaptable — in effect, less open— as they grow and age. There has been little research on how open collaborations like Wikipedia change over time, and on the impact of those changes on the social dynamics of the collaborating community and the way community members prioritize and perform work. Learning from Wikipedia’s successes and failures can help researchers and designers understand how to support open collaborations in other domains — such as Free/Libre Open Source Software, Citizen Science, and Citizen Journalism.

In this dissertation, I examine the role of openness, and the potential antecedents and consequences of formalization, within Wikipedia through an analysis of three distinct but interrelated social structures: community-created rules within the Wikipedia policy environment, coordination work and group dynamics within self-organized open teams called WikiProjects, and the socialization mechanisms that Wikipedia editors use to teach new community members how to participate.To inquire further, I have designed a new editor peer support space, the Wikipedia Teahouse, based on the findings from my empirical studies. The Teahouse is a volunteer-driven project that provides a welcoming and engaging environment in which new editors can learn how to be productive members of the Wikipedia community, with the goal of increasing the number and diversity of newcomers who go on to make substantial contributions to Wikipedia …”

True Collective Intelligence? A Sketch of a Possible New Field


Paper by Geoff Mulgan in Philosophy & Technology :” Collective intelligence is much talked about but remains very underdeveloped as a field. There are small pockets in computer science and psychology and fragments in other fields, ranging from economics to biology. New networks and social media also provide a rich source of emerging evidence. However, there are surprisingly few useable theories, and many of the fashionable claims have not stood up to scrutiny. The field of analysis should be how intelligence is organised at large scale—in organisations, cities, nations and networks. The paper sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to the possible intellectual barriers to progress.”

Predicting Individual Behavior with Social Networks


Article by Sharad Goel and Daniel Goldstein (Microsoft Research): “With the availability of social network data, it has become possible to relate the behavior of individuals to that of their acquaintances on a large scale. Although the similarity of connected individuals is well established, it is unclear whether behavioral predictions based on social data are more accurate than those arising from current marketing practices. We employ a communications network of over 100 million people to forecast highly diverse behaviors, from patronizing an off-line department store to responding to advertising to joining a recreational league. Across all domains, we find that social data are informative in identifying individuals who are most likely to undertake various actions, and moreover, such data improve on both demographic and behavioral models. There are, however, limits to the utility of social data. In particular, when rich transactional data were available, social data did little to improve prediction.”