The big medical data miss: challenges in establishing an open medical resource


Eric J. Topol in Nature: ” I call for an international open medical resource to provide a database for every individual’s genomic, metabolomic, microbiomic, epigenomic and clinical information. This resource is needed in order to facilitate genetic diagnoses and transform medical care.

“We are each, in effect, one-person clinical trials”

Laurie Becklund was a noted journalist who died in February 2015 at age 66 from breast cancer. Soon thereafter, the Los Angeles Times published her op-ed entitled “As I lay dying” (Ref. 1). She lamented, “We are each, in effect, one-person clinical trials. Yet the knowledge generated from those trials will die with us because there is no comprehensive database of metastatic breast cancer patients, their characteristics and what treatments did and didn’t help them”. She went on to assert that, in the era of big data, the lack of such a resource is “criminal”, and she is absolutely right….

Around the same time of this important op-ed, the MIT Technology Review published their issue entitled “10 Breakthrough Technologies 2015” and on the list was the “Internet of DNA” (Ref. 2). While we are often reminded that the world we live in is becoming the “Internet of Things”, I have not seen this terminology applied to DNA before. The article on the “Internet of DNA” decried, “the unfolding calamity in genomics is that a great deal of life-saving information, though already collected, is inaccessible”. It called for a global network of millions of genomes and cited theMatchmaker Exchange as a frontrunner. For this international initiative, a growing number of research and clinical teams have come together to pool and exchange phenotypic and genotypic data for individual patients with rare disorders, in order to share this information and assist in the molecular diagnosis of individuals with rare diseases….

an Internet of DNA — or what I have referred to as a massive, open, online medicine resource (MOOM) — would help to quickly identify the genetic cause of the disorder4 and, in the process of doing so, precious guidance for prevention, if necessary, would become available for such families who are currently left in the lurch as to their risk of suddenly dying.

So why aren’t such MOOMs being assembled? ….

There has also been much discussion related to privacy concerns that patients might be unwilling to participate in a massive medical information resource. However, multiple global consumer surveys have shown that more than 80% of individuals are ready to share their medical data provided that they are anonymized and their privacy maximally assured4. Indeed, just 24 hours into Apple’s ResearchKit initiative, a smartphone-based medical research programme, there were tens of thousand of patients with Parkinson disease, asthma or heart disease who had signed on. Some individuals are even willing to be “open source” — that is, to make their genetic and clinical data fully available with free access online, without any assurance of privacy. This willingness is seen by the participants in the recently launched Open Humans initiative. Along with the Personal Genome Project, Go Viral and American Gut have joined in this initiative. Still, studies suggest that most individuals would only agree to be medical research participants if their identities would not be attainable. Unfortunately, to date, little has been done to protect individual medical privacy, for which there are both promising new data protection technological approaches4 and the need for additional governmental legislation.

This leaves us with perhaps the major obstacle that is holding back the development of MOOMs — researchers. Even with big, team science research projects culling together hundreds of investigators and institutions throughout the world, such as the Global Alliance for Genomics and Health (GA4GH), the data obtained clinically are just as Laurie Becklund asserted in her op-ed — “one-person clinical trials” (Ref. 1). While undertaking the construction of a MOOM is a huge endeavour, there is little motivation for researchers to take on this task, as this currently offers no academic credit and has no funding source. But the transformative potential of MOOMs to improve medical care is extraordinary. Rather than having the knowledge die with each of us, the time has come to take down the walls of academic medical centres and health-care systems around the world, and create a global knowledge medical resource that leverages each individual’s information to help one another…(More)”

Americans’ Views on Open Government Data


The upshot has been the appearance of a variety of “open data” and “open government” initiatives throughout the United States that try to use data as a lever to improve government performance and encourage warmer citizens’ attitudes toward government.

This report is based on the first national survey that seeks to benchmark public sentiment about the government initiatives that use data to cultivate the public square. The survey, conducted by Pew Research Center in association with the John S. and James L. Knight Foundation, captures public views at the emergent moment when new technology tools and techniques are being used to disseminate and capitalize on government data and specifically looks at:

  • People’s level of awareness of government efforts to share data
  • Whether these efforts translate into people using data to track government performance
  • If people think government data initiatives have made, or have the potential to make, government perform better or improve accountability
  • The more routine kinds of government-citizen online interactions, such as renewing licenses or searching for the hours of public facilities.

The results cover all three levels of government in America — federal, state and local — and show that government data initiatives are in their early stages in the minds of most Americans. Generally, people are optimistic that these initiatives can make government more accountable; even though many are less sure open data will improve government performance. And government does touch people online, as evidenced by high levels of use of the internet for routine information applications. But most Americans have yet to delve too deeply into government data and its possibilities to closely monitor government performance.

Among the survey’s main findings:

As open data and open government initiatives get underway, most Americans are still largely engaged in “e-Gov 1.0” online activities, with far fewer attuned to “Data-Gov 2.0” initiatives that involve agencies sharing data online for public use….

Minorities of Americans say they pay a lot of attention to how governments share data with the public and relatively few say they are aware of examples where government has done a good (or bad) job sharing data. Less than one quarter use government data to monitor how government performs in several different domains….
Americans have mixed hopes about government data initiatives. People see the potential in these initiatives as a force to improve government accountability. However, the jury is still out for many Americans as to whether government data initiatives will improve government performance….
People’s baseline level of trust in government strongly shapes how they view the possible impact of open data and open government initiatives on how government functions…
Americans’ perspectives on trusting government are shaped strongly by partisan affiliation, which in turn makes a difference in attitudes about the impacts of government data initiatives…

Americans are for the most part comfortable with government sharing online data about their communities, although they sound cautionary notes when the data hits close to home…

Smartphone users have embraced information-gathering using mobile apps that rely on government data to function, but not many see a strong link between the underlying government data and economic value…

…(More)”

21st-Century Public Servants: Using Prizes and Challenges to Spur Innovation


Jenn Gustetic at the Open Government Initiative Blog: “Thousands of Federal employees across the government are using a variety of modern tools and techniques to deliver services more effectively and efficiently, and to solve problems that relate to the missions of their Agencies. These 21st-century public servants are accomplishing meaningful results by applying new tools and techniques to their programs and projects, such as prizes and challenges, citizen science and crowdsourcing, open data, and human-centered design.

Prizes and challenges have been a particularly popular tool at Federal agencies. With 397 prizes and challenges posted on challenge.gov since September 2010, there are hundreds of examples of the many different ways these tools can be designed for a variety of goals. For example:

  • NASA’s Mars Balance Mass Challenge: When NASA’s Curiosity rover pummeled through the Martian atmosphere and came to rest on the surface of Mars in 2012, about 300 kilograms of solid tungsten mass had to be jettisoned to ensure the spacecraft was in a safe orientation for landing. In an effort to seek creative concepts for small science and technology payloads that could potentially replace a portion of such jettisoned mass on future missions, NASA released the Mars Balance Mass Challenge. In only two months, over 200 concepts were submitted by over 2,100 individuals from 43 different countries for NASA to review. Proposed concepts ranged from small drones and 3D printers to radiation detectors and pre-positioning supplies for future human missions to the planet’s surface. NASA awarded the $20,000 prize to Ted Ground of Rising Star, Texas for his idea to use the jettisoned payload to investigate the Mars atmosphere in a way similar to how NASA uses sounding rockets to study Earth’s atmosphere. This was the first time Ted worked with NASA, and NASA was impressed by the novelty and elegance of his proposal: a proposal that NASA likely would not have received through a traditional contract or grant because individuals, as opposed to organizations, are generally not eligible to participate in those types of competitions.
  • National Institutes of Health (NIH) Breast Cancer Startup Challenge (BCSC): The primary goals of the BCSC were to accelerate the process of bringing emerging breast cancer technologies to market, and to stimulate the creation of start-up businesses around nine federally conceived and owned inventions, and one invention from an Avon Foundation for Women portfolio grantee.  While NIH has the capacity to enable collaborative research or to license technology to existing businesses, many technologies are at an early stage and are ideally suited for licensing by startup companies to further develop them into commercial products. This challenge established 11 new startups that have the potential to create new jobs and help promising NIH cancer inventions support the fight against breast cancer. The BCSC turned the traditional business plan competition model on its head to create a new channel to license inventions by crowdsourcing talent to create new startups.

These two examples of challenges are very different, in terms of their purpose and the process used to design and implement them. The success they have demonstrated shouldn’t be taken for granted. It takes access to resources (both information and people), mentoring, and practical experience to both understand how to identify opportunities for innovation tools, like prizes and challenges, to use them to achieve a desired outcome….

Last month, the Challenge.gov program at the General Services Administration (GSA), the Office of Personnel Management (OPM)’s Innovation Lab, the White House Office of Science and Technology Policy (OSTP), and a core team of Federal leaders in the prize-practitioner community began collaborating with the Federal Community of Practice for Challenges and Prizes to develop the other half of the open innovation toolkit, the prizes and challenges toolkit. In developing this toolkit, OSTP and GSA are thinking not only about the information and process resources that would be helpful to empower 21st-century public servants using these tools, but also how we help connect these people to one another to add another meaningful layer to the learning environment…..

Creating an inventory of skills and knowledge across the 600-person (and growing!) Federal community of practice in prizes and challenges will likely be an important resource in support of a useful toolkit. Prize design and implementation can involve tricky questions, such as:

  • Do I have the authority to conduct a prize or challenge?
  • How should I approach problem definition and prize design?
  • Can agencies own solutions that come out of challenges?
  • How should I engage the public in developing a prize concept or rules?
  • What types of incentives work best to motivate participation in challenges?
  • What legal requirements apply to my prize competition?
  • Can non-Federal employees be included as judges for my prizes?
  • How objective do the judging criteria need to be?
  • Can I partner to conduct a challenge? What’s the right agreement to use in a partnership?
  • Who can win prize money and who is eligible to compete? …(More)

Data Science and Ebola


Inaugural Lecture by Aske Plaat on the acceptance of the position of professor of Data Science at the Universiteit Leiden: “…Today, everybody and everything produces data. People produce large amounts of data in social networks and in commercial transactions. Medical, corporate, and government databases continue to grow. Ten years ago there were a billion Internet users. Now there are more than three billion, most of whom are mobile.1 Sensors continue to get cheaper and are increasingly connected, creating an Internet of Things. The next three billion users of the Internet will not all be human, and will generate a large amount of data. In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to medicine, to the behavioral sciences, to finance and commerce, to the humanities and to the arts. In every discipline people want to organize, analyze, optimize and understand their data to answer questions and to deepen insights. The availability of so much data and the ability to interpret it are changing the way the world operates. The number of sciences using this approach is increasing. The science that is transforming this ocean of data into a sea of knowledge is called data science. In many sciences the impact on the research methodology is profound—some even call it a paradigm shift.

…I will address the question of why there is so much interest in data. I will answer this question by discussing one of the most visible recent challenges to public health of the moment, the 2014 Ebola outbreak in West Africa…(More)”

Citizen Science for Citizen Access to Law


Paper by Michael Curtotti, Wayne Weibel, Eric McCreath, Nicolas Ceynowa, Sara Frug, and Tom R Bruce: “This paper sits at the intersection of citizen access to law, legal informatics and plain language. The paper reports the results of a joint project of the Cornell University Legal Information Institute and the Australian National University which collected thousands of crowdsourced assessments of the readability of law through the Cornell LII site. The aim of the project is to enhance accuracy in the prediction of the readability of legal sentences. The study requested readers on legislative pages of the LII site to rate passages from the United States Code and the Code of Federal Regulations and other texts for readability and other characteristics. The research provides insight into who uses legal rules and how they do so. The study enables conclusions to be drawn as to the current readability of law and spread of readability among legal rules. The research is intended to enable the creation of a dataset of legal rules labelled by human judges as to readability. Such a dataset, in combination with machine learning, will assist in identifying factors in legal language which impede readability and access for citizens. As far as we are aware, this research is the largest ever study of readability and usability of legal language and the first research which has applied crowdsourcing to such an investigation. The research is an example of the possibilities open for enhancing access to law through engagement of end users in the online legal publishing environment for enhancement of legal accessibility and through collaboration between legal publishers and researchers….(More)”

White House Releases 150 Data Sets to Fight Climate Change


 at GovTech: “To support the president’s Climate Data Initiative, the White House revealed on Tuesday, April 7, a series of data projects and partnerships that includes more than 150 new open data sets, as well as commitments from Google, Microsoft and others to cultivate climate analysis.

The undertakings were released at a White House climate and health conference where John Holdren, director of the White House Office of Science and Technology Policy, pressed the need for greater data to compel decreases to greenhouse emissions.

“This is a science-based administration, a fact-based administration, and our climate policies have to be based on fact, have to be based on data, and we want to make those data available to everybody,” Holdren said.

The data initiative touches multiple agencies — including NASA, the Centers for Disease Control and Prevention, the National Institutes of Health and the Environmental Protection Agency — and is part of the White House proclamation of a new National Public Health Week, from April 6 to April 12, to spur national health solutions and awareness.

The 150-plus data sets are all connected to health, and are among the 560 climate-related data sets available on Data.gov, the U.S. government’s open data portal. Accompanying the release, the Department of Health and Human Services added a Health Care Facilities Toolkit on Toolkit.climate.gov, a site that delivers climate resilience techniques, strategies, case studies and tools for organizations attempting climate change initiatives.

Holdren was followed by White House Chief Data Scientist D.J. Patil, who moderated a tech industry panel with representatives from Google, Microsoft and GIS mapping software company Esri.

Google Earth Outreach Program Manager Allison Lieber confirmed that Google will continue to provide assistance with 10 million hours for high-performance computing for climate data projects — down from 50 million in 2014 — and the company will likewise provide climate data hosting on Google Earth….(More)”

Big Data, Little Data, No Data


New book by Christine L. Borgman: “Big Data” is on the covers of Science, Nature, the Economist, and Wired magazines, on the front pages of the Wall Street Journal and the New York Times. But despite the media hyperbole, as Christine Borgman points out in this examination of data and scholarly research, having the right data is usually better than having more data; little data can be just as valuable as big data. In many cases, there are no data—because relevant data don’t exist, cannot be found, or are not available. Moreover, data sharing is difficult, incentives to do so are minimal, and data practices vary widely across disciplines.

Borgman, an often-cited authority on scholarly communication, argues that data have no value or meaning in isolation; they exist within a knowledge infrastructure—an ecology of people, practices, technologies, institutions, material objects, and relationships. After laying out the premises of her investigation—six “provocations” meant to inspire discussion about the uses of data in scholarship—Borgman offers case studies of data practices in the sciences, the social sciences, and the humanities, and then considers the implications of her findings for scholarly practice and research policy. To manage and exploit data over the long term, Borgman argues, requires massive investment in knowledge infrastructures; at stake is the future of scholarship….(More)”

Discovering the Language of Data: Personal Pattern Languages and the Social Construction of Meaning from Big Data


Paper by ; ; in Interdisciplinary Science Reviews: “This paper attempts to address two issues relevant to the sense-making of Big Data. First, it presents a case study for how a large dataset can be transformed into both a visual language and, in effect, a ‘text’ that can be read and interpreted by human beings. The case study comes from direct observation of graduate students at the IIT Institute of Design who investigated task-switching behaviours, as documented by productivity software on a single user’s laptop and a smart phone. Through a series of experiments with the resulting dataset, the team effects a transformation of that data into a catalogue of visual primitives — a kind of iconic alphabet — that allow others to ‘read’ the data as a corpus and, more provocatively, suggest the formation of a personal pattern language. Second, this paper offers a model for human-technical collaboration in the sense-making of data, as demonstrated by this and other teams in the class. Current sense-making models tend to be data- and technology-centric, and increasingly presume data visualization as a primary point of entry of humans into Big Data systems. This alternative model proposes that meaningful interpretation of data emerges from a more elaborate interplay between algorithms, data and human beings….(More)”

 

Eight ways to make government more experimental


Jonathan Breckon et al at NESTA: “When the banners and bunting have been tidied away after the May election, and a new bunch of ministers sit at their Whitehall desks, could they embrace a more experimental approach to government?

Such an approach requires a degree of humility.  Facing up to the fact that we don’t have all the answers for the next five years.  We need to test things out, evaluate new ways of doing things with the best of social science, and grow what works.  And drop policies that fail.

But how best to go about it?  Here are our 8 ways to make it a reality:

  1. Make failure OK. A more benign attitude to risk is central to experimentation.  As a 2003 Cabinet Office review entitled Trying it Out said, a pilot that reveals a policy to be flawed should be ‘viewed as a success rather than a failure, having potentially helped to avert a potentially larger political and/or financial embarrassment’. Pilots are particularly important in fast moving areas such as technology to try promising fresh ideas in real-time. Our ‘Visible Classroom’ pilot tried an innovative approach to teacher CPD developed from technology for television subtitling.
  2. Avoid making policies that are set in stone.  Allowing policy to be more project–based, flexible and time-limited could encourage room for manoeuvre, according to a previous Nesta report State of Uncertainty; Innovation policy through experimentation.  The Department for Work and Pensions’ Employment Retention and Advancement pilot scheme to help people back to work was designed to influence the shape of legislation. It allowed for amendments and learning as it was rolled out.  We need more policy experiments like this.
  3. Work with the grain of current policy environment. Experimenters need to be opportunists. We need to be nimble and flexible. Ready to seize windows of opportunity to  experiment. Some services have to be rolled out in stages due to budget constraints. This offers opportunities to try things out before going national. For instance, The Mexican Oportunidades anti-poverty experiments which eventually reached 5.8 million households in all Mexican states, had to be trialled first in a handful of areas. Greater devolution is creating a patchwork of different policy priorities, funding and delivery models – so-called ‘natural experiments’. Let’s seize the opportunity to deliberately test and compare across different jurisdictions. What about a trial of basic income in Northern Ireland, for example, along the lines of recent Finnish proposals, or universal free childcare in Scotland?
  4. Experiments need the most robust and appropriate evaluation methods such as, if appropriate, Randomised Controlled Trials. Other methods, such as qualitative research may be needed to pry open the ‘black box’ of policies – to learn about why and how things are working. Civil servants should use the government trial advice panel as a source of expertise when setting up experiments.
  5. Grow the public debate about the importance of experimentation. Facebook had to apologise after a global backlash to psychological experiments on their 689,000 users web-users. Approval by ethics committees – normal practice for trials in hospitals and universities – is essential, but we can’t just rely on experts. We need a dedicated public understanding of experimentation programmes, perhaps run by Evidence Matters or Ask for Evidence campaigns at Sense about Science. Taking part in an experiment in itself can be a learning opportunity creating  an appetite amongt the public, something we have found from running an RCT with schools.
  6. Create ‘Skunkworks’ institutions. New or improved institutional structures within government can also help with experimentation.   The Behavioural Insights Team, located in Nesta,  operates a classic ‘skunkworks’ model, semi-detached from day-to-day bureaucracy. The nine UK What Works Centres help try things out semi-detached from central power, such as the The Education Endowment Foundation who source innovations widely from across the public and private sectors- including Nesta-  rather than generating ideas exclusively in house or in government.
  7. Find low-cost ways to experiment. People sometimes worry that trials are expensive and complicated.  This does not have to be the case. Experiments to encourage organ donation by the Government Digital Service and Behavioural Insights Team involved an estimated cost of £20,000.  This was because the digital experiments didn’t involve setting up expensive new interventions – just changing messages on  web pages for existing services. Some programmes do, however, need significant funding to evaluate and budgets need to be found for it. A memo from the White House Office for Management and Budget has asked for new Government schemes seeking funding to allocate a proportion of their budgets to ‘randomized controlled trials or carefully designed quasi-experimental techniques’.
  8. Be bold. A criticism of some experiments is that they only deal with the margins of policy and delivery. Government officials and researchers should set up more ambitious experiments on nationally important big-ticket issues, from counter-terrorism to innovation in jobs and housing….(More)

New Desktop Application Has Potential to Increase Asteroid Detection, Now Available to Public


NASA Press Release: “A software application based on an algorithm created by a NASA challenge has the potential to increase the number of new asteroid discoveries by amateur astronomers.

Analysis of images taken of our solar system’s main belt asteroids between Mars and Jupiter using the algorithm showed a 15 percent increase in positive identification of new asteroids.

During a panel Sunday at the South by Southwest Festival in Austin, Texas, NASA representatives discussed how citizen scientists have made a difference in asteroid hunting. They also announced the release of a desktop software application developed by NASA in partnership with Planetary Resources, Inc., of Redmond, Washington. The application is based on an Asteroid Data Hunter-derived algorithm that analyzes images for potential asteroids. It’s a tool that can be used by amateur astronomers and citizen scientists.

The Asteroid Data Hunter challenge was part of NASA’s Asteroid Grand Challenge. The data hunter contest series, which was conducted in partnership with Planetary Resources under a Space Act Agreement, was announced at the 2014 South by Southwest Festival and concluded in December. The series offered a total of $55,000 in awards for participants to develop significantly improved algorithms to identify asteroids in images captured by ground-based telescopes. The winning solutions of each piece of the contest combined to create an application using the best algorithm that increased the detection sensitivity, minimized the number of false positives, ignored imperfections in the data, and ran effectively on all computer systems.

“The Asteroid Grand Challenge is seeking non-traditional partnerships to bring the citizen science and space enthusiast community into NASA’s work,” said Jason Kessler, program executive for NASA’s Asteroid Grand Challenge. “The Asteroid Data Hunter challenge has been successful beyond our hopes, creating something that makes a tangible difference to asteroid hunting astronomers and highlights the possibility for more people to play a role in protecting our planet.”…

The new asteroid hunting application can be downloaded at:

http://topcoder.com/asteroids

For information about NASA’s Asteroid Grand Challenge, visit:

http://www.nasa.gov/asteroidinitiative