The Impact of Open: Keeping you healthy


of Sunlight: “In healthcare, the goal-set shared widely throughout the field is known as “the Triple Aim”: improving individual experience of care, improving population health, and reducing the cost of care. Across the wide array of initiatives undertaken by health care data users, the great majority seem to fall within the scope of at least one aspect of the Triple Aim. Below is a set of examples that reveal how data — both open and not — is being used to achieve its elements.

The use of open data to reduce costs:

The use of open data to improve quality of care:

  • Using open data on a substantial series of individual hospital quality measures, CMS created a hospital comparison tool that allows consumers to compare average quality of care outcomes across their local hospitals.

  • Non-profit organizations survey hospitals and have used this data to provide another national measure of hospital quality that consumers can use to select a high-quality hospital.

  • In New York state, widely-shared data on cardiac surgery outcomes associated with individual providers has led to improved outcomes and better understanding of successful techniques.

  • In the UK, the National Health Service is actively working towards defining concrete metrics to evaluate how the system as a whole is moving towards improved quality. …

  • The broad cultural shift towards data-sharing in healthcare appears to have facilitated additional secured sharing in order to achieve the joint goal of improving healthcare quality and effectiveness. The current effort to securely network of millions of patient data records through the federal PCORI system has the potential to advance understanding of disease treatment at an unprecedented pace.

  • Through third-party tools, people are able to use the products of aggregated patient data in order to begin diagnosing their own symptoms more accurately, giving them a head start in understanding how to optimize their visit to a provider.

The use of open data to improve population health:

  • Out of the three elements of the triple aim, population health may have the longest and deepest relationship with open data. Public datasets like those collected by the Centers for Disease Control and the US Census have for decades been used to monitor disease prevalence, verify access to health insurance, and track mortality and morbidity statistics.

  • Population health improvement has been a major focus for newer developments as well. Health data has been a regular feature in tech efforts to improve the ways that governments — including local health departments — reach their constituencies. The use of data in new communication tools improves population health by increasing population awareness of local health trends and disease prevention opportunities. Two examples of this work in action include the Chicago Health Atlas, which combines health data and healthcare consumer problem-solving, and Philadelphia’s map interface to city data about available flu vaccines.

One final observation for open data advocates to take from health data concerns the way that the sector encourages the two-way information flow: it embraces the notion that data users can also be data producers. Open data ecosystems are properly characterized by multi-directional relationships among governmental and non-governmental actors, with opportunities for feedback, correction and augmentation of open datasets. That this happens at the scale of health data is important and meaningful for open data advocates who can face push-back when they ask their governments to ingest externally-generated data….”

The Data Revolution in Policy-Making


at the Open Institute: “There continues to be a great deal of dialogue and debate on what the data revolution from the report of the High Level Panel on the Post-2015 Development Agenda is all about. However, some have raised concerns that the emerging narrative around opening up data, strengthening national statistics offices or building capacity for e-government may not be revolutionary enough. In thinking through this it becomes clear that revolutions are highly contextual events. The Arab spring happened due to the unique factors of the cultural and social-economic environment in the Middle East and North Africa (MENA). A similar ‘spring’ may not happen in the same way in sub-Sahara Africa due to the peculiarities of the region. Attempting to replicate it is therefore an exercise in futility for those hoping for regime change.
We have just published a think piece on the role of public participation in policy making and how a data revolution could play out in that space. None of the ideas are revolutionary. They have been proposed and piloted in various countries to various extents over time. For instance, in some contexts strengthening and safe guarding the autonomy of the national statistics office may not seem revolutionary to some, in some countries it may be unprecedented (this is not part of the report). And that is okay. Nation states should be allowed, in their efforts to build capable and developmental institutions, to interpret the revolution for themselves.
In sub-Sahara Africa the availability of underlying data used to develop public policy is almost non-existent. Even when citizens are expected to participate in the formulation process and implementation of the policies, the data is still difficult to find. This neuters public participation and is a disservice to open government. Therefore making this detailed data and the accompanying rationale publicly available would be a revolutionary change in both culture and policy on access to information and potentially empower citizens to participate.
The data revolution is an opportunity to mainstream statistics into public discourse on public policy in ways that citizens can understand and engage with. I hope African countries will be willing to put an effort in translating the data revolution into an African revolution. If not, there’s a risk we shall continue singing about a revolution and never actually have one.
Download the ThinkPiece here”

15 Ways to bring Civic Innovation to your City


Chris Moore at AcuitasGov: “In my previous blog post I wrote about a desire to see our Governments transform to be part of the  21st century.  I saw a recent reference to how governments across Canada have lost their global leadership, how government in Canada at all levels is providing analog services to a digital society.  I couldn’t agree more.  I have been thinking lately about some practical ways that Mayors and City Managers could innovate in their communities.  I realize that there are a number of municipal elections happening this fall across Canada, a time when leadership changes and new ideas emerge.  So this blog is also for Mayoral candidates who have a sense that technology and innovation have a role to play in their city and in their administration.
I thought I would identify 15 initiatives that cities could pursue as part of their Civic Innovation Strategy.   For the last 50 years technology in local government in Canada has been viewed as an expense, as a necessary evil, not always understood by elected officials and senior administrators.  Information and Technology is part of every aspect of a city, it is critical in delivering services.  It is time to not just think of this as an expense but as an investment, as a way to innovate, reduce costs, enhance citizen service delivery and transform government operations.
Here are my top 15 ways to bring Civic Innovation to your city:
1. Build 21st Century Digital Infrastructure like the Chattanooga Gig City Project.
2. Build WiFi networks like the City of Edmonton on your own and in partnership with others.
3. Provide technology and internet to children and youth in need like the City of Toronto.
4. Connect to a national Education and Research network like Cybera in Alberta and CANARIE.
5. Create a Mayors Task-force on Innovation and Technology leveraging your city’s resources.
6. Run a hackathon or two or three like the City of Glasgow or maybe host a hacking health event like the City of Vancouver.
7. Launch a Startup incubator like Startup Edmonton or take it to the next level and create a civic lab like the City of Barcelona.
8. Develop an Open Government Strategy, I like to the Open City Strategy from Edmonton.
9. If Open Government is too much then just start with Open Data, Edmonton has one of the best.
10. Build a Citizen Dashboard to showcase your cities services and commitment to the public.
11. Put your Crime data online like the Edmonton Police Service.
12. Consider a pilot project with sensor technology for parking like the City of Nice or for  waste management like the City of Barcelona.
13. Embrace Car2Go, Modo and UBER as ways to move people in your city.
14. Consider turning your IT department into the Innovation and Technology Department like they did at the City of Chicago.
15. Partner with other near by local governments to create a shared Innovation and Technology agency.
Now more than ever before cities need to find ways to innovate, to transform and to create a foundation that is sustainable.  Now is the time for both courage and innovations in government.  What is your city doing to move into the 21st Century?”

Index: The Networked Public


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the networked public and was originally published in 2014.

Global Overview

  • The proportion of global population who use the Internet in 2013: 38.8%, up 3 percentage points from 2012
  • Increase in average global broadband speeds from 2012 to 2013: 17%
  • Percent of internet users surveyed globally that access the internet at least once a day in 2012: 96
  • Hours spent online in 2012 each month across the globe: 35 billion
  • Country with the highest online population, as a percent of total population in 2012: United Kingdom (85%)
  • Country with the lowest online population, as a percent of total population in 2012: India (8%)
  • Trend with the highest growth rate in 2012: Location-based services (27%)
  • Years to reach 50 million users: telephone (75), radio (38), TV (13), internet (4)

Growth Rates in 2014

  • Rate at which the total number of Internet users is growing: less than 10% a year
  • Worldwide annual smartphone growth: 20%
  • Tablet growth: 52%
  • Mobile phone growth: 81%
  • Percentage of all mobile users who are now smartphone users: 30%
  • Amount of all web usage in 2013 accounted for by mobile: 14%
  • Amount of all web usage in 2014 accounted for by mobile: 25%
  • Percentage of money spent on mobile used for app purchases: 68%
  • Growth of BitCoin wallet between 2013 and 2014: 8 times increase
  • Number of listings on AirBnB in 2014: 550k, 83% growth year on year
  • How many buyers are on Alibaba in 2014: 231MM buyers, 44% growth year on year

Social Media

  • Number of Whatsapp messages on average sent per day: 50 billion
  • Number sent per day on Snapchat: 1.2 billion
  • How many restaurants are registered on GrubHub in 2014: 29,000
  • Amount the sale of digital songs fell in 2013: 6%
  • How much song streaming grew in 2013: 32%
  • Number of photos uploaded and shared every day on Flickr, Snapchat, Instagram, Facebook and Whatsapp combined in 2014: 1.8 billion
  • How many online adults in the U.S. use a social networking site of some kind: 73%
  • Those who use multiple social networking sites: 42%
  • Dominant social networking platform: Facebook, with 71% of online adults
  • Number of Facebook users in 2004, its founding year: 1 million
  • Number of monthly active users on Facebook in September 2013: 1.19 billion, an 18% increase year-over-year
  • How many Facebook users log in to the site daily: 63%
  • Instagram users who log into the service daily: 57%
  • Twitter users who are daily visitors: 46%
  • Number of photos uploaded to Facebook every minute: over 243,000, up 16% from 2012
  • How much of the global internet population is actively using Twitter every month: 21%
  • Number of tweets per minute: 350,000, up 250% from 2012
  • Fastest growing demographic on Twitter: 55-64 year age bracket, up 79% from 2012
  • Fastest growing demographic on Facebook: 45-54 year age bracket, up 46% from 2012
  • How many LinkedIn accounts are created every minute: 120, up 20% from 2012
  • The number of Google searches in 2013: 3.5 million, up 75% from 2012
  • Percent of internet users surveyed globally that use social media in 2012: 90
  • Percent of internet users surveyed globally that use social media daily: 60
  • Time spent social networking, the most popular online activity: 22%, followed by searches (21%), reading content (20%), and emails/communication (19%)
  • The average age at which a child acquires an online presence through their parents in 10 mostly Western countries: six months
  • Number of children in those countries who have a digital footprint by age 2: 81%
  • How many new American marriages between 2005-2012 began by meeting online, according to a nationally representative study: more than one-third 
  • How many of the world’s 505 leaders are on Twitter: 3/4
  • Combined Twitter followers: of 505 world leaders: 106 million
  • Combined Twitter followers of Justin Bieber, Katy Perry, and Lady Gaga: 122 million
  • How many times all Wikipedias are viewed per month: nearly 22 billion times
  • How many hits per second: more than 8,000 
  • English Wikipedia’s share of total page views: 47%
  • Number of articles in the English Wikipedia in December 2013: over 4,395,320 
  • Platform that reaches more U.S. adults between ages 18-34 than any cable network: YouTube
  • Number of unique users who visit YouTube each month: more than 1 billion
  • How many hours of video are watched on YouTube each month: over 6 billion, 50% more than 2012
  • Proportion of YouTube traffic that comes from outside the U.S.: 80%
  • Most common activity online, based on an analysis of over 10 million web users: social media
  • People on Twitter who recommend products in their tweets: 53%
  • People who trust online recommendations from people they know: 90%

Mobile and the Internet of Things

  • Number of global smartphone users in 2013: 1.5 billion
  • Number of global mobile phone users in 2013: over 5 billion
  • Percent of U.S. adults that have a cell phone in 2013: 91
  • Number of which are a smartphone: almost two thirds
  • Mobile Facebook users in March 2013: 751 million, 54% increase since 2012
  • Growth rate of global mobile traffic as a percentage of global internet traffic as of May 2013: 15%, up from .9% in 2009
  • How many smartphone owners ages 18–44 “keep their phone with them for all but two hours of their waking day”: 79%
  • Those who reach for their smartphone immediately upon waking up: 62%
  • Those who couldn’t recall a time their phone wasn’t within reach or in the same room: 1 in 4
  • Facebook users who access the service via a mobile device: 73.44%
  • Those who are “mobile only”: 189 million
  • Amount of YouTube’s global watch time that is on mobile devices: almost 40%
  • Number of objects connected globally in the “internet of things” in 2012: 8.7 billion
  • Number of connected objects so far in 2013: over 10 billion
  • Years from tablet introduction for tables to surpass desktop PC and notebook shipments: less than 3 (over 55 million global units shipped in 2013, vs. 45 million notebooks and 35 million desktop PCs)
  • Number of wearable devices estimated to have been shipped worldwide in 2011: 14 million
  • Projected number of wearable devices in 2016: between 39-171 million
  • How much of the wearable technology market is in the healthcare and medical sector in 2012: 35.1%
  • How many devices in the wearable tech market are fitness or activity trackers: 61%
  • The value of the global wearable technology market in 2012: $750 million
  • The forecasted value of the market in 2018: $5.8 billion
  • How many Americans are aware of wearable tech devices in 2013: 52%
  • Devices that have the highest level of awareness: wearable fitness trackers,
  • Level of awareness for wearable fitness trackers amongst American consumers: 1 in 3 consumers
  • Value of digital fitness category in 2013: $330 million
  • How many American consumers surveyed are aware of smart glasses: 29%
  • Smart watch awareness amongst those surveyed: 36%

Access

  • How much of the developed world has mobile broadband subscriptions in 2013: 3/4
  • How much of the developing world has broadband subscription in 2013: 1/5
  • Percent of U.S. adults that had a laptop in 2012: 57
  • How many American adults did not use the internet at home, at work, or via mobile device in 2013: one in five
  • Amount President Obama initiated spending in 2009 in an effort to expand access: $7 billion
  • Number of Americans potentially shut off from jobs, government services, health care and education, among other opportunities due to digital inequality: 60 million
  • American adults with a high-speed broadband connection at home as of May 2013: 7 out of 10
  • Americans aged 18-29 vs. 65+ with a high-speed broadband connection at home as of May 2013: 80% vs. 43
  • American adults with college education (or more) vs. adults with no high school diploma that have a high-speed broadband connection at home as of May 2013: 89% vs. 37%
  • Percent of U.S. adults with college education (or more) that use the internet in 2011: 94
  • Those with no high school diploma that used the internet in 2011: 43
  • Percent of white American households that used the internet in 2013: 67
  • Black American households that used the internet in 2013: 57
  • States with lowest internet use rates in 2013: Mississippi, Alabama and Arkansas
  • How many American households have only wireless telephones as of the second half of 2012: nearly two in five
  • States with the highest prevalence of wireless-only adults according to predictive modeling estimates: Idaho (52.3%), Mississippi (49.4%), Arkansas (49%)
  • Those with the lowest prevalence of wireless-only adults: New Jersey (19.4%), Connecticut (20.6%), Delaware (23.3%) and New York (23.5%)

Sources

A Big Day for Big Data: The Beginning of Our Data Transformation


Mark Doms, Under Secretary for Economic Affairs at the US Department of Commerce: “Wednesday, June 18, 2014, was a big day for big data.  The Commerce Department participated in the inaugural Open Data Roundtable at the White House, with GovLab at NYU and the White House Office of Science and Technology Policy. The event brought businesses and non-profit organizations that rely on Commerce data together with Commerce Department officials to discuss how to make the data we collect and release easier to find, understand and use.  This initiative has significant potential to fuel new businesses; create jobs; and help federal, state and local governments make better decisions.
OpenData 500

Under Secretary Mark Doms presented and participated in the first Open Data Roundtable at the White House, organized by Commerce, GovLab at NYU and the White House Office of Science and Technology Policy 
Data innovation is revolutionizing every aspect of our society and government data is playing a major role in the revolution. From the National Oceanic and Atmospheric Administration’s (NOAA’s) climate data to the U.S. Census Bureau’s American Community Survey, the U.S. Patent and Trademark Office (USPTO) patent and trademark records, and National Institute of Standards and Technology (NIST) research, companies, organizations and people are using this information to innovate, grow our economy and better plan for the future.
 At this week’s Open Data 500, some key insights I came away with include: 

  • There is a strong desire for data consistency across the Commerce Department, and indeed the federal government. 
  • Data should be catalogued in a common, machine-readable format. 
  • Data should be accessible in bulk, allowing the private sector greater flexibility to harness the information. 
  • The use of a single platform for access to government data would create efficiencies and help coordination across agencies.

Furthermore, business leaders stand ready to help us achieve these goals.
Secretary Pritzker is the first Secretary of Commerce to make data a departmental priority in the Commerce Department’s Strategic Plan, and has branded Commerce as “America’s Data Agency.” In keeping with that mantra, over the next several months, my team at the Economics and Statistics Administration (ESA), which includes the Bureau of Economic Analysis and the U.S. Census Bureau, will be involved in similar forums.  We will be engaging our users – businesses, academia, advocacy organizations, and state and local governments – to drive this open data conversation forward. 
Today was a big first step in that process. The insight gained will help inform our efforts ahead. Thanks again to the team at GovLab and the White House for their hard work in making it possible!”

Open for Business: How Open Data Can Help Achieve the G20 Growth Target


New Report commissioned by Omydiar Network on the Business Case for Open Data: “Economic analysis has confirmed the significant contribution to economic growth and productivity achievable through an open data agenda. Governments, the private sector, individuals and communities all stand to benefit from the innovation and information that will inform investment, drive the creation of new industries, and inform decision making and research. To mark a step change in the way valuable information is created and reused, the G20 should release information as open data.
In May 2014, Omidyar Network commissioned Lateral Economics to undertake economic analysis on the potential of open data to support the G20’s 2% growth target and illustrate how an open data agenda can make a significant contribution to economic growth and productivity. Combining all G20 economies, output could increase by USD 13 trillion cumulatively over the next five years. Implementation of open data policies would thus boost cumulative G20 GDP by around 1.1 percentage points (almost 55%) of the G20’s 2% growth target over five years.
Recommendations
Importantly, open data cuts across a number of this year’s G20 priorities: attracting private infrastructure investment, creating jobs and lifting participation, strengthening tax systems and fighting corruption. This memo suggests an open data thread that runs across all G20 priorities. The more data is opened, the more it can be used, reused, repurposed and built on—in combination with other data—for everyone’s benefit.
We call on G20 economies to sign up to the Open Data Charter.
The G20 should ensure that data released by G20 working groups and themes is in line with agreed open data standards. This will lead to more accountable, efficient, effective governments who are going further to expose inadequacy, fight corruption and spur innovation.
Data is a national resource and open data is a ‘win-win’ policy. It is about making more of existing resources. We know that the cost of opening data is smaller than the economic returns, which could be significant. Methods to respect privacy concerns must be taken into account. If this is done, as the public and private sector share of information grows, there will be increasing positive returns.
The G20 opportunity
This November, leaders of the G20 Member States will meet in Australia to drive forward commitments made in the St Petersburg G20 Leaders Declaration last September and to make firm progress on stimulating growth. Actions across the G20 will include increasing investment, lifting employment and participation, enhancing trade and promoting competition.
The resulting ‘Brisbane Action Plan’ will encapsulate all of these commitments with the aim of raising the level of G20 output by at least 2% above the currently projected level over the next five years. There are major opportunities for cooperative and collective action by G20 governments.
Governments should intensify the release of existing public sector data – both government and publicly funded research data. But much more can be done to promote open data than simply releasing more government data. In appropriate circumstances, governments can mandate public disclosure of private sector data (e.g. in corporate financial reporting).
Recommendations for action

  • G20 governments should adopt the principles of the Open Data Charter to encourage the building of stronger, more interconnected societies that better meet the needs of our citizens and allow innovation and prosperity to flourish.
  • G20 governments should adopt specific open data targets under each G20 theme, as illustrated below, such as releasing open data related to beneficial owners of companies, as well revenues from extractive industries
  • G20 governments should consider harmonizing licensing regimes across the G20
  • G20 governments should adopt metrics for measuring the quantity and quality of open data publication, e.g. using the Open Data Institute’s Open Data Certificates as a bottom-up mechanism for driving the adoption of common standards.

Illustrative G20 examples
Fiscal and monetary policy
Governments possess rich real time data that is not open or accessed by government macro-economic managers. G20 governments should:

  • Open up models that lie behind economic forecasts and help assess alternative policy settings;
  • Publish spending and contractual data to enable comparative shopping by government between government suppliers.

Anti corruption
Open data may directly contribute to reduced corruption by increasing the likelihood corruption will be detected. G20 governments should:

  • Release open data related to beneficial owners of companies as well as revenues from extractive industries,
  • Collaborate on harmonised technical standards that permit the tracing of international money flows – including the tracing of beneficial owners of commercial entities, and the comparison and reconciliation of transactions across borders.

Trade
Obtaining and using trade data from multiple jurisdictions is difficult. Access fees, specific licenses, and non-machine readable formats all involve large transaction costs. G20 governments should:

  • Harmonise open data policies related to trade data.
  • Use standard trade schema and formats.

Employment
Higher quality information on employment conditions would facilitate better matching of employees to organizations, producing greater job-satisfaction and improved productivity. G20 governments should:

  • Open up centralised job vacancy registers to provide new mechanisms for people to find jobs.
  • Provide open statistical information about the demand for skills in particular areas to help those supporting training and education to hone their offerings.

Energy
Open data will help reduce the cost of energy supply and improve energy efficiency. G20 governments should:

  • Provide incentives for energy companies to publish open data from consumers and suppliers to enable cost savings through optimizing energy plans.
  • Release energy performance certifications for buildings
  • Publish real-time energy consumption for government buildings.

Infrastructure
Current infrastructure asset information is fragmented and inefficient. Exposing current asset data would be a significant first step in understanding gaps and providing new insights. G20 governments should:

  • Publish open data on governments’ infrastructure assets and plans to better understand infrastructure gaps, enable greater efficiency and insights in infrastructure development and use and analyse cost/benefits.
  • Publish open infrastructure data, including contracts via Open Contracting Partnership, in a consistent and harmonised way across G20 countries…”

Lawsuit Would Force IRS to Release Nonprofit Tax Forms Digitally


Suzanne Perry at the Chronicle of Philanthropy on how “Open Data Could Shine a Light on Pay and Lobbying”: “Nonprofits that want to find out what their peers are doing can find a wealth of information in the forms the groups must file each year with the Internal Revenue Service—how much they pay their chief executives, how much they spend on fundraising, who is on their boards, where they offer services.
But the way the IRS makes those data available harkens to the digital dark ages, and critics who want to overhaul the system have been shaking up the generally polite nonprofit world with legal challenges, charges of monopoly, and talk of “disrupting” the status quo.
The issue will take center stage in a courtroom this week when a federal district judge in San Francisco is scheduled to consider arguments about whether to approve the IRS’s move to dismiss a lawsuit filed by an open-records group.
The group wants to obtain some specific Forms 990s, the informational tax documents filed by nonprofits, in a format that can be read by computers.
In theory, that shouldn’t be difficult since the nine nonprofits involved— including the American National Standards Institute, the New Horizons Foundation, and the International Code Council—submitted the forms electronically. But the IRS converts all 990s, no matter how they were filed, into images, rendering them useless for digital operations like searching multiple forms for information­.
That means watchdog groups and those that provide information on charities, like Charity Navigator, GuideStar, and the Urban Institute, have to spend money to manually enter the data they get from the IRS before making it available to the public, even if it has previously been digitized.
The lawsuit against the IRS, filed by Public.Resource.Org, aims to end that practice.
Carl Malamud, who heads the group, is a longtime activist who successfully pushed the Securities and Exchange Commission to post corporate filings free online in the 1990s, among other projects.
He wants to do the same with the IRS, arguing that data should be readily available at no cost about a sector that represents more than 1.5 million tax-exempt organizations and more than $1.5-trillion in revenue.

Putting Open Data to Work for Communities


Report by  Kathryn L.S. PettitLeah HendeyBrianna LosoyaG. Thomas Kingsley  at the Urban Institute: “The National Neighborhood Indicators Partnership (NNIP) is a network of local organizations that collect, organize, and use neighborhood data to tackle issues in their communities. As the movement for government transparency has spread at the local level, more NNIP partners are participating in the call for governments to release data and are using open data to provide information for decisionmaking and community engagement. Local NNIP partners and open data advocates have complementary strengths and should work together to more effectively advance open government data that benefits all residents.”

Big Data, My Data


Jane Sarasohn-Kahn  at iHealthBeat: “The routine operation of modern health care systems produces an abundance of electronically stored data on an ongoing basis,” Sebastian Schneeweis writes in a recent New England Journal of Medicine Perspective.
Is this abundance of data a treasure trove for improving patient care and growing knowledge about effective treatments? Is that data trove a Pandora’s black box that can be mined by obscure third parties to benefit for-profit companies without rewarding those whose data are said to be the new currency of the economy? That is, patients themselves?
In this emerging world of data analytics in health care, there’s Big Data and there’s My Data (“small data”). Who most benefits from the use of My Data may not actually be the consumer.
Big focus on Big Data. Several reports published in the first half of 2014 talk about the promise and perils of Big Data in health care. The Federal Trade Commission’s study, titled “Data Brokers: A Call for Transparency and Accountability,” analyzed the business practices of nine “data brokers,” companies that buy and sell consumers’ personal information from a broad array of sources. Data brokers sell consumers’ information to buyers looking to use those data for marketing, managing financial risk or identifying people. There are health implications in all of these activities, and the use of such data generally is not covered by HIPAA. The report discusses the example of a data segment called “Smoker in Household,” which a company selling a new air filter for the home could use to target-market to an individual who might seek such a product. On the downside, without the consumers’ knowledge, the information could be used by a financial services company to identify the consumer as a bad health insurance risk.
Big Data and Privacy: A Technological Perspective,” a report from the President’s Office of Science and Technology Policy, considers the growth of Big Data’s role in helping inform new ways to treat diseases and presents two scenarios of the “near future” of health care. The first, on personalized medicine, recognizes that not all patients are alike or respond identically to treatments. Data collected from a large number of similar patients (such as digital images, genomic information and granular responses to clinical trials) can be mined to develop a treatment with an optimal outcome for the patients. In this case, patients may have provided their data based on the promise of anonymity but would like to be informed if a useful treatment has been found. In the second scenario, detecting symptoms via mobile devices, people wishing to detect early signs of Alzheimer’s Disease in themselves use a mobile device connecting to a personal couch in the Internet cloud that supports and records activities of daily living: say, gait when walking, notes on conversations and physical navigation instructions. For both of these scenarios, the authors ask, “Can the information about individuals’ health be sold, without additional consent, to third parties? What if this is a stated condition of use of the app? Should information go to the individual’s personal physicians with their initial consent but not a subsequent confirmation?”
The World Privacy Foundation’s report, titled “The Scoring of America: How Secret Consumer Scores Threaten Your Privacy and Your Future,” describes the growing market for developing indices on consumer behavior, identifying over a dozen health-related scores. Health scores include the Affordable Care Act Individual Health Risk Score, the FICO Medication Adherence Score, various frailty scores, personal health scores (from WebMD and OneHealth, whose default sharing setting is based on the user’s sharing setting with the RunKeeper mobile health app), Medicaid Resource Utilization Group Scores, the SF-36 survey on physical and mental health and complexity scores (such as the Aristotle score for congenital heart surgery). WPF presents a history of consumer scoring beginning with the FICO score for personal creditworthiness and recommends regulatory scrutiny on the new consumer scores for fairness, transparency and accessibility to consumers.
At the same time these three reports went to press, scores of news stories emerged discussing the Big Opportunities Big Data present. The June issue of CFO Magazine published a piece called “Big Data: Where the Money Is.” InformationWeek published “Health Care Dives Into Big Data,” Motley Fool wrote about “Big Data’s Big Future in Health Care” and WIRED called “Cloud Computing, Big Data and Health Care” the “trifecta.”
Well-timed on June 5, the Office of the National Coordinator for Health IT’s Roadmap for Interoperability was detailed in a white paper, titled “Connecting Health and Care for the Nation: A 10-Year Vision to Achieve an Interoperable Health IT Infrastructure.” The document envisions the long view for the U.S. health IT ecosystem enabling people to share and access health information, ensuring quality and safety in care delivery, managing population health, and leveraging Big Data and analytics. Notably, “Building Block #3” in this vision is ensuring privacy and security protections for health information. ONC will “support developers creating health tools for consumers to encourage responsible privacy and security practices and greater transparency about how they use personal health information.” Looking forward, ONC notes the need for “scaling trust across communities.”
Consumer trust: going, going, gone? In the stakeholder community of U.S. consumers, there is declining trust between people and the companies and government agencies with whom people deal. Only 47% of U.S. adults trust companies with whom they regularly do business to keep their personal information secure, according to a June 6 Gallup poll. Furthermore, 37% of people say this trust has decreased in the past year. Who’s most trusted to keep information secure? Banks and credit card companies come in first place, trusted by 39% of people, and health insurance companies come in second, trusted by 26% of people.
Trust is a basic requirement for health engagement. Health researchers need patients to share personal data to drive insights, knowledge and treatments back to the people who need them. PatientsLikeMe, the online social network, launched the Data for Good project to inspire people to share personal health information imploring people to “Donate your data for You. For Others. For Good.” For 10 years, patients have been sharing personal health information on the PatientsLikeMe site, which has developed trusted relationships with more than 250,000 community members…”

Why Statistically Significant Studies Aren’t Necessarily Significant


Michael White in PSMagazine on how modern statistics have made it easier than ever for us to fool ourselves: “Scientific results often defy common sense. Sometimes this is because science deals with phenomena that occur on scales we don’t experience directly, like evolution over billions of years or molecules that span billionths of meters. Even when it comes to things that happen on scales we’re familiar with, scientists often draw counter-intuitive conclusions from subtle patterns in the data. Because these patterns are not obvious, researchers rely on statistics to distinguish the signal from the noise. Without the aid of statistics, it would be difficult to convincingly show that smoking causes cancer, that drugged bees can still find their way home, that hurricanes with female names are deadlier than ones with male names, or that some people have a precognitive sense for porn.
OK, very few scientists accept the existence of precognition. But Cornell psychologist Daryl Bem’s widely reported porn precognition study illustrates the thorny relationship between science, statistics, and common sense. While many criticisms were leveled against Bem’s study, in the end it became clear that the study did not suffer from an obvious killer flaw. If it hadn’t dealt with the paranormal, it’s unlikely that Bem’s work would have drawn much criticism. As one psychologist put it after explaining how the study went wrong, “I think Bem’s actually been relatively careful. The thing to remember is that this type of fudging isn’t unusual; to the contrary, it’s rampant–everyone does it. And that’s because it’s very difficult, and often outright impossible, to avoid.”…
That you can lie with statistics is well known; what is less commonly noted is how much scientists still struggle to define proper statistical procedures for handling the noisy data we collect in the real world. In an exchange published last month in the Proceedings of the National Academy of Sciences, statisticians argued over how to address the problem of false positive results, statistically significant findings that on further investigation don’t hold up. Non-reproducible results in science are a growing concern; so do researchers need to change their approach to statistics?
Valen Johnson, at Texas A&M University, argued that the commonly used threshold for statistical significance isn’t as stringent as scientists think it is, and therefore researchers should adopt a tighter threshold to better filter out spurious results. In reply, statisticians Andrew Gelman and Christian Robert argued that tighter thresholds won’t solve the problem; they simply “dodge the essential nature of any such rule, which is that it expresses a tradeoff between the risks of publishing misleading results and of important results being left unpublished.” The acceptable level of statistical significance should vary with the nature of the study. Another team of statisticians raised a similar point, arguing that a more stringent significance threshold would exacerbate the worrying publishing bias against negative results. Ultimately, good statistical decision making “depends on the magnitude of effects, the plausibility of scientific explanations of the mechanism, and the reproducibility of the findings by others.”
However, arguments over statistics usually occur because it is not always obvious how to make good statistical decisions. Some bad decisions are clear. As xkcd’s Randall Munroe illustrated in his comic on the spurious link between green jelly beans and acne, most people understand that if you keep testing slightly different versions of a hypothesis on the same set of data, sooner or later you’re likely to get a statistically significant result just by chance. This kind of statistical malpractice is called fishing or p-hacking, and most scientists know how to avoid it.
But there are more subtle forms of the problem that pervade the scientific literature. In an unpublished paper (PDF), statisticians Andrew Gelman, at Columbia University, and Eric Loken, at Penn State, argue that researchers who deliberately avoid p-hacking still unknowingly engage in a similar practice. The problem is that one scientific hypothesis can be translated into many different statistical hypotheses, with many chances for a spuriously significant result. After looking at their data, researchers decide which statistical hypothesis to test, but that decision is skewed by the data itself.
To see how this might happen, imagine a study designed to test the idea that green jellybeans cause acne. There are many ways the results could come out statistically significant in favor of the researchers’ hypothesis. Green jellybeans could cause acne in men, but not in women, or in women but not men. The results may be statistically significant if the jellybeans you call “green” include Lemon Lime, Kiwi, and Margarita but not Sour Apple. Gelman and Loken write that “researchers can perform a reasonable analysis given their assumptions and their data, but had the data turned out differently, they could have done other analyses that were just as reasonable in those circumstances.” In the end, the researchers may explicitly test only one or a few statistical hypotheses, but their decision-making process has already biased them toward the hypotheses most likely to be supported by their data. The result is “a sort of machine for producing and publicizing random patterns.”
Gelman and Loken are not alone in their concern. Last year Daniele Fanelli, at the University of Edingburgh, and John Ioannidis, at Stanford University, reported that many U.S. studies, particularly in the social sciences, may overestimate the effect sizes of their results. “All scientists have to make choices throughout a research project, from formulating the question to submitting results for publication.” These choices can be swayed “consciously or unconsciously, by scientists’ own beliefs, expectations, and wishes, and the most basic scientific desire is that of producing an important research finding.”
What is the solution? Part of the answer is to not let measures of statistical significance override our common sense—not our naïve common sense, but our scientifically-informed common sense…”