I’ve just finished two packed days at the Health Datapalooza, put on by the Health Data Consortium with the Department of Health and Human Services. As I’ve just heard someone say, many of the 2000 people here are a bit “Palooza”d out.” But this fourth annual event shows the growing power of open government data on health and health care services. The two-day event covered both the knowledge and applications that can come from the release of data like that on Medicare claims, and the ways in which the Affordable Care Act is driving the use of data for better delivery of high-quality care. The participation of leaders from the United Kingdom’s National Health Service added an international perspective as well.
There’s too much to summarize in a single blog post, but you can follow these links to read about the Health Data Consortium and its new CEO’s goals; the DataPalooza’s opening plenary session, with luminaries from government, business, and the New Yorker; and today’s keynote by Todd Park, with reflections on some of new companies that open government data is supporting.
– Joel Gurin, GovLab network member and Founder and Editor, OpenDataNow.com
The Crowdstorm Effect
Peter Ryder and Shaun Abrahamson in Innovation Excellence: “When we open up the innovation process to talent outside our organization we are trying to channel the abilities of a lot of people we don’t know, in the hope that a few of them have ideas we need. Crowdsourcing is the term most closely associated with the process. But over the last decade, many organizations have been not only sourcing ideas from crowds but also getting feedback on ideas….
We call the intersection of lower transaction costs and brainstorming at scale enabled by online connections crowdstorming.
Getting ideas, getting feedback, identifying talent to work with, filtering ideas, earning media, enabling stakeholders to select ideas to change the organization/stakeholder relationship — the crowd’s role and the crowdstorming process has become more complex as it has expanded to involve external talent in new ways. …
Seventy-five years ago, the British economist, Ronald Coase, suggested that high transaction costs – the overhead to find, recruit, negotiate and contract with talent—required organizations to bring the best talent in house. While Coase’s equation still holds true, the Internet has allowed organizations to revisit under what conditions they want and need full time employees. When we have the ability to efficiently tap resources anywhere, anytime at low cost, new opportunities emerge.”
Is Crowdsourcing the Future for Crime Investigation?
Joe Harris in IFSEC Global: “Following April’s Boston Marathon bombings, many people around the world wanted to help in any way they could. Previously, there would have been little but financial assistance that they could have offered.
However, with the advent of high-quality cameras on smartphone devices, and services such as YouTube and Flickr, it was not long before the well-known online collectives such as Reddit and 4chan mobilized members of the public to ask them to review hundreds of thousands of photos and videos taken on the day to try and identify potential suspects….Here in the UK, we recently had the successful launch of Facewatch, and we have seen other regional attempts — such as Greater Manchester Police’s services and appeals app — to use the goodwill of members of the public to help trace, identify, or report suspected criminals and the crimes that they commit.
Does this herald a new era in transparency? Are we seeing the first steps towards a more transparent future where rapid information flow means that there really is nowhere to hide? Or are we instead falling into some Orwellian society construct where people are scared to speak out or think for themselves?”
Why Big Data Is Not Truth
Quentin Hardy in the New York Times: “Kate Crawford, a researcher at Microsoft Research, calls the problem “Big Data fundamentalism — the idea with larger data sets, we get closer to objective truth.” Speaking at a conference in Berkeley, Calif., on Thursday, she identified what she calls “six myths of Big Data.”
Myth 1: Big Data is New
In 1997, there was a paper that discussed the difficulty of visualizing Big Data, and in 1999, a paper that discussed the problems of gaining insight from the numbers in Big Data. That indicates that two prominent issues today in Big Data, display and insight, had been around for awhile…..
Myth 2: Big Data Is Objective
Over 20 million Twitter messages about Hurricane Sandy were posted last year. … “These were very privileged urban stories.” And some people, privileged or otherwise, put information like their home addresses on Twitter in an effort to seek aid. That sensitive information is still out there, even though the threat is gone.
Myth 3: Big Data Doesn’t Discriminate
“Big Data is neither color blind nor gender blind,” Ms. Crawford said. “We can see how it is used in marketing to segment people.” …
Myth 4: Big Data Makes Cities Smart
…, moving cities toward digital initiatives like predictive policing, or creating systems where people are seen, whether they like it or not, can promote lots of tension between individuals and their governments.
Myth 5: Big Data Is Anonymous
A study published in Nature last March looked at 1.5 million phone records that had personally identifying information removed. It found that just four data points of when and where a call was made could identify 95 percent of individuals. …
Myth 6: You Can Opt Out
… given the ways that information can be obtained in these big systems, “what are the chances that your personal information will never be used?”
Before Big Data disappears into the background as another fact of life, Ms. Crawford said, “We need to think about how we will navigate these systems. Not just individually, but as a society.”
Complex Algorithm Auto-Writes Books, Could Transform Science
Mashable: “Could a sophisticated algorithm be the future of science? One innovative economist thinks so.
Phil Parker, who holds a doctorate in business economics from the Wharton School, has built an algorithm that auto-writes books. Now he’s taking that model and applying it to loftier goals than simply penning periodicals: namely, medicine and forensics. Working with professors and researchers at NYU, Parker is trying to decode complex genetic structures and find cures for diseases. And he’s doing it with the help of man’s real best friend: technology.
Parker’s recipe is a complex computer program that mimics formulaic writing….
Parker’s been at this for years. His formula, originally used for printing, is able to churn out entire books in minutes. It’s similar to the work being done by Narrative Science and StatSheet, except those companies are known for short form auto-writing for newspapers. Parker’s work is much longer, focusing on obscure non-fiction and even poetry.
It’s not creative writing, though, and Parker isn’t interested in introspection, exploring emotion or storytelling. He’s interested in exploiting reproducible patterns — that’s how his algorithm can find, collect and “write” so quickly. And how he can apply that model to other disciplines, like science.
Parker’s method seems to be a success; indeed, his ICON Group International, Inc., has auto-written so many books that Parker has lost count. But this isn’t the holy grail of literature, he insists. Instead, he says, his work is a play on mechanizing processes to create a simple formula. And he thinks that “finding new knowledge structures within data” stretches far beyond print.”
New Book: Digital Methods
New book by Richard Rogers, Director of the Govcom.org Foundation (Amsterdam) and the Digital Methods Initiative: “In Digital Methods, Richard Rogers proposes a methodological outlook for social and cultural scholarly research on the Web that seeks to move Internet research beyond the study of online culture. It is not a toolkit for Internet research, or operating instructions for a software package; it deals with broader questions. How can we study social media to learn something about society rather than about social media use? How can hyperlinks reveal not just the value of a Web site but the politics of association? Rogers proposes repurposing Web-native techniques for research into cultural change and societal conditions. We can learn to reapply such “methods of the medium” as crawling and crowd sourcing, PageRank and similar algorithms, tag clouds and other visualizations; we can learn how they handle hits, likes, tags, date stamps, and other Web-native objects. By “thinking along” with devices and the objects they handle, digital research method! s can follow the evolving methods of the medium.
Rogers uses this new methodological outlook to examine the findings of inquiries into 9/11 search results, the recognition of climate change skeptics by climate-change-related Web sites, the events surrounding the Srebrenica massacre according to Dutch, Serbian, Bosnian, and Croatian Wikipedias, presidential candidates’ social media “friends,” and the censorship of the Iranian Web. With Digital Methods, Rogers introduces a new vision and method for Internet research and at the same time applies them to the Web’s objects of study, from tiny particles (hyperlinks) to large masses (social media).”
Techs and the City
Alec Appelbaum, who teaches at Pratt Institute in The New York Times: “THIS spring New York City is rolling out its much-ballyhooed bike-sharing program, which relies on a sophisticated set of smartphone apps and other digital tools to manage it. The city isn’t alone: across the country, municipalities are buying ever more complicated technological “solutions” for urban life.
But higher tech is not always essential tech. Cities could instead be making savvier investments in cheaper technology that may work better to stoke civic involvement than the more complicated, expensive products being peddled by information-technology developers….
To be sure, big tech can zap some city weaknesses. According to I.B.M., its predictive-analysis technology, which examines historical data to estimate the next crime hot spots, has helped Memphis lower its violent crime rate by 30 percent.
But many problems require a decidedly different approach. Take the seven-acre site in Lower Manhattan called the Seward Park Urban Renewal Area, where 1,000 mixed-income apartments are set to rise. A working-class neighborhood that fell to bulldozers in 1969, it stayed bare as co-ops nearby filled with affluent families, including my own.
In 2010, with the city ready to invite developers to bid for the site, long-simmering tensions between nearby public-housing tenants and wealthier dwellers like me turned suddenly — well, civil.
What changed? Was it some multimillion-dollar “open democracy” platform from Cisco, or a Big Data program to suss out the community’s real priorities? Nope. According to Dominic Pisciotta Berg, then the chairman of the local community board, it was plain old e-mail, and the dialogue it facilitated. “We simply set up an e-mail box dedicated to receiving e-mail comments” on the renewal project, and organizers would then “pull them together by comment type and then consolidate them for display during the meetings,” he said. “So those who couldn’t be there had their voices considered and those who were there could see them up on a screen and adopted, modified or rejected.”
Through e-mail conversations, neighbors articulated priorities — permanently affordable homes, a movie theater, protections for small merchants — that even a supercomputer wouldn’t necessarily have identified in the data.
The point is not that software is useless. But like anything else in a city, it’s only as useful as its ability to facilitate the messy clash of real human beings and their myriad interests and opinions. And often, it’s the simpler software, the technology that merely puts people in contact and steps out of the way, that works best.”
The Dictatorship of Data
Kenneth Cukier and Viktor Mayer-Schönberger in MIT Technology Review: “Big data is poised to transform society, from how we diagnose illness to how we educate children, even making it possible for a car to drive itself. Information is emerging as a new economic input, a vital resource. Companies, governments, and even individuals will be measuring and optimizing everything possible.
But there is a dark side. Big data erodes privacy. And when it is used to make predictions about what we are likely to do but haven’t yet done, it threatens freedom as well. Yet big data also exacerbates a very old problem: relying on the numbers when they are far more fallible than we think. Nothing underscores the consequences of data analysis gone awry more than the story of Robert McNamara.”
Empowering Consumers through the Smart Disclosure of Data
OSTP: “Today, the Administration’s interagency National Science and Technology Council released Smart Disclosure and Consumer Decision Making: Report of the Task Force on Smart Disclosure—the first comprehensive description of the Federal Government’s efforts to promote the smart disclosure of information that can help consumers make wise decisions in the marketplace.
Whether they are searching for colleges, health insurance, credit cards, airline flights, or energy providers, consumers can find it difficult to identify the specific product or service that best suits their particular needs. In some cases, the effort required to sift through all of the available information is so large that consumers make decisions using inadequate information. As a result, they may overpay, miss out on a product that would better meet their needs, or be surprised by fees.
The report released today outlines ways in which Federal agencies and other governmental and non-governmental organizations can use—and in many cases are already using—smart disclosure approaches that increase market transparency and empower consumers facing complex choices in domains such as health, education, energy and personal finance.”
A Data-Powered Revolution in Health Care
Todd Park @ White House Blog: “Thomas Friedman’s New York Times column, Obamacare’s Other Surprise, highlights a rising tide of innovation that has been unleashed by the Affordable Care Act and the Administration’s health IT and data initiatives. Supported by digital data, new data-driven tools, and payment policies that reward improving the quality and value of care, doctors, hospitals, patients, and entrepreneurs across the nation are demonstrating that smarter, better, more accessible, and more proactive care is the best way to improve quality and control health care costs.
We are witnessing the emergence of a data-powered revolution in health care. Catalyzed by the Recovery Act, adoption of electronic health records is increasing dramatically. More than half of all doctors and other eligible providers and nearly 80 percent of hospitals are using electronic health records to improve care, an increase of more than 200 percent since 2008. In addition, the Administration’s Health Data Initiative is making a growing supply of key government data on everything from hospital charges and quality to regional health care system performance statistics freely available in computer-readable, downloadable form, as fuel for innovation, entrepreneurship, and discovery.