Developing Public Policy To Advance The Use Of Big Data In Health Care


Paper by Axel Heitmueller et al in Health Affairs:  “The vast amount of health data generated and stored around the world each day offers significant opportunities for advances such as the real-time tracking of diseases, predicting disease outbreaks, and developing health care that is truly personalized. However, capturing, analyzing, and sharing health data is difficult, expensive, and controversial. This article explores four central questions that policy makers should consider when developing public policy for the use of “big data” in health care. We discuss what aspects of big data are most relevant for health care and present a taxonomy of data types and levels of access. We suggest that successful policies require clear objectives and provide examples, discuss barriers to achieving policy objectives based on a recent policy experiment in the United Kingdom, and propose levers that policy makers should consider using to advance data sharing. We argue that the case for data sharing can be won only by providing real-life examples of the ways in which it can improve health care.”

The Rise of Data Poverty in America


Report by Daniel Castro for the Center of Data Innovation: “Data-driven innovations offer enormous opportunities to advance important societal goals. However, to take advantage of these opportunities, individuals must have access to high-quality data about themselves and their communities. If certain groups routinely do not have data collected about them, their problems may be overlooked and their communities held back in spite of progress elsewhere. Given this risk, policymakers should begin a concerted effort to address the “data divide”—the social and economic inequalities that may result from a lack of collection or use of data about individuals or communities..”

When Big Data Maps Your Safest, Shortest Walk Home


Sarah Laskow at NextCity: “Boston University and University of Pittsburgh researchers are trying to do the same thing that got the creators of the app SketchFactor into so much trouble over the summer. They’re trying to show people how to avoid dangerous spots on city streets while walking from one place to another.
“What we are interested in is finding paths that offer trade-offs between safety and distance,” Esther Galbrun, a postdoc at Boston University, recently said in New York at the 3rd International Workshop on Urban Computing, held in conjunction with KDD2014.
She was presenting, “Safe Navigation in Urban Environments,” which describes a set of algorithms that would give a person walking through a city options for getting from one place to another — the shortest path, the safest path and a number of alternatives that balanced between both factors. The paper takes existing algorithms, well defined in theory — nothing new or fancy, Galbrun says — and applies them to a problem that people face everyday.
Imagine, she suggests, that a person is standing at the Philadelphia Museum of Art, and he wants to walk home, to his place on Wharton Street. (Galbrun and her colleagues looked at Philadelphia and Chicago because those cities have made their crime data openly available.) The walk is about three miles away, and one option would be to take the shortest path back. But maybe he’s worried about safety. Maybe he’s willing to take a little bit of a longer walk if it means he has to worry less about crime. What route should he take then?
Services like Google Maps have excelled at finding the shortest, most direct routes from Point A to Point B. But, increasingly, urban computing is looking to capture other aspects of moving about a place. “Fast is only one option,” says co-author Konstantinos Pelechrinis. “There are noble objectives beyond the surface path that you can put inside this navigation problem.” You might look for the path that will burn the most calories; a Yahoo! lab has considered how to send people along the most scenic route.
But working on routes that do more than give simple directions can have its pitfalls. The SketchFactor app relies both on crime data, when it’s available, and crowdsourced comments to reveal potential trouble spots to users. When it was released this summer, tech reporters and other critics immediately started talking about how it could easily become a conduit for racism. (“Sketchy” is, after all, a very subjective measure.)
So far, though, the problem with the SketchFactor app is less that it offers racially skewed perspectives than that the information it does offer is pretty useless — if entertaining. A pinpoint marked “very sketchy” is just as likely to flag an incident like a Jewish man eating pork products or hipster kids making too much noise as it is to flag a mugging.
Here, then, is a clear example of how Big Data has an advantage over Big Anecdata. The SafePath set-up measures risk more objectively and elegantly. It pulls in openly available crime data and considers simple data like time, location and types of crime. While a crime occurs at a discrete point, the researchers wanted to estimate the risk of a crime on every street, at every point. So they use a mathematical tool that smooths out the crime data over the space of the city and allows them to measure the relative risk of witnessing a crime on every street segment in a city….”

What Is Big Data?


datascience@berkeley Blog: ““Big Data.” It seems like the phrase is everywhere. The term was added to the Oxford English Dictionary in 2013 External link, appeared in Merriam-Webster’s Collegiate Dictionary by 2014 External link, and Gartner’s just-released 2014 Hype Cycle External link shows “Big Data” passing the “Peak of Inflated Expectations” and on its way down into the “Trough of Disillusionment.” Big Data is all the rage. But what does it actually mean?
A commonly repeated definition External link cites the three Vs: volume, velocity, and variety. But others argue that it’s not the size of data that counts, but the tools being used, or the insights that can be drawn from a dataset.
To settle the question once and for all, we asked 40+ thought leaders in publishing, fashion, food, automobiles, medicine, marketing and every industry in between how exactly they would define the phrase “Big Data.” Their answers might surprise you! Take a look below to find out what big data is:

  1. John Akred, Founder and CTO, Silicon Valley Data Science
  2. Philip Ashlock, Chief Architect of Data.gov
  3. Jon Bruner, Editor-at-Large, O’Reilly Media
  4. Reid Bryant, Data Scientist, Brooks Bell
  5. Mike Cavaretta, Data Scientist and Manager, Ford Motor Company
  6. Drew Conway, Head of Data, Project Florida
  7. Rohan Deuskar, CEO and Co-Founder, Stylitics
  8. Amy Escobar, Data Scientist, 2U
  9. Josh Ferguson, Chief Technology Officer, Mode Analytics
  10. John Foreman, Chief Data Scientist, MailChimp

FULL LIST at datascience@berkeley Blog”

Big Data and Chicago's Traffic-cam Scandal


Holman Jenkins in the Wall Street Journal: “The danger is microscopic regulation that we invite via the democratic process.
Big data techniques are new in the world. It will take time to know how to feel about them and whether and how they should be legally corralled. For sheer inanity, though, there’s no beating a recent White House report quivering about the alleged menace of “digital redlining,” or the use of big-data marketing tactics in ways that supposedly disadvantage minority groups.
This alarm rests on an extravagant misunderstanding. Redlining was a crude method banks used to avoid losses in bad neighborhoods even at the cost of missing some profitable transactions—exactly the inefficiency big data is meant to improve upon. Failing to lure an eligible customer into a sale, after all, is hardly the goal of any business.
The real danger of the new technologies lies elsewhere, which the White House slightly touches upon in some of its fretting about police surveillance. The danger is microscopic regulation of our daily activities that we will invite on ourselves through the democratic process.
Soon it may be impossible to leave our homes without our movements being tracked by traffic and security cameras able to read license plates, identify faces and pull up data about any individual, from social media postings to credit reports.
Private businesses are just starting to use these techniques to monitor shoppers in front of shelves of goodies. Towns and cities have already embraced such techniques as revenue grabs, encouraged by private contractors peddling automated traffic cameras.
Witness a festering Chicago scandal. This month came federal indictments of a former city bureaucrat, an outside consultant, and the former CEO of Redflex Traffic Systems, the company that operated the city’s traffic cameras until last year….”
 

The Changing Nature of Privacy Practice


Numerous commenters have observed that Facebook, among many marketers (including political campaigns like U.S. President Barack Obama’s), regularly conducts A-B tests and other research to measure how consumers respond to different products, messages and messengers. So what makes the Facebook-Cornell study different from what goes on all the time in an increasingly data-driven world? After all, the ability to conduct such testing continuously on a large scale is considered one of the special features of big data.
The answer calls for broader judgments than parsing the language of privacy policies or managing compliance with privacy laws and regulations. Existing legal tools such as notice-and-choice and use limitations are simply too narrow to address the array of issues presented and inform the judgment needed. Deciding whether Facebook ought to participate in research like its newsfeed study is not really about what the company can do but what it should do.
As Omer Tene and Jules Polonetsky, CIPP/US, point out in an article on Facebook’s research study, “Increasingly, corporate officers find themselves struggling to decipher subtle social norms and make ethical choices that are more befitting of philosophers than business managers or lawyers.” They add, “Going forward, companies will need to create new processes, deploying a toolbox of innovative solutions to engender trust and mitigate normative friction.” Tene and Polonetsky themselves have proposed a number of such tools. In recent comments on Consumer Privacy Bill of Rights legislation filed with the Commerce Department, the Future of Privacy Forum (FPF) endorsed the use of internal review boards along the lines of those used in academia for human-subject research. The FPF also submitted an initial framework for benefit-risk analysis in the big data context “to understand whether assuming the risk is ethical, fair, legitimate and cost-effective.” Increasingly, companies and other institutions are bringing to bear more holistic review of privacy issues. Conferences and panels on big data research ethics are proliferating.
The expanding variety and complexity of data uses also call for a broader public policy approach. The Obama administration’s Consumer Privacy Bill of Rights (of which I was an architect) adapted existing Fair Information Practice Principles to a principles-based approach that is intended not as a formalistic checklist but as a set of principles that work holistically in ways that are “flexible” and “dynamic.” In turn, much of the commentary submitted to the Commerce Department on the Consumer Privacy Bill of Rights addressed the question of the relationship between these principles and a “responsible use framework” as discussed in the White House Big Data Report….”

Detroit and Big Data Take on Blight


Susan Crawford in Bloomberg View: “The urban blight that has been plaguing Detroit was, until very recently, made worse by a dearth of information about the problem. No one could tell how many buildings needed fixing or demolition, or how effectively city services were being delivered to them (or not). Today, thanks to the combined efforts of a scrappy small business, tech-savvy city leadership and substantial philanthropic support, the extent of the problem is clear.
The question now is whether Detroit has the heart to use the information to make hard choices about its future.
In the past, when the city foreclosed on properties for failure to pay back taxes, it had no sense of where those properties were clustered. The city would auction off the houses for the bargain-basement price of $500 each, but the auction was entirely undocumented, so neighbors were unaware of investment opportunities, big buyers were gaming the system, and, as often as not, arsonists would then burn the properties down. The result of this blind spot was lost population, lost revenue and even more blight.
Then along came Jerry Paffendorf, a San Francisco transplant, who saw what was needed. His company, Loveland Technologies, started mapping all the tax-foreclosed and auctioned properties. Impressed with Paffendorf’s zeal, the city’s Blight Task Force, established by President Barack Obama and funded by foundations and the state Housing Development Authority, hired his team to visit every property in the city. That led to MotorCityMapping.org, the first user-friendly collection of information about all the attributes of every property in Detroit — including photographs.
Paffendorf calls this map a “scan of the genome of the city.” It shows more than 84,000 blighted structures and vacant lots; in eight neighborhoods, crime, fires and other torments have led to the abandonment of more than a third of houses and businesses. To demolish all those houses, as recommended by the Blight Task Force, will cost almost $2 billion. Still more money will then be needed to repurpose the sites….”

Big Data: Google Searches Predict Unemployment in Finland


Paper by Tuhkuri, Joonas: “There are over 3 billion searches globally on Google every day. This report examines whether Google search queries can be used to predict the present and the near future unemployment rate in Finland. Predicting the present and the near future is of interest, as the official records of the state of the economy are published with a delay. To assess the information contained in Google search queries, the report compares a simple predictive model of unemployment to a model that contains a variable, Google Index, formed from Google data. In addition, cross-correlation analysis and Granger-causality tests are performed. Compared to a simple benchmark, Google search queries improve the prediction of the present by 10 % measured by mean absolute error. Moreover, predictions using search terms perform 39 % better over the benchmark for near future unemployment 3 months ahead. Google search queries also tend to improve the prediction accuracy around turning points. The results suggest that Google searches contain useful information of the present and the near future unemployment rate in Finland.”

Crowd-Sourced, Gamified Solutions to Geopolitical Issues


Gamification Corp: “Daniel Green, co-founder and CTO of Wikistrat, spoke at GSummit 2014 on an intriguing topic: How Gamification Motivates All Age Groups: Or How to Get Retired Generals to Play Games Alongside Students and Interns.

Wikistrat, a crowdsourced consulting company, leverages a worldwide network of experts from various industries to solve some of the world’s geopolitical problems through the power of gamification. Wikistrat also leverages fun, training, mentorship, and networking as core concepts in their company.

Dan (@wsdan) spoke with TechnologyAdvice host Clark Buckner about Wikistrat’s work, origins, what clients can expect from working with Wikistrat, and how gamification correlates with big data and business intelligence. Listen to the podcast and read the summary below:

Wikistrat aims to solve a common problem faced by most governments and organizations when generating strategies: “groupthink.” Such entities can devise a diverse set of strategies, but they always seem to find their resolution in the most popular answer.

In order to break group thinking, Wikistrat carries out geopolitical simulations that work around “collaborative competition.” The process involves:

  • Securing analysts: Wikistrat recruits a diverse group of analysts who are experts in certain fields and located in different strategic places.

  • Competing with ideas: These analysts are placed in an online environment where, instead of competing with each other, one analyst contributes an idea, then other analysts create 2-3 more ideas based on the initial idea.

  • Breaking group thinking: Now the competition becomes only about ideas. People champion the ideas they care about rather than arguing with other analysts. That’s when Wikistrat breaks group thinking and helps their clients discover ideas they may have never considered before.

Gamification occurs when analysts create different scenarios for a specific angle or question the client raises. Plus, Wikistrat’s global analyst coverage is so good that they tout having at least one expert in every country. They accomplished this by allowing anyone—not just four-star generals—to register as an analyst. However, applicants must submit a resume and a writing sample, as well as pass a face-to-face interview….”

Can big data help build more wind and solar farms?


Rachael Post in The Guardian: “Convincing customers to switch to renewable energy is an uphill battle. But for a former political operative, finding business is as easy as mining a consumer behavior database…After his father died from cancer related to pollution from a coal-burning plant, Tom Matzzie, the former director of democratic activist group MoveOn.org, decided that he’d had enough with traditional dirty energy. But when he installed solar panels on his home, he discovered that the complicated permitting and construction process made switching to renewable energy difficult and unwieldy. The solution, he concluded, was to use his online campaigning and big data skills – honed from his years of working in politics – to find the most likely customers for renewables and convince them to switch. Ethical Electric was born.
Matzzie’s company isn’t the first to sell renewable energy, but it might be the smartest. For the most part, convincing people to switch away from dirty energy is an unprofitable and work-intensive process, requiring electrical company representatives to approach thousands of randomly chosen customers. Ethical Electric, on the other hand, uses a highly-targeted, strategic method to identify its potential customers.
From finding votes to finding customers
Matzzie, who is now CEO of Ethical Electric, explained that the secret lies in his company’s use of big data, a resource that he and his partners mastered on the political front lines. In the last few presidential elections, big data fundamentally changed the way candidates – and their teams – approached voters. “We couldn’t rely on voter registration lists to make assumptions about who would be willing to vote in the next election,” Matzzie said. “What happened in politics is a real revolution in data.”…”