UK Department of Health: Citizen Space


at the UK Department of Health: “We recently ran a survey of our internal DH Citizen Space users. Citizen Space is the digital tool that DH and a number of other local and central Government Departments use to run their consultations.
Overall, our survey results were positive with staff reporting they had found the tool relatively easy to use and access. The survey did flag some internal issues eg. visibility of the tool in the Department, minor technical issues etc, which we’re planning to address through better promotion of Citizen Space and training, but on the whole our internal user experience seemed to be good.
However, there was one area where internal users did seem to be experiencing problems, and ironically it wasn’t with the tool itself. Many of our survey respondents seemed to be struggling with the analysis of their consultation responses, with some teams even questioning the usefulness of the data they were amassing from their digital consultations.
Some common mistakes
To help us get to the bottom of what was going on, we contacted some of our respondents and met with some consultation teams to talk about how they design, run and analyse the responses from their digital consultations. We found some common mistakes:

  • Not thinking ‘digital first’ – not designing consultations with a digital audience and digital responses in mind. Eg. writing consultations for print and then trying to shoehorn them into a digital tool
  • Not identifying what ‘real’ success means for a consultation before launching it or not putting in place the metrics needed to measure for success. Eg. not setting benchmarks, not measuring qualitative data or not identifying key target audiences and how to reach them.
  • Not thinking about the type and/or amount of data that will be returned and planning resources and tools accordingly. Eg. asking lots of free-text questions and then drowning in responses

As a team, we are trying to address many of these issues by improving the way the Department approaches and designs its digital consultations. The next iteration of our Digital policymaking toolkit, which will combine a new set of Policy Standards with our digital tools, techniques and advice for policymakers, should help. Alongside other work our team is doing to build up digital capability in the department and to produce analytical tools for data mining and sentiment analysis that will help teams with free-text analysis.
…how to use or build a consultation in Citizen Space, you can find one of those in the Citizen Space User Guide and further guides and user forums on the Citizen Space Knowledge Base website.

Twenty-one European Cities Advance in Bloomberg Philanthropies' Mayors Challenge Competition to Create Innovative Solutions to Urban Challenges


Press Release: “Bloomberg Philanthropies today revealed the 21 European cities that have emerged as final contenders in its 2013-2014 Mayors Challenge, a competition to inspire cities to generate innovative ideas that solve major challenges and improve city life, and that ultimately can spread to other cities. One grand prize winner will receive €5 million for the most creative and transferable idea. Four additional cities will be awarded €1 million, and all will be announced in the fall. The finalists’ proposed solutions address some of Europe’s most critical issue areas: youth unemployment, aging populations, civic engagement, economic development, environment and energy concerns, public health and safety, and making government more efficient…
James Anderson, the head of government innovation for Bloomberg Philanthropies, said: “While the ideas are very diverse, we identified key themes. The ideas tended toward networked, distributed solutions as opposed to costly centralized ones. There was a lot of interest in citizen engagement as both a means and end. Technology that concretely and positively affects the lives of individual citizens – from the blind person in Warsaw to the unemployed youth in Amsterdam to the homeowner in Schaerbeek — also played a significant role.”
Bloomberg Philanthropies staff and an independent selection committee of 12 members from across Europe closely considered each application over multiple rounds of review, culminating in feedback and selection earlier this month, resulting in 21 cities’ ideas moving forward for further development. The submissions will be judged on four critieria: vision, potential for impact, implementation plan, and potential to spread to other cities. The finalists and their ideas are:

  1. AMSTERDAM, Netherlands – Youth Unemployment: Tackling widespread youth unemployment by equipping young people with 21st century skills and connecting them with jobs and apprenticeships across Europe through an online game
  2. ATHENS, Greece – Civic Engagement: Empowering citizens with a new online platform to address the large number of small-scale urban challenges accelerated by the Greek economic crisis
  3. BARCELONA, Spain – Aging: Improving quality of life and limiting social isolation by establishing a network of public and private support – including family, friends, social workers, and volunteers – for each elderly citizen
  4. BOLOGNA, Italy – Youth Unemployment: Building an urban scale model of informal education labs and civic engagement to prevent youth unemployment by teaching children aged 6-16 entrepreneurship and 21st century skills
  5. BRISTOL, United Kingdom – Health/Anti-obesity: Tackling obesity and unemployment by creating a new economic system that increases access to locally grown, healthy foods
  6. BRNO, Czech Republic – Public Safety/Civic Engagement: Engaging citizens in keeping their own communities safe to build social cohesion and reduce crime
  7. CARDIFF, United Kingdom – Economic Development: Increasing productivity little by little in residents’ personal and professional lives, so that a series of small improvements add up to a much more productive city
  8. FLORENCE, Italy – Economic Development: Combatting unemployment with a new economic development model that combines technology and social innovation, targeting the city’s historic artisan and maker community
  9. GDAŃSK, Poland – Civic Engagement: Re-instilling faith in local democracy by mandating that city government formally debate local issues put forward by citizens
  10. KIRKLEES, United Kingdom – Social Capital: Pooling the city and community’s idle assets – from vehicles to unused spaces to citizens’ untapped time and expertise – to help the area make the most of what it has and do more with less
  11. KRAKOW, Poland – Transportation: Implementing smart, personalized transportation incentives and a seamless and unified public transit payment system to convince residents to opt for greener modes of transportation
  12. LISBON, Portugal – Energy: Transforming wasted kinetic energy generated by the city’s commuting traffic into electricity, reducing the carbon footprint and increasing environmental sustainability
  13. LONDON, United Kingdom – Public Health: Empowering citizens to monitor and improve their own health through a coordinated, multi-stakeholder platform and new technologies that dramatically improve quality of life and reduce health care costs
  14. MADRID, Spain – Energy: Diversifying its renewable energy options by finding and funding the best ways to harvest underground power, such as wasted heat generated by the city’s below-ground infrastructure
  15. SCHAERBEEK, Belgium – Energy: Using proven flyover and 3D geothermal mapping technology to provide each homeowner and tenant with a personalized energy audit and incentives to invest in energy-saving strategies
  16. SOFIA, Bulgaria – Civic Engagement: Transforming public spaces by deploying mobile art units to work side-by-side with local residents, re-envisioning and rejuvenating underused spaces and increasing civic engagement
  17. STARA ZAGORA, Bulgaria – Economic Development: Reversing the brain-drain of the city’s best and brightest by helping young entrepreneurs turn promising ideas into local high-tech businesses
  18. STOCKHOLM, Sweden – Environment: Combatting climate change by engaging citizens to produce biochar, an organic material that increases tree growth, sequesters carbon, and purifies storm runoff
  19. THE HAGUE, Netherlands – Civic Engagement: Enabling citizens to allocate a portion of their own tax money to support the local projects they most believe in
  20. WARSAW, Poland – Transportation/Accessibility: Enabling the blind and visually impaired to navigate the city as easily as their sighted peers by providing high-tech auditory alerts which will save them travel time and increase their independence
  21. YORK, United Kingdom – Government Systems: Revolutionizing the way citizens, businesses, and others can propose new ideas to solve top city problems, providing a more intelligent way to acquire or develop the best solutions, thus enabling greater civic participation and saving the city both time and money

Further detail and related elements for this year’s Mayors Challenge can be found via: http://mayorschallenge.bloomberg.org/”

Crowdsourcing medical expertise in near real time


Paper by Max H. Sims et al in Journal of Hospital Medicine: “Given the pace of discovery in medicine, accessing the literature to make informed decisions at the point of care has become increasingly difficult. Although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered. Advances in social computation and human computer interactions offer a potential solution to this problem. We developed and piloted the mobile application DocCHIRP, which uses a system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers. Over the 244-day pilot period, 85 registered users logged 1544 page views and sent 45 consult questions. The median initial first response from the crowd occurred within 19 minutes. Review of the transcripts revealed several dominant themes, including complex medical decision making and inquiries related to prescription medication use. Feedback from the post-trial survey identified potential hurdles related to medical crowdsourcing, including a reluctance to expose personal knowledge gaps and the potential risk for “distracted doctoring.” Users also suggested program modifications that could support future adoption, including changes to the mobile interface and mechanisms that could expand the crowd of participating healthcare providers.”

Historic release of data delivers unprecedented transparency on the medical services physicians provide and how much they are paid


Jonathan Blum, Principal Deputy Administrator, Centers for Medicare & Medicaid Services : “Today the Centers for Medicare & Medicaid Services (CMS) took a major step forward in making Medicare data more transparent and accessible, while maintaining the privacy of beneficiaries, by announcing the release of new data on medical services and procedures furnished to Medicare fee-for-service beneficiaries by physicians and other healthcare professionals (http://www.cms.gov/newsroom/newsroom-center.html). For too long, the only information on physicians readily available to consumers was physician name, address and phone number. This data will, for the first time, provide a better picture of how physicians practice in the Medicare program.
This new data set includes over nine million rows of data on more than 880,000 physicians and other healthcare professionals in all 50 states, DC and Puerto Rico providing care to Medicare beneficiaries in 2012. The data set presents key information on the provision of services by physicians and how much they are paid for those services, and is organized by provider (National Provider Identifier or NPI), type of service (Healthcare Common Procedure Coding System, or HCPCS) code, and whether the service was performed in a facility or office setting. This public data set includes the number of services, average submitted charges, average allowed amount, average Medicare payment, and a count of unique beneficiaries treated. CMS takes beneficiary privacy very seriously and we will protect patient-identifiable information by redacting any data in cases where it includes fewer than 11 beneficiaries.
Previously, CMS could not release this information due to a permanent injunction issued by a court in 1979. However, in May 2013, the court vacated this injunction, causing a series of events that has led CMS to be able to make this information available for the first time.
Data to Fuel Research and Innovation
In addition to the public data release, CMS is making slight modifications to the process to request CMS data for research purposes. This will allow researchers to conduct important research at the physician level. As with the public release of information described above, CMS will continue to prohibit the release of patient-identifiable information. For more information about CMS’s disclosures to researchers, please contact the Research Data Assistance Center (ResDAC) at http://www.resdac.org/.
Unprecedented Data Access
This data release follows other CMS efforts to make more data available to the public. Since 2010, the agency has released an unprecedented amount of aggregated data in machine-readable form, with much of it available at http://www.healthdata.gov. These data range from previously unpublished statistics on Medicare spending, utilization, and quality at the state, hospital referral region, and county level, to detailed information on the quality performance of hospitals, nursing homes, and other providers.
In May 2013, CMS released information on the average charges for the 100 most common inpatient services at more than 3,000 hospitals nationwide http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Inpatient.html.
In June 2013, CMS released average charges for 30 selected outpatient procedures http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Outpatient.html.
We will continue to work toward harnessing the power of data to promote quality and value, and improve the health of our seniors and persons with disabilities.”

PatientsLikeMe Gives Genentech Full Access


Susan Young Rojahn in MIT Technology Review: “PatientsLikeMe, the largest online network for patients, has established its first broad partnership with a drug company. Genentech, the South San Francisco biotechnology company bought by Roche in 2009, now has access to PatientsLikeMe’s full database for five years.
PatientsLikeMe is an online network of some 250,000 people with chronic diseases who share information about symptoms, treatments, and coping mechanisms. The largest communities within the network are built around fibromyalgia, multiple sclerosis, and amyotrophic lateral sclerosis (ALS), but as many as 2,000 conditions are represented in the system. The hope is that the information shared by people with chronic disease will help the life sciences industry identify unmet needs in patients and generate medical evidence, says co-founder Ben Heywood.
The agreement with Genentech is not the first collaboration between a life sciences company and PatientsLikeMe, named one of 50 Disruptive Companies in 2012 by MIT Technology Review, but it is the broadest. Previous collaborations were more limited in scope, says Heywood, focusing on a particular research question or a specific disease area. The deal with Genentech is an all-encompassing subscription to information posted by the entire PatientsLikeMe population, without the need for new contracts and new business deals if a research program shifts direction from its original focus. “This allows for a much more rapid real-time use of the data,” says Heywood.
In 2010, PatientsLikeMe demonstrated some of its potential to advance medicine. With data from its community of ALS patients, who suffer from a progressive and fatal neurological disease, the company could see that a drug under study was not effective (see “Patients’ Social Network Predicts Drug Outcomes”). Those findings were corroborated by an academic study published that year. Another area of medicine the network can shed light on is the quality of care patients receive, including whether or not doctors are following guidelines established by medical societies for how patients are treated. “As we try to shift to patient-centered health care, we have to understand what [patients] value,” says Heywood.
In exchange for an undisclosed payment to PatientsLikeMe, Genentech has a five-year subscription to the data in the online network. The data will be de-identified– that is, Genentech will not see patient names or email addresses. Heywood says his company is hoping to establish broad agreements with other life sciences companies soon.”

Medicare to Publish Trove of Data on Doctors


Louise Radnofsky in the Wall Street Journal: “The Obama administration said it would publish as early as next week data on what Medicare paid individual doctors in 2012, aiming to boost transparency and help root out fraud.
The move, which faced fierce resistance from doctors’ groups, would end a decadeslong block on making the information public.
Federal officials said they planned to release reimbursement information on April 9 or soon after that would show billing data for 880,000 health-care providers treating patients in the government-run insurance program for elderly and disabled people. It will include how many times the providers carried out a particular service or procedure, whether they carried it out in a medical facility or an office setting, the average amount they charged Medicare for it, the average amount they were paid for it, and the total number of people they treated.
The data set would show the names and addresses of the providers in connection with their reimbursement information, officials at the Centers for Medicare and Medicaid Services said. The agency hasn’t previously released such data.
Physicians’ organizations had sought to prevent the release of the data, citing concerns about physician privacy. But a federal judge last year lifted a long-standing injunction placed on the publication of the information by a federal court in Florida, in response to a challenge from Dow Jones & Co., The Wall Street Journal’s parent company.
Jonathan Blum, principal deputy administrator at CMS, informed the American Medical Association and Florida Medical Association in letters dated Wednesday that the agency would move to publish the data soon.
Ardis Dee Hoven, president of the American Medical Association, said the group remained concerned that CMS was taking a “broad approach” that could result in “unwarranted bias against physicians that can destroy careers.” Dr. Hoven said the AMA wanted doctors to be able to review and correct their information before the data set was published. The Florida Medical Association couldn’t immediately be reached.
Mr. Blum said that for privacy reasons, data related to subsets of fewer than 11 Medicare patients would be redacted.
In the letters, Mr. Blum said the agency believed that news organizations seeking the information—which include the Journal—would be able to use it to shed light on problems in the Medicare program. He also specifically cited earlier reporting by the Journal that had drawn on similar data.
“The Department concluded that the data to be released would assist the public’s understanding of Medicare fraud, waste, and abuse, as well as shed light on payments to physicians for services furnished to Medicare beneficiaries,” Mr. Blum wrote. “As an example, using similar payment information, The Wall Street Journal was able to identify and report on a number of instances of Medicare fraud, waste, and abuse, using Medicare payment data in its Secrets of the System series,” Mr. Blum wrote. That series was a finalist for a Pulitzer Prize in 2011.”

'Hackathons' Aim to Solve Health Care's Ills


Amy Dockser Marcus in the Wall Street Journal: “Hackathons, the high-octane, all-night problem-solving sessions popularized by the software-coding community, are making their way into the more traditional world of health care. At Massachusetts Institute of Technology, a recent event called Hacking Medicine’s Grand Hackfest attracted more than 450 people to work for one weekend on possible solutions to problems involving diabetes, rare diseases, global health and information technology used at hospitals.
Health institutions such as New York-Presbyterian Hospital and Brigham and Women’s Hospital in Boston have held hackathons. MIT, meantime, has co-sponsored health hackathons in India, Spain and Uganda.
Hackathons of all kinds are increasingly popular. Intel Corp.  recently bought a group that organizes them. Companies hoping to spark creative thinking sponsor them. And student-run hackathons have turned into intercollegiate competitions.
But in health care, where change typically comes much more slowly than in Silicon Valley, they represent a cultural shift. To solve a problem, scientists and doctors can spend years painstakingly running experiments, gathering data, applying for grants and publishing results. So the idea of an event where people give two-minute pitches describing a problem, then join a team of strangers to come up with a solution in the course of one weekend is radical.
“We are not trying to replace the medical culture with Facebook culture,” said Elliot Cohen, who wore a hoodie over a button-down dress shirt at the MIT event in March and helped start MIT Hacking Medicine while at business school. “But we want to try to blend them more.”
Mr. Cohen co-founded and is chief technology officer at PillPack, a pharmacy that sends customers personalized packages of their medications, a company that started at a hackathon.
At MIT’s health-hack, physicians, researchers, students and a smattering of people wearing Google Glass sprawled on the floor of MIT’s Media Lab and at tables with a view of the Boston skyline. At one table, a group of college students, laptops plastered with stickers, pulled juice boxes and snacks out of backpacks, trash piling up next to them as they feverishly wrote code.
Nupur Garg, an emergency-room physician and one of the eventual winners, finished her hospital shift at 2 a.m. Saturday in New York, drove to Boston and arrived at MIT in time to pitch the need for a way to capture images of patients’ ears and throats that can be shared with specialists to help make diagnoses. She and her team immediately started working on a prototype for the device, testing early versions on anyone who stopped by their table.
Dr. Garg and teammate Nancy Liang, who runs a company that makes Web apps for 3-D printers, caught a few hours of sleep in a dorm room Saturday night. They came up with the idea for their product’s name—MedSnap—later that night while watching students use cellphone cameras to send SnapChats to one another. “There was no time to conduct surveys on what was the best name,” said Ms. Liang. “Many ideas happen after midnight.”
Winning teams in each category won $1,000, as well as access to the hackathons sponsors for advice and pilot projects.
Yet even supporters say hackathons can’t solve medicine’s challenges overnight. Harlan Krumholz, a professor at Yale School of Medicine who ran a many-months trial that found telemonitoring didn’t reduce hospitalizations or deaths of cardiology patients, said he supports the problem-solving ethos of hackathons. But he added that “improvements require a long-term commitment, not just a weekend.”
Ned McCague, a data scientist at Blue Cross Blue Shield of Massachusetts, served as a mentor at the hackathon. He said he wasn’t representing his employer, but he used his professional experiences to push groups to think about the potential customer. “They have a good idea and are excited about it, but they haven’t thought about who is paying for it,” he said.
Zen Chu, a senior lecturer in health-care innovation and entrepreneur-in-residence at MIT, and one of the founders of Hacking Medicine, said more than a dozen startups conceived since the first hackathon, in 2011, are still in operation. Some received venture-capital funding.
The upsides of hackathons were made clear to Sharon Moalem, a physician who studies rare diseases. He had spent years developing a mobile app that can take pictures of faces to help diagnose rare genetic conditions, but was stumped on how to give the images a standard size scale to make comparisons. At the hackathon, Dr. Moalem said he was approached by an MIT student who suggested sticking a coin on the subjects’ forehead. Since quarters have a standard measurement, it “creates a scale,” said Dr. Moalem.
Dr. Moalem said he had never considered such a simple, elegant solution. The team went on to write code to help standardize facial measurements based on the dimensions of a coin and a credit card.
“Sometimes when you are too close to something, you stop seeing solutions, you only see problems,” Dr. Moalem said. “I needed to step outside my own silo.”

Let’s get geeks into government


Gillian Tett in the Financial Times: “Fifteen years ago, Brett Goldstein seemed to be just another tech entrepreneur. He was working as IT director of OpenTable, then a start-up website for restaurant bookings. The company was thriving – and subsequently did a very successful initial public offering. Life looked very sweet for Goldstein. But when the World Trade Center was attacked in 2001, Goldstein had a moment of epiphany. “I spent seven years working in a startup but, directly after 9/11, I knew I didn’t want my whole story to be about how I helped people make restaurant reservations. I wanted to work in public service, to give something back,” he recalls – not just by throwing cash into a charity tin, but by doing public service. So he swerved: in 2006, he attended the Chicago police academy and then worked for a year as a cop in one of the city’s toughest neighbourhoods. Later he pulled the disparate parts of his life together and used his number-crunching skills to build the first predictive data system for the Chicago police (and one of the first in any western police force), to indicate where crime was likely to break out.

This was such a success that Goldstein was asked by Rahm Emanuel, the city’s mayor, to create predictive data systems for the wider Chicago government. The fruits of this effort – which include a website known as “WindyGrid” – went live a couple of years ago, to considerable acclaim inside the techie scene.

This tale might seem unremarkable. We are all used to hearing politicians, business leaders and management consultants declare that the computing revolution is transforming our lives. And as my colleague Tim Harford pointed out in these pages last week, the idea of using big data is now wildly fashionable in the business and academic worlds….

In America when top bankers become rich, they often want to “give back” by having a second career in public service: just think of all those Wall Street financiers who have popped up at the US Treasury in recent years. But hoodie-wearing geeks do not usually do the same. Sure, there are some former techie business leaders who are indirectly helping government. Steve Case, a co-founder of AOL, has supported White House projects to boost entrepreneurship and combat joblessness. Tech entrepreneurs also make huge donations to philanthropy. Facebook’s Mark Zuckerberg, for example, has given funds to Newark education. And the whizz-kids have also occasionally been summoned by the White House in times of crisis. When there was a disastrous launch of the government’s healthcare website late last year, the Obama administration enlisted the help of some of the techies who had been involved with the president’s election campaign.

But what you do not see is many tech entrepreneurs doing what Goldstein did: deciding to spend a few years in public service, as a government employee. There aren’t many Zuckerberg types striding along the corridors of federal or local government.
. . .
It is not difficult to work out why. To most young entrepreneurs, the idea of working in a state bureaucracy sounds like utter hell. But if there was ever a time when it might make sense for more techies to give back by doing stints of public service, that moment is now. The civilian public sector badly needs savvier tech skills (just look at the disaster of that healthcare website for evidence of this). And as the sector’s founders become wealthier and more powerful, they need to show that they remain connected to society as a whole. It would be smart political sense.
So I applaud what Goldstein has done. I also welcome that he is now trying to persuade his peers to do the same, and that places such as the University of Chicago (where he teaches) and New York University are trying to get more young techies to think about working for government in between doing those dazzling IPOs. “It is important to see more tech entrepreneurs in public service. I am always encouraging people I know to do a ‘stint in government”. I tell them that giving back cannot just be about giving money; we need people from the tech world to actually work in government, “ Goldstein says.

But what is really needed is for more technology CEOs and leaders to get involved by actively talking about the value of public service – or even encouraging their employees to interrupt their private-sector careers with the occasional spell as a government employee (even if it is not in a sector quite as challenging as the police). Who knows? Maybe it could be Sheryl Sandberg’s next big campaigning mission. After all, if she does ever jump back to Washington, that could have a powerful demonstration effect for techie women and men. And shake DC a little too.”

Eight (No, Nine!) Problems With Big Data


Gary Marcus and Ernest Davis in the New York Times: “BIG data is suddenly everywhere. Everyone seems to be collecting it, analyzing it, making money from it and celebrating (or fearing) its powers. Whether we’re talking about analyzing zillions of Google search queries to predict flu outbreaks, or zillions of phone records to detect signs of terrorist activity, or zillions of airline stats to find the best time to buy plane tickets, big data is on the case. By combining the power of modern computing with the plentiful data of the digital era, it promises to solve virtually any problem — crime, public health, the evolution of grammar, the perils of dating — just by crunching the numbers.

Or so its champions allege. “In the next two decades,” the journalist Patrick Tucker writes in the latest big data manifesto, “The Naked Future,” “we will be able to predict huge areas of the future with far greater accuracy than ever before in human history, including events long thought to be beyond the realm of human inference.” Statistical correlations have never sounded so good.

Is big data really all it’s cracked up to be? There is no doubt that big data is a valuable tool that has already had a critical impact in certain areas. For instance, almost every successful artificial intelligence computer program in the last 20 years, from Google’s search engine to the I.B.M. “Jeopardy!” champion Watson, has involved the substantial crunching of large bodies of data. But precisely because of its newfound popularity and growing use, we need to be levelheaded about what big data can — and can’t — do.

The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful. A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two. Likewise, from 1998 to 2007 the number of new cases of autism diagnosed was extremely well correlated with sales of organic food (both went up sharply), but identifying the correlation won’t by itself tell us whether diet has anything to do with autism.

Second, big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement. Molecular biologists, for example, would very much like to be able to infer the three-dimensional structure of proteins from their underlying DNA sequence, and scientists working on the problem use big data as one tool among many. But no scientist thinks you can solve this problem by crunching data alone, no matter how powerful the statistical analysis; you will always need to start with an analysis that relies on an understanding of physics and biochemistry.

Third, many tools that are based on big data can be easily gamed. For example, big data programs for grading student essays often rely on measures like sentence length and word sophistication, which are found to correlate well with the scores given by human graders. But once students figure out how such a program works, they start writing long sentences and using obscure words, rather than learning how to actually formulate and write clear, coherent text. Even Google’s celebrated search engine, rightly seen as a big data success story, is not immune to “Google bombing” and “spamdexing,” wily techniques for artificially elevating website search placement.

Fourth, even when the results of a big data analysis aren’t intentionally gamed, they often turn out to be less robust than they initially seem. Consider Google Flu Trends, once the poster child for big data. In 2009, Google reported — to considerable fanfare — that by analyzing flu-related search queries, it had been able to detect the spread of the flu as accurately and more quickly than the Centers for Disease Control and Prevention. A few years later, though, Google Flu Trends began to falter; for the last two years it has made more bad predictions than good ones.

As a recent article in the journal Science explained, one major contributing cause of the failures of Google Flu Trends may have been that the Google search engine itself constantly changes, such that patterns in data collected at one time do not necessarily apply to data collected at another time. As the statistician Kaiser Fung has noted, collections of big data that rely on web hits often merge data that was collected in different ways and with different purposes — sometimes to ill effect. It can be risky to draw conclusions from data sets of this kind.

A fifth concern might be called the echo-chamber effect, which also stems from the fact that much of big data comes from the web. Whenever the source of information for a big data analysis is itself a product of big data, opportunities for vicious cycles abound. Consider translation programs like Google Translate, which draw on many pairs of parallel texts from different languages — for example, the same Wikipedia entry in two different languages — to discern the patterns of translation between those languages. This is a perfectly reasonable strategy, except for the fact that with some of the less common languages, many of the Wikipedia articles themselves may have been written using Google Translate. In those cases, any initial errors in Google Translate infect Wikipedia, which is fed back into Google Translate, reinforcing the error.

A sixth worry is the risk of too many correlations. If you look 100 times for correlations between two variables, you risk finding, purely by chance, about five bogus correlations that appear statistically significant — even though there is no actual meaningful connection between the variables. Absent careful supervision, the magnitudes of big data can greatly amplify such errors.

Seventh, big data is prone to giving scientific-sounding solutions to hopelessly imprecise questions. In the past few months, for instance, there have been two separate attempts to rank people in terms of their “historical importance” or “cultural contributions,” based on data drawn from Wikipedia. One is the book “Who’s Bigger? Where Historical Figures Really Rank,” by the computer scientist Steven Skiena and the engineer Charles Ward. The other is an M.I.T. Media Lab project called Pantheon.

Both efforts get many things right — Jesus, Lincoln and Shakespeare were surely important people — but both also make some egregious errors. “Who’s Bigger?” claims that Francis Scott Key was the 19th most important poet in history; Pantheon has claimed that Nostradamus was the 20th most important writer in history, well ahead of Jane Austen (78th) and George Eliot (380th). Worse, both projects suggest a misleading degree of scientific precision with evaluations that are inherently vague, or even meaningless. Big data can reduce anything to a single number, but you shouldn’t be fooled by the appearance of exactitude.

FINALLY, big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common. For instance, programs that use big data to deal with text, such as search engines and translation programs, often rely heavily on something called trigrams: sequences of three words in a row (like “in a row”). Reliable statistical information can be compiled about common trigrams, precisely because they appear frequently. But no existing body of data will ever be large enough to include all the trigrams that people might use, because of the continuing inventiveness of language.

To select an example more or less at random, a book review that the actor Rob Lowe recently wrote for this newspaper contained nine trigrams such as “dumbed-down escapist fare” that had never before appeared anywhere in all the petabytes of text indexed by Google. To witness the limitations that big data can have with novelty, Google-translate “dumbed-down escapist fare” into German and then back into English: out comes the incoherent “scaled-flight fare.” That is a long way from what Mr. Lowe intended — and from big data’s aspirations for translation.

Wait, we almost forgot one last problem: the hype….

Smart cities are here today — and getting smarter


Computer World: “Smart cities aren’t a science fiction, far-off-in-the-future concept. They’re here today, with municipal governments already using technologies that include wireless networks, big data/analytics, mobile applications, Web portals, social media, sensors/tracking products and other tools.
These smart city efforts have lofty goals: Enhancing the quality of life for citizens, improving government processes and reducing energy consumption, among others. Indeed, cities are already seeing some tangible benefits.
But creating a smart city comes with daunting challenges, including the need to provide effective data security and privacy, and to ensure that myriad departments work in harmony.

The global urban population is expected to grow approximately 1.5% per year between 2025 and 2030, mostly in developing countries, according to the World Health Organization.

What makes a city smart? As with any buzz term, the definition varies. But in general, it refers to using information and communications technologies to deliver sustainable economic development and a higher quality of life, while engaging citizens and effectively managing natural resources.
Making cities smarter will become increasingly important. For the first time ever, the majority of the world’s population resides in a city, and this proportion continues to grow, according to the World Health Organization, the coordinating authority for health within the United Nations.
A hundred years ago, two out of every 10 people lived in an urban area, the organization says. As recently as 1990, less than 40% of the global population lived in a city — but by 2010 more than half of all people lived in an urban area. By 2050, the proportion of city dwellers is expected to rise to 70%.
As many city populations continue to grow, here’s what five U.S. cities are doing to help manage it all:

Scottsdale, Ariz.

The city of Scottsdale, Ariz., has several initiatives underway.
One is MyScottsdale, a mobile application the city deployed in the summer of 2013 that allows citizens to report cracked sidewalks, broken street lights and traffic lights, road and sewer issues, graffiti and other problems in the community….”