Needed: A New Generation of Game Changers to Solve Public Problems


Beth Noveck: “In order to change the way we govern, it is important to train and nurture a new generation of problem solvers who possess the multidisciplinary skills to become effective agents of change. That’s why we at the GovLab have launched The GovLab Academy with the support of the Knight Foundation.
In an effort to help people in their own communities become more effective at developing and implementing creative solutions to compelling challenges, The Gov Lab Academy is offering two new training programs:
1) An online platform with an unbundled and evolving set of topics, modules and instructors on innovations in governance, including themes such as big and open data and crowdsourcing and forthcoming topics on behavioral economics, prizes and challenges, open contracting and performance management for governance;
2) Gov 3.0: A curated and sequenced, 14-week mentoring and training program.
While the online-platform is always freely available, Gov 3.0 begins on January 29, 2014 and we invite you to to participate. Please forward this email to your networks and help us spread the word about the opportunity to participate.
Please consider applying (individuals or teams may apply), if you are:

  • an expert in communications, public policy, law, computer science, engineering, business or design who wants to expand your ability to bring about social change;

  • a public servant who wants to bring innovation to your job;

  • someone with an important idea for positive change but who lacks key skills or resources to realize the vision;

  • interested in joining a network of like-minded, purpose-driven individuals across the country; or

  • someone who is passionate about using technology to solve public problems.

The program includes live instruction and conversation every Wednesday from 5:00– 6:30 PM EST for 14 weeks starting Jan 29, 2014. You will be able to participate remotely via Google Hangout.

Gov 3.0 will allow you to apply evolving technology to the design and implementation of effective solutions to public interest challenges. It will give you an overview of the most current approaches to smarter governance and help you improve your skills in collaboration, communication, and developing and presenting innovative ideas.

Over 14 weeks, you will develop a project and a plan for its implementation, including a long and short description, a presentation deck, a persuasive video and a project blog. Last term’s projects covered such diverse issues as post-Fukushima food safety, science literacy for high schoolers and prison reform for the elderly. In every case, the goal was to identify realistic strategies for making a difference quickly.  You can read the entire Gov 3.0 syllabus here.

The program will include national experts and instructors in technology and governance both as guests and as mentors to help you design your project. Last term’s mentors included current and former officials from the White House and various state, local and international governments, academics from a variety of fields, and prominent philanthropists.

People who complete the program will have the opportunity to apply for a special fellowship to pursue their projects further.

Previously taught only on campus, we are offering Gov 3.0 in beta as an online program. This is not a MOOC. It is a mentoring-intensive coaching experience. To maximize the quality of the experience, enrollment is limited.

Please submit your application by January 22, 2014. Accepted applicants (individuals and teams) will be notified on January 24, 2014. We hope to expand the program in the future so please use the same form to let us know if you would like to be kept informed about future opportunities.”

The Eight Key Issues of Digital Government


Andrea Di Maio (Gartner): “…Now, to set the record straight, I do believe digital government is profoundly different from e-government as well as from government 2.0 (although in some jurisdictions the latter terms still looks more relevant than “digital”). Whereas there are many differences as far as technologies and what they make possible,political will, and evolving citizen demand, my contention is that the single most fundamental difference is in the relevance of data and how new and unforeseen uses of data can truly transform the way governments deliver their services and perform their operations.
This is not at all just about government as a platform or open government, where government is primarily a provider of data that constituents – be they citizens, business or intermediaries – use and mash up in new ways. It is also about government themselves inventing new ways to user their own as well as constituents’ data. It is only by striking the right balance between being a data provider and being a data broker and consumer that governments will find the right path to being truly digital.
During the Gartner Symposia I attended last fall, I had numerous interesting conversations with people who are exploring very innovative ways of using its own data, such as:

  • tax authorities contemplating to use up-to-date financial information about taxpayers to proactively suggest investments that may provide tax breaks,
  • education institutions leveraging data about student location from their original purpose (giving parents information about students’ whereabouts) to providing new tools for teachers to understand behavioral patterns and relate those to more personalized learning
  • immigration authorities leveraging data coming from video analysis, whose role is to flag suspicious immigrants for secondary inspection, to inform public safety authorities or the hospitality sector about specific issues and opportunities with tourists.

In the second half of 2013, Gartner government analysts focused on distilling the fundamental components of a digital government initiative, in order to be able to shape our research and advice in ways that hit the most important issues that client face. The new government research agenda has just been published (see Agenda Overview for Government, 2014) and eight key issues, grouped in three distinct areas,  that need to be addressed to successfully transform into a digital government organization.
Engaging Citizens

  • Service Delivery Innovation: How will governments use technology to support innovative services that produce better results for society?
  • Open Government: How will governments create and sustain a digital ecosystem that citizens can trust and want to participate in?

Connecting Agencies

  • New Digital Business Models: What data-driven business models will emerge to meet the growing needs for adequate and sustainable public services?
  • Joint Governance: How will governance coordinate IT and service decisions across independent public and private organizations?
  • Scalable Interoperability: How much interoperability is needed to support connected government services and at what cost?

Resourcing Government

  • Workforce Innovation: How will the IT organization and role transform to support government workforce innovation?
  • Adaptive Sourcing: How will government IT organizations expand their sourcing strategies to take advantage of competitive cloud-based and consumer-grade solutions?
  • Sustainable Financing: How will government IT organizations obtain and manage the financial resources required to connect government and engage citizens?”

Safety Datapalooza Shows Power of Data.gov Communities


Lisa Nelson at DigitalGov: “The White House Office of Public Engagement held the first Safety Datapalooza illustrating the power of Data.gov communities. Federal Chief Technology Officer Todd Park and Deputy Secretary of Transportation John Porcari hosted the event, which touted the data available on Safety.Data.gov and the community of innovators using it to make effective tools for consumers.
The event showcased many of the  tools that have been produced as a result of  opening this safety data including:

  • PulsePoint, from the San Ramon Fire Protection District, a lifesaving mobile app that allows CPR-trained volunteers to be notified if someone nearby is in need of emergency assistance;
  • Commute and crime maps, from Trulia, allow home buyers to choose their new residence based on two important everyday factors; and
  • Hurricane App, from the American Red Cross, to monitor storm conditions, prepare your family and home, find help, and let others know you’re safe even if the power is out;

Safety data is far from alone in generating innovative ideas and gathering a community of developers and entrepreneurs, Data.gov currently has 16 different topically diverse communities on land and sea — the Cities and Oceans communities being two such examples. Data.gov’s communities are a virtual meeting spot for interested parties across government, academia and industry to come together and put the data to use. Data.gov enables a whole set of tools to make these communities come to life: apps, blogs, challenges, forums, ranking, rating and wikis.
For a summary of the Safety Datapalooza visit Transportation’s “Fast Lane” blog.”

EPA Launches New Citizen Science Website


Press Release:The U.S. Environmental Protection Agency has revamped its Citizen Science website to provide new resources and success stories to assist the public in conducting scientific research and collecting data to better understand their local environment and address issues of concern. The website can be found at www.epa.gov/region2/citizenscience.
“Citizen Science is an increasingly important part of EPA’s commitment to using sound science and technology to protect people’s health and safeguard the environment,” said Judith A. Enck, EPA Regional Administrator. “The EPA encourages the public to use the new website as a tool in furthering their scientific investigations and developing solutions to pollution problems.”
The updated website now offers detailed information about air, water and soil monitoring, including recommended types of equipment and resources for conducting investigations. It also includes case studies and videotapes that showcase successful citizen science projects in New York and New Jersey, provides funding opportunities, quality assurance information and workshops and webinars.”

Bad Data


Bad Data is a site providing real-world examples of how not to prepare or provide data. It showcases the poorly structured, the mis-formatted, or the just plain ugly. Its primary purpose is to educate – though there may also be some aspect of entertainment.
As a side-product it also provides a source of good practice material for budding data wranglers (the repo in fact began as a place to keep practice data for Data Explorer).
New examples wanted and welcome – submit them here »

Examples

Garbage In, Garbage Out… Or, How to Lie with Bad Data


Medium: For everyone who slept through Stats 101, Charles Wheelan’s Naked Statistics is a lifesaver. From batting averages and political polls to Schlitz ads and medical research, Wheelan “illustrates exactly why even the most reluctant mathophobe is well advised to achieve a personal understanding of the statistical underpinnings of life” (New York Times). What follows is adapted from the book, out now in paperback.
Behind every important study there are good data that made the analysis possible. And behind every bad study . . . well, read on. People often speak about “lying with statistics.” I would argue that some of the most egregious statistical mistakes involve lying with data; the statistical analysis is fine, but the data on which the calculations are performed are bogus or inappropriate. Here are some common examples of “garbage in, garbage out.”

Selection Bias

….Selection bias can be introduced in many other ways. A survey of consumers in an airport is going to be biased by the fact that people who fly are likely to be wealthier than the general public; a survey at a rest stop on Interstate 90 may have the opposite problem. Both surveys are likely to be biased by the fact that people who are willing to answer a survey in a public place are different from people who would prefer not to be bothered. If you ask 100 people in a public place to complete a short survey, and 60 are willing to answer your questions, those 60 are likely to be different in significant ways from the 40 who walked by without making eye contact.

Publication Bias

Positive findings are more likely to be published than negative findings, which can skew the results that we see. Suppose you have just conducted a rigorous, longitudinal study in which you find conclusively that playing video games does not prevent colon cancer. You’ve followed a representative sample of 100,000 Americans for twenty years; those participants who spend hours playing video games have roughly the same incidence of colon cancer as the participants who do not play video games at all. We’ll assume your methodology is impeccable. Which prestigious medical journal is going to publish your results?

Most things don’t prevent cancer.

None, for two reasons. First, there is no strong scientific reason to believe that playing video games has any impact on colon cancer, so it is not obvious why you were doing this study. Second, and more relevant here, the fact that something does not prevent cancer is not a particularly interesting finding. After all, most things don’t prevent cancer. Negative findings are not especially sexy, in medicine or elsewhere.
The net effect is to distort the research that we see, or do not see. Suppose that one of your graduate school classmates has conducted a different longitudinal study. She finds that people who spend a lot of time playing video games do have a lower incidence of colon cancer. Now that is interesting! That is exactly the kind of finding that would catch the attention of a medical journal, the popular press, bloggers, and video game makers (who would slap labels on their products extolling the health benefits of their products). It wouldn’t be long before Tiger Moms all over the country were “protecting” their children from cancer by snatching books out of their hands and forcing them to play video games instead.
Of course, one important recurring idea in statistics is that unusual things happen every once in a while, just as a matter of chance. If you conduct 100 studies, one of them is likely to turn up results that are pure nonsense—like a statistical association between playing video games and a lower incidence of colon cancer. Here is the problem: The 99 studies that find no link between video games and colon cancer will not get published, because they are not very interesting. The one study that does find a statistical link will make it into print and get loads of follow-on attention. The source of the bias stems not from the studies themselves but from the skewed information that actually reaches the public. Someone reading the scientific literature on video games and cancer would find only a single study, and that single study will suggest that playing video games can prevent cancer. In fact, 99 studies out of 100 would have found no such link.

Recall Bias

Memory is a fascinating thing—though not always a great source of good data. We have a natural human impulse to understand the present as a logical consequence of things that happened in the past—cause and effect. The problem is that our memories turn out to be “systematically fragile” when we are trying to explain some particularly good or bad outcome in the present. Consider a study looking at the relationship between diet and cancer. In 1993, a Harvard researcher compiled a data set comprising a group of women with breast cancer and an age-matched group of women who had not been diagnosed with cancer. Women in both groups were asked about their dietary habits earlier in life. The study produced clear results: The women with breast cancer were significantly more likely to have had diets that were high in fat when they were younger.
Ah, but this wasn’t actually a study of how diet affects the likelihood of getting cancer. This was a study of how getting cancer affects a woman’s memory of her diet earlier in life. All of the women in the study had completed a dietary survey years earlier, before any of them had been diagnosed with cancer. The striking finding was that women with breast cancer recalled a diet that was much higher in fat than what they actually consumed; the women with no cancer did not.

Women with breast cancer recalled a diet that was much higher in fat than what they actually consumed; the women with no cancer did not.

The New York Times Magazine described the insidious nature of this recall bias:

The diagnosis of breast cancer had not just changed a woman’s present and the future; it had altered her past. Women with breast cancer had (unconsciously) decided that a higher-fat diet was a likely predisposition for their disease and (unconsciously) recalled a high-fat diet. It was a pattern poignantly familiar to anyone who knows the history of this stigmatized illness: these women, like thousands of women before them, had searched their own memories for a cause and then summoned that cause into memory.

Recall bias is one reason that longitudinal studies are often preferred to cross-sectional studies. In a longitudinal study the data are collected contemporaneously. At age five, a participant can be asked about his attitudes toward school. Then, thirteen years later, we can revisit that same participant and determine whether he has dropped out of high school. In a cross-sectional study, in which all the data are collected at one point in time, we must ask an eighteen-year-old high school dropout how he or she felt about school at age five, which is inherently less reliable.

Survivorship Bias

Suppose a high school principal reports that test scores for a particular cohort of students has risen steadily for four years. The sophomore scores for this class were better than their freshman scores. The scores from junior year were better still, and the senior year scores were best of all. We’ll stipulate that there is no cheating going on, and not even any creative use of descriptive statistics. Every year this cohort of students has done better than it did the preceding year, by every possible measure: mean, median, percentage of students at grade level, and so on. Would you (a) nominate this school leader for “principal of the year” or (b) demand more data?

If you have a room of people with varying heights, forcing the short people to leave will raise the average height in the room, but it doesn’t make anyone taller.

I say “b.” I smell survivorship bias, which occurs when some or many of the observations are falling out of the sample, changing the composition of the observations that are left and therefore affecting the results of any analysis. Let’s suppose that our principal is truly awful. The students in his school are learning nothing; each year half of them drop out. Well, that could do very nice things for the school’s test scores—without any individual student testing better. If we make the reasonable assumption that the worst students (with the lowest test scores) are the most likely to drop out, then the average test scores of those students left behind will go up steadily as more and more students drop out. (If you have a room of people with varying heights, forcing the short people to leave will raise the average height in the room, but it doesn’t make anyone taller.)

Healthy User Bias

People who take vitamins regularly are likely to be healthy—because they are the kind of people who take vitamins regularly! Whether the vitamins have any impact is a separate issue. Consider the following thought experiment. Suppose public health officials promulgate a theory that all new parents should put their children to bed only in purple pajamas, because that helps stimulate brain development. Twenty years later, longitudinal research confirms that having worn purple pajamas as a child does have an overwhelmingly large positive association with success in life. We find, for example, that 98 percent of entering Harvard freshmen wore purple pajamas as children (and many still do) compared with only 3 percent of inmates in the Massachusetts state prison system.

The purple pajamas do not matter.

Of course, the purple pajamas do not matter; but having the kind of parents who put their children in purple pajamas does matter. Even when we try to control for factors like parental education, we are still going to be left with unobservable differences between those parents who obsess about putting their children in purple pajamas and those who don’t. As New York Times health writer Gary Taubes explains, “At its simplest, the problem is that people who faithfully engage in activities that are good for them—taking a drug as prescribed, for instance, or eating what they believe is a healthy diet—are fundamentally different from those who don’t.” This effect can potentially confound any study trying to evaluate the real effect of activities perceived to be healthful, such as exercising regularly or eating kale. We think we are comparing the health effects of two diets: kale versus no kale. In fact, if the treatment and control groups are not randomly assigned, we are comparing two diets that are being eaten by two different kinds of people. We have a treatment group that is different from the control group in two respects, rather than just one.

If statistics is detective work, then the data are the clues. My wife spent a year teaching high school students in rural New Hampshire. One of her students was arrested for breaking into a hardware store and stealing some tools. The police were able to crack the case because (1) it had just snowed and there were tracks in the snow leading from the hardware store to the student’s home; and (2) the stolen tools were found inside. Good clues help.
Like good data. But first you have to get good data, and that is a lot harder than it seems.

Open data movement faces fresh hurdles


SciDevNet: “The open-data community made great strides in 2013 towards increasing the reliability of and access to information, but more efforts are needed to increase its usability on the ground and the general capacity of those using it, experts say.
An international network of innovation hubs, the first extensive open data certification system and a data for development partnership are three initiatives launched last year by the fledgling Open Data Institute (ODI), a UK-based not-for-profit firm that champions the use of open data to aid social, economic and environmental development.
Before open data can be used effectively the biggest hurdles to be cleared are agreeing common formats for data sets and improving their trustworthiness and searchability, says the ODI’s chief statistician, Ulrich Atz.
“As it is so new, open data is often inconsistent in its format, making it difficult to reuse. We see a great need for standards and tools,” he tells SciDev.Net. Data that is standardised is of “incredible value” he says, because this makes it easier and faster to use and gives it a longer useable lifetime.
The ODI — which celebrated its first anniversary last month — is attempting to achieve this with a first-of-its-kind certification system that gives publishers and users important details about online data sets, including publishers’ names and contact information, the type of sharing licence, the quality of information and how long it will be available.
Certificates encourage businesses and governments to make use of open data by guaranteeing their quality and usability, and making them easier to find online, says Atz.
Finding more and better ways to apply open data will also be supported by a growing network of ODI ‘nodes’: centres that bring together companies, universities and NGOs to support open-data projects and communities….
Because lower-income countries often lack well-established data collection systems, they have greater freedom to rethink how data are collected and how they flow between governments and civil society, he says.
But there is still a long way to go. Open-data projects currently rely on governments and other providers sharing their data on online platforms, whereas in a truly effective system, information would be published in an open format from the start, says Davies.
Furthermore, even where advances are being made at a strategic level, open-data initiatives are still having only a modest impact in the real world, he says.
“Transferring [progress at a policy level] into availability of data on the ground and the capacity to use it is a lot tougher and slower,” Davies says.”

New Book: Open Data Now


New book by Joel Gurin (The GovLab): “Open Data is the world’s greatest free resource–unprecedented access to thousands of databases–and it is one of the most revolutionary developments since the Information Age began. Combining two major trends–the exponential growth of digital data and the emerging culture of disclosure and transparency–Open Data gives you and your business full access to information that has never been available to the average person until now. Unlike most Big Data, Open Data is transparent, accessible, and reusable in ways that give it the power to transform business, government, and society.
Open Data Now is an essential guide to understanding all kinds of open databases–business, government, science, technology, retail, social media, and more–and using those resources to your best advantage. You’ll learn how to tap crowds for fast innovation, conduct research through open collaboration, and manage and market your business in a transparent marketplace.
Open Data is open for business–and the opportunities are as big and boundless as the Internet itself. This powerful, practical book shows you how to harness the power of Open Data in a variety of applications:

  • HOT STARTUPS: turn government data into profitable ventures
  • SAVVY MARKETING: understand how reputational data drives your brand
  • DATA-DRIVEN INVESTING: apply new tools for business analysis
  • CONSUMER IN FORMATION: connect with your customers using smart disclosure
  • GREEN BUSINESS: use data to bet on sustainable companies
  • FAST R&D: turn the online world into your research lab
  • NEW OPPORTUNITIES: explore open fields for new businesses

Whether you’re a marketing professional who wants to stay on top of what’s trending, a budding entrepreneur with a billion-dollar idea and limited resources, or a struggling business owner trying to stay competitive in a changing global market–or if you just want to understand the cutting edge of information technology–Open Data Now offers a wealth of big ideas, strategies, and techniques that wouldn’t have been possible before Open Data leveled the playing field.
The revolution is here and it’s now. It’s Open Data Now.”

Supporting open government in New Europe


Google Europe Blog: “The “New Europe” countries that joined the European Union over the past decade are moving ahead fast to use the Internet to improve transparency and open government. We recently partnered with Techsoup Global to support online projects driving forward good governance in Romania, the Czech Republic, and most recently, in Slovakia.
Techsoup Global, in partnership with the Slovak Center for Philanthropy, recently held an exciting social-startups awards ceremony Restart Slovakia 2013 in Bratislava. Slovakia’s Deputy Minister of Finance and Digital Champion Peter Pellegrini delivered keynote promoting Internet and Open Data and announced the winners of this year contest. Ambassadors from U.S., Israel and Romania and several distinguished Slovak NGOs also attended the ceremony.
Winning projects included:

  • Vzdy a vsade – Always and Everywhere – a volunteer portal offering online and anonymous psychological advice to internet users via chat.
  • Nemlcme.sk – a portal providing counsel for victims of sexual assaults.
  • Co robim – an educational online library of job careers advising young people how to choose their career paths and dream jobs.
  • Mapa zlocinu – an online map displaying various rates of criminality in different neighbourhoods.
  • Demagog.sk – a platform focused on analyzing public statements of politicians and releasing information about politicians and truthfulness of their speeches in a user-friendly format.”

Why the Nate Silvers of the World Don’t Know Everything


Felix Salmon in Wired: “This shift in US intelligence mirrors a definite pattern of the past 30 years, one that we can see across fields and institutions. It’s the rise of the quants—that is, the ascent to power of people whose native tongue is numbers and algorithms and systems rather than personal relationships or human intuition. Michael Lewis’ Moneyball vividly recounts how the quants took over baseball, as statistical analy­sis trumped traditional scouting and propelled the underfunded Oakland A’s to a division-winning 2002 season. More recently we’ve seen the rise of the quants in politics. Commentators who “trusted their gut” about Mitt Romney’s chances had their gut kicked by Nate Silver, the stats whiz who called the election days before­hand as a lock for Obama, down to the very last electoral vote in the very last state.
The reason the quants win is that they’re almost always right—at least at first. They find numerical patterns or invent ingenious algorithms that increase profits or solve problems in ways that no amount of subjective experience can match. But what happens after the quants win is not always the data-driven paradise that they and their boosters expected. The more a field is run by a system, the more that system creates incentives for everyone (employees, customers, competitors) to change their behavior in perverse ways—providing more of whatever the system is designed to measure and produce, whether that actually creates any value or not. It’s a problem that can’t be solved until the quants learn a little bit from the old-fashioned ways of thinking they’ve displaced.
No matter the discipline or industry, the rise of the quants tends to happen in four stages. Stage one is what you might call pre-disruption, and it’s generally best visible in hindsight. Think about quaint dating agencies in the days before the arrival of Match .com and all the other algorithm-powered online replacements. Or think about retail in the era before floor-space management analytics helped quantify exactly which goods ought to go where. For a live example, consider Hollywood, which, for all the money it spends on market research, is still run by a small group of lavishly compensated studio executives, all of whom are well aware that the first rule of Hollywood, as memorably summed up by screenwriter William Goldman, is “Nobody knows anything.” On its face, Hollywood is ripe for quantifi­cation—there’s a huge amount of data to be mined, considering that every movie and TV show can be classified along hundreds of different axes, from stars to genre to running time, and they can all be correlated to box office receipts and other measures of profitability.
Next comes stage two, disruption. In most industries, the rise of the quants is a recent phenomenon, but in the world of finance it began back in the 1980s. The unmistakable sign of this change was hard to miss: the point at which you started getting targeted and personalized offers for credit cards and other financial services based not on the relationship you had with your local bank manager but on what the bank’s algorithms deduced about your finances and creditworthiness. Pretty soon, when you went into a branch to inquire about a loan, all they could do was punch numbers into a computer and then give you the computer’s answer.
For a present-day example of disruption, think about politics. In the 2012 election, Obama’s old-fashioned campaign operatives didn’t disappear. But they gave money and freedom to a core group of technologists in Chicago—including Harper Reed, former CTO of the Chicago-based online retailer Threadless—and allowed them to make huge decisions about fund-raising and voter targeting. Whereas earlier campaigns had tried to target segments of the population defined by geography or demographic profile, Obama’s team made the campaign granular right down to the individual level. So if a mom in Cedar Rapids was on the fence about who to vote for, or whether to vote at all, then instead of buying yet another TV ad, the Obama campaign would message one of her Facebook friends and try the much more effective personal approach…
After disruption, though, there comes at least some version of stage three: over­shoot. The most common problem is that all these new systems—metrics, algo­rithms, automated decisionmaking processes—result in humans gaming the system in rational but often unpredictable ways. Sociologist Donald T. Campbell noted this dynamic back in the ’70s, when he articulated what’s come to be known as Campbell’s law: “The more any quantitative social indicator is used for social decision-making,” he wrote, “the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.”…
Policing is a good example, as explained by Harvard sociologist Peter Moskos in his book Cop in the Hood: My Year Policing Baltimore’s Eastern District. Most cops have a pretty good idea of what they should be doing, if their goal is public safety: reducing crime, locking up kingpins, confiscating drugs. It involves foot patrols, deep investigations, and building good relations with the community. But under statistically driven regimes, individual officers have almost no incentive to actually do that stuff. Instead, they’re all too often judged on results—specifically, arrests. (Not even convictions, just arrests: If a suspect throws away his drugs while fleeing police, the police will chase and arrest him just to get the arrest, even when they know there’s no chance of a conviction.)…
It’s increasingly clear that for smart organizations, living by numbers alone simply won’t work. That’s why they arrive at stage four: synthesis—the practice of marrying quantitative insights with old-fashioned subjective experience. Nate Silver himself has written thoughtfully about examples of this in his book, The Signal and the Noise. He cites baseball, which in the post-Moneyball era adopted a “fusion approach” that leans on both statistics and scouting. Silver credits it with delivering the Boston Red Sox’s first World Series title in 86 years. Or consider weather forecasting: The National Weather Service employs meteorologists who, understanding the dynamics of weather systems, can improve forecasts by as much as 25 percent compared with computers alone. A similar synthesis holds in eco­nomic forecasting: Adding human judgment to statistical methods makes results roughly 15 percent more accurate. And it’s even true in chess: While the best computers can now easily beat the best humans, they can in turn be beaten by humans aided by computers….
That’s what a good synthesis of big data and human intuition tends to look like. As long as the humans are in control, and understand what it is they’re controlling, we’re fine. It’s when they become slaves to the numbers that trouble breaks out. So let’s celebrate the value of disruption by data—but let’s not forget that data isn’t everything.