Analyzing the Analyzers


catAn Introspective Survey of Data Scientists and Their Work,By Harlan Harris, Sean Murphy, Marck Vaisman: “There has been intense excitement in recent years around activities labeled “data science,” “big data,” and “analytics.” However, the lack of clarity around these terms and, particularly, around the skill sets and capabilities of their practitioners has led to inefficient communication between “data scientists” and the organizations requiring their services. This lack of clarity has frequently led to missed opportunities. To address this issue, we surveyed several hundred practitioners via the Web to explore the varieties of skills, experiences, and viewpoints in the emerging data science community.

We used dimensionality reduction techniques to divide potential data scientists into five categories based on their self-ranked skill sets (Statistics, Math/Operations Research, Business, Programming, and Machine Learning/Big Data), and four categories based on their self-identification (Data Researchers, Data Businesspeople, Data Engineers, and Data Creatives). Further examining the respondents based on their division into these categories provided additional insights into the types of professional activities, educational background, and even scale of data used by different types of Data Scientists.
In this report, we combine our results with insights and data from others to provide a better understanding of the diversity of practitioners, and to argue for the value of clearer communication around roles, teams, and careers.”

Gamification: A Short History


Ty McCormick in Foreign Policy: “If you’re checking in on Foursquare or ramping up the “strength” of your LinkedIn profile, you’ve just been gamified — whether or not you know it. “Gamification,” today’s hottest business buzzword, is gaining traction everywhere from corporate boardrooms to jihadi chat forums, and its proponents say it can revolutionize just about anything, from education to cancer treatment to ending poverty. While the global market for gamification is expected to explode from $242 million in 2012 to $2.8 billion in 2016, according to market analysis firm M2 Research, there is a growing chorus of critics who think it’s little more than a marketing gimmick. So is the application of game mechanics to everyday life more than just a passing fad? You decide.
1910
Kellogg’s cereals offers its first “premium,” the Funny Jungleland Moving-Pictures book, free with every two boxes. Two years later, Cracker Jack starts putting prizes, from stickers to baseball cards, in its boxes of caramel-coated corn snacks. “A prize in every box” is an instant hit; over the next 100 years, Cracker Jack gives away more than 23 billion in-package treasures. By the 1950s, the concept of gamification is yet to be born, but its primary building block — fun — is motivating billions of consumers around the world.
1959
Duke University sociologist Donald F. Roy publishes “Banana Time,” an ethnographic study of garment workers in Chicago. Roy chronicles how workers use “fun” and “fooling” on the factory room floor — including a daily ritual game in which workers steal a banana — to stave off the “beast of monotony.” The notion that fun can enhance job satisfaction and productivity inspires reams of research on games in the workplace….”

How Open Data Can Fight Climate Change


New blog post by Joel Gurin, Founder and Editor, OpenDataNow.com: When people point to the value of Open Data from government, they often cite the importance of weather data from NOAA, the National Oceanic and Atmospheric Administration. That data has given us the Weather Channel, more accurate forecasts, and a number of weather-based companies. But the most impressive – and one of the best advertisements for government Open Data – may well be The Climate Corporation, headquartered in San Francisco.
Founded in 2006 under the name WeatherBill, The Climate Corporation was started to sell a better kind of weather insurance. But it’s grown into a company that could help farmers around the world plan around climate change, increase their crop yields, and become part of a new green revolution.
The company’s work is especially relevant in light of President Obama’s speech yesterday on new plans to fight climate change. We know that whatever we do to reduce carbon emissions now, we’ll still need to deal with changes that are already irreversible. The Climate Corporation’s work can be part of that solution…
The company has developed a new service, Climate.com, that is free to policyholders and available to others for a fee….
Their work may become part of a global Green Revolution 2.0. The U.S. Government’s satellite data doesn’t stop at the border: It covers the entire planet.  The Climate Corporation is now looking for ways to apply its work internationally, probably starting with Australia, which has relevant data of its own.
Start with insurance sales, end up by changing the world. The power of Open Data has never been clearer.”

Quantifying Our Cities, Ourselves


David Sasaki in Next City: “Over the past few years a merry band of geeks from around the world has given rise to the movement of the quantified self. The mission, as the geeks explain it, is “self knowledge through numbers.” Vanity Fair sarcastically calls them “weirder, hive minder weight watchers.”
The basic premise of the quantified self is perhaps best summed up by a popular slogan from business consultant Peter Drucker: “What gets measured gets managed.” If we aspire to run faster, then we must use a stopwatch to time our pace. If we want to lose weight, then we must buy a scale to measure our progress until we reach our goal. Modern self-trackers have the advantages of apps that make it possible to quantitatively analyze sleep, moods, finances, vital signs and even amino acids, all without consulting a single other person….
What if we were to apply the model of the quantified self to the development of our cities? It’s a question that appears to be gaining steam. Esther Dyson, an influential angel investor and technology analyst, has observed the emergence of a suite of applications that enable citizens and governments to monitor the “health” of their communities.
Civic Insight, for example, has partnered with New Orleans to enable citizens to monitor what the local government is doing to address blight. On Monday, the project was announced as one of eight winners of the 2013 Knight News Challenge, which means that the software will be expanding for use in other cities. Yelp has partnered with New York and San Francisco to make restaurant inspection data available on restaurant profile pages. (Boston, Philadelphia and Chicago have already committed to making their restaurant inspection data available using the same standard.) The Daily Brief allows residents of Baltimore, Bloomington and Boston to monitor all the 311 service requests made by citizens each day.”

Transforming Government Acquisition Systems: Overview and Selected Issues


New Report of the Congressional Research Service: “Increasingly, the federal government uses technology to facilitate and support the federal acquisition process. Primary beneficiaries of this shift to online systems (websites and databases) are the government’s acquisition workforce and prospective and incumbent government contractors. The suite of web-based systems supports contracting officers’ efforts to ensure the government contracts only with responsible parties, is essential to the dissemination of information regarding contracting opportunities, and facilitates interagency contracting. From the contractor perspective, the government’s online systems streamline the processes involved in fulfilling various administrative requirements, provide access to possible contracting opportunities, and are potential resources for market research.
Although this report does not focus on transparency, several issues discussed here are related to transparency. First, while the Federal Business Opportunities (FedBizOpps) website and FPDS-NG provide information about executive branch agencies’ procurements, a database of federal agencies’ contracts does not exist. In 2003, GSA established a working group to examine the feasibility, challenges, and anticipated benefits of posting federal contracts online. Ultimately, the working group concluded there were insufficient data to support recommending the establishment of a central system for posting contracts online. In 2010, the Department of Defense (DOD), GSA, and the National Aeronautics and Space Administration (NASA) issued an advance notice of proposed rulemaking (ANPR) regarding posting contracts online. Comments submitted in response to the notice identified several challenges, and the matter was concluded when the agencies withdrew the ANPR. Second, transparency does not necessarily equate to comprehension. Generally, variation exists among the users of government procurement systems regarding their knowledge of government procurement and procurement data. Third, during the 113th Congress, two similar bills (H.R. 2061 and S. 994) with the same name (Digital Accountability and Transparency Act, or DATA Act) were introduced, either of which would enhance transparency of spending data, including certain procurement data. If either bill is enacted, it might have implications for FPDS-NG.”

Knight News Challenge on Open Gov


Press Release: “Knight Foundation today named eight projects as winners of the Knight News Challenge on Open Gov, awarding the recipients more than $3.2 million for their ideas.
The projects will provide new tools and approaches to improve the way people and governments interact. They tackle a range of issues from making it easier to open a local business to creating a simulator that helps citizens visualize the impact of public policies on communities….
Each of the winning projects offers a solution to a real-world need. They include:
Civic Insight: Providing up-to-date information on vacant properties so that communities can find ways to make tangible improvements to local spaces;
OpenCounter: Making it easier for residents to register and create new businesses by building open source software that governments can use to simplify the process;
Open Gov for the Rest of Us: Providing residents in low-income neighborhoods in Chicago with the tools to access and demand better data around issues important to them, like housing and education;
Outline.com: Launching a public policy simulator that helps people visualize the impact that public policies like health care reform and school budget changes might have on local economies and communities;
Oyez Project: Making state and appellate court documents freely available and useful to journalists, scholars and the public, by providing straightforward summaries of decisions, free audio recordings and more;
Procur.io: Making government contract bidding more transparent by simplifying the way smaller companies bid on government work;
GitMachines: Supporting government innovation by creating tools and servers that meet government regulations, so that developers can easily build and adopt new technology;
Plan in a Box: Making it easier to discover information about local planning projects, by creating a tool that governments and contractors can use to easily create websites with updates that also allow public input into the process.

Now in its sixth year, the Knight News Challenge accelerates media innovation by funding breakthrough ideas in news and information. Winners receive a share of $5 million in funding and support from Knight’s network of influential peers and advisors to help advance their ideas. Past News Challenge winners have created a lasting impact. They include: DocumentCloud, which analyzes and annotates public documents – turning them into data; Tools for OpenStreetMap, which makes it easier to contribute to the editable map of the world; and Safecast, which helps people measure air quality and became the leading provider of pollution data following the 2011 earthquake and tsunami in Japan.
For more, visit newschallenge.org and follow #newschallenge on Twitter.

Weather Could Be Next On The Auction Block For Crowdsourced Data


Darrell Etherington in TechCrunch: “Waze’s big exit to Google proved one thing: if companies can harness the power of the crowd to deliver real-time, granular data, big tech corporations will be watching them closely as potential acquisition targets. There’s another category ripe for the picking, even if the problem being solved isn’t as apparent or immediately useful as traffic and navigation data: weather. A few apps are trying to harness the crowd to provide accurate, ground-level forecasts and conditions, and they’re catching on with consumers, too.
Montreal-based startup SkyMotion is one such firm, and it recently launched its 4.0 update, which not only harnesses crowdsourced weather reports, but also allows other businesses to plug into that data using a public API, to integrate real-time reporting data from SkyMotion’s users into their own products. That provides an up-to-the-minute forecast, one that probably won’t show you weather conditions completely dissimilar from the ones you’re actually feeling outside at any given moment, as can still be the case with apps that pull weather data only from specific weather monitoring stations….
SkyMotion isn’t alone in crowdsourcing weather data. There’s also Weddar, the “people-powered” weather service and mobile app that encourages location-based reporting with a very human element, since it asks people how conditions generally feel on the ground, instead of seeking out specifics…”

When Ordinary Americans Accomplish What the Government Can’t




in The National Journal: “Washington may be paralyzed by partisanship, but across the country, grassroots innovators are crafting solutions to our problems….This special issue of National Journal celebrates these pragmatic problem-solvers in business, the civic sector, local government, and partnerships that creatively combine all three. At a time of endemic stalemate in the nation’s capital, think of it as a report from the America that works (to borrow a recent phrase from The Economist)….
Another significant message is that the communications revolution, by greatly accelerating the sharing of ideas, has produced a “democratization of innovation,” as author Vijay Vaitheeswaran put it in his 2012 book, Need, Speed, and Greed. This dynamic has simultaneously allowed breakthroughs to disseminate faster than ever and empowered more people inside companies and communities to tackle problems previously left to elites. “One of the most interesting stories in social change today is how much creative problem-solving is emerging from citizens scattered far and wide who are taking it upon themselves to fix things and who, in many cases, are outperforming traditional organizations,” David Bornstein, founder of the Dowser.org website that tracks social innovation, wrote in The New York Times last year. Our honoree Eric Greitens, the former Navy SEAL who founded The Mission Continues for other post-9/11 veterans, personifies this trend. Across the categories, many honorees insist they have pursued new approaches in part because they could no longer wait for Washington to address the problems they face. In a world where barriers to the dispersal of ideas are crumbling, waiting for elites to propose answers may soon seem as outdated as waiting for a dial-up connection to the Internet.
The third conclusion limits the first two. Even many of the most dynamic grassroots innovations will remain isolated islands of excellence in this continent-sized society without energy and amplification from the top. Donald Kettl, dean of the University of Maryland’s School of Public Policy, notes the federal government is unavoidably a major force on many of the challenges facing America, particularly reforming education, health care, and training; developing regional economic strategies; and providing physical and digital infrastructure. Washington need not direct or control the response to these problems, but change on a massive scale is always harder without stronger signals and incentives than the federal government has provided in recent years. “It is possible to feed change aggressively from the bottom,” Kettl says. “[But] the federal government, for better or worse, inevitably is involved…. There’s a natural limit in what’s possible to bubble up from the bottom….
Special issue at https://web.archive.org/web/2013/http://www.nationaljournal.com/back-in-business ”
 
 

Big ideas can be bad ideas – even in the age of the thinktank


Mark Mazower, who teaches history at Columbia University, in the Guardian: “First there was Francis Fukuyama’s The End of History. More recently, we had Malcolm Gladwell’s The Tipping Point and Cass Sunstein’s Nudge: for years, it seems, big ideas have been heading our way across the Atlantic. It is hard to think of many similarly catchy slogans that have gone the other way of late – Tony Giddens’ notion of “the third way” may be one.
Some people think that is a problem. They are worried that Britain has been failing to produce big ideas that policymakers can use. They want to convert academic ideas into policy relevance and shake up the bureaucrats. Phillip Blond, who recently wrote a controversial article in Chatham House’s magazine, is one of them. Francis Maude is another: he wants politicians to be able to appoint senior civil servants so that fresh thinking can enter Whitehall…
And are big ideas the kind of ideas worth having anyway? They age badly for one thing and quickly look shopworn. Moreover, it’s hard to think of many scholars whose best work has been directed explicitly towards such a goal. …The tendency in recent government policy here to demand demonstrable policy relevance or public “impact” from academics shows how far this mindset has spread. It may or may not produce some policy product. But what it will do is jeopardise British universities’ ability to do what they have done so well for so long: world-class research. These days both government and business demand value for money when they fund academia, and this makes it harder and more vital to insist that there are many ways to demonstrate the value of ideas, not just policy relevance.”

Experiments in Democracy


Jeremy Rozansky, assistant editor of National Affairs in The New Atlantis: ” In his debut book Uncontrolled, entrepreneur and policy analyst Jim Manzi argues that social scientists and policymakers should instead adopt the “experimental method.” The essential tool of this method is the randomized field trial (RFT), a technique that already informs many of our successful private enterprises. Perhaps the best known example of RFTs — one that Manzi uses to illustrate the concept — is the kind of clinical trial performed to test new medicines, wherein researchers “undertake a painstaking series of replicated controlled experiments to measure the effects of various interventions under various conditions,” as he puts it.
 
The central argument of Uncontrolled is that RFTs should be adopted more widely by businesses as well as government. The book is helpful and holds much wisdom — although the approach he recommends is ultimately just another streetlamp in the night, casting a pale light that tapers off after a few yards. Much still lies beyond its glow….
The econometric method now dominates the social sciences because it helps to cope with the problem of high causal density. It begins with a large data set: economic records, election results, surveys, and other similar big pools of data. Then the social scientist uses statistical techniques to model the interactions of sundry independent variables (causes) and a dependent variable (the effect). But for this method to work properly, social scientists must know all the causally important variables beforehand, because a hidden conditional could easily yield a false positive.
The experimental method, which Manzi prefers, offers a different way of coping with high causal density: sidestepping the problem of isolating exact causes. To sort out whether a given treatment or policy works, a scientist or social scientist can try it out on a random section of a population, and compare the results to a different section of the population where the treatment or policy was not implemented. So while econometric models aim to identify which particular variables are responsible for different results, RFTs have more modest aims, as they do not seek to identify every hidden conditional. By using the RFT approach, we may not know precisely why we achieved a desired effect, since we do not model all possible variables. But we can gain some ability to know that we will achieve a desired effect, at least under certain conditions.
Strictly speaking, even a randomized field trial only tells us with certainty that some exact technique worked with some specific population on some specific date in the past when conducted by some specific experimenters. We cannot know whether a given treatment or policy will work again under the same conditions at a later date, much less on a different population, much less still on the population as a whole. But scientists must always be cautious about moving from particular results to general conclusions; this is why experiments need to be replicated. And the more we do replicate them, the more information we can gain from those particular results, and the more reliably they can build toward teaching us which treatments or policies might work or (more often) which probably won’t. The result is that the RFT approach is very well suited to the business of government, since policymakers usually only need to know whether a given policy will work — whether it will produce a desired outcome.”