Analyzing the Analyzers


catAn Introspective Survey of Data Scientists and Their Work,By Harlan Harris, Sean Murphy, Marck Vaisman: “There has been intense excitement in recent years around activities labeled “data science,” “big data,” and “analytics.” However, the lack of clarity around these terms and, particularly, around the skill sets and capabilities of their practitioners has led to inefficient communication between “data scientists” and the organizations requiring their services. This lack of clarity has frequently led to missed opportunities. To address this issue, we surveyed several hundred practitioners via the Web to explore the varieties of skills, experiences, and viewpoints in the emerging data science community.

We used dimensionality reduction techniques to divide potential data scientists into five categories based on their self-ranked skill sets (Statistics, Math/Operations Research, Business, Programming, and Machine Learning/Big Data), and four categories based on their self-identification (Data Researchers, Data Businesspeople, Data Engineers, and Data Creatives). Further examining the respondents based on their division into these categories provided additional insights into the types of professional activities, educational background, and even scale of data used by different types of Data Scientists.
In this report, we combine our results with insights and data from others to provide a better understanding of the diversity of practitioners, and to argue for the value of clearer communication around roles, teams, and careers.”

How Open Data Can Fight Climate Change


New blog post by Joel Gurin, Founder and Editor, OpenDataNow.com: When people point to the value of Open Data from government, they often cite the importance of weather data from NOAA, the National Oceanic and Atmospheric Administration. That data has given us the Weather Channel, more accurate forecasts, and a number of weather-based companies. But the most impressive – and one of the best advertisements for government Open Data – may well be The Climate Corporation, headquartered in San Francisco.
Founded in 2006 under the name WeatherBill, The Climate Corporation was started to sell a better kind of weather insurance. But it’s grown into a company that could help farmers around the world plan around climate change, increase their crop yields, and become part of a new green revolution.
The company’s work is especially relevant in light of President Obama’s speech yesterday on new plans to fight climate change. We know that whatever we do to reduce carbon emissions now, we’ll still need to deal with changes that are already irreversible. The Climate Corporation’s work can be part of that solution…
The company has developed a new service, Climate.com, that is free to policyholders and available to others for a fee….
Their work may become part of a global Green Revolution 2.0. The U.S. Government’s satellite data doesn’t stop at the border: It covers the entire planet.  The Climate Corporation is now looking for ways to apply its work internationally, probably starting with Australia, which has relevant data of its own.
Start with insurance sales, end up by changing the world. The power of Open Data has never been clearer.”

Sensing and Shaping Emerging Conflicts


cover.phpA new Report of a Joint Workshop of the National Academy of Engineering and the United States Institute of Peace: Roundtable on Technology, Science, and Peacebuilding: “Technology has revolutionized many aspects of modern life, from how businesses operate, to how people get information, to how countries wage war. Certain technologies in particular, including not only cell phones and the Internet but also satellites, drones, and sensors of various kinds, are transforming the work of mitigating conflict and building peaceful societies. Rapid increases in the capabilities and availability of digital technologies have put powerful communications devices in the hands of most of the world’s population.
These technologies enable one-to-one and one-to-many flows of information, connecting people in conflict settings to individuals and groups outside those settings and, conversely, linking humanitarian organizations to people threatened by violence. Communications within groups have also intensified and diversified as the group members use new technologies to exchange text, images, video, and audio. Monitoring and analysis of the flow and content of this information can yield insights into how violence can be prevented or mitigated. In this way technologies and the resulting information can be used to detect and analyze, or sense, impending conflict or developments in ongoing conflict.”

FailureFest


Geoff Mulgan’s blog: “We’ve often discussed the role of failure in innovation – and have started running FailureFests and other devices to get practitioners talking honestly about what they learned from things that didn’t work. We all know how hard this is.
There’s a new book out by the guru of failure in engineering, Henry Petroski: To forgive design: understanding failure. He argues that the best way of achieving lasting success is by understanding failure and that a single failure may show ‘weaknesses in reasoning, knowledge, and performance that all the successful designs may not even hint at’. For him the best examples are collapsing bridges. Here’s a very different, but helpful, example of trying to extract some useful lessons from a well-intentioned project that didn’t quite work in a field very distant from bridges. It’s a reminder of why it’s so important that the new What Works centres are brave enough to set out clearly the ideas that they think have been tested and shown not to work – that may be just as useful as the recommendations on best or proven practice.
Of course it’s not enough to say we should celebrate failure. No organisation or system can do that. Instead there is an unavoidable ambiguity in the relationship between innovation and failure. On the one hand if you’re not failing often, you’re probably not taking enough creative risks. On the other hand, if you fail too much don’t expect to keep your job, or your funding. “

Knight News Challenge on Open Gov


Press Release: “Knight Foundation today named eight projects as winners of the Knight News Challenge on Open Gov, awarding the recipients more than $3.2 million for their ideas.
The projects will provide new tools and approaches to improve the way people and governments interact. They tackle a range of issues from making it easier to open a local business to creating a simulator that helps citizens visualize the impact of public policies on communities….
Each of the winning projects offers a solution to a real-world need. They include:
Civic Insight: Providing up-to-date information on vacant properties so that communities can find ways to make tangible improvements to local spaces;
OpenCounter: Making it easier for residents to register and create new businesses by building open source software that governments can use to simplify the process;
Open Gov for the Rest of Us: Providing residents in low-income neighborhoods in Chicago with the tools to access and demand better data around issues important to them, like housing and education;
Outline.com: Launching a public policy simulator that helps people visualize the impact that public policies like health care reform and school budget changes might have on local economies and communities;
Oyez Project: Making state and appellate court documents freely available and useful to journalists, scholars and the public, by providing straightforward summaries of decisions, free audio recordings and more;
Procur.io: Making government contract bidding more transparent by simplifying the way smaller companies bid on government work;
GitMachines: Supporting government innovation by creating tools and servers that meet government regulations, so that developers can easily build and adopt new technology;
Plan in a Box: Making it easier to discover information about local planning projects, by creating a tool that governments and contractors can use to easily create websites with updates that also allow public input into the process.

Now in its sixth year, the Knight News Challenge accelerates media innovation by funding breakthrough ideas in news and information. Winners receive a share of $5 million in funding and support from Knight’s network of influential peers and advisors to help advance their ideas. Past News Challenge winners have created a lasting impact. They include: DocumentCloud, which analyzes and annotates public documents – turning them into data; Tools for OpenStreetMap, which makes it easier to contribute to the editable map of the world; and Safecast, which helps people measure air quality and became the leading provider of pollution data following the 2011 earthquake and tsunami in Japan.
For more, visit newschallenge.org and follow #newschallenge on Twitter.

G8 Open Data Charter: "Open Data by Default" will "fuel innovation"


G8 Open Data Charter, June 2013: “Principle 1: Open Data by Default
13. We recognise that free access to, and subsequent re-use of, open data are of significant value to society and the economy.
14. We agree to orient our governments towards open data by default.
15. We recognise that the term government data is meant in the widest sense possible. This could apply to data owned by national, federal, local, or international government bodies, or by the wider public sector.
16. We recognise that there is national and international legislation, in particular pertaining to intellectual property, personally-identifiable and sensitive information, which must be observed.
17. We will: establish an expectation that all government data be published openly by default , as outlined in this Charter, while recognising that there are legitimate reasons why some data cannot be released….
Principle 4: Releasing Data for Improved Governance
25. We recognise that the release of open data strengthens our democratic institutions and encourages better policy-making to meets the needs of our citizens. This is true not only in our own countries but across the world.
26. We also recognise that interest in open data is growing in other multilateral organisations and initiatives.
27. We will: share technical expertise and experience with each other and with other countries across the world so that everyone can reap the benefits of open data; and be transparent about our own data collection, standards, and publishing processes , by documenting all of these related processes online.
Principle 5: Releasing Data for Innovation
28. Recognising the importance of diversity in stimulating creativity and innovation, we agree that the more people and or ganisations that use our data, the greater the social and economic benefits that will be generated. This is true for both commercial and non-commercial uses .
29. We will: work to increase open data literacy and encourage people, such as developers of applications and civil society organisations that work in the field of open data promotion, to unlock the value of open data ; empower a future generation of data innovators by providing data in machine-readable formats.
See also:
Professor Sir Nigel Shadbolt, Chairman and Co-Founder, Open Data Institute on G8 Open Data Charter: why it matters
Nick Sinai and Marina Martin from the White House on Open Data Going Global

Big ideas can be bad ideas – even in the age of the thinktank


Mark Mazower, who teaches history at Columbia University, in the Guardian: “First there was Francis Fukuyama’s The End of History. More recently, we had Malcolm Gladwell’s The Tipping Point and Cass Sunstein’s Nudge: for years, it seems, big ideas have been heading our way across the Atlantic. It is hard to think of many similarly catchy slogans that have gone the other way of late – Tony Giddens’ notion of “the third way” may be one.
Some people think that is a problem. They are worried that Britain has been failing to produce big ideas that policymakers can use. They want to convert academic ideas into policy relevance and shake up the bureaucrats. Phillip Blond, who recently wrote a controversial article in Chatham House’s magazine, is one of them. Francis Maude is another: he wants politicians to be able to appoint senior civil servants so that fresh thinking can enter Whitehall…
And are big ideas the kind of ideas worth having anyway? They age badly for one thing and quickly look shopworn. Moreover, it’s hard to think of many scholars whose best work has been directed explicitly towards such a goal. …The tendency in recent government policy here to demand demonstrable policy relevance or public “impact” from academics shows how far this mindset has spread. It may or may not produce some policy product. But what it will do is jeopardise British universities’ ability to do what they have done so well for so long: world-class research. These days both government and business demand value for money when they fund academia, and this makes it harder and more vital to insist that there are many ways to demonstrate the value of ideas, not just policy relevance.”

Experiments in Democracy


Jeremy Rozansky, assistant editor of National Affairs in The New Atlantis: ” In his debut book Uncontrolled, entrepreneur and policy analyst Jim Manzi argues that social scientists and policymakers should instead adopt the “experimental method.” The essential tool of this method is the randomized field trial (RFT), a technique that already informs many of our successful private enterprises. Perhaps the best known example of RFTs — one that Manzi uses to illustrate the concept — is the kind of clinical trial performed to test new medicines, wherein researchers “undertake a painstaking series of replicated controlled experiments to measure the effects of various interventions under various conditions,” as he puts it.
 
The central argument of Uncontrolled is that RFTs should be adopted more widely by businesses as well as government. The book is helpful and holds much wisdom — although the approach he recommends is ultimately just another streetlamp in the night, casting a pale light that tapers off after a few yards. Much still lies beyond its glow….
The econometric method now dominates the social sciences because it helps to cope with the problem of high causal density. It begins with a large data set: economic records, election results, surveys, and other similar big pools of data. Then the social scientist uses statistical techniques to model the interactions of sundry independent variables (causes) and a dependent variable (the effect). But for this method to work properly, social scientists must know all the causally important variables beforehand, because a hidden conditional could easily yield a false positive.
The experimental method, which Manzi prefers, offers a different way of coping with high causal density: sidestepping the problem of isolating exact causes. To sort out whether a given treatment or policy works, a scientist or social scientist can try it out on a random section of a population, and compare the results to a different section of the population where the treatment or policy was not implemented. So while econometric models aim to identify which particular variables are responsible for different results, RFTs have more modest aims, as they do not seek to identify every hidden conditional. By using the RFT approach, we may not know precisely why we achieved a desired effect, since we do not model all possible variables. But we can gain some ability to know that we will achieve a desired effect, at least under certain conditions.
Strictly speaking, even a randomized field trial only tells us with certainty that some exact technique worked with some specific population on some specific date in the past when conducted by some specific experimenters. We cannot know whether a given treatment or policy will work again under the same conditions at a later date, much less on a different population, much less still on the population as a whole. But scientists must always be cautious about moving from particular results to general conclusions; this is why experiments need to be replicated. And the more we do replicate them, the more information we can gain from those particular results, and the more reliably they can build toward teaching us which treatments or policies might work or (more often) which probably won’t. The result is that the RFT approach is very well suited to the business of government, since policymakers usually only need to know whether a given policy will work — whether it will produce a desired outcome.”
 

Data-Smart City Solutions


Press Release: “Today the Ash Center for Democratic Governance and Innovation at Harvard Kennedy School announced the launch of Data-Smart City Solutions, a new initiative aimed at using big data and analytics to transform the way local government operates. Bringing together leading industry, academic, and government officials, the initiative will offer city leaders a national depository of cases and best practice examples where cities and private partners use analytics to solve city problems. Data-Smart City Solutions is funded by Bloomberg Philanthropies and the John D. and Catherine T. MacArthur Foundation.

Data-Smart City Solutions highlights best practices, curates resources, and supports cities embarking on new data projects. The initiative’s website contains feature-length articles on how data drives innovation in different policy areas, profile pieces on municipal leaders at the forefront of implementing data analytics in their cities, and resources for interested officials to begin data projects in their own communities.
Recent articles include an assessment of Boston’s Adopt-a-Hydrant program as a potential harbinger of future city work promoting civic engagement and infrastructure maintenance, and a feature on how predictive technology is transforming police work. The site also spotlights municipal use of data such as San Francisco’s efforts to integrate data from different social service departments to better identify and serve at-risk youth. In addition to visiting the initiative’s website, Data-Smart City Solutions’ work is chronicled in their newsletter as well as on their Twitter page.”

Is Cybertopianism Really Such a Bad Thing?


in Slate: “As the historian and technology scholar Langdon Winner suggests, “The arrival of any new technology that has significant power and practical potential always brings with it a wave of visionary enthusiasm that anticipates the rise of a utopian social order.” Technologies that connect individuals to one another—like the airplane, the telegraph, and the radio—appear particularly powerful at helping us imagine a smaller, more connected world. Seen through this lens, the Internet’s underlying architecture—it is no more and no less than a network that connects networks—and the sheer amount written about it in the past decade guaranteed that the network would be placed at the center of visions for a world made better through connection. These visions are so abundant that they’ve even spawned a neologism: “cyberutopianism.”

The term “cyberutopian” tends to be used only in the context of critique. Calling someone a cyberutopian implies that he or she has an unrealistic and naïvely overinflated sense of what technology makes possible and an insufficient understanding of the forces that govern societies. Curiously, the commonly used term for an opposite stance, a belief that Internet technologies are weakening society, coarsening discourse, and hastening conflict is described with a less weighted term: “cyberskepticism.” Whether or not either of these terms adequately serves us in this debate, we should consider cyberutopianism’s appeal, and its merits….

If we reject the notion that technology makes certain changes inevitable, but accept that the aspirations of the “cyberutopians” are worthy ones, we are left with a challenge: How do we rewire the tools we’ve built to maximize our impact on an interconnected world? Accepting the shortcomings of the systems we’ve built as inevitable and unchangeable is lazy. As Benjamin Disraeli observed in Vivian Grey, “Man is not the creature of circumstances, circumstances are the creatures of men. We are free agents, and man is more powerful than matter.” And, as Rheingold suggests, believing that people can use technology to build a world that’s more just, fair, and inclusive isn’t merely defensible. It’s practically a moral imperative.


Excerpted from Rewire: Digital Cosmopolitans in the Age of Connection by Ethan Zuckerman.