How to Start Thinking Like a Data Scientist


Thomas C. Redman in Harvard Business Review Blog: “Slowly but steadily, data are forcing their way into every nook and cranny of every industry, company, and job. Managers who aren’t data savvy, who can’t conduct basic analyses, interpret more complex ones, and interact with data scientists are already at a disadvantage. Companies without a large and growing cadre of data-savvy managers are similarly disadvantaged.
Fortunately, you don’t have to be a data scientist or a Bayesian statistician to tease useful insights from data. This post explores an exercise I’ve used for 20 years to help those with an open mind (and a pencil, paper, and calculator) get started. One post won’t make you data savvy, but it will help you become data literate, open your eyes to the millions of small data opportunities, and enable you work a bit more effectively with data scientists, analytics, and all things quantitative.
While the exercise is very much a how-to, each step also illustrates an important concept in analytics — from understanding variation to visualization.
First, start with something that interests, even bothers, you at work, like consistently late-starting meetings. Whatever it is, form it up as a question and write it down: “Meetings always seem to start late. Is that really true?”
Next, think through the data that can help answer your question, and develop a plan for creating them. Write down all the relevant definitions and your protocol for collecting the data. For this particular example, you have to define when the meeting actually begins. Is it the time someone says, “Ok, let’s begin.”? Or the time the real business of the meeting starts? Does kibitzing count?
Now collect the data. It is critical that you trust the data. And, as you go, you’re almost certain to find gaps in data collection. You may find that even though a meeting has started, it starts anew when a more senior person joins in. Modify your definition and protocol as you go along.
Sooner than you think, you’ll be ready to start drawing some pictures. Good pictures make it easier for you to both understand the data and communicate main points to others. There are plenty of good tools to help, but I like to draw my first picture by hand. My go-to plot is a time-series plot, where the horizontal axis has the date and time and the vertical axis has the variable of interest. Thus, a point on the graph below (click for a larger image) is the date and time of a meeting versus the number of minutes late….”

Google's Civic Information API: now connecting US users with their representatives


Jonathan Tomer, Software Engineer at Google Blog: “Many applications track and map governmental data, but few help their users identify the relevant local public officials. Too often local problems are divorced from the government institutions designed to help. Today, we’re launching new functionality in the Google Civic Information API that lets developers connect constituents to their federal, state, county and municipal elected officials—right down to the city council district.
The Civic Information API has already helped developers create apps for US elections that incorporate polling place and ballot information, from helping those affected by Superstorm Sandy find updated polling locations over SMS to learning more about local races through social networks. We want to support these developers in their work beyond elections, including everyday civic engagement.
In addition to elected representatives, the API also returns your political jurisdictions using Open Civic Data Identifiers. We worked with the Sunlight Foundation and other civic technology groups to create this new open standard to make it easier for developers to combine the Civic Information API with their datasets. For example, once you look up districts and representatives in the Civic Information API, you can match the districts up to historical election results published by Open Elections.
Developers can head over to the documentation to get started; be sure to check out the “Map Your Reps” sample application from Bow & Arrow to get a sense of what the API can do. You can also see the API in action today through new features from some of our partners, for example:

  • Change.org has implemented a new Decision Makers feature which allows users to direct a petition to their elected representative and lists that petition publicly on the representative’s profile page. As a result, the leader has better insight into the issues being discussed in their district, and a new channel to respond to constituents.
  • PopVox helps users share their opinions on bills with their Congressional Representatives in a meaningful format. PopVox uses the API to connect the user to the correct Congressional District. Because PopVox verifies that users are real constituents, the opinions shared with elected officials have more impact on the political process.

Over time, we will expand beyond US elected representatives and elections to other data types and places. We can’t grow without your help. As you use the API, please visit our Developer Forum to share your experiences and tell us how we can help you build the next generation of civic apps and services.”

Four critiques of open data initiatives


Blog by Rob Kitchin: “The arguments concerning the benefits of open data are now reasonably well established and include contentions that open data lead to increased transparency and accountability with respect to public bodies and services; increases the efficiency and productivity of agencies and enhances their governance; promotes public participation in decision making and social innovation; and fosters economic innovation and job and wealth creation (Pollock 2006; Huijboom and Van der Broek 2011; Janssen 2012; Yiu 2012).
What is less well examined are the potential problems affecting, and negative consequences of, open data initiatives.  Consequently, as a provocation for Wednesday’s (Nov 13th, 4-6pm) Programmable City open data event I thought it might be useful to outline four critiques of open data, each of which deserves and demands critical attention: open data lacks a sustainable financial model; promotes a politics of the benign and empowers the empowered; lacks utility and usability; and facilitates the neoliberalisation and marketisation of public services.  These critiques do not suggest abandoning the move towards opening data, but contend that open data initiatives need to be much more mindful of what data are being made open, how data are made available, how they are being used, and how they are being funded.”

Concerns about opening up data, and responses which have proved effective


Google doc by Christopher Gutteridge, University of Southampton and Alexander Dutton, University of Oxford:  “This document is inspired by the open data excuses bingo card. Someone asked for what responses have proved effective. This document is a work in progress based on our experience. Carly Strasser has also written at the Data Pub blog about these issues from an Open Science and research data perspective. You may also be interested in How to make a business case for open data, published by the ODI.
We’ll get spam…
Terrorists might use the data…
People will contact us to ask about stuff…
People will misinterpret the data…
It’s too big…
It’s not very interesting…
We might want to use it in a research paper…
There’s no API to that system…
We’re worried about the Data Protection Act…
We’re not sure that we own it…
I don’t mind making it open, but I worry someone else might object…
It’s too complicated…
Our data is embarrassingly bad…
It’s not a priority and we’re busy…
Our lawyers want to make a custom license…
It changes too quickly…
There’s already a project in progress which sounds similar…
Some of what you asked for is confidential…
I don’t own the data, so can’t give you permission…
We don’t have that data…
That data is already published via (external organisation X)….
We can’t provide that dataset because one part is not possible…
What if something breaks and the open version becomes out of date?…
We can’t see the benefit…
What if we want to sell access to this data…?
If we publish this data, people might sue us…
We want people to come direct to us so we know why they want the data…

Open Government and Its Constraints


Blog entry by Panthea Lee: “Open government” is everywhere. Search the term and you’ll find OpenGovernment.orgOpenTheGovernment.orgOpen Government InitiativeOpen Gov Hub and the Open Gov Foundation; you’ll find open government initiatives for New York CityBostonKansasVirginiaTennessee and the list goes on; you’ll find dedicated open government plans for the White HouseState DepartmentUSAIDTreasuryJustice DepartmentCommerceEnergy and just about every other major federal agency. Even the departments of Defense and Homeland Security are in on open government.
And that’s just in the United States.
There is Open Government AfricaOpen Government in the EU and Open Government Data. The World Bank has an Open Government Data Toolkit and recently announced a three-year initiative to help developing countries leverage open data. And this week, over 1,000 delegates from over 60 countries are in London for the annual meeting of the Open Government Partnership, which has grown from 8 to 60 member states in just two years….
Many of us have no consensus or clarity on just what exactly “open government” iswhat we hope to achieve from it or how to measure our progress. Too often, our initiatives are designed through the narrow lenses of our own biases and without a concrete understanding of those they are intended for — both those in and out of government.
If we hope to realize the promise of more open governments, let’s be clear about the barriers we face so that we may start to overcome them.
Barrier 1: “Open Gov” is…?
Open government is… not new, for starters….
Barrier 2: Open Gov is Not Inclusive
The central irony of open government is that it’s often not “open” at all….
Barrier 3: Open Gov Lacks Empathy
Open government practitioners love to speak of “the citizen” and “the government.” But who exactly are these people? Too often, we don’t really know. We are builders, makers and creators with insufficient understanding of whom we are building, making and creating for…On the flip side, who do we mean by “the government?” And why, gosh darn it, is it so slow to innovate? Simply put, “the government” is comprised of individual people working in environments that are not conducive to innovation….
For open government to realize its potential, we must overcome these barriers.”

Mozilla Location Service: crowdsourcing data to help devices find your location without GPS


“The Mozilla Location Service is an experimental pilot project to provide geolocation lookups based on publicly observable cell tower and WiFi access point information. Currently in its early stages, it already provides basic service coverage of select locations thanks to our early adopters and contributors.
A world map showing areas with location data. Map data provided by mapbox / OpenStreetMap.
While many commercial services exist in this space, there’s currently no large public service to provide this crucial part of any mobile ecosystem. Mobile phones with a weak GPS signal and laptops without GPS hardware can use this service to quickly identify their approximate location. Even though the underlying data is based on publicly accessible signals, geolocation data is by its very nature personal and privacy sensitive. Mozilla is committed to improving the privacy aspects for all participants of this service offering.
If you want to help us build our service, you can install our dedicated Android MozStumbler and enjoy competing against others on our leaderboard or choose to contribute anonymously. The service is evolving rapidly, so expect to see a more full featured experience soon. For an overview of the current experience, you can head over to the blog of Soledad Penadés, who wrote a far better introduction than we did.
We welcome any ideas or concerns about this project and would love to hear any feedback or experience you might have. Please contact us either on our dedicated mailing list or come talk to us in our IRC room #geo on Mozilla’s IRC server.
For more information please follow the links on our project page.”

Scientific Humanities


New course by Bruno Latour: “Scientific humanities” means the extension of interpretative skills to the discoveries made by science and to technical innovations. The course will equip future citizens with the means to be at ease with many issues that straddle the distinctions between science, morality, politics and society.
The course provides concepts and methods to :

  • learn the basics of the field called “science and technology studies”, a vast corpus of literature developed over the last forty years to give a realistic description of knowledge production
  • handle the flood of different opinions about contentious issues and order the various positions by using the tools now available through digital media
  • comment on those different pieces of news in a more articulated way through a specifically designed blog.

Course Format : the course is organized in 8 sequences It displays multimedia contents (images, video, original documents)
Bruno Latour was trained as a philosopher and an anthropologist. From 1982 to 2006, he has been professor at the CSI (Ecole des mines) in Paris. He is now professor at Sciences Po where he created the medialab in 2009. He became famous for his social studies of science and technology. He developed with others a widely known theory called “Actor Network Theory”.
http://www.bruno-latour.fr/

IRM releases United States report for public comment


“The Open Government Partnership’s Independent Reporting Mechanism (IRM) has launched its eighth progress reports for public comment; this one is on the United States and can be found below….
The United States’ action plan was highly varied and, in many respects, ambitious and innovative and significant progress was made on most of the commitments. While OGP implementation in the United States drew inspiration from an unprecedented consultation on open government during the implementation of the 2009 Open Government Directive, the dedicated public consultation for the OGP action plan was more limited and arguably more targeted.
Several of the commitments in the action plan focused on improving transparency; however, open government progress has been relatively slower in controversial areas such as national security, ethics reform, declassification of documents, and Freedom of Information Act reform.
The United States completed half of the commitments in its action plan, while the other half saw limited or substantial progress.
Due to the nature of the US government, wherein federal agencies are to some degree independent of the White House, much of the best participation took place within agencies. There were several notable examples of participation and collaboration at this level, including the commitments around the Extractive Industries Transparency Initiative, the National Dialogue on Federal Website Policy, and NASA’s Space Apps competition.
This report is a draft for public comment.  All interested parties are encouraged to comment on this blog or to send public comments to [email protected] until November 14. Comments will be collated and published, except where the requestor asks to be anonymous. Where substantive factual errors are identified, comments will be integrated into a final version of the report.”
 

United States IRM Report

Residents remix their neighborhood’s streets through platform


Springwise: “City residents may not have degrees in urban planning, but their everyday use of high streets, parks and main roads means they have some valuable input into what’s best for their local environment. A new website called Streetmix is helping to empower citizens, enabling them to become architects with an easy-to-use street-building platform.
Developed by Code for America, the site greets users with a colorful cartoon representation of a typical street, split into segments of varying widths. Designers can then swap and change each piece into road, cycle paths, pedestrian areas, bus stops, bike racks and other amenities, as well as alter their dimensions. Users can create their own perfect high street or use the exact measurements of their own neighborhood to come up with new propositions for planned construction work. Indeed, Streetmix has already found use among residents and organizations to demonstrate how to better use the local space available. Kansas City’s Bike Walk KC has utilized the platform to show how new bike lanes could figure in an upcoming study of traffic flow in the region, while New Zealand’s Transport Blog has presented several alternatives to current street layouts in Auckland.
Streetmix is an easy-to-use visualization tool that can help amateurs present their ideas to local authorities in a more coherent way, potentially increasing the chances of politicians hearing calls for change. Are there other ways to help laymen express complex ideas more eloquently?”
Spotted by Murtaza Patel, written by Springwise

You Can Predict What Government Agencies Will Buy; For Real!


Jen Clement at GovLoop: “Two great free government-run websites that show how federal government agencies are spending their money are USASpending.gov and FedBizOpps.gov. Each site allows you to research how the government has spent its procurement dollars in the last several years, and can give business owners a snapshot of what industry segments and what type of commercial products and services offer the best contracting opportunities so vendors can conduct their target business analysis and approach a select group of potential buyers.

SmartProcure offers a unique service that allows you to search thousands and thousands of government purchase orders, providing you ability to predict purchasing opportunity in the future. SmartProcure lets you search specifically for a product or service you sell and show you exactly which government agencies have bought that product or service, how much they paid, and which vendors (your competitors) they’ve purchased from. In addition to purchasing histories you’ll have access to powerful market analysis tools to help you conduct thorough competitive and market intelligence reviews to find the right niches for your business to take advantage of.  Whether it is federal, state, or local governments, a snapshot into the past can help determine the future…
For more helpful tips visit:  https://ow133.infusionsoft.com/go/blog/jc/