AU: Govt finds one third of open data was "junk"


IT News: “The number of datasets available on the Government’s open data website has slimmed by more than half after the agency discovered one third of the datasets were junk.
Since its official launch in 2011 data.gov.au grew to hold 1200 datasets from government agencies for public consumption.
In July this year the Deaprtment of Finance migrated the portal to a new open source platform – the Open Knowledge Foundation CKAN platform – for greater ease of use and publishing ability.
Since July the number of datasets fell from 1200 to 500.
Australian Government CTO John Sheridan said in his blog late yesterday the agency had needed to review the 1200 datasets as a result of the CKAN migration, and discovered a significant amount of them were junk.
“We unfortunately found that a third of the “datasets” were just links to webpages or files that either didn’t exist anymore, or redirected somewhere not useful to genuine seekers of data,” Sheridan said.
“In the second instance, the original 1200 number included each individual file. On the new platform, a dataset may have multiple files. In one case we have a dataset with 200 individual files where before it was counted as 200 datasets.”
The number of datasets following the clean out now sits at 529. Around 123 government bodies contributed data to the portal.
Sheridan said the number was still too low.
“A lot of momentum has built around open data in Australia, including within governments around the country and we are pleased to report that a growing number of federal agencies are looking at how they can better publish data to be more efficient, improve policy development and analysis, deliver mobile services and support greater transparency and public innovation,” he said….
The Federal Government’s approach to open data has previously been criticised as “patchy” and slow, due in part to several shortcomings in the data.gov.au website as well as slow progress in agencies adopting an open approach by default.
The Australian Information Commissioner’s February report on open data in government outlined the manual uploading and updating of datasets, lack of automated entry for metadata and a lack of specific search functions within data.gov.au as obstacles affecting the efforts pushing a whole-of-government approach to open data.
The introduction of the new CKAN platform is expected to go some way to addressing the highlighted concerns.”

Continued Progress: Engaging Citizen Solvers through Prizes


Blog post by Cristin Dorgelo: “Today OSTP released its second annual comprehensive report detailing the use of prizes and competitions by Federal agencies to spur innovation and solve Grand Challenges. Those efforts have expanded in the last two years under the America COMPETES Reauthorization Act of 2010, which granted all Federal agencies the authority to conduct prize competitions to spur innovation, solve tough problems, and advance their core missions.
This year’s report details the remarkable benefits the Federal Government reaped in Fiscal Year (FY) 2012 from more than 45 prize competitions across 10 agencies. To date, nearly 300 prize competitions have been implemented by 45 agencies through the website Challenge.gov.
Over the past four years, the Obama Administration has taken important steps to make prizes a standard tool in every agency’s toolbox. In his September 2009 Strategy for American Innovation, President Obama called on all Federal agencies to increase their use of prizes to address some of our Nation’s most pressing challenges. In March 2010, the Office of Management and Budget issued a policy framework to guide agencies in using prizes to mobilize American ingenuity and advance their respective core missions. Then, in September 2010, the Administration launched Challenge.gov, a one-stop shop where entrepreneurs and citizen solvers can find public-sector prize competitions.
The prize authority in COMPETES is a key piece of this effort. By giving agencies a clear legal path and expanded authority to deploy competitions and challenges, the legislation makes it dramatically easier for agencies to enlist this powerful approach to problem-solving and to pursue ambitious prizes with robust incentives…
To support these ongoing efforts, the General Services Administration  continues to train agencies about resources and vendors available to help them administer prize competitions. In addition, NASA’s Center of Excellence for Collaborative Innovation (CoECI) provides other agencies with a full suite of services for incentive prize pilots – from prize design, through implementation, to post-prize evaluation”

Google Global Impact Award Expands Zooniverse


Press Release: “A $1.8 million Google Global Impact Award will enable Zooniverse, a nonprofit collaboration led by the Adler Planetarium and the University of Oxford, to make setting up a citizen science project as easy as starting a blog and could lead to thousands of innovative new projects around the world, accelerating the pace of scientific research.
The award supports the further development of the Zooniverse, the world’s leading ‘citizen science’ platform, which has already given more than 900,000 online volunteers the chance to contribute to science by taking part in activities including discovering planets, classifying plankton or searching through old ship’s logs for observations of interest to climate scientists. As part of the Global Impact Award, the Adler will receive $400,000 to support the Zooniverse platform.
With the Google Global Impact Award, Zooniverse will be able to rebuild their platform so that research groups with no web development expertise can build and launch their own citizen science projects.
“We are entering a new era of citizen science – this effort will enable prolific development of science projects in which hundreds of thousands of additional volunteers will be able to work alongside professional scientists to conduct important research – the potential for discovery is limitless,” said Michelle B. Larson, Ph.D., Adler Planetarium president and CEO. “The Adler is honored to join its fellow Zooniverse partner, the University of Oxford, as a Google Global Impact Award recipient.”
The Zooniverse – the world’s leading citizen science platform – is a global collaboration across several institutions that design and build citizen science projects. The Adler is a founding partner of the Zooniverse, which has already engaged more than 900,000 online volunteers as active scientists by discovering planets, mapping the surface of Mars and detecting solar flares. Adler-directed citizen science projects include: Galaxy Zoo (astronomy), Solar Stormwatch (solar physics), Moon Zoo (planetary science), Planet Hunters (exoplanets) and The Milky Way Project (star formation). The Zooniverse (zooniverse.org) also includes projects in environmental, biological and medical sciences. Google’s investment in the Adler and its Zooniverse partner, the University of Oxford, will further the global reach, making thousands of new projects possible.”

Where Do You Want to Dwell?


Stephen Buckner @ Census Blog:  “New Census Mobile App Showcases Local Statistics for People on the Go
dwellr, a new mobile app from the U.S. Census BureauAmerica has always been a nation on the move. Whether you are looking for a career change or a new neighborhood to call home, life decisions affect each of us every day.  With roughly half of Americans now owning smartphones, everyone should be able to access the wealth of statistics the Census Bureau collects to make informed decisions on the go, whether at home or on the road.  What good are data if nobody but the experts can easily access them? The Census Bureau uses 21st century technology to meet its centuries-old mission, making the statistics that define our growing, changing nation more accessible to the public than ever before.
The Census Bureau’s new mobile app, dwellr, provides those on the go with immediate, personalized access to the latest demographic, socio-economic and housing statistics from the American Community Survey for neighborhoods across the nation. Using the level of importance you place on a location’s characteristics, the app generates a list of top 25 towns or cities most suitable for you. Once you have used the app, it saves your selections on your phone so you can see how they match up against each new place you visit.
With more than 30 million Americans moving last year, dwellr allows for quick and easy access to information to help make the decision, including the ages of residents, how many families have children, median income and housing costs. Dwellr allows Apple and Android smartphone users to explore a range of questions making it a powerful tool for homebuyers, members of the military being deployed domestically, real estate agents, new businesses and teachers helping students learn about their communities….
The app is just the latest product from the Census Bureau’s digital transformation and provides statistics to more Americans in a new and user-friendly way. It follows the successful release of our hugely popular America’s Economy mobile app, which now has more than 100,000 downloads. Coming soon, you will see an upgraded census.gov website with enhanced search and navigation features that are based on several years of customer feedback. We continue to open up more of our data to developers as part of our API, including 30 years of decennial statistics, in addition to the American Community Survey statistics that power dwellr.
As we continue to align ourselves with the Digital Government Strategy, our free mobile apps are just one way we are making our statistics available anytime, anywhere, and on nearly any device.”

How to Start Thinking Like a Data Scientist


Thomas C. Redman in Harvard Business Review Blog: “Slowly but steadily, data are forcing their way into every nook and cranny of every industry, company, and job. Managers who aren’t data savvy, who can’t conduct basic analyses, interpret more complex ones, and interact with data scientists are already at a disadvantage. Companies without a large and growing cadre of data-savvy managers are similarly disadvantaged.
Fortunately, you don’t have to be a data scientist or a Bayesian statistician to tease useful insights from data. This post explores an exercise I’ve used for 20 years to help those with an open mind (and a pencil, paper, and calculator) get started. One post won’t make you data savvy, but it will help you become data literate, open your eyes to the millions of small data opportunities, and enable you work a bit more effectively with data scientists, analytics, and all things quantitative.
While the exercise is very much a how-to, each step also illustrates an important concept in analytics — from understanding variation to visualization.
First, start with something that interests, even bothers, you at work, like consistently late-starting meetings. Whatever it is, form it up as a question and write it down: “Meetings always seem to start late. Is that really true?”
Next, think through the data that can help answer your question, and develop a plan for creating them. Write down all the relevant definitions and your protocol for collecting the data. For this particular example, you have to define when the meeting actually begins. Is it the time someone says, “Ok, let’s begin.”? Or the time the real business of the meeting starts? Does kibitzing count?
Now collect the data. It is critical that you trust the data. And, as you go, you’re almost certain to find gaps in data collection. You may find that even though a meeting has started, it starts anew when a more senior person joins in. Modify your definition and protocol as you go along.
Sooner than you think, you’ll be ready to start drawing some pictures. Good pictures make it easier for you to both understand the data and communicate main points to others. There are plenty of good tools to help, but I like to draw my first picture by hand. My go-to plot is a time-series plot, where the horizontal axis has the date and time and the vertical axis has the variable of interest. Thus, a point on the graph below (click for a larger image) is the date and time of a meeting versus the number of minutes late….”

Google's Civic Information API: now connecting US users with their representatives


Jonathan Tomer, Software Engineer at Google Blog: “Many applications track and map governmental data, but few help their users identify the relevant local public officials. Too often local problems are divorced from the government institutions designed to help. Today, we’re launching new functionality in the Google Civic Information API that lets developers connect constituents to their federal, state, county and municipal elected officials—right down to the city council district.
The Civic Information API has already helped developers create apps for US elections that incorporate polling place and ballot information, from helping those affected by Superstorm Sandy find updated polling locations over SMS to learning more about local races through social networks. We want to support these developers in their work beyond elections, including everyday civic engagement.
In addition to elected representatives, the API also returns your political jurisdictions using Open Civic Data Identifiers. We worked with the Sunlight Foundation and other civic technology groups to create this new open standard to make it easier for developers to combine the Civic Information API with their datasets. For example, once you look up districts and representatives in the Civic Information API, you can match the districts up to historical election results published by Open Elections.
Developers can head over to the documentation to get started; be sure to check out the “Map Your Reps” sample application from Bow & Arrow to get a sense of what the API can do. You can also see the API in action today through new features from some of our partners, for example:

  • Change.org has implemented a new Decision Makers feature which allows users to direct a petition to their elected representative and lists that petition publicly on the representative’s profile page. As a result, the leader has better insight into the issues being discussed in their district, and a new channel to respond to constituents.
  • PopVox helps users share their opinions on bills with their Congressional Representatives in a meaningful format. PopVox uses the API to connect the user to the correct Congressional District. Because PopVox verifies that users are real constituents, the opinions shared with elected officials have more impact on the political process.

Over time, we will expand beyond US elected representatives and elections to other data types and places. We can’t grow without your help. As you use the API, please visit our Developer Forum to share your experiences and tell us how we can help you build the next generation of civic apps and services.”

Four critiques of open data initiatives


Blog by Rob Kitchin: “The arguments concerning the benefits of open data are now reasonably well established and include contentions that open data lead to increased transparency and accountability with respect to public bodies and services; increases the efficiency and productivity of agencies and enhances their governance; promotes public participation in decision making and social innovation; and fosters economic innovation and job and wealth creation (Pollock 2006; Huijboom and Van der Broek 2011; Janssen 2012; Yiu 2012).
What is less well examined are the potential problems affecting, and negative consequences of, open data initiatives.  Consequently, as a provocation for Wednesday’s (Nov 13th, 4-6pm) Programmable City open data event I thought it might be useful to outline four critiques of open data, each of which deserves and demands critical attention: open data lacks a sustainable financial model; promotes a politics of the benign and empowers the empowered; lacks utility and usability; and facilitates the neoliberalisation and marketisation of public services.  These critiques do not suggest abandoning the move towards opening data, but contend that open data initiatives need to be much more mindful of what data are being made open, how data are made available, how they are being used, and how they are being funded.”

Concerns about opening up data, and responses which have proved effective


Google doc by Christopher Gutteridge, University of Southampton and Alexander Dutton, University of Oxford:  “This document is inspired by the open data excuses bingo card. Someone asked for what responses have proved effective. This document is a work in progress based on our experience. Carly Strasser has also written at the Data Pub blog about these issues from an Open Science and research data perspective. You may also be interested in How to make a business case for open data, published by the ODI.
We’ll get spam…
Terrorists might use the data…
People will contact us to ask about stuff…
People will misinterpret the data…
It’s too big…
It’s not very interesting…
We might want to use it in a research paper…
There’s no API to that system…
We’re worried about the Data Protection Act…
We’re not sure that we own it…
I don’t mind making it open, but I worry someone else might object…
It’s too complicated…
Our data is embarrassingly bad…
It’s not a priority and we’re busy…
Our lawyers want to make a custom license…
It changes too quickly…
There’s already a project in progress which sounds similar…
Some of what you asked for is confidential…
I don’t own the data, so can’t give you permission…
We don’t have that data…
That data is already published via (external organisation X)….
We can’t provide that dataset because one part is not possible…
What if something breaks and the open version becomes out of date?…
We can’t see the benefit…
What if we want to sell access to this data…?
If we publish this data, people might sue us…
We want people to come direct to us so we know why they want the data…

Open Government and Its Constraints


Blog entry by Panthea Lee: “Open government” is everywhere. Search the term and you’ll find OpenGovernment.orgOpenTheGovernment.orgOpen Government InitiativeOpen Gov Hub and the Open Gov Foundation; you’ll find open government initiatives for New York CityBostonKansasVirginiaTennessee and the list goes on; you’ll find dedicated open government plans for the White HouseState DepartmentUSAIDTreasuryJustice DepartmentCommerceEnergy and just about every other major federal agency. Even the departments of Defense and Homeland Security are in on open government.
And that’s just in the United States.
There is Open Government AfricaOpen Government in the EU and Open Government Data. The World Bank has an Open Government Data Toolkit and recently announced a three-year initiative to help developing countries leverage open data. And this week, over 1,000 delegates from over 60 countries are in London for the annual meeting of the Open Government Partnership, which has grown from 8 to 60 member states in just two years….
Many of us have no consensus or clarity on just what exactly “open government” iswhat we hope to achieve from it or how to measure our progress. Too often, our initiatives are designed through the narrow lenses of our own biases and without a concrete understanding of those they are intended for — both those in and out of government.
If we hope to realize the promise of more open governments, let’s be clear about the barriers we face so that we may start to overcome them.
Barrier 1: “Open Gov” is…?
Open government is… not new, for starters….
Barrier 2: Open Gov is Not Inclusive
The central irony of open government is that it’s often not “open” at all….
Barrier 3: Open Gov Lacks Empathy
Open government practitioners love to speak of “the citizen” and “the government.” But who exactly are these people? Too often, we don’t really know. We are builders, makers and creators with insufficient understanding of whom we are building, making and creating for…On the flip side, who do we mean by “the government?” And why, gosh darn it, is it so slow to innovate? Simply put, “the government” is comprised of individual people working in environments that are not conducive to innovation….
For open government to realize its potential, we must overcome these barriers.”

Mozilla Location Service: crowdsourcing data to help devices find your location without GPS


“The Mozilla Location Service is an experimental pilot project to provide geolocation lookups based on publicly observable cell tower and WiFi access point information. Currently in its early stages, it already provides basic service coverage of select locations thanks to our early adopters and contributors.
A world map showing areas with location data. Map data provided by mapbox / OpenStreetMap.
While many commercial services exist in this space, there’s currently no large public service to provide this crucial part of any mobile ecosystem. Mobile phones with a weak GPS signal and laptops without GPS hardware can use this service to quickly identify their approximate location. Even though the underlying data is based on publicly accessible signals, geolocation data is by its very nature personal and privacy sensitive. Mozilla is committed to improving the privacy aspects for all participants of this service offering.
If you want to help us build our service, you can install our dedicated Android MozStumbler and enjoy competing against others on our leaderboard or choose to contribute anonymously. The service is evolving rapidly, so expect to see a more full featured experience soon. For an overview of the current experience, you can head over to the blog of Soledad Penadés, who wrote a far better introduction than we did.
We welcome any ideas or concerns about this project and would love to hear any feedback or experience you might have. Please contact us either on our dedicated mailing list or come talk to us in our IRC room #geo on Mozilla’s IRC server.
For more information please follow the links on our project page.”