Jason Brick at PSFK: “Computer time and human hours are among of the biggest obstacles in the face of progress in the fight against cancer. Researchers have terabytes of data, but only so many processors and people with which to analyze it. Much like the SETI program (Search for Extra Terrestrial Intelligence), it’s likely that big answers are already in the information we’ve collected. They’re just waiting for somebody to find them.
Reverse the Odds, a free mobile game from Cancer Research UK, accesses the combined resources of geeks and gamers worldwide. It’s a simple app game, the kind you play in line at the bank or while waiting at the dentist’s office, in which you complete mini puzzles and buy upgrades to save an imaginary world.
Each puzzle of the game is a repurposing of cancer data. Players find patterns in the data — the exact kind of analysis grad students and volunteers in a lab look for — and the results get compiled by Cancer Research UK for use in finding a cure. Errors are expected and accounted for because the thousands of players expected will round out the occasional mistake….(More)”
Launching Disasters.Data.Gov
OSTP Blog: “Strengthening our Nation’s resilience to disasters is a shared responsibility, with all community members contributing their unique skills and perspectives. Whether you’re a data steward who can unlock information and foster a culture of open data, an innovator who can help address disaster preparedness challenges, or a volunteer ready to join the “Innovation for Disasters” movement, we are excited for you to visit the new disasters.data.gov site, launching today.
First previewed at the White House Innovation for Disaster Response and Recovery Initiative Demo Day, disasters.data.gov is designed to be a public resource to foster collaboration and the continual improvement of disaster-related open data, free tools, and new ways to empower first responders, survivors, and government officials with the information needed in the wake of a disaster.
A screenshot from the new disasters.data.gov web portal.
Today, the Administration is unveiling the first in a series of Innovator Challenges that highlight pressing needs from the disaster preparedness community. The inaugural Innovator Challenge focuses on a need identified from firsthand experience of local emergency management, responders, survivors, and Federal departments and agencies. The challenge asks innovators across the nation: “How might we leverage real-time sensors, open data, social media, and other tools to help reduce the number of fatalities from flooding?”
In addition to this first Innovator Challenge, here are some highlights from disasters.data.gov:….(More)”
How Government Can Unlock Economic Benefits from Open Data
Zillow is a prime example of how open data creates economic value. The Seattle-based company has grown rapidly since its launch in 2006, generating more than $78 million in revenue in its last financial quarter and employing more than 500 workers. But real estate firms aren’t the only businesses benefiting from data collected and published by government.
GovLab, a research laboratory run by New York University, publishes the Open Data 500, a list of companies that benefit from open data produced by the federal government. The list contains more than 15 categories of businesses, ranging from health care and education to energy, finance, legal and the environment. And the data flows from all the major agencies, including NASA, Defense, Transportation, Homeland Security and Labor….
Zillow’s road to success underscores the challenges that lie ahead if local government is going to grab its share of open data’s economic bonanza. One of the company’s biggest hurdles was to create a system that could integrate government data from thousands of databases in county government. “There’s no standard format, which is very frustrating,” Stan Humphries, Zillow’s chief economist, told Computerworld.com. “It’s up to us to figure out 3,000 different ways to ingest data and make sense of it…. More at GovTech”
An Introduction to the Economic Analysis of Open Data
Research Note by Soichiro Takagi: “Open data generally refers to a movement in which public organizations provide data in a machine-readable format to the public, so that anyone can reuse the data. Open data is becoming an important phenomenon in Japan. At this moment, utilization of open data in Japan is emerging with collaborative efforts among small units of production such as individuals. These collaborations have been also observed in the Open Source Software (OSS) movement, but collaboration in open data is somewhat different in respect to small-scale, distributed collaboration. The aim of this research note is to share the phenomena of open data as an object of economic analysis with readers by describing the movement and providing a preliminary analysis. This note discusses how open data is associated with mass collaboration from the viewpoint of organizational economics. It also provides the results of empirical analysis on how the regional characteristics of municipalities affect the decision of local governments to conduct open data initiatives.”
The Free 'Big Data' Sources Everyone Should Know
Bernard Marr at Linkedin Pulse: “…The moves by companies and governments to put large amounts of information into the public domain have made large volumes of data accessible to everyone….here’s my rundown of some of the best free big data sources available today.
Data.gov
The US Government pledged last year to make all government data available freely online. This site is the first stage and acts as a portal to all sorts of amazing information on everything from climate to crime. To check it out, click here.
US Census Bureau
A wealth of information on the lives of US citizens covering population data, geographic data and education. To check it out, click here. To check it out, click here.
European Union Open Data Portal
As the above, but based on data from European Union institutions. To check it out, click here.
Data.gov.uk
Data from the UK Government, including the British National Bibliography – metadata on all UK books and publications since 1950. To check it out, click here.
The CIA World Factbook
Information on history, population, economy, government, infrastructure and military of 267 countries. To check it out, click here.
Healthdata.gov
125 years of US healthcare data including claim-level Medicare data, epidemiology and population statistics. To check it out, click here.
NHS Health and Social Care Information Centre
Health data sets from the UK National Health Service. To check it out, click here.
Amazon Web Services public datasets
Huge resource of public data, including the 1000 Genome Project, an attempt to build the most comprehensive database of human genetic information and NASA’s database of satellite imagery of Earth. To check it out, click here.
Facebook Graph
Although much of the information on users’ Facebook profile is private, a lot isn’t – Facebook provide the Graph API as a way of querying the huge amount of information that its users are happy to share with the world (or can’t hide because they haven’t worked out how the privacy settings work). To check it out, click here.
Gapminder
Compilation of data from sources including the World Health Organization and World Bank covering economic, medical and social statistics from around the world. To check it out, click here.
Google Trends
Statistics on search volume (as a proportion of total search) for any given term, since 2004. To check it out, click here.
Google Finance
40 years’ worth of stock market data, updated in real time. To check it out, click here.
Google Books Ngrams
Search and analyze the full text of any of the millions of books digitised as part of the Google Books project. To check it out, click here.
National Climatic Data Center
Huge collection of environmental, meteorological and climate data sets from the US National Climatic Data Center. The world’s largest archive of weather data. To check it out, click here.
DBPedia
Wikipedia is comprised of millions of pieces of data, structured and unstructured on every subject under the sun. DBPedia is an ambitious project to catalogue and create a public, freely distributable database allowing anyone to analyze this data. To check it out, click here.
Topsy
Free, comprehensive social media data is hard to come by – after all their data is what generates profits for the big players (Facebook, Twitter etc) so they don’t want to give it away. However Topsy provides a searchable database of public tweets going back to 2006 as well as several tools to analyze the conversations. To check it out, click here.
Likebutton
Mines Facebook’s public data – globally and from your own network – to give an overview of what people “Like” at the moment. To check it out, click here.
New York Times
Searchable, indexed archive of news articles going back to 1851. To check it out, click here.
Freebase
A community-compiled database of structured data about people, places and things, with over 45 million entries. To check it out, click here.
Million Song Data Set
Metadata on over a million songs and pieces of music. Part of Amazon Web Services. To check it out, click here.”
See also Bernard Marr‘s blog at Big Data Guru
4 Tech Trends Changing How Cities Operate
Governing: “Louis Brandeis famously characterized states as laboratories for democracy, but cities could be called labs for innovation or new practices….When Government Technology magazine (produced by Governing’s parent company, e.Republic, Inc.) published its annual Digital Cities Survey, the results provided an interesting look at how local governments are using technology to improve how they deliver services, increase production and streamline operations…the survey also showed four technology trends changing how local government operates and serves its citizens:
at1. Open Data
…Big cities were the first to open up their data and gained national attention for their transparency. New York City, which passed an open data law in 2012, leads all cities with more than 1,300 data sets open to the public; Chicago started opening up data to the public in 2010 following an executive order and is second among cities with more than 600; and San Francisco, which was the first major city to open the doors to transparency in 2009, had the highest score from the U.S. Open Data Census for the quality of its open data.
But the survey shows that a growing number of mid-sized jurisdictions are now getting involved, too. Tacoma, Wash., has a portal with 40 data sets that show how the city is spending tax dollars on public works, economic development, transportation and public safety. Ann Arbor, Mich., has a financial transparency tool that reveals what the city is spending on a daily basis, in some cases….
2. ‘Stat’ Programs and Data Analytics
…First, the so-called “stat” programs are proliferating. Started by the New York Police Department in the 1980s, CompStat was a management technique that merged data with staff feedback to drive better performance by police officers and precinct captains. Its success led to many imitations over the years and, as the digital survey shows, stat programs continue to grow in importance. For example, Louisville has used its “LouieStat” program to cut the city’s bill for unscheduled employee overtime by $23 million as well as to spot weaknesses in performance.
Second, cities are increasing their use of data analytics to measure and improve performance. Denver, Jacksonville, Fla., and Phoenix have launched programs that sift through data sets to find patterns that can lead to better governance decisions. Los Angeles has combined transparency with analytics to create an online system that tracks performance for the city’s economy, service delivery, public safety and government operations that the public can view. Robert J. O’Neill Jr., executive director of the International City/County Management Association, said that both of these tech-driven performance trends “enable real-time decision-making.” He argued that public leaders who grasp the significance of these new tools can deliver government services that today’s constituents expect.
3. Online Citizen Engagement
…Avondale, Ariz., population 78,822, is engaging citizens with a mobile app and an online forum that solicits ideas that other residents can vote up or down.
In Westminster, Colo., population 110,945, a similar forum allows citizens to vote online about community ideas and gives rewards to users who engage with the online forum on a regular basis (free passes to a local driving range or fitness program). Cities are promoting more engagement activities to combat a decline in public trust in government. The days when a public meeting could provide citizen engagement aren’t enough in today’s technology-dominated world. That’s why social media tools, online surveys and even e-commerce rewards programs are popping up in cities around the country to create high-value interaction with its citizens.
4. Geographic Information Systems
… Cities now use them to analyze financial decisions to increase performance, support public safety, improve public transit, run social service activities and, increasingly, engage citizens about their city’s governance.
Augusta, Ga., won an award for its well-designed and easy-to-use transit maps. Sugar Land, Texas, uses GIS to support economic development and, as part of its citizen engagement efforts, to highlight its capital improvement projects. GIS is now used citywide by 92 percent of the survey respondents. That’s significant because GIS has long been considered a specialized (and expensive) technology primarily for city planning and environmental projects….”
The Global Open Data Index 2014
The UK tops the 2014 Index retaining its pole position with an overall score of 96%, closely followed by Denmark and then France at number 3 up from 12th last year. Finland comes in 4th while Australia and New Zealand share the 5th place. Impressive results were seen from India at #10 (up from #27) and Latin American countries like Colombia and Uruguay who came in joint 12th .
Sierra Leone, Mali, Haiti and Guinea rank lowest of the countries assessed, but there are many countries where the governments are less open but that were not assessed because of lack of openness or a sufficiently engaged civil society.
Overall, whilst there is meaningful improvement in the number of open datasets (from 87 to 105), the percentage of open datasets across all the surveyed countries remained low at only 11%.
Even amongst the leaders on open government data there is still room for improvement: the US and Germany, for example, do not provide a consolidated, open register of corporations. There was also a disappointing degree of openness around the details of government spending with most countries either failing to provide information at all or limiting the information available – only two countries out of 97 (the UK and Greece) got full marks here. This is noteworthy as in a period of sluggish growth and continuing austerity in many countries, giving citizens and businesses free and open access to this sort of data would seem to be an effective means of saving money and improving government efficiency.
Explore the Global Open Data Index 2014 for yourself!”
How do we improve open data for police accountability?
Emily Shaw at the SunLight Foundation: “This is a challenging time for people who worry about the fairness of American governmental institutions. In quick succession, grand juries declined to indict two police officers accused of killing black men. In the case of Ferguson, Mo. officer Darren Wilson’s killing of Michael Brown, the grand jury’s decision appeared to center on uncertainty about whether Wilson’s action was legal and whether he killed under threat. In the case of New York City police officer Daniel Pantaleo’s killing of Eric Garner, however, a bystander recorded and made public a video of the police officer causing Garner’s death through an illegal chokehold. In Pantaleo’s case, the availability of video data has made the question about institutional fairness even more urgent, as people can see for themselves the context in which the officer exercised power. The data has given us a common set of facts to use in judging police behavior.
We grant law enforcement and corrections departments the right to exercise more physical power over the public than we do to any other part of our government. But do we generally have the data we need to evaluate how they’re using it?….
The time to find good solutions to these problems is now. Responding to widespread frustration, President Obama has just announced a three-part initiative to “strengthen community policing”: an increased focus on transparency and oversight for federal-to-local transfers of military equipment, a proposal to provide matching funding to local police departments to buy body cameras, and a “Task Force on 21st Century Policing” that will make recommendations for how to implement community-oriented policing practices.
While each element of Obama’s initiative corresponds to a distinct set of concerns about policing, one element they share in common is the need to increase access to information about police work. Each of the three approaches will rely on mechanisms to increase the flow of public information about what police officers are doing in their official roles and how they are doing it. How are police officers going about fulfilling their responsibility to ensure public safety? Are they working in ways that appropriately respect individual rights? Are they responsive to public concerns, when concerns are raised?
By encouraging the collection and publication of more data about how government is working, Obama’s initiative has the potential to support precisely the kind of increase in data availability that can transform public outcomes. When applied with the intent to improve transparency and accountability and to increase public engagement, open data — and the civic tech that uses this data — can bridge the often too-large gap between the public and government.
However, because Obama’s initiatives depend on the effective collection, publication, and communication of information, open data advocates have a particular contribution to make. It’s important to think about what lessons we can apply from our experiences with open data — and with data collected and used for police accountability — in order to ensure that this initiative has the greatest possible impact. As an open data and open government community, can we make recommendations that can help improve the data we’re collecting for police transparency and accountability?
I’m going to begin a list, but it’s just a beginning – I am certain that you have many more recommendations to make. I’ll categorize them first by Obama’s “Strengthening Community Policing” initiatives and then keep thinking about what additional data is needed. Please think along with me about what kind of datasets we will need, what potential issues with data availability and quality we’re likely to see, what kind of laws may need to be changed to improve access to the data necessary for police accountability, then make your recommendations in the Google Doc embedded at the end of this post. If you’ve seen any great projects you’ve seen which improve police transparency and accountability, be sure to share those as well….”
The Year of Data-Driven Government Accountability
Pacific Standard: “Indeed, 2014 could be called the Year of Government Accountability, as voters on just about every continent have demanded that public officials govern with relentless efficiency, fiscal responsibility, and transparency….
The bottom line, in my view, is that facts must be the fundamental basis for critical and strategic decision-making at every level of government around the world today.
This belief—the foundation of massive technology and social movements, such as open data, big data, and data-driven government—is currently shared by a number of global government leaders. Just recently, for example, President Obama declared that “We must respond based on facts, not fear” when confronting the global Ebola crisis.
To be sure, presenting facts to decision-makers where and when they are needed is one of the most urgent technology priorities of our time. The good news is that we’re seeing progress on this front each and every day as civic organizations around the world rush to open their vast troves of data on the Internet and usher in a new era in data-driven government that will produce facts at the speed of light, and deliver them in context to political leaders, everyday citizens, professional academicians, scientists, journalists, and software developers wherever they are connected to the Web.
Data-driven government, which capitalizes on data, one of the most valuable natural resources of the 21st century, is a breakthrough opportunity of truly significant proportions. And it will be absolutely critical if governments everywhere are to achieve their ultimate mission. Without it, I worry that we just won’t be able to provide citizens with a higher quality of life and with greater opportunities to achieve their full potential.
Forget the FOIA Request: Cities, States Open Data Portals
Alexa Capeloto in MediaShift (PBS): “In almost any city you can read your local leaders’ emails if you formally ask for them. In Gainesville, Fla., all you have to do is go here.
In most states you can find out how tax dollars are being spent if you officially request expenditure records. In Wisconsin, you just click here.
For the last 50 years, governments have given up public records in response to Freedom of Information requests. But a number of public agencies are learning the value of proactively providing information before anyone has to ask for it.
The trend is part of the open-data movement that most large cities and the federal government have already begun to embrace. The information itself can range from simple emails to complex datasets, but the general idea is the same: Deliver information directly to the public using digital tools that can save money and serve the goal of government transparency.
…And there’s the added benefit of helping the bottom line. Users don’t have to request information if it’s already posted, saving agencies time and money, and a centralized FOIA tracking system can further streamline processing.
Sean Moulton of the Center for Effective Government testified before Congress that full participation in FOIAonline could save federal agencies an estimated $40 million per year in processing costs.
And Reinvent Albany, a nonprofit that pushes for transparency in New York, estimated in a June report that New York City could reduce FOI-related costs by 66 percent – from $20 million per year down to $7 million – by adopting an open-data system and doing away with its “hodgepodge of paper-based methods that are expensive, slow and unreliable.”
So…What’s the Catch?
In their survey, chief information officers were asked to name the top three barriers to advancing open data in state government. Fifty-three percent cited “agencies’ willingness to publish data,” and 49 percent cited “the reliability of the data.”
Information is of little value to the public if it’s faulty or too complex to understand. It could become a way for agencies to claim they’re being transparent without actually providing anything useful.
Plus, some worry that public servants will self-censor if they know, for example, their emails are automatically being shared with the world….”