Data Revolutionaries: Routine Administrative Data Can Be Sexy Too


Sebastian Bauhoff at Center for Global Development: “Routine operational data on government programs lack sexiness, and are generally not trendy withData Revolutionaries. But unlike censuses and household surveys, routine administrative data are readily available at low cost, cover key populations and service providers, and are generally at the right level of disaggregation for decision-making on payment and service delivery. Despite their potential utility, these data remain an under-appreciated asset for generating evidence and informing policy—a particularly egregious omission given that developing countries can leapfrog old, inefficient approaches for more modern methods to collect and manage data. Verifying receipt of service via biometric ID and beneficiary fingerprint at the point of service? India’s already doing it.

To better make the case for routine data, two questions need to be answered—what exactly can be learned from these data and how difficult are they to use?

In a paper just published in Health Affairs with collaborators from the World Bank and the Government of India, we probed these questions using claims data from India’s National Health Insurance Program, Rashtriya Swasthya Bima Yojana (RSBY). Using the US Medicare program as a comparison, we wondered whether reimbursement claims data that RSBY receives from participating hospitals could be used to study the quality of care provided. The main goal was to see how far we could push on an example dataset of hospital claims from Puri, a district in Orissa state.

Here’s what we learned…(More)”

Evaluating World Bank Support to Budget Analysis and Transparency


Report by Linnea Mills and Clay G. Wescott: “BOOST is a new resource launched in 2010 to facilitate improved quality, classification, and access to budget data and promote effective use for improved government decision making, transparency and accountability. Using the Government’s own data from public expenditure accounts held in the Government’s Financial Management Information System, and benefiting from a consistent methodology, the BOOST data platform makes highly granular fiscal data accessible and ready-for-use. National authorities can significantly enhance fiscal transparency by publishing summary data and analysis or by providing open access to the underlying dataset. This paper addresses four research questions: Did BOOST help improve the quality of expenditure analysis available to government decision makers? Did it help to develop capacity in central finance and selected spending agencies to sustain expenditure analysis? Did it help to improve public access to expenditure analysis anddata? Did it help to increase awareness of the opportunities for BOOST and expenditure analysis in Sub-Saharan Africa as well as countries outside this region where BOOST has been used (Georgia, Haiti and Tunisia).

Evidence has been drawn from various sources. Survey questionnaires were sent to all World Bank task team leaders for Gates Trust Fund supported countries. Completed questionnaires were received from 18 predominantly African countries (Annex 4). These 18 countries constitute the majority but not all of the countries implementing BOOST with financial support from the Trust Fund. Information has also been gathered through a BOOST stakeholder questionnaire targeting government officials, civil society representatives and representatives from parliaments at country level, field visits to Kenya, Mozambique and Uganda, interviews with stakeholders at the Bank and at country level, participation at regional conferences on BOOST in South Africa and Senegal, and document review. Interviews covered participants from some countries that did not complete questionnaires, such as Haiti.

The research will help to inform the Bill and Melinda Gates Foundation, and the World Bank, the administrator of the trust fund on the achievements of the program, and the value of continuing support. It will inform client country Governments, and non-Government actors interested in improved dissemination and analysis of quality public financial data. The research should also be useful for vendors of similar products like OpenGov; and to international scholars and experts working to better understand public expenditure management in developing countries….(More)”

Resource Library for Cross-Sector Collaboration


The Intersector Project: “Whether you’re working on a local collective impact initiative or a national public-private partnership; whether you’re a practitioner or a researcher; whether you’re looking for basics or a detailed look at a particular topic, our Resource Library can help you find the information and tools you need for your cross-sector thinking and practice. The Library — which includes resources from research organizations, advisory groups, training organizations, academic centers and journals, and more — spans issue areas, sectors, and partnership types….(More)”

Bringing together the United States of data


The U.S. Data Federation will support government-wide data standardization and data federation initiatives across both Federal agencies and local governments. This is intended to be a fundamental coordinating mechanism for a more open and interconnected digital government by profiling and supporting use-cases that demonstrate unified and coherent data architectures across disparate government agencies. These examples will highlight emerging data standards and API initiatives across all levels of government, convey the level of maturity for each effort, and facilitate greater participation by government agencies. Initiatives that may be profiled within the U.S. Data Federation include Open311, DOT’s National Transit Map, the Project Open Data metadata schema, Contact USA, and the Police Data Initiative. As part of the U.S. Data Federation, GSA will also pilot the development of reusable components needed for a successful data federation strategy including schema documentation tools, schema validation tools, and automated data aggregation and normalization capabilities. The U.S. Data Federation will provide more sophisticated and seamless opportunities on the foundation of U.S. open data initiatives by allowing the public to more easily do comparative data analysis across government bodies and create applications that work across multiple government agencies….(More)”

Privacy and Open Data


A Research Briefing by Wood, Alexandra and O’Brien, David and Gasser, Urs: “Political leaders and civic advocates are increasingly recommending that open access be the “default state” for much of the information held by government agencies. Over the past several years, they have driven the launch of open data initiatives across hundreds of national, state, and local governments. These initiatives are founded on a presumption of openness for government data and have led to the public release of large quantities data through a variety of channels. At the same time, much of the data that have been released, or are being considered for release, pertain to the behavior and characteristics of individual citizens, highlighting tensions between open data and privacy. This research briefing offers a snapshot of recent developments in the open data and privacy landscape, outlines an action map of various governance approaches to protecting privacy when releasing open data, and identifies key opportunities for decision-makers seeking to respond to challenges in this space….(More)”

Next big thing: The ‘uberfication’ of crowdsourced news


Ken Doctor at Politico: “Get ready to hear a lot about the “uberfication” of user-generated content.

Yes, it’s a mouthful. But it’s also the next big thing. Fresco News, a two-year-old New York start-up, sees itself becoming a hot property as it cracks the code on local amateur content generation….Fresco News now enables local TV stations to assign, receive and quickly get on air and online lots of amateur-shot newsy videos in their metro area.

Its secret sauce: Uberizing the supply chain process from station assignment to Fresco “qualified” shooter to shooting smartphone video to uploading and optimizing its quality for quick delivery to consumers, online or on the air.

Meyer’s team of 40, which includes numerous part-timers, has assiduously worked through the many frictions. That’s one hallmark of successful Uberfication.

“We just did a tremendous amount of just non-stop testing,” he says. “I would say, even with simple things like user acquisition, which is a major part of our process and entering new markets. We’ve tested hundreds of different ad types, graphics that we’ve designed internally that effectively, and I would say cheaply, bring in prospective citizen journalists.”

Stations can assign easily. Would-be shooters can see assignments, geographically displayed, on a single screen. The upload works well and stations’ ability to quickly use the videos is a strong selling point. Fresco, then, handles the billing and payment processes, much as Uber does.

As with once-taxi rides, individual transaction amounts compute small. Shooters get $50 for each video used by a TV station or $20 for a still photo. Stations pay $75 for a video and $30 for a still. As a standalone business, Fresco News is a scale play.

It’s not a new idea.

UGC – or user-generated content – was supposed to be huge. The late ’90s notion: the Internet could make anyone and everyone a reporter, and make it easy for them to share their work widely and cheaply. Many newspaper chains bought into the idea, and tested it unevenly, hoping that UGC could provide what was, for awhile called, “local-local” content. Local-local meaning neighborhood plus, that kind of locally differentiating news coverage that publishers thought readers wanted, but coverage publishers believed cost too much if they had to pay professional reporters to do it.

Short story: It didn’t work for the chains. In part, the technology was immature. More importantly, it turns out that reporting – and writing – remains, even the Internet age, largely a professional skill. Publishers couldn’t find enough dependable local amateurs, and besides, they never really iterated a business model around the idea.

Then, there were the national start-up efforts. NowPublic, one memorable one partnered with the Associated Press, launched in 2005 …but never found traction. Today, several other companies ply the territory, with Storyful a standard of quality. Importantly, Storyful focuses on national and global content. Fresco News aims squarely at local – first across the 3,000-mile breadth of the U.S.

The dots tell the story

Take a look at the many dots on the Philly map above. Each blue dot represents an active, signed-up Fresco video shooter in the area. Each yellow dot shows current assignments. In this August visualization, visually, you get a sense of quickly and energetically local TV station Fox 29, WTXF, has deployed – and uses – Fresco News.as earned “preferred” status at Fresco, has been around journalistic operations for a long time and looks forward to contributing news tips as Fresco might expand what its tech can do for local stations…(More)”

Crowdsourcing at Statistics Canada


Pilot project by Statistics Canada: “Our crowdsourcing pilot project will focus on mapping buildings across Canada.

If you live in Ottawa or Gatineau, you can be among the first to collaborate with us. If you live elsewhere, stay in touch! Your town or city could be next. We are very excited to work with communities across the country on this project.

As a project contributor, you can help create a free and open source of information on commercial, industrial, government and other buildings in Canada. We need your support to close this important data gap! Your work will improve your community’s knowledge of its buildings, and in turn inform policies and programs designed to help you.

An eye on the future

There are currently no accurate national-level statistics on buildings— and their attributes—that can be used to compare specific local areas. The information you submit will help to fill existing data gaps and provide new analytical opportunities that are important to data users.

This project will also teach us about the possibilities and limitations of crowdsourcing. Crowdsourcing data collection may become a way for Statistics Canada and other organizations around the world to collect much-needed information by reaching out to citizens.

What you can do

Using your knowledge of your neighbourhood, along with an online mapping tool called OpenStreetMap, you and other members of the public will be able to input the location, physical attributes and other features of buildings.


It all starts with you, on October 17, 2016

We will officially launch the crowdsourcing campaign for the pilot on October 17, 2016 and will provide further instructions and links to resources.

To subscribe to a distribution list for periodic updates on the project, send us an email at statcan.crowdsource.statcan@canada.ca. We will keep you posted!….(More)”

National Transit Map Seeks to Close the Transit Data Gap


Ben Miller at GovTech: “In bringing together the first ever map illustrating the nation’s transit system, the U.S. Department of Transportation isn’t just making data more accessible — it’s also aiming to modernize data collection and dissemination for many of the country’s transit agencies.

With more than 10,000 routes and 98,000 stops represented, the National Transit Map is already enormous. But Dan Morgan, chief data officer of the department, says it’s not enough. When measuring vehicles operated in maximum service — a metric illustrating peak service at a transit agency — the National Transit Map captures only about half of all transit in the U.S.

“Not all of these transit agencies have this data available,” Morgan said, “so this is an ongoing project to really close the transit data gap.”Which is why, in the process of building out the map, the DOT is working with transit agencies to make their data available.

Which is why, in the process of building out the map, the DOT is working with transit agencies to make their data available.

On the whole, transit data is easier to collect and process than a lot of transportation data because many agencies have adopted a standard called General Transit Feed Specification (GTFS) that applies to schedule-related data. That’s what made the National Transit Map an easy candidate for completion, Morgan said.

But as popular as GTFS has become, many agencies — especially smaller ones — haven’t been able to use it. The tools to convert to GTFS come with a learning curve.

“It’s really a matter of priority and availability of resources,” he said.

Bringing those agencies into the mainstream is important to achieving the goals of the map. In the map, Morgan said he sees an opportunity to achieve a new level of clarity where it has never existed before.

That’s because transit has long suffered from difficulty in seeing its own history. Transit officials can describe their systems as they exist, but looking at how they got there is trickier.

“There’s no archive,” Morgan said, “there’s no picture of how transit changes over time.”

And that’s a problem for assessing what works and what doesn’t, for understanding why the system operates the way it does and how it responds to changes. …(More)”

Recent Developments in Open Data Policy


Presentation by Paul Uhlir:  “Several International organizations have issued policy statements on open data policies in the past two years. This presentation provides an overview of those statements and their relevance to developing countries.

International Statements on Open Data Policy

Open data policies have become much more supported internationally in recent years. Policy statements in just the most recent 2014-2016 period that endorse and promote openness to research data derived from public funding include: the African Data Consensus (UNECA 2014); the CODATA Nairobi Principles for Data Sharing for Science and Development in Developing Countries (PASTD 2014); the Hague Declaration on Knowledge Discovery in the Digital Age (LIBER 2014); Policy Guidelines for Open Access and Data Dissemination and Preservation (RECODE 2015); Accord on Open Data in a Big Data World (Science International 2015). This presentation will present the principal guidelines of these policy statements.

The Relevance of Open Data from Publicly Funded Research for Development

There are many reasons that publicly funded research data should be made as freely and openly available as possible. Some of these are noted here, although many other benefits are possible. For research, it is closing the gap with more economically developed countries, making researchers more visible on the web, enhancing their collaborative potential, and linking them globally. For educational benefits, open data assists greatly in helping students learn how to do data science and to manage data better. From a socioeconomic standpoint, open data policies have been shown to enhance economic opportunities and to enable citizens to improve their lives in myriad ways. Such policies are more ethical in allowing access to those that have no means to pay and not having to pay for the data twice—once through taxes to create the data in the first place and again at the user level . Finally, access to factual data can improve governance, leading to better decision making by policymakers, improved oversight by constituents, and digital repatriation of objects held by former colonial powers.

Some of these benefits are cited directly in the policy statements themselves, while others are developed more fully in other documents (Bailey Mathae and Uhlir 2012, Uhlir 2015). Of course, not all publicly funded data and information can be made available and there are appropriate reasons—such as the protection of national security, personal privacy, commercial concerns, and confidentiality of all kinds—that make the withholding of them legal and ethical. However, the default rule should be one of openness, balanced against a legitimate reason not to make the data public….(More)”