How Can the Department of Education Increase Innovation, Transparency and Access to Data?


David Soo at the Department of Education: “Despite the growing amount of information about higher education, many students and families still need access to clear, helpful resources to make informed decisions about going to – and paying for – college.  President Obama has called for innovation in college access, including by making sure all students have easy-to-understand information.
Now, the U.S. Department of Education needs your input on specific ways that we can increase innovation, transparency, and access to data.  In particular, we are interested in how APIs (application programming interfaces) could make our data and processes more open and efficient.
APIs are set of software instructions and standards that allow machine-to-machine communication.  APIs could allow developers from inside and outside government to build apps, widgets, websites, and other tools based on government information and services to let consumers access government-owned data and participate in government-run processes from more places on the Web, even beyond .gov websites. Well-designed government APIs help make data and processes freely available for use within agencies, between agencies, in the private sector, or by citizens, including students and families.
So, today, we are asking you – student advocates, designers, developers, and others – to share your ideas on how APIs could spark innovation and enable processes that can serve students better. We need you to weigh in on a Request for Information (RFI) – a formal way the government asks for feedback – on how the Department could use APIs to increase access to higher education data or financial aid programs. There may be ways that Department forms – like the Free Application for Federal Student Aid (FAFSA) – or information-gathering processes could be made easier for students by incorporating the use of APIs. We invite the best and most creative thinking on specific ways that Department of Education APIs could be used to improve outcomes for students.
To weigh in, you can email APIRFI@ed.gov by June 2, or send your input via other addresses as detailed in the online notice.
The Department wants to make sure to do this right. It must ensure the security and privacy of the data it collects or maintains, especially when the information of students and families is involved.  Openness only works if privacy and security issues are fully considered and addressed.  We encourage the field to provide comments that identify concerns and offer suggestions on ways to ensure privacy, safeguard student information, and maintain access to federal resources at no cost to the student.
Through this request, we hope to gather ideas on how APIs could be used to fuel greater innovation and, ultimately, affordability in higher education.  For further information, see the Federal Register notice.”

The Transformative Impact of Data and Communication on Governance


Steven Livingston at Brookings: “How do digital technologies affect governance in areas of limited statehood – places and circumstances characterized by the absence of state provisioning of public goods and the enforcement of binding rules with a monopoly of legitimate force?  In the first post in this series I introduced the limited statehood concept and then described the tremendous growth in mobile telephony, GIS, and other technologies in the developing world.   In the second post I offered examples of the use of ICT in initiatives intended to fill at least some of the governance vacuum created by limited statehood.  With mobile phones, for example, farmers are informed of market conditions, have access to liquidity through M-Pesa and similar mobile money platforms….
This brings to mind another type of ICT governance initiative.  Rather than fill in for or even displace the state some ICT initiatives can strengthen governance capacity.  Digital government – the use of digital technology by the state itself — is one important possibility.  Other initiatives strengthen the state by exerting pressure. Countries with weak governance sometimes take the form of extractive states or those, which cater to the needs of an elite, leaving the majority of the population in poverty and without basic public services. This is what Daron Acemoglu and James A. Robinson call extractive political and economic institutions.  Inclusive states, on the other hand, are pluralistic, bound by the rule of law, respectful of property rights, and, in general, accountable.  Accountability mechanisms such as a free press and competitive multiparty elections are instrumental to discourage extractive institutions.  What ICT-based initiatives might lend a hand in strengthening accountability? We can point to three examples.

Example One: Using ICT to Protect Human Rights

Nonstate actors now use commercial, high-resolution remote sensing satellites to monitor weapons programs and human rights violations.  Amnesty International’s Remote Sensing for Human Rights offers one example, and Satellite Sentinel offers another.  Both use imagery from DigitalGlobe, an American remote sensing and geospatial content company.   Other organizations have used commercially available remote sensing imagery to monitor weapons proliferation.  The Institute for Science and International Security, a Washington-based NGO, revealed the Iranian nuclear weapons program in 2003 using commercial satellite imagery…

Example Two: Crowdsourcing Election Observation

Others have used mobile phones and GIS to crowdsource election observation.  For the 2011 elections in Nigeria, The Community Life Project, a civil society organization, created ReclaimNaija, an elections process monitoring system that relied on GIS and amateur observers with mobile phones to monitor the elections.  Each of the red dots represents an aggregation of geo-located incidents reported to the ReclaimNaija platform.  In a live map, clicking on a dot disaggregates the reports, eventually taking the reader to individual reports.  Rigorous statistical analysis of ReclaimNaija results and the elections suggest it contributed to the effectiveness of the election process.

ReclaimNaija: Election Incident Reporting System Map

ReclaimNaija: Election Incident Reporting System Map

Example Three: Using Genetic Analysis to Identify War Crimes

In recent years, more powerful computers have led to major breakthroughs in biomedical science.  The reduction in cost of analyzing the human genome has actually outpaced Moore’s Law.  This has opened up new possibilities for the use of genetic analysis in forensic anthropology.   In Guatemala, the Balkans, Argentina, Peru and in several other places where mass executions and genocides took place, forensic anthropologists are using genetic analysis to find evidence that is used to hold the killers – often state actors – accountable…”

Wikipedia Use Could Give Insights To The Flu Season


Agata Blaszczak-Boxe in Huffington Post: “By monitoring the number of times people look for flu information on Wikipedia, researchers may be better able to estimate the severity of a flu season, according to a new study.
Researchers created a new data-analysis system that looks at visits to Wikipedia articles, and found the system was able to estimate flu levels in the United States up to two weeks sooner than the flu data from the Centers for Disease Control and Prevention were released.
Looking at data spanning six flu seasons between December 2007 and August 2013, the new system estimated the peak flu week better than Google Flu Trends, another data-based system. The Wikipedia-based system accurately estimated the peak flu week in three out of six seasons, while the Google-based system got only two right, the researchers found.
“We were able to get really nice estimates of what the [flu] level is in the population,” said study author David McIver, a postdoctoral fellow at Boston Children’s Hospital.
The new system examined visits to Wikipedia articles that included terms related to flulike illnesses, whereas Google Flu Trends looks at searches typed into Google. The researchers analyzed the data from Wikipedia on how many times in an hour a certain article was viewed, and combined their data with flu data from the CDC, using a model they created.
The research team wanted to use a database that is accessible to everyone and create a system that could be more accurate than Google Flu Trends, which has flaws. For instance, during the swine flu pandemic in 2009, and during the 2012-2013 influenza season, Google Flu Trends got a bit “confused,” and overestimated flu numbers because of increased media coverage focused on the two illnesses, the researchers said.
When a pandemic strikes, people search for news stories related to the pandemic itself, but this doesn’t mean that they have the flu. In general, the problem with Internet-based estimation systems is that it is practically impossible to tell whether people are looking for information about an illness because they are sick, the researchers said.
In the new system, the researchers tried to overcome this issue by including a number of Wikipedia articles “to act as markers for general background-level activity of normal usage of Wikipedia,” the researchers wrote in the study. However, just like any other data-based system, the Wikipedia system is not immune to the issues related to figuring out the actual motivation of someone checking information related to the flu…
The study is published … in the journal PLOS Computational Biology.”

The Open Data 500: Putting Research Into Action


TheGovLab Blog: “On April 8, the GovLab made two significant announcements. At an open data event in Washington, DC, I was pleased to announce the official launch of the Open Data 500, our study of 500 companies that use open government data as a key business resource. We also announced that the GovLab is now planning a series of Open Data Roundtables to bring together government agencies with the businesses that use their data – and that five federal agencies have agreed to participate. Video of the event, which was hosted by the Center for Data Innovation, is available here.
The Open Data 500, funded by the John S. and James L. Knight Foundation, is the first comprehensive study of U.S.-based companies that rely on open government data.  Our website at OpenData500.com includes searchable, sortable information on 500 of these companies.  Our data about them comes from responses to a survey we’ve sent to all the companies (190 have responded) and what we’ve been able to learn from research using public information.  Anyone can now explore this website, read about specific companies or groups of companies, or download our data to analyze it. The website features an interactive tool on the home page, the Open Data Compass, that shows the connections between government agencies and different categories of companies visually.
We began work on the Open Data 500 study last fall with three goals. First, we wanted to collect information that will ultimately help calculate the economic value of open data – an important question for policymakers and others. Second, we wanted to present examples of open data companies to inspire others to use this important government resource in new ways. And third – and perhaps most important – we’ve hoped that our work will be a first step in creating a dialogue between the government agencies that provide open data and the companies that use it.
That dialogue is critically important to make government open data more accessible and useful. While open government data is a huge potential resource, and federal agencies are working to make it more available, it’s too often trapped in legacy systems that make the data difficult to find and to use. To solve this problem, we plan to connect agencies to their clients in the business community and help them work together to find and liberate the most valuable datasets.
We now plan to convene and facilitate a series of Open Data Roundtables – a new approach to bringing businesses and government agencies together. In these Roundtables, which will be informed by the Open Data 500 study, companies and the agencies that provide their data will come together in structured, results-oriented meetings that we will facilitate. We hope to help figure out what can be done to make the most valuable datasets more available and usable quickly.
We’ve been gratified by the immediate positive response to our plan from several federal agencies. The Department of Commerce has committed to help plan and participate in the first of our Roundtables, now being scheduled for May. By the time we announced our launch on April 8, the Departments of Labor, Transportation, and Treasury had also signed up. And at the end of the launch event, the Deputy Chief Information Officer of the USDA publicly committed her agency to participate as well…”

Citi Bike System Data


Citi Bike: “Where do Citi Bikers ride? When do they ride? How far do they go? Which stations are most popular? What days of the week are most rides taken on? We’ve heard all of these questions and more from you and now we are happy to provide the datasets to help you discover the answers to these questions and more. We invite developers, engineers, statisticians, artists, academics and other members of the interested public to use the data we provide for analysis, development, visualization and whatever else moves you.
This data is provided according to the NYCBS Data Use Policy.
Citi Bike Trip Histories
Below are links to downloadable files of Citi Bike trip data. The data includes:

  • Trip Duration (seconds)
  • Start Time and Date
  • Stop Time and Date
  • Start Station Name
  • End Station Name
  • Station ID
  • Station Lat/Long
  • Bike ID
  • User Type (Customer = 24-hour pass or 7-day pass user; Subscriber = Annual Member)
  • Gender
  • Year of Birth”

How Civil Society Organizations Close the Gap between Transparency and Accountability


In a research note in the current issue of Governance, Albert Van Zyl poses “the most critical question for activists and scholars of accountability: How and when does transparency lead to greater accountability?”  Van Zyl’s note looks particularly at the role of civil society organizations (CSOs) in demanding and using government budget information, drawing on case studies of CSO activity in eleven countries in Africa, Latin America and South Asia.  Accountability is achieved, Van Zyl suggests, when CSOs are active and closely engaged with legislators, auditors, and other formal oversight institutions.  But research is still needed on the kinds of engagement that are most likely to enhance accountability.  Read the research note.

Historic release of data delivers unprecedented transparency on the medical services physicians provide and how much they are paid


Jonathan Blum, Principal Deputy Administrator, Centers for Medicare & Medicaid Services : “Today the Centers for Medicare & Medicaid Services (CMS) took a major step forward in making Medicare data more transparent and accessible, while maintaining the privacy of beneficiaries, by announcing the release of new data on medical services and procedures furnished to Medicare fee-for-service beneficiaries by physicians and other healthcare professionals (http://www.cms.gov/newsroom/newsroom-center.html). For too long, the only information on physicians readily available to consumers was physician name, address and phone number. This data will, for the first time, provide a better picture of how physicians practice in the Medicare program.
This new data set includes over nine million rows of data on more than 880,000 physicians and other healthcare professionals in all 50 states, DC and Puerto Rico providing care to Medicare beneficiaries in 2012. The data set presents key information on the provision of services by physicians and how much they are paid for those services, and is organized by provider (National Provider Identifier or NPI), type of service (Healthcare Common Procedure Coding System, or HCPCS) code, and whether the service was performed in a facility or office setting. This public data set includes the number of services, average submitted charges, average allowed amount, average Medicare payment, and a count of unique beneficiaries treated. CMS takes beneficiary privacy very seriously and we will protect patient-identifiable information by redacting any data in cases where it includes fewer than 11 beneficiaries.
Previously, CMS could not release this information due to a permanent injunction issued by a court in 1979. However, in May 2013, the court vacated this injunction, causing a series of events that has led CMS to be able to make this information available for the first time.
Data to Fuel Research and Innovation
In addition to the public data release, CMS is making slight modifications to the process to request CMS data for research purposes. This will allow researchers to conduct important research at the physician level. As with the public release of information described above, CMS will continue to prohibit the release of patient-identifiable information. For more information about CMS’s disclosures to researchers, please contact the Research Data Assistance Center (ResDAC) at http://www.resdac.org/.
Unprecedented Data Access
This data release follows other CMS efforts to make more data available to the public. Since 2010, the agency has released an unprecedented amount of aggregated data in machine-readable form, with much of it available at http://www.healthdata.gov. These data range from previously unpublished statistics on Medicare spending, utilization, and quality at the state, hospital referral region, and county level, to detailed information on the quality performance of hospitals, nursing homes, and other providers.
In May 2013, CMS released information on the average charges for the 100 most common inpatient services at more than 3,000 hospitals nationwide http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Inpatient.html.
In June 2013, CMS released average charges for 30 selected outpatient procedures http://www.cms.gov/Research-Statistics-Data-and-Systems/Statistics-Trends-and-Reports/Medicare-Provider-Charge-Data/Outpatient.html.
We will continue to work toward harnessing the power of data to promote quality and value, and improve the health of our seniors and persons with disabilities.”

In Austria, council uses app to crowdsource community issues


Springwise: “Civic authorities can’t be everywhere at once and often rely on citizens to inform them of the improvements that need making. The NYPD already launched its own crowdsourcing crime reports app, and now the Bürgerforum Vorarlberg mobile app is enabling citizens to flag community issues that need addressing by sending photos and text direct to the council.
Available for all residents to download from the App Store and Google Play, the app has been developed by Vorarlberg news outlets VN and Vol.at. Users who have found a problem on the streets of the Austrian province can take a photo and add a caption, while the app automatically adds a geolocation tag. The issue is then added to a map of complaints and concerns filed by other residents. The idea is that local authorities can then easily see the issues that need to be dealt with.
Website: www.buergerforum.vol.at
Contact: www.twitter.com/vorarlberg

AU: Revitalising and revising the Innovation Showcase


From the Public Sector Innovation Toolkit unit of the Australian Government: “Do you have any case studies of innovative initiatives in the public service?
An important part of the public sector innovation agenda is sharing examples of innovation in practice. That’s why we created the Public Sector Innovation Showcase.
As noted in the APS Innovation Action Plan, “The Public Sector Innovation Showcase will enable government agencies and departments to share and celebrate case studies of innovation, and to consider how they might apply such innovative practices within their own operations to achieve better outcomes.”
The Showcase was a joint initiative with the Department of Finance and has been operating for a number of years now. We thought it time for some changes and that it needs some new examples.
To make the showcase more useful we have incorporated it into this site – you can see the examples here. We are eager to receive more examples from the public sector – from the Commonwealth, state, territory and local governments.
Please get in contact with us if you have an example that might be suitable as a case study of innovation in the public sector. The sort of things we’re after in the case studies are spelled out in our Showcase submission guidance.
We’re seeking examples that demonstrate doing things differently, rather than doing what we do now but slightly better.

The Data Mining Techniques That Reveal Our Planet's Cultural Links and Boundaries


Emerging Technology From the arXiv: “The habits and behaviors that define a culture are complex and fascinating. But measuring them is a difficult task. What’s more, understanding the way cultures change from one part of the world to another is a task laden with challenges.
The gold standard in this area of science is known as the World Values Survey, a global network of social scientists studying values and their impact on social and political life. Between 1981 and 2008, this survey conducted over 250,000 interviews in 87 societies. That’s a significant amount of data and the work has continued since then. This work is hugely valuable but it is also challenging, time-consuming and expensive.
Today, Thiago Silva at the Universidade Federal de Minas Gerais in Brazil and a few buddies reveal another way to collect data that could revolutionize the study of global culture. These guys study cultural differences around the world using data generated by check-ins on the location-based social network, Foursquare.
That allows these researchers to gather huge amounts of data, cheaply and easily in a short period of time. “Our one-week dataset has a population of users of the same order of magnitude of the number of interviews performed in [the World Values Survey] in almost three decades,” they say.
Food and drink are fundamental aspects of society and so the behaviors and habits associated with them are important indicators. The basic question that Silva and co attempt to answer is: what are your eating and drinking habits? And how do these differ from a typical individual in another part of the world such as Japan, Malaysia, or Brazil?
Foursquare is ideally set up to explore this question. Users “check in” by indicating when they have reached a particular location that might be related to eating and drinking but also to other activities such as entertainment, sport and so on.
Silva and co are only interested in the food and drink preferences of individuals and, in particular, on the way these preferences change according to time of day and geographical location.
So their basic approach is to compare a large number individual preferences from different parts of the world and see how closely they match or how they differ.
Because Foursquare does not share its data, Silva and co downloaded almost five million tweets containing Foursquare check-ins, URLs pointing to the Foursquare website containing information about each venue. They discarded check-ins that were unrelated to food or drink.
That left them with some 280,000 check-ins related to drink from 160,000 individuals; over 400,000 check-ins related to fast food from 230,000 people; and some 400,000 check-ins relating to ordinary restaurant food or what Silva and co call slow food.
They then divide each of these classes into subcategories. For example, the drink class has 21 subcategories such as brewery, karaoke bar, pub, and so on. The slow food class has 53 subcategories such as Chinese restaurant, Steakhouse, Greek restaurant, and so on.
Each check-in gives the time and geographical location which allows the team to compare behaviors from all over the world. They compare, for example, eating and drinking times in different countries both during the week and at the weekend. They compare the choices of restaurants, fast food habits and drinking habits by continent and country. The even compare eating and drinking habits in New York, London, and Tokyo.
The results are a fascinating insight into humanity’s differing habits. Many places have similar behaviors, Malaysia and Singapore or Argentina and Chile, for example, which is just as expected given the similarities between these places.
But other resemblances are more unexpected. A comparison of drinking habits show greater similarity between Brazil and France, separated by the Atlantic Ocean, than they do between France and England, separated only by the English Channel…
They point out only two major differences. The first is that no Islamic cluster appears in the Foursquare data. Countries such as Turkey are similar to Russia, while Indonesia seems related to Malaysia and Singapore.
The second is that the U.S. and Mexico make up their own individual cluster in the Foursquare data whereas the World Values Survey has them in the “English-speaking” and “Latin American” clusters accordingly.
That’s exciting data mining work that has the potential to revolutionize the way sociologists and anthropologists study human culture around the world. Expect to hear more about it
Ref: http://arxiv.org/abs/1404.1009: You Are What You Eat (and Drink): Identifying Cultural Boundaries By Analyzing Food & Drink Habits In Foursquare”.