New Tool in Fighting Corruption: Open Data


Martin Tisne at Omidyar Network: “Yesterday in Brisbane, the G20 threw its weight behind open data by featuring it prominently in the G20 Anti-Corruption working action plan. Specifically, the action plan calls for effort in three related areas:

(1)   Prepare a G20 compendium of good practices and lessons learned on open data and its application in the fight against corruption
(2)   Prepare G20 Open Data Principles, including identifying areas or sectors where their application is particularly useful
(3)   Complete self‑assessments of G20 country open data frameworks and initiatives

Open data describes information that is not simply public, but that has been published in a manner that makes it easy to access and easy to compare and connect with other information.
This matters for anti corruption – if you are a journalist or a civil society activist investigating bribery and corruption those connections are everything. They tell you that an anonymous person (e.g. ‘Mr Smith’) who owns an obscure company registered in a tax haven is linked to a another company that has been illegally exporting timber from a neighboring country. That the said Mr. Smith is also the son-in-law of the mining minister of yet another country, who herself has been accused of embezzling mining revenues. As we have written elsewhere on this blog, investigative journalists, prosecution authorities, and civil society groups all need access to this linked data for their work.
The action plan also links open data to the wider G20 agenda, citing its impact on the ability of businesses to make better investment decisions. You can find the full detail here….”

Giving Americans Easier Access to Their Own Data


Nick Sinai and Rajive Mathur  at the White House Blog: “…One of the newest My Data efforts is the IRS tool, Get Transcript. Launched in 2014, Get Transcript allows taxpayers to securely view, print, and download a PDF record of the last three years of their IRS tax account. Get Transcript has produced over 17 million so-called tax transcripts, reducing phone, mail, or in-person requests by approximately 40% from last year. Secure access to your own tax data makes it easier to demonstrate your income with prospective lenders and employers, or help with tax preparation. What was a paper-based transcript process which took multiple days has been made instantaneous and easy for the American taxpayer.
The IRS is an agency that serves virtually every American, and runs one of the nation’s largest customer service operations. To give an idea of the size and scope of responsibilities, the Internal Revenue Service:

  • receives over 80 million phone calls per year, mostly from people eager to hear the status of their refund, understand a notice, make a payment, or update their account;
  • sends out nearly 200 million paper notices annually; and
  • receives over 50 million unique visitors to its website each month during filing season.

Meeting this demand from citizens is a challenge with limited staff and resources. Nonetheless, the IRS is committed to improving service to citizens across all of its channels – whether it’s by phone, walk-ins, or especially its digital services.
Building on the initial success of Get Transcript, there are more exciting improvements to IRS services in the pipeline. For instance, millions of taxpayers contact the IRS every year to ask about their tax status, whether their filing was received, if their refund was processed, or if their payment posted. In the future, taxpayers will be able to answer these types of questions independently by signing in to a mobile-friendly, personalized online account to conduct transactions and see all of their tax information in one place. Users will be able to view account history and balance, make payments or see payment status, or even authorize their tax preparer to view or make changes to their tax return. This will also include the ability to download personal tax information in an easy to use and machine-readable format so that taxpayers can share with trusted recipients if desired….”

OpenUp Corporate Data while Protecting Privacy


Article by Stefaan G. Verhulst and David Sangokoya, (The GovLab) for the OpenUp? Blog: “Consider a few numbers: By the end of 2014, the number of mobile phone subscriptions worldwide is expected to reach 7 billion, nearly equal to the world’s population. More than 1.82 billion people communicate on some form of social network, and almost 14 billion sensor-laden everyday objects (trucks, health monitors, GPS devices, refrigerators, etc.) are now connected and communicating over the Internet, creating a steady stream of real-time, machine-generated data.
Much of the data generated by these devices is today controlled by corporations. These companies are in effect “owners” of terabytes of data and metadata. Companies use this data to aggregate, analyze, and track individual preferences, provide more targeted consumer experiences, and add value to the corporate bottom line.
At the same time, even as we witness a rapid “datafication” of the global economy, access to data is emerging as an increasingly critical issue, essential to addressing many of our most important social, economic, and political challenges. While the rise of the Open Data movement has opened up over a million datasets around the world, much of this openness is limited to government (and, to a lesser extent, scientific) data. Access to corporate data remains extremely limited. This is a lost opportunity. If corporate data—in the form of Web clicks, tweets, online purchases, sensor data, call data records, etc.—were made available in a de-identified and aggregated manner, researchers, public interest organizations, and third parties would gain greater insights on patterns and trends that could help inform better policies and lead to greater public good (including combatting Ebola).
Corporate data sharing holds tremendous promise. But its potential—and limitations—are also poorly understood. In what follows, we share early findings of our efforts to map this emerging open data frontier, along with a set of reflections on how to safeguard privacy and other citizen and consumer rights while sharing. Understanding the practice of shared corporate data—and assessing the associated risks—is an essential step in increasing access to socially valuable data held by businesses today. This is a challenge certainly worth exploring during the forthcoming OpenUp conference!
Understanding and classifying current corporate data sharing practices
Corporate data sharing remains very much a fledgling field. There has been little rigorous analysis of different ways or impacts of sharing. Nonetheless, our initial mapping of the landscape suggests there have been six main categories of activity—i.e., ways of sharing—to date:…
Assessing risks of corporate data sharing
Although the shared corporate data offers several benefits for researchers, public interest organizations, and other companies, there do exist risks, especially regarding personally identifiable information (PII). When aggregated, PII can serve to help understand trends and broad demographic patterns. But if PII is inadequately scrubbed and aggregated data is linked to specific individuals, this can lead to identity theft, discrimination, profiling, and other violations of individual freedom. It can also lead to significant legal ramifications for corporate data providers….”

The New Thing in Google Flu Trends Is Traditional Data


in the New York Times: “Google is giving its Flu Trends service an overhaul — “a brand new engine,” as it announced in a blog post on Friday.

The new thing is actually traditional data from the Centers for Disease Control and Prevention that is being integrated into the Google flu-tracking model. The goal is greater accuracy after the Google service had been criticized for consistently over-estimating flu outbreaks in recent years.

The main critique came in an analysis done by four quantitative social scientists, published earlier this year in an article in Science magazine, “The Parable of Google Flu: Traps in Big Data Analysis.” The researchers found that the most accurate flu predictor was a data mash-up that combined Google Flu Trends, which monitored flu-related search terms, with the official C.D.C. reports from doctors on influenza-like illness.

The Google Flu Trends team is heeding that advice. In the blog post, written by Christian Stefansen, a Google senior software engineer, wrote, “We’re launching a new Flu Trends model in the United States that — like many of the best performing methods in the literature — takes official CDC flu data into account as the flu season progresses.”

Google’s flu-tracking service has had its ups and downs. Its triumph came in 2009, when it gave an advance signal of the severity of the H1N1 outbreak, two weeks or so ahead of official statistics. In a 2009 article in Nature explaining how Google Flu Trends worked, the company’s researchers did, as the Friday post notes, say that the Google service was not intended to replace official flu surveillance methods and that it was susceptible to “false alerts” — anything that might prompt a surge in flu-related search queries.

Yet those caveats came a couple of pages into the Nature article. And Google Flu Trends became a symbol of the superiority of the new, big data approach — computer algorithms mining data trails for collective intelligence in real time. To enthusiasts, it seemed so superior to the antiquated method of collecting health data that involved doctors talking to patients, inspecting them and filing reports.

But Google’s flu service greatly overestimated the number of cases in the United States in the 2012-13 flu season — a well-known miss — and, according to the research published this year, has persistently overstated flu cases over the years. In the Science article, the social scientists called it “big data hubris.”

Governing the Smart, Connected City


Blog by Susan Crawford at HBR: “As politics at the federal level becomes increasingly corrosive and polarized, with trust in Congress and the President at historic lows, Americans still celebrate their cities. And cities are where the action is when it comes to using technology to thicken the mesh of civic goods — more and more cities are using data to animate and inform interactions between government and citizens to improve wellbeing.
Every day, I learn about some new civic improvement that will become possible when we can assume the presence of ubiquitous, cheap, and unlimited data connectivity in cities. Some of these are made possible by the proliferation of smartphones; others rely on the increasing number of internet-connected sensors embedded in the built environment. In both cases, the constant is data. (My new book, The Responsive City, written with co-author Stephen Goldsmith, tells stories from Chicago, Boston, New York City and elsewhere about recent developments along these lines.)
For example, with open fiber networks in place, sending video messages will become as accessible and routine as sending email is now. Take a look at rhinobird.tv, a free lightweight, open-source video service that works in browsers (no special download needed) and allows anyone to create a hashtag-driven “channel” for particular events and places. A debate or protest could be viewed from a thousand perspectives. Elected officials and public employees could easily hold streaming, virtual town hall meetings.
Given all that video and all those livestreams, we’ll need curation and aggregation to make sense of the flow. That’s why visualization norms, still in their infancy, will become a greater part of literacy. When the Internet Archive attempted late last year to “map” 400,000 hours of television news, against worldwide locations, it came up with pulsing blobs of attention. Although visionary Kevin Kelly has been talking about data visualization as a new form of literacy for years, city governments still struggle with presenting complex and changing information in standard, easy-to-consume ways.
Plenar.io is one attempt to resolve this. It’s a platform developed by former Chicago Chief Data Officer Brett Goldstein that allows public datasets to be combined and mapped with easy-to-see relationships among weather and crime, for example, on a single city block. (A sample question anyone can ask of Plenar.io: “Tell me the story of 700 Howard Street in San Francisco.”) Right now, Plenar.io’s visual norm is a map, but it’s easy to imagine other forms of presentation that could become standard. All the city has to do is open up its widely varying datasets…”

Nine Lessons for Bridging the Gap between Cities and Citizens


Soren Gigler at the Worldbank Blog: “…Moving towards a citizen-centered model of government is critical for achieving better results. 
But what does this mean in praxis? What are some of the bottlenecks and pitfalls of such an approach?
Here are nine lessons learned from our work on Open Government and Citizen Engagement programs around the world.

  1. Open Government is more than just making Government more open and transparent. It is about rebalancing the “governance” and power structure between government institutions, civil society, the private sector and citizens.
  2. Openness and accountability of government is the basis for building a relationship of trust for effective civic participation. It can fundamentally alter the relationship between government and citizens.
  3. Open Government programs are not effective if they are not embedded into a much broader institutional and cultural changes within government and fully integrated into the governments overall economic and social development goals.
  4. New technologies can be powerful enablers to strengthen existing transparency and social accountability mechanisms that empower citizens and traditionally excluded groups. Technologies by themselves, however, are not transformational; they need to be closely embedded into the different local socio-political context and amplify existing social accountability and governance processes.
  5. Enhancing the capabilities of the urban poor, youth and minorities to engage in policy debates is equally important as strengthening the capacity of government institutions to effectively respond to citizen engagement.
  6. Effective Open Government programs not only enhance the openness and responsiveness of governments however also fosters the inclusiveness of institutions.
  7. It’s critical to recognize that Open Government initiatives are not just about learning how to better listen to citizens. It’s also about how to become more responsive to them and their expressed needs.
  8. Civil society plays a central role in enhancing government accountability. They can form effective bridges between government and citizens. Improved government openness does not translate automatically into the effective uses of information by citizens. CSOs are critical ‘infomediaries’ that can strengthen the capabilities of poor communities to better access information, evaluate and act upon the provided information.
  9.  A genuine process of political and institutional reforms can grow out of an effective alliance between reform-minded policymakers, civil society and private sector leaders. Thus, open governance reforms need to be driven by the local socio-economic, political and cultural context….”

USAID establishes its first open data policy


Billy Mitchell at FedScoop: “The U.S. Agency for International Development jumped on the open data wave last week, announcing its first-ever policy to share its data sets and tools with the public on a central repository.

Referred to as Automated Directives System 579, the open data policy is a hat tip to President Barack Obama’s directive on transparency and open government five years ago and comes after the agency’s Frontiers in Development Forum in September addressing pathways for innovation for its mission to provide support to impoverished countries. With the new policy, USAID will provide a framework to open its agency-funded data to the public and publish it in a central location, making it easy to consume and use.
“USAID has long been a data-driven and evidence-based Agency, but never has the need been greater to share our data with a diverse set of partners—including the general public—to improve development outcomes,” wrote Angelique Crumbly, USAID’s performance improvement officer, and Brandon Pustejovsky, chief data officer for USAID, in a blog post. “For the first time in history, we have the tools, technologies and approaches to end extreme poverty within two decades. And while many of these new innovations were featured at our recent Frontiers in Development Forum, we also recognize that they largely rely on an ongoing stream of data (and new insights generated by that data) to ensure their appropriate application.”…

USAID’s DDL and open data will be hosted on the USAID website, where there’s already a long list of databases hosted. USAID also started a GitHub page for any feedback on the data”

The New We the People Write API, and What It Means for You


White House Blog by Leigh Heyman: “The White House petitions platform, We the People, just became more accessible and open than ever before. We are very excited to announce the launch of the “write” version of the Petitions Application Programming Interface, or “API.”
Starting today, people can sign We the People petitions even when they’re not on WhiteHouse.gov. Now, users can also use third-party platforms, including other petitions services, or even their own websites or blogs. All of those signatures, once validated, will count towards a petition’s objective of meeting the 100,000-signature threshold needed for an official White House response.
We the People started with a simple goal: to give more Americans a way to reach their government. To date, the platform has been more successful than we could have imagined, with more than 16 million users creating and signing more than 360,000 petitions.
We launched our Write API beta test last year, and since then we’ve been hard at work, both internally and in collaboration with our beta test participants. Last spring, as part of the National Day of Civic Hacking, we hosted a hackathon right here at the White House, where our engineers spent a day sitting side-by-side with our beta testers to help get our code and theirs ready for the big day.
That big day has finally come.
Click here if you want to get started right away, or read on to learn more about the Petitions Write API….”

The Role Of Open Data In Choosing Neighborhood


PlaceILive Blog: “To what extent is it important to get familiar with our environment?
If we think about how the world surrounding us has changed throughout the years, it is not so unreasonable that, while walking to work, we might encounter some new little shops, restaurants, or gas stations we had never noticed before. Likewise, how many times did we wander about for hours just to find green spaces for a run? And the only one we noticed was even more polluted than other urban areas!
Citizens are not always properly informed about the evolution of the places they live in. And that is why it would be crucial for people to be constantly up-to-date with accurate information of the neighborhood they have chosen or are going to choose.
London is a neat evidence of how transparency in providing data is basic in order to succeed as a Smart City.
The GLA’s London Datastore, for instance, is a public platform of datasets revealing updated figures on the main services offered by the town, in addition to population’s lifestyle and environmental risks. These data are then made more easily accessible to the community through the London Dashboard.
The importance of dispensing free information can be also proved by the integration of maps, which constitute an efficient means of geolocation. Consulting a map where it’s easy to find all the services you need as close as possible can be significant in the search for a location.
Wheel 435
(source: Smart London Plan)
The Open Data Index, published by The Open Knowledge Foundation in 2013, is another useful tool for data retrieval: it showcases a rank of different countries in the world with scores based on openness and availability of data attributes such as transport timetables and national statistics.
Here it is possible to check UK Open Data Census and US City Open Data Census.
As it was stated, making open data available and easily findable online not only represented a success for US cities but favoured apps makers and civic hackers too. Lauren Reid, a spokesperson at Code for America, reported according to Government Technology: “The more data we have, the better picture we have of the open data landscape.”
That is, on the whole, what Place I Live puts the biggest effort into: fostering a new awareness of the environment by providing free information, in order to support citizens willing to choose the best place they can live.
The outcome is soon explained. The website’s homepage offers visitors the chance to type address of their interest, displaying an overview of neighborhood parameters’ evaluation and a Life Quality Index calculated for every point on the map.
The research of the nearest medical institutions, schools or ATMs thus gets immediate and clear, as well as the survey about community’s generic information. Moreover, data’s reliability and accessibility are constantly examined by a strong team of professionals with high competence in data analysis, mapping, IT architecture and global markets.
For the moment the company’s work is focused on London, Berlin, Chicago, San Francisco and New York, while higher goals to reach include more than 200 cities.
US Open Data Census finally saw San Francisco’s highest score achievement as a proof of the city’s labour in putting technological expertise at everyone’s disposal, along with the task of fulfilling users’ needs through meticulous selections of datasets. This challenge seems to be successfully overcome by San Francisco’s new investment, partnering with the University of Chicago, in a data analytics dashboard on sustainability performance statistics named Sustainable Systems Framework, which is expected to be released in beta version by the the end of 2015’s first quarter.
 
Another remarkable collaboration in Open Data’s spread comes from the Bartlett Centre for Advanced Spatial Analysis (CASA) of the University College London (UCL); Oliver O’Brien, researcher at UCL Department of Geography and software developer at the CASA, is indeed one of the contributors to this cause.
Among his products, an interesting accomplishment is London’s CityDashboard, a real-time reports’ control panel in terms of spatial data. The web page also allows to visualize the whole data translated into a simplified map and to look at other UK cities’ dashboards.
Plus, his Bike Share Map is a live global view to bicycle sharing systems in over a hundred towns around the world, since bike sharing has recently drawn a greater public attention as an original form of transportation, in Europe and China above all….”

Big Thinkers. Big Data. Big Opportunity: Announcing The LinkedIn Economic Graph Challeng


at Linkedin Official Blog: “LinkedIn’s vision is to create economic opportunity for every member of the global workforce. Facilitating economic empowerment is a big task that will require bold thinking by smart, passionate individuals and groups. Today, we’re kicking off an initiative that aims to encourage this type of big thinking: the LinkedIn Economic Graph Challenge.
The LinkedIn Economic Graph Challenge is an idea that emerged from the development of the Economic Graph, a digital mapping of the global economy, comprised of a profile for every professional, company, job opportunity, the skills required to obtain those opportunities, every higher education organization, and all the professionally relevant knowledge associated with each of these entities. With these elements in place, we can connect talent with opportunity at massive scale.
We are launching the LinkedIn Economic Graph Challenge to encourage researchers, academics, and data-driven thinkers to propose how they would use data from LinkedIn to solve some of the most challenging economic problems of our times. We invite anyone who is interested to submit your most innovative, ambitious ideas. In return, we will recognize the three strongest proposals for using data from LinkedIn to generate a positive impact on the global economy, and present the team and/or individual with a $25,000 (USD) research award and the resources to complete their proposed research, with the potential to have it published….
We look forward to your submissions! For more information, please visit the LinkedIn Economic Graph Challenge website….”