New Data for a New Energy Future


(This post originally appeared on the blog of the U.S. Chamber of Commerce Foundation.)

Two growing concerns—climate change and U.S. energy self-sufficiency—have accelerated the search for affordable, sustainable approaches to energy production and use. In this area, as in many others, data-driven innovation is a key to progress. Data scientists are working to help improve energy efficiency and make new forms of energy more economically viable, and are building new, profitable businesses in the process.
In the same way that government data has been used by other kinds of new businesses, the Department of Energy is releasing data that can help energy innovators. At a recent “Energy Datapalooza” held by the department, John Podesta, counselor to the President, summed up the rationale: “Just as climate data will be central to helping communities prepare for climate change, energy data can help us reduce the harmful emissions that are driving climate change.” With electric power accounting for one-third of greenhouse gas emissions in the United States, the opportunities for improvement are great.
The GovLab has been studying the business applications of public government data, or “open data,” for the past year. The resulting study, the Open Data 500, now provides structured, searchable information on more than 500 companies that use open government data as a key business driver. A review of those results shows four major areas where open data is creating new business opportunities in energy and is likely to build many more in the near future.

Commercial building efficiency
Commercial buildings are major energy consumers, and energy costs are a significant business expense. Despite programs like LEED Certification, many commercial buildings waste large amounts of energy. Now a company called FirstFuel, based in Boston, is using open data to drive energy efficiency in these buildings. At the Energy Datapalooza, Swap Shah, the company’s CEO, described how analyzing energy data together with geospatial, weather, and other open data can give a very accurate view of a building’s energy consumption and ways to reduce it. (Sometimes the solution is startlingly simple: According to Shah, the largest source of waste is running heating and cooling systems at the same time.) Other companies are taking on the same kind of task – like Lucid, which provides an operating system that can track a building’s energy use in an integrated way.

Home energy use
A number of companies are finding data-driven solutions for homeowners who want to save money by reducing their energy usage. A key to success is putting together measurements of energy use in the home with public data on energy efficiency solutions. PlotWatt, for example, promises to help consumers “save money with real-time energy tracking” through the data it provides. One of the best-known companies in this area, Opower, uses a psychological strategy: it simultaneously gives people access to their own energy data and lets them compare their energy use to their neighbors’ as an incentive to save. Opower partners with utilities to provide this information, and the Virginia-based company has been successful enough to open offices in San Francisco, London, and Singapore. Soon more and more people will have access to data on their home energy use: Green Button, a government-promoted program implemented by utilities, now gives about 100 million Americans data about their energy consumption.

Solar power and renewable energy
As solar power becomes more efficient and affordable, a number of companies are emerging to support this energy technology. Clean Power Finance, for example, uses its database to connect solar entrepreneurs with sources of capital. In a different way, a company called Solar Census is analyzing publicly available data to find exactly where solar power can be produced most efficiently. The kind of analysis that used to require an on-site survey over several days can now be done in less than a minute with their algorithms.
Other kinds of geospatial and weather data can support other forms of renewable energy. The data will make it easier to find good sites for wind power stations, water sources for small-scale hydroelectric projects, and the best opportunities to tap geothermal energy.

Supporting new energy-efficient vehicles
The Tesla and other electric vehicles are becoming commercially viable, and we will soon see even more efficient vehicles on the road. Toyota has announced that its first fuel-cell cars, which run on hydrogen, will be commercially available by mid-2015, and other auto manufacturers have announced plans to develop fuel-cell vehicles as well. But these vehicles can’t operate without a network to supply power, be it electricity for a Tesla battery or hydrogen for a fuel cell.
It’s a chicken-and-egg problem: People won’t buy large numbers of electric or fuel-cell cars unless they know they can power them, and power stations will be scarce until there are enough vehicles to support their business. Now some new companies are facilitating this transition by giving drivers data-driven tools to find and use the power sources they need. Recargo, for example, provides tools to help electric car owners find charging stations and operate their vehicles.
The development of new energy sources will involve solving social, political, economic, and technological issues. Data science can help develop solutions and bring us more quickly to a new kind of energy future.
Joel Gurin, senior advisor at the GovLab and project director, Open Data 500. He also currently serves as a fellow of the U.S. Chamber of Commerce Foundation.

Google’s Waze announces government data exchange program with 10 initial partners


Josh Ong at TheNextWeb blog: “Waze today announced “Connected Citizens,” a new government partnership program that will see both parties exchange data in order to improve traffic conditions.

For the program, Waze will provide real-time anonymized crowdsourced traffic data to government departments in exchange for information on public projects like construction, road sensors, and pre-planned road closures.

The first 10 partners include:

  • Rio de Janeiro, Brazil
  • Barcelona, Spain and the Government of Catalonia
  • Jakarta, Indonesia
  • Tel Aviv, Israel
  • San Jose, Costa Rica
  • Boston, USA
  • State of Florida, USA
  • State of Utah, USA
  • Los Angeles County
  • The New York Police Department (NYPD)

Waze has also signed on five other government partners and has received applications from more than 80 municipal groups. The company ran an initial pilot program in Rio de Janeiro where it partnered with the city’s traffic control center to supplement the department’s sensor data with reports from Waze users.

At an event celebrating the launch, Di-Ann Eisnor, head of Growth at Waze noted that the data exchange will only include public alerts, such as accidents and closures.

We don’t share anything beyond that, such as where individuals are located and who they are,” she said.

Eisnor also made it clear that Waze isn’t selling the data. GPS maker TomTom came under fire several years ago after customers learned that the company had sold their data to police departments to help find the best places to put speed traps.

“We keep [the data] clean by making sure we don’t have a business model around it,” Eisnor added.

Waze will requires that new Connected Citizens partners “prove their dedication to citizen engagement and commit to use Waze data to improve city efficiency.”…”

YouTube for Government


Brandon Feldman at Google Politics & Elections Blog: From live streams of the State of the Union and legislative hearings, to explainer videos on important issues and Hangouts with constituents, YouTube has become an important platform where citizens engage with their governments and elected officials.
In order to help government officials get a better idea of what YouTube can do, we are launching youtube.com/government101, a one-stop shop where government officials can learn how to get the most out of YouTube as a communication tool.

The site offers a broad range of YouTube advice, from the basics of creating a channel to in-depth guidance on features like live streaming, annotations, playlists and more. We’ve also featured case studies from government offices around the world that are using YouTube in innovative ways.
If you’re a government official, whether you are looking for an answer to a quick question or need a full training on YouTube best practices, we hope this resource will help you engage in a rich dialogue with your constituents and increase transparency within your community….”

Why we’re failing to get the most out of open data


Victoria Lemieux at the WEF Blog: “An unprecedented number of individuals and organizations are finding ways to explore, interpret and use Open Data. Public agencies are hosting Open Data events such as meetups, hackathons and data dives. The potential of these initiatives is great, including support for economic development (McKinsey, 2013), anti-corruption (European Public Sector Information Platform, 2014) and accountability (Open Government Partnership, 2012). But is Open Data’s full potential being realized?
news item from Computer Weekly casts doubt. A recent report notes that, in the United Kingdom, poor data quality is hindering the government’s Open Data program. The report goes on to explain that – in an effort to make the public sector more transparent and accountable – UK public bodies have been publishing spending records every month since November 2010. The authors of the report, who conducted an analysis of 50 spending-related data releases by the Cabinet Office since May 2010, found that that the data was of such poor quality that using it would require advanced computer skills.
Far from being a one-off problem, research suggests that this issue is ubiquitous and endemic. Some estimates indicate that as much as 80 percent of the time and cost of an analytics project is attributable to the need to clean up “dirty data” (Dasu and Johnson, 2003).
In addition to data quality issues, data provenance can be difficult to determine. Knowing where data originates and by what means it has been disclosed is key to being able to trust data. If end users do not trust data, they are unlikely to believe they can rely upon the information for accountability purposes. Establishing data provenance does not “spring full blown from the head of Zeus.” It entails a good deal of effort undertaking such activities as enriching data with metadata – data about data – such as the date of creation, the creator of the data, who has had access to the data over time and ensuring that both data and metadata remain unalterable.
Similarly, if people think that data could be tampered with, they are unlikely to place trust in it; full comprehension of data relies on the ability to trace its origins….”

Redesigning that first encounter with online government


Nancy Scola in the Washington Post: “Teardowns,” Samuel Hulick calls them, and by that he means his step-by-step dissections of how some of world’s most popular digital services — Gmail, Evernote, Instragram — welcome new users. But the term might give an overly negative sense of what Hulick is up to. The Portland, Ore., user-experience designer highlights both the good and bad in his critiques, and his annotated slideshows, under the banner of UserOnboard, have gained a following among design aficionados.

Now Hulick is partnering with two of those fans, a pair of Code for America fellows, to encourage the public to do the same for, say, the process of applying for food stamps.  It’s called CitizenOnboard.
Using the original UserOnboard is like taking a tour through some of the digital sites you know best — but with an especially design-savvy friend by your side pointing out the kinks. “The user experience,” or UX on these sites, “is often tacked on haphazardly,” says Hulick, who launched UserOnboard in December 2013 and who is also the author of the recent book “The Elements of User Onboarding.” What’s he looking for in a good UX, he says, is something non-designers can spot, too. “If you were the Web site, what tone would you take? How would you guide people through your process?”
Hulick reviews what’s working and what’s not, and adds a bit of sass: Gmail pre-populates its inbox with a few welcome messages: “Preloading some emails is a nice way to deal with the ‘cold start’ problem,” Hulick notes. Evernote nudges new users to check out its blog and other apps: “It’s like a restaurant rolling out the dessert cart while I’m still trying to decide if I even want to eat there.” Instagram’s first backdrop is a photo of someone taking a picture: “I’m learning how to Instagram by osmosis!”….
CitizenOnboard’s pitch is to get the public to do that same work. They suggest starting with state food stamp programs. Hulick tackled his. The onboarding for Oregon’s SNAP service is 118 slides long, but that’s because there is much to address. In one step, applications must, using a drop-down menu, identify how those in their family are related to one another. “It took a while to figure out who should be the relation ‘of’ the other,” Hulick notes in his teardown. “In fact, I’m still not 100% sure I got it right.”…”

Happy Birthday, We the People! Marking Three Years of Online Petitions


On September 22, 2011, we launched We the People to give Americans a new way to petition their government around issues they care about. It works like this: Start a petition, get enough signatures, and the Obama administration will work with policy experts to issue an official response.
It’s three years later, and We the People remains incredibly popular: More than 15 million users have participated, collecting more than 22 million signatures on more than 360,000 petitions. To date, we’ve issued nearly 250 responses to petitions on a wide range of topics, including maintaining an open and innovative internet, reducing student loan debt, improving our economy, and even building a “Death Star.”
The We the People platform has led directly to policy changes and provided new opportunities for dialogue between citizens and their government. That’s part of the reason why, over the course of 2014, an average of response surveys showed a majority of signers thought it was “helpful to hear the Administration’s response,” even if they didn’t agree. Nearly 80 percent said they would use We the People again.
To celebrate We the People’s third birthday, the White House will host the first-ever social meetup for We the People users and petition creators right here at 1600 Pennsylvania Avenue. It will be an exciting chance for users to meet with policy experts and connect with each other in person.
Meanwhile, we continue to work to make We the People even more accessible so that people — no matter where they are on the internet — can use the platform to reach the White House. Beginning in October, third-party websites can submit signatures to We the People on behalf of their own signers, using our soon-to-be-released Write API (which is currently in beta). It’s the result of months of hard work, and we can’t wait to share it with the public.
Check out the infographic below, and take a look at some of the platform’s highlights over the last three years…”

Mapping the Next Frontier of Open Data: Corporate Data Sharing


Stefaan Verhulst at the GovLab (cross-posted at the UN Global Pulse Blog): “When it comes to data, we are living in the Cambrian Age. About ninety percent of the data that exists today has been generated within the last two years. We create 2.5 quintillion bytes of data on a daily basis—equivalent to a “new Google every four days.”
All of this means that we are certain to witness a rapid intensification in the process of “datafication”– already well underway. Use of data will grow increasingly critical. Data will confer strategic advantages; it will become essential to addressing many of our most important social, economic and political challenges.
This explains–at least in large part–why the Open Data movement has grown so rapidly in recent years. More and more, it has become evident that questions surrounding data access and use are emerging as one of the transformational opportunities of our time.
Today, it is estimated that over one million datasets have been made open or public. The vast majority of this open data is government data—information collected by agencies and departments in countries as varied as India, Uganda and the United States. But what of the terabyte after terabyte of data that is collected and stored by corporations? This data is also quite valuable, but it has been harder to access.
The topic of private sector data sharing was the focus of a recent conference organized by the Responsible Data Forum, Data and Society Research Institute and Global Pulse (see event summary). Participants at the conference, which was hosted by The Rockefeller Foundation in New York City, included representatives from a variety of sectors who converged to discuss ways to improve access to private data; the data held by private entities and corporations. The purpose for that access was rooted in a broad recognition that private data has the potential to foster much public good. At the same time, a variety of constraints—notably privacy and security, but also proprietary interests and data protectionism on the part of some companies—hold back this potential.
The framing for issues surrounding sharing private data has been broadly referred to under the rubric of “corporate data philanthropy.” The term refers to an emerging trend whereby companies have started sharing anonymized and aggregated data with third-party users who can then look for patterns or otherwise analyze the data in ways that lead to policy insights and other public good. The term was coined at the World Economic Forum meeting in Davos, in 2011, and has gained wider currency through Global Pulse, a United Nations data project that has popularized the notion of a global “data commons.”
Although still far from prevalent, some examples of corporate data sharing exist….

Help us map the field

A more comprehensive mapping of the field of corporate data sharing would draw on a wide range of case studies and examples to identify opportunities and gaps, and to inspire more corporations to allow access to their data (consider, for instance, the GovLab Open Data 500 mapping for open government data) . From a research point of view, the following questions would be important to ask:

  • What types of data sharing have proven most successful, and which ones least?
  • Who are the users of corporate shared data, and for what purposes?
  • What conditions encourage companies to share, and what are the concerns that prevent sharing?
  • What incentives can be created (economic, regulatory, etc.) to encourage corporate data philanthropy?
  • What differences (if any) exist between shared government data and shared private sector data?
  • What steps need to be taken to minimize potential harms (e.g., to privacy and security) when sharing data?
  • What’s the value created from using shared private data?

We (the GovLab; Global Pulse; and Data & Society) welcome your input to add to this list of questions, or to help us answer them by providing case studies and examples of corporate data philanthropy. Please add your examples below, use our Google Form or email them to us at corporatedata@thegovlab.org”

Welcoming the Third Class of Presidential Innovation Fellows


Garren Givens, and Ryan Panchadsaram at the White House Blog: “We recently welcomed the newest group of Presidential Innovation Fellows into the federal government. This diverse group represents some of the nation’s most talented and creative civic-minded innovators…
You can learn more about this inspiring group of Fellows here.
Over the next 12 months, these innovators will collaborate and work with change agents inside government on three high-impact initiatives aimed at saving lives, saving taxpayer money, and fueling our economy. These initiatives include:

  • Building a 21st Century Veterans Experience
  • Unleashing the Power of Data Resources to Improve Americans’ Lives
  • Crowdsourcing to Improve Government

Read more about the projects that make up these initiatives, and the previous successes the program has helped shape.
The fellows will be supported by 18F, an innovative group focused on the delivery of digital services across the federal government, and will work alongside the U.S. Digital Service and agency innovators in continuing to build a culture, and best practice within government….”

DrivenData


DrivenData Blog: “As we begin launching our first competitions, we thought it would be a good idea to lay out what exactly we’re trying to do and why….
At DrivenData, we want to bring cutting-edge practices in data science and crowdsourcing to some of the world’s biggest social challenges and the organizations taking them on. We host online challenges, usually lasting 2-3 months, where a global community of data scientists competes to come up with the best statistical model for difficult predictive problems that make a difference.
Just like every major corporation today, nonprofits and NGOs have more data than ever before. And just like those corporations, they are trying to figure out how to make the best use of their data. We work with mission-driven organizations to identify specific predictive questions that they care about answering and can use their data to tackle.
Then we host the online competitions, where experts from around the world vie to come up with the best solution. Some competitors are experienced data scientists in the private sector, analyzing corporate data by day, saving the world by night, and testing their mettle on complex questions of impact. Others are smart, sophisticated students and researchers looking to hone their skills on real-world datasets and real-world problems. Still more have extensive experience with social sector data and want to bring their expertise to bear on new, meaningful challenges – with immediate feedback on how well their solution performs.
Like any data competition platform, we want to harness the power of crowds combined with the increasing prevalence of large, relevant datasets. Unlike other data competition platforms, our primary goal is to create actual, measurable, lasting positive change in the world with our competitions. At the end of each challenge, we work with the sponsoring organization to integrate the winning solutions, giving them the tools to drive real improvements in their impact….
We are launching soon and we want you to join us!
If you want to get updates about our launch this fall with exciting, real competitions, please sign up for our mailing list here and follow us on Twitter: @drivendataorg.
If you are a data scientist, feel free to create an account and start playing with our first sandbox competitions.
If you are a nonprofit or public sector organization, and want to squeeze every drop of mission effectiveness out of your data, check out the info on our site and let us know! “

What Is Big Data?


datascience@berkeley Blog: ““Big Data.” It seems like the phrase is everywhere. The term was added to the Oxford English Dictionary in 2013 External link, appeared in Merriam-Webster’s Collegiate Dictionary by 2014 External link, and Gartner’s just-released 2014 Hype Cycle External link shows “Big Data” passing the “Peak of Inflated Expectations” and on its way down into the “Trough of Disillusionment.” Big Data is all the rage. But what does it actually mean?
A commonly repeated definition External link cites the three Vs: volume, velocity, and variety. But others argue that it’s not the size of data that counts, but the tools being used, or the insights that can be drawn from a dataset.
To settle the question once and for all, we asked 40+ thought leaders in publishing, fashion, food, automobiles, medicine, marketing and every industry in between how exactly they would define the phrase “Big Data.” Their answers might surprise you! Take a look below to find out what big data is:

  1. John Akred, Founder and CTO, Silicon Valley Data Science
  2. Philip Ashlock, Chief Architect of Data.gov
  3. Jon Bruner, Editor-at-Large, O’Reilly Media
  4. Reid Bryant, Data Scientist, Brooks Bell
  5. Mike Cavaretta, Data Scientist and Manager, Ford Motor Company
  6. Drew Conway, Head of Data, Project Florida
  7. Rohan Deuskar, CEO and Co-Founder, Stylitics
  8. Amy Escobar, Data Scientist, 2U
  9. Josh Ferguson, Chief Technology Officer, Mode Analytics
  10. John Foreman, Chief Data Scientist, MailChimp

FULL LIST at datascience@berkeley Blog”