Welcoming the Third Class of Presidential Innovation Fellows


Garren Givens, and Ryan Panchadsaram at the White House Blog: “We recently welcomed the newest group of Presidential Innovation Fellows into the federal government. This diverse group represents some of the nation’s most talented and creative civic-minded innovators…
You can learn more about this inspiring group of Fellows here.
Over the next 12 months, these innovators will collaborate and work with change agents inside government on three high-impact initiatives aimed at saving lives, saving taxpayer money, and fueling our economy. These initiatives include:

  • Building a 21st Century Veterans Experience
  • Unleashing the Power of Data Resources to Improve Americans’ Lives
  • Crowdsourcing to Improve Government

Read more about the projects that make up these initiatives, and the previous successes the program has helped shape.
The fellows will be supported by 18F, an innovative group focused on the delivery of digital services across the federal government, and will work alongside the U.S. Digital Service and agency innovators in continuing to build a culture, and best practice within government….”

DrivenData


DrivenData Blog: “As we begin launching our first competitions, we thought it would be a good idea to lay out what exactly we’re trying to do and why….
At DrivenData, we want to bring cutting-edge practices in data science and crowdsourcing to some of the world’s biggest social challenges and the organizations taking them on. We host online challenges, usually lasting 2-3 months, where a global community of data scientists competes to come up with the best statistical model for difficult predictive problems that make a difference.
Just like every major corporation today, nonprofits and NGOs have more data than ever before. And just like those corporations, they are trying to figure out how to make the best use of their data. We work with mission-driven organizations to identify specific predictive questions that they care about answering and can use their data to tackle.
Then we host the online competitions, where experts from around the world vie to come up with the best solution. Some competitors are experienced data scientists in the private sector, analyzing corporate data by day, saving the world by night, and testing their mettle on complex questions of impact. Others are smart, sophisticated students and researchers looking to hone their skills on real-world datasets and real-world problems. Still more have extensive experience with social sector data and want to bring their expertise to bear on new, meaningful challenges – with immediate feedback on how well their solution performs.
Like any data competition platform, we want to harness the power of crowds combined with the increasing prevalence of large, relevant datasets. Unlike other data competition platforms, our primary goal is to create actual, measurable, lasting positive change in the world with our competitions. At the end of each challenge, we work with the sponsoring organization to integrate the winning solutions, giving them the tools to drive real improvements in their impact….
We are launching soon and we want you to join us!
If you want to get updates about our launch this fall with exciting, real competitions, please sign up for our mailing list here and follow us on Twitter: @drivendataorg.
If you are a data scientist, feel free to create an account and start playing with our first sandbox competitions.
If you are a nonprofit or public sector organization, and want to squeeze every drop of mission effectiveness out of your data, check out the info on our site and let us know! “

What Is Big Data?


datascience@berkeley Blog: ““Big Data.” It seems like the phrase is everywhere. The term was added to the Oxford English Dictionary in 2013 External link, appeared in Merriam-Webster’s Collegiate Dictionary by 2014 External link, and Gartner’s just-released 2014 Hype Cycle External link shows “Big Data” passing the “Peak of Inflated Expectations” and on its way down into the “Trough of Disillusionment.” Big Data is all the rage. But what does it actually mean?
A commonly repeated definition External link cites the three Vs: volume, velocity, and variety. But others argue that it’s not the size of data that counts, but the tools being used, or the insights that can be drawn from a dataset.
To settle the question once and for all, we asked 40+ thought leaders in publishing, fashion, food, automobiles, medicine, marketing and every industry in between how exactly they would define the phrase “Big Data.” Their answers might surprise you! Take a look below to find out what big data is:

  1. John Akred, Founder and CTO, Silicon Valley Data Science
  2. Philip Ashlock, Chief Architect of Data.gov
  3. Jon Bruner, Editor-at-Large, O’Reilly Media
  4. Reid Bryant, Data Scientist, Brooks Bell
  5. Mike Cavaretta, Data Scientist and Manager, Ford Motor Company
  6. Drew Conway, Head of Data, Project Florida
  7. Rohan Deuskar, CEO and Co-Founder, Stylitics
  8. Amy Escobar, Data Scientist, 2U
  9. Josh Ferguson, Chief Technology Officer, Mode Analytics
  10. John Foreman, Chief Data Scientist, MailChimp

FULL LIST at datascience@berkeley Blog”

Opportunities for strengthening open meetings with open data


at the Sunlight Foundation Blog: “Governments aren’t alone in thinking about how open data can help improve the open meetings process. There are an increasing number of tools governments can use to help bolster open meetings with open data. From making public records generated by meetings more easily accessible and reusable online to inviting the public to participate in the decision-making process from wherever they may be, these tools allow governments to upgrade open meetings for the opportunities and demands of the 21st Century.
Improving open meetings with open data may involve taking advantage of simple solutions already freely available online, developing new tools within government, using open-source tools, or investing in new software, but it can all help serve the same goal: bringing more information online where it’s easily accessible to the public….
It’s not just about making open meetings more accessible, either. More communities are thinking about how they can bring government to the people. Open meetings are typically held in government-designated buildings at specified times, but are those locations and times truly accessible for most of the public or for those who may be most directly impacted by what’s being discussed?
Technology presents opportunities for governments to engage with the public outside of regularly scheduled meetings. Tools like Speakup and Textizen, for example, are being used to increase public participation in the general decision-making process. A continually increasing array of toolsprovidenewways for government and the public to identify issues, share ideas, and work toward solutions, even outside of open meetings. Boston, for example, took an innovative approach to this issue with its City Hall To Go truck and other efforts, bringing government services to locations around the city rather than requiring people to come to a government building…”

Policy bubbles: What factors drive their birth, maturity and death?


Moshe Maor at LSE Blog: “A policy bubble is a real or perceived policy overreaction that is reinforced by positive feedback over a relatively long period of time. This type of policy imposes objective and/or perceived social costs without producing offsetting objective and/or perceived benefits over a considerable length of time. A case in point is when government spending over a policy problem increases due to public demand for more policy while the severity of the problem decreases over an extended period of time. Another case is when governments raise ‘green’ or other standards due to public demand while the severity of the problem does not justify this move…
Drawing on insights from a variety of fields – including behavioural economics, psychology, sociology, political science and public policy – three phases of the life-cycle of a policy bubble may be identified: birth, maturity and death. A policy bubble may emerge when certain individuals perceive opportunities to gain from public policy or to exploit it by rallying support for the policy, promoting word-of-mouth enthusiasm and widespread endorsement of the policy, heightening expectations for further policy, and increasing demand for this policy….
How can one identify a policy bubble? A policy bubble may be identified by measuring parliamentary concerns, media concerns, public opinion regarding the policy at hand, and the extent of a policy problem, against the budget allocation to said policy over the same period, preferably over 50 years or more. Measuring the operation of different transmission mechanisms in emotional contagion and human herding, particularly the spread of social influence and feeling, can also work to identify a policy bubble.
Here, computer-aided content analysis of verbal and non-verbal communication in social networks, especially instant messaging, may capture emotional and social contagion. A further way to identify a policy bubble revolves around studying bubble expectations and individuals’ confidence over time by distributing a questionnaire to a random sample of the population, experts in the relevant policy sub-field, as well as decision makers, and comparing the results across time and nations.
To sum up, my interpretation of the process that leads to the emergence of policy bubbles allows for the possibility that different modes of policy overreaction lead to different types of human herding, thereby resulting in different types of policy bubbles. This interpretation has the added benefit of contributing to the explanation of economic, financial, technological and social bubbles as well”

OkCupid reveals it’s been lying to some of its users. Just to see what’ll happen.


Brian Fung in the Washington Post: “It turns out that OkCupid has been performing some of the same psychological experiments on its users that landed Facebook in hot water recently.
In a lengthy blog post, OkCupid cofounder Christian Rudder explains that OkCupid has on occasion played around with removing text from people’s profiles, removing photos, and even telling some users they were an excellent match when in fact they were only a 30 percent match according to the company’s systems. Just to see what would happen.
OkCupid defends this behavior as something that any self-respecting Web site would do.
“OkCupid doesn’t really know what it’s doing. Neither does any other Web site,” Rudder wrote. “But guess what, everybody: if you use the Internet, you’re the subject of hundreds of experiments at any given time, on every site. That’s how websites work.”…
we have a bigger problem on our hands: A problem about how to reconcile the sometimes valuable lessons of data science with the creep factor — particularly when you aren’t notified about being studied. But as I’ve written before, these kinds of studies happen all the time; it’s just rare that the public is presented with the results.
Short of banning the practice altogether, which seems totally unrealistic, corporate data science seems like an opportunity on a number of levels, particularly if it’s disclosed to the public. First, it helps us understand how human beings tend to behave at Internet scale. Second, it tells us more about how Internet companies work. And third, it helps consumers make better decisions about which services they’re comfortable using.
I suspect that what bothers us most of all is not that the research took place, but that we’re slowly coming to grips with how easily we ceded control over our own information — and how the machines that collect all this data may all know more about us than we do ourselves. We had no idea we were even in a rabbit hole, and now we’ve discovered we’re 10 feet deep. As many as 62.5 percent of Facebook users don’t know the news feed is generated by a company algorithm, according to a recent study conducted by Christian Sandvig, an associate professor at the University of Michigan, and Karrie Karahalios, an associate professor at the University of Illinois.
OkCupid’s blog post is distinct in several ways from Facebook’s psychological experiment. OkCupid didn’t try to publish its findings in a scientific journal. It isn’t even claiming that what it did was science. Moreover, OkCupid’s research is legitimately useful to users of the service — in ways that Facebook’s research is arguably not….
But in any case, there’s no such motivating factor when it comes to Facebook. Unless you’re a page administrator or news organization, understanding how the newsfeed works doesn’t really help the average user in the way that understanding how OkCupid works does. That’s because people use Facebook for all kinds of reasons that have nothing to do with Facebook’s commercial motives. But people would stop using OkCupid if they discovered it didn’t “work.”
If you’re lying to your users in an attempt to improve your service, what’s the line between A/B testing and fraud?”

The Social Laboratory


Shane Harris in Foreign Policy: “…, Singapore has become a laboratory not only for testing how mass surveillance and big-data analysis might prevent terrorism, but for determining whether technology can be used to engineer a more harmonious society….Months after the virus abated, Ho and his colleagues ran a simulation using Poindexter’s TIA ideas to see whether they could have detected the outbreak. Ho will not reveal what forms of information he and his colleagues used — by U.S. standards, Singapore’s privacy laws are virtually nonexistent, and it’s possible that the government collected private communications, financial data, public transportation records, and medical information without any court approval or private consent — but Ho claims that the experiment was very encouraging. It showed that if Singapore had previously installed a big-data analysis system, it could have spotted the signs of a potential outbreak two months before the virus hit the country’s shores. Prior to the SARS outbreak, for example, there were reports of strange, unexplained lung infections in China. Threads of information like that, if woven together, could in theory warn analysts of pending crises.
The RAHS system was operational a year later, and it immediately began “canvassing a range of sources for weak signals of potential future shocks,” one senior Singaporean security official involved in the launch later recalled.
The system uses a mixture of proprietary and commercial technology and is based on a “cognitive model” designed to mimic the human thought process — a key design feature influenced by Poindexter’s TIA system. RAHS, itself, doesn’t think. It’s a tool that helps human beings sift huge stores of data for clues on just about everything. It is designed to analyze information from practically any source — the input is almost incidental — and to create models that can be used to forecast potential events. Those scenarios can then be shared across the Singaporean government and be picked up by whatever ministry or department might find them useful. Using a repository of information called an ideas database, RAHS and its teams of analysts create “narratives” about how various threats or strategic opportunities might play out. The point is not so much to predict the future as to envision a number of potential futures that can tell the government what to watch and when to dig further.
The officials running RAHS today are tight-lipped about exactly what data they monitor, though they acknowledge that a significant portion of “articles” in their databases come from publicly available information, including news reports, blog posts, Facebook updates, and Twitter messages. (“These articles have been trawled in by robots or uploaded manually” by analysts, says one program document.) But RAHS doesn’t need to rely only on open-source material or even the sorts of intelligence that most governments routinely collect: In Singapore, electronic surveillance of residents and visitors is pervasive and widely accepted…”

Crowdsourcing Ideas to Accelerate Economic Growth and Prosperity through a Strategy for American Innovation


Jason Miller and Tom Kalil at the White House Blog: “America’s future economic growth and international competitiveness depend crucially on our capacity to innovate. Creating the jobs and industries of the future will require making the right investments to unleash the unmatched creativity and imagination of the American people.
We want to gather bold ideas for how we as a nation can build on and extend into the future our historic strengths in innovation and discovery. Today we are calling on thinkers, doers, and entrepreneurs across the country to submit their proposals for promising new initiatives or pressing needs for renewed investment to be included in next year’s updated Strategy for American Innovation.
What will the next Strategy for American Innovation accomplish? In part, it’s up to you. Your input will help guide the Administration’s efforts to catalyze the transformative innovation in products, processes, and services that is the hallmark of American ingenuity.
Today, we released a set of questions for your comment, which you can access here and on Quora – an online platform that allows us to crowdsource ideas from the American people.

  Calling on America’s Inventors and Innovators for Great Ideas
Among the questions we are posing today to innovators across the country are:

  • What specific policies or initiatives should the Administration consider prioritizing in the next version of the Strategy for American Innovation?
  • What are the biggest challenges to, and opportunities for, innovation in the United States that will generate long-term economic growth and rising standards of living for more Americans?
  • What additional opportunities exist to develop high-impact platform technologies that reduce the time and cost associated with the “design, build, test” cycle for important classes of materials, products, and systems?
  • What investments, strategies, or technological advancements, across both the public and private sectors, are needed to rebuild the U.S. “industrial commons” (i.e., regional manufacturing capabilities) and ensure the latest technologies can be produced here?
  • What partnerships or novel models for collaboration between the Federal Government and regions should the Administration consider in order to promote innovation and the development of regional innovation ecosystems?

 
In today’s world of rapidly evolving technology, the Administration is adapting its approach to innovation-driven economic growth to reflect the emergence of new and exciting possibilities. Now is the time to gather input from the American people in order to envision and shape the innovations of the future. The full Request for Information can be found here and the 2011 Strategy for American Innovation can be found here. Comments are due by September 23, 2014, and can be sent to innovationstrategy@ostp.gov.  We look forward to hearing your ideas!”

Unleashing Climate Data to Empower America’s Agricultural Sector


Secretary Tom Vilsack and John P. Holdren at the White House Blog: “Today, in a major step to advance the President’s Climate Data Initiative, the Obama administration is inviting leaders of the technology and agricultural sectors to the White House to discuss new collaborative steps to unleash data that will help ensure our food system is resilient to the effects of climate change.

More intense heat waves, heavier downpours, and severe droughts and wildfires out west are already affecting the nation’s ability to produce and transport safe food. The recently released National Climate Assessment makes clear that these kinds of impacts are projected to become more severe over this century.

Food distributors, agricultural businesses, farmers, and retailers need accessible, useable data, tools, and information to ensure the effectiveness and sustainability of their operations – from water availability, to timing of planting and harvest, to storage practices, and more.

Today’s convening at the White House will include formal commitments by a host of private-sector companies and nongovernmental organizations to support the President’s Climate Data Initiative by harnessing climate data in ways that will increase the resilience of America’s food system and help reduce the contribution of the nation’s agricultural sector to climate change.

Microsoft Research, for instance, will grant 12 months of free cloud-computing resources to winners of a national challenge to create a smartphone app that helps farmers increase the resilience of their food production systems in the face of weather variability and climate change; the Michigan Agri-Business Association will soon launch a publicly available web-based mapping tool for use by the state’s agriculture sector; and the U.S. dairy industry will test and pilot four new modules – energy, feed, nutrient, and herd management – on the data-driven Farm Smart environmental-footprint calculation tool by the end of 2014. These are just a few among dozens of exciting commitments.

And the federal government is also stepping up. Today, anyone can log onto climate.data.gov and find new features that make data accessible and usable about the risks of climate change to food production, delivery, and nutrition – including current and historical data from the Census of Agriculture on production, supply, and distribution of agricultural products, and data on climate-change-related risks such as storms, heat waves, and drought.

These steps are a direct response to the President’s call for all hands on deck to generate further innovation to help prepare America’s communities and business for the impacts of climate change.

We are delighted about the steps being announced by dozens of collaborators today, and we can’t wait to see what further tools, apps, and services are developed as the Administration and its partners continue to unleash data to make America’s agriculture enterprise stronger and more resilient than ever before.

Read a fact sheet about all of today’s Climate Data Initiative commitments here.

Sharing Data Is a Form of Corporate Philanthropy


Matt Stempeck in HBR Blog:  “Ever since the International Charter on Space and Major Disasters was signed in 1999, satellite companies like DMC International Imaging have had a clear protocol with which to provide valuable imagery to public actors in times of crisis. In a single week this February, DMCii tasked its fleet of satellites on flooding in the United Kingdom, fires in India, floods in Zimbabwe, and snow in South Korea. Official crisis response departments and relevant UN departments can request on-demand access to the visuals captured by these “eyes in the sky” to better assess damage and coordinate relief efforts.

DMCii is a private company, yet it provides enormous value to the public and social sectors simply by periodically sharing its data.
Back on Earth, companies create, collect, and mine data in their day-to-day business. This data has quickly emerged as one of this century’s most vital assets. Public sector and social good organizations may not have access to the same amount, quality, or frequency of data. This imbalance has inspired a new category of corporate giving foreshadowed by the 1999 Space Charter: data philanthropy.
The satellite imagery example is an area of obvious societal value, but data philanthropy holds even stronger potential closer to home, where a wide range of private companies could give back in meaningful ways by contributing data to public actors. Consider two promising contexts for data philanthropy: responsive cities and academic research.
The centralized institutions of the 20th century allowed for the most sophisticated economic and urban planning to date. But in recent decades, the information revolution has helped the private sector speed ahead in data aggregation, analysis, and applications. It’s well known that there’s enormous value in real-time usage of data in the private sector, but there are similarly huge gains to be won in the application of real-time data to mitigate common challenges.
What if sharing economy companies shared their real-time housing, transit, and economic data with city governments or public interest groups? For example, Uber maintains a “God’s Eye view” of every driver on the road in a city:
stempeck2
Imagine combining this single data feed with an entire portfolio of real-time information. An early leader in this space is the City of Chicago’s urban data dashboard, WindyGrid. The dashboard aggregates an ever-growing variety of public datasets to allow for more intelligent urban management.
stempeck3
Over time, we could design responsive cities that react to this data. A responsive city is one where services, infrastructure, and even policies can flexibly respond to the rhythms of its denizens in real-time. Private sector data contributions could greatly accelerate these nascent efforts.
Data philanthropy could similarly benefit academia. Access to data remains an unfortunate barrier to entry for many researchers. The result is that only researchers with access to certain data, such as full-volume social media streams, can analyze and produce knowledge from this compelling information. Twitter, for example, sells access to a range of real-time APIs to marketing platforms, but the price point often exceeds researchers’ budgets. To accelerate the pursuit of knowledge, Twitter has piloted a program called Data Grants offering access to segments of their real-time global trove to select groups of researchers. With this program, academics and other researchers can apply to receive access to relevant bulk data downloads, such as an period of time before and after an election, or a certain geographic area.
Humanitarian response, urban planning, and academia are just three sectors within which private data can be donated to improve the public condition. There are many more possible applications possible, but few examples to date. For companies looking to expand their corporate social responsibility initiatives, sharing data should be part of the conversation…
Companies considering data philanthropy can take the following steps:

  • Inventory the information your company produces, collects, and analyzes. Consider which data would be easy to share and which data will require long-term effort.
  • Think who could benefit from this information. Who in your community doesn’t have access to this information?
  • Who could be harmed by the release of this data? If the datasets are about people, have they consented to its release? (i.e. don’t pull a Facebook emotional manipulation experiment).
  • Begin conversations with relevant public agencies and nonprofit partners to get a sense of the sort of information they might find valuable and their capacity to work with the formats you might eventually make available.
  • If you expect an onslaught of interest, an application process can help qualify partnership opportunities to maximize positive impact relative to time invested in the program.
  • Consider how you’ll handle distribution of the data to partners. Even if you don’t have the resources to set up an API, regular releases of bulk data could still provide enormous value to organizations used to relying on less-frequently updated government indices.
  • Consider your needs regarding privacy and anonymization. Strip the data of anything remotely resembling personally identifiable information (here are some guidelines).
  • If you’re making data available to researchers, plan to allow researchers to publish their results without obstruction. You might also require them to share the findings with the world under Open Access terms….”