New book by Patrick Meier on how big data is changing humanitarian response: “The overflow of information generated during disasters can be as paralyzing to humanitarian response as the lack of information. This flash flood of information when amplified by social media and satellite imagery is increasingly referred to as Big Data—or Big Crisis Data. Making sense of Big Crisis Data during disasters is proving an impossible challenge for traditional humanitarian organizations, which explains why they’re increasingly turning to Digital Humanitarians.
Who exactly are these Digital Humanitarians? They’re you, me, all of us. Digital Humanitarians are volunteers and professionals from the world over and from all walks of life. What do they share in common? The desire to make a difference, and they do that by rapidly mobilizing online in collaboration with international humanitarian organizations. They make sense of vast volumes of social media and satellite imagery in virtually real-time to support relief efforts worldwide. How? They craft and leverage ingenious crowdsourcing solutions with trail-blazing insights from artificial intelligence.
In sum, this book charts the sudden and spectacular rise of Digital Humanitarians by sharing their remarkable, real-life stories, highlighting how their humanity coupled with innovative solutions to Big Data is changing humanitarian response forever. Digital Humanitarians will make you think differently about what it means to be humanitarian and will invite you to join the journey online.
Clicker here to be notified when the book becomes available. For speaking requests, please email [email protected].”
UK Department of Health: Citizen Space
Sarah Wood at the UK Department of Health: “We recently ran a survey of our internal DH Citizen Space users. Citizen Space is the digital tool that DH and a number of other local and central Government Departments use to run their consultations.
Overall, our survey results were positive with staff reporting they had found the tool relatively easy to use and access. The survey did flag some internal issues eg. visibility of the tool in the Department, minor technical issues etc, which we’re planning to address through better promotion of Citizen Space and training, but on the whole our internal user experience seemed to be good.
However, there was one area where internal users did seem to be experiencing problems, and ironically it wasn’t with the tool itself. Many of our survey respondents seemed to be struggling with the analysis of their consultation responses, with some teams even questioning the usefulness of the data they were amassing from their digital consultations.
Some common mistakes
To help us get to the bottom of what was going on, we contacted some of our respondents and met with some consultation teams to talk about how they design, run and analyse the responses from their digital consultations. We found some common mistakes:
- Not thinking ‘digital first’ – not designing consultations with a digital audience and digital responses in mind. Eg. writing consultations for print and then trying to shoehorn them into a digital tool
- Not identifying what ‘real’ success means for a consultation before launching it or not putting in place the metrics needed to measure for success. Eg. not setting benchmarks, not measuring qualitative data or not identifying key target audiences and how to reach them.
- Not thinking about the type and/or amount of data that will be returned and planning resources and tools accordingly. Eg. asking lots of free-text questions and then drowning in responses
As a team, we are trying to address many of these issues by improving the way the Department approaches and designs its digital consultations. The next iteration of our Digital policymaking toolkit, which will combine a new set of Policy Standards with our digital tools, techniques and advice for policymakers, should help. Alongside other work our team is doing to build up digital capability in the department and to produce analytical tools for data mining and sentiment analysis that will help teams with free-text analysis.
…how to use or build a consultation in Citizen Space, you can find one of those in the Citizen Space User Guide and further guides and user forums on the Citizen Space Knowledge Base website.
The Right Colors Make Data Easier To Read
Sharon Lin And Jeffrey Heer at HBR Blog: “What is the color of money? Of love? Of the ocean? In the United States, most people respond that money is green, love is red and the ocean is blue. Many concepts evoke related colors — whether due to physical appearance, common metaphors, or cultural conventions. When colors are paired with the concepts that evoke them, we call these “semantically resonant color choices.”
Artists and designers regularly use semantically resonant colors in their work. And in the research we conducted with Julie Fortuna, Chinmay Kulkarni, and Maureen Stone, we found they can be remarkably important to data visualization.
Consider these charts of (fictional) fruit sales:
The only difference between the charts is the color assignment. The left-hand chart uses colors from a default palette. The right-hand chart has been assigned semantically resonant colors. (In this case, the assignment was computed automatically using an algorithm that analyzes the colors in relevant images retrieved from Google Image Search using queries for each data category name.)
Now, try answering some questions about the data in each of these charts. Which fruit had higher sales: blueberries or tangerines? How about peaches versus apples? Which chart do you find easier to read?…
To make effective visualization color choices, you need to take a number of factors into consideration. To name just two: All the colors need to be suitably different from one another, for instance, so that readers can tell them apart – what’s called “discriminability.” You also need to consider what the colors look like to the color blind — roughly 8% of the U.S. male population! Could the colors be distinguished from one another if they were reprinted in black and white?
One easy way to assign semantically resonant colors is to use colors from an existing color palette that has been carefully designed for visualization applications (ColorBrewer offers some options) but assign the colors to data values in a way that best matches concept color associations. This is the basis of our own algorithm, which acquires images for each concept and then analyzes them to learn concept color associations. However, keep in mind that color associations may vary across cultures. For example, in the United States and many western cultures, luck is often associated with green (four-leaf clovers), while red can be considered a color of danger. However, in China, luck is traditionally symbolized with the color red.
…
Semantically resonant colors can reinforce perception of a wide range of data categories. We believe similar gains would likely be seen for other forms of visualizations like maps, scatterplots, and line charts. So when designing visualizations for presentation or analysis, consider color choice and ask yourself how well the colors resonate with the underlying data.”
In Belgium, speed camera locations are crowdsourced from citizens
Springwise: “As much as local authorities try to, they aren’t able to stop every single civic infraction because they only have a limited number of eyes on the street. However, smartphones have already enabled councils to crowdsource details of law breaches, through apps such as Parking Mobility that let users log when a driver is using a disabled parking bay without a licence. Now the ikflitsmee campaign in Belgium has encouraged citizens to send in locations where they believe speeding is a problem in order for the police to invest in safety measures.
Open until April 10, anyone could log onto the ikflitsmee website to nominate locations such as schools, playgrounds or sharp turns in the road where speeding is a particular problem. The initiative spanned the whole country, involving both local and Federal police forces. After receiving more than 50,000 suggestions, those forces were then invited to check the pinned locations near to them to see if a speed camera would be a feasible solution. The website gets residents to flag up the areas they know to be dangerous and helps authorities by creating an instant data resource to plan future audits.
By asking residents to show them where potential speeders are, local authorities can curb accidents and deliver more fines to culprits, boosting their revenue. At the same time, citizens feel empowered and involved in the improvement of road safety in the country. Are there other ways to tap citizens’ smartphones for more rapid gathering of data that can help councils improve their service to the community?
Website: www.ikflitsmee.be”
Can Government Play Moneyball?
David Bornstein in the New York Times: “…For all the attention it’s getting inside the administration, evidence-based policy-making seems unlikely to become a headline grabber; it lacks emotional appeal. But it does have intellectual heft. And one group that has been doing creative work to give the message broader appeal is Results for America, which has produced useful teaching aids under the banner “Moneyball for Government,” building on the popularity of the book and movie about Billy Beane’s Oakland A’s, and the rise of data-driven decision making in major league baseball. (Watch their video explainers here and here.)
Results for America works closely with leaders across political parties and social sectors, to build awareness about evidence-based policy making — drawing attention to key areas where government could dramatically improve people’s lives by augmenting well-tested models. They are also chronicling efforts by local governments around the country, to show how an emerging group of “Geek Cities,” including Baltimore, Denver, Miami, New York, Providence and San Antonio, are using data and evidence to drive improvements in various areas of social policy like education, youth development and employment.
“It seems like common sense to use evidence about what works to get better results,” said Michele Jolin, Results for America’s managing partner. “How could anyone be against it? But the way our system is set up, there are so many loud voices pushing to have dollars spent and policy shaped in the way that works for them. There has been no organized constituency for things that work.”
“The debate in Washington is usually about the quantity of resources,” said David Medina, a partner in Results for America. “We’re trying to bring it back to talking about quality.”
Not everyone will find this change appealing. “When you have a longstanding social service policy, there’s going to be a network of [people and groups] who are organized to keep that money flowing regardless of whether evidence suggests it’s warranted,” said Daniel Stid. “People in social services don’t like to think they’re behaving like other organized interests — like dairy farmers or mortgage brokers — but it leads to tremendous inertia in public policy.”
Beyond the politics, there are practical obstacles to overcome, too. Federal agencies lack sufficient budgets for evaluation or a common definition for what constitutes rigorous evidence. (Any lobbyist can walk into a legislator’s office and claim to have solid data to support an argument.) Up-to-date evidence also needs to be packaged in accessible ways and made available on a timely basis, so it can be used to improve programs, rather than to threaten them. Governments need to build regular evaluations into everything they do — not just conduct big, expensive studies every 10 years or so.
That means developing new ways to conduct quick and inexpensive randomized studies using data that is readily available, said Haskins, who is investigating this approach. “We should be running 10,000 evaluations a year, like they do in medicine.” That’s the only way to produce the rapid trial-and-error learning needed to drive iterative program improvements, he added. (I reported on a similar effort being undertaken by the Coalition for Evidence-Based Policy.)
Results for America has developed a scorecard to rank federal departments about how prepared they are to produce or incorporate evidence in their programs. It looks at whether a department has an office and a leader with the authority and budget to evaluate its programs. It asks: Does it make its data accessible to the public? Does it compile standards about what works and share them widely? Does it spend at least 1 percent of its budget evaluating its programs? And — most important — does it incorporate evidence in its big grant programs? For now, the Department of Education gets the top score.
The stakes are high. In 2011, for example, the Obama administration launched a process to reform Head Start, doing things like spreading best practices and forcing the worst programs to improve or lose their funding. This February, for the third time, the government released a list of Head Start providers (103 out of about 1,600) who will have to recompete for federal funding because of performance problems. That list represents tens of thousands of preschoolers, many of whom are missing out on the education they need to succeed in kindergarten — and life.
Improving flagship programs like Head Start, and others, is not just vital for the families they serve; it’s vital to restore trust in government. “I am a card-carrying member of the Republican Party and I want us to be governed well,” said Robert Shea, who pushed for better program evaluations as associate director of the Office of Management and Budget during the Bush administration, and continues to focus on this issue as chairman of the National Academy of Public Administration. “This is the most promising thing I know of to get us closer to that goal.”
“This idea has the prospect of uniting Democrats and Republicans,” said Haskins. “But it will involve a broad cultural change. It has to get down to the program administrators, board members and local staff throughout the country — so they know that evaluation is crucial to their operations.”
“There’s a deep mistrust of government and a belief that problems can’t be solved,” said Michele Jolin. “This movement will lead to better outcomes — and it will help people regain confidence in their public officials by creating a more effective, more credible way for policy choices to be made.”
Paying Farmers to Welcome Birds
Jim Robbins in The New York Times: “The Central Valley was once one of North America’s most productive wildlife habitats, a 450-mile-long expanse marbled with meandering streams and lush wetlands that provided an ideal stop for migratory shorebirds on their annual journeys from South America and Mexico to the Arctic and back.
Farmers and engineers have long since tamed the valley. Of the wetlands that existed before the valley was settled, about 95 percent are gone, and the number of migratory birds has declined drastically. But now an unusual alliance of conservationists, bird watchers and farmers have joined in an innovative plan to restore essential habitat for the migrating birds.
The program, called BirdReturns, starts with data from eBird, the pioneering citizen science project that asks birders to record sightings on a smartphone app and send the information to the Cornell Lab of Ornithology in upstate New York.
By crunching data from the Central Valley, eBird can generate maps showing where virtually every species congregates in the remaining wetlands. Then, by overlaying those maps on aerial views of existing surface water, it can determine where the birds’ need for habitat is greatest….
BirdReturns is an example of the growing movement called reconciliation ecology, in which ecosystems dominated by humans are managed to increase biodiversity.
“It’s a new ‘Moneyball,’ ” said Eric Hallstein, an economist with the Nature Conservancy and a designer of the auctions, referring to the book and movie about the Oakland Athletics’ data-driven approach to baseball. “We’re disrupting the conservation industry by taking a new kind of data, crunching it differently and contracting differently.”
Passage Of The DATA Act Is A Major Advance In Government Transparency
OpEd by Hudson Hollister in Forbes: “Even as the debate over official secrecy grows on Capitol Hill, basic information about our government’s spending remains hidden in plain sight.
Information that is technically public — federal finance, awards, and expenditures — is effectively locked within a disconnected disclosure system that relies on outdated paper-based technology. Budgets, grants, contracts, and disbursements are reported manually and separately, using forms and spreadsheets. Researchers seeking insights into federal spending must invest time and resources crafting data sets out of these documents. Without common data standards across all government spending, analyses of cross-agency spending trends require endless conversions of apples to oranges.
For a nation whose tech industry leads the world, there is no reason to allow this antiquated system to persist.
That’s why we’re excited to welcome Thursday’s unanimous Senate approval of the Digital Accountability and Transparency Act — known as the DATA Act.
The DATA Act will mandate government-wide standards for federal spending data. It will also require agencies to publish this information online, fully searchable and open to everyone.
Watchdogs and transparency advocates from across the political spectrum have endorsed the DATA Act because all Americans will benefit from clear, accessible information about how their tax dollars are being spent.
It is darkly appropriate that the only organized opposition to this bill took place behind closed doors. In January, Senate sponsors Mark Warner (D-VA) and Rob Portman (R-OH) rejected amendments offered privately by the White House Office of Management and Budget. These nonpublic proposals would have gutted the DATA Act’s key data standards requirement. But Warner and Portman went public with their opposition, and Republicans and Democrats agreed to keep a strong standards mandate.
We now await swift action by the House of Representatives to pass this bill and put it on the President’s desk.
The tech industry is already delivering the technology and expertise that will use federal spending data, once it is open and standardized, to solve problems.
If the DATA Act is fully enforced, citizens will be able to track government spending on a particular contractor or from a particular program, payment by payment. Agencies will be able to deploy sophisticated Big Data analytics to illuminate, and eliminate, waste and fraud. And states and universities will be able to automate their complex federal grant reporting tasks, freeing up more tax dollars for their intended use. Our industry can perform these tasks — as soon as we get the data.
Chairman Earl Devaney’s Recovery Accountability and Transparency Board proved this is possible. Starting in 2009, the Recovery Board applied data standards to track stimulus spending. Our members’ software used that data to help inspectors general prevent and recover over $100 million in spending on suspicious grantees and contractors. The DATA Act applies that approach across the whole of government spending.
Congress is now poised to pass this landmark legislative mandate to transform spending from disconnected documents into open data. Next , the executive branch must implement that mandate.
So our Coalition’s work continues. We will press the Treasury Department and the White House to adopt robust, durable, and nonproprietary data standards for federal spending.
And we won’t stop with spending transparency. The American people deserve access to open data across all areas of government activity — financial regulatory reporting, legislative actions, judicial filings, and much more….”
How Can the Department of Education Increase Innovation, Transparency and Access to Data?
David Soo at the Department of Education: “Despite the growing amount of information about higher education, many students and families still need access to clear, helpful resources to make informed decisions about going to – and paying for – college. President Obama has called for innovation in college access, including by making sure all students have easy-to-understand information.
Now, the U.S. Department of Education needs your input on specific ways that we can increase innovation, transparency, and access to data. In particular, we are interested in how APIs (application programming interfaces) could make our data and processes more open and efficient.
APIs are set of software instructions and standards that allow machine-to-machine communication. APIs could allow developers from inside and outside government to build apps, widgets, websites, and other tools based on government information and services to let consumers access government-owned data and participate in government-run processes from more places on the Web, even beyond .gov websites. Well-designed government APIs help make data and processes freely available for use within agencies, between agencies, in the private sector, or by citizens, including students and families.
So, today, we are asking you – student advocates, designers, developers, and others – to share your ideas on how APIs could spark innovation and enable processes that can serve students better. We need you to weigh in on a Request for Information (RFI) – a formal way the government asks for feedback – on how the Department could use APIs to increase access to higher education data or financial aid programs. There may be ways that Department forms – like the Free Application for Federal Student Aid (FAFSA) – or information-gathering processes could be made easier for students by incorporating the use of APIs. We invite the best and most creative thinking on specific ways that Department of Education APIs could be used to improve outcomes for students.
To weigh in, you can email [email protected] by June 2, or send your input via other addresses as detailed in the online notice.
The Department wants to make sure to do this right. It must ensure the security and privacy of the data it collects or maintains, especially when the information of students and families is involved. Openness only works if privacy and security issues are fully considered and addressed. We encourage the field to provide comments that identify concerns and offer suggestions on ways to ensure privacy, safeguard student information, and maintain access to federal resources at no cost to the student.
Through this request, we hope to gather ideas on how APIs could be used to fuel greater innovation and, ultimately, affordability in higher education. For further information, see the Federal Register notice.”
The Transformative Impact of Data and Communication on Governance
Steven Livingston at Brookings: “How do digital technologies affect governance in areas of limited statehood – places and circumstances characterized by the absence of state provisioning of public goods and the enforcement of binding rules with a monopoly of legitimate force? In the first post in this series I introduced the limited statehood concept and then described the tremendous growth in mobile telephony, GIS, and other technologies in the developing world. In the second post I offered examples of the use of ICT in initiatives intended to fill at least some of the governance vacuum created by limited statehood. With mobile phones, for example, farmers are informed of market conditions, have access to liquidity through M-Pesa and similar mobile money platforms….
This brings to mind another type of ICT governance initiative. Rather than fill in for or even displace the state some ICT initiatives can strengthen governance capacity. Digital government – the use of digital technology by the state itself — is one important possibility. Other initiatives strengthen the state by exerting pressure. Countries with weak governance sometimes take the form of extractive states or those, which cater to the needs of an elite, leaving the majority of the population in poverty and without basic public services. This is what Daron Acemoglu and James A. Robinson call extractive political and economic institutions. Inclusive states, on the other hand, are pluralistic, bound by the rule of law, respectful of property rights, and, in general, accountable. Accountability mechanisms such as a free press and competitive multiparty elections are instrumental to discourage extractive institutions. What ICT-based initiatives might lend a hand in strengthening accountability? We can point to three examples.
Example One: Using ICT to Protect Human Rights
Nonstate actors now use commercial, high-resolution remote sensing satellites to monitor weapons programs and human rights violations. Amnesty International’s Remote Sensing for Human Rights offers one example, and Satellite Sentinel offers another. Both use imagery from DigitalGlobe, an American remote sensing and geospatial content company. Other organizations have used commercially available remote sensing imagery to monitor weapons proliferation. The Institute for Science and International Security, a Washington-based NGO, revealed the Iranian nuclear weapons program in 2003 using commercial satellite imagery…
Example Two: Crowdsourcing Election Observation
Others have used mobile phones and GIS to crowdsource election observation. For the 2011 elections in Nigeria, The Community Life Project, a civil society organization, created ReclaimNaija, an elections process monitoring system that relied on GIS and amateur observers with mobile phones to monitor the elections. Each of the red dots represents an aggregation of geo-located incidents reported to the ReclaimNaija platform. In a live map, clicking on a dot disaggregates the reports, eventually taking the reader to individual reports. Rigorous statistical analysis of ReclaimNaija results and the elections suggest it contributed to the effectiveness of the election process.
ReclaimNaija: Election Incident Reporting System Map
Example Three: Using Genetic Analysis to Identify War Crimes
In recent years, more powerful computers have led to major breakthroughs in biomedical science. The reduction in cost of analyzing the human genome has actually outpaced Moore’s Law. This has opened up new possibilities for the use of genetic analysis in forensic anthropology. In Guatemala, the Balkans, Argentina, Peru and in several other places where mass executions and genocides took place, forensic anthropologists are using genetic analysis to find evidence that is used to hold the killers – often state actors – accountable…”
Wikipedia Use Could Give Insights To The Flu Season
Agata Blaszczak-Boxe in Huffington Post: “By monitoring the number of times people look for flu information on Wikipedia, researchers may be better able to estimate the severity of a flu season, according to a new study.
Researchers created a new data-analysis system that looks at visits to Wikipedia articles, and found the system was able to estimate flu levels in the United States up to two weeks sooner than the flu data from the Centers for Disease Control and Prevention were released.
Looking at data spanning six flu seasons between December 2007 and August 2013, the new system estimated the peak flu week better than Google Flu Trends, another data-based system. The Wikipedia-based system accurately estimated the peak flu week in three out of six seasons, while the Google-based system got only two right, the researchers found.
“We were able to get really nice estimates of what the [flu] level is in the population,” said study author David McIver, a postdoctoral fellow at Boston Children’s Hospital.
The new system examined visits to Wikipedia articles that included terms related to flulike illnesses, whereas Google Flu Trends looks at searches typed into Google. The researchers analyzed the data from Wikipedia on how many times in an hour a certain article was viewed, and combined their data with flu data from the CDC, using a model they created.
The research team wanted to use a database that is accessible to everyone and create a system that could be more accurate than Google Flu Trends, which has flaws. For instance, during the swine flu pandemic in 2009, and during the 2012-2013 influenza season, Google Flu Trends got a bit “confused,” and overestimated flu numbers because of increased media coverage focused on the two illnesses, the researchers said.
When a pandemic strikes, people search for news stories related to the pandemic itself, but this doesn’t mean that they have the flu. In general, the problem with Internet-based estimation systems is that it is practically impossible to tell whether people are looking for information about an illness because they are sick, the researchers said.
In the new system, the researchers tried to overcome this issue by including a number of Wikipedia articles “to act as markers for general background-level activity of normal usage of Wikipedia,” the researchers wrote in the study. However, just like any other data-based system, the Wikipedia system is not immune to the issues related to figuring out the actual motivation of someone checking information related to the flu…
The study is published … in the journal PLOS Computational Biology.”