Using Flash Crowds to Automatically Detect Earthquakes & Impact Before Anyone Else


Patrick Meier at iRevolutions: “It is said that our planet has a new nervous system; a digital nervous system comprised of digital veins and intertwined sensors that capture the pulse of our planet in near real-time. Next generation humanitarian technologies seek to leverage this new nervous system to detect and diagnose the impact of disasters within minutes rather than hours. To this end, LastQuake may be one of the most impressive humanitarian technologies that I have recently come across. Spearheaded by the European-Mediterranean Seismological Center (EMSC), the technology combines “Flashsourcing” with social media monitoring to auto-detect earthquakes before they’re picked up by seismometers or anyone else.

Screen Shot 2014-10-23 at 5.08.30 PM

Scientists typically draw on ground-motion prediction algorithms and data on building infrastructure to rapidly assess an earthquake’s potential impact. Alas, ground-motion predictions vary significantly and infrastructure data are rarely available at sufficient resolutions to accurately assess the impact of earthquakes. Moreover, a minimum of three seismometers are needed to calibrate a quake and said seismic data take several minutes to generate. This explains why the EMSC uses human sensors to rapidly collect relevant data on earthquakes as these reduce the uncertainties that come with traditional rapid impact assessment methodologies. Indeed, the Center’s important work clearly demonstrates how the Internet coupled with social media are “creating new potential for rapid and massive public involvement by both active and passive means” vis-a-vis earthquake detection and impact assessments. Indeed, the EMSC can automatically detect new quakes within 80-90 seconds of their occurrence while simultaneously publishing tweets with preliminary information on said quakes, like this one:

Screen Shot 2014-10-23 at 5.44.27 PM

In reality, the first human sensors (increases in web traffic) can be detected within 15 seconds (!) of a quake…(More)

City Governments Are Using Yelp to Tell You Where Not to Eat


Michael Luca and Luther Lowe at HBR Blog: “…in recent years consumer-feedback platforms like TripAdvisor, Foursquare, and Chowhound have transformed the restaurant industry (as well as the hospitality industry), becoming important guides for consumers. Yelp has amassed about 67 million reviews in the last decade. So it’s logical to think that these platforms could transform hygiene awareness too — after all, people who contribute to review sites focus on some of the same things inspectors look for.

It turns out that one way user reviews can transform hygiene awareness is by helping health departments better utilize their resources. The deployment of inspectors is usually fairly random, which means time is often wasted on spot checks at clean, rule-abiding restaurants. Social media can help narrow the search for violators.
Within a given city or area, it’s possible to merge the entire history of Yelp reviews and ratings — some of which contain telltale words or phrases such as “dirty” and “made me sick” — with the history of hygiene violations and feed them into an algorithm that can predict the likelihood of finding problems at reviewed restaurants. Thus inspectors can be allocated more efficiently.
In San Francisco, for example, we broke restaurants into the top half and bottom half of hygiene scores. In a recent paper, one of us (Michael Luca, with coauthor Yejin Choi and her graduate students) showed that we could correctly classify more than 80% of restaurants into these two buckets using only Yelp text and ratings. In the next month, we plan to hold a contest on DrivenData to get even better algorithms to help cities out (we are jointly running the contest). Similar algorithms could be applied in any city and in other sorts of prediction tasks.
Another means for transforming hygiene awareness is through the sharing of health-department data with online review sites. The logic is simple: Diners should be informed about violations before they decide on a destination, rather than after.
Over the past two years, we have been working with cities to help them share inspection data with Yelp through an open-data standard that Yelp created in 2012 to encourage officials to put their information in places that are more useful to consumers. In San Francisco, Los Angeles, Raleigh, and Louisville, Kentucky, customers now see hygiene data alongside Yelp reviews. There’s evidence that users are starting to pay attention to this data — click-through rates are similar to those for other features on Yelp ….

And there’s no reason this type of data sharing should be limited to restaurant-inspection reports. Why not disclose data about dentists’ quality and regulatory compliance via Yelp? Why not use data from TripAdvisor to help spot bedbugs? Why not use Twitter to understand what citizens are concerned about, and what cities can do about it? Uses of social media data for policy, and widespread dissemination of official data through social media, have the potential to become important means of public accountability. (More)

Tired of Being Profiled, a Programmer Turns to Crowdsourcing Cop Reviews


Christopher Moraff at Next City: “…despite the fact that policing is arguably one of the most important and powerful service professions a civilized society can produce, it’s far easier to find out if the plumber you just hired broke someone’s pipe while fixing their toilet than it is to find out if the cop patrolling your neighborhood broke someone’s head while arresting them.
A 31-year-old computer programmer has set out to fix that glitch with a new web-based (and soon to be mobile) crowdsourced rating tool called CopScore that is designed to help communities distinguish police officers who are worthy of praise from those who are not fit to wear the uniform….
CopScore is a work in progress, and, for the time being at least, a one-man show. Hardison does all the coding himself, often working through the night to bring new features online.
Currently in the very early beta stage, the platform works by consolidating information on the service records of individual police officers together with details of their interactions with constituents. The searchable platform includes data gleaned from public sources — such as social media and news articles — cross-referenced with Yelp-style ratings from citizens.

For Hardison, CopScore is as much a personal endeavor as it is a professional one. He says his youthful interest in computer programming — which he took up as a misbehaving fifth-grader under the guiding hand of a concerned teacher — made him the butt of the occassional joke in the predominantly African-American community of North Nashville where he grew up….”(More)

Data for good


NESTA: “This report explores how capturing, sharing and analysing data in new ways can transform how charities work and how social action happens.

Key Findings

  • Citizens Advice (CAB) and Data Kind partnered to develop the Civic Dashboard. A tool which mines data from CAB consultations to understand emerging social issues in the UK.
  • Shooting Star Chase volunteers streamlined the referral paths of how children come to be at the hospices saving up to £90,000 for children’s hospices around the country by refining the referral system.
  • In a study of open grant funding data, NCVO identified 33,000 ‘below the radar organisations’ not currently registered in registers and databases on the third sector
  • In their social media analysis of tweets related to the Somerset Floods, Demos found that 39,000 tweets were related to social action

New ways of capturing, sharing and analysing data have the potential to transform how community and voluntary sector organisations work and how social action happens. However, while analysing and using data is core to how some of the world’s fastest growing businesses understand their customers and develop new products and services, civil society organisations are still some way off from making the most of this potential.
Over the last 12 months Nesta has grant funded a number of research projects that explore two dimensions of how big and open data can be used for the common good. Firstly, how it can be used by charities to develop better products and services and secondly, how it can help those interested in civil society better understand social action and civil society activity.

  • Citizens Advice Bureau (CAB) and Datakind, a global community of data scientists interested in how data can be used for a social purpose, were grant funded to explore how a datadriven approach to mining the rich data that CAB holds on social issues in the UK could be used to develop a real–time dashboard to identify emerging social issues. The project also explored how data–driven methods could better help other charities such as St Mungo’s and Buttle UK, and how data could be shared more effectively between charities as part of this process, to create collaborative data–driven projects.
  • Five organisations (The RSA, Cardiff University, The Demos Centre for Analysis of Social Media, NCVO and European Alternatives) were grant funded to explore how data–driven methods, such as open data analysis and social media analysis, can help us understand informal social action, often referred to as ‘below the radar activity’ in new ways.

This paper is not the definitive story of the opportunities in using big and open data for the common good, but it can hopefully provide insight on what can be done and lessons for others interested in exploring the opportunities in these methods….(More).”

Unleashing the Power of Data to Serve the American People


Memorandum: Unleashing the Power of Data to Serve the American People
To: The American People
From: Dr. DJ Patil, Deputy U.S. CTO for Data Policy and Chief Data Scientist

….While there is a rich history of companies using data to their competitive advantage, the disproportionate beneficiaries of big data and data science have been Internet technologies like social media, search, and e-commerce. Yet transformative uses of data in other spheres are just around the corner. Precision medicine and other forms of smarter health care delivery, individualized education, and the “Internet of Things” (which refers to devices like cars or thermostats communicating with each other using embedded sensors linked through wired and wireless networks) are just a few of the ways in which innovative data science applications will transform our future.

The Obama administration has embraced the use of data to improve the operation of the U.S. government and the interactions that people have with it. On May 9, 2013, President Obama signed Executive Order 13642, which made open and machine-readable data the new default for government information. Over the past few years, the Administration has launched a number of Open Data Initiatives aimed at scaling up open data efforts across the government, helping make troves of valuable data — data that taxpayers have already paid for — easily accessible to anyone. In fact, I used data made available by the National Oceanic and Atmospheric Administration to improve numerical methods of weather forecasting as part of my doctoral work. So I know firsthand just how valuable this data can be — it helped get me through school!

Given the substantial benefits that responsibly and creatively deployed data can provide to us and our nation, it is essential that we work together to push the frontiers of data science. Given the importance this Administration has placed on data, along with the momentum that has been created, now is a unique time to establish a legacy of data supporting the public good. That is why, after a long time in the private sector, I am returning to the federal government as the Deputy Chief Technology Officer for Data Policy and Chief Data Scientist.

Organizations are increasingly realizing that in order to maximize their benefit from data, they require dedicated leadership with the relevant skills. Many corporations, local governments, federal agencies, and others have already created such a role, which is usually called the Chief Data Officer (CDO) or the Chief Data Scientist (CDS). The role of an organization’s CDO or CDS is to help their organization acquire, process, and leverage data in a timely fashion to create efficiencies, iterate on and develop new products, and navigate the competitive landscape.

The Role of the First-Ever U.S. Chief Data Scientist

Similarly, my role as the U.S. CDS will be to responsibly source, process, and leverage data in a timely fashion to enable transparency, provide security, and foster innovation for the benefit of the American public, in order to maximize the nation’s return on its investment in data.

So what specifically am I here to do? As I start, I plan to focus on these four activities:

…(More)”

Amid Open Data Push, Agencies Feel Urge for Analytics


Jack Moore at NextGov: “Federal agencies, thanks to their unique missions, have long been collectors of valuable, vital and, no doubt, arcane data. Under a nearly two-year-old executive order from President Barack Obama, agencies are releasing more of this data in machine-readable formats to the public and entrepreneurs than ever before.
But agencies still need a little help parsing through this data for their own purposes. They are turning to industry, academia and outside researchers for cutting-edge analytics tools to parse through their data to derive insights and to use those insights to drive decision-making.
Take the U.S. Agency for International Development, for example. The agency administers U.S. foreign aid programs aimed at ending extreme poverty and helping support democratic societies around the globe.
Under the agency’s own recent open data policy, it’s started collecting reams of data from its overseas missions. Starting Oct. 1, organizations doing development work on the ground – including through grants and contracts – have been directed to also collect data generated by their work and submit it to back to agency headquarters. Teams go through the data, scrub it to remove sensitive material and then publish it.
The data spans the gamut from information on land ownership in South Sudan to livestock demographics in Senegal and HIV prevention activities in Zambia….The agency took the first step in solving that problem with a Jan. 20 request for information from outside groups for cutting-edge data analytics tools.
“Operating units within USAID are sometimes constrained by existing capacity to transform data into insights that could inform development programming,” the RFI stated.
The RFI queries industry on their capabilities in data mining and social media analytics and forecasting and systems modeling.
USAID is far from alone in its quest for data-driven decision-making.
A Jan. 26 RFI from the Transportation Department’s Federal Highway Administration also seeks innovative ideas from industry for “advanced analytical capabilities.”…(More)”

Opinion Mining in Social Big Data


New Paper by Wlodarczak, Peter and Ally, Mustafa and Soar, Jeffrey: “Opinion mining has rapidly gained importance due to the unprecedented amount of opinionated data on the Internet. People share their opinions on products, services, they rate movies, restaurants or vacation destinations. Social Media such as Facebook or Twitter has made it easier than ever for users to share their views and make it accessible for anybody on the Web. The economic potential has been recognized by companies who want to improve their products and services, detect new trends and business opportunities or find out how effective their online marketing efforts are. However, opinion mining using social media faces many challenges due to the amount and the heterogeneity of the available data. Also, spam or fake opinions have become a serious issue. There are also language related challenges like the usage of slang and jargon on social media or special characters like smileys that are widely adopted on social media sites.
These challenges create many interesting research problems such as determining the influence of social media on people’s actions, understanding opinion dissemination or determining the online reputation of a company. Not surprisingly opinion mining using social media has become a very active area of research, and a lot of progress has been made over the last years. This article describes the current state of research and the technologies that have been used in recent studies….(More)”
 

The Tricky Task of Rating Neighborhoods on 'Livability'


Tanvi Misra at CityLab: “Jokubas Neciunas was looking to buy an apartment almost two years back in Vilnius, Lithuania. He consulted real estate platforms and government data to help him decide the best option for him. In the process, he realized that there was a lot of information out there, but no one was really using it very well.
Fast-forward two years, and Neciunas and his colleagues have created PlaceILive.com—a start-up trying to leverage open data from cities and information from social media to create a holistic, accessible tool that measures the “livability” of any apartment or house in a city.
“Smart cities are the ones that have smart citizens,” says PlaceILive co-founder Sarunas Legeckas.
The team recognizes that foraging for relevant information in the trenches of open data might not be for everyone. So they tried to “spice it up” by creating a visually appealing, user-friendly portal for people looking for a new home to buy or rent. The creators hope PlaceILive becomes a one-stop platform where people find ratings on every quality-of-life metric important to them before their housing hunt begins.
In its beta form, the site features five cities—New York, Chicago, San Francisco, London and Berlin. Once you click on the New York portal, for instance, you can search for the place you want to know about by borough, zip code, or address. I pulled up Brooklyn….The index is calculated using a variety of public information sources (from transit agencies, police departments, and the Census, for instance) as well as other available data (from the likes of Google, Socrata, and Foursquare)….(More)”

2015 Edelman Trust Barometer


The 2015 Edelman Trust Barometer shows a global decline in trust over the last year, and the number of countries with trusted institutions has fallen to an all-time low among the informed public.

Among the general population, the trust deficit is even more pronounced, with nearly two-thirds of countries falling into the distruster category.
In the last year, trust has declined for three of the four institutions measured. NGOs continue to be the most trusted institution, but trust in NGOs declined from 66 to 63 percent. Sixty percent of countries now distrust media. Trust in government increased slightly, driven by big gains in India, Russia and Indonesia but government is still distrusted in 19 of the 27 markets surveyed. And trust in business is below 50 percent in half of those markets.
 

Ebola: Call for more sharing of scientific data


at the BBC: “The devastation left by the Ebola virus in west Africa raises many questions for science, policy and international development. One issue that has yet to receive widespread media attention is the handling of genetic data on the virus. By studying its code, scientists can trace how Ebola leapt across borders, and how, like all viruses, it is constantly evolving and changing.

Yet, researchers have been privately complaining for months about the scarcity of genetic information about the virus that is entering the public domain….

At the heart of the issue is the scientific process. The main way scientists are rewarded for their work is through the quality and number of research papers they publish.
Data is only revealed for scrutiny by the wider scientific community when the research is published, which can be a lengthy process….
Dr Emma Thomson of the MRC-University of Glasgow centre for virus research says all journals publishing papers on Ebola must insist all data is released, as a collaborative approach could save lives.
“At the time of publication is really important – these days most people do it but not always and journals often insist (but not always),” she told me.
“A lot of Ebola sequencing has happened but the data hasn’t always been uploaded.
“It’s an international emergency so people need to get the data out there to allow it to be analysed in different ways by different labs.”
In the old days of the public private race to decode the first human genome, the mood was one of making data accessible to all for the good of science and society.
Genetic science and public attitudes have moved on, but in the case of Ebola, some are saying it may be time for a re think.
As Prof Paul Hunter, Professor of health protection at the University of East Anglia, put it: “It would be tragic if, during a crisis like this, data was not being adequately shared with the public health community.
“The rapid sharing of data could help enable more rapid control of the outbreak.”…(More)”