Public Open Data: The Good, the Bad, the Future


at IDEALAB: “Some of the most powerful tools combine official public data with social media or other citizen input, such as the recent partnership between Yelp and the public health departments in New York and San Francisco for restaurant hygiene inspection ratings. In other contexts, such tools can help uncover and ultimately reduce corruption by making it easier to “follow the money.”
Despite the opportunities offered by “free data,” this trend also raises new challenges and concerns, among them, personal privacy and security. While attention has been devoted to the unsettling power of big data analysis and “predictive analytics” for corporate marketing, similar questions could be asked about the value of public data. Does it contribute to community cohesion that I can find out with a single query how much my neighbors paid for their house or (if employed by public agencies) their salaries? Indeed, some studies suggest that greater transparency leads not to greater trust in government but to resignation and apathy.
Exposing certain law enforcement data also increases the possibility of vigilantism. California law requires the registration and publication of the home addresses of known sex offenders, for instance. Or consider the controversy and online threats that erupted when, shortly after the Newtown tragedy, a newspaper in New York posted an interactive map of gun permit owners in nearby counties.
…Policymakers and officials must still mind the “big data gap.”So what does the future hold for open data? Publishing data is only one part of the information ecosystem. To be useful, tools must be developed for cleaning, sorting, analyzing and visualizing it as well. …
For-profit companies and non-profit watchdog organizations will continue to emerge and expand, building on the foundation of this data flood. Public-private partnerships such as those between San Francisco and Appallicious or Granicus, startups created by Code for America’s Incubator, and non-partisan organizations like the Sunlight Foundation and MapLight rely on public data repositories for their innovative applications and analysis.
Making public data more accessible is an important goal and offers enormous potential to increase civic engagement. To make the most effective and equitable use of this resource for the public good, cities and other government entities should invest in the personnel and equipment — hardware and software — to make it universally accessible. At the same time, Chief Data Officers (or equivalent roles) should also be alert to the often hidden challenges of equity, inclusion, privacy, and security.”

Innovating to Improve Disaster Response and Recovery


Todd Park at OSTP blog: “Last week, the White House Office of Science and Technology Policy (OSTP) and the Federal Emergency Management Agency (FEMA) jointly challenged a group of over 80 top innovators from around the country to come up with ways to improve disaster response and recovery efforts.  This diverse group of stakeholders, consisting of representatives from Zappos, Airbnb, Marriott International, the Parsons School of Design, AOL/Huffington Post’s Social Impact, The Weather Channel, Twitter, Topix.com, Twilio, New York City, Google and the Red Cross, to name a few, spent an entire day at the White House collaborating on ideas for tools, products, services, programs, and apps that can assist disaster survivors and communities…
During the “Data Jam/Think Tank,” we discussed response and recovery challenges…Below are some of the ideas that were developed throughout the day. In the case of the first two ideas, participants wrote code and created actual working prototypes.

  • A real-time communications platform that allows survivors dependent on electricity-powered medical devices to text or call in their needs—such as batteries, medication, or a power generator—and connect those needs with a collaborative transportation network to make real-time deliveries.
  • A technical schema that tags all disaster-related information from social media and news sites – enabling municipalities and first responders to better understand all of the invaluable information generated during a disaster and help identify where they can help.
  • A Disaster Relief Innovation Vendor Engine (DRIVE) which aggregates pre-approved vendors for disaster-related needs, including transportation, power, housing, and medical supplies, to make it as easy as possible to find scarce local resources.
  • A crowdfunding platform for small businesses and others to receive access to capital to help rebuild after a disaster, including a rating system that encourages rebuilding efforts that improve the community.
  • Promoting preparedness through talk shows, working closely with celebrities, musicians, and children to raise awareness.
  • A “community power-go-round” that, like a merry-go-round, can be pushed to generate electricity and additional power for battery-charged devices including cell phones or a Wi-Fi network to provide community internet access.
  • Aggregating crowdsourced imagery taken and shared through social media sites to help identify where trees have fallen, electrical lines have been toppled, and streets have been obstructed.
  • A kid-run local radio station used to educate youth about preparedness for a disaster and activated to support relief efforts during a disaster that allows youth to share their experiences.”

New! Humanitarian Computing Library


Patrick Meier at iRevolution: “The field of “Humanitarian Computing” applies Human Computing and Machine Computing to address major information-based challengers in the humanitarian space. Human Computing refers to crowdsourcing and microtasking, which is also referred to as crowd computing. In contrast, Machine Computing draws on natural language processing and machine learning, amongst other disciplines. The Next Generation Humanitarian Technologies we are prototyping at QCRI are powered by Humanitarian Computing research and development (R&D).
My QCRI colleagues and I  just launched the first ever Humanitarian Computing Library which is publicly available here. The purpose of this library, or wiki, is to consolidate existing and future research that relate to Humanitarian Computing in order to support the development of next generation humanitarian tech. The repository currently holds over 500 publications that span topics such as Crisis Management, Trust and Security, Software and Tools, Geographical Analysis and Crowdsourcing. These publications are largely drawn from (but not limited to) peer-reviewed papers submitted at leading conferences around the world. We invite you to add your own research on humanitarian computing to this growing collection of resources.”

How Mechanical Turkers Crowdsourced a Huge Lexicon of Links Between Words and Emotion


The Physics arXiv Blog: Sentiment analysis on the social web depends on how a person’s state of mind is expressed in words. Now a new database of the links between words and emotions could provide a better foundation for this kind of analysis


One of the buzzphrases associated with the social web is sentiment analysis. This is the ability to determine a person’s opinion or state of mind by analysing the words they post on Twitter, Facebook or some other medium.
Much has been promised with this method—the ability to measure satisfaction with politicians, movies and products; the ability to better manage customer relations; the ability to create dialogue for emotion-aware games; the ability to measure the flow of emotion in novels; and so on.
The idea is to entirely automate this process—to analyse the firehose of words produced by social websites using advanced data mining techniques to gauge sentiment on a vast scale.
But all this depends on how well we understand the emotion and polarity (whether negative or positive) that people associate with each word or combinations of words.
Today, Saif Mohammad and Peter Turney at the National Research Council Canada in Ottawa unveil a huge database of words and their associated emotions and polarity, which they have assembled quickly and inexpensively using Amazon’s crowdsourcing Mechanical Turk website. They say this crowdsourcing mechanism makes it possible to increase the size and quality of the database quickly and easily….The result is a comprehensive word-emotion lexicon for over 10,000 words or two-word phrases which they call EmoLex….
The bottom line is that sentiment analysis can only ever be as good as the database on which it relies. With EmoLex, analysts have a new tool for their box of tricks.”
Ref: arxiv.org/abs/1308.6297: Crowdsourcing a Word-Emotion Association Lexicon

From Collaborative Coding to Wedding Invitations: GitHub Is Going Mainstream


Wired: “With 3.4 million users, the five-year-old site is a runaway hit in the hacker community, the go-to place for coders to show off pet projects and crowdsource any improvements. But the company has grander ambitions: It wants to change the way people work. It’s starting with software developers for sure, but maybe one day anyone who edits text in one form or another — lawyers, writers, and civil servants — will do it the GitHub way.
To first-time visitors, GitHub looks like a twisted version of Facebook, built in some alternate universe where YouTube videos and photos of cats have somehow morphed into snippets of code. But many of the underlying concepts are the same. You can “follow” other hackers to see what they’re working on. You can comment on their code — much like you’d do on a Facebook photo. You can even “star” a project to show that you like it, just as you’d “favorite” something on Twitter.
But it’s much more than a social network. People discover new projects and then play around with them, making changes, trying out new ideas. Then, with the push of a button, they merge into something better. You can also “fork” projects. That’s GitHub lingo for then when you make a copy of a project so you can then build and modify your own, independent version.
People didn’t just suggest changes to Lee’s Twitter patent license. It was forked 53 times: by Arul, by a computer science student in Portland, by a Belgian bicycle designer. These forks can now evolve and potentially even merge back into Lee’s agreement. The experiment also inspired Fenwick & West, one of Silicon Valley’s top legal firms (and GitHub’s law firm) to post 30 pages of standard documents for startups to GitHub earlier this year.”

The Other Side of Open is Not Closed


Dazza Greenwood at Civics.com: “Impliedly, the opposite of “open” is “closed” but the other side of open data, open API’s and open access is usually still about enabling access but only when allowed or required. Open government also needs to include adequate methods to access and work with data and other resources that are not fully open. In fact, many (most?) high value, mission critical and societally important data access is restricted in some way. If a data-set is not fully public record then a good practice is to think of it as “protected” and to ensure access according to proper controls.
As a metaphorical illustration, you could look at an open data system like a village square or agora that is architected and intended to be broadly accessible. On the other side of the spectrum, you could see a protected data system more like a castle or garrison, that is architected to be secure from intruders but features guarded gates and controlled access points in order to function.
In fact, this same conceptual approach applies well beyond data and includes everything you could consider an resource on the Internet.  In other words, any asset, service, process or other item that can exist at a URL (or URI) is a resource and can be positioned somewhere on a spectrum from openly accessible to access protected. It is easy to forget that the “R” in URL stands for “Resource” and the whole wonderful web connects to resources of every nature and description. Data – structured, raw or otherwise – is just the tip of the iceberg.
Resources on the web could be apps and other software, or large-scale enterprise network services, or just a single text file with few lines of html. The concept of a enabling access permission to “protected resources” on the web is the cornerstone of OAuth2 and is now being extended by the OpenID Connect standard, the User Managed Access protocol and other specifications to enable a powerful array of REST-based authorization possibilities…”

Assessing Zuckerberg’s Idea That Facebook Could Help Citizens Re-Make Their Government


Gregory Ferenstein in TechCrunch: “Mark Zuckerberg has a grand vision that Facebook will help citizens in developing countries decide their own governments. It’s a lofty and partially attainable goal. While Egypt probably won’t let citizens vote for their next president with a Like, it is theoretically possible to use Facebook to crowdsource expertise. Governments around the world are experimenting with radical online direct democracy, but it doesn’t always work out.

Very briefly, Zuckerberg laid out his broad vision for e-government to Wired’s Steven Levy, while defending Internet.org, a new consortium to bring broadband to the developing world.

“People often talk about how big a change social media had been for our culture here in the U.S. But imagine how much bigger a change it will be when a developing country comes online for the first time ever. We use things like Facebook to share news and keep in touch with our friends, but in those countries, they’ll use this for deciding what kind of government they want to have. Getting access to health care information for the first time ever.”

When he references “deciding … government,” Zuckerberg could be talking about voting, sharing ideas, or crafting a constitution. We decided to assess the possibilities of them all….
For citizens in the exciting/terrifying position to construct a brand-new government, American-style democracy is one of many options. Britain, for instance, has a parliamentary system and has no constitution. In other cases, a government may want to heed political scientists’ advice and develop a “consensus democracy,” where more than two political parties are incentivized to work collaboratively with citizens, business, and different branches of government to craft laws.
At least once, choosing a new style of democracy has been attempted through the Internet. After the global financial meltdown wrecked Iceland’s economy, the happy citizens of the grass-covered country decided to redo their government and solicit suggestions from the public (950 Icelanders chosen by lottery and general calls for ideas through social networks). After much press about Iceland’s “crowdsourced” constitution, it crashed miserably after most of the elected leaders rejected it.
Crafting law, especially a constitution, is legally complex; unless there is a systematic way to translate haphazard citizen suggestions into legalese, the results are disastrous.
“Collaborative drafting, at large scale, at low costs, and that is inclusive, is something that we still don’t know how to do,” says Tiago Peixoto, a World Bank Consultant on participatory democracy (and one of our Most Innovative People In Democracy).
Peixoto, who helps the Brazilian government conduct some of the world’s only online policymaking, says he’s optimistic that Facebook could be helpful, but he wouldn’t use it to draft laws just yet.
While technically it is possible for social networks to craft a new government, we just don’t know how to do it very well, and, therefore, leaders are likely to reject the idea. In other words, don’t expect Egypt to decide their future through Facebook likes.”

Mapping the Twitterverse


Mapping the Twitterverse

Phys.org: “What does your Twitter profile reveal about you? More than you know, according to Chris Weidemann. The GIST master’s student has developed an application that follows geospatial footprints.
You start your day at your favorite breakfast spot. When your order of strawberry waffles with extra whipped cream arrives, it’s too delectable not to share with your Twitter followers. You snap a photo with your smartphone and hit send. Then, it’s time to hit the books.
You tweet your friends that you’ll be at the library on campus. Later that day, palm trees silhouette a neon-pink sunset. You can’t resist. You tweet a picture with the hashtag #ILoveLA.
You may not realize that when you tweet those breezy updates and photos of food, you are sharing information about your location.
Chris Weidemann, a graduate student in the Geographic Information Science and Technology (GIST) online master’s program at USC Dornsife, investigated just how much public was generated by Twitter users and how their information—available through Twitter’s (API)—could potentially be used by third parties. His study was published June 2013 in the International Journal of Geoinformatics
Twitter has approximately 500 million active users, and reports show that 6 percent of users opt-in to allow the platform to broadcast their location using global positioning technology with each tweet they post. That’s about 30 million people sending geo-tagged data out into the Twitterverse. In their tweets, people can choose whether their information is displayed as a city and state, an address or pinpoint their precise latitude and longitude.
That’s only part of their geospatial footprint. Information contained in a post may reveal a user’s location. Depending upon how the account is set up, profiles may include details about their hometown, time zone and language.”
 

Twitter’s activist roots: How Twitter’s past shapes its use as a protest tool


Radio Netherlands Worldwide: “Surprised when demonstrators from all over the world took to Twitter as a protest tool? Evan “Rabble” Henshaw-Plath, member of Twitter’s founding team, was not. Rather, he sees it as a return to its roots: Inspired by protest coordination tools like TXTMob, and shaped by the values and backgrounds of Twitter’s founders, he believes activist potential was built into the service from the start.

It took a few revolutions before Twitter was taken seriously. Critics claimed that its 140-character limit only provided space for the most trivial thoughts: neat for keeping track of Ashton Kutcher’s lunch choices, but not much else. It made the transition from Silicon Valley toy into Middle East protest tool seem all the more astonishing.
Unless, Twitter co-founder Evan Henshaw-Plath argues, you know the story of how Twitter came to be. Evan Henshaw-Plath was the lead developer at Odeo, the company that started and eventually became Twitter. TXTMob, an activist tool deployed during the 2004 Republican National Convention in the US to coordinate protest efforts via SMS was, says Henshaw-Plath, a direct inspiration for Twitter.
Protest 1.0
In 2004, while Henshaw-Plath was working at Odeo, he and a few other colleagues found a fun side-project in working on TXTMob, an initiative by what he describes as a “group of academic artist/prankster/hacker/makers” that operated under the ostensibly serious moniker of Institute for Applied Autonomy (IAA). Earlier IAA projects included small graffiti robots on wheels that spray painted slogans on pavements during demonstrations, and a pudgy talking robot with big puppy eyes made to distribute subversive literature to people who ignored less-cute human pamphleteers.
TXTMob was a more serious endeavor than these earlier projects: a tactical protest coordination tool. With TXTMob, users could quickly exchange text messages with large groups of other users about protest locations and police crackdowns….”

Big Data and Disease Prevention: From Quantified Self to Quantified Communities


New Paper by Meredith A. Barrett, Olivier Humblet, Robert A. Hiatt, and Nancy E. Adler: “Big data is often discussed in the context of improving medical care, but it also has a less appreciated but equally important role to play in preventing disease. Big data can facilitate action on the modifiable risk factors that contribute to a large fraction of the chronic disease burden, such as physical activity, diet, tobacco use, and exposure to pollution. It can do so by facilitating the discovery of risk factors for disease at population, subpopulation, and individual levels, and by improving the effectiveness of interventions to help people achieve healthier behaviors in healthier environments. In this article, we describe new sources of big data in population health, explore their applications, and present two case studies illustrating how big data can be leveraged for prevention. We also discuss the many implementation obstacles that must be overcome before this vision can become a reality.”