How Mechanical Turkers Crowdsourced a Huge Lexicon of Links Between Words and Emotion


The Physics arXiv Blog: Sentiment analysis on the social web depends on how a person’s state of mind is expressed in words. Now a new database of the links between words and emotions could provide a better foundation for this kind of analysis


One of the buzzphrases associated with the social web is sentiment analysis. This is the ability to determine a person’s opinion or state of mind by analysing the words they post on Twitter, Facebook or some other medium.
Much has been promised with this method—the ability to measure satisfaction with politicians, movies and products; the ability to better manage customer relations; the ability to create dialogue for emotion-aware games; the ability to measure the flow of emotion in novels; and so on.
The idea is to entirely automate this process—to analyse the firehose of words produced by social websites using advanced data mining techniques to gauge sentiment on a vast scale.
But all this depends on how well we understand the emotion and polarity (whether negative or positive) that people associate with each word or combinations of words.
Today, Saif Mohammad and Peter Turney at the National Research Council Canada in Ottawa unveil a huge database of words and their associated emotions and polarity, which they have assembled quickly and inexpensively using Amazon’s crowdsourcing Mechanical Turk website. They say this crowdsourcing mechanism makes it possible to increase the size and quality of the database quickly and easily….The result is a comprehensive word-emotion lexicon for over 10,000 words or two-word phrases which they call EmoLex….
The bottom line is that sentiment analysis can only ever be as good as the database on which it relies. With EmoLex, analysts have a new tool for their box of tricks.”
Ref: arxiv.org/abs/1308.6297: Crowdsourcing a Word-Emotion Association Lexicon