How Mechanical Turkers Crowdsourced a Huge Lexicon of Links Between Words and Emotion


The Physics arXiv Blog: Sentiment analysis on the social web depends on how a person’s state of mind is expressed in words. Now a new database of the links between words and emotions could provide a better foundation for this kind of analysis


One of the buzzphrases associated with the social web is sentiment analysis. This is the ability to determine a person’s opinion or state of mind by analysing the words they post on Twitter, Facebook or some other medium.
Much has been promised with this method—the ability to measure satisfaction with politicians, movies and products; the ability to better manage customer relations; the ability to create dialogue for emotion-aware games; the ability to measure the flow of emotion in novels; and so on.
The idea is to entirely automate this process—to analyse the firehose of words produced by social websites using advanced data mining techniques to gauge sentiment on a vast scale.
But all this depends on how well we understand the emotion and polarity (whether negative or positive) that people associate with each word or combinations of words.
Today, Saif Mohammad and Peter Turney at the National Research Council Canada in Ottawa unveil a huge database of words and their associated emotions and polarity, which they have assembled quickly and inexpensively using Amazon’s crowdsourcing Mechanical Turk website. They say this crowdsourcing mechanism makes it possible to increase the size and quality of the database quickly and easily….The result is a comprehensive word-emotion lexicon for over 10,000 words or two-word phrases which they call EmoLex….
The bottom line is that sentiment analysis can only ever be as good as the database on which it relies. With EmoLex, analysts have a new tool for their box of tricks.”
Ref: arxiv.org/abs/1308.6297: Crowdsourcing a Word-Emotion Association Lexicon

From Collaborative Coding to Wedding Invitations: GitHub Is Going Mainstream


Wired: “With 3.4 million users, the five-year-old site is a runaway hit in the hacker community, the go-to place for coders to show off pet projects and crowdsource any improvements. But the company has grander ambitions: It wants to change the way people work. It’s starting with software developers for sure, but maybe one day anyone who edits text in one form or another — lawyers, writers, and civil servants — will do it the GitHub way.
To first-time visitors, GitHub looks like a twisted version of Facebook, built in some alternate universe where YouTube videos and photos of cats have somehow morphed into snippets of code. But many of the underlying concepts are the same. You can “follow” other hackers to see what they’re working on. You can comment on their code — much like you’d do on a Facebook photo. You can even “star” a project to show that you like it, just as you’d “favorite” something on Twitter.
But it’s much more than a social network. People discover new projects and then play around with them, making changes, trying out new ideas. Then, with the push of a button, they merge into something better. You can also “fork” projects. That’s GitHub lingo for then when you make a copy of a project so you can then build and modify your own, independent version.
People didn’t just suggest changes to Lee’s Twitter patent license. It was forked 53 times: by Arul, by a computer science student in Portland, by a Belgian bicycle designer. These forks can now evolve and potentially even merge back into Lee’s agreement. The experiment also inspired Fenwick & West, one of Silicon Valley’s top legal firms (and GitHub’s law firm) to post 30 pages of standard documents for startups to GitHub earlier this year.”

Assessing Zuckerberg’s Idea That Facebook Could Help Citizens Re-Make Their Government


Gregory Ferenstein in TechCrunch: “Mark Zuckerberg has a grand vision that Facebook will help citizens in developing countries decide their own governments. It’s a lofty and partially attainable goal. While Egypt probably won’t let citizens vote for their next president with a Like, it is theoretically possible to use Facebook to crowdsource expertise. Governments around the world are experimenting with radical online direct democracy, but it doesn’t always work out.

Very briefly, Zuckerberg laid out his broad vision for e-government to Wired’s Steven Levy, while defending Internet.org, a new consortium to bring broadband to the developing world.

“People often talk about how big a change social media had been for our culture here in the U.S. But imagine how much bigger a change it will be when a developing country comes online for the first time ever. We use things like Facebook to share news and keep in touch with our friends, but in those countries, they’ll use this for deciding what kind of government they want to have. Getting access to health care information for the first time ever.”

When he references “deciding … government,” Zuckerberg could be talking about voting, sharing ideas, or crafting a constitution. We decided to assess the possibilities of them all….
For citizens in the exciting/terrifying position to construct a brand-new government, American-style democracy is one of many options. Britain, for instance, has a parliamentary system and has no constitution. In other cases, a government may want to heed political scientists’ advice and develop a “consensus democracy,” where more than two political parties are incentivized to work collaboratively with citizens, business, and different branches of government to craft laws.
At least once, choosing a new style of democracy has been attempted through the Internet. After the global financial meltdown wrecked Iceland’s economy, the happy citizens of the grass-covered country decided to redo their government and solicit suggestions from the public (950 Icelanders chosen by lottery and general calls for ideas through social networks). After much press about Iceland’s “crowdsourced” constitution, it crashed miserably after most of the elected leaders rejected it.
Crafting law, especially a constitution, is legally complex; unless there is a systematic way to translate haphazard citizen suggestions into legalese, the results are disastrous.
“Collaborative drafting, at large scale, at low costs, and that is inclusive, is something that we still don’t know how to do,” says Tiago Peixoto, a World Bank Consultant on participatory democracy (and one of our Most Innovative People In Democracy).
Peixoto, who helps the Brazilian government conduct some of the world’s only online policymaking, says he’s optimistic that Facebook could be helpful, but he wouldn’t use it to draft laws just yet.
While technically it is possible for social networks to craft a new government, we just don’t know how to do it very well, and, therefore, leaders are likely to reject the idea. In other words, don’t expect Egypt to decide their future through Facebook likes.”

Mapping the Twitterverse


Mapping the Twitterverse

Phys.org: “What does your Twitter profile reveal about you? More than you know, according to Chris Weidemann. The GIST master’s student has developed an application that follows geospatial footprints.
You start your day at your favorite breakfast spot. When your order of strawberry waffles with extra whipped cream arrives, it’s too delectable not to share with your Twitter followers. You snap a photo with your smartphone and hit send. Then, it’s time to hit the books.
You tweet your friends that you’ll be at the library on campus. Later that day, palm trees silhouette a neon-pink sunset. You can’t resist. You tweet a picture with the hashtag #ILoveLA.
You may not realize that when you tweet those breezy updates and photos of food, you are sharing information about your location.
Chris Weidemann, a graduate student in the Geographic Information Science and Technology (GIST) online master’s program at USC Dornsife, investigated just how much public was generated by Twitter users and how their information—available through Twitter’s (API)—could potentially be used by third parties. His study was published June 2013 in the International Journal of Geoinformatics
Twitter has approximately 500 million active users, and reports show that 6 percent of users opt-in to allow the platform to broadcast their location using global positioning technology with each tweet they post. That’s about 30 million people sending geo-tagged data out into the Twitterverse. In their tweets, people can choose whether their information is displayed as a city and state, an address or pinpoint their precise latitude and longitude.
That’s only part of their geospatial footprint. Information contained in a post may reveal a user’s location. Depending upon how the account is set up, profiles may include details about their hometown, time zone and language.”
 

Twitter’s activist roots: How Twitter’s past shapes its use as a protest tool


Radio Netherlands Worldwide: “Surprised when demonstrators from all over the world took to Twitter as a protest tool? Evan “Rabble” Henshaw-Plath, member of Twitter’s founding team, was not. Rather, he sees it as a return to its roots: Inspired by protest coordination tools like TXTMob, and shaped by the values and backgrounds of Twitter’s founders, he believes activist potential was built into the service from the start.

It took a few revolutions before Twitter was taken seriously. Critics claimed that its 140-character limit only provided space for the most trivial thoughts: neat for keeping track of Ashton Kutcher’s lunch choices, but not much else. It made the transition from Silicon Valley toy into Middle East protest tool seem all the more astonishing.
Unless, Twitter co-founder Evan Henshaw-Plath argues, you know the story of how Twitter came to be. Evan Henshaw-Plath was the lead developer at Odeo, the company that started and eventually became Twitter. TXTMob, an activist tool deployed during the 2004 Republican National Convention in the US to coordinate protest efforts via SMS was, says Henshaw-Plath, a direct inspiration for Twitter.
Protest 1.0
In 2004, while Henshaw-Plath was working at Odeo, he and a few other colleagues found a fun side-project in working on TXTMob, an initiative by what he describes as a “group of academic artist/prankster/hacker/makers” that operated under the ostensibly serious moniker of Institute for Applied Autonomy (IAA). Earlier IAA projects included small graffiti robots on wheels that spray painted slogans on pavements during demonstrations, and a pudgy talking robot with big puppy eyes made to distribute subversive literature to people who ignored less-cute human pamphleteers.
TXTMob was a more serious endeavor than these earlier projects: a tactical protest coordination tool. With TXTMob, users could quickly exchange text messages with large groups of other users about protest locations and police crackdowns….”

The Global Database of Events, Language, and Tone (GDELT)


“The Global Database of Events, Language, and Tone (GDELT) is an initiative to construct a catalog of human societal-scale behavior and beliefs across all countries of the world over the last two centuries down to the city level globally, to make all of this data freely available for open research, and to provide daily updates to create the first “realtime social sciences earth observatory.” Nearly a quarter-billion georeferenced events capture global behavior in more than 300 categories covering 1979 to present with daily updates.GDELT is designed to help support new theories and descriptive understandings of the behaviors and driving forces of global-scale social systems from the micro-level of the individual through the macro-level of the entire planet by offering realtime synthesis of global societal-scale behavior into a rich quantitative database allowing realtime monitoring and analytical exploration of those trends.
GDELT’s goal is to help uncover previously-obscured spatial, temporal, and perceptual evolutionary trends through new forms of analysis of the vast textual repositories that capture global societal activity, from news and social media archives to knowledge repositories.”

The New Reality of Social Production


Don Peppers on LinkedIn: “…Waze is yet another example of social production, or the increasingly common use of connected people working together to create value with little or no actual economic incentives involved. Instead, social production is based on a completely different set of principles – sharing and giving, rather than trading and selling. It is an important aspect of what some are now calling the “sharing economy,” and systems like Waze are ever more rapidly replacing or supplementing large portions of the commercial economy, as Martha Rogers and I document in our book Extreme Trust.
In the commercial economy, where profit-making entities operate, what you pay for determines what you get. I pay you, and you give me something of value. I may be a customer buying a product or service, or you may be the boss paying my salary, but either way neither of us is volunteering. We are trading our time or money for value in return. In the commercial economy, we all expect to pay for the things we want. When you pay the grocer $6 for a 12-pack of Diet Coke by the can, you don’t begrudge him the money. And you wouldn’t even consider asking the grocer to give you the soda voluntarily, for free – the way a Waze participant voluntarily reports a new hazard for other participants.
An economic system based on money, as ours is, facilitates the efficient division of labor, enabling us to accomplish more and more complex tasks by dividing them into simple components. The end result is that you don’t have to wire your own smartphone together or harvest your own wheat for your morning bagel. The division-of-labor principle has allowed technology to become so complex that none of us today could ever make even the simplest manufactured products all by ourselves.
But because of the very efficient way in which people are now electronically connected, many social production tasks can also be parsed up and allocated bit by bit among assorted different players – just talk to any of the 3.4 million volunteer coders and developers who work on the more than 300,000 different open-source software projects now registered at Sourceforge, for example. Moreover, these tasks are sometimes so complex, diffused, or difficult that accomplishing them with a commercial model just wouldn’t be practical. Imagine what it would have taken for Waze’s organizers to identify and monitor traffic hazards across the nation on their own, for instance. A small army of paid scouts or robotic monitors would have been required, continually updating the system, and the cost would have made the whole project completely unrealistic…”

Index: The Data Universe


The Living Library Index – inspired by the Harper’s Index – provides important statistics and highlights global trends in governance innovation. This installment focuses on the data universe and was originally published in 2013.

  • How much data exists in the digital universe as of 2012: 2.7 zetabytes*
  • Increase in the quantity of Internet data from 2005 to 2012: +1,696%
  • Percent of the world’s data created in the last two years: 90
  • Number of exabytes (=1 billion gigabytes) created every day in 2012: 2.5; that number doubles every month
  • Percent of the digital universe in 2005 created by the U.S. and western Europe vs. emerging markets: 48 vs. 20
  • Percent of the digital universe in 2012 created by emerging markets: 36
  • Percent of the digital universe in 2020 predicted to be created by China alone: 21
  • How much information in the digital universe is created and consumed by consumers (video, social media, photos, etc.) in 2012: 68%
  • Percent of which enterprises have liability or responsibility for (copyright, privacy, compliance with regulations, etc.): 80
  • Amount included in the Obama Administration’s 2-12 Big Data initiative: over $200 million
  • Amount the Department of Defense is investing annually on Big Data projects as of 2012: over $250 million
  • Data created per day in 2012: 2.5 quintillion bytes
  • How many terabytes* of data collected by the U.S. Library of Congress as of April 2011: 235
  • How many terabytes of data collected by Walmart per hour as of 2012: 2,560, or 2.5 petabytes*
  • Projected growth in global data generated per year, as of 2011: 40%
  • Number of IT jobs created globally by 2015 to support big data: 4.4 million (1.9 million in the U.S.)
  • Potential shortage of data scientists in the U.S. alone predicted for 2018: 140,000-190,000, in addition to 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions
  • Time needed to sequence the complete human genome (analyzing 3 billion base pairs) in 2003: ten years
  • Time needed in 2013: one week
  • The world’s annual effective capacity to exchange information through telecommunication networks in 1986, 2007, and (predicted) 2013: 281 petabytes, 65 exabytes, 667 exabytes
  • Projected amount of digital information created annually that will either live in or pass through the cloud: 1/3
  • Increase in data collection volume year-over-year in 2012: 400%
  • Increase in number of individual data collectors from 2011 to 2012: nearly double (over 300 data collection parties in 2012)

*1 zetabyte = 1 billion terabytes | 1 petabyte = 1,000 terabytes | 1 terabyte = 1,000 gigabytes | 1 gigabyte = 1 billion bytes

Sources

The Logic of Connective Action- Digital Media and the Personalization of Contentious Politics


New book by W. Lance Bennett and Alexandra Segerberg: “The Logic of Connective Action explains the rise of a personalized digitally networked politics in which diverse individuals address the common problems of our times such as economic fairness and climate change. Rich case studies from the United States, United Kingdom, and Germany illustrate a theoretical framework for understanding how large-scale connective action is coordinated using inclusive discourses such as “We Are the 99%” that travel easily through social media. In many of these mobilizations, communication operates as an organizational process that may replace or supplement familiar forms of collective action based on organizational resource mobilization, leadership, and collective action framing. In some cases, connective action emerges from crowds that shun leaders, as when Occupy protesters created media networks to channel resources and create loose ties among dispersed physical groups. In other cases, conventional political organizations deploy personalized communication logics to enable large-scale engagement with a variety of political causes. The Logic of Connective Action shows how power is organized in communication-based networks, and what political outcomes may result.”

Is Connectivity A Human Right?


Mark Zuckerberg (Facebook): For almost ten years, Facebook has been on a mission to make the world more open and connected. Today we connect more than 1.15 billion people each month, but as we started thinking about connecting the next 5 billion, we realized something important: the vast majority of people in the world don’t have access to the internet.
Today, only 2.7 billion people are online — a little more than one third of the world. That is growing by less than 9% each year, but that’s slow considering how early we are in the internet’s development. Even though projections show most people will get smartphones in the next decade, most people still won’t have data access because the cost of data remains much more expensive than the price of a smartphone.
Below, I’ll share a rough proposal for how we can connect the next 5 billion people, and a rough plan to work together as an industry to get there. We’ll discuss how we can make internet access more affordable by making it more efficient to deliver data, how we can use less data by improving the efficiency of the apps we build and how we can help businesses drive internet access by developing a new model to get people online.
I call this a “rough plan” because, like many long term technology projects, we expect the details to evolve. It may be possible to achieve more than we lay out here, but it may also be more challenging than we predict. The specific technical work will evolve as people contribute better ideas, and we welcome all feedback on how to improve this.
Connecting the world is one of the greatest challenges of our generation. This is just one small step toward achieving that goal. I’m excited to work together to make this a reality.
For the full version, click here.