James Surowiecki in the New Yorker: “If Reddit were looking for a model to follow, it could use NASA’s Clickworkers experiment, which in 2000-01 let tens of thousands of amateurs look at photos of Mars in order to identify craters on the planet and classify them by age. That study found that the aggregated judgments of the amateur “clickworkers” were “virtually indistinguishable from the inputs of a geologist with years of experience.”
The problem from Reddit’s perspective, of course, is that this method of sleuthing would be far less exciting for users, and would probably generate less traffic, than its current free-for-all approach. The point of the “find-the-bombers” subthread, after all, wasn’t just to find the bombers—it was also to connect and talk with others, and to feel like you were part of a virtual community. But valuable as that experience may have been for users, it also diminished the chances of the community coming up with useful information. Reddit has done an excellent job of being engaging. Now it needs to figure out if it wants to be effective”.
Open Data Research Announced
WWW Foundation Press Release: “Speaking at an Open Government Partnership reception last night in London, Sir Tim Berners-Lee, founder of the World Wide Web Foundation (Web Foundation) and inventor of the Web, unveiled details of the first ever in-depth study into how the power of open data could be harnessed to tackle social challenges in the developing world. The 14 country study is funded by Canada’s International Development Research Centre (IDRC) and will be overseen by the Web Foundation’s world-leading open data experts. An interim progress update will be made at an October 2013 meeting of the Open Government Partnership, with in-depth results expected in 2014…
Sir Tim Berners-Lee, founder of the World Wide Web Foundation and inventor of the Web said:
“Open Data, accessed via a free and open Web, has the potential to create a better world. However, best practice in London or New York is not necessarily best practice in Lima or Nairobi. The Web Foundation’s research will help to ensure that Open Data initiatives in the developing world will unlock real improvements in citizens’ day-to-day lives.”
José M. Alonso, program manager at the World Wide Web Foundation, added:
“Through this study, the Web Foundation hopes not only to contribute to global understanding of open data, but also to cultivate the ability of developing world researchers and development workers to understand and apply open data for themselves.”
Further details on the project, including case study outlines are available here: http://oddc.opendataresearch.org/
From Open Data to Information Justice
The Social Affordances of the Internet for Networked Individualism
Paper by NetLab (Toronto University) scholars in the latest issue of the Journal of Computer-Mediated Communication: “We review the evidence from a number of surveys in which our NetLab has been involved about the extent to which the Internet is transforming or enhancing community. The studies show that the Internet is used for connectivity locally as well as globally, although the nature of its use varies in different countries. Internet use is adding on to other forms of communication, rather than replacing them. Internet use is reinforcing the pre-existing turn to societies in the developed world that are organized around networked individualism rather than group or local solidarities. The result has important implications for civic involvement.”
The Dangers of Surveillance
Paper by Neil M. Richards in Harvard Law Review. Abstract: “From the Fourth Amendment to George Orwell’s Nineteen Eighty-Four, our culture is full of warnings about state scrutiny of our lives. These warnings are commonplace, but they are rarely very specific. Other than the vague threat of an Orwellian dystopia, as a society we don’t really know why surveillance is bad, and why we should be wary of it. To the extent the answer has something to do with “privacy,” we lack an understanding of what “privacy” means in this context, and why it matters. Developments in government and corporate practices have made this problem more urgent. Although we have laws that protect us against government surveillance, secret government programs cannot be challenged until they are discovered.
… I propose a set of four principles that should guide the future development of surveillance law, allowing for a more appropriate balance between the costs and benefits of government surveillance. First, we must recognize that surveillance transcends the public-private divide. Even if we are ultimately more concerned with government surveillance, any solution must grapple with the complex relationships between government and corporate watchers. Second, we must recognize that secret surveillance is illegitimate, and prohibit the creation of any domestic surveillance programs whose existence is secret. Third, we should recognize that total surveillance is illegitimate and reject the idea that it is acceptable for the government to record all Internet activity without authorization. Fourth, we must recognize that surveillance is harmful. Surveillance menaces intellectual privacy and increases the risk of blackmail, coercion, and discrimination; accordingly, we must recognize surveillance as a harm in constitutional standing doctrine.
How to Clean Up Social News
David Talbot in MIT Technology Review: ” New platforms for fact-checking and reputation scoring aim to better channel social media’s power in the wake of a disaster…Researchers from the Masdar Institute of Technology and the Qatar Computer Research Institute plan to launch Verily, a platform that aims to verify social media information, in a beta version this summer. Verily aims to enlist people in collecting and analyzing evidence to confirm or debunk reports. As an incentive, it will award reputation points—or dings—to its contributors.
Verily will join services like Storyful that use various manual and technical means to fact-check viral information, and apps such as Swift River that, among other things, let people set up filters on social media to provide more weight to trusted users in the torrent of posts following major events…Reputation scoring has worked well for e-commerce sites like eBay and Amazon and could help to clean up social media reports in some situations.
The Rise of Big Data
Kenneth Neil Cukier and Viktor Mayer-Schoenberger in Foreign Affairs: “Everyone knows that the Internet has changed how businesses operate, governments function, and people live. But a new, less visible technological trend is just as transformative: “big data.” Big data starts with the fact that there is a lot more information floating around these days than ever before, and it is being put to extraordinary new uses. Big data is distinct from the Internet, although the Web makes it much easier to collect and share data. Big data is about more than just communication: the idea is that we can learn from a large body of information things that we could not comprehend when we used only smaller amounts.”
Gideon Rose, editor of Foreign Affairs, sits down with Kenneth Cukier, data editor of The Economist (video):
Investigating Terror in the Age of Twitter
Michael Chertoff and Dallas Lawrence in WSJ: “A dozen years ago when the terrorists struck on 9/11, there was no Facebook or Twitter or i-anything on the market. Cellphones were relatively common, but when cell networks collapsed in 2001, many people were left disconnected and wanting for immediate answers. Last week in Boston, when mobile networks became overloaded following the bombings, the social-media-savvy Boston Police Department turned to Twitter, using the platform as a makeshift newsroom to alert media and concerned citizens to breaking news.
Law-enforcement agencies around the world will note how social media played a prominent role both in telling the story and writing its eventual conclusion. Some key lessons have emerged.”
Knowing Where to Focus the Wisdom of Crowds
Nick Bilton in NYT: “It looks as if the theory of the “wisdom of crowds” doesn’t apply to terrorist manhunts. Last week after the Boston Marathon bombings, the Internet quickly offered to help find the people responsible. In a scene metaphorically reminiscent of a movie in which vigilantes swarm the streets with pitchforks and lanterns, people took to Reddit, the popular community and social news Web site, and started scouring images posted online from the bombings.
One Reddit forum told users to search for ”people carrying black bags,” and noted that “if they look suspicious, then post them. Then people will try and follow their movements using all the images.” In the process, each time a scrap of information was discovered — the color of a hat, the type of straps on a backpack, the weighted droop of a bag — it was passed out on Twitter like “Wanted” posters tacked to lampposts. It didn’t matter whether it was right, wrong or even completely made up (some images posted to forums had been manipulated) — off it went, fiction and fact indistinguishable. Some misinformation online landed on the front page of The New York Post, incorrectly identifying an innocent high school student as a suspect. Later in the week, the Web wrongly identified one of the suspects as a student from Brown University who went missing earlier this month…
Perhaps the scariest aspect of these crowd-like investigations is that when information is incorrect, no one is held responsible.
As my colleague David Carr noted in his column this week, “even good reporters with good sources can end up with stories that go bad.” But the difference between CNN, The Associated Press or The New York Post getting it wrong, is that those names are held accountable when they publish incorrect news. No one is going to remember, or punish, the users on Reddit or Twitter who incorrectly identify random high school runners and missing college students as terrorists.”
Crowd diagnosis could spot rare diseases doctors miss
New Scientist: “Diagnosing rare illnesses could get easier, thanks to new web-based tools that pool information from a wide variety of sources…CrowdMed, launched on 16 April at the TedMed conference in Washington DC, uses crowds to solve tough medical cases.
Anyone can join CrowdMed and analyse cases, regardless of their background or training. Participants are given points that they can then use to bet on the correct diagnosis from lists of suggestions. This creates a prediction market, with diagnoses falling and rising in value based on their popularity, like stocks in a stock market. Algorithms then calculate the probability that each diagnosis will be correct. In 20 initial test cases, around 700 participants identified each of the mystery diseases as one of their top three suggestions….
Frustrated patients and doctors can also turn to FindZebra, a recently launched search engine for rare diseases. It lets users search an index of rare disease databases looked after by a team of researchers. In initial trials, FindZebra returned more helpful results than Google on searches within this same dataset.”