Edited book by William H. Dutton (Routledge – 2014 – 1,888 pages: “It is commonplace to observe that the Internet—and the dizzying technologies and applications which it continues to spawn—has revolutionized human communications. But, while the medium’s impact has apparently been immense, the nature of its political implications remains highly contested. To give but a few examples, the impact of networked individuals and institutions has prompted serious scholarly debates in political science and related disciplines on: the evolution of ‘e-government’ and ‘e-politics’ (especially after recent US presidential campaigns); electronic voting and other citizen participation; activism; privacy and surveillance; and the regulation and governance of cyberspace.
As research in and around politics and the Internet flourishes as never before, this new four-volume collection from Routledge’s acclaimed Critical Concepts in Political Science series meets the need for an authoritative reference work to make sense of a rapidly growing—and ever more complex—corpus of literature. Edited by William H. Dutton, Director of the Oxford Internet Institute (OII), the collection gathers foundational and canonical work, together with innovative and cutting-edge applications and interventions.
With a full index and comprehensive bibliographies, together with a new introduction by the editor, which places the collected material in its historical and intellectual context, Politics and the Internet is an essential work of reference. The collection will be particularly useful as a database allowing scattered and often fugitive material to be easily located. It will also be welcomed as a crucial tool permitting rapid access to less familiar—and sometimes overlooked—texts. For researchers, students, practitioners, and policy-makers, it is a vital one-stop research and pedagogic resource.”
Eight (No, Nine!) Problems With Big Data
Gary Marcus and Ernest Davis in the New York Times: “BIG data is suddenly everywhere. Everyone seems to be collecting it, analyzing it, making money from it and celebrating (or fearing) its powers. Whether we’re talking about analyzing zillions of Google search queries to predict flu outbreaks, or zillions of phone records to detect signs of terrorist activity, or zillions of airline stats to find the best time to buy plane tickets, big data is on the case. By combining the power of modern computing with the plentiful data of the digital era, it promises to solve virtually any problem — crime, public health, the evolution of grammar, the perils of dating — just by crunching the numbers.
Or so its champions allege. “In the next two decades,” the journalist Patrick Tucker writes in the latest big data manifesto, “The Naked Future,” “we will be able to predict huge areas of the future with far greater accuracy than ever before in human history, including events long thought to be beyond the realm of human inference.” Statistical correlations have never sounded so good.
Is big data really all it’s cracked up to be? There is no doubt that big data is a valuable tool that has already had a critical impact in certain areas. For instance, almost every successful artificial intelligence computer program in the last 20 years, from Google’s search engine to the I.B.M. “Jeopardy!” champion Watson, has involved the substantial crunching of large bodies of data. But precisely because of its newfound popularity and growing use, we need to be levelheaded about what big data can — and can’t — do.
The first thing to note is that although big data is very good at detecting correlations, especially subtle correlations that an analysis of smaller data sets might miss, it never tells us which correlations are meaningful. A big data analysis might reveal, for instance, that from 2006 to 2011 the United States murder rate was well correlated with the market share of Internet Explorer: Both went down sharply. But it’s hard to imagine there is any causal relationship between the two. Likewise, from 1998 to 2007 the number of new cases of autism diagnosed was extremely well correlated with sales of organic food (both went up sharply), but identifying the correlation won’t by itself tell us whether diet has anything to do with autism.
Second, big data can work well as an adjunct to scientific inquiry but rarely succeeds as a wholesale replacement. Molecular biologists, for example, would very much like to be able to infer the three-dimensional structure of proteins from their underlying DNA sequence, and scientists working on the problem use big data as one tool among many. But no scientist thinks you can solve this problem by crunching data alone, no matter how powerful the statistical analysis; you will always need to start with an analysis that relies on an understanding of physics and biochemistry.
Third, many tools that are based on big data can be easily gamed. For example, big data programs for grading student essays often rely on measures like sentence length and word sophistication, which are found to correlate well with the scores given by human graders. But once students figure out how such a program works, they start writing long sentences and using obscure words, rather than learning how to actually formulate and write clear, coherent text. Even Google’s celebrated search engine, rightly seen as a big data success story, is not immune to “Google bombing” and “spamdexing,” wily techniques for artificially elevating website search placement.
Fourth, even when the results of a big data analysis aren’t intentionally gamed, they often turn out to be less robust than they initially seem. Consider Google Flu Trends, once the poster child for big data. In 2009, Google reported — to considerable fanfare — that by analyzing flu-related search queries, it had been able to detect the spread of the flu as accurately and more quickly than the Centers for Disease Control and Prevention. A few years later, though, Google Flu Trends began to falter; for the last two years it has made more bad predictions than good ones.
As a recent article in the journal Science explained, one major contributing cause of the failures of Google Flu Trends may have been that the Google search engine itself constantly changes, such that patterns in data collected at one time do not necessarily apply to data collected at another time. As the statistician Kaiser Fung has noted, collections of big data that rely on web hits often merge data that was collected in different ways and with different purposes — sometimes to ill effect. It can be risky to draw conclusions from data sets of this kind.
A fifth concern might be called the echo-chamber effect, which also stems from the fact that much of big data comes from the web. Whenever the source of information for a big data analysis is itself a product of big data, opportunities for vicious cycles abound. Consider translation programs like Google Translate, which draw on many pairs of parallel texts from different languages — for example, the same Wikipedia entry in two different languages — to discern the patterns of translation between those languages. This is a perfectly reasonable strategy, except for the fact that with some of the less common languages, many of the Wikipedia articles themselves may have been written using Google Translate. In those cases, any initial errors in Google Translate infect Wikipedia, which is fed back into Google Translate, reinforcing the error.
A sixth worry is the risk of too many correlations. If you look 100 times for correlations between two variables, you risk finding, purely by chance, about five bogus correlations that appear statistically significant — even though there is no actual meaningful connection between the variables. Absent careful supervision, the magnitudes of big data can greatly amplify such errors.
Seventh, big data is prone to giving scientific-sounding solutions to hopelessly imprecise questions. In the past few months, for instance, there have been two separate attempts to rank people in terms of their “historical importance” or “cultural contributions,” based on data drawn from Wikipedia. One is the book “Who’s Bigger? Where Historical Figures Really Rank,” by the computer scientist Steven Skiena and the engineer Charles Ward. The other is an M.I.T. Media Lab project called Pantheon.
Both efforts get many things right — Jesus, Lincoln and Shakespeare were surely important people — but both also make some egregious errors. “Who’s Bigger?” claims that Francis Scott Key was the 19th most important poet in history; Pantheon has claimed that Nostradamus was the 20th most important writer in history, well ahead of Jane Austen (78th) and George Eliot (380th). Worse, both projects suggest a misleading degree of scientific precision with evaluations that are inherently vague, or even meaningless. Big data can reduce anything to a single number, but you shouldn’t be fooled by the appearance of exactitude.
FINALLY, big data is at its best when analyzing things that are extremely common, but often falls short when analyzing things that are less common. For instance, programs that use big data to deal with text, such as search engines and translation programs, often rely heavily on something called trigrams: sequences of three words in a row (like “in a row”). Reliable statistical information can be compiled about common trigrams, precisely because they appear frequently. But no existing body of data will ever be large enough to include all the trigrams that people might use, because of the continuing inventiveness of language.
To select an example more or less at random, a book review that the actor Rob Lowe recently wrote for this newspaper contained nine trigrams such as “dumbed-down escapist fare” that had never before appeared anywhere in all the petabytes of text indexed by Google. To witness the limitations that big data can have with novelty, Google-translate “dumbed-down escapist fare” into German and then back into English: out comes the incoherent “scaled-flight fare.” That is a long way from what Mr. Lowe intended — and from big data’s aspirations for translation.
Wait, we almost forgot one last problem: the hype….
Effective metrics for measurement and target setting in online citizen engagement
In building the latest version of the EngagementHQ software we not only thought about new tools and ways to engage the community, we also watched the ways our clients had been using the reports and set ourselves to thinking about how we could build a set of metrics for target setting and the measurement of results that will remain relevant as we add more and more functionality to EngagementHQ.
Things have changed a lot since we designed our old reports. You can now get information from your community using forums, guestbooks, a story tool, interactive mapping, surveys, quick polls, submission forms, a news feed with discussions or the QandA tool. You can provide information to the community not just through library, dates, photos and FAQs but also using videos, link boxes and embedded content from all over the web.
Our old reports could tell you that 600 people had viewed the documents and it could tell you that 70 people had read the FAQs but you could not tell if they were the same people so you didn’t really know how many people had accessed information through your site. Generally we used those who had viewed documents in the library as a proxy but as time goes on our more engaging clients are communicating less and less through documents and more through other channels.
Similarly, whilst registrations were a good proxy for engagement (why else would you sign up?), it was failing to keep pace with the technology. You can now configure all our tools to require sign up or to be exempt from it these days so the proxy doesn’t hold. Moreover, many of our clients bulk load groups into the database and therefore inflate the registrations number.
What we came up with was a simple solution. We would calculate Aware, Informed and Engaged cohorts in the reports.
Aware – a measure of the number of people who have visited your project;
Informed – a measure of the visitors who have clicked to access further information resources, to learn more;
Engaged – a measure of the number of people who have given you feedback using any of the means available on the site.”
Quizz: Targeted Crowdsourcing with a Billion (Potential) Users
Using Social Media to Measure Labor Market Flows
Paper by Dolan Antenucci, Michael Cafarella, Margaret C. Levenstein, Christopher Ré, and Matthew D. Shapiro: “Social media enable promising new approaches to measuring economic activity and analyzing economic behavior at high frequency and in real time using information independent from standard survey and administrative sources. This paper uses data from Twitter to create indexes of job loss, job search, and job posting. Signals are derived by counting job-related phrases in Tweets such as “lost my job.” The social media indexes are constructed from the principal components of these signals. The University of Michigan Social Media Job Loss Index tracks initial claims for unemployment insurance at medium and high frequencies and predicts 15 to 20 percent of the variance of the prediction error of the consensus forecast for initial claims. The social media indexes provide real-time indicators of events such as Hurricane Sandy and the 2013 government shutdown. Comparing the job loss index with the search and posting indexes indicates that the Beveridge Curve has been shifting inward since 2011.
The University of Michigan Social Media Job Loss index is update weeklyand is available at http://econprediction.eecs.umich.edu/.”
Smart cities are here today — and getting smarter
Computer World: “Smart cities aren’t a science fiction, far-off-in-the-future concept. They’re here today, with municipal governments already using technologies that include wireless networks, big data/analytics, mobile applications, Web portals, social media, sensors/tracking products and other tools.
These smart city efforts have lofty goals: Enhancing the quality of life for citizens, improving government processes and reducing energy consumption, among others. Indeed, cities are already seeing some tangible benefits.
But creating a smart city comes with daunting challenges, including the need to provide effective data security and privacy, and to ensure that myriad departments work in harmony.
What makes a city smart? As with any buzz term, the definition varies. But in general, it refers to using information and communications technologies to deliver sustainable economic development and a higher quality of life, while engaging citizens and effectively managing natural resources.
Making cities smarter will become increasingly important. For the first time ever, the majority of the world’s population resides in a city, and this proportion continues to grow, according to the World Health Organization, the coordinating authority for health within the United Nations.
A hundred years ago, two out of every 10 people lived in an urban area, the organization says. As recently as 1990, less than 40% of the global population lived in a city — but by 2010 more than half of all people lived in an urban area. By 2050, the proportion of city dwellers is expected to rise to 70%.
As many city populations continue to grow, here’s what five U.S. cities are doing to help manage it all:
Scottsdale, Ariz.
The city of Scottsdale, Ariz., has several initiatives underway.
One is MyScottsdale, a mobile application the city deployed in the summer of 2013 that allows citizens to report cracked sidewalks, broken street lights and traffic lights, road and sewer issues, graffiti and other problems in the community….”
Clinical Nuance: Benefit Design Meets Behavioral Economics
Kavita Patel, Elizabeth Cliff, and Mark Fendrick in Health Affairs: “In Capitol Hill, there’s a growing chorus of support from both sides of the aisle to move the focus of health care payment incentives from volume to value. Earlier this month, legislators introduced proposals that would have fixed the sustainable growth rate in Medicare, as well as made other changes, including allowing for clinical nuance in Medicare benefit designs. The Centers for Medicare and Medicaid Services, too, is embracing this trend, recently asking for partners in a demonstration project to used value-based arrangements in benefit design. These efforts of policymakers and agencies to innovate Medicare’s benefit design are crucial both for the health of seniors and to ensure value in the Medicare program.
The concept of clinical nuance, implemented using value-based insurance design (V-BID), is a key innovation already widely implemented in the private and public payers. It recognizes two important facts about the provision of medical care: 1) medical services differ in the amount of health produced, and 2) the clinical benefit derived from a medical service depends on who is using it, who is delivering the service, and where it is being delivered.
Today’s Medicare beneficiaries face little clinical nuance in their benefit structure. Medicare largely uses a “one-size-fits-all” structure that does not recognize that some treatments, drugs or tests are more important to health than others. Not only does it create inefficiencies in the health system, it can actually harm the health of beneficiaries.
Some discussion of economics explains why. The concept of moral hazard, which posits that individuals over-consume when they are not on the hook for the cost of their behavior, is well established in health care. It is used to explain why those who are insured use more care than they might need to remain optimally healthy. But, that’s only half the story. There are lots of beneficial medications or services that we wish people would use—and for some treatment adherence is low. So what gives?
Recently, three economists — Katherine Baicker, Sendhil Mullainathan and Joshua Schwartzstein — coined the term “behavioral hazard.” They use it to refer to suboptimal choices that people make based on their own behavioral biases. For example, a diabetic patient might feel fine and choose to forgo regular eye exams, only to have their disease progress. Here, higher levels of cost sharing worsen the problem. A beneficiary who faces both financial and behavioral obstacles to treatment adherence is less likely to behave in a way that ensures optimal health.
That brings us back to the concept of clinical nuance. Clinically nuanced insurance designs recognize both moral and behavioral hazard, and seek to shape incentives to minimize their impact. When patients’ incentives are aligned with evidence-based medicine, it improves outcomes, helps patients and, in some clinical situations, lowers costs.”
Crowdsourcing “Monopoly”
The Economist: “In 1904 a young American named Elizabeth Magie received a patent for a board game in which players used tokens to move around a four-sided board buying properties, avoiding taxes and jail, and collecting $100 every time they passed the board’s starting-point. Three decades later Charles Darrow, a struggling salesman in Pennsylvania, patented a tweaked version of the game as “Monopoly”. Now owned by Hasbro, a big toymaker, it has become one of the world’s most popular board games, available in dozens of languages and innumerable variations.
Magie was a devotee of Henry George, an economist who believed in common ownership of land; her game was designed to be a “practical demonstration of the present system of land-grabbing with all its usual outcomes and consequences.” And so it has become, though players snatch properties more in zeal than sadness. In “Monopoly” as in life, it is better to be rich than poor, children gleefully bankrupt their parents and nobody uses a flat iron any more.
Board-game makers have had to find their footing in a digital age. Hasbro’s game-and-puzzle sales fell by 4% in 2010—the year the iPad came to market—and 10% in 2011. Since then, however, its game-and-puzzle sales have rebounded, rising by 2% in 2012 and 10% in 2013. Stephanie Wissink, a youth-market analyst with Piper Jaffray, an investment bank, says that Hasbro has learned to become “co-creative…They’re infusing more social-generated content into their marketing and product development.”
Some of that content comes from Facebook. Last year, “Monopoly” fans voted on Hasbro’s Facebook page to jettison the poor old flat iron in favour of a new cat token. “Scrabble” players are voting on which word to add to the new dictionary (at press time, 16 remain, including “booyah”, “adorbs” and “cosplay”). “Monopoly” fans, meanwhile, are voting on which of ten house rules—among them collecting $400 rather than $200 for landing on “Go”, requiring players to make a full circuit of the board before buying property and “Mom always gets out of jail free. Always. No questions asked”—to make official…”
Facebook’s Connectivity Lab will develop advanced technology to provide internet across the world
Signe Brewster and Lauren Hockenson at GigaOm: “The Internet.org initiative will rely on a new team at Facebook called the Connectivity Lab, based at the company’s Menlo Park campus, to develop technology on the ground, in the air and in space, CEO Mark Zuckerberg announced Thursday. The team will develop technology like drones and satellites to expand access to the internet across the world.
“The team’s approach is based on the principle that different sized communities need different solutions and they are already working on new delivery platforms—including planes and satellites—to provide connectivity for communities with different population densities,” a post on Internet.org says.
Internet.org, which is backed by companies like Facebook, Samsung and Qualcomm, wants to provide internet to the two thirds of the world that remains disconnected due to cost, lack of infrastructure or remoteness. While many companies are developing business models and partnerships in areas that lack internet, the Connectivity Lab will focus on sustainable technology that will transmit the signals. Facebook envisions using drones that could fly for months to connect suburban areas, while more rural areas would rely on satellites. Both would use infrared lasers to blanket whole areas with connectivity.
Members of the Connectivity Lab have backgrounds at NASA’s Jet Propulsion Laboratory, NASA’s Ames Research Center and the National Optical Astronomy Observatory. Facebook also confirmed today that it acquired five employees from Ascenta, a U.K.-based company that worked on the Zephyr–a solar-powered drone capable of flying for two weeks straight.
The lab’s work will build on work the company has already done in the Philippines and Paraguay, Zuckerberg said in a Facebook post. And, like the company’s Open Compute project, there is a possibility that the lab will seek partnerships with outside countries once the bulk of the technology has been developed.”
How Cities Can Be Designed to Help—or Hinder—Sharing
Jay Walljasper in Yes!: Centuries before someone first uttered the words “sharing economy,” the steady rise of cities embodied both the principles and promise of that phrase. The reason more than half the people on earth now live in urban areas is the advantages that come from sharing resources, infrastructure, and lives with other people. Essential commons belonging to all of us, ranging from transportation systems to public health safeguards to plentiful social connections, are easier to create and maintain in a populated area.
Think about typical urban dwellers. They are more likely to reside in an apartment building, shared household, or compact living unit (saving on heating, utilities, original construction costs, and other expenses), walk or take transit (saving the environment as well as money), know a wide range of people (expanding their circle of friends and colleagues), and encounter new experiences (increasing their knowledge and skills).
Access to these opportunities for sharing offers economic, social, environmental, and educational rewards. But living in a populated area does not automatically mean more sharing. Indeed, the classic suburban lifestyle—a big, single-family house and a big yard isolated from everything else and reachable only by automobile—makes sharing extremely difficult….
“The suburbs were designed as a landscape to maximize consumption,” Fisher explains. “It worked against sharing of any kind. People had all this stuff in their houses and garages, which was going unused most of the time.”
Autos replaced streetcars. Kids rode school buses instead of walking to school.
Everyone bought their own lawn mower, shovels, tools, sports equipment, and grills.
Even the proverbial cup of sugar borrowed from a neighbor disappeared in favor of the 10-pound bag bought at the supermarket.
As our spending grew, our need for social connections shrank. “Mass consumption was good for the economy, but bad for our well-being,” Fisher notes. He now sees changes ahead for our communities as the economy evolves.
“The new economy is all about innovation, which depends on maximizing interaction, not consumption.”
This means redesigning our communities to bring people together by giving everyone more opportunities to “walk, live close together, and share.”
This shift can already be seen in farmers markets, co-working spaces, tool libraries, bike sharing systems, co-ops, credit unions, public spaces, and other sharing projects everywhere.
“Creative people in cities around the world are rising up…” declares Neal Gorenflo, co-founder of Shareable magazine. “We are not protesting, and we are not asking for permission, and we are not waiting—we are building a people-powered economy right under everyone’s noses.”
Excited by this emerging grassroots movement, Shareable recently launched the Sharing Cities Network to be an independent resource “for sharing innovators to discover together how to create as many sharing cities around the world as fast as possible.”
The aim is to help empower and connect local initiatives around the world through online forums, peer learning, and other ways to boost collaboration, share best practices, and catalyze new projects.”