Max Read in NewYork Magazine: “My favorite story about the internet is the one about the anonymous Japanese guy who liberated Czechoslovakia. In 1989, as open dissent was spreading across the country, dissidents were attempting to coordinate efforts outside the watchful eye of Czechoslovak state security. The internet was a nascent technology, and the cops didn’t use it; modems were banned, and activists were able to use only those they could smuggle over the border, one at a time. Enter our Japanese guy. Bruce Sterling, who first told the story of the Japanese guy in a 1995 Wired article, says he talked to four different people who’d met the quiet stranger, but no one knew his name. What really mattered, anyway, is what he brought with him: “a valise full of brand-new and unmarked 2400-baud Taiwanese modems,” which he handed over to a group of engineering students in Prague before walking away. “The students,” Sterling would later write, “immediately used these red-hot 2400-baud scorcher modems to circulate manifestos, declarations of solidarity, rumors, and riot news.” Unrest expanded, the opposition grew, and within months, the Communist regime collapsed.
Is it true? Were free modems the catalyst for the Velvet Revolution? Probably not. But it’s a good story, the kind whose logic and lesson have become so widely understood — and so foundational to the worldview of Silicon Valley — as to make its truth irrelevant. Isn’t the best way to fortify the town square by giving more people access to it? And isn’t it nice to know, as one storied institution and industry after another falls to the internet’s disrupting sword, that everything will be okay in the end — that there might be some growing pains, but connecting billions of people to one another is both inevitable and good? Free speech will expand, democracy will flower, and we’ll all be rich enough to own MacBooks. The new princes of Silicon Valley will lead us into the rational, algorithmically enhanced, globally free future.
Or, they were going to, until earlier this month. The question we face now is: What happens when the industry destroyed is professional politics, the institutions leveled are the same few that prop up liberal democracy, and the values the internet disseminates are racism, nationalism, and demagoguery?
Powerful undemocratic states like China and Russia have for a while now put the internet to use to mislead the public, create the illusion of mass support, and either render opposition invisible or expose it to targeting…(More)”
James Bridle in the New Humanist: “In a 2008 article in Wired magazine entitled “The End of Theory”, Chris Anderson argued that the vast amounts of data now available to researchers made the traditional scientific process obsolete. No longer would they need to build models of the world and test them against sampled data. Instead, the complexities of huge and totalising datasets would be processed by immense computing clusters to produce truth itself: “With enough data, the numbers speak for themselves.” As an example, Anderson cited Google’s translation algorithms which, with no knowledge of the underlying structures of languages, were capable of inferring the relationship between them using extensive corpora of translated texts. He extended this approach to genomics, neurology and physics, where scientists are increasingly turning to massive computation to make sense of the volumes of information they have gathered about complex systems. In the age of big data, he argued, “Correlation is enough. We can stop looking for models.”
This belief in the power of data, of technology untrammelled by petty human worldviews, is the practical cousin of more metaphysical assertions. A belief in the unquestionability of data leads directly to a belief in the truth of data-derived assertions. And if data contains truth, then it will, without moral intervention, produce better outcomes. Speaking at Google’s private London Zeitgeist conference in 2013, Eric Schmidt, Google Chairman, asserted that “if they had had cellphones in Rwanda in 1994, the genocide would not have happened.” Schmidt’s claim was that technological visibility – the rendering of events and actions legible to everyone – would change the character of those actions. Not only is this statement historically inaccurate (there was plenty of evidence available of what was occurring during the genocide from UN officials, US satellite photographs and other sources), it’s also demonstrably untrue. Analysis of unrest in Kenya in 2007, when over 1,000 people were killed in ethnic conflicts, showed that mobile phones not only spread but accelerated the violence. But you don’t need to look to such extreme examples to see how a belief in technological determinism underlies much of our thinking and reasoning about the world.
“Big data” is not merely a business buzzword, but a way of seeing the world. Driven by technology, markets and politics, it has come to determine much of our thinking, but it is flawed and dangerous. It runs counter to our actual findings when we employ such technologies honestly and with the full understanding of their workings and capabilities. This over-reliance on data, which I call “quantified thinking”, has come to undermine our ability to reason meaningfully about the world, and its effects can be seen across multiple domains.
The assertion is hardly new. Writing in the Dialectic of Enlightenment in 1947, Theodor Adorno and Max Horkheimer decried “the present triumph of the factual mentality” – the predecessor to quantified thinking – and succinctly analysed the big data fallacy, set out by Anderson above. “It does not work by images or concepts, by the fortunate insights, but refers to method, the exploitation of others’ work, and capital … What men want to learn from nature is how to use it in order wholly to dominate it and other men. That is the only aim.” What is different in our own time is that we have built a world-spanning network of communication and computation to test this assertion. While it occasionally engenders entirely new forms of behaviour and interaction, the network most often shows to us with startling clarity the relationships and tendencies which have been latent or occluded until now. In the face of the increased standardisation of knowledge, it becomes harder and harder to argue against quantified thinking, because the advances of technology have been conjoined with the scientific method and social progress. But as I hope to show, technology ultimately reveals its limitations….
“Eroom’s law” – Moore’s law backwards – was recently formulated to describe a problem in pharmacology. Drug discovery has been getting more expensive. Since the 1950s the number of drugs approved for use in human patients per billion US dollars spent on research and development has halved every nine years. This problem has long perplexed researchers. According to the principles of technological growth, the trend should be in the opposite direction. In a 2012 paper in Nature entitled “Diagnosing the decline in pharmaceutical R&D efficiency” the authors propose and investigate several possible causes for this. They begin with social and physical influences, such as increased regulation, increased expectations and the exhaustion of easy targets (the “low hanging fruit” problem). Each of these are – with qualifications – disposed of, leaving open the question of the discovery process itself….(More)
Science Magazine: “For years, researchers at the MIT Media Lab have been developing a database of images captured at regular distances around several major cities. The images are scored according to different visual characteristics — how safe the depicted areas look, how affluent, how lively, and the like….Adjusted for factors such as population density and distance from city centers, the correlation between perceived safety and visitation rates was strong, but it was particularly strong for women and people over 50. The correlation was negative for people under 30, which means that males in their 20s were actually more likely to visit neighborhoods generally perceived to be unsafe than to visit neighborhoods perceived to be safe.
In the same paper, the researchers also identified several visual features that are highly correlated with judgments that a particular area is safe or unsafe. Consequently, the work could help guide city planners in decisions about how to revitalize declining neighborhoods.,,,
Jacobs’ theory, Hidalgo says, is that neighborhoods in which residents can continuously keep track of street activity tend to be safer; a corollary is that buildings with street-facing windows tend to create a sense of safety, since they imply the possibility of surveillance. Newman’s theory is an elaboration on Jacobs’, suggesting that architectural features that demarcate public and private spaces, such as flights of stairs leading up to apartment entryways or archways separating plazas from the surrounding streets, foster the sense that crossing a threshold will bring on closer scrutiny….(More)”
Robert M. Patton, Christopher G. Stahl and Jack C. Wells at DLib Magazine: “Measuring scientific progress remains elusive. There is an intuitive understanding that, in general, science is progressing forward. New ideas and theories are formed, older ideas and theories are confirmed, rejected, or modified. Progress is made. But, questions such as how is it made, by whom, how broadly, or how quickly present significant challenges. Historically, scientific publications reference other publications if the former publication in some way shaped the work that was performed. In other words, one publication “impacted” a latter one. The implication of this impact revolves around the intellectual content of the idea, theory, or conclusion that was formed. Several metrics such as h-index or journal impact factor (JIF) are often used as a means to assess whether an author, article, or journal creates an “impact” on science. The implied statement behind high values for such metrics is that the work must somehow be valuable to the community, which in turn implies that the author, article, or journal somehow has influenced the direction, development, or progress of what others in that field do. Unfortunately, the drive for increased publication revenue, research funding, or global recognition has lead to a variety of external factors completely unrelated to the quality of the work that can be used to manipulate key metric values. In addition, advancements in computing and data sciences field have further altered the meaning of impact on science.
The remainder of this paper will highlight recent advancements in both cultural and technological factors that now influence scientific impact as well as suggest new factors to be leveraged through full content analysis of publications….(More)”
Clive Thompson at the Smithsonian magazine: “As the 2016 election approaches, we’re hearing a lot about “red states” and “blue states.” That idiom has become so ingrained that we’ve almost forgotten where it originally came from: a data visualization.
In the 2000 presidential election, the race between Al Gore and George W. Bush was so razor close that broadcasters pored over electoral college maps—which they typically colored red and blue. What’s more, they talked about those shadings. NBC’s Tim Russert wondered aloud how George Bush would “get those remaining 61 electoral red states, if you will,” and that language became lodged in the popular imagination. America became divided into two colors—data spun into pure metaphor. Now Americans even talk routinely about “purple” states, a mental visualization of political information.
We live in an age of data visualization. Go to any news website and you’ll see graphics charting support for the presidential candidates; open your iPhone and the Health app will generate personalized graphs showing how active you’ve been this week, month or year. Sites publish charts showing how the climate is changing, how schools are segregating, how much housework mothers do versus fathers. And newspapers are increasingly finding that readers love “dataviz”: In 2013, the New York Times’ most-read story for the entire year was a visualization of regional accents across the United States. It makes sense. We live in an age of Big Data. If we’re going to understand our complex world, one powerful way is to graph it.
But this isn’t the first time we’ve discovered the pleasures of making information into pictures. Over a hundred years ago, scientists and thinkers found themselves drowning in their own flood of data—and to help understand it, they invented the very idea of infographics.
**********
The idea of visualizing data is old: After all, that’s what a map is—a representation of geographic information—and we’ve had maps for about 8,000 years. But it was rare to graph anything other than geography. Only a few examples exist: Around the 11th century, a now-anonymous scribe created a chart of how the planets moved through the sky. By the 18th century, scientists were warming to the idea of arranging knowledge visually. The British polymath Joseph Priestley produced a “Chart of Biography,” plotting the lives of about 2,000 historical figures on a timeline. A picture, he argued, conveyed the information “with more exactness, and in much less time, than it [would take] by reading.”
Still, data visualization was rare because data was rare. That began to change rapidly in the early 19th century, because countries began to collect—and publish—reams of information about their weather, economic activity and population. “For the first time, you could deal with important social issues with hard facts, if you could find a way to analyze it,” says Michael Friendly, a professor of psychology at York University who studies the history of data visualization. “The age of data really began.”
An early innovator was the Scottish inventor and economist William Playfair. As a teenager he apprenticed to James Watt, the Scottish inventor who perfected the steam engine. Playfair was tasked with drawing up patents, which required him to develop excellent drafting and picture-drawing skills. After he left Watt’s lab, Playfair became interested in economics and convinced that he could use his facility for illustration to make data come alive.
“An average political economist would have certainly been able to produce a table for publication, but not necessarily a graph,” notes Ian Spence, a psychologist at the University of Toronto who’s writing a biography of Playfair. Playfair, who understood both data and art, was perfectly positioned to create this new discipline.
In one famous chart, he plotted the price of wheat in the United Kingdom against the cost of labor. People often complained about the high cost of wheat and thought wages were driving the price up. Playfair’s chart showed this wasn’t true: Wages were rising much more slowly than the cost of the product.
“He wanted to discover,” Spence notes. “He wanted to find regularities or points of change.” Playfair’s illustrations often look amazingly modern: In one, he drew pie charts—his invention, too—and lines that compared the size of various country’s populations against their tax revenues. Once again, the chart produced a new, crisp analysis: The British paid far higher taxes than citizens of other nations.
Neurology was not yet a robust science, but Playfair seemed to intuit some of its principles. He suspected the brain processed images more readily than words: A picture really was worth a thousand words. “He said things that sound almost like a 20th-century vision researcher,” Spence adds. Data, Playfair wrote, should “speak to the eyes”—because they were “the best judge of proportion, being able to estimate it with more quickness and accuracy than any other of our organs.” A really good data visualization, he argued, “produces form and shape to a number of separate ideas, which are otherwise abstract and unconnected.”
Soon, intellectuals across Europe were using data visualization to grapple with the travails of urbanization, such as crime and disease….(More)”
Allison Shapiro in Pacific Standard Magazine: “In 2014, New York City Mayor Bill de Blasio decided to adopt Vision Zero, a multi-national initiative dedicated to eliminating traffic-related deaths. Under Vision Zero, city services, including the Department of Transportation, began an engineering and public relations plan to make the streets safer for drivers, pedestrians, and cyclists. The plan included street re-designs, improved accessibility measures, and media campaigns on safer driving.
The goal may be an old one, but the approach is innovative: When New York City officials wanted to reduce traffic deaths, they crowdsourced and used data.
Many cities in the United States—from Washington, D.C., all the way to Los Angeles—have adopted some version of Vision Zero, which began in Sweden in 1997. It’s part of a growing trend to make cities “smart” by integrating data collection into things like infrastructure and policing.
Cities have access to an unprecedented amount of data about traffic patterns, driving violations, and pedestrian concerns. Although advocacy groups say Vision Zero is moving too slowly, de Blasio has invested another $115 million in this data-driven approach.
De Blasio may have been vindicated. A 2015 year-end report released by the city last week analyzes the successes and shortfalls of data-driven city life, and the early results look promising. In 2015, fewer New Yorkers lost their lives in traffic accidents than in any year since 1910, according to the report, despite the fact that the population has almost doubled in those 105 years.
Below are some of the project highlights.
New Yorkers were invited to add to this public dialogue map, where they could list information ranging from “not enough time to cross” to “red light running.” The Department of Transportation ended up with over 10,000 comments, which led to 80 safety projects in 2015, including the creation of protected bike lanes, the introduction of leading pedestrian intervals, and the simplifying of complex intersections….
Data collected from the public dialogue map, town hall meetings, and past traffic accidents led to “changes to signals, street geometry and markings and regulations that govern actions like turning and parking. These projects simplify driving, walking and bicycling, increase predictability, improve visibility and reduce conflicts,” according to Vision Zero in NYC….(More)”
Athina Karatzogianni at the Conversation: “One of the key issues the West has had to face in countering Islamic State (IS) is the jihadi group’s mastery of online propaganda, seen in hundreds of thousands of messages celebrating the atrocities against civilians and spreading the message of radicalisation. It seems clear that efforts to counter IS online are missing the mark.
A US internal State Department assessment noted in June 2015 how the violent narrative of IS had “trumped” the efforts of the world’s richest and most technologically advanced nations. Meanwhile in Europe, Interpol was to track and take down social media accounts linked to IS, as if that would solve the problem – when in fact doing so meant potentially missing out on intelligence gathering opportunities.
Into this vacuum has stepped Anonymous, a fragmented loose network of hacktivists that has for years launched occasional cyberattacks against government, corporate and civil society organisations. The group announced its intention to take on IS and its propaganda online, using its networks to crowd-source the identity of IS-linked accounts. Under the banner of #OpIsis and #OpParis, Anonymous published lists of thousands of Twitter accounts claimed to belong to IS members or sympathisers, claiming more than 5,500 had been removed.
The group pursued a similar approach following the attacks on Charlie Hebdo magazine in January 2015, with @OpCharlieHebdo taking down more than 200 jihadist Twitter acounts, bringing down the website Ansar-Alhaqq.net and publishing a list of 25,000 accounts alongside a guide on how to locate pro-IS material online….
Anonymous has been prosecuted for cyber attacks in many countries under cybercrime laws, as their activities are not seen as legitimate protest. It is worth mentioning the ethical debate around hacktivism, as some see cyber attacks that take down accounts or websites as infringing on others’ freedom of expression, while others argue that hacktivism should instead create technologies to circumvent censorship, enable digital equality and open access to information….(More)”
H. V. Jagadish in the Conversation: “Police departments, like everyone else, would like to be more effective while spending less. Given the tremendous attention to big data in recent years, and the value it has provided in fields ranging from astronomy to medicine, it should be no surprise that police departments are using data analysis to inform deployment of scarce resources. Enter the era of what is called “predictive policing.”
Some form of predictive policing is likely now in force in a city near you.Memphis was an early adopter. Cities from Minneapolis to Miami have embraced predictive policing. Time magazine named predictive policing (with particular reference to the city of Santa Cruz) one of the 50 best inventions of 2011. New York City Police Commissioner William Bratton recently said that predictive policing is “the wave of the future.”
The term “predictive policing” suggests that the police can anticipate a crime and be there to stop it before it happens and/or apprehend the culprits right away. As the Los Angeles Times points out, it depends on “sophisticated computer analysis of information about previous crimes, to predict where and when crimes will occur.”
At a very basic level, it’s easy for anyone to read a crime map and identify neighborhoods with higher crime rates. It’s also easy to recognize that burglars tend to target businesses at night, when they are unoccupied, and to target homes during the day, when residents are away at work. The challenge is to take a combination of dozens of such factors to determine where crimes are more likely to happen and who is more likely to commit them. Predictive policing algorithms are getting increasingly good at such analysis. Indeed, such was the premise of the movie Minority Report, in which the police can arrest and convict murderers before they commit their crime.
Predicting a crime with certainty is something that science fiction can have a field day with. But as a data scientist, I can assure you that in reality we can come nowhere close to certainty, even with advanced technology. To begin with, predictions can be only as good as the input data, and quite often these input data have errors.
But even with perfect, error-free input data and unbiased processing, ultimately what the algorithms are determining are correlations. Even if we have perfect knowledge of your troubled childhood, your socializing with gang members, your lack of steady employment, your wacko posts on social media and your recent gun purchases, all that the best algorithm can do is to say it is likely, but not certain, that you will commit a violent crime. After all, to believe such predictions as guaranteed is to deny free will….
What data can do is give us probabilities, rather than certainty. Good data coupled with good analysis can give us very good estimates of probability. If you sum probabilities over many instances, you can usually get a robust estimate of the total.
For example, data analysis can provide a probability that a particular house will be broken into on a particular day based on historical records for similar houses in that neighborhood on similar days. An insurance company may add this up over all days in a year to decide how much to charge for insuring that house….(More)”
Minna Ruckenstein and Mika Pantzar in New Media and Society: “This article investigates the metaphor of the Quantified Self (QS) as it is presented in the magazine Wired (2008–2012). Four interrelated themes—transparency, optimization, feedback loop, and biohacking—are identified as formative in defining a new numerical self and promoting a dataist paradigm. Wired captures certain interests and desires with the QS metaphor, while ignoring and downplaying others, suggesting that the QS positions self-tracking devices and applications as interfaces that energize technological engagements, thereby pushing us to rethink life in a data-driven manner. The thematic analysis of the QS is treated as a schematic aid for raising critical questions about self-quantification, for instance, detecting the merging of epistemological claims, technological devices, and market-making efforts. From this perspective, another definition of the QS emerges: a knowledge system that remains flexible in its aims and can be used as a resource for epistemological inquiry and in the formation of alternative paradigms….(More)”
Mark Boonshoft at The Junto: “Data. Before postmodernism, or environmental history, or the cultural turn, or the geographic turn, and even before the character on the old Star Trek series, historians began to gather and analyze quantitative evidence to understand the past. As computers became common during the 1970s and 1980s, scholars responded by painstakingly compiling and analyzing datasets, using that evidence to propose powerful new historical interpretations. Today, much of that information (as well as data compiled since) is in danger of disappearing. For that and other reasons, we have developed a website designed to preserve and share the datasets permanently (or at least until aliens destroy our planet). We appeal to all early American historians (not only the mature ones from earlier decades) to take the time both to preserve and to share their statistical evidence with present and future scholars. It will not only be a legacy to the profession but also will encourage historians to share their data more openly and to provide a foundation on which scholars can build.
In coordination with the McNeil Center for Early American Studies and specialists at the University of Pennsylvania Libraries, in addition to bepress, we have established the Magazine of Early American Datasets (MEAD), available athttp://repository.upenn.edu/mead/. We’d love to have your datasets, your huddled 1’s and 0’s (and other numbers and letters) yearning to be free. The best would be in either .csv or, if you have commas in your data, .txt, because both of those are non-proprietary and somewhat close to universal. However, if the data is in other forms, like Access Excel or SPSS, that will do fine as well. Ultimately, we should be able to convert files to a more permanent database and to preserve those files in perpetuity. In addition, we are asking scholars, out of the goodness of their heart and commitment to the profession, to load a separate document as a codebook explaining the meaning of the variables. The files will all be available to any scholar regardless of their academic affiliation.
How will a free, open centralized data center benefit Early American Historians and why should you participate in using and sharing data? Let us count just a few ways. In our experience, most historians of early America are extremely generous in sharing not only their expertise but also their evidence with other scholars. However, that generally occurs on an individual, case-by-case basis in a somewhat serendipitous fashion. A centralized website would permit scholars quickly to investigate rather quantitative evidence was available on which they might begin to construct their own research. Ideally, scholars setting out on a new topic might be guided somewhat the existence and availability of data. Moreover, it would set a precedent that future historians might follows—routinely sharing their evidence, either before or after their publications analyzing the data have appeared in print or online….(More)”