Data-Intensive Approaches To Creating Innovation For Sustainable Smart Cities


Science Trends: “Located at the complex intersection of economic development and environmental change, cities play a central role in our efforts to move towards sustainability. Reducing air and water pollution, improving energy efficiency while securing energy supply, and minimizing vulnerabilities to disruptions and disturbances are interconnected and pose a formidable challenge, with their dynamic interactions changing in highly complex and unpredictable manners….

The Beijing City Lab demonstrates the usefulness of open urban data in mapping urbanization with a fine spatiotemporal scale and reflecting social and environmental dimensions of urbanization through visualization at multiple scales.

The basic principle of open data will generate significant opportunities for promoting inter-disciplinary and inter-organizational research, producing new data sets through the integration of different sources, avoiding duplication of research, facilitating the verification of previous results, and encouraging citizen scientists and crowdsourcing approaches. Open data also is expected to help governments promote transparency, citizen participation, and access to information in policy-making processes.

Despite a significant potential, however, there still remain numerous challenges in facilitating innovation for urban sustainability through open data. The scope and amount of data collected and shared are still limited, and the quality control, error monitoring, and cleaning of open data is also indispensable in securing the reliability of the analysis. Also, the organizational and legal frameworks of data sharing platforms are often not well-defined or established, and it is critical to address the interoperability between various data standards, balance between open and proprietary data, and normative and legal issues such as the data ownership, personal privacy, confidentiality, law enforcement, and the maintenance of public safety and national security….

These findings are described in the article entitled Facilitating data-intensive approaches to innovation for sustainability: opportunities and challenges in building smart cities, published in the journal Sustainability Science. This work was led by Masaru Yarime from the City University of Hong Kong….(More)”.

When census taking is a recipe for controversy


Anjana Ahuja in the Financial Times: “Population counts are important tools for development, but also politically fraught…The UN describes a census as “among the most complex and massive peacetime exercises a nation undertakes”. Given that social trends, migration patterns and inequalities can be determined from questions that range from health to wealth, housing and even religious beliefs, censuses can also be controversial. So it is with the next one in the US, due to be conducted in 2020. The US Department of Justice has proposed that participants should be quizzed on their citizenship status. Vanita Gupta, president of the Leadership Conference on Civil and Human Rights, warned the journal Science that many would refuse to take part. Ms Gupta said that, in the current political climate, enquiring about citizenship “would destroy any chance of an accurate count, discard years of careful research and increase costs significantly”.

The row has taken on a new urgency because the 2020 census must be finalised by April. The DoJ claims that a citizenship question will ensure that ethnic minorities are treated fairly in the voting process. Currently, only about one in six households is asked about citizenship, with the results extrapolated for the whole population, a process observers say is statistically acceptable and less intrusive. In 2011, the census for England and Wales asked for country of birth and passports held — but not citizenship explicitly. It is one of those curious cases when fewer questions might lead to more accurate and useful data….(More)”.

Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor


Book by Virginia Eubanks: “The State of Indiana denies one million applications for healthcare, foodstamps and cash benefits in three years—because a new computer system interprets any mistake as “failure to cooperate.” In Los Angeles, an algorithm calculates the comparative vulnerability of tens of thousands of homeless people in order to prioritize them for an inadequate pool of housing resources. In Pittsburgh, a child welfare agency uses a statistical model to try to predict which children might be future victims of abuse or neglect.

Since the dawn of the digital age, decision-making in finance, employment, politics, health and human services has undergone revolutionary change. Today, automated systems—rather than humans—control which neighborhoods get policed, which families attain needed resources, and who is investigated for fraud. While we all live under this new regime of data, the most invasive and punitive systems are aimed at the poor.

In Automating Inequality, Virginia Eubanks systematically investigates the impacts of data mining, policy algorithms, and predictive risk models on poor and working-class people in America. The book is full of heart-wrenching and eye-opening stories, from a woman in Indiana whose benefits are literally cut off as she lays dying to a family in Pennsylvania in daily fear of losing their daughter because they fit a certain statistical profile.

The U.S. has always used its most cutting-edge science and technology to contain, investigate, discipline and punish the destitute. Like the county poorhouse and scientific charity before them, digital tracking and automated decision-making hide poverty from the middle-class public and give the nation the ethical distance it needs to make inhumane choices: which families get food and which starve, who has housing and who remains homeless, and which families are broken up by the state. In the process, they weaken democracy and betray our most cherished national values….(More)”.

Visualizing the Uncertainty in Data


Nathan Yau at FlowingData: “Data is a representation of real life. It’s an abstraction, and it’s impossible to encapsulate everything in a spreadsheet, which leads to uncertainty in the numbers.

How well does a sample represent a full population? How likely is it that a dataset represents the truth? How much do you trust the numbers?

Statistics is a game where you figure out these uncertainties and make estimated judgements based on your calculations. But standard errors, confidence intervals, and likelihoods often lose their visual space in data graphics, which leads to judgements based on simplified summaries expressed as means, medians, or extremes.

That’s no good. You miss out on the interesting stuff. The important stuff. So here are some visualization options for the uncertainties in your data, each with its pros, cons, and examples….(More)”.

AI System Sorts News Articles By Whether or Not They Contain Actual Information


Michael Byrne at Motherboard:”… in a larger sense it’s worth wondering to what degree the larger news feed is being diluted by news stories that are not “content dense.” That is, what’s the real ratio between signal and noise, objectively speaking? To start, we’d need a reasonably objective metric of content density and a reasonably objective mechanism for evaluating news stories in terms of that metric.

In a recent paper published in the Journal of Artificial Intelligence Research, computer scientists Ani Nenkova and Yinfei Yang, of Google and the University of Pennsylvania, respectively, describe a new machine learning approach to classifying written journalism according to a formalized idea of “content density.” With an average accuracy of around 80 percent, their system was able to accurately classify news stories across a wide range of domains, spanning from international relations and business to sports and science journalism, when evaluated against a ground truth dataset of already correctly classified news articles.

At a high level this works like most any other machine learning system. Start with a big batch of data—news articles, in this case—and then give each item an annotation saying whether or not that item falls within a particular category. In particular, the study focused on article leads, the first paragraph or two in a story traditionally intended to summarize its contents and engage the reader. Articles were drawn from an existing New York Times linguistic dataset consisting of original articles combined with metadata and short informative summaries written by researchers….(More)”.

Social Theory After the Internet: Media, Technology and Globalization


(Open Access) Book by Ralph Schroeder: “The internet has fundamentally transformed society in the past 25 years, yet existing theories of mass or interpersonal communication do not work well in understanding a digital world. Nor has this understanding been helped by disciplinary specialization and a continual focus on the latest innovations. Ralph Schroeder takes a longer-term view, synthesizing perspectives and findings from various social science disciplines in four countries: the United States, Sweden, India and China. His comparison highlights, among other observations, that smartphones are in many respects more important than PC-based internet uses.

Social Theory after the Internet focuses on everyday uses and effects of the internet, including information seeking and big data, and explains how the internet has gone beyond traditional media in, for example, enabling Donald Trump and Narendra Modi to come to power. Schroeder puts forward a sophisticated theory of the role internet plays, and how both technological and social forces shape its significance. He provides a sweeping and penetrating study, theoretically ambitious and at the same time always empirically grounded….(More)”.

Do-it-yourself science is taking off


The Economist: “…Citizen science has been around for ages—professional astronomers, geologists and archaeologists have long had their work supplemented by enthusiastic amateurs—and new cheap instruments can usefully spread the movement’s reach. What is more striking about bGeigie and its like, though, is that citizens and communities can use such instruments to inform decisions on which science would otherwise be silent—or mistrusted. For example, getting hold of a bGeigie led some people planning to move home after Fukushima to decide they were safer staying put.

Ms Liboiron’s research at CLEAR also stresses self-determination. It is subject to “community peer review”: those who have participated in the lab’s scientific work decide whether it is valid and merits publication. In the 1980s fishermen had tried to warn government scientists that stocks were in decline. Their cries were ignored and the sudden collapse of Newfoundland’s cod stocks in 1992 had left 35,000 jobless. The people taking science into their own hands with Ms Liboiron want to make sure that in the future the findings which matter to them get heard.

Swell maps

Issues such as climate change, plastic waste and air pollution become more tangible to those with the tools in their hands to measure them. Those tools, in turn, encourage more people to get involved. Eymund Diegel, a South African urban planner who is also a keen canoeist, has long campaigned for the Gowanus canal, close to his home in Brooklyn, to be cleaned up. Effluent from paint manufacturers, tanneries, chemical plants and more used to flow into the canal with such profligacy that by the early 20th century the Gowanus was said to be jammed solid. The New York mob started using the waterway as a dumping ground for dead bodies. In the early part of this century it was still badly polluted.

In 2009 Mr Diegel contacted Public Lab, an NGO based in New Orleans that helps people investigate environmental concerns. They directed him to what became his most powerful weapon in the fight—a mapping rig consisting of a large helium balloon, 300 metres (1,000 feet) of string and an old digital camera. A camera or smartphone fixed to such a balloon can take more detailed photographs than the satellite imagery used by the likes of Google for its online maps, and Public Lab provides software, called MapKnitter, that can stitch these photos together into surveys.

These data—and community pressure—helped persuade the Environmental Protection Agency (EPA) to make the canal eligible for money from a “superfund” programme which targets some of America’s most contaminated land. Mr Diegel’s photos have revealed a milky plume flowing into the canal from a concealed chemical tank which the EPA’s own surveys had somehow missed. The agency now plans to spend $500m cleaning up the canal….(More)”.

Big Data Challenge for Social Sciences: From Society and Opinion to Replications


Symposium Paper by Dominique Boullier: “When in 2007 Savage and Burrows pointed out ‘the coming crisis of empirical methods’, they were not expecting to be so right. Their paper however became a landmark, signifying the social sciences’ reaction to the tremendous shock triggered by digital methods. As they frankly acknowledge in a more recent paper, they did not even imagine the extent to which their prediction might become true, in an age of Big Data, where sources and models have to be revised in the light of extended computing power and radically innovative mathematical approaches.They signalled not just a debate about academic methods but also a momentum for ‘commercial sociology’ in which platforms acquire the capacity to add ‘another major nail in the coffin of academic sociology claims to jurisdiction over knowledge of the social’, because ‘research methods (are) an intrinsic feature of contemporary capitalist organisations’ (Burrows and Savage, 2014, p. 2). This need for a serious account of research methods is well tuned with the claims of Social Studies of Science that should be applied to the social sciences as well.

I would like to build on these insights and principles of Burrows and Savage to propose an historical and systematic account of quantification during the last century, following in the footsteps of Alain Desrosières, and in which we see Big Data and Machine Learning as a major shift in the way social science can be performed. And since, according to Burrows and Savage (2014, p. 5), ‘the use of new data sources involves a contestation over the social itself’, I will take the risk here of identifying and defining the entities that are supposed to encapsulate the social for each kind of method: beyond the reign of ‘society’ and ‘opinion’, I will point at the emergence of the ‘replications’ that are fabricated by digital platforms but are radically different from previous entities. This is a challenge to invent not only new methods but also a new process of reflexivity for societies, made available by new stakeholders (namely, the digital platforms) which transform reflexivity into reactivity (as operational quantifiers always tend to)….(More)”.

Could Bitcoin technology help science?


Andy Extance at Nature: “…The much-hyped technology behind Bitcoin, known as blockchain, has intoxicated investors around the world and is now making tentative inroads into science, spurred by broad promises that it can transform key elements of the research enterprise. Supporters say that it could enhance reproducibility and the peer review process by creating incorruptible data trails and securely recording publication decisions. But some also argue that the buzz surrounding blockchain often exceeds reality and that introducing the approach into science could prove expensive and introduce ethical problems.

A few collaborations, including Scienceroot and Pluto, are already developing pilot projects for science. Scienceroot aims to raise US$20 million, which will help pay both peer reviewers and authors within its electronic journal and collaboration platform. It plans to raise the funds in early 2018 by exchanging some of the science tokens it uses for payment for another digital currency known as ether. And the Wolfram Mathematica algebra program — which is widely used by researchers — is currently working towards offering support for an open-source blockchain platform called Multichain. Scientists could use this, for example, to upload data to a shared, open workspace that isn’t controlled by any specific party, according to Multichain….

Claudia Pagliari, who researches digital health-tracking technologies at the University of Edinburgh, UK, says that she recognizes the potential of blockchain, but researchers have yet to properly explore its ethical issues. What happens if a patient withdraws consent for a trial that is immutably recorded on a blockchain? And unscrupulous researchers could still add fake data to a blockchain, even if the process is so open that everyone can see who adds it, says Pagliari. Once added, no-one can change that information, although it’s possible they could label it as retracted….(More)”.

How the Index Card Cataloged the World


Daniela Blei in the Atlantic: “…The index card was a product of the Enlightenment, conceived by one of its towering figures: Carl Linnaeus, the Swedish botanist, physician, and the father of modern taxonomy. But like all information systems, the index card had unexpected political implications, too: It helped set the stage for categorizing people, and for the prejudice and violence that comes along with such classification….

In 1780, two years after Linnaeus’s death, Vienna’s Court Library introduced a card catalog, the first of its kind. Describing all the books on the library’s shelves in one ordered system, it relied on a simple, flexible tool: paper slips. Around the same time that the library catalog appeared, says Krajewski, Europeans adopted banknotes as a universal medium of exchange. He believes this wasn’t a historical coincidence. Banknotes, like bibliographical slips of paper and the books they referred to, were material, representational, and mobile. Perhaps Linnaeus took the same mental leap from “free-floating banknotes” to “little paper slips” (or vice versa). Sweden’s great botanist was also a participant in an emerging capitalist economy.

Linnaeus never grasped the full potential of his paper technology. Born of necessity, his paper slips were “idiosyncratic,” say Charmantier and Müller-Wille. “There is no sign he ever tried to rationalize or advertise the new practice.” Like his taxonomical system, paper slips were both an idea and a method, designed to bring order to the chaos of the world.

The passion for classification, a hallmark of the Enlightenment, also had a dark side. From nature’s variety came an abiding preoccupation with the differences between people. As soon as anthropologists applied Linnaeus’s taxonomical system to humans, the category of race, together with the ideology of racism, was born.

It’s fitting, then, that the index card would have a checkered history. To take one example, the FBI’s J. Edgar Hoover used skills he burnished as a cataloger at the Library of Congress to assemble his notorious “Editorial Card Index.” By 1920, he had cataloged 200,000 subversive individuals and organizations in detailed, cross-referenced entries. Nazi ideologues compiled a deadlier index-card database to classify 500,000 Jewish Germans according to racial and genetic background. Other regimes have employed similar methods, relying on the index card’s simplicity and versatility to catalog enemies real and imagined.

The act of organizing information—even notes about plants—is never neutral or objective. Anyone who has used index cards to plan a project, plot a story, or study for an exam knows that hierarchies are inevitable. Forty years ago, Michel Foucault observed in a footnote that, curiously, historians had neglected the invention of the index card. The book was Discipline and Punish, which explores the relationship between knowledge and power. The index card was a turning point, Foucault believed, in the relationship between power and technology. Like the categories they cataloged, Linnaeus’s paper slips belong to the history of politics as much as the history of science….(More)”.