The Quiet Movement to Make Government Fail Less Often


in The New York Times: “If you wanted to bestow the grandiose title of “most successful organization in modern history,” you would struggle to find a more obviously worthy nominee than the federal government of the United States.

In its earliest stirrings, it established a lasting and influential democracy. Since then, it has helped defeat totalitarianism (more than once), established the world’s currency of choice, sent men to the moon, built the Internet, nurtured the world’s largest economy, financed medical research that saved millions of lives and welcomed eager immigrants from around the world.

Of course, most Americans don’t think of their government as particularly successful. Only 19 percent say they trust the government to do the right thing most of the time, according to Gallup. Some of this mistrust reflects a healthy skepticism that Americans have always had toward centralized authority. And the disappointing economic growth of recent decades has made Americans less enamored of nearly every national institution.

But much of the mistrust really does reflect the federal government’s frequent failures – and progressives in particular will need to grapple with these failures if they want to persuade Americans to support an active government.

When the federal government is good, it’s very, very good. When it’s bad (or at least deeply inefficient), it’s the norm.

The evidence is abundant. Of the 11 large programs for low- and moderate-income people that have been subject to rigorous, randomized evaluation, only one or two show strong evidence of improving most beneficiaries’ lives. “Less than 1 percent of government spending is backed by even the most basic evidence of cost-effectiveness,” writes Peter Schuck, a Yale law professor, in his new book, “Why Government Fails So Often,” a sweeping history of policy disappointments.

As Mr. Schuck puts it, “the government has largely ignored the ‘moneyball’ revolution in which private-sector decisions are increasingly based on hard data.”

And yet there is some good news in this area, too. The explosion of available data has made evaluating success – in the government and the private sector – easier and less expensive than it used to be. At the same time, a generation of data-savvy policy makers and researchers has entered government and begun pushing it to do better. They have built on earlier efforts by the Bush and Clinton administrations.

The result is a flowering of experiments to figure out what works and what doesn’t.

New York City, Salt Lake City, New York State and Massachusetts have all begun programs to link funding for programs to their success: The more effective they are, the more money they and their backers receive. The programs span child care, job training and juvenile recidivism.

The approach is known as “pay for success,” and it’s likely to spread to Cleveland, Denver and California soon. David Cameron’s conservative government in Britain is also using it. The Obama administration likes the idea, and two House members – Todd Young, an Indiana Republican, and John Delaney, a Maryland Democrat – have introduced a modest bill to pay for a version known as “social impact bonds.”

The White House is also pushing for an expansion of randomized controlled trials to evaluate government programs. Such trials, Mr. Schuck notes, are “the gold standard” for any kind of evaluation. Using science as a model, researchers randomly select some people to enroll in a government program and others not to enroll. The researchers then study the outcomes of the two groups….”

No silver bullet: De-identification still doesn’t work


Arvind Narayanan and Edward W. Felten: “Paul Ohm’s 2009 article Broken Promises of Privacy spurred a debate in legal and policy circles on the appropriate response to computer science research on re-identification techniques. In this debate, the empirical research has often been misunderstood or misrepresented. A new report by Ann Cavoukian and Daniel Castro is full of such inaccuracies, despite its claims of “setting the record straight.” In a response to this piece, Ed Felten and I point out eight of our most serious points of disagreement with Cavoukian and Castro. The thrust of our arguments is that (i) there is no evidence that de-identification works either in theory or in practice and (ii) attempts to quantify its efficacy are unscientific and promote a false sense of security by assuming unrealistic, artificially constrained models of what an adversary might do. Specifically, we argue that:

  1. There is no known effective method to anonymize location data, and no evidence that it’s meaningfully achievable.
  2. Computing re-identification probabilities based on proof-of-concept demonstrations is silly.
  3. Cavoukian and Castro ignore many realistic threats by focusing narrowly on a particular model of re-identification.
  4. Cavoukian and Castro concede that de-identification is inadequate for high-dimensional data. But nowadays most interesting datasets are high-dimensional.
  5. Penetrate-and-patch is not an option.
  6. Computer science knowledge is relevant and highly available.
  7. Cavoukian and Castro apply different standards to big data and re-identification techniques.
  8. Quantification of re-identification probabilities, which permeates Cavoukian and Castro’s arguments, is a fundamentally meaningless exercise.

Data privacy is a hard problem. Data custodians face a choice between roughly three alternatives: sticking with the old habit of de-identification and hoping for the best; turning to emerging technologies like differential privacy that involve some trade-offs in utility and convenience; and using legal agreements to limit the flow and use of sensitive data. These solutions aren’t fully satisfactory, either individually or in combination, nor is any one approach the best in all circumstances. Change is difficult. When faced with the challenge of fostering data science while preventing privacy risks, the urge to preserve the status quo is understandable. However, this is incompatible with the reality of re-identification science. If a “best of both worlds” solution exists, de-identification is certainly not that solution. Instead of looking for a silver bullet, policy makers must confront hard choices.”

Incentivizing Peer Review


in Wired on “The Last Obstacle for Open Access Science: The Galapagos Islands’ Charles Darwin Foundation runs on an annual operating budget of about $3.5 million. With this money, the center conducts conservation research, enacts species-saving interventions, and provides educational resources about the fragile island ecosystems. As a science-based enterprise whose work would benefit greatly from the latest research findings on ecological management, evolution, and invasive species, there’s one glaring hole in the Foundation’s budget: the $800,000 it would cost per year for subscriptions to leading academic journals.
According to Richard Price, founder and CEO of Academia.edu, this episode is symptomatic of a larger problem. “A lot of research centers” – NGOs, academic institutions in the developing world – “are just out in the cold as far as access to top journals is concerned,” says Price. “Research is being commoditized, and it’s just another aspect of the digital divide between the haves and have-nots.”
 
Academia.edu is a key player in the movement toward open access scientific publishing, with over 11 million participants who have uploaded nearly 3 million scientific papers to the site. It’s easy to understand Price’s frustration with the current model, in which academics donate their time to review articles, pay for the right to publish articles, and pay for access to articles. According to Price, journals charge an average of $4000 per article: $1500 for production costs (reformatting, designing), $1500 to orchestrate peer review (labor costs for hiring editors, administrators), and $1000 of profit.
“If there were no legacy in the scientific publishing industry, and we were looking at the best way to disseminate and view scientific results,” proposes Price, “things would look very different. Our vision is to build a complete replacement for scientific publishing,” one that would allow budget-constrained organizations like the CDF full access to information that directly impacts their work.
But getting to a sustainable new world order requires a thorough overhaul of academic publishing industry. The alternative vision – of “open science” – has two key properties: the uninhibited sharing of research findings, and a new peer review system that incorporates the best of the scientific community’s feedback. Several groups have made progress on the former, but the latter has proven particularly difficult given the current incentive structure. The currency of scientific research is the number of papers you’ve published and their citation counts – the number of times other researchers have referred to your work in their own publications. The emphasis is on creation of new knowledge – a worthy goal, to be sure – but substantial contributions to the quality, packaging, and contextualization of that knowledge in the form of peer review goes largely unrecognized. As a result, researchers view their role as reviewers as a chore, a time-consuming task required to sustain the ecosystem of research dissemination.
“Several experiments in this space have tried to incorporate online comment systems,” explains Price, “and the result is that putting a comment box online and expecting high quality comments to flood in is just unrealistic. My preference is to come up with a system where you’re just as motivated to share your feedback on a paper as you are to share your own findings.” In order to make this lofty aim a reality, reviewers’ contributions would need to be recognized. “You need something more nuanced, and more qualitative,” says Price. “For example, maybe you gather reputation points from your community online.” Translating such metrics into tangible benefits up the food chain – hirings, tenure decisions, awards – is a broader community shift that will no doubt take time.
A more iterative peer review process could allow the community to better police faulty methods by crowdsourcing their evaluation. “90% of scientific studies are not reproducible,” claims Price; a problem that is exacerbated by the strong bias toward positive results. Journals may be unlikely to publish methodological refutations, but a flurry of well-supported comments attached to a paper online could convince the researchers to marshal more convincing evidence. Typically, this sort of feedback cycle takes years….”

Do We Choose Our Friends Because They Share Our Genes?


Rob Stein at NPR: “People often talk about how their friends feel like family. Well, there’s some new research out that suggests there’s more to that than just a feeling. People appear to be more like their friends genetically than they are to strangers, the research found.
“The striking thing here is that friends are actually significantly more similar to one another than we were expecting,” says  James Fowler, a professor of medical genetics at the University of California, San Diego, who conducted the study with Nicholas A. Christakis, a social scientist at Yale University.
In fact, the study in Monday’s issue of the Proceedings of the National Academy of Sciences found that friends are as genetically similar as fourth cousins.
“It’s as if they shared a great- great- great-grandparent in common,” Fowler told Shots.
Some of the genes that friends were most likely to have in common involve smell. “We tend to smell things the same way that our friends do,” Fowler says. The study involved nearly 2,000 adults.
This suggests that as humans evolved, the ability to tolerate and be drawn to certain smells may have influenced where people hung out. Today we might call this the Starbucks effect.
“You may really love the smell of coffee. And you’re drawn to a place where other people have been drawn to who also love the smell of coffee,” Fowler says. “And so that might be the opportunity space for you to make friends. You’re all there together because you love coffee and you make friends because you all love coffee.”…”

The open data imperative


Paper by Geoffrey Boulton in Insights: the UKSG journal: “The information revolution of recent decades is a world historical event that is changing the lives of individuals, societies and economies and with major implications for science, research and learning. It offers profound opportunities to explore phenomena that were hitherto beyond our power to resolve, and at the same time is undermining the process whereby concurrent publication of scientific concept and evidence (data) permitted scrutiny, replication and refutation and that has been the bedrock of scientific progress and of ‘self-correction’ since the inception of the first scientific journals in the 17th century. Open publication, release and sharing of data are vital habits that need to be redefined and redeveloped for the modern age by the research community if it is to exploit technological opportunities, maintain self-correction and maximize the contribution of research to human understanding and welfare.”

Selected Readings on Crowdsourcing Expertise


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Crowdsourcing enables leaders and citizens to work together to solve public problems in new and innovative ways. New tools and platforms enable citizens with differing levels of knowledge, expertise, experience and abilities to collaborate and solve problems together. Identifying experts, or individuals with specialized skills, knowledge or abilities with regard to a specific topic, and incentivizing their participation in crowdsourcing information, knowledge or experience to achieve a shared goal can enhance the efficiency and effectiveness of problem solving.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Börner, Katy, Michael Conlon, Jon Corson-Rikert, and Ying Ding. “VIVO: A Semantic Approach to Scholarly Networking and Discovery.” Synthesis Lectures on the Semantic Web: Theory and Technology 2, no. 1 (October 17, 2012): 1–178. http://bit.ly/17huggT.

  • This e-book “provides an introduction to VIVO…a tool for representing information about research and researchers — their scholarly works, research interests, and organizational relationships.”
  • VIVO is a response to the fact that, “Information for scholars — and about scholarly activity — has not kept pace with the increasing demands and expectations. Information remains siloed in legacy systems and behind various access controls that must be licensed or otherwise negotiated before access. Information representation is in its infancy. The raw material of scholarship — the data and information regarding previous work — is not available in common formats with common semantics.”
  • Providing access to structured information on the work and experience of a diversity of scholars enables improved expert finding — “identifying and engaging experts whose scholarly works is of value to one’s own. To find experts, one needs rich data regarding one’s own work and the work of potential related experts. The authors argue that expert finding is of increasing importance since, “[m]ulti-disciplinary and inter-disciplinary investigation is increasingly required to address complex problems. 

Bozzon, Alessandro, Marco Brambilla, Stefano Ceri, Matteo Silvestri, and Giuliano Vesci. “Choosing the Right Crowd: Expert Finding in Social Networks.” In Proceedings of the 16th International Conference on Extending Database Technology, 637–648. EDBT  ’13. New York, NY, USA: ACM, 2013. http://bit.ly/18QbtY5.

  • This paper explores the challenge of selecting experts within the population of social networks by considering the following problem: “given an expertise need (expressed for instance as a natural language query) and a set of social network members, who are the most knowledgeable people for addressing that need?”
  • The authors come to the following conclusions:
    • “profile information is generally less effective than information about resources that they directly create, own or annotate;
    • resources which are produced by others (resources appearing on the person’s Facebook wall or produced by people that she follows on Twitter) help increasing the assessment precision;
    • Twitter appears the most effective social network for expertise matching, as it very frequently outperforms all other social networks (either combined or alone);
    • Twitter appears as well very effective for matching expertise in domains such as computer engineering, science, sport, and technology & games, but Facebook is also very effective in fields such as locations, music, sport, and movies & tv;
    • surprisingly, LinkedIn appears less effective than other social networks in all domains (including computer science) and overall.”

Brabham, Daren C. “The Myth of Amateur Crowds.” Information, Communication & Society 15, no. 3 (2012): 394–410. http://bit.ly/1hdnGJV.

  • Unlike most of the related literature, this paper focuses on bringing attention to the expertise already being tapped by crowdsourcing efforts rather than determining ways to identify more dormant expertise to improve the results of crowdsourcing.
  • Brabham comes to two central conclusions: “(1) crowdsourcing is discussed in the popular press as a process driven by amateurs and hobbyists, yet empirical research on crowdsourcing indicates that crowds are largely self-selected professionals and experts who opt-in to crowdsourcing arrangements; and (2) the myth of the amateur in crowdsourcing ventures works to label crowds as mere hobbyists who see crowdsourcing ventures as opportunities for creative expression, as entertainment, or as opportunities to pass the time when bored. This amateur/hobbyist label then undermines the fact that large amounts of real work and expert knowledge are exerted by crowds for relatively little reward and to serve the profit motives of companies. 

Dutton, William H. Networking Distributed Public Expertise: Strategies for Citizen Sourcing Advice to Government. One of a Series of Occasional Papers in Science and Technology Policy, Science and Technology Policy Institute, Institute for Defense Analyses, February 23, 2011. http://bit.ly/1c1bpEB.

  • In this paper, a case is made for more structured and well-managed crowdsourcing efforts within government. Specifically, the paper “explains how collaborative networking can be used to harness the distributed expertise of citizens, as distinguished from citizen consultation, which seeks to engage citizens — each on an equal footing.” Instead of looking for answers from an undefined crowd, Dutton proposes “networking the public as advisors” by seeking to “involve experts on particular public issues and problems distributed anywhere in the world.”
  • Dutton argues that expert-based crowdsourcing can be successfully for government for a number of reasons:
    • Direct communication with a diversity of independent experts
    • The convening power of government
    • Compatibility with open government and open innovation
    • Synergy with citizen consultation
    • Building on experience with paid consultants
    • Speed and urgency
    • Centrality of documents to policy and practice.
  • He also proposes a nine-step process for government to foster bottom-up collaboration networks:
    • Do not reinvent the technology
    • Focus on activities, not the tools
    • Start small, but capable of scaling up
    • Modularize
    • Be open and flexible in finding and going to communities of experts
    • Do not concentrate on one approach to all problems
    • Cultivate the bottom-up development of multiple projects
    • Experience networking and collaborating — be a networked individual
    • Capture, reward, and publicize success.

Goel, Gagan, Afshin Nikzad and Adish Singla. “Matching Workers with Tasks: Incentives in Heterogeneous Crowdsourcing Markets.” Under review by the International World Wide Web Conference (WWW). 2014. http://bit.ly/1qHBkdf

  • Combining the notions of crowdsourcing expertise and crowdsourcing tasks, this paper focuses on the challenge within platforms like Mechanical Turk related to intelligently matching tasks to workers.
  • The authors’ call for more strategic assignment of tasks in crowdsourcing markets is based on the understanding that “each worker has certain expertise and interests which define the set of tasks she can and is willing to do.”
  • Focusing on developing meaningful incentives based on varying levels of expertise, the authors sought to create a mechanism that, “i) is incentive compatible in the sense that it is truthful for agents to report their true cost, ii) picks a set of workers and assigns them to the tasks they are eligible for in order to maximize the utility of the requester, iii) makes sure total payments made to the workers doesn’t exceed the budget of the requester.

Gubanov, D., N. Korgin, D. Novikov and A. Kalkov. E-Expertise: Modern Collective Intelligence. Springer, Studies in Computational Intelligence 558, 2014. http://bit.ly/U1sxX7

  • In this book, the authors focus on “organization and mechanisms of expert decision-making support using modern information and communication technologies, as well as information analysis and collective intelligence technologies (electronic expertise or simply e-expertise).”
  • The book, which “addresses a wide range of readers interested in management, decision-making and expert activity in political, economic, social and industrial spheres, is broken into five chapters:
    • Chapter 1 (E-Expertise) discusses the role of e-expertise in decision-making processes. The procedures of e-expertise are classified, their benefits and shortcomings are identified, and the efficiency conditions are considered.
    • Chapter 2 (Expert Technologies and Principles) provides a comprehensive overview of modern expert technologies. A special emphasis is placed on the specifics of e-expertise. Moreover, the authors study the feasibility and reasonability of employing well-known methods and approaches in e-expertise.
    • Chapter 3 (E-Expertise: Organization and Technologies) describes some examples of up-to-date technologies to perform e-expertise.
    • Chapter 4 (Trust Networks and Competence Networks) deals with the problems of expert finding and grouping by information and communication technologies.
    • Chapter 5 (Active Expertise) treats the problem of expertise stability against any strategic manipulation by experts or coordinators pursuing individual goals.

Holst, Cathrine. “Expertise and Democracy.” ARENA Report No 1/14, Center for European Studies, University of Oslo. http://bit.ly/1nm3rh4

  • This report contains a set of 16 papers focused on the concept of “epistocracy,” meaning the “rule of knowers.” The papers inquire into the role of knowledge and expertise in modern democracies and especially in the European Union (EU). Major themes are: expert-rule and democratic legitimacy; the role of knowledge and expertise in EU governance; and the European Commission’s use of expertise.
    • Expert-rule and democratic legitimacy
      • Papers within this theme concentrate on issues such as the “implications of modern democracies’ knowledge and expertise dependence for political and democratic theory.” Topics include the accountability of experts, the legitimacy of expert arrangements within democracies, the role of evidence in policy-making, how expertise can be problematic in democratic contexts, and “ethical expertise” and its place in epistemic democracies.
    • The role of knowledge and expertise in EU governance
      • Papers within this theme concentrate on “general trends and developments in the EU with regard to the role of expertise and experts in political decision-making, the implications for the EU’s democratic legitimacy, and analytical strategies for studying expertise and democratic legitimacy in an EU context.”
    • The European Commission’s use of expertise
      • Papers within this theme concentrate on how the European Commission uses expertise and in particular the European Commission’s “expertgroup system.” Topics include the European Citizen’s Initiative, analytic-deliberative processes in EU food safety, the operation of EU environmental agencies, and the autonomy of various EU agencies.

King, Andrew and Karim R. Lakhani. “Using Open Innovation to Identify the Best Ideas.” MIT Sloan Management Review, September 11, 2013. http://bit.ly/HjVOpi.

  • In this paper, King and Lakhani examine different methods for opening innovation, where, “[i]nstead of doing everything in-house, companies can tap into the ideas cloud of external expertise to develop new products and services.”
  • The three types of open innovation discussed are: opening the idea-creation process, competitions where prizes are offered and designers bid with possible solutions; opening the idea-selection process, ‘approval contests’ in which outsiders vote to determine which entries should be pursued; and opening both idea generation and selection, an option used especially by organizations focused on quickly changing needs.

Long, Chengjiang, Gang Hua and Ashish Kapoor. Active Visual Recognition with Expertise Estimation in Crowdsourcing. 2013 IEEE International Conference on Computer Vision. December 2013. http://bit.ly/1lRWFur.

  • This paper is focused on improving the crowdsourced labeling of visual datasets from platforms like Mechanical Turk. The authors note that, “Although it is cheap to obtain large quantity of labels through crowdsourcing, it has been well known that the collected labels could be very noisy. So it is desirable to model the expertise level of the labelers to ensure the quality of the labels. The higher the expertise level a labeler is at, the lower the label noises he/she will produce.”
  • Based on the need for identifying expert labelers upfront, the authors developed an “active classifier learning system which determines which users to label which unlabeled examples” from collected visual datasets.
  • The researchers’ experiments in identifying expert visual dataset labelers led to findings demonstrating that the “active selection” of expert labelers is beneficial in cutting through the noise of crowdsourcing platforms.

Noveck, Beth Simone. “’Peer to Patent’: Collective Intelligence, Open Review, and Patent Reform.” Harvard Journal of Law & Technology 20, no. 1 (Fall 2006): 123–162. http://bit.ly/HegzTT.

  • This law review article introduces the idea of crowdsourcing expertise to mitigate the challenge of patent processing. Noveck argues that, “access to information is the crux of the patent quality problem. Patent examiners currently make decisions about the grant of a patent that will shape an industry for a twenty-year period on the basis of a limited subset of available information. Examiners may neither consult the public, talk to experts, nor, in many cases, even use the Internet.”
  • Peer-to-Patent, which launched three years after this article, is based on the idea that, “The new generation of social software might not only make it easier to find friends but also to find expertise that can be applied to legal and policy decision-making. This way, we can improve upon the Constitutional promise to promote the progress of science and the useful arts in our democracy by ensuring that only worth ideas receive that ‘odious monopoly’ of which Thomas Jefferson complained.”

Ober, Josiah. “Democracy’s Wisdom: An Aristotelian Middle Way for Collective Judgment.” American Political Science Review 107, no. 01 (2013): 104–122. http://bit.ly/1cgf857.

  • In this paper, Ober argues that, “A satisfactory model of decision-making in an epistemic democracy must respect democratic values, while advancing citizens’ interests, by taking account of relevant knowledge about the world.”
  • Ober describes an approach to decision-making that aggregates expertise across multiple domains. This “Relevant Expertise Aggregation (REA) enables a body of minimally competent voters to make superior choices among multiple options, on matters of common interest.”

Sims, Max H., Jeffrey Bigham, Henry Kautz and Marc W. Halterman. Crowdsourcing medical expertise in near real time.” Journal of Hospital Medicine 9, no. 7, July 2014. http://bit.ly/1kAKvq7.

  • In this article, the authors discuss the develoment of a mobile application called DocCHIRP, which was developed due to the fact that, “although the Internet creates unprecedented access to information, gaps in the medical literature and inefficient searches often leave healthcare providers’ questions unanswered.”
  • The DocCHIRP pilot project used a “system of point-to-multipoint push notifications designed to help providers problem solve by crowdsourcing from their peers.”
  • Healthcare providers (HCPs) sought to gain intelligence from the crowd, which included 85 registered users, on questions related to medication, complex medical decision making, standard of care, administrative, testing and referrals.
  • The authors believe that, “if future iterations of the mobile crowdsourcing applications can address…adoption barriers and support the organic growth of the crowd of HCPs,” then “the approach could have a positive and transformative effect on how providers acquire relevant knowledge and care for patients.”

Spina, Alessandro. “Scientific Expertise and Open Government in the Digital Era: Some Reflections on EFSA and Other EU Agencies.” in Foundations of EU Food Law and Policy, eds. A. Alemmano and S. Gabbi. Ashgate, 2014. http://bit.ly/1k2EwdD.

  • In this paper, Spina “presents some reflections on how the collaborative and crowdsourcing practices of Open Government could be integrated in the activities of EFSA [European Food Safety Authority] and other EU agencies,” with a particular focus on “highlighting the benefits of the Open Government paradigm for expert regulatory bodies in the EU.”
  • Spina argues that the “crowdsourcing of expertise and the reconfiguration of the information flows between European agencies and teh public could represent a concrete possibility of modernising the role of agencies with a new model that has a low financial burden and an almost immediate effect on the legal governance of agencies.”
  • He concludes that, “It is becoming evident that in order to guarantee that the best scientific expertise is provided to EU institutions and citizens, EFSA should strive to use the best organisational models to source science and expertise.”

Diffusers of Useful Knowledge


Book review of Visions of Science: Books and Readers at the Dawn of the Victorian Age (By James A Secord): “For a moment in time, just before Victoria became queen, popular science seemed to offer answers to everything. Around 1830, revolutionary information technology – steam-powered presses and paper-making machines – made possible the dissemination of ‘useful knowledge’ to a mass public. At that point professional scientists scarcely existed as a class, but there were genteel amateur researchers who, with literary panache, wrote for a fascinated lay audience.
The term ‘scientist’ was invented only in 1833, by the polymath William Whewell, who gave it a faintly pejorative odour, drawing analogies to ‘journalist’, ‘sciolist’, ‘atheist’, and ‘tobacconist’. ‘Better die … than bestialise our tongue by such barbarisms,’ scowled the geologist Adam Sedgwick. ‘To anyone who respects the English language,’ said T H Huxley, ‘I think “Scientist” must be about as pleasing a word as “Electrocution”.’ These men preferred to call themselves ‘natural philosophers’ and there was a real distinction. Scientists were narrowly focused utilitarian data-grubbers; natural philosophers thought deeply and wrote elegantly about the moral, cosmological and metaphysical implications of their work….
Visions of Science offers vignettes of other pre-Darwin scientific writers who generated considerable buzz in their day. Consolations in Travel, a collection of meta-scientific musings by the chemist Humphry Davy, published in 1830, played a salient role in the plot of The Tenant of Wildfell Hall (1848), with Anne Brontë being reasonably confident that her readers would get the reference. The general tone of such works was exemplified by the astronomer John Herschel in Preliminary Discourse on the Study of Natural Philosophy (1831) – clear, empirical, accessible, supremely rational and even-tempered. These authors communicated a democratic faith that science could be mastered by anyone, perhaps even a woman.
Mary Somerville’s On the Connexion of the Physical Sciences (1834) pulled together mathematics, astronomy, electricity, light, sound, chemistry and meteorology in a grand middlebrow synthesis. She even promised her readers that the sciences were converging on some kind of unified field theory, though that particular Godot has never arrived. For several decades the book sold hugely and was pirated widely, but as scientists became more specialised and professional, it began to look like a hodgepodge. Writing in Nature in 1874, James Clerk Maxwell could find no theme in her pudding, calling it a miscellany unified only by the bookbinder.
The same scientific populism made possible the brief supernova of phrenology. Anyone could learn the fairly simple art of reading bumps on the head once the basics had been broadcast by new media. The first edition of George Combe’s phrenological treatise The Constitution of Man, priced at six shillings, sold barely a hundred copies a year. But when the state-of-the-art steam presses of Chambers’s Edinburgh Journal (the first mass-market periodical) produced a much cheaper version, 43,000 copies were snapped up in a matter of months. What the phrenologists could not produce were research papers backing up their claims, and a decade later the movement was moribund.
Charles Babbage, in designing his ‘difference engine’, anticipated all the basic principles of the modern computer – including ‘garbage in, garbage out’. In Reflections on the Decline of Science in England (1830) he accused his fellow scientists of routinely suppressing, concocting or cooking data. Such corruption (he confidently insisted) could be cleaned up if the government generously subsidised scientific research. That may seem naive today, when we are all too aware that scientists often fudge results to keep the research money flowing. Yet in the era of the First Reform Act, everything appeared to be reformable. Babbage even stood for parliament in Finsbury, on a platform of freedom of information for all. But he split the scientific radical vote with Thomas Wakley, founder of The Lancet, and the Tory swept home.
After his sketches of these forgotten bestsellers, Secord concludes with the literary bomb that blew them all up. In Sartor Resartus Thomas Carlyle fiercely deconstructed everything the popular scientists stood for. Where they were cool, rational, optimistic and supremely organised, he was frenzied, mystical, apocalyptic and deliberately nonsensical. They assumed that big data represented reality; he saw that it might be all pretence, fabrication, image – in a word, ‘clothes’. A century and a half before Microsoft’s emergence, Carlyle grasped the horror of universal digitisation: ‘Shall your Science proceed in the small chink-lighted, or even oil-lighted, underground workshop of Logic alone; and man’s mind become an Arithmetical Mill?’ That was a dig at the clockwork utilitarianism of both John Stuart Mill and Babbage: the latter called his central processing unit a ‘mill’.
The scientific populists sincerely aimed to democratise information. But when the movement was institutionalised in the form of mechanics’ institutes and the Society for the Diffusion of Useful Knowledge, did it aim at anything more than making workers more productive? Babbage never completed his difference engine, in part because he treated human beings – including the artisans who were supposed to execute his designs – as programmable machines. And he was certain that Homo sapiens was not the highest form of intelligence in the universe. On another planet somewhere, he suggested, the Divine Programmer must have created Humanity 2.0….”

Eigenmorality


Blog from Scott Aaronson: “This post is about an idea I had around 1997, when I was 16 years old and a freshman computer-science major at Cornell.  Back then, I was extremely impressed by a research project called CLEVER, which one of my professors, Jon Kleinberg, had led while working at IBM Almaden.  The idea was to use the link structure of the web itself to rank which web pages were most important, and therefore which ones should be returned first in a search query.  Specifically, Kleinberg defined “hubs” as pages that linked to lots of “authorities,” and “authorities” as pages that were linked to by lots of “hubs.”  At first glance, this definition seems hopelessly circular, but Kleinberg observed that one can break the circularity by just treating the World Wide Web as a giant directed graph, and doing some linear algebra on its adjacency matrix.  Equivalently, you can imagine an iterative process where each web page starts out with the same hub/authority “starting credits,” but then in each round, the pages distribute their credits among their neighbors, so that the most popular pages get more credits, which they can then, in turn, distribute to their neighbors by linking to them.
I was also impressed by a similar research project called PageRank, which was proposed later by two guys at Stanford named Sergey Brin and Larry Page.  Brin and Page dispensed with Kleinberg’s bipartite hubs-and-authorities structure in favor of a more uniform structure, and made some other changes, but otherwise their idea was very similar.  At the time, of course, I didn’t know that CLEVER was going to languish at IBM, while PageRank (renamed Google) was going to expand to roughly the size of the entire world’s economy.
In any case, the question I asked myself about CLEVER/PageRank was not the one that, maybe in retrospect, I should have asked: namely, “how can I leverage the fact that I know the importance of this idea before most people do, in order to make millions of dollars?”
Instead I asked myself: “what other ‘vicious circles’ in science and philosophy could one unravel using the same linear-algebra trick that CLEVER and PageRank exploit?”  After all, CLEVER and PageRank were both founded on what looked like a hopelessly circular intuition: “a web page is important if other important web pages link to it.”  Yet they both managed to use math to defeat the circularity.  All you had to do was find an “importance equilibrium,” in which your assignment of “importance” to each web page was stable under a certain linear map.  And such an equilibrium could be shown to exist—indeed, to exist uniquely.
Searching for other circular notions to elucidate using linear algebra, I hit on morality.  Philosophers from Socrates on, I was vaguely aware, had struggled to define what makes a person “moral” or “virtuous,” without tacitly presupposing the answer.  Well, it seemed to me that, as a first attempt, one could do a lot worse than the following:

A moral person is someone who cooperates with other moral people, and who refuses to cooperate with immoral people.

Obviously one can quibble with this definition on numerous grounds: for example, what exactly does it mean to “cooperate,” and which other people are relevant here?  If you don’t donate money to starving children in Africa, have you implicitly “refused to cooperate” with them?  What’s the relative importance of cooperating with good people and withholding cooperation with bad people, of kindness and justice?  Is there a duty not to cooperate with bad people, or merely the lack of a duty to cooperate with them?  Should we consider intent, or only outcomes?  Surely we shouldn’t hold someone accountable for sheltering a burglar, if they didn’t know about the burgling?  Also, should we compute your “total morality” by simply summing over your interactions with everyone else in your community?  If so, then can a career’s worth of lifesaving surgeries numerically overwhelm the badness of murdering a single child?
For now, I want you to set all of these important questions aside, and just focus on the fact that the definition doesn’t even seem to work on its own terms, because of circularity.  How can we possibly know which people are moral (and hence worthy of our cooperation), and which ones immoral (and hence unworthy), without presupposing the very thing that we seek to define?
Ah, I thought—this is precisely where linear algebra can come to the rescue!  Just like in CLEVER or PageRank, we can begin by giving everyone in the community an equal number of “morality starting credits.”  Then we can apply an iterative update rule, where each person A can gain morality credits by cooperating with each other person B, and A gains more credits the more credits B has already.  We apply the rule over and over, until the number of morality credits per person converges to an equilibrium.  (Or, of course, we can shortcut the process by simply finding the principal eigenvector of the “cooperation matrix,” using whatever algorithm we like.)  We then have our objective measure of morality for each individual, solving a 2400-year-old open problem in philosophy….”

Want to Brainstorm New Ideas? Then Limit Your Online Connections


Steve Lohr in the New York Times: “The digitally connected life is both invaluable and inevitable.

Anyone who has the slightest doubt need only walk down the sidewalk of any city street filled with people checking their smartphones for text messages, tweets, news alerts or weather reports or any number of things. So glued to their screens, they run into people or create pedestrian traffic jams.

Just when all the connectedness is useful and when it’s not is often difficult to say. But a recent research paper, published on the Social Science Research Network, titled “Facts and Figuring,” sheds some light on that question.

The research involved customizing a Pentagon lab program for measuring collaboration and information-sharing — a whodunit game, in which the subjects sitting at computers search for clues and solutions to figure out the who, what, when and where of a hypothetical terrorist attack.

The 417 subjects, played more than 1,100 rounds of the 25-minute web-based game, and they were mostly students from the Boston area, selected from the pool of volunteers in the Harvard Decision Science Laboratory and Harvard Business School’s Computer Lab for Experimental Research.

They could share clues and solutions. But the study was designed to measure the results from different network structures — densely clustered networks and unclustered networks of communication. Problem solving, the researchers write, involves “both search for information and search for solutions.” They found that “clustering promotes exploration in information space, but decreases exploration in solution space.”

In looking for unique facts or clues, clustering helped since members of the dense communications networks effectively split up the work and redundant facts were quickly weeded out, making them five percent more efficient. But the number of unique theories or solutions was 17.5 percent higher among subjects who were not densely connected. Clustering reduced the diversity of ideas.

The research paper, said Jesse Shore, a co-author and assistant professor at the Boston University School of Management, contributes to “the growing awareness that being connected all the time has costs. And we put a number to it, in an experimental setting.”

The research, of course, also showed where the connection paid off — finding information, the vital first step in decision making. “There are huge, huge benefits to information sharing,” said Ethan Bernstein, a co-author and assistant professor at the Harvard Business School. “But the costs are harder to measure.”…

Facebook tinkered with users’ feeds for a massive psychology experiment


William Hughes in AVClub: “Scientists at Facebook have published a paper showing that they manipulated the content seen by more than 600,000 users in an attempt to determine whether this would affect their emotional state. The paper, “Experimental evidence of massive-scale emotional contagion through social networks,” was published in The Proceedings Of The National Academy Of Sciences. It shows how Facebook data scientists tweaked the algorithm that determines which posts appear on users’ news feeds—specifically, researchers skewed the number of positive or negative terms seen by randomly selected users. Facebook then analyzed the future postings of those users over the course of a week to see if people responded with increased positivity or negativity of their own, thus answering the question of whether emotional states can be transmitted across a social network. Result: They can! Which is great news for Facebook data scientists hoping to prove a point about modern psychology. It’s less great for the people having their emotions secretly manipulated.

In order to sign up for Facebook, users must click a box saying they agree to the Facebook Data Use Policy, giving the company the right to access and use the information posted on the site. The policy lists a variety of potential uses for your data, most of them related to advertising, but there’s also a bit about “internal operations, including troubleshooting, data analysis, testing, research and service improvement.” In the study, the authors point out that they stayed within the data policy’s liberal constraints by using machine analysis to pick out positive and negative posts, meaning no user data containing personal information was actually viewed by human researchers. And there was no need to ask study “participants” for consent, as they’d already given it by agreeing to Facebook’s terms of service in the first place.

Facebook data scientist Adam Kramer is listed as the study’s lead author. In an interview the company released a few years ago, Kramer is quoted as saying he joined Facebook because “Facebook data constitutes the largest field study in the history of the world.”

See also:
Facebook Experiments Had Few Limits, Data Science Lab Conducted Tests on Users With Little Oversight, Wall Street Journal.
Stop complaining about the Facebook study. It’s a golden age for research, Duncan Watts