Suvodeep Mazumdar, Stuart Wrigley and Fabio Ciravegna in Remote Sense: “The impact of Crowdsourcing and citizen science activities on academia, businesses, governance and society has been enormous. This is more prevalent today with citizens and communities collaborating with organizations, businesses and authorities to contribute in a variety of manners, starting from mere data providers to being key stakeholders in various decision-making processes. The “Crowdsourcing for observations from Satellites” project is a recently concluded study supported by demonstration projects funded by European Space Agency (ESA). The objective of the project was to investigate the different facets of how crowdsourcing and citizen science impact upon the validation, use and enhancement of Observations from Satellites (OS) products and services. This paper presents our findings in a stakeholder analysis activity involving participants who are experts in crowdsourcing, citizen science for Earth Observations. The activity identified three critical areas that needs attention by the community as well as provides suggestions to potentially help in addressing some of the challenges identified….(More)”.
Be the Change: Saving the World with Citizen Science
Book by It’s so easy to be overwhelmed by everything that is wrong in the world. In 2010, there were 660,000 deaths from malaria. Dire predictions about climate change suggest that sea levels could rise enough to submerge both Los Angeles and London by 2100. Bees are dying, not by the thousands but by the millions.
But what can you do? You’re just one person, right? The good news is that you *can* do something.
It’s called citizen science, and it’s a way for ordinary people like you and me to do real, honest-to-goodness, help-answer-the-big-questions science.
This book introduces you to a world in which it is possible to go on a wildlife survey in a national park, install software on your computer to search for a cure for cancer, have your smartphone log the sound pollution in your city, transcribe ancient Greek scrolls, or sift through the dirt from a site where a mastodon died 11,000 years ago—even if you never finished high school….(More)”
Algorithmic Life
Review of several books by Massimo Mazzotti at LARB: “…As a historian of science, I have been trained to think of algorithms as sets of instructions for solving certain problems — and so as neither glamorous nor threatening. Insert the correct input, follow the instructions, and voilà, the desired output. A typical example would be the mathematical formulas used since antiquity to calculate the position of a celestial body at a given time. In the case of a digital algorithm, the instructions need to be translated into a computer program — they must, in other words, be “mechanizable.” Understood in this way — as mechanizable instructions — algorithms were around long before the dawn of electronic computers. Not only were they implemented in mechanical calculating devices, they were used by humans who behaved in machine-like fashion. Indeed, in the pre-digital world, the very term “computer” referred to a human who performed calculations according to precise instructions — like the 200 women trained at the University of Pennsylvania to perform ballistic calculations during World War II. In her classic article “When Computers Were Women,” historian Jennifer Light recounts their long-forgotten story, which takes place right before those algorithmic procedures were automated by ENIAC, the first electronic general-purpose computer.
Terse definitions have now disappeared, however. We rarely use the word “algorithm” to refer solely to a set of instructions. Rather, the word now usually signifies a program running on a physical machine — as well as its effects on other systems. Algorithms have thus become agents, which is partly why they give rise to so many suggestive metaphors. Algorithms now do things. They determine important aspects of our social reality. They generate new forms of subjectivity and new social relationships. They are how a billion-plus people get where they’re going. They free us from sorting through multitudes of irrelevant results. They drive cars. They manufacture goods. They decide whether a client is creditworthy. They buy and sell stocks, thus shaping all-powerful financial markets. They can even be creative; indeed, according to engineer and author Christopher Steiner, they have already composed symphonies “as moving as those composed by Beethoven.”
Do they, perhaps, do too much? That’s certainly the opinion of a slew of popular books on the topic, with titles like Automate This: How Algorithms Took Over Our Markets, Our Jobs, and the World.
Algorithms have captured the scholarly imagination every bit as much as the popular one. Academics variously describe them as a new technology, a particular form of decision-making, the incarnation of a new epistemology, the carriers of a new ideology, and even as a veritable modern myth — a way of saying something, a type of speech that naturalizes beliefs and worldviews. In an article published in 2009 entitled “Power Through the Algorithm,” sociologist David Beer describes algorithms as expressions of a new rationality and form of social organization. He’s onto something fundamental that’s worth exploring further: scientific knowledge and machines are never just neutral instruments. They embody, express, and naturalize specific cultures — and shape how we live according to the assumptions and priorities of those cultures….(More)”
Citizenship, Social Media, and Big Data
Homero Gil de Zúñiga and Trevor Diehl introducing Special Issue of the Social Science Computer Review: “This special issue of the Social Science Computer Review provides a sample of the latest strategies employing large data sets in social media and political communication research. The proliferation of information communication technologies, social media, and the Internet, alongside the ubiquity of high-performance computing and storage technologies, has ushered in the era of computational social science. However, in no way does the use of źbig dataź represent a standardized area of inquiry in any field. This article briefly summarizes pressing issues when employing big data for political communication research. Major challenges remain to ensure the validity and generalizability of findings. Strong theoretical arguments are still a central part of conducting meaningful research. In addition, ethical practices concerning how data are collected remain an area of open discussion. The article surveys studies that offer unique and creative ways to combine methods and introduce new tools while at the same time address some solutions to ethical questions. (See Table of Contents)”
Data Maturity Framework
Center for Data Science and Public Policy: “Want to know if your organization is ready to start a data-driven social impact project? See where you are in our data maturity framework and how to improve your organizational, tech, and data readiness.
The Data Maturity Framework has three content areas:
- Problem Definition
- Data and Technology Readiness
- Organizational Readiness
The Data Maturity Framework consists of:
- A questionnaire and survey to assess readiness
- Data and Technology Readiness Matrix
- Organizational Readiness Matrix
The framework materials can be downloaded here, and you can complete our survey here. When we collect enough responses from enough organizations, we’ll launch an aggregate benchmarking report around the state of data in non-profits and government organizations. We ask that each problem be entered as a separate entry (rather than multiple problems from one organization entered in the same response).
We have adapted the Data Maturity Framework for specific projects:
- Lead-prevention project: organizational readiness and data and tech readiness…(More)”.
Public services and the new age of data
John Manzoni at Civil Service Quaterly: “Government holds massive amounts of data. The potential in that data for transforming the way government makes policy and delivers public services is equally huge. So, getting data right is the next phase of public service reform. And the UK Government has a strong foundation on which to build this future.
Public services have a long and proud relationship with data. In 1858, more than 50 years before the creation of the Cabinet Office, Florence Nightingale produced her famous ‘Diagram of the causes of mortality in the army in the east’ during the Crimean War. The modern era of statistics in government was born at the height of the Second World War with the creation of the Central Statistical Office in 1941.
However, the huge advances we’ve seen in technology mean there are significant new opportunities to use data to improve public services. It can help us:
- understand what works and what doesn’t, through data science techniques, so we can make better decisions: improving the way government works and saving money
- change the way that citizens interact with government through new better digital services built on reliable data;.
- boost the UK economy by opening and sharing better quality data, in a secure and sensitive way, to stimulate new data-based businesses
- demonstrate a trustworthy approach to data, so citizens know more about the information held about them and how and why it’s being used
In 2011 the Government embarked upon a radical improvement in its digital capability with the creation of the Government Digital Service, and over the last few years we have seen a similar revolution begin on data. Although there is much more to do, in areas like open data, the UK is already seen as world-leading.
…But if government is going to seize this opportunity, it needs to make some changes in:
- infrastructure – data is too often hard to find, hard to access, and hard to work with; so government is introducing developer-friendly open registers of trusted core data, such as countries and local authorities, and better tools to find and access personal data where appropriate through APIs for transformative digital services;
- approach – we need the right policies in place to enable us to get the most out of data for citizens and ensure we’re acting appropriately; and the introduction of new legislation on data access will ensure government is doing the right thing – for example, through the data science code of ethics;
- data science skills – those working in government need the skills to be confident with data; that means recruiting more data scientists, developing data science skills across government, and using those skills on transformative projects….(More)”.
Scientists have a word for studying the post-truth world: agnotology
epistemology, or the study of knowledge. This field helps define what we know and why we know it. On the flip side of this is agnotology, or the study of ignorance. Agnotology is not often discussed, because studying the absence of something — in this case knowledge — is incredibly difficult.
But scientists have another word for “post-truth”. You might have heard ofDoubt is our product
Agnotology is more than the study of what we don’t know; it’s also the study of why we are not supposed to know it. One of its more important aspects is revealing how people, usually powerful ones, use ignorance as a strategic tool to hide or divert attention from societal problems in which they have a vested interest.
A perfect example is the tobacco industry’s dissemination of reports that continuously questioned the link between smoking and cancer. As one tobacco employee famously stated, “Doubt is our product.”
In a similar way, conservative think tanks such as The Heartland Institute work to discredit the science behind human-caused climate change.
Despite the fact that 97% of scientists support the anthropogenic causes of climate change, hired “experts” have been able to populate talk shows, news programmes, and the op-ed pages to suggest a lack of credible data or established consensus, even with evidence to the contrary.
These institutes generate pseudo-academic reports to counter scientific results. In this way, they are responsible for promoting ignorance….
Under agnotology 2.0, truth becomes a moot point. It is the sensation that counts. Public media leaders create an impact with whichever arguments they can muster based in whatever fictional data they can create…Donald Trump entering the White House is the pinnacle of agnotology 2.0. Washington Post journalist Fareed Zakaria has argued that in politics, what matters is no longer the economy but identity; we would like to suggest that the problem runs deeper than that.
The issue is not whether we should search for identity, for fame, or for sensational opinions and entertainment. The overarching issue is the fallen status of our collective search for truth, in its many forms. It is no longer a positive attribute to seek out truth, determine biases, evaluate facts, or share knowledge.
Under agnotology 2.0, scientific thinking itself is under attack. In a post-fact and post-truth era, we could very well become post-science….(More)”.
Harnessing the Power of Feedback Loops
Thomas Kalil and David Wilkinson at the White House: “When it comes to strengthening the public sector, the Federal Government looks for new ways to achieve better results for the people we serve. One promising tool that has gained momentum across numerous sectors in the last few years is the adoption of feedback loops. Systematically collecting data and learning from client and customer insights can benefit organizations across all sectors.
The collection of these valuable insights—and acting on them—remains an underutilized tool. The people who receive services are the experts on their effectiveness and usefulness. While the private sector has used customer feedback to improve products and services, the government and nonprofit sectors have often lagged behind. User experience is a critically important factor in driving positive outcomes. Getting honest feedback from service recipients can help nonprofit service providers and agencies at all levels of government ensure their work effectively addresses the needs of the people they serve. It’s equally important to close the loop by letting those who provided feedback know that their input was put to good use.
In September, the White House Office of Social Innovation and the White House Office of Science and Technology Policy (OSTP) hosted a workshop at the White House on data-driven feedback loops for the social and public sectors. The event brought together leaders across the philanthropy, nonprofit, and business sectors who discussed ways to collect and utilize feedback.
The program featured organizations in the nonprofit sector that use feedback to learn what works, what might not be working as well, and how to fix it. One organization, which offers comprehensive employment services to men and women with recent criminal convictions, explained that it has sought feedback from clients on its training program and learned that many people were struggling to find their work site locations and get to the sessions on time. The organization acted on this feedback, shifting their start times and providing maps and clearer directions to their participants. These two simple changes increased both participation in and satisfaction with their program.
Another organization collected feedback to learn whether factory workers attend and understand trainings on fire evacuation procedures. By collecting and acting on this feedback in Brazil, the organization was able to help a factory reduce fire-drill evacuation time from twelve minutes to two minutes—a life-saving result of seeking feedback.
With results such as these in mind, the White House has emphasized the importance of evidence and data-driven solutions across the Federal Government. …
USAID works to end extreme poverty in over 100 countries around the world. The Agency has recently changed its operational policy to enable programs to adapt to feedback from the communities in which they work. They did this by removing bureaucratic obstacles and encouraging more flexibility in their program design. For example, if a USAID-funded project designed to increase agricultural productivity is unexpectedly impacted by drought, the original plan may no longer be relevant or effective; the community may want drought-resistant crops instead. The new, more flexible policy is intended to ensure that such programs can pivot if a community provides feedback that its needs have changed or projects are not succeeding…(More)”
The social data revolution will be crowdsourced
Nicholas B. Adams at SSRC Parameters: “It is now abundantly clear to librarians, archivists, computer scientists, and many social scientists that we are in a transformational age. If we can understand and measure meaning from all of these data describing so much of human activity, we will finally be able to test and revise our most intricate theories of how the world is socially constructed through our symbolic interactions….
We cannot write enough rules to teach a computer to read like us. And because the social world is not a game per se, we can’t design a reinforcement-learning scenario teaching a computer to “score points” and just ‘win.’ But AlphaGo’s example does show a path forward. Recall that much of AlphaGo’s training came in the form of supervised machine learning, where humans taught it to play like them by showing the machine how human experts played the game. Already, humans have used this same supervised learning approach to teach computers to classify images, identify parts of speech in text, or categorize inventories into various bins. Without writing any rules, simply by letting the computer guess, then giving it human-generated feedback about whether it guessed right or wrong, humans can teach computers to label data as we do. The problem is (or has been): humans label textual data slowly—very, very slowly. So, we have generated precious little data with which to teach computers to understand natural language as we do. But that is going to change….
The single greatest factor dilating the duration of such large-scale text-labeling projects has been workforce training and turnover. ….The key to organizing work for the crowd, I had learned from talking to computer scientists, was task decomposition. The work had to be broken down into simple pieces that any (moderately intelligent) person could do through a web interface without requiring face-to-face training. I knew from previous experiments with my team that I could not expect a crowd worker to read a whole article, or to know our whole conceptual scheme defining everything of potential interest in those articles. Requiring either or both would be asking too much. But when I realized that my conceptual scheme could actually be treated as multiple smaller conceptual schemes, the idea came to me: Why not have my RAs identify units of text that corresponded with the units of analysis of my conceptual scheme? Then, crowd workers reading those much smaller units of text could just label them according to a smaller sub-scheme. Moreover, I came to realize, we could ask them leading questions about the text to elicit information about the variables and attributes in the scheme, so they wouldn’t have to memorize the scheme either. By having them highlight the words justifying their answers, they would be labeling text according to our scheme without any face-to-face training. Bingo….
This approach promises more, too. The databases generated by crowd workers, citizen scientists, and students can also be used to train machines to see in social data what we humans see comparatively easily. Just as AlphaGo learned from humans how to play a strategy game, our supervision can also help it learn to see the social world in textual or video data. The final products of social data analysis assembly lines, therefore, are not merely rich and massive databases allowing us to refine our most intricate, elaborate, and heretofore data-starved theories; they are also computer algorithms that will do most or all social data labeling in the future. In other words, whether we know it or not, we social scientists hold the key to developing artificial intelligences capable of understanding our social world….
At stake is a social science with the capacity to quantify and qualify so many of our human practices, from the quotidian to mythic, and to lead efforts to improve them. In decades to come, we may even be able to follow the path of other mature sciences (including physics, biology, and chemistry) and shift our focus toward engineering better forms of sociality. All the more so because it engages the public, a crowd-supported social science could enlist a new generation in the confident and competent re-construction of society….(More)”
Crowdsourcing, Citizen Science, and Data-sharing
Sapien Labs: “The future of human neuroscience lies in crowdsourcing, citizen science and data sharing but it is not without its minefields.
A recent Scientific American article by Daniel Goodwin, “Why Neuroscience Needs Hackers,” makes the case that neuroscience, like many fields today, is drowning in data, begging for application of advances in computer science like machine learning. Neuroscientists are able to gather realms of neural data, but often without big data mechanisms and frameworks to synthesize them.
The SA article describes the work of Sebastian Seung, a Princeton neuroscientist, who recently mapped the neural connections of the human retina from an “overwhelming mass” of electron microscopy data using state of the art A.I. and massive crowd-sourcing. Seung incorporated the A.I. into a game called “Eyewire” where 1,000s of volunteers scored points while improving the neural map. Although the article’s title emphasizes advanced A.I., Dr. Seung’s experiment points even more to crowdsourcing and open science, avenues for improving research that have suddenly become easy and powerful with today’s internet. Eyewire perhaps epitomizes successful crowdsourcing — using an application that gathers, represents, and analyzes data uniformly according to researchers’ needs.
Crowdsourcing is seductive in its potential but risky for those who aren’t sure how to control it to get what they want. For researchers who don’t want to become hackers themselves, trying to turn the diversity of data produced by a crowd into conclusive results might seem too much of a headache to make it worthwhile. This is probably why the SA article title says we need hackers. The crowd is there but using it depends on innovative software engineering. A lot of researchers could really use software designed to flexibly support a diversity of crowdsourcing, some AI to enable things like crowd validation and big data tools.
The Potential
The SA article also points to Open BCI (brain-computer interface), mentioned here in other posts, as an example of how traditional divisions between institutional and amateur (or “citizen”) science are now crumbling; Open BCI is a community of professional and citizen scientists doing principled research with cheap, portable EEG-headsets producing professional research quality data. In communities of “neuro-hackers,” like NeurotechX, professional researchers, entrepreneurs, and citizen scientists are coming together to develop all kinds of applications, such as “telepathic” machine control, prostheses, and art. Other companies, like Neurosky sell EEG headsets and biosensors for bio-/neuro-feedback training and health-monitoring at consumer affordable pricing. (Read more in Citizen Science and EEG)
Tan Le, whose company Emotiv Lifesciences, also produces portable EEG head-sets, says, in an article in National Geographic, that neuroscience needs “as much data as possible on as many brains as possible” to advance diagnosis of conditions such as epilepsy and Alzheimer’s. Human neuroscience studies have typically consisted of 20 to 50 participants, an incredibly small sampling of a 7 billion strong humanity. For a single lab to collect larger datasets is difficult but with diverse populations across the planet real understanding may require data not even from thousands of brains but millions. With cheap mobile EEG-headsets, open-source software, and online collaboration, the potential for anyone can participate in such data collection is immense; the potential for crowdsourcing unprecedented. There are, however, significant hurdles to overcome….(More)”