Selected Readings on Crowdsourcing Tasks and Peer Production


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing was originally published in 2014.

Technological advances are creating a new paradigm by which institutions and organizations are increasingly outsourcing tasks to an open community, allocating specific needs to a flexible, willing and dispersed workforce. “Microtasking” platforms like Amazon’s Mechanical Turk are a burgeoning source of income for individuals who contribute their time, skills and knowledge on a per-task basis. In parallel, citizen science projects – task-based initiatives in which citizens of any background can help contribute to scientific research – like Galaxy Zoo are demonstrating the ability of lay and expert citizens alike to make small, useful contributions to aid large, complex undertakings. As governing institutions seek to do more with less, looking to the success of citizen science and microtasking initiatives could provide a blueprint for engaging citizens to help accomplish difficult, time-consuming objectives at little cost. Moreover, the incredible success of peer-production projects – best exemplified by Wikipedia – instills optimism regarding the public’s willingness and ability to complete relatively small tasks that feed into a greater whole and benefit the public good. You can learn more about this new wave of “collective intelligence” by following the MIT Center for Collective Intelligence and their annual Collective Intelligence Conference.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Benkler, Yochai. The Wealth of Networks: How Social Production Transforms Markets and Freedom. Yale University Press, 2006. http://bit.ly/1aaU7Yb.

  • In this book, Benkler “describes how patterns of information, knowledge, and cultural production are changing – and shows that the way information and knowledge are made available can either limit or enlarge the ways people can create and express themselves.”
  • In his discussion on Wikipedia – one of many paradigmatic examples of people collaborating without financial reward – he calls attention to the notable ongoing cooperation taking place among a diversity of individuals. He argues that, “The important point is that Wikipedia requires not only mechanical cooperation among people, but a commitment to a particular style of writing and describing concepts that is far from intuitive or natural to people. It requires self-discipline. It enforces the behavior it requires primarily through appeal to the common enterprise that the participants are engaged in…”

Brabham, Daren C. Using Crowdsourcing in Government. Collaborating Across Boundaries Series. IBM Center for The Business of Government, 2013. http://bit.ly/17gzBTA.

  • In this report, Brabham categorizes government crowdsourcing cases into a “four-part, problem-based typology, encouraging government leaders and public administrators to consider these open problem-solving techniques as a way to engage the public and tackle difficult policy and administrative tasks more effectively and efficiently using online communities.”
  • The proposed four-part typology describes the following types of crowdsourcing in government:
    • Knowledge Discovery and Management
    • Distributed Human Intelligence Tasking
    • Broadcast Search
    • Peer-Vetted Creative Production
  • In his discussion on Distributed Human Intelligence Tasking, Brabham argues that Amazon’s Mechanical Turk and other microtasking platforms could be useful in a number of governance scenarios, including:
    • Governments and scholars transcribing historical document scans
    • Public health departments translating health campaign materials into foreign languages to benefit constituents who do not speak the native language
    • Governments translating tax documents, school enrollment and immunization brochures, and other important materials into minority languages
    • Helping governments predict citizens’ behavior, “such as for predicting their use of public transit or other services or for predicting behaviors that could inform public health practitioners and environmental policy makers”

Boudreau, Kevin J., Patrick Gaule, Karim Lakhani, Christoph Reidl, Anita Williams Woolley. “From Crowds to Collaborators: Initiating Effort & Catalyzing Interactions Among Online Creative Workers.” Harvard Business School Technology & Operations Mgt. Unit Working Paper No. 14-060. January 23, 2014. https://bit.ly/2QVmGUu.

  • In this working paper, the authors explore the “conditions necessary for eliciting effort from those affecting the quality of interdependent teamwork” and “consider the the role of incentives versus social processes in catalyzing collaboration.”
  • The paper’s findings are based on an experiment involving 260 individuals randomly assigned to 52 teams working toward solutions to a complex problem.
  • The authors determined the level of effort in such collaborative undertakings are sensitive to cash incentives. However, collaboration among teams was driven more by the active participation of teammates, rather than any monetary reward.

Franzoni, Chiara, and Henry Sauermann. “Crowd Science: The Organization of Scientific Research in Open Collaborative Projects.” Research Policy (August 14, 2013). http://bit.ly/HihFyj.

  • In this paper, the authors explore the concept of crowd science, which they define based on two important features: “participation in a project is open to a wide base of potential contributors, and intermediate inputs such as data or problem solving algorithms are made openly available.” The rationale for their study and conceptual framework is the “growing attention from the scientific community, but also policy makers, funding agencies and managers who seek to evaluate its potential benefits and challenges. Based on the experiences of early crowd science projects, the opportunities are considerable.”
  • Based on the study of a number of crowd science projects – including governance-related initiatives like Patients Like Me – the authors identify a number of potential benefits in the following categories:
    • Knowledge-related benefits
    • Benefits from open participation
    • Benefits from the open disclosure of intermediate inputs
    • Motivational benefits
  • The authors also identify a number of challenges:
    • Organizational challenges
    • Matching projects and people
    • Division of labor and integration of contributions
    • Project leadership
    • Motivational challenges
    • Sustaining contributor involvement
    • Supporting a broader set of motivations
    • Reconciling conflicting motivations

Kittur, Aniket, Ed H. Chi, and Bongwon Suh. “Crowdsourcing User Studies with Mechanical Turk.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 453–456. CHI ’08. New York, NY, USA: ACM, 2008. http://bit.ly/1a3Op48.

  • In this paper, the authors examine “[m]icro-task markets, such as Amazon’s Mechanical Turk, [which] offer a potential paradigm for engaging a large number of users for low time and monetary costs. [They] investigate the utility of a micro-task market for collecting user measurements, and discuss design considerations for developing remote micro user evaluation tasks.”
  • The authors conclude that in addition to providing a means for crowdsourcing small, clearly defined, often non-skill-intensive tasks, “Micro-task markets such as Amazon’s Mechanical Turk are promising platforms for conducting a variety of user study tasks, ranging from surveys to rapid prototyping to quantitative measures. Hundreds of users can be recruited for highly interactive tasks for marginal costs within a timeframe of days or even minutes. However, special care must be taken in the design of the task, especially for user measurements that are subjective or qualitative.”

Kittur, Aniket, Jeffrey V. Nickerson, Michael S. Bernstein, Elizabeth M. Gerber, Aaron Shaw, John Zimmerman, Matthew Lease, and John J. Horton. “The Future of Crowd Work.” In 16th ACM Conference on Computer Supported Cooperative Work (CSCW 2013), 2012. http://bit.ly/1c1GJD3.

  • In this paper, the authors discuss paid crowd work, which “offers remarkable opportunities for improving productivity, social mobility, and the global economy by engaging a geographically distributed workforce to complete complex tasks on demand and at scale.” However, they caution that, “it is also possible that crowd work will fail to achieve its potential, focusing on assembly-line piecework.”
  • The authors argue that seven key challenges must be met to ensure that crowd work processes evolve and reach their full potential:
    • Designing workflows
    • Assigning tasks
    • Supporting hierarchical structure
    • Enabling real-time crowd work
    • Supporting synchronous collaboration
    • Controlling quality

Madison, Michael J. “Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo.” In Convening Cultural Commons, 2013. http://bit.ly/1ih9Xzm.

  • This paper explores a “case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place, Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis on the Internet. In the second place, Galaxy Zoo is a highly successful example of peer production, some times known as crowdsourcing…In the third place, is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies.”
  • Madison concludes that the success of Galaxy Zoo has not been the result of the “character of its information resources (scientific data) and rules regarding their usage,” but rather, the fact that the “community was guided from the outset by a vision of a specific organizational solution to a specific research problem in astronomy, initiated and governed, over time, by professional astronomers in collaboration with their expanding universe of volunteers.”

Malone, Thomas W., Robert Laubacher and Chrysanthos Dellarocas. “Harnessing Crowds: Mapping the Genome of Collective Intelligence.” MIT Sloan Research Paper. February 3, 2009. https://bit.ly/2SPjxTP.

  • In this article, the authors describe and map the phenomenon of collective intelligence – also referred to as “radical decentralization, crowd-sourcing, wisdom of crowds, peer production, and wikinomics – which they broadly define as “groups of individuals doing things collectively that seem intelligent.”
  • The article is derived from the authors’ work at MIT’s Center for Collective Intelligence, where they gathered nearly 250 examples of Web-enabled collective intelligence. To map the building blocks or “genes” of collective intelligence, the authors used two pairs of related questions:
    • Who is performing the task? Why are they doing it?
    • What is being accomplished? How is it being done?
  • The authors concede that much work remains to be done “to identify all the different genes for collective intelligence, the conditions under which these genes are useful, and the constraints governing how they can be combined,” but they believe that their framework provides a useful start and gives managers and other institutional decisionmakers looking to take advantage of collective intelligence activities the ability to “systematically consider many possible combinations of answers to questions about Who, Why, What, and How.”

Mulgan, Geoff. “True Collective Intelligence? A Sketch of a Possible New Field.” Philosophy & Technology 27, no. 1. March 2014. http://bit.ly/1p3YSdd.

  • In this paper, Mulgan explores the concept of a collective intelligence, a “much talked about but…very underdeveloped” field.
  • With a particular focus on health knowledge, Mulgan “sets out some of the potential theoretical building blocks, suggests an experimental and research agenda, shows how it could be analysed within an organisation or business sector and points to possible intellectual barriers to progress.”
  • He concludes that the “central message that comes from observing real intelligence is that intelligence has to be for something,” and that “turning this simple insight – the stuff of so many science fiction stories – into new theories, new technologies and new applications looks set to be one of the most exciting prospects of the next few years and may help give shape to a new discipline that helps us to be collectively intelligent about our own collective intelligence.”

Sauermann, Henry and Chiara Franzoni. “Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation.” SSRN Working Papers Series. November 28, 2013. http://bit.ly/1o6YB7f.

  • In this paper, Sauremann and Franzoni explore the issue of interest-based motivation in crowd-based knowledge production – in particular the use of the crowd science platform Zooniverse – by drawing on “research in psychology to discuss important static and dynamic features of interest and deriv[ing] a number of research questions.”
  • The authors find that interest-based motivation is often tied to a “particular object (e.g., task, project, topic)” not based on a “general trait of the person or a general characteristic of the object.” As such, they find that “most members of the installed base of users on the platform do not sign up for multiple projects, and most of those who try out a project do not return.”
  • They conclude that “interest can be a powerful motivator of individuals’ contributions to crowd-based knowledge production…However, both the scope and sustainability of this interest appear to be rather limited for the large majority of contributors…At the same time, some individuals show a strong and more enduring interest to participate both within and across projects, and these contributors are ultimately responsible for much of what crowd science projects are able to accomplish.”

Schmitt-Sands, Catherine E. and Richard J. Smith. “Prospects for Online Crowdsourcing of Social Science Research Tasks: A Case Study Using Amazon Mechanical Turk.” SSRN Working Papers Series. January 9, 2014. http://bit.ly/1ugaYja.

  • In this paper, the authors describe an experiment involving the nascent use of Amazon’s Mechanical Turk as a social science research tool. “While researchers have used crowdsourcing to find research subjects or classify texts, [they] used Mechanical Turk to conduct a policy scan of local government websites.”
  • Schmitt-Sands and Smith found that “crowdsourcing worked well for conducting an online policy program and scan.” The microtasked workers were helpful in screening out local governments that either did not have websites or did not have the types of policies and services for which the researchers were looking. However, “if the task is complicated such that it requires ongoing supervision, then crowdsourcing is not the best solution.”

Shirky, Clay. Here Comes Everybody: The Power of Organizing Without Organizations. New York: Penguin Press, 2008. https://bit.ly/2QysNif.

  • In this book, Shirky explores our current era in which, “For the first time in history, the tools for cooperating on a global scale are not solely in the hands of governments or institutions. The spread of the Internet and mobile phones are changing how people come together and get things done.”
  • Discussing Wikipedia’s “spontaneous division of labor,” Shirky argues that the process is like, “the process is more like creating a coral reef, the sum of millions of individual actions, than creating a car. And the key to creating those individual actions is to hand as much freedom as possible to the average user.”

Silvertown, Jonathan. “A New Dawn for Citizen Science.” Trends in Ecology & Evolution 24, no. 9 (September 2009): 467–471. http://bit.ly/1iha6CR.

  • This article discusses the move from “Science for the people,” a slogan adopted by activists in the 1970s to “’Science by the people,’ which is “a more inclusive aim, and is becoming a distinctly 21st century phenomenon.”
  • Silvertown identifies three factors that are responsible for the explosion of activity in citizen science, each of which could be similarly related to the crowdsourcing of skills by governing institutions:
    • “First is the existence of easily available technical tools for disseminating information about products and gathering data from the public.
    • A second factor driving the growth of citizen science is the increasing realisation among professional scientists that the public represent a free source of labour, skills, computational power and even finance.
    • Third, citizen science is likely to benefit from the condition that research funders such as the National Science Foundation in the USA and the Natural Environment Research Council in the UK now impose upon every grantholder to undertake project-related science outreach. This is outreach as a form of public accountability.”

Szkuta, Katarzyna, Roberto Pizzicannella, David Osimo. “Collaborative approaches to public sector innovation: A scoping study.” Telecommunications Policy. 2014. http://bit.ly/1oBg9GY.

  • In this article, the authors explore cases where government collaboratively delivers online public services, with a focus on success factors and “incentives for services providers, citizens as users and public administration.”
  • The authors focus on six types of collaborative governance projects:
    • Services initiated by government built on government data;
    • Services initiated by government and making use of citizens’ data;
    • Services initiated by civil society built on open government data;
    • Collaborative e-government services; and
    • Services run by civil society and based on citizen data.
  • The cases explored “are all designed in the way that effectively harnesses the citizens’ potential. Services susceptible to collaboration are those that require computing efforts, i.e. many non-complicated tasks (e.g. citizen science projects – Zooniverse) or citizens’ free time in general (e.g. time banks). Those services also profit from unique citizens’ skills and their propensity to share their competencies.”

Heteromation and its (dis)contents: The invisible division of labor between humans and machines


Paper by Hamid Ekbia and Bonnie Nardi in First Monday: “The division of labor between humans and computer systems has changed along both technical and human dimensions. Technically, there has been a shift from technologies of automation, the aim of which was to disallow human intervention at nearly all points in the system, to technologies of “heteromation” that push critical tasks to end users as indispensable mediators. As this has happened, the large population of human beings who have been driven out by the first type of technology are drawn back into the computational fold by the second type. Turning artificial intelligence on its head, one technology fills the gap created by the other, but with a vengeance that unsettles established mechanisms of reward, fulfillment, and compensation. In this fashion, replacement of human beings and their irrelevance to technological systems has given way to new “modes of engagement” with remarkable social, economic, and ethical implications. In this paper we provide a historical backdrop for heteromation and explore and explicate some of these displacements through analysis of a number of cases, including Mechanical Turk, the video games FoldIt and League of Legends, and social media.

Full Text: HTML

Big Data, new epistemologies and paradigm shifts


Paper by Rob Kitchin in the Journal “Big Data and Society”: This article examines how the availability of Big Data, coupled with new data analytics, challenges established epistemologies across the sciences, social sciences and humanities, and assesses the extent to which they are engendering paradigm shifts across multiple disciplines. In particular, it critically explores new forms of empiricism that declare ‘the end of theory’, the creation of data-driven rather than knowledge-driven science, and the development of digital humanities and computational social sciences that propose radically different ways to make sense of culture, history, economy and society. It is argued that: (1) Big Data and new data analytics are disruptive innovations which are reconfiguring in many instances how research is conducted; and (2) there is an urgent need for wider critical reflection within the academy on the epistemological implications of the unfolding data revolution, a task that has barely begun to be tackled despite the rapid changes in research practices presently taking place. After critically reviewing emerging epistemological positions, it is contended that a potentially fruitful approach would be the development of a situated, reflexive and contextually nuanced epistemology”

Technological Innovations and Future Shifts in International Politics


Paper by Askar Akaev and Vladimir Pantin in International Studies Quaterly: “How are large technological changes and important shifts in international politics interconnected? It is shown in the article that primary technological innovations, which take place in each Kondratieff cycle, change the balance of power between the leading states and cause shifts in international politics. In the beginning of the twenty-first century, the genesis and initial development of the cluster of new technologies takes place in periods of crisis and depression. Therefore, the authors forecast that the period 2013–2020 will be marked by the advancement of important technological innovations and massive geopolitical shifts in many regions of the world.”

Who Influences Whom? Reflections on U.S. Government Outreach to Think Tanks


Jeremy Shapiro at Brookings: “The U.S. government makes a big effort to reach out to important think tanks, often through the little noticed or understood mechanism of small, private and confidential roundtables. Indeed, for the ambitious Washington think-tanker nothing quite gets the pulse racing like the idea of attending one of these roundtables with the most important government officials. The very occasion is full of intrigue and ritual.

When the Government Calls for Advice

First, an understated e-mail arrives from some polite underling inviting you in to a “confidential, off-the-record” briefing with some official with an impressive title—a deputy secretary or a special assistant to the president, maybe even (heaven forfend) the secretary of state or the national security advisor. The thinker’s heart leaps, “they read my article; they finally see the light of my wisdom, I will probably be the next national security advisor.”
He clears his schedule of any conflicting brown bags on separatism in South Ossetia and, after a suitable interval to keep the government guessing as to his availability, replies that he might be able to squeeze it in to his schedule. Citizenship data and social security numbers are provided for security purposes, times are confirmed and ground rules are established in a multitude of emails with a seemingly never-ending array of staffers, all of whose titles include the word “special.” The thinker says nothing directly to his colleagues, but searches desperately for opportunities to obliquely allude to the meeting: “I’d love to come to your roundtable on uncovered interest rate parity, but I unfortunately have a meeting with the secretary of defense.”
On the appointed day, the thinker arrives early as instructed at an impressively massive and well-guarded government building, clears his ways through multiple layers of redundant security, and is ushered into a wood-paneled room that reeks of power and pine-sol. (Sometimes it is a futuristic conference room filled with television monitors and clocks that give the time wherever the President happens to be.) Nameless peons in sensible suits clutch government-issue notepads around the outer rim of the room as the thinker takes his seat at the center table, only somewhat disappointed to see so many other familiar thinkers in the room—including some to whom he had been obliquely hinting about the meeting the day before.
At the appointed hour, an officious staffer arrives to announce that “He” (the lead government official goes only by personal pronoun—names are unnecessary at this level) is unfortunately delayed at another meeting on the urgent international crisis of the day, but will arrive just as soon as he can get break away from the president in the Situation Room. He is, in fact, just reading email, but his long career has taught him the advantage of making people wait.
After 15 minutes of stilted chit-chat with colleagues that the thinker has the misfortune to see at virtually every event he attends in Washington, the senior government official strides calmly into the room, plops down at the head of the table and declares solemnly what a honor it is to have such distinguished experts to help with this critical area of policy. He very briefly details how very hard the U.S. government is working on this highest priority issue and declares that “we are in listening mode and are anxious to hear your sage advice.” A brave thinker raises his hand and speaks truth to power by reciting the thesis of his latest article. From there, the group is off to races as the thinkers each struggle to get in the conversation and rehearse their well-worn positions.
Forty-three minutes later, the thinkers’ “hour” is up because, the officious staffer interjects, “He” must attend a Principals Committee meeting. The senior government official thanks the experts for coming, compliments them on their fruitful ideas and their full and frank debate, instructs a nameless peon at random to assemble “what was learned here” for distribution in “the building” and strides purposefully out of the room.
The pantomime then ends and the thinker retreats back to his office to continue his thoughts. But what precisely has happened behind the rituals? Have we witnessed the vaunted academic-government exchange that Washington is so famous for? Is this how fresh ideas re-invigorate stale government groupthink?..”

Cataloging the World


New book on “Paul Otlet and the Birth of the Information Age”: “The dream of capturing and organizing knowledge is as old as history. From the archives of ancient Sumeria and the Library of Alexandria to the Library of Congress and Wikipedia, humanity has wrestled with the problem of harnessing its intellectual output. The timeless quest for wisdom has been as much about information storage and retrieval as creative genius.
In Cataloging the World, Alex Wright introduces us to a figure who stands out in the long line of thinkers and idealists who devoted themselves to the task. Beginning in the late nineteenth century, Paul Otlet, a librarian by training, worked at expanding the potential of the catalog card, the world’s first information chip. From there followed universal libraries and museums, connecting his native Belgium to the world by means of a vast intellectual enterprise that attempted to organize and code everything ever published. Forty years before the first personal computer and fifty years before the first browser, Otlet envisioned a network of “electric telescopes” that would allow people everywhere to search through books, newspapers, photographs, and recordings, all linked together in what he termed, in 1934, a réseau mondial–essentially, a worldwide web.
Otlet’s life achievement was the construction of the Mundaneum–a mechanical collective brain that would house and disseminate everything ever committed to paper. Filled with analog machines such as telegraphs and sorters, the Mundaneum–what some have called a “Steampunk version of hypertext”–was the embodiment of Otlet’s ambitions. It was also short-lived. By the time the Nazis, who were pilfering libraries across Europe to collect information they thought useful, carted away Otlet’s collection in 1940, the dream had ended. Broken, Otlet died in 1944.
Wright’s engaging intellectual history gives Otlet his due, restoring him to his proper place in the long continuum of visionaries and pioneers who have struggled to classify knowledge, from H.G. Wells and Melvil Dewey to Vannevar Bush, Ted Nelson, Tim Berners-Lee, and Steve Jobs. Wright shows that in the years since Otlet’s death the world has witnessed the emergence of a global network that has proved him right about the possibilities–and the perils–of networked information, and his legacy persists in our digital world today, captured for all time…”

How open data can help shape the way we analyse electoral behaviour


Harvey Lewis (Deloitte), Ulrich Atz, Gianfranco Cecconi, Tom Heath (ODI) in The Guardian: Even after the local council elections in England and Northern Ireland on 22 May, which coincided with polling for the European Parliament, the next 12 months remain a busy time for the democratic process in the UK.
In September, the people of Scotland make their choice in a referendum on the future of the Union. Finally, the first fixed-term parliament in Westminster comes to an end with a general election in all areas of Great Britain and Northern Ireland in May 2015.
To ensure that as many people as possible are eligible and able to vote, the government is launching an ambitious programme of Individual Electoral Registration (IER) this summer. This will mean that the traditional, paper-based approach to household registration will shift to a tailored and largely digital process more in-keeping with the data-driven demands of the twenty-first century.
Under IER, citizens will need to provide ‘identifying information’, such as date of birth or national insurance number, when applying to register.

Ballots: stuck in the past?

However, despite the government’s attempts through IER to improve the veracity of information captured prior to ballots being posted, little has changed in terms of the vision for capturing, distributing and analysing digital data from election day itself.

Advertisement

Indeed, paper is still the chosen medium for data collection.
Digitising elections is fraught with difficulty, though. In the US, for example, the introduction of new voting machines created much controversy even though they are capable of providing ‘near-perfect’ ballot data.
The UK’s democratic process is not completely blind, though. Numerous opinion surveys are conducted both before and after polling, including the long-running British Election Study, to understand the shifting attitudes of a representative cross-section of the electorate.
But if the government does not retain in sufficient geographic detail digital information on the number of people who vote, then how can it learn what is necessary to reverse the long-running decline in turnout?

The effects of lack of data

To add to the debate around democratic engagement, a joint research team, with data scientists from Deloitte and the Open Data Institute (ODI), have been attempting to understand what makes voters tick.
Our research has been hampered by a significant lack of relevant data describing voter behaviour at electoral ward level, as well as difficulties in matching what little data is available to other open data sources, such as demographic data from the 2011 Census.
Even though individual ballot papers are collected and verified for counting the number of votes per candidate – the primary aim of elections, after all – the only recent elections for which aggregate turnout statistics have been published at ward level are the 2012 local council elections in England and Wales. In these elections, approximately 3,000 wards from a total of over 8,000 voted.
Data published by the Electoral Commission for the 2013 local council elections in England and Wales purports to be at ward level but is, in fact, for ‘county electoral divisions’, as explained by the Office for National Statistics.
Moreover, important factors related to the accessibility of polling stations – such as the distance from main population centres – could not be assessed because the location of polling stations remains the responsibility of individual local authorities – and only eight of these have so far published their data as open data.
Given these fundamental limitations, drawing any robust conclusions is difficult. Nevertheless, our research shows the potential for forecasting electoral turnout with relatively few census variables, the most significant of which are age and the size of the electorate in each ward.

What role can open data play?

The limited results described above provide a tantalising glimpse into a possible future scenario: where open data provides a deeper and more granular understanding of electoral behaviour.
On the back of more sophisticated analyses, policies for improving democratic engagement – particularly among young people – have the potential to become focused and evidence-driven.
And, although the data captured on election day will always remain primarily for the use of electing the public’s preferred candidate, an important secondary consideration is aggregating and publishing data that can be used more widely.
This may have been prohibitively expensive or too complex in the past but as storage and processing costs continue to fall, and the appetite for such knowledge grows, there is a compelling business case.
The benefits of this future scenario potentially include:

  • tailoring awareness and marketing campaigns to wards and other segments of the electorate most likely to respond positively and subsequently turn out to vote
  • increasing the efficiency with which European, general and local elections are held in the UK
  • improving transparency around the electoral process and stimulating increased democratic engagement
  • enhancing links to the Government’s other significant data collection activities, including the Census.

Achieving these benefits requires commitment to electoral data being collected and published in a systematic fashion at least at ward level. This would link work currently undertaken by the Electoral Commission, the ONS, Plymouth University’s Election Centre, the British Election Study and the more than 400 local authorities across the UK.”

Crowdsourcing for public safety


Paper presented by A Goncalves, C Silva, P Morreale, J Bonafide  at Systems Conference (SysCon), 2014: “With advances in mobile technology, the ability to get real-time geographically accurate data, including photos and videos, becomes integrated into daily activities. Businesses use this technology edge to stay ahead of their competitors. Social media has made photo and video sharing a widely accepted and adopted behavior. This real-time data and information exchange, crowdsourcing, can be used to help first responders and personnel in emergency situations caused by extreme weather such as earthquakes, hurricanes, floods, and snow storms. Using smartphones, civilians can contribute data and images to the recovery process and make it more efficient, which can ultimately save lives and decrease the economic impact caused by extreme weather conditions.”

Linking Social, Open, and Enterprise Data


Paper by T Omitola, J Davies, A Duke, H Glaser, N Shadbolt for Proceeding WIMS ’14 (Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics): “The new world of big data, of the LOD cloud, of the app economy, and of social media means that organisations no longer own, much less control, all the data they need to make the best informed business decisions. In this paper, we describe how we built a system using Linked Data principles to bring in data from Web 2.0 sites (LinkedIn, Salesforce), and other external business sites such as OpenCorporates, linking these together with pertinent internal British Telecommunications enterprise data into that enterprise data space. We describe the challenges faced during the implementation, which include sourcing the datasets, finding the appropriate “join points” from the individual datasets, as well as developing the client application used for data publication. We describe our solutions to these challenges and discuss the design decisions made. We conclude by drawing some general principles from this work.”

Crowdsourcing platform for museums


Thesis by Kræn Vesterberg Hansen: “This thesis addresses a strategic challenge at National Museum of Denmark to engage with external people, interested in contributing information about their collection of more than half a million coins and medals. This approach of getting outsiders to help with the completion of many small tasks are popularly known as crowdsourcing. This entails a need for the transcription of handwritten protocols, establishment of references between of entries in protocols and photographs of coins. These coins also references both structured and non-structured metadata.
Does a digital platformfor crowd engagement, in the museum’s context, exist? And how is such a platform integrated with the existing infrastructure of the museum? The report considers the MediaWiki, Amazon’s Mechanical Turk and Zooniverse’s Scribe transcription interface, and finds that the MediaWiki fits approximately 70% of the requirements.
Existing cases of successful crowdsourcing projects, national as well international is mentioned and the solution builds upon APIs of existing infrastructure components (such as the existing collection management system GenReg Mønt and the Canto Cumulus digital asset management system) in a modular and reusable architecture.
The report approaches the challenge in a three part process, greatly inspired by the software process model of “Reuse-oriented software engineering” proposed by Professor of Software engineering at the University of St Andrews, Ian Summerville.”