From Collective Intelligence to Collective Intelligence Systems


New Paper by A. Kornrumpf and U. Baumol in  the International Journal of Cooperative Information Systems: “Collective intelligence (CI) has become a popular research topic over the past few years. However, the CI debate suffers from several problems such as that there is no unanimously agreed-upon definition of CI that clearly differentiates between CI and related terms such as swarm intelligence (SI) and collective intelligence systems (CIS). Furthermore, a model of such CIS is lacking for purposes of research and the design of new CIS. This paper aims at untangling the definitions of CI and other related terms, especially CIS, and at providing a semi-structured model of CIS as a first step towards more structured research. The authors of this paper argue that CI can be defined as the ability of sufficiently large groups of individuals to create an emergent solution for a specific class of problems or tasks. The authors show that other alleged properties of CI which are not covered by this definition, are, in fact, properties of CIS and can be understood by regarding CIS as complex socio-technical systems (STS) that enable the realization of CI. The model defined in this article serves as a means to structure open questions in CIS research and helps to understand which research methodology is adequate for different aspects of CIS.”

Towards an information systems perspective and research agenda on crowdsourcing for innovation


New paper by A Majchrzak and A Malhotra in The Journal of Strategic Information Systems: “Recent years have seen an increasing emphasis on open innovation by firms to keep pace with the growing intricacy of products and services and the ever changing needs of the markets. Much has been written about open innovation and its manifestation in the form of crowdsourcing. Unfortunately, most management research has taken the information system (IS) as a given. In this essay we contend that IS is not just an enabler but rather can be a shaper that optimizes open innovation in general and crowdsourcing in particular. This essay is intended to frame crowdsourcing for innovation in a manner that makes more apparent the issues that require research from an IS perspective. In doing so, we delineate the contributions that the IS field can make to the field of crowdsourcing.

  • Reviews participation architectures supporting current crowdsourcing, finding them inadequate for innovation development by the crowd.

  • Identifies 3 tensions for explaining why a participation architecture for crowdsourced innovation is difficult.

  • Identifies affordances for the participation architectures that may help to manage the tension.

  • Uses the tensions and possible affordances to identify research questions for IS scholars.”

Commons at the Intersection of Peer Production, Citizen Science, and Big Data: Galaxy Zoo


New paper by Michael J. Madison: “The knowledge commons research framework is applied to a case of commons governance grounded in research in modern astronomy. The case, Galaxy Zoo, is a leading example of at least three different contemporary phenomena. In the first place Galaxy Zoo is a global citizen science project, in which volunteer non-scientists have been recruited to participate in large-scale data analysis via the Internet. In the second place Galaxy Zoo is a highly successful example of peer production, some times known colloquially as crowdsourcing, by which data are gathered, supplied, and/or analyzed by very large numbers of anonymous and pseudonymous contributors to an enterprise that is centrally coordinated or managed. In the third place Galaxy Zoo is a highly visible example of data-intensive science, sometimes referred to as e-science or Big Data science, by which scientific researchers develop methods to grapple with the massive volumes of digital data now available to them via modern sensing and imaging technologies. This chapter synthesizes these three perspectives on Galaxy Zoo via the knowledge commons framework.”

Are Some Tweets More Interesting Than Others? #HardQuestion


New paper by Microsoft Research (Omar Alonso, Catherine C. Marshall, and Marc Najork): “Twitter has evolved into a significant communication nexus, coupling personal and highly contextual utterances with local news, memes, celebrity gossip, headlines, and other microblogging subgenres. If we take Twitter as a large and varied dynamic collection, how can we predict which tweets will be interesting to a broad audience in advance of lagging social indicators of interest such as retweets? The telegraphic form of tweets, coupled with the subjective notion of interestingness, makes it difficult for human judges to agree on which tweets are indeed interesting.
In this paper, we address two questions: Can we develop a reliable strategy that results in high-quality labels for a collection of tweets, and can we use this labeled collection to predict a tweet’s interestingness?
To answer the first question, we performed a series of studies using crowdsourcing to reach a diverse set of workers who served as a proxy for an audience with variable interests and perspectives. This method allowed us to explore different labeling strategies, including varying the judges, the labels they applied, the datasets, and other aspects of the task.
To address the second question, we used crowdsourcing to assemble a set of tweets rated as interesting or not; we scored these tweets using textual and contextual features; and we used these scores as inputs to a binary classifier. We were able to achieve moderate agreement (kappa = 0.52) between the best classifier and the human assessments, a figure which reflects the challenges of the judgment task.”

Defining Open Data


Open Knowledge Foundation Blog: “Open data is data that can be freely used, shared and built-on by anyone, anywhere, for any purpose. This is the summary of the full Open Definition which the Open Knowledge Foundation created in 2005 to provide both a succinct explanation and a detailed definition of open data.
As the open data movement grows, and even more governments and organisations sign up to open data, it becomes ever more important that there is a clear and agreed definition for what “open data” means if we are to realise the full benefits of openness, and avoid the risks of creating incompatibility between projects and splintering the community.

Open can apply to information from any source and about any topic. Anyone can release their data under an open licence for free use by and benefit to the public. Although we may think mostly about government and public sector bodies releasing public information such as budgets or maps, or researchers sharing their results data and publications, any organisation can open information (corporations, universities, NGOs, startups, charities, community groups and individuals).

Read more about different kinds of data in our one page introduction to open data
There is open information in transport, science, products, education, sustainability, maps, legislation, libraries, economics, culture, development, business, design, finance …. So the explanation of what open means applies to all of these information sources and types. Open may also apply both to data – big data and small data – or to content, like images, text and music!
So here we set out clearly what open means, and why this agreed definition is vital for us to collaborate, share and scale as open data and open content grow and reach new communities.

What is Open?

The full Open Definition provides a precise definition of what open data is. There are 2 important elements to openness:

  • Legal openness: you must be allowed to get the data legally, to build on it, and to share it. Legal openness is usually provided by applying an appropriate (open) license which allows for free access to and reuse of the data, or by placing data into the public domain.
  • Technical openness: there should be no technical barriers to using that data. For example, providing data as printouts on paper (or as tables in PDF documents) makes the information extremely difficult to work with. So the Open Definition has various requirements for “technical openness,” such as requiring that data be machine readable and available in bulk.”…

The role of task difficulty in the effectiveness of collective intelligence


New article by Christian Wagner: “The article presents a framework and empirical investigation to demonstrate the role of task difficulty in the effectiveness of collective intelligence. The research contends that collective intelligence, a form of community engagement to address problem solving tasks, can be superior to individual judgment and choice, but only when the addressed tasks are in a range of appropriate difficulty, which we label the “collective range”. Outside of that difficulty range, collectives will perform about as poorly as individuals for high difficulty tasks, or only marginally better than individuals for low difficulty tasks. An empirical investigation with subjects randomly recruited online supports our conjecture. Our findings qualify prior research on the strength of collective intelligence in general and offer preliminary insights into the mechanisms that enable individuals and collectives to arrive at good solutions. Within the framework of digital ecosystems, the paper argues that collective intelligence has more survival strength than individual intelligence, with highest sustainability for tasks of medium difficulty”

A New Kind of Economy is Born – Social Decision-Makers Beat the "Homo Economicus"


A new paper by Dirk Helbing: “The Internet and Social Media change our way of decision-making. We are no longer the independent decision makers we used to be. Instead, we have become networked minds, social decision-makers, more than ever before. This has several fundamental implications. First of all, our economic theories must change, and second, our economic institutions must be adapted to support the social decision-maker, the “homo socialis”, rather than tailored to the perfect egoist, known as “homo economicus”….
Such developments will eventually create a participatory market society. “Prosumers”, i.e. co-producing consumers, the new “makers” movement, and the sharing economy are some examples illustrating this. Just think of the success of Wikipedia, Open Streetmap or Github. Open Streetmap now provides the most up-to-date maps of the world, thanks to more than 1 million volunteers.
This is just the beginning of a new era, where production and public engagement will more and more happen in a bottom up way through fluid “projects”, where people can contribute as a leaders (“entrepreneurs”) or participants. A new intellectual framework is emerging, and a creative and participatory era is ahead.
The paradigm shift towards participatory bottom-up self-regulation may be bigger than the paradigm shift from a geocentric to a heliocentric worldview. If we build the right institutions for the information society of the 21st century, we will finally be able to mitigate some very old problems of humanity. “Tragedies of the commons” are just one of them. After so many centuries, they are still plaguing us, but this needn’t be.”

Social media analytics for future oriented policy making


New paper by Verena Grubmüller, Katharina Götsch, and Bernhard Krieger: “Research indicates that evidence-based policy making is most successful when public administrators refer to diversified information portfolios. With the rising prominence of social media in the last decade, this paper argues that governments can benefit from integrating this publically available, user-generated data through the technique of social media analytics (SMA). There are already several initiatives set up to predict future policy issues, e.g. for the policy fields of crisis mitigation or migrant integration insights. The authors analyse these endeavours and their potential for providing more efficient and effective public policies. Furthermore, they scrutinise the challenges to governmental SMA usage in particular with regards to legal and ethical aspects. Reflecting the latter, this paper provides forward-looking recommendations on how these technologies can best be used for future policy making in a legally and ethically sound manner.”

Undefined By Data: A Survey of Big Data Definitions


Paper by Jonathan Stuart Ward and Adam Barker: “The term big data has become ubiquitous. Owing to shared origin between academia, industry and the media there is no single unified definition, and various stakeholders provide diverse and often contradictory definitions. The lack of a consistent definition introduces ambiguity and hampers discourse relating to big data. This short paper attempts to collate the various definitions which have gained some degree of traction and to furnish a clear and concise definition of an otherwise ambiguous term…
Despite the range and differences existing within each of the aforementioned definitions there are some points of similarity. Notably all definitions make at least one of the following assertions:
Size: the volume of the datasets is a critical factor.
Complexity: the structure, behaviour and permutations of the datasets is a critical factor.
Technologies: the tools and techniques which are used to process a sizable or complex dataset is a critical factor.
The definitions surveyed here all encompass at least one of these factors, most encompass two. An extrapolation of these factors would therefore postulate the following: Big data is a term describing the storage and analysis of large and or complex data sets using a series of techniques including, but not limited to: NoSQL, MapReduce and machine learning.”

Using Participatory Crowdsourcing in South Africa to Create a Safer Living Environment


New Paper by Bhaveer Bhana, Stephen Flowerday, and Aharon Satt in the International Journal of Distributed Sensor Networks: “The increase in urbanisation is making the management of city resources a difficult task. Data collected through observations (utilising humans as sensors) of the city surroundings can be used to improve decision making in terms of managing these resources. However, the data collected must be of a certain quality in order to ensure that effective and efficient decisions are made. This study is focused on the improvement of emergency and non-emergency services (city resources) through the use of participatory crowdsourcing (humans as sensors) as a data collection method (collect public safety data), utilising voice technology in the form of an interactive voice response (IVR) system.
The study illustrates how participatory crowdsourcing (specifically humans as sensors) can be used as a Smart City initiative focusing on public safety by illustrating what is required to contribute to the Smart City, and developing a roadmap in the form of a model to assist decision making when selecting an optimal crowdsourcing initiative. Public safety data quality criteria were developed to assess and identify the problems affecting data quality.
This study is guided by design science methodology and applies three driving theories: the Data Information Knowledge Action Result (DIKAR) model, the characteristics of a Smart City, and a credible Data Quality Framework. Four critical success factors were developed to ensure high quality public safety data is collected through participatory crowdsourcing utilising voice technologies.”