Are bots taking over Wikipedia?


Kurzweil News: “As crowdsourced Wikipedia has grown too large — with more than 30 million articles in 287 languages — to be entirely edited and managed by volunteers, 12 Wikipedia bots have emerged to pick up the slack.

The bots use Wikidata — a free knowledge base that can be read and edited by both humans and bots — to exchange information between entries and between the 287 languages.

Which raises an interesting question: what portion of Wikipedia edits are generated by humans versus bots?

To find out (and keep track of other bot activity), Thomas Steiner of Google Germany has created an open-source application (and API): Wikipedia and Wikidata Realtime Edit Stats, described in an arXiv paper.
The percentages of bot vs. human edits as shown in the application is constantly changing.  A KurzweilAI snapshot on Feb. 20 at 5:19 AM EST showed an astonishing 42% of Wikipedia being edited by bots. (The application lists the 12 bots.)


Anonymous vs. logged-In humans (credit: Thomas Steiner)
The percentages also vary by language. Only 5% of English edits were by bots; but for Serbian pages, in which few Wikipedians apparently participate, 96% of edits were by bots.

The application also tracks what percentage of edits are by anonymous users. Globally, it was 25 percent in our snapshot and a surprising 34 percent for English — raising interesting questions about corporate and other interests covertly manipulating Wikipedia information.

How Government Can Make Open Data Work


Joel Gurin in Information Week: “At the GovLab at New York University, where I am senior adviser, we’re taking a different approach than McKinsey’s to understand the evolving value of government open data: We’re studying open data companies from the ground up. I’m now leading the GovLab’s Open Data 500 project, funded by the John S. and James L. Knight Foundation, to identify and examine 500 American companies that use government open data as a key business resource.
Our preliminary results show that government open data is fueling companies both large and small, across the country, and in many sectors of the economy, including health, finance, education, energy, and more. But it’s not always easy to use this resource. Companies that use government open data tell us it is often incomplete, inaccurate, or trapped in hard-to-use systems and formats.
It will take a thorough and extended effort to make government data truly useful. Based on what we are hearing and the research I did for my book, here are some of the most important steps the federal government can take, starting now, to make it easier for companies to add economic value to the government’s data.
1. Improve data quality
The Open Data Policy not only directs federal agencies to release more open data; it also requires them to release information about data quality. Agencies will have to begin improving the quality of their data simply to avoid public embarrassment. We can hope and expect that they will do some data cleanup themselves, demand better data from the businesses they regulate, or use creative solutions like turning to crowdsourcing for help, as USAID did to improve geospatial data on its grantees.
 
 

2. Keep improving open data resources
The government has steadily made Data.gov, the central repository of federal open data, more accessible and useful, including a significant relaunch last week. To the agency’s credit, the GSA, which administers Data.gov, plans to keep working to make this key website still better. As part of implementing the Open Data Policy, the administration has also set up Project Open Data on GitHub, the world’s largest community for open-source software. These resources will be helpful for anyone working with open data either inside or outside of government. They need to be maintained and continually improved.
3. Pass DATA
The Digital Accountability and Transparency Act would bring transparency to federal government spending at an unprecedented level of detail. The Act has strong bipartisan support. It passed the House with only one dissenting vote and was unanimously approved by a Senate committee, but still needs full Senate approval and the President’s signature to become law. DATA is also supported by technology companies who see it as a source of new open data they can use in their businesses. Congress should move forward and pass DATA as the logical next step in the work that the Obama administration’s Open Data Policy has begun.
4. Reform the Freedom of Information Act
Since it was passed in 1966, the federal Freedom of Information Act has gone through two major revisions, both of which strengthened citizens’ ability to access many kinds of government data. It’s time for another step forward. Current legislative proposals would establish a centralized web portal for all federal FOIA requests, strengthen the FOIA ombudsman’s office, and require agencies to post more high-interest information online before they receive formal requests for it. These changes could make more information from FOIA requests available as open data.
5. Engage stakeholders in a genuine way
Up to now, the government’s release of open data has largely been a one-way affair: Agencies publish datasets that they hope will be useful without consulting the organizations and companies that want to use it. Other countries, including the UK, France, and Mexico, are building in feedback loops from data users to government data providers, and the US should, too. The Open Data Policy calls for agencies to establish points of contact for public feedback. At the GovLab, we hope that the Open Data 500 will help move that process forward. Our research will provide a basis for new, productive dialogue between government agencies and the businesses that rely on them.
6. Keep using federal challenges to encourage innovation
The federal Challenge.gov website applies the best principles of crowdsourcing and collective intelligence. Agencies should use this approach extensively, and should pose challenges using the government’s open data resources to solve business, social, or scientific problems. Other approaches to citizen engagement, including federally sponsored hackathons and the White House Champions of Change program, can play a similar role.
Through the Open Data Policy and other initiatives, the Obama administration has set the right goals. Now it’s time to implement and move toward what US CTO Todd Park calls “data liberation.” Thousands of companies, organizations, and individuals will benefit.”

MIT Crowdsources the Next Great (free) IQ Test


ThePsychReport: “Raven’s Matrices have long been a gold standard for psychologists needing to measure general intelligence. But the good ones, the ones scientists like to use, are too expensive for most research projects.

Christopher Chabris, associate professor of psychology at Union College, and David Engel, postdoctoral associate at MIT Sloan School of Management, think the public can help. They recently launched a campaign to crowdsource “the next great IQ test.” The Matrix Reasoning Challenge, created through MIT’s Center for Collective Intelligence with Anita Woolley and Tom Malone,  calls on the public to design and submit matrix puzzles – 3×3 grids that asks subjects to complete a pattern by filling in a missing square.

Chabris says they aren’t trying to compete with commercially available tests used for diagnostic or clinical purposes, but rather want to provide a trustworthy and free alternative for scientists. Because these types of puzzles are nonverbal, culturally neutral, and objective, they have wide-ranging applications and are particularly useful when conducting research across various demographics. If this project is successful, a lot more scientists could do a lot more research.

A simple example of a matrix puzzle. Source: Matrix Reasoning Challenge

“Researchers typically don’t have that much money,” Chabris said. “They can’t afford pay per use tests. Sometimes they have no research budgets, or if they do, they’re not large enough for that kind of thing. Our real goal is to create something useful for researchers.”

Through the Matrix Reasoning Challenge, Chabris and Engel also hope to better understand how crowdsourcing can be used to problem-solve in social and cognitive sciences.

Social scientists already widely use crowdsourcing sites like Amazon’s Mechanical Turk to recruit participants for their studies, but the matrix project is different in that it seeks to tap into the public’s expertise to help solve scientific problems. Scientists in computer science and bioinformatics have been able to harness this expertise to yield some incredible results. Using TopCoder.com, NASA was able to find a more efficient way to deploy solar panels on the International Space Station. Harvard Medical School was able to develop better software for analyzing immune-system genes. With The Matrix Reasoning Challenge, Chabris and Engel are beginning to explore crowdsourcing’s potential in the social sciences.”

What Jelly Means


Steven Johnson: “A few months ago, I found this strange white mold growing in my garden in California. I’m a novice gardener, and to make matters worse, a novice Californian, so I had no idea what these small white cells might portend for my flowers.
This is one of those odd blank spots — I used the call them Googleholes in the early days of the service — where the usual Delphic source of all knowledge comes up relatively useless. The Google algorithm doesn’t know what those white spots are, the way it knows more computational questions, like “what is the top-ranked page for “white mold?” or “what is the capital of Illinois?” What I want, in this situation, is the distinction we usually draw between information and wisdom. I don’t just want to know what the white spots are; I want to know if I should be worried about them, or if they’re just a normal thing during late summer in Northern California gardens.
Now, I’m sure I know a dozen people who would be able to answer this question, but the problem is I don’t really know which people they are. But someone in my extended social network has likely experienced these white spots on their plants, or better yet, gotten rid of them.  (Or, for all I know, ate them — I’m trying not to be judgmental.) There are tools out there that would help me run the social search required to find that person. I can just bulk email my entire address book with images of the mold and ask for help. I could go on Quora, or a gardening site.
But the thing is, it’s a type of question that I find myself wanting to ask a lot, and there’s something inefficient about trying to figure the exact right tool to use to ask it each time, particularly when we have seen the value of consolidating so many of our queries into a single, predictable search field at Google.
This is why I am so excited about the new app, Jelly, which launched today. …
Jelly, if you haven’t heard, is the brainchild of Biz Stone, one of Twitter’s co-founders.  The service launches today with apps on iOS and Android. (Biz himself has a blog post and video, which you should check out.) I’ve known Biz since the early days of Twitter, and I’m excited to be an adviser and small investor in a company that shares so many of the values around networks and collective intelligence that I’ve been writing about since Emergence.
The thing that’s most surprising about Jelly is how fun it is to answer questions. There’s something strangely satisfying in flipping through the cards, reading questions, scanning the pictures, and looking for a place to be helpful. It’s the same broad gesture of reading, say, a Twitter feed, and pleasantly addictive in the same way, but the intent is so different. Scanning a twitter feed while waiting for the train has the feel of “Here we are now, entertain us.” Scanning Jelly is more like: “I’m here. How can I help?”

The Effective Use of Crowdsourcing in E-Governance


Paper by Jayakumar Sowmya and Hussain Shafiq Pyarali: “The rise of Web 2.0 paradigm has empowered the Internet users to share information and generate content on social networking and media sharing platforms such as wikis and blogs. The trend of harnessing the wisdom of public using Web 2.0 distributed networks through open calls is termed as ‘Crowdsourcing’. In addition to businesses, this powerful idea of using collective intelligence or the ‘wisdom of crowd’ applies to different situations, such as in governments and non-profit organizations which have started utilizing crowdsourcing as an essential problem -solving tool. In addition, the widespread and easy access to technologies such as the Internet, mobile phones and other communication devices has resulted in an exponential growth in the use of crowdsourcing for government policy advocacy, e-democracy and e-governance during the past decade. However, utilizing collective intelligence and efforts of public to find solutions to real life problems using web 2.0 tools does come with its share of associated challenges and limitations. This paper aims at identifying and examining the value-adding strategies which contribute to the success of crowdsourcing in e-governance. The qualitative case study analysis and emphatic design methodology are employed to evaluate the effectiveness of the identified strategic and functional components, by analyzing the characteristics of some of the notable cases of crowdsourcing in e-governance and the findings are tabulated and discussed. The paper concludes with the limitations and the implications for future research”.

NEW: The Open Governance Knowledge Base


In its continued efforts to organize and disseminate learnings in the field of technology-enabled governance innovation, today, The Governance Lab is introducing a collaborative, wiki-style repository of information and research at the nexus of technology, governance and citizenship. Right now we’re calling it the Open Governance Knowledge Base, and it goes live today.
Our goal in creating this collaborative platform is to provide a single source of research and insights related to the broad, interdiscplinary field of open governance for the benefit of: 1) decision-makers in governing institutions seeking information and inspiration to guide their efforts to increase openness; 2) academics seeking to enrich and expand their scholarly pursuits in this field; 3) technology practitioners seeking insights and examples of familiar tools being used to solve public problems; and 4) average citizens simply seeking interesting information on a complex, evolving topic area.
While you can already find some pre-populated information and research on the platform, we need your help! The field of open governance is too vast, complex and interdisciplinary to meaningfully document without broad collaboration.
Here’s how you can help to ensure this shared resource is as useful and engaging as possible:

  • What should we call the platform? We want your title suggestions. Leave your ideas in the comments or tweet them to us @TheGovLab.
  • And more importantly: Share your knowledge and research. Take a look at what we’ve posted, create an account, refer to this MediaWiki formatting guide as needed and start editing!

Peer Production: A Modality of Collective Intelligence


New paper by Yochai Benkler, Aaron Shaw and Benjamin Mako Hill:  “Peer production is the most significant organizational innovation that has emerged from
Internet-mediated social practice and among the most a visible and important examples of collective intelligence. Following Benkler,  we define peer production as a form of open creation and sharing performed by groups online that: (1) sets and executes goals in a decentralized manner; (2) harnesses a diverse range of participant motivations, particularly non-monetary motivations; and (3) separates governance and management relations from exclusive forms of property and relational contracts (i.e., projects are governed as open commons or common property regimes and organizational governance utilizes combinations of participatory, meritocratic and charismatic, rather than proprietary or contractual, models). For early scholars of peer production, the phenomenon was both important and confounding for its ability to generate high quality work products in the absence of formal hierarchies and monetary incentives. However, as peer production has become increasingly established in society, the economy, and scholarship, merely describing the success of some peer production projects has become less useful. In recent years, a second wave of scholarship has emerged to challenge assumptions in earlier work; probe nuances glossed over by earlier framings of the phenomena; and identify the necessary dynamics, structures, and conditions for peer production success.
Peer production includes many of the largest and most important collaborative communities on the Internet….
Much of this academic interest in peer production stemmed from the fact that the phenomena resisted straightforward explanations in terms of extant theories of the organization and production of functional information goods like software or encyclopedias. Participants in peer production projects join and contribute valuable resources without the hierarchical bureaucracies or strong leadership structures common to state agencies or firms, and in the absence of clear financial incentives or rewards. As a result, foundationalresearch on peer production was focused on (1) documenting and explaining the organization and governance of peer production communities, (2) understanding the motivation of contributors to peer production, and (3) establishing and evaluating the quality of peer production’s outputs.
In the rest of this chapter, we describe the development of the academic literature on peer production in these three areas – organization, motivation, and quality.”

Implementing Open Innovation in the Public Sector: The Case of Challenge.gov


Article by Ines Mergel and Kevin C. Desouza in Public Administration Review: “As part of the Open Government Initiative, the Barack Obama administration has called for new forms of collaboration with stakeholders to increase the innovativeness of public service delivery. Federal managers are employing a new policy instrument called Challenge.gov to implement open innovation concepts invented in the private sector to crowdsource solutions from previously untapped problem solvers and to leverage collective intelligence to tackle complex social and technical public management problems. The authors highlight the work conducted by the Office of Citizen Services and Innovative Technologies at the General Services Administration, the administrator of the Challenge.gov platform. Specifically, this Administrative Profile features the work of Tammi Marcoullier, program manager for Challenge.gov, and Karen Trebon, deputy program manager, and their role as change agents who mediate collaborative practices between policy makers and public agencies as they navigate the political and legal environments of their local agencies. The profile provides insights into the implementation process of crowdsourcing solutions for public management problems, as well as lessons learned for designing open innovation processes in the public sector”.

Global Collective Intelligence in Technological Societies


Paper by Juan Carlos Piedra Calderón and Javier Rainer in the International Journal of Artificial Intelligence and Interactive Multimedia: “The big influence of Information and Communication Technologies (ICT), especially in area of construction of Technological Societies has generated big
social changes. That is visible in the way of relating to people in different environments. These changes have the possibility to expand the frontiers of knowledge through sharing and cooperation. That has meaning the inherently creation of a new form of Collaborative Knowledge. The potential of this Collaborative Knowledge has been given through ICT in combination with Artificial Intelligence processes, from where is obtained a Collective Knowledge. When this kind of knowledge is shared, it gives the place to the Global Collective Intelligence”.

Democratic Reason: Politics, Collective Intelligence, and the Rule of the Many


New book by Hélène Landemore: “Individual decision making can often be wrong due to misinformation, impulses, or biases. Collective decision making, on the other hand, can be surprisingly accurate. In Democratic Reason, Hélène Landemore demonstrates that the very factors behind the superiority of collective decision making add up to a strong case for democracy. She shows that the processes and procedures of democratic decision making form a cognitive system that ensures that decisions taken by the many are more likely to be right than decisions taken by the few. Democracy as a form of government is therefore valuable not only because it is legitimate and just, but also because it is smart.
Landemore considers how the argument plays out with respect to two main mechanisms of democratic politics: inclusive deliberation and majority rule. In deliberative settings, the truth-tracking properties of deliberation are enhanced more by inclusiveness than by individual competence. Landemore explores this idea in the contexts of representative democracy and the selection of representatives. She also discusses several models for the “wisdom of crowds” channeled by majority rule, examining the trade-offs between inclusiveness and individual competence in voting. When inclusive deliberation and majority rule are combined, they beat less inclusive methods, in which one person or a small group decide. Democratic Reason thus establishes the superiority of democracy as a way of making decisions for the common good.”