Crowdsourcing Tolstoy


 at the NewYorker: “When Leo Tolstoy’s great-great-granddaughter, the journalist Fyokla Tolstaya, announced that the Leo Tolstoy State Museum was looking for volunteers to proofread some forty-six thousand eight hundred pages of her relative’s writings, she hoped to generate enough interest to get the first round of corrections done in six months.

Within days, some three thousand Russians—engineers, I.T. workers, schoolteachers, retirees, a student pilot, a twenty-year-old waitress—signed on. “We were so happy and so surprised,” said Tolstaya. “They finished in fourteen days.”

Now, thanks largely to the efforts of these volunteers, nearly all of the great Russian writer’s massive body of work, including novels, diaries, letters, religious tracts, philosophical treatises, travelogues, and childhood memories, will soon be available online, in a form that can be easily downloaded, free of charge. “Of course we realized there are some novels on the Internet,” Tolstaya said. “But most [writings] are not. We in the museum decided this is not good.”…

The definitive, ninety-volume jubilee edition of Tolstoy’s works, compiled and published in Russia from the nineteen-twenties to the nineteen-fifties, had already been scanned by the Russian State Library. However, converting the PDFs into an easy-to-use digital format posed a challenge. For one thing, even after ABBYY, a company that specializes in translating printed documents into digital records, offered their services for free, proofreading costs were likely to be prohibitive. Charging readers to download the works was not an option. “At the end of his life, Tolstoy said, ‘I don’t need any money for my work. I want to give my work to the people,’ “ said Tolstaya. “It was important for us to make it free for everyone. It is his will.”

That was when they hit on the idea of crowdsourcing, Tolstaya said. “It’s according to Leo Tolstoy’s ideas, to do it with the help of all people around the world—vsem mirom—even the world’s hardest task can be done with the help of everyone.”…(More)”

Leveraging Mixed Expertise in Crowdsourcing


Dissertation by David Merritt: “Crowdsourcing systems promise to leverage the “wisdom of crowds” to help solve many kinds of problems that are difficult to solve using only computers. Although a crowd of people inherently represents a diversity of skill levels, knowledge, and opinions, crowdsourcing system designers typically view this diversity as noise and effectively cancel it out by aggregating responses. However, we believe that by embracing crowd workers’ diverse expertise levels, system designers can better leverage that knowledge to increase the wisdom of crowds. In this thesis, we propose solutions to a limitation of current crowdsourcing approaches: not accounting for a range of expertise levels in the crowd. The current body of work in crowdsourcing does not systematically examine this, suggesting that researchers may not believe the benefits of using mixed expertise warrants the complexities of supporting it. This thesis presents two systems, Escalier and Kurator, to show that leveraging mixed expertise is a worthwhile endeavor because it materially benefits system performance, at scale, for various types of problems. We also demonstrate an effective technique, called expertise layering, to incorporate mixed expertise into crowdsourcing systems. Finally, we show that leveraging mixed expertise enables researchers to use crowdsourcing to address new types of problems….(More)”

Crowdsourcing at Statistics Canada


Pilot project by Statistics Canada: “Our crowdsourcing pilot project will focus on mapping buildings across Canada.

If you live in Ottawa or Gatineau, you can be among the first to collaborate with us. If you live elsewhere, stay in touch! Your town or city could be next. We are very excited to work with communities across the country on this project.

As a project contributor, you can help create a free and open source of information on commercial, industrial, government and other buildings in Canada. We need your support to close this important data gap! Your work will improve your community’s knowledge of its buildings, and in turn inform policies and programs designed to help you.

An eye on the future

There are currently no accurate national-level statistics on buildings— and their attributes—that can be used to compare specific local areas. The information you submit will help to fill existing data gaps and provide new analytical opportunities that are important to data users.

This project will also teach us about the possibilities and limitations of crowdsourcing. Crowdsourcing data collection may become a way for Statistics Canada and other organizations around the world to collect much-needed information by reaching out to citizens.

What you can do

Using your knowledge of your neighbourhood, along with an online mapping tool called OpenStreetMap, you and other members of the public will be able to input the location, physical attributes and other features of buildings.


It all starts with you, on October 17, 2016

We will officially launch the crowdsourcing campaign for the pilot on October 17, 2016 and will provide further instructions and links to resources.

To subscribe to a distribution list for periodic updates on the project, send us an email at statcan.crowdsource.statcan@canada.ca. We will keep you posted!….(More)”

How Technology is Crowd-Sourcing the Fight Against Hunger


Beth Noveck at Media Planet: “There is more than enough food produced to feed everyone alive today. Yet access to nutritious food is a challenge everywhere and depends on getting every citizen involved, not just large organizations. Technology is helping to democratize and distribute the job of tackling the problem of hunger in America and around the world.

Real-time research

One of the hardest problems is the difficulty of gaining real-time insight into food prices and shortages. Enter technology. We no longer have to rely on professional inspectors slowly collecting information face-to-face. The UN World Food Programme, which provides food assistance to 80 million people each year, together with Nielsen is conducting mobile phone surveys in 15 countries (with plans to expand to 30), asking people by voice and text about what they are eating. Formerly blank maps are now filled in with information provided quickly and directly by the most affected people, making it easy to prioritize the allocation of resources.

Technology helps the information flow in both directions, enabling those in need to reach out, but also to become more effective at helping themselves. The Indian Ministry of Agriculture, in collaboration with Reuters Market Light, provides information services in nine Indian languages to 1.4 million registered farmers in 50,000 villages across 17 Indian states via text and voice messages.

“In the United States, 40 percent of the food produced here is wasted, and yet 1 in 4 American children (and 1 in 6 adults) remain food insecure…”

Data to the people

New open data laws and policies that encourage more transparent publication of public information complement data collection and dissemination technologies such as phones and tablets. About 70 countries and hundreds of regions and cities have adopted open data policies, which guarantee that the information these public institutions collect be available for free use by the public. As a result, there are millions of open datasets now online on websites such as the Humanitarian Data Exchange, which hosts 4,000 datasets such as country-by-country stats on food prices and undernourishment around the world.

Companies are compiling and sharing data to combat food insecurity, too. Anyone can dig into the data on the Global Open Data for Agriculture and Nutrition platform, a data collaborative where 300 private and public partners are sharing information.

Importantly, this vast quantity of open data is available to anyone, not only to governments. As a result, large and small entrepreneurs are able to create new apps and programs to combat food insecurity, such as Plantwise, which uses government data to offer a knowledge bank and run “plant clinics” that help farmers lose less of what they grow to pests. Google uses open government data to show people the location of farmers markets near their homes.

Students, too, can learn to play a role. For the second summer in a row, the Governance Lab at New York University, in partnership with the United States Department of Agriculture (USDA), mounted a two-week open data summer camp for 40 middle and high school students. The next generation of problem solvers is learning new data science skills by working on food safety and other projects using USDA open data.

Enhancing connection

Ultimately, technology enables greater communication and collaboration among the public, social service organizations, restaurants, farmers and other food producers who must work together to avoid food crises. The European Food Safety Authority in Italy has begun exploring how to use internet-based collaboration (often called citizen science or crowdsourcing) to get more people involved in food and feed risk assessment.

In the United States, 40 percent of the food produced here is wasted, and yet 1 in 4 American children (and 1 in 6 adults) remain food insecure, according to the Rockefeller Foundation. Copia, a San Francisco based smartphone app facilitates donations and deliveries of those with excess food in six cities in the Bay Area. Zero Percent in Chicago similarly attacks the distribution problem by connecting restaurants to charities to donate their excess food. Full Harvest is a tech platform that facilitates the selling of surplus produce that otherwise would not have a market.

Mobilizing the world

Prize-backed challenges create the incentives for more people to collaborate online and get involved in the fight against hunger….(More)”

Beware of the gaps in Big Data


Edd Gent at E&T: “When the municipal authority in charge of Boston, Massachusetts, was looking for a smarter way to find which roads it needed to repair, it hit on the idea of crowdsourcing the data. The authority released a mobile app called Street Bump in 2011 that employed an elegantly simple idea: use a smartphone’s accelerometer to detect jolts as cars go over potholes and look up the location using the Global Positioning System. But the approach ran into a pothole of its own.The system reported a disproportionate number of potholes in wealthier neighbourhoods. It turned out it was oversampling the younger, more affluent citizens who were digitally clued up enough to download and use the app in the first place. The city reacted quickly, but the incident shows how easy it is to develop a system that can handle large quantities of data but which, through its own design, is still unlikely to have enough data to work as planned.

As we entrust more of our lives to big data analytics, automation problems like this could become increasingly common, with their errors difficult to spot after the fact. Systems that ‘feel like they work’ are where the trouble starts.

Harvard University professor Gary King, who is also founder of social media analytics company Crimson Hexagon, recalls a project that used social media to predict unemployment. The model was built by correlating US unemployment figures with the frequency that people used words like ‘jobs’, ‘unemployment’ and ‘classifieds’. A sudden spike convinced researchers they had predicted a big rise in joblessness, but it turned out Steve Jobs had died and their model was simply picking up posts with his name. “This was an example of really bad analytics and it’s even worse because it’s the kind of thing that feels like it should work and does work a little bit,” says King.

Big data can shed light on areas with historic information deficits, and systems that seem to automatically highlight the best course of action can be seductive for executives and officials. “In the vacuum of no decision any decision is attractive,” says Jim Adler, head of data at Toyota Research Institute in Palo Alto. “Policymakers will say, ‘there’s a decision here let’s take it’, without really looking at what led to it. Was the data trustworthy, clean?”…(More)”

Coming soon: The Conversation Global


Screen Shot 2016-09-22 at 8.54.58 AMThe Conversation, an independent news and commentary website produced by academics and journalists, launches its Global edition this month.

The Conversation Global will publish commentary, analysis and research from the academic community worldwide. We will engage scholars from across the world, featuring perspectives from the Global South and North on the most pressing international issues. All content will be published under Creative Commons.

The site is open and free for everyone to read.

The Internet for farmers without Internet


Project Breakthrough: “Mobile Internet is rapidly becoming a primary source of knowledge for rural populations in developing countries. But not every one of the world’s 500 million smallholder farmers is connected to the Internet – which means they can struggle to solve daily agricultural challenges. With no way to access to information on things like planting, growing and selling, farmers in Asia, Latin America and Africa simply cannot grow. Many live on less than a dollar a day and don’t have smartphones to ask Google what to do.

London-based startup WeFarm is the world’s first free peer-to-peer network that spreads crowdsourced knowledge via SMS messages, which only need simple mobile phones. Since launching in November 2015, its aim has been to give remote, offline farmers access to the vital innovative insight, such as crop diversification, tackling soil erosion or changing climatic conditions. Billing itself as ‘The internet for people without the internet’, WeFarm strongly believes in the power of grassroots information. That’s why it costs nothing.

“With WeFarm we want all farmers in the world to be able to search for and access the information they need to improve their livelihoods,” Kenny Ewan, CEO tells us. The seeds for his idea were planted after many years working with indigenous communities in Latin America, based in Peru. “To me it makes perfect sense to allow farmers to connect with other farmers in order to find solutions to their problems. These farmers are experts in agriculture, and they come up with low-cost, innovative solutions, that are easy to implement.”

Farmers send questions by SMS to a local WeFarm number. Then they are connected to a huge crowdsourcing platform. The network’s back-end uses machine-learning algorithms to match them to farmers with answers. This data creates a sort of Google for agriculture…(More)”

The Wisdom of the Crowd is what science really needs


Science/Disrupt: “In a world where technology allows for global collaboration, and in a time when we’re finally championing diversity of thought, there are few barriers to getting the right people together to work on some of our most pressing problems. Governments and research labs are attempting to apply this mentality to science through what is known as ‘Citizen Science’ – research conducted in part by the public (amateur scientists) in partnership with the professionals.

The concept of Citizen Science is brilliant: moving science forward, faster, by utilising the wisdom and volume of the crowd. …

But Citizen Science goes beyond working directly with people with specific data to share. Zooniverse – the home of Citizen Science online – lists hundreds of projects which anyone can get involved with to help advance science. From mapping the galaxy and looking for comets, to seeking outAustralian wildlife and helping computers understand animal faces, the projects span across many subjects.

But when you dig deeper into the tasks being asked of these CitizenScientists, you find that – really – it’s a simple data capture activity. There’s no skill involved other than engaging your eyes to see and fingers to click and type. It’s not the wisdom of the crowd which is being tapped into.

You could argue that people are interested purely in being a part of important research – which of course is true for many – but it misses the point that scientists are simply missing out on a great resource of intellect at their fingertips.

There has been a rise of crowdsourced solutions over the last few years. rLoopis an organisation formed over Reddit to propose a Hyperloop transportation capsule; Techfugees is a Global community of technologists who team up to propose and build solutions to problems facing the increasing numbers of refugees around the world;  and XPRIZE is an open competition offering winning teams large sums of money and support to solve the global problems they select each year.

The difference between crowdsourcing and Citizen Science is that in the former, a high value is placed on ideas. There’s a general understanding that‘two minds are better than one’ and that by empowering a larger, more diverse pool of people to engage with important and purposeful work, a better solution will be found faster.

With Citizen Science, the mood is that of the public only being capable of playing hide and seek with pictures and completing menial, time consuming work that the scientists are simply too busy to do. …(More)”

Crowdsourcing: It Matters Who the Crowd Are


Paper by Alexis Comber, Peter Mooney, Ross S. Purves, Duccio Rocchini, and Ariane Walz: “Volunteered geographical information (VGI) and citizen science have become important sources data for much scientific research. In the domain of land cover, crowdsourcing can provide a high temporal resolution data to support different analyses of landscape processes. However, the scientists may have little control over what gets recorded by the crowd, providing a potential source of error and uncertainty. This study compared analyses of crowdsourced land cover data that were contributed by different groups, based on nationality (labelled Gondor and Non-Gondor) and on domain experience (labelled Expert and Non-Expert). The analyses used a geographically weighted model to generate maps of land cover and compared the maps generated by the different groups. The results highlight the differences between the maps how specific land cover classes were under- and over-estimated. As crowdsourced data and citizen science are increasingly used to replace data collected under the designed experiment, this paper highlights the importance of considering between group variations and their impacts on the results of analyses. Critically, differences in the way that landscape features are conceptualised by different groups of contributors need to be considered when using crowdsourced data in formal scientific analyses. The discussion considers the potential for variation in crowdsourced data, the relativist nature of land cover and suggests a number of areas for future research. The key finding is that the veracity of citizen science data is not the critical issue per se. Rather, it is important to consider the impacts of differences in the semantics, affordances and functions associated with landscape features held by different groups of crowdsourced data contributors….(More)”

How Citizen Attachment to Neighborhoods Helps to Improve Municipal Services and Public Spaces


Paper by Daniel O’Brien, Dietmar Offenhuber, Jessica Baldwin-Philippi, Melissa Sands, and Eric Gordon: “What motivates people to contact their local governments with reports about street light outages, potholes, graffiti, and other deteriorations in public spaces? Current efforts to improve government interactions with constituents operate on the premise that citizens who make such reports are motivated by broad civic values. In contrast, our recent research demonstrates that such citizens are primarily motivated by territoriality – that is, attachments to the spaces where they live. Our research focuses on Boston’s “311 system,” which provides telephone hotlines and web channels through which constituents can request non-emergency government services.

Although our study focuses on 311 users in Boston, it holds broader implications for more than 400 U.S. municipalities that administer similar systems. And our results encourage a closer look at the drivers of citizen participation in many “coproduction programs” – programs that involve people in the design and implementation of government services. Currently, 311 is just one example of government efforts to use technology to involve constituents in joint efforts.

Territorial Ties and Civic Engagement

The concept of territoriality originated in studies of animal behavior – such as bears marking trees in the forest or lions and hyenas fighting over a kill. Human beings also need to manage the ownership of objects and spaces, but social psychologists have demonstrated that human territoriality, whether at home, the workplace, or a neighborhood, entails more than the defense of objects or spaces against others. It includes maintenance and caretaking, and even extends to items shared with others….(More)”