Nicaraguans are using crowdsourcing technology to prove that a good map can change your life


Grace Dobush in Quartz: “Taking a bus in Latin America can be a disorienting experience. Whilethe light rail systems in places like Mexico City, Buenos Aires, and Riode Janeiro have system maps that are fairly easy to understand, mostcities lack a comprehensive bus map.

Several factors have kept bus maps from taking root in much of LatinAmerica. One reason is that bus lines running through cities tend to beoperated by a collection of private companies, so they have lessincentive to coordinate. Newspapers and local governmentsoccasionally attempt to create system maps, but none have stuck.

In Managua, Nicaragua, this absence has had a constricting effect onresidents’ lives. Most people know the bus lines that take them toschool or to work, but don’t stray far from their usual route. “Mobilityin Managua is very difficult,” says Felix Delattre, a German programmerand amateur cartographer who moved to Nicaragua 10 years ago towork for an NGO. “People don’t go out at night because they don’t havethe information on when the bus is coming and where it’s going.”

Delattre led a group of volunteers who teamed up with universitystudents in Managua to create what they believe to be the firstcomprehensive bus map in Latin America. The mapping venture is anindependent project connected to the Humanitarian OpenStreetMapTeam, a nonprofit group that uses open-source technology andcrowdsourcing to create badly needed maps around the world. The teamspent almost two years mapping out Managua and Ciudad Sandino andrecording its 46 bus routes….(More)

How Snapchat Is Recruiting Bone Marrow Donors


PSFK: “Every ten minutes, blood cancer takes a life away. One of the ways to treat this disease is through stem cell transplants that create healthy blood cells. Since 70% of patients who need a transplant cannot find a match in their family, they need to turn to outside donors. In an effort to increase the number of bone marrow donors in the registry, Be The Match turned to Snapchat to find male donors from the ages of 18 to 24.

The aim of the campaign “Be the Guy” is to release short videos on Snapchat of regular guys acting silly. This emphasizes the idea that literally anyone can save a life, no matter who—or how quirky—you are. With a swipe up, any Snapchat user will be directed to a form that makes it easy to sign up to be a donor. To complete the registration, all it takes is to receive a kit in the mail and mail back a swap from your cheek….(More)”

US start-up aims to steer through flood of data


Richard Waters in the Financial Times: “The “open data” movement has produced a deluge of publicly available information this decade, as governments like those in the UK and US have released large volumes of data for general use.

But the flood has left researchers and data scientists with a problem: how do they find the best data sets, ensure these are accurate and up to date, and combine them with other sources of information?

The most ambitious in a spate of start-ups trying to tackle this problem is set to be unveiled on Monday, when data.world opens for limited release. A combination of online repository and social network, the site is designed to be a central platform to support the burgeoning activity around freely available data.

The aim closely mirrors Github, which has been credited with spurring the open source software movement by becoming both a place to store and find free programs as well as a crowdsourcing tool for identifying the most useful.

“We are at an inflection point,” said Jeff Meisel, chief marketing officer for the US Census Bureau. A “massive amount of data” has been released under open data provisions, he said, but “what hasn’t been there are the tools, the communities, the infrastructure to make that data easier to mash up”….

Data.world plans to seed its site with about a thousand data sets and attract academics as its first users, said Mr Hurt. By letting users create personal profiles on the site, follow others and collaborate around the information they are working on, the site hopes to create the kind of social dynamic that makes it more useful the more it is used.

An attraction of the service is the ability to upload data in any format and then use common web standards to link different data sets and create mash-ups with the information, said Dean Allemang, an expert in online data….(More)”

Crowdsourcing biomedical research: leveraging communities as innovation engines


Julio Saez-Rodriguez et al in Nature: “The generation of large-scale biomedical data is creating unprecedented opportunities for basic and translational science. Typically, the data producers perform initial analyses, but it is very likely that the most informative methods may reside with other groups. Crowdsourcing the analysis of complex and massive data has emerged as a framework to find robust methodologies. When the crowdsourcing is done in the form of collaborative scientific competitions, known as Challenges, the validation of the methods is inherently addressed. Challenges also encourage open innovation, create collaborative communities to solve diverse and important biomedical problems, and foster the creation and dissemination of well-curated data repositories….(More)”

Research in the Crowdsourcing Age, a Case Study


Report by  (Pew): “How scholars, companies and workers are using Mechanical Turk, a ‘gig economy’ platform, for tasks computers can’t handle

How Mechanical Turk WorksDigital age platforms are providing researchers the ability to outsource portions of their work – not just to increasingly intelligent machines, but also to a relatively low-cost online labor force comprised of humans. These so-called “online outsourcing” services help employers connect with a global pool of free-agent workers who are willing to complete a variety of specialized or repetitive tasks.

Because it provides access to large numbers of workers at relatively low cost, online outsourcing holds a particular appeal for academics and nonprofit research organizations – many of whom have limited resources compared with corporate America. For instance, Pew Research Center has experimented with using these services to perform tasks such as classifying documents and collecting website URLs. And a Google search of scholarly academic literature shows that more than 800 studies – ranging from medical research to social science – were published using data from one such platform, Amazon’s Mechanical Turk, in 2015 alone.1

The rise of these platforms has also generated considerable commentary about the so-called “gig economy” and the possible impact it will have on traditional notions about the nature of work, the structure of compensation and the “social contract” between firms and workers. Pew Research Center recently explored some of the policy and employment implications of these new platforms in a national survey of Americans.

Proponents say this technology-driven innovation can offer employers – whether companies or academics – the ability to control costs by relying on a global workforce that is available 24 hours a day to perform relatively inexpensive tasks. They also argue that these arrangements offer workers the flexibility to work when and where they want to. On the other hand, some critics worry this type of arrangement does not give employees the same type of protections offered in more traditional work environments – while others have raised concerns about the quality and consistency of data collected in this manner.

A recent report from the World Bank found that the online outsourcing industry generated roughly $2 billion in 2013 and involved 48 million registered workers (though only 10% of them were considered “active”). By 2020, the report predicted, the industry will generate between $15 billion and $25 billion.

Amazon’s Mechanical Turk is one of the largest outsourcing platforms in the United States and has become particularly popular in the social science research community as a way to conduct inexpensive surveys and experiments. The platform has also become an emblem of the way that the internet enables new businesses and social structures to arise.

In light of its widespread use by the research community and overall prominence within the emerging world of online outsourcing, Pew Research Center conducted a detailed case study examining the Mechanical Turk platform in late 2015 and early 2016. The study utilizes three different research methodologies to examine various aspects of the Mechanical Turk ecosystem. These include human content analysis of the platform, a canvassing of Mechanical Turk workers and an analysis of third party data.

The first goal of this research was to understand who uses the Mechanical Turk platform for research or business purposes, why they use it and who completes the work assignments posted there. To evaluate these issues, Pew Research Center performed a content analysis of the tasks posted on the site during the week of Dec. 7-11, 2015.

A second goal was to examine the demographics and experiences of the workers who complete the tasks appearing on the site. This is relevant not just to fellow researchers that might be interested in using the platform, but as a snapshot of one set of “gig economy” workers. To address these questions, Pew Research Center administered a nonprobability online survey of Turkers from Feb. 9-25, 2016, by posting a task on Mechanical Turk that rewarded workers for answering questions about their demographics and work habits. The sample of 3,370 workers contains any number of interesting findings, but it has its limits. This canvassing emerges from an opt-in sample of those who were active on MTurk during this particular period, who saw our survey and who had the time and interest to respond. It does not represent all active Turkers in this period or, more broadly, all workers on MTurk.

Finally, this report uses data collected by the online tool mturk-tracker, which is run by Dr. Panagiotis G. Ipeirotis of the New York University Stern School of Business, to examine the amount of activity occurring on the site. The mturk-tracker data are publically available online, though the insights presented here have not been previously published elsewhere….(More)”

Bridging data gaps for policymaking: crowdsourcing and big data for development


 for the DevPolicyBlog: “…By far the biggest innovation in data collection is the ability to access and analyse (in a meaningful way) user-generated data. This is data that is generated from forums, blogs, and social networking sites, where users purposefully contribute information and content in a public way, but also from everyday activities that inadvertently or passively provide data to those that are able to collect it.

User-generated data can help identify user views and behaviour to inform policy in a timely way rather than just relying on traditional data collection techniques (census, household surveys, stakeholder forums, focus groups, etc.), which are often cumbersome, very costly, untimely, and in many cases require some form of approval or support by government.

It might seem at first that user-generated data has limited usefulness in a development context due to the importance of the internet in generating this data combined with limited internet availability in many places. However, U-Report is one example of being able to access user-generated data independent of the internet.

U-Report was initiated by UNICEF Uganda in 2011 and is a free SMS based platform where Ugandans are able to register as “U-Reporters” and on a weekly basis give their views on topical issues (mostly related to health, education, and access to social services) or participate in opinion polls. As an example, Figure 1 shows the result from a U-Report poll on whether polio vaccinators came to U-Reporter houses to immunise all children under 5 in Uganda, broken down by districts. Presently, there are more than 300,000 U-Reporters in Uganda and more than one million U-Reporters across 24 countries that now have U-Report. As an indication of its potential impact on policymaking,UNICEF claims that every Member of Parliament in Uganda is signed up to receive U-Report statistics.

Figure 1: U-Report Uganda poll results

Figure 1: U-Report Uganda poll results

U-Report and other platforms such as Ushahidi (which supports, for example, I PAID A BRIBE, Watertracker, election monitoring, and crowdmapping) facilitate crowdsourcing of data where users contribute data for a specific purpose. In contrast, “big data” is a broader concept because the purpose of using the data is generally independent of the reasons why the data was generated in the first place.

Big data for development is a new phrase that we will probably hear a lot more (see here [pdf] and here). The United Nations Global Pulse, for example, supports a number of innovation labs which work on projects that aim to discover new ways in which data can help better decision-making. Many forms of “big data” are unstructured (free-form and text-based rather than table- or spreadsheet-based) and so a number of analytical techniques are required to make sense of the data before it can be used.

Measures of Twitter activity, for example, can be a real-time indicator of food price crises in Indonesia [pdf] (see Figure 2 below which shows the relationship between food-related tweet volume and food inflation: note that the large volume of tweets in the grey highlighted area is associated with policy debate on cutting the fuel subsidy rate) or provide a better understanding of the drivers of immunisation awareness. In these examples, researchers “text-mine” Twitter feeds by extracting tweets related to topics of interest and categorising text based on measures of sentiment (positive, negative, anger, joy, confusion, etc.) to better understand opinions and how they relate to the topic of interest. For example, Figure 3 shows the sentiment of tweets related to vaccination in Kenya over time and the dates of important vaccination related events.

Figure 2: Plot of monthly food-related tweet volume and official food price statistics

Figure 2: Plot of monthly food-related Tweet volume and official food price statistics

Figure 3: Sentiment of vaccine related tweets in Kenya

Figure 3: Sentiment of vaccine-related tweets in Kenya

Another big data example is the use of mobile phone usage to monitor the movement of populations in Senegal in 2013. The data can help to identify changes in the mobility patterns of vulnerable population groups and thereby provide an early warning system to inform humanitarian response effort.

The development of mobile banking too offers the potential for the generation of a staggering amount of data relevant for development research and informing policy decisions. However, it also highlights the public good nature of data collected by public and private sector institutions and the reliance that researchers have on them to access the data. Building trust and a reputation for being able to manage privacy and commercial issues will be a major challenge for researchers in this regard….(More)”

Crowdsourcing privacy policy analysis: Potential, challenges and best practices


Paper by , and : “Privacy policies are supposed to provide transparency about a service’s data practices and help consumers make informed choices about which services to entrust with their personal information. In practice, those privacy policies are typically long and complex documents that are largely ignored by consumers. Even for regulators and data protection authorities privacy policies are difficult to assess at scale. Crowdsourcing offers the potential to scale the analysis of privacy policies with microtasks, for instance by assessing how specific data practices are addressed in privacy policies or extracting information about data practices of interest, which can then facilitate further analysis or be provided to users in more effective notice formats. Crowdsourcing the analysis of complex privacy policy documents to non-expert crowdworkers poses particular challenges. We discuss best practices, lessons learned and research challenges for crowdsourcing privacy policy analysis….(More)”

Directory of crowdsourcing websites


Directory by Donelle McKinley: “…Here is just a selection of websites for crowdsourcing cultural heritage. Websites are actively crowdsourcing unless indicated with an asterisk…The directory is organized by the type of crowdsourcing process involved, using the typology for crowdsourcing in the humanities developed by Dunn & Hedges (2012). In their study they explain that, “a process is a sequence of tasks, through which an output is produced by operating on an asset”. For example, the Your Paintings Tagger website is for the process of tagging, which is an editorial task. The assets being tagged are images, and the output of the project is metadata, which makes the images easier to discover, retrieve and curate.

Transcription

Alexander Research Library, Wanganui Library * (NZ) Transcription of index cards from 1840 to 2002.

Ancient Lives*, University of Oxford (UK) Transcription of papyri from Greco-Roman Egypt.

AnnoTate, Tate Britain (UK) Transcription of artists’ diaries, letters and sketchbooks.

Decoding the Civil War, The Huntington Library, Abraham Lincoln Presidential Library and Museum &  North Carolina State University (USA). Transcription and decoding of Civil War telegrams from the Thomas T. Eckert Papers.

DIY History, University of Iowa Libraries (USA) Transcription of historical documents.

Emigrant City, New York Public Library (USA) Transcription of handwritten mortgage and bond ledgers from the Emigrant Savings Bank records.

Field Notes of Laurence M. Klauber, San Diego Natural History Museum (USA) Transcription of field notes by the celebrated herpetologist.

Notes from Nature Transcription of natural history museum records.

Measuring the ANZACs, Archives New Zealand and Auckland War Memorial Museum (NZ). Transcription of first-hand accounts of NZ soldiers in WW1.

Old Weather (UK) Transcription of Royal Navy ships logs from the early twentieth century.

Scattered Seeds, Heritage Collections, Dunedin Public Libraries (NZ) Transcription of index cards for Dunedin newspapers 1851-1993

Shakespeare’s World, Folger Shakespeare Library (USA) & Oxford University Press (UK). Transcription of handwritten documents by Shakespeare’s contemporaries. Identification of words that have yet to be recorded in the authoritative Oxford English Dictionary.

Smithsonian Digital Volunteers Transcription Center (USA) Transcription of multiple collections.

Transcribe Bentham, University College London (UK) Transcription of historical manuscripts by philosopher and reformer Jeremy Bentham,

What’s on the menu? New York Public Library (USA) Transcription of historical restaurant menus. …

(Full Directory).

Better research through video games


Simon Parkin at the New Yorker:”… it occurred to Szantner and Revaz that the tremendous amount of time and energy that people put into games could be co-opted in the name of human progress. That year, they founded Massively Multiplayer Online Science, a company that pairs game makers with scientists.

This past March, the first fruits of their conversation in Geneva appeared in EVE Online, a complex science-fiction game set in a galaxy composed of tens of thousands of stars and planets, and inhabited by half a million or so people from across the Internet, who explore and do battle daily. EVE was launched in 2003 by C.C.P., a studio based in Reykjavík, but players have only recently begun to contribute to scientific research. Their task is to assist with the Human Protein Atlas (H.P.A.), a Swedish-run effort to catalogue proteins and the genes that encode them, in both normal tissue and cancerous tumors. “Humans are, by evolution, very good at quickly recognizing patterns,” Emma Lundberg, the director of the H.P.A.’s Subcellular Atlas, a database of high-resolution images of fluorescently dyed cells, told me. “This is what we exploit in the game.”

The work, dubbed Project Discovery, fits snugly into EVE Online’s universe. At any point, players can take a break from their dogfighting, trading, and political machinations to play a simple game within the game, finding commonalities and differences between some thirteen million microscope images. In each one, the cell’s innards have been color-coded—blue for the nucleus (the cell’s brain), red for microtubules (the cell’s scaffolding), and green for anywhere that a protein has been detected. After completing a tutorial, players tag the image using a list of twenty-nine options, including “nucleus,” “cytoplasm,” and “mitochondria.” When enough players reach a consensus on a single image, it is marked as “solved” and handed off to the scientists at the H.P.A. “In terms of the pattern recognition and classification, it resembles what we are doing as researchers,” Lundberg said. “But the game interface is, of course, much cooler than our laboratory information-management system. I would love to work in-game only.”

Rather than presenting the project as a worthy extracurricular activity, EVE Online’s designers have cast it as an extension of the game’s broader fiction. Players work for the Sisters of EVE, a religious humanitarian-aid organization, which rewards their efforts with virtual currency. This can be used to purchase items in the game, including a unique set of armor designed by one of the C.C.P.’s artists, Andrei Cristea. (The armor is available only to players who participate in Project Discovery, and therefore, like a rare Coco Chanel frock, is desirable as much for its scarcity as for its design.) Insuring that the mini-game be thought of as more than a short-term novelty or diversion was an issue that Linzi Campbell, Project Discovery’s lead designer, considered carefully. “The hardest challenge has been turning the image-analysis process into a game that is strong enough to motivate the player to continue playing,” Campbell told me. “The fun comes from the feeling of mastery.”

Evidently, her efforts were successful. On the game’s first day of release, there were four hundred thousand submissions from players. According to C.C.P., some people have been so caught up in the task that they have played for fifteen hours without interruption. “EVE players turned out to be a perfect crowd for this type of citizen science,” Lundberg said. She anticipates that the first phase of the project will be completed this summer. If the work meets this target, players will be presented with more advanced images and tasks, such as the classification of protein patterns in complex tumor-tissue samples. Eventually, their efforts could aid in the development of new cancer drugs….(More)”

In Your Neighborhood, Who Draws the Map?


Lizzie MacWillie at NextCity: “…By crowdsourcing neighborhood boundaries, residents can put themselves on the map in critical ways.

Why does this matter? Neighborhoods are the smallest organizing element in any city. A strong city is made up of strong neighborhoods, where the residents can effectively advocate for their needs. A neighborhood boundary marks off a particular geography and calls out important elements within that geography: architecture, street fabric, public spaces and natural resources, to name a few. Putting that line on a page lets residents begin to identify needs and set priorities. Without boundaries, there’s no way to know where to start.

Knowing a neighborhood’s boundaries and unique features allows a group to list its assets. What buildings have historic significance? What shops and restaurants exist? It also helps highlight gaps: What’s missing? What does the neighborhood need more of? What is there already too much of? Armed with this detailed inventory, residents can approach a developer, city council member or advocacy group with hard numbers on what they know their neighborhood needs.

With a precisely defined geography, residents living in a food desert can point to developable vacant land that’s ideal for a grocery store. They can also cite how many potential grocery shoppers live within the neighborhood.

In addition to being able to organize within the neighborhood, staking a claim to a neighborhood, putting it on a map and naming it, can help a neighborhood control its own narrative and tell its story — so someone else doesn’t.

Our neighborhood map project was started in part as a response to consistent misidentification of Dallas neighborhoods by local media, which appears to be particularly common in stories about majority-minority neighborhoods. This kind of oversight can contribute to a false narrative about a place, especially when the news is about crime or violence, and takes away from residents’ ability to tell their story and shape their neighborhood’s future. Even worse is when neighborhoods are completely left off of the map, as if they have no story at all to tell.

Neighborhood mapping can also counter narrative hijacking like I’ve seen in my hometown of Brooklyn, where realtor-driven neighborhood rebranding has led to areas being renamed. These places have their own unique identities and histories, yet longtime residents saw names changed so that real estate sellers could capitalize on increasing property values in adjacent trendy neighborhoods.

Cities across the country — including Dallas, Boston, New York, Chicago,Portland and Seattle — have crowdsourced mapping projects people can contribute to. For cities lacking such an effort, tools like Google Map Maker have been effective….(More)”.