Global

Research in the Crowdsourcing Age, a Case Study

Curated on July 12, 2016October 9, 2018 by Stefaan Verhulst

Report by Paul Hitlin (Pew): “How scholars, companies and workers are using Mechanical Turk, a ‘gig economy’ platform, for tasks computers can’t handle

Digital age platforms are providing researchers the ability to outsource portions of their work – not just to increasingly intelligent machines, but also to a relatively low-cost online labor force comprised of humans. These so-called “online outsourcing” services help employers connect with a global pool of free-agent workers who are willing to complete a variety of specialized or repetitive tasks.

Because it provides access to large numbers of workers at relatively low cost, online outsourcing holds a particular appeal for academics and nonprofit research organizations – many of whom have limited resources compared with corporate America. For instance, Pew Research Center has experimented with using these services to perform tasks such as classifying documents and collecting website URLs. And a Google search of scholarly academic literature shows that more than 800 studies – ranging from medical research to social science – were published using data from one such platform, Amazon’s Mechanical Turk, in 2015 alone.¹

The rise of these platforms has also generated considerable commentary about the so-called “gig economy” and the possible impact it will have on traditional notions about the nature of work, the structure of compensation and the “social contract” between firms and workers. Pew Research Center recently explored some of the policy and employment implications of these new platforms in a national survey of Americans.

Proponents say this technology-driven innovation can offer employers – whether companies or academics – the ability to control costs by relying on a global workforce that is available 24 hours a day to perform relatively inexpensive tasks. They also argue that these arrangements offer workers the flexibility to work when and where they want to. On the other hand, some critics worry this type of arrangement does not give employees the same type of protections offered in more traditional work environments – while others have raised concerns about the quality and consistency of data collected in this manner.

A recent report from the World Bank found that the online outsourcing industry generated roughly $2 billion in 2013 and involved 48 million registered workers (though only 10% of them were considered “active”). By 2020, the report predicted, the industry will generate between $15 billion and $25 billion.

Amazon’s Mechanical Turk is one of the largest outsourcing platforms in the United States and has become particularly popular in the social science research community as a way to conduct inexpensive surveys and experiments. The platform has also become an emblem of the way that the internet enables new businesses and social structures to arise.

In light of its widespread use by the research community and overall prominence within the emerging world of online outsourcing, Pew Research Center conducted a detailed case study examining the Mechanical Turk platform in late 2015 and early 2016. The study utilizes three different research methodologies to examine various aspects of the Mechanical Turk ecosystem. These include human content analysis of the platform, a canvassing of Mechanical Turk workers and an analysis of third party data.

The first goal of this research was to understand who uses the Mechanical Turk platform for research or business purposes, why they use it and who completes the work assignments posted there. To evaluate these issues, Pew Research Center performed a content analysis of the tasks posted on the site during the week of Dec. 7-11, 2015.

A second goal was to examine the demographics and experiences of the workers who complete the tasks appearing on the site. This is relevant not just to fellow researchers that might be interested in using the platform, but as a snapshot of one set of “gig economy” workers. To address these questions, Pew Research Center administered a nonprobability online survey of Turkers from Feb. 9-25, 2016, by posting a task on Mechanical Turk that rewarded workers for answering questions about their demographics and work habits. The sample of 3,370 workers contains any number of interesting findings, but it has its limits. This canvassing emerges from an opt-in sample of those who were active on MTurk during this particular period, who saw our survey and who had the time and interest to respond. It does not represent all active Turkers in this period or, more broadly, all workers on MTurk.

Finally, this report uses data collected by the online tool mturk-tracker, which is run by Dr. Panagiotis G. Ipeirotis of the New York University Stern School of Business, to examine the amount of activity occurring on the site. The mturk-tracker data are publically available online, though the insights presented here have not been previously published elsewhere….(More)”

Postal big data: Global flows as proxy indicators for national wellbeing

Curated on July 11, 2016October 9, 2018 by Stefaan Verhulst

Data Driven Journalism: “A new project has developed an innovative means to approximate socioeconomic indicators by analyzing the network of international postal flows.

The project used 14 million aggregated electronic postal records from 187 countries collected by the Universal Postal Union over a four-year period (2010-2014) to create an international network showing the way post flows around the world.

In addition, the project builds upon previous research efforts using global flow networks, derived from the five following open data sources:

For each network, a country’s degree of connectivity for incoming and outgoing flows was quantified using the Jaccard coefficient and Spearman’s rank correlation coefficient….

To understand these connections in the context of socioeconomic indicators, the researchers then compared these positions to the values of GDP, Life expectancy, Corruption Perception Index, Internet penetration rate, Happiness index, Gini index, Economic Complexity Index, Literacy, Poverty, CO2 emissions, Fixed phone line penetration, Mobile phone users, and the Human Development Index.

Image: Spearman rank correlations between global flow network degrees and socioeconomic indicators (CC BY 4.0).

From this analysis, the researchers revealed that:

The best-performing degree, in terms of consistently high performance across indicators is the global degree, suggesting that looking at how well connected a country is in the global multiplex can be more indicative of its socioeconomic profile as a whole than looking at single networks.
GDP per capita and life expectancy are most closely correlated with the global degree, closely followed by the postal, trade and IP weighed degrees – indicative of a relationship between national wealth and the flow of goods and information.
Similarly to GDP, the rate of poverty of a country is best represented by the global degree, followed by the postal degree. The negative correlation indicates that the more impoverished a country is, the less well connected it is to the rest of the world.
Low human development (high rank) is most highly negatively correlated with the global degree, followed by the postal, trade and IP degrees. This shows that high human development (low rank) is associated with high global connectivity and activity in terms of incoming and outgoing flows of information and goods. ….Read the fully study here.”

Bridging data gaps for policymaking: crowdsourcing and big data for development

Curated on July 8, 2016August 3, 2018 by Stefaan Verhulst

Anthony Swan for the DevPolicyBlog: “…By far the biggest innovation in data collection is the ability to access and analyse (in a meaningful way) user-generated data. This is data that is generated from forums, blogs, and social networking sites, where users purposefully contribute information and content in a public way, but also from everyday activities that inadvertently or passively provide data to those that are able to collect it.

User-generated data can help identify user views and behaviour to inform policy in a timely way rather than just relying on traditional data collection techniques (census, household surveys, stakeholder forums, focus groups, etc.), which are often cumbersome, very costly, untimely, and in many cases require some form of approval or support by government.

It might seem at first that user-generated data has limited usefulness in a development context due to the importance of the internet in generating this data combined with limited internet availability in many places. However, U-Report is one example of being able to access user-generated data independent of the internet.

U-Report was initiated by UNICEF Uganda in 2011 and is a free SMS based platform where Ugandans are able to register as “U-Reporters” and on a weekly basis give their views on topical issues (mostly related to health, education, and access to social services) or participate in opinion polls. As an example, Figure 1 shows the result from a U-Report poll on whether polio vaccinators came to U-Reporter houses to immunise all children under 5 in Uganda, broken down by districts. Presently, there are more than 300,000 U-Reporters in Uganda and more than one million U-Reporters across 24 countries that now have U-Report. As an indication of its potential impact on policymaking,UNICEF claims that every Member of Parliament in Uganda is signed up to receive U-Report statistics.

Figure 1: U-Report Uganda poll results

U-Report and other platforms such as Ushahidi (which supports, for example, I PAID A BRIBE, Watertracker, election monitoring, and crowdmapping) facilitate crowdsourcing of data where users contribute data for a specific purpose. In contrast, “big data” is a broader concept because the purpose of using the data is generally independent of the reasons why the data was generated in the first place.

Big data for development is a new phrase that we will probably hear a lot more (see here [pdf] and here). The United Nations Global Pulse, for example, supports a number of innovation labs which work on projects that aim to discover new ways in which data can help better decision-making. Many forms of “big data” are unstructured (free-form and text-based rather than table- or spreadsheet-based) and so a number of analytical techniques are required to make sense of the data before it can be used.

Measures of Twitter activity, for example, can be a real-time indicator of food price crises in Indonesia [pdf] (see Figure 2 below which shows the relationship between food-related tweet volume and food inflation: note that the large volume of tweets in the grey highlighted area is associated with policy debate on cutting the fuel subsidy rate) or provide a better understanding of the drivers of immunisation awareness. In these examples, researchers “text-mine” Twitter feeds by extracting tweets related to topics of interest and categorising text based on measures of sentiment (positive, negative, anger, joy, confusion, etc.) to better understand opinions and how they relate to the topic of interest. For example, Figure 3 shows the sentiment of tweets related to vaccination in Kenya over time and the dates of important vaccination related events.

Figure 2: Plot of monthly food-related tweet volume and official food price statistics

Figure 3: Sentiment of vaccine related tweets in Kenya

Another big data example is the use of mobile phone usage to monitor the movement of populations in Senegal in 2013. The data can help to identify changes in the mobility patterns of vulnerable population groups and thereby provide an early warning system to inform humanitarian response effort.

The development of mobile banking too offers the potential for the generation of a staggering amount of data relevant for development research and informing policy decisions. However, it also highlights the public good nature of data collected by public and private sector institutions and the reliance that researchers have on them to access the data. Building trust and a reputation for being able to manage privacy and commercial issues will be a major challenge for researchers in this regard….(More)”

Intermediation in Open Development

Curated on July 4, 2016May 29, 2019 by Stefaan Verhulst

Katherine M. A. Reilly and Juan P. Alperin at Global Media Journal: “Open Development (OD) is a subset of ICT4D that studies the potential of ITenabled openness to support social change among poor or marginalized populations. Early OD work examined the potential of IT-enabled openness to decentralize power and enable public engagement by disintermediating knowledge production and dissemination. However, in practice, intermediaries have emerged to facilitate open data and related knowledge production activities in development processes. We identify five models of intermediation in OD work: decentralized, arterial, ecosystem, bridging, and communities of practice and examine the implications of each for stewardship of open processes. We conclude that studying OD through these five forms of intermediation is a productive way of understanding whether and how different patterns of knowledge stewardship influence development outcomes. We also offer suggestions for future research that can improve our understanding of how to sustain openness, facilitate public engagement, and ensure that intermediation contributes to open development….(More)”

Transforming governance: how can technology help reshape democracy?

Curated on June 19, 2016August 3, 2018 by Stefaan Verhulst

Research Briefing by Matt Leighninger: “Around the world, people are asking how we can make democracy work in new and better ways. We are frustrated by political systems in which voting is the only legitimate political act, concerned that many republics don’t have the strength or appeal to withstand authoritarian figures, and disillusioned by the inability of many countries to address the fundamental challenges of health, education and economic development.

We can no longer assume that the countries of the global North have ‘advanced’ democracies, and that the nations of the global South simply need to catch up. Citizens of these older democracies have increasingly lost faith in their political institutions; Northerners cherish their human rights and free elections, but are clearly looking for something more. Meanwhile, in the global South, new regimes based on a similar formula of rights and elections have proven fragile and difficult to sustain. And in Brazil, India and other Southern countries, participatory budgeting and other valuable democratic innovations have emerged. The stage is set for a more equitable, global conversation about what we mean by democracy.

How can we adjust our democratic formulas so that they are more sustainable, powerful, fulfilling – and, well, democratic? Some of the parts of this equation may come from the development of online tools and platforms that help people to engage with their governments, with organisations and institutions, and with each other. Often referred to collectively as ‘civic technology’ or ‘civic tech’, these tools can help us map public problems, help citizens generate solutions, gather input for government, coordinate volunteer efforts, and help neighbours remain connected. If we want to create democracies in which citizens have meaningful roles in shaping public decisions and solving public problems, we should be asking a number of questions about civic tech, including:

How can online tools best support new forms of democracy?
What are the examples of how this has happened?
What are some variables to consider in comparing these examples?
How can we learn from each other as we move forward?

This background note has been developed to help democratic innovators explore these questions and examine how their work can provide answers….(More)”

The Seductions of Quantification: Measuring Human Rights, Gender Violence, and Sex Trafficking

Curated on June 17, 2016August 3, 2018 by Stefaan Verhulst

Book by Sally Engle Merry: “We live in a world where seemingly everything can be measured. We rely on indicators to translate social phenomena into simple, quantified terms, which in turn can be used to guide individuals, organizations, and governments in establishing policy. Yet counting things requires finding a way to make them comparable. And in the process of translating the confusion of social life into neat categories, we inevitably strip it of context and meaning—and risk hiding or distorting as much as we reveal.

With The Seductions of Quantification, leading legal anthropologist Sally Engle Merry investigates the techniques by which information is gathered and analyzed in the production of global indicators on human rights, gender violence, and sex trafficking. Although such numbers convey an aura of objective truth and scientific validity, Merry argues persuasively that measurement systems constitute a form of power by incorporating theories about social change in their design but rarely explicitly acknowledging them. For instance, the US State Department’s Trafficking in Persons Report, which ranks countries in terms of their compliance with antitrafficking activities, assumes that prosecuting traffickers as criminals is an effective corrective strategy—overlooking cultures where women and children are frequently sold by their own families. As Merry shows, indicators are indeed seductive in their promise of providing concrete knowledge about how the world works, but they are implemented most successfully when paired with context-rich qualitative accounts grounded in local knowledge….(More)”.

Estonia Is Demonstrating How Government Should Work in a Digital World

Curated on June 13, 2016August 3, 2018 by Stefaan Verhulst

Motherboard: “In May, Manu Sporny became the 10,000th “e-Resident” of Estonia. Sporny, the founder and CEO of a digital payments and identity company located in the United States, has never set foot in Estonia. However, he heard about the country’s e-Residency program and decided it would be an obvious choice for his company’s European headquarters.

People like Sporny are why Estonia launched a digital residency program in December 2014. The program allows anyone in the world to apply for a digital identity, which will let them: establish and run a location independent business online, get easier access to EU markets, open a bank account and conduct e-banking, use international payment service providers, declare taxes, and sign all relevant documents and contracts remotely…..

One of the most essential components of a functioning digital society is a secure digital identity. The state and the private sector need to know who is accessing these online services. Likewise, users need to feel secure that their identity is protected.

Estonia found the solution to this problem. In 2002, we started issuing residents a mandatory ID-card with a chip that empowers them to categorically identify themselves and verify legal transactions and documents through a digital signature. A digital signature has been legally equivalent to a handwritten one throughout the European Union—not just in Estonia—since 1999.

With this new digital identity system, the state could serve not only areas with a low population, but also the entire Estonian diaspora. Estonians anywhere in the world could maintain a connection to their homeland via e-services, contribute to the legislative process, and even participate in elections. Once the government realized that it could scale this service worldwide, it seemed logical to offer its e-services to those without physical residency in Estonia. This meant the Estonian country suddenly had value as a service in addition to a place to live.

What does “Country as a Service” mean?

With the rise of a global internet, we’ve seen more skilled workers and businesspeople offering their services across nations, regardless of their physical location. A survey by Intuit estimates that this number will reach 40 percent in the US alone by 2020.

These entrepreneurs and skilled artisans are ultimately looking for the simplest way to create and maintain a legal, global identity as an outlet for their global offerings.

They look to other countries, not because they are looking for a tax haven, but because they have been prevented from incorporating and maintaining a business, due to barriers from their own government.

The most important thing for these entrepreneurs is that the creation and upkeep of the company is easy and hassle-free. It is also important that, despite being incorporated in a different nation, they remain honest taxpayers within their country of physical residence.

This is exactly what Estonia offers—a location-independent, hassle-free and fully-digital economic and financial environment where entrepreneurs can run their own company globally….

When an e-Resident establishes a company, it means that the company will likely start using the services offered by other Estonian companies (like creating a bank account, partnering with a payment service provider, seeking assistance from accountants, auditors and lawyers). As more clients are created for Estonian companies, their growth potential increases, along with the growth potential of the Estonian economy.

Eventually, there will be more residents outside borders than inside them

If states fail to redesign and simplify the machinery of bureaucracy and make it location-independent, there will be an opportunity for countries that can offer such services across borders.

Estonia has learned that it’s incredibly important in a small state to serve primarily small and micro businesses. In order to sustain a nation on this, we must automate and digitize processes to scale. Estonia’s model, for instance, is location-independent, making it simple to scale successfully. We hope to acquire at least 10 million digital residents (e-Residents) in a way that is mutually beneficial by the nation-states where these people are tax residents….(More)”

Open access: All human knowledge is there—so why can’t everybody access it?

Curated on June 12, 2016August 3, 2018 by Stefaan Verhulst

Glyn Moody at ArsTechnica: “In 1836, Anthony Panizzi, who later became principal librarian of the British Museum, gave evidence before a parliamentary select committee. At that time, he was only first assistant librarian, but even then he had an ambitious vision for what would one day became the British Library. He told the committee:

I want a poor student to have the same means of indulging his learned curiosity, of following his rational pursuits, of consulting the same authorities, of fathoming the most intricate inquiry as the richest man in the kingdom, as far as books go, and I contend that the government is bound to give him the most liberal and unlimited assistance in this respect.

He went some way to achieving that goal of providing general access to human knowledge. In 1856, after 20 years of labour as Keeper of Printed Books, he had helped boost the British Museum’s collection to over half a million books, making it the largest library in the world at the time. But there was a serious problem: to enjoy the benefits of those volumes, visitors needed to go to the British Museum in London.

Imagine, for a moment, if it were possible to provide access not just to those books, but to all knowledge for everyone, everywhere—the ultimate realisation of Panizzi’s dream. In fact, we don’t have to imagine: it is possible today, thanks to the combined technologies of digital texts and the Internet. The former means that we can make as many copies of a work as we want, for vanishingly small cost; the latter provides a way to provide those copies to anyone with an Internet connection. The global rise of low-cost smartphones means that group will soon include even the poorest members of society in every country.

That is to say, we have the technical means to share all knowledge, and yet we are nowhere near providing everyone with the ability to indulge their learned curiosity as Panizzi wanted it.

What’s stopping us? That’s the central question that the “open access” movement has been asking, and trying to answer, for the last two decades. Although tremendous progress has been made, with more knowledge freely available now than ever before, there are signs that open access is at a critical point in its development, which could determine whether it will ever succeed in realising Panizzi’s plan.

Connect the corporate dots to see true transparency

Curated on June 10, 2016August 15, 2018 by Stefaan Verhulst

Gillian Tett at the Financial Times: “…In all this, a crucial point is often forgotten: simply amassing data will not solve the problem of transparency. What is also needed is a way for analysts to track the connections that exist between companies scattered across different national jurisdictions.

There are more than 45,000 companies listed on global stock exchanges and, according to Chris Taggart of OpenCorporates, an independent data company, there are between 250m and 400m unlisted groups. Many of these are listed on national registries but, since registries are extremely fragmented, it is very difficult for shareholders or regulators to form a complete picture of company activity.

It also creates financial stability risks. One reason why it is currently hard to track the scale of Chinese corporate debt, say, is that it is being issued by an opaque web of legal entities. Similarly, regulators struggled to cope with the fallout from the Lehman Brothers collapse in 2008 because the bank was operating almost 3,000 different legal entities around the world.

Is there a solution to this? A good place to start would be for governments to put their corporate registries online. Another crucial step would be for governments and companies to agree on a common standard for labelling legal entities, so that these can be tracked across borders.

Happily, work on that has begun: in 2014, the Global Legal Entity Identifier Foundation was created. It supports the implementation and use of “legal entity identifiers”, a data standard that identifies participants in financial transactions. Groups such as the Data Coalition in Washington DC are lobbying for laws that would force companies to use LEIs….However, this inter-governmental project is moving so slowly that the private sector may be a better bet. In recent years, companies such as Dun & Bradstreet have begun to amass proprietary information about complex corporate webs, and computer nerds are also starting to use the power of big data to join up the corporate dots in a public format.

OpenCorporates is a good example. Over the past five years, a dozen staff there have been painstakingly scraping national corporate registries to create a database designed to show how companies are connected around the world. This is far from complete but data from 100m entities have already been logged. And in the wake of the Panama Papers, more governments are coming on board — data from the Cayman Islands are currently being added and France is likely to collaborate soon.

Sadly, these moves will not deliver real transparency straight away. If you type “MIO” into the search box on the OpenCorporates website, you will not see a map of all of McKinsey’s activities — at least not yet.

The good news, however, is that with every data scrape, or use of an LEI, the picture of global corporate activity is becoming slightly less opaque thanks to the work of a hidden army of geeks. They deserve acclaim and support — even (or especially) from management consultants….(More)”