Bridging data gaps for policymaking: crowdsourcing and big data for development


 for the DevPolicyBlog: “…By far the biggest innovation in data collection is the ability to access and analyse (in a meaningful way) user-generated data. This is data that is generated from forums, blogs, and social networking sites, where users purposefully contribute information and content in a public way, but also from everyday activities that inadvertently or passively provide data to those that are able to collect it.

User-generated data can help identify user views and behaviour to inform policy in a timely way rather than just relying on traditional data collection techniques (census, household surveys, stakeholder forums, focus groups, etc.), which are often cumbersome, very costly, untimely, and in many cases require some form of approval or support by government.

It might seem at first that user-generated data has limited usefulness in a development context due to the importance of the internet in generating this data combined with limited internet availability in many places. However, U-Report is one example of being able to access user-generated data independent of the internet.

U-Report was initiated by UNICEF Uganda in 2011 and is a free SMS based platform where Ugandans are able to register as “U-Reporters” and on a weekly basis give their views on topical issues (mostly related to health, education, and access to social services) or participate in opinion polls. As an example, Figure 1 shows the result from a U-Report poll on whether polio vaccinators came to U-Reporter houses to immunise all children under 5 in Uganda, broken down by districts. Presently, there are more than 300,000 U-Reporters in Uganda and more than one million U-Reporters across 24 countries that now have U-Report. As an indication of its potential impact on policymaking,UNICEF claims that every Member of Parliament in Uganda is signed up to receive U-Report statistics.

Figure 1: U-Report Uganda poll results

Figure 1: U-Report Uganda poll results

U-Report and other platforms such as Ushahidi (which supports, for example, I PAID A BRIBE, Watertracker, election monitoring, and crowdmapping) facilitate crowdsourcing of data where users contribute data for a specific purpose. In contrast, “big data” is a broader concept because the purpose of using the data is generally independent of the reasons why the data was generated in the first place.

Big data for development is a new phrase that we will probably hear a lot more (see here [pdf] and here). The United Nations Global Pulse, for example, supports a number of innovation labs which work on projects that aim to discover new ways in which data can help better decision-making. Many forms of “big data” are unstructured (free-form and text-based rather than table- or spreadsheet-based) and so a number of analytical techniques are required to make sense of the data before it can be used.

Measures of Twitter activity, for example, can be a real-time indicator of food price crises in Indonesia [pdf] (see Figure 2 below which shows the relationship between food-related tweet volume and food inflation: note that the large volume of tweets in the grey highlighted area is associated with policy debate on cutting the fuel subsidy rate) or provide a better understanding of the drivers of immunisation awareness. In these examples, researchers “text-mine” Twitter feeds by extracting tweets related to topics of interest and categorising text based on measures of sentiment (positive, negative, anger, joy, confusion, etc.) to better understand opinions and how they relate to the topic of interest. For example, Figure 3 shows the sentiment of tweets related to vaccination in Kenya over time and the dates of important vaccination related events.

Figure 2: Plot of monthly food-related tweet volume and official food price statistics

Figure 2: Plot of monthly food-related Tweet volume and official food price statistics

Figure 3: Sentiment of vaccine related tweets in Kenya

Figure 3: Sentiment of vaccine-related tweets in Kenya

Another big data example is the use of mobile phone usage to monitor the movement of populations in Senegal in 2013. The data can help to identify changes in the mobility patterns of vulnerable population groups and thereby provide an early warning system to inform humanitarian response effort.

The development of mobile banking too offers the potential for the generation of a staggering amount of data relevant for development research and informing policy decisions. However, it also highlights the public good nature of data collected by public and private sector institutions and the reliance that researchers have on them to access the data. Building trust and a reputation for being able to manage privacy and commercial issues will be a major challenge for researchers in this regard….(More)”