Human Factors in Big Data


Dossier by Jenny de Boer, Marc Steen, Maurice van Beurden, Alexander Toet, Susanne Tak, Jan van Erp en Ward Venrooij: “Much attention in Big Data is given to the technical challenges that arise. However, Human Factor challenges and opportunities are also relevant. Big Data provides human factor specialists with a new and rich way of data collection and data analysis of human factors patterns. Big Data also has implications on how human factors knowledge can be applied (and developed) to develop Big Data services that people are willing to use. How people are critical to the success of big data: This first article highlights the role of people in developing Data Driven Innovations. Visualising Uncertainty: In order to interpret Big Data, visualization of data is of great importance. This article gives insight in how to visualize predictive information based on Big Data. Measuring dashboard performance: In the third article, the development of a framework is described, that can be used to develop and test the quality of data presentation in dashboards….(More) “

Smart Cities – International Case Studies


“These case studies were developed by the Inter-American Development Bank (IDB), in association with the Korea Research Institute for Human Settlements (KRIHS).

Anyang, Korea Anyang, a 600,000 population city near Seoul is developing international recognition on its smart city project that has been implemented incrementally since 2003. This initiative began with the Bus Information System to enhance citizen’s convenience at first, and has been expanding its domain into wider Intelligent Transport System as well as crime and disaster prevention in an integrated manner. Anyang is considered a benchmark for smart city with a 2012 Presidential Award in Korea and receives large number of international visits. Anyang’s Integrated Operation and Control Center (IOCC) acts as the platform that gathers, analyzes and distributes information for mobility, disasters management and crime. Anyang is currently utilizing big data for policy development and is continuing its endeavor to expand its smart city services into areas such as waste and air quality management. Download Anyang case study

Medellín, Colombia Medellin is a city that went from being known for its security problems to being an international referent of technological and social innovation, urban transformation, equity, and citizen participation. This report shows how Medellin has implemented a series of strategies that have made it a smart city that is developing capacity and organic structure in the entities that control mobility, the environment, and security. In addition, these initiatives have created mechanisms to communicate and interact with citizens in order to promote continuous improvement of smart services.

Through the Program “MDE: Medellin Smart City,” Medellin is implementing projects to create free Internet access zones, community centers, a Mi-Medellin co-creation portal, open data, online transactions, and other services. Another strategy is the creation of the Smart Mobility System which, through the use of technology, has achieved a reduction in the number of accidents, improvement in mobility, and a reduction in incident response time. Download Medellin case study

Namyangju, Korea

Orlando, U.S.

Pangyo, Korea

Rio de Janeiro, Brazil… 

Santander, España

Singapore

Songdo, Korea

Tel Aviv, Israel(More)”

Bridging data gaps for policymaking: crowdsourcing and big data for development


 for the DevPolicyBlog: “…By far the biggest innovation in data collection is the ability to access and analyse (in a meaningful way) user-generated data. This is data that is generated from forums, blogs, and social networking sites, where users purposefully contribute information and content in a public way, but also from everyday activities that inadvertently or passively provide data to those that are able to collect it.

User-generated data can help identify user views and behaviour to inform policy in a timely way rather than just relying on traditional data collection techniques (census, household surveys, stakeholder forums, focus groups, etc.), which are often cumbersome, very costly, untimely, and in many cases require some form of approval or support by government.

It might seem at first that user-generated data has limited usefulness in a development context due to the importance of the internet in generating this data combined with limited internet availability in many places. However, U-Report is one example of being able to access user-generated data independent of the internet.

U-Report was initiated by UNICEF Uganda in 2011 and is a free SMS based platform where Ugandans are able to register as “U-Reporters” and on a weekly basis give their views on topical issues (mostly related to health, education, and access to social services) or participate in opinion polls. As an example, Figure 1 shows the result from a U-Report poll on whether polio vaccinators came to U-Reporter houses to immunise all children under 5 in Uganda, broken down by districts. Presently, there are more than 300,000 U-Reporters in Uganda and more than one million U-Reporters across 24 countries that now have U-Report. As an indication of its potential impact on policymaking,UNICEF claims that every Member of Parliament in Uganda is signed up to receive U-Report statistics.

Figure 1: U-Report Uganda poll results

Figure 1: U-Report Uganda poll results

U-Report and other platforms such as Ushahidi (which supports, for example, I PAID A BRIBE, Watertracker, election monitoring, and crowdmapping) facilitate crowdsourcing of data where users contribute data for a specific purpose. In contrast, “big data” is a broader concept because the purpose of using the data is generally independent of the reasons why the data was generated in the first place.

Big data for development is a new phrase that we will probably hear a lot more (see here [pdf] and here). The United Nations Global Pulse, for example, supports a number of innovation labs which work on projects that aim to discover new ways in which data can help better decision-making. Many forms of “big data” are unstructured (free-form and text-based rather than table- or spreadsheet-based) and so a number of analytical techniques are required to make sense of the data before it can be used.

Measures of Twitter activity, for example, can be a real-time indicator of food price crises in Indonesia [pdf] (see Figure 2 below which shows the relationship between food-related tweet volume and food inflation: note that the large volume of tweets in the grey highlighted area is associated with policy debate on cutting the fuel subsidy rate) or provide a better understanding of the drivers of immunisation awareness. In these examples, researchers “text-mine” Twitter feeds by extracting tweets related to topics of interest and categorising text based on measures of sentiment (positive, negative, anger, joy, confusion, etc.) to better understand opinions and how they relate to the topic of interest. For example, Figure 3 shows the sentiment of tweets related to vaccination in Kenya over time and the dates of important vaccination related events.

Figure 2: Plot of monthly food-related tweet volume and official food price statistics

Figure 2: Plot of monthly food-related Tweet volume and official food price statistics

Figure 3: Sentiment of vaccine related tweets in Kenya

Figure 3: Sentiment of vaccine-related tweets in Kenya

Another big data example is the use of mobile phone usage to monitor the movement of populations in Senegal in 2013. The data can help to identify changes in the mobility patterns of vulnerable population groups and thereby provide an early warning system to inform humanitarian response effort.

The development of mobile banking too offers the potential for the generation of a staggering amount of data relevant for development research and informing policy decisions. However, it also highlights the public good nature of data collected by public and private sector institutions and the reliance that researchers have on them to access the data. Building trust and a reputation for being able to manage privacy and commercial issues will be a major challenge for researchers in this regard….(More)”

Priorities for the National Privacy Research Strategy


James Kurose and Keith Marzullo at the White House: “Vast improvements in computing and communications are creating new opportunities for improving life and health, eliminating barriers to education and employment, and enabling advances in many sectors of the economy. The promise of these new applications frequently comes from their ability to create, collect, process, and archive information on a massive scale.

However, the rapid increase in the quantity of personal information that is being collected and retained, combined with our increased ability to analyze and combine it with other information, is creating concerns about privacy. When information about people and their activities can be collected, analyzed, and repurposed in so many ways, it can create new opportunities for crime, discrimination, inadvertent disclosure, embarrassment, and harassment.

This Administration has been a strong champion of initiatives to improve the state of privacy, such as the “Consumer Privacy Bill of Rights” proposal and the creation of the Federal Privacy Council. Similarly, the White House report Big Data: Seizing Opportunities, Preserving Values highlights the need for large-scale privacy research, stating: “We should dramatically increase investment for research and development in privacy-enhancing technologies, encouraging cross-cutting research that involves not only computer science and mathematics, but also social science, communications and legal disciplines.”

Today, we are pleased to release the National Privacy Research Strategy. Research agencies across government participated in the development of the strategy, reviewing existing Federal research activities in privacy-enhancing technologies, soliciting inputs from the private sector, and identifying priorities for privacy research funded by the Federal Government. The National Privacy Research Strategy calls for research along a continuum of challenges, from how people understand privacy in different situations and how their privacy needs can be formally specified, to how these needs can be addressed, to how to mitigate and remediate the effects when privacy expectations are violated. This strategy proposes the following priorities for privacy research:

  • Foster a multidisciplinary approach to privacy research and solutions;
  • Understand and measure privacy desires and impacts;
  • Develop system design methods that incorporate privacy desires, requirements, and controls;
  • Increase transparency of data collection, sharing, use, and retention;
  • Assure that information flows and use are consistent with privacy rules;
  • Develop approaches for remediation and recovery; and
  • Reduce privacy risks of analytical algorithms.

With this strategy, our goal is to produce knowledge and technology that will enable individuals, commercial entities, and the Federal Government to benefit from technological advancements and data use while proactively identifying and mitigating privacy risks. Following the release of this strategy, we are also launching a Federal Privacy R&D Interagency Working Group, which will lead the coordination of the Federal Government’s privacy research efforts. Among the group’s first public activities will be to host a workshop to discuss the strategic plan and explore directions of follow-on research. It is our hope that this strategy will also inspire parallel efforts in the private sector….(More)”

Open Data in Southeast Asia


Book by Manuel Stagars: “This book explores the power of greater openness, accountability, and transparency in digital information and government data for the nations of Southeast Asia. The author demonstrates that, although the term “open data” seems to be self-explanatory, it involves an evolving ecosystem of complex domains. Through empirical case studies, this book explains how governments in the ASEAN may harvest the benefits of open data to maximize their productivity, efficiency and innovation. The book also investigates how increasing digital divides in the population, boundaries to civil society, and shortfalls in civil and political rights threaten to arrest open data in early development, which may hamper post-2015 development agendas in the region. With robust open data policies and clear roadmaps, member states of the ASEAN can harvest the promising opportunities of open data in their particular developmental, institutional and legal settings. Governments, policy makers, entrepreneurs and academics will gain a clearer understanding of the factors that enable open data from this timely research….(More)”

Intermediation in Open Development


Katherine M. A. Reilly and Juan P. Alperin at Global Media Journal: “Open Development (OD) is a subset of ICT4D that studies the potential of ITenabled openness to support social change among poor or marginalized populations. Early OD work examined the potential of IT-enabled openness to decentralize power and enable public engagement by disintermediating knowledge production and dissemination. However, in practice, intermediaries have emerged to facilitate open data and related knowledge production activities in development processes. We identify five models of intermediation in OD work: decentralized, arterial, ecosystem, bridging, and communities of practice and examine the implications of each for stewardship of open processes. We conclude that studying OD through these five forms of intermediation is a productive way of understanding whether and how different patterns of knowledge stewardship influence development outcomes. We also offer suggestions for future research that can improve our understanding of how to sustain openness, facilitate public engagement, and ensure that intermediation contributes to open development….(More)”

Due Diligence? We need an app for that


Ken Banks at kiwanja.net: “The ubiquity of mobile phones, the reach of the Internet, the shear number of problems facing the planet, competitions and challenges galore, pots of money and strong media interest in tech-for-good projects has today created the perfect storm. Not a day goes by without the release of an app hoping to solve something, and the fact so many people are building so many apps to fix so many problems can only be a good thing. Right?

The only problem is this. It’s become impossible to tell good from bad, even real from fake. It’s something of a Wild West out there. So it was no surprise to see this happening recently. Quoting The Guardian:

An app which purported to offer aid to refugees lost in the Mediterranean has been pulled from Apple’s App Store after it was revealed as a fake. The I Sea app, which also won a Bronze medal at the Cannes Lions conference on Monday night, presented itself as a tool to help report refugees lost at sea, using real-time satellite footage to identify boats in trouble and highlighting their location to the Malta-based Migrant Offshore Aid Station (Moas), which would provide help.

In fact, the app did nothing of the sort. Rather than presenting real-time satellite footage – a difficult and expensive task – it instead simply shows a portion of a static, unchanging image. And while it claims to show the weather in the southern Mediterranean, that too isn’t that accurate: it’s for Western Libya.

The worry isn’t only that someone would decide to build a fake app which ‘tackles’ such an emotive subject, but the fact that this particular app won an award and received favourable press. Wired, Mashable, the Evening Standard and Reuters all spoke positively about it. Did no-one check that it did what it said it did?

This whole episode reminds me of something Joel Selanikio wrote in his contributing chapter to two books I’ve recently edited and published. In his chapters, which touch on his work on the Magpi data collection tool in addition to some of the challenges facing the tech-for-development community, Joel wrote:

In going over our user activity logs for the online Magpi app, I quickly realised that no-one from any of our funding organisations was listed. Apparently no-one who was paying us had ever seen our working software! This didn’t seem to make sense. Who would pay for software without ever looking at it? And if our funders hadn’t seen the software, what information were they using when they decided whether to fund us each year?

…The shear number of apps available that claim to solve all manner of problems may seem encouraging on the surface – 1,500 (and counting) to help refugees might be a case in point – but how many are useful? How many are being used? How many solve a problem? And how many are real?

Due diligence? Maybe it’s time we had an app for that…(More)”

The Perils of Using Technology to Solve Other People’s Problems


Ethan Zuckerman in The Atlantic: “I found Shane Snow’s essay on prison reform — “How Soylent and Oculus Could Fix the Prison System” — through hate-linking….

Some of my hate-linking friends began their eye-rolling about Snow’s article with the title, which references two of Silicon Valley’s most hyped technologies. With the current focus on the U.S. as an “innovation economy,” it’s common to read essays predicting the end of a major social problem due to a technical innovation.Bitcoin will end poverty in the developing world by enabling inexpensive money transfers. Wikipedia and One Laptop Per Child will educate the world’s poor without need for teachers or schools. Self driving cars will obviate public transport and reshape American cities.

The writer Evgeny Morozov has offered a sharp and helpful critique to this mode of thinking, which he calls “solutionism.” Solutionism demands that we focus on problems that have “nice and clean technological solution at our disposal.” In his book, To Save Everything, Click Here, Morozov savages ideas like Snow’s, regardless of whether they are meant as thought experiments or serious policy proposals. (Indeed, one worry I have in writing this essay is taking Snow’s ideas too seriously, as Morozov does with many of the ideas he lambastes in his book.)

The problem with the solutionist critique, though, is that it tends to remove technological innovation from the problem-solver’s toolkit. In fact, technological development is often a key component in solving complex social and political problems, and new technologies can sometimes open a previously intractable problem. The rise of inexpensive solar panels may be an opportunity to move nations away from a dependency on fossil fuels and begin lowering atmospheric levels of carbon dioxide, much as developments in natural gas extraction and transport technologies have lessened the use of dirtier fuels like coal.

But it’s rare that technology provides a robust solution to a social problem by itself. Successful technological approaches to solving social problems usually require changes in laws and norms, as well as market incentives to make change at scale….

Design philosophies like participatory design and codesign bring this concept to the world of technology, demanding that technologies designed for a group of people be designed and built, in part, by those people. Codesign challenges many of the assumptions of engineering, requiring people who are used to working in isolation to build broad teams and to understand that those most qualified to offer a technical solution may be least qualified to identify a need or articulate a design problem. This method is hard and frustrating, but it’s also one of the best ways to ensure that you’re solving the right problem, rather than imposing your preferred solution on a situation…(More)”

Better research through video games


Simon Parkin at the New Yorker:”… it occurred to Szantner and Revaz that the tremendous amount of time and energy that people put into games could be co-opted in the name of human progress. That year, they founded Massively Multiplayer Online Science, a company that pairs game makers with scientists.

This past March, the first fruits of their conversation in Geneva appeared in EVE Online, a complex science-fiction game set in a galaxy composed of tens of thousands of stars and planets, and inhabited by half a million or so people from across the Internet, who explore and do battle daily. EVE was launched in 2003 by C.C.P., a studio based in Reykjavík, but players have only recently begun to contribute to scientific research. Their task is to assist with the Human Protein Atlas (H.P.A.), a Swedish-run effort to catalogue proteins and the genes that encode them, in both normal tissue and cancerous tumors. “Humans are, by evolution, very good at quickly recognizing patterns,” Emma Lundberg, the director of the H.P.A.’s Subcellular Atlas, a database of high-resolution images of fluorescently dyed cells, told me. “This is what we exploit in the game.”

The work, dubbed Project Discovery, fits snugly into EVE Online’s universe. At any point, players can take a break from their dogfighting, trading, and political machinations to play a simple game within the game, finding commonalities and differences between some thirteen million microscope images. In each one, the cell’s innards have been color-coded—blue for the nucleus (the cell’s brain), red for microtubules (the cell’s scaffolding), and green for anywhere that a protein has been detected. After completing a tutorial, players tag the image using a list of twenty-nine options, including “nucleus,” “cytoplasm,” and “mitochondria.” When enough players reach a consensus on a single image, it is marked as “solved” and handed off to the scientists at the H.P.A. “In terms of the pattern recognition and classification, it resembles what we are doing as researchers,” Lundberg said. “But the game interface is, of course, much cooler than our laboratory information-management system. I would love to work in-game only.”

Rather than presenting the project as a worthy extracurricular activity, EVE Online’s designers have cast it as an extension of the game’s broader fiction. Players work for the Sisters of EVE, a religious humanitarian-aid organization, which rewards their efforts with virtual currency. This can be used to purchase items in the game, including a unique set of armor designed by one of the C.C.P.’s artists, Andrei Cristea. (The armor is available only to players who participate in Project Discovery, and therefore, like a rare Coco Chanel frock, is desirable as much for its scarcity as for its design.) Insuring that the mini-game be thought of as more than a short-term novelty or diversion was an issue that Linzi Campbell, Project Discovery’s lead designer, considered carefully. “The hardest challenge has been turning the image-analysis process into a game that is strong enough to motivate the player to continue playing,” Campbell told me. “The fun comes from the feeling of mastery.”

Evidently, her efforts were successful. On the game’s first day of release, there were four hundred thousand submissions from players. According to C.C.P., some people have been so caught up in the task that they have played for fifteen hours without interruption. “EVE players turned out to be a perfect crowd for this type of citizen science,” Lundberg said. She anticipates that the first phase of the project will be completed this summer. If the work meets this target, players will be presented with more advanced images and tasks, such as the classification of protein patterns in complex tumor-tissue samples. Eventually, their efforts could aid in the development of new cancer drugs….(More)”

Civic Data Initiatives


Burak Arikan at Medium: “Big data is the term used to define the perpetual and massive data gathered by corporations and governments on consumers and citizens. When the subject of data is not necessarily individuals but governments and companies themselves, we can call it civic data, and when systematically generated in large amounts, civic big data. Increasingly, a new generation of initiatives are generating and organizing structured data on particular societal issues from human rights violations, to auditing government budgets, from labor crimes to climate justice.

These civic data initiatives diverge from the traditional civil society organizations in their outcomes,that they don’t just publish their research as reports, but also open it to the public as a database.Civic data initiatives are quite different in their data work than international non-governmental organizations such as UN, OECD, World Bank and other similar bodies. Such organizations track social, economical, political conditions of countries and concentrate upon producing general statistical data, whereas civic data initiatives aim to produce actionable data on issues that impact individuals directly. The change in the GDP value of a country is useless for people struggling for free transportation in their city. Incarceration rate of a country does not help the struggle of the imprisoned journalists. Corruption indicators may serve as a parameter in a country’s credit score, but does not help to resolve monopolization created with public procurement. Carbon emission statistics do not prevent the energy deals between corrupt governments that destroy the nature in their region.

Needless to say, civic data initiatives also differ from governmental institutions, which are reluctant to share any more that they are legally obligated to. Many governments in the world simply dump scanned hardcopies of documents on official websites instead of releasing machine-readable data, which prevents systematic auditing of government activities.Civic data initiatives, on the other hand, make it a priority to structure and release their data in formats that are both accessible and queryable.

Civic data initiatives also deviate from general purpose information commons such as Wikipedia. Because they consistently engage with problems, closely watch a particular societal issue, make frequent updates,even record from the field to generate and organize highly granular data about the matter….

Several civic data initiatives generate data on variety of issues at different geographies, scopes, and scales. The non-exhaustive list below have information on founders, data sources, and financial support. It is sorted according to each initiative’s founding year. Please send your suggestions to contact at graphcommons.com. See more detailed information and updates on the spreadsheet of civic data initiatives.

Open Secrets tracks data about the money flow in the US government, so it becomes more accessible for journalists, researchers, and advocates.Founded as a non-profit in 1983 by Center for Responsive Politics, gets support from variety of institutions.

PolitiFact is a fact-checking website that rates the accuracy of claims by elected officials and others who speak up in American politics. Uses on-the-record interviews as its data source. Founded in 2007 as a non-profit organization by Tampa Bay Times. Supported by Democracy Fund, Bill &Melinda Gates Foundation, John S. and James L. Knight Foundation, FordFoundation, Knight Foundation, Craigslist Charitable Fund, and the CollinsCenter for Public Policy…..

La Fabrique de La loi (The Law Factory) maps issues of local-regional socio-economic development, public investments, and ecology in France.Started in 2014, the project builds a database by tracking bills from government sources, provides a search engine as well as an API. The partners of the project are CEE Sciences Po, médialab Sciences Po, RegardsCitoyens, and Density Design.

Mapping Media Freedom identifies threats, violations and limitations faced by members of the press throughout European Union member states,candidates for entry and neighbouring countries. Initiated by Index onCensorship and European Commission in 2004, the project…(More)”