Open data and transparency: a look back at 2013


Zoe Smith in the Guardian on the open data and development in 2013: “The clarion call for a “data revolution” made in the post-2015 high level panel report is a sign of a growing commitment to see freely flowing data become a tool for social change.

Web-based technology continued to offer increasing numbers of people the ability to share standardised data and statistics to demand better governance and strengthen accountability. 2013 seemed to herald the moment that the open data/transparency movement entered the mainstream.
Yet for those who have long campaigned on the issue, the call was more than just a catchphrase, it was a unique opportunity. “If we do get a global drive towards open data in relation to development or anything else, that would be really transformative and it’s quite rare to see such bold statements at such an early stage of the process. I think it set the tone for a year in which transparency was front and centre of many people’s agendas,” says David Hall Matthews, of Publish What You Fund.
This year saw high level discussions translated into commitments at the policy level. David Cameron used the UK’s presidency of the G8 to trigger international action on the three Ts (tax, trade and transparency) through the IF campaign. The pledge at Lough Erne, in Scotland, reaffirmed the commitment to the Busan open data standard as well as the specific undertaking that all G8 members would implement International Aid Transparency Index (IATI) standards by the end of 2015.
2013 was a particularly good year for the US Millenium Challenge Corporation (MCC) which topped the aid transparency index. While at the very top MCC and UK’s DfID were examples of best practice, there was still much room for improvement. “There is a really long tail of agencies who are not really taking transparency at all, yet. This includes important donors, the whole of France and the whole of Japan who are not doing anything credible,” says Hall-Matthews.
Yet given the increasing number of emerging and ‘frontier‘ markets whose growth is driven in large part by wealth derived from natural resources, 2013 saw a growing sense of urgency for transparency to be applied to revenues from oil, gas and mineral resources that may far outstrip aid. In May, the new Extractive Industries Transparency Initiative standard (EITI) was adopted, which is said to be far broader and deeper than its previous incarnation.
Several countries have done much to ensure that transparency leads to accountability in their extractive industries. In Nigeria, for example, EITI reports are playing an important role in the debate about how resources should be managed in the country. “In countries such as Nigeria they’re taking their commitment to transparency and EITI seriously, and are going beyond disclosing information but also ensuring that those findings are acted upon and lead to accountability. For example, the tax collection agency has started to collect more of the revenues that were previously missing,” says Jonas Moberg, head of the EITI International Secretariat.
But just the extent to which transparency and open data can actually deliver on its revolutionary potential has also been called into question. Governments and donors agencies can release data but if the power structures within which this data is consumed and acted upon do not shift is there really any chance of significant social change?
The complexity of the challenge is illustrated by the case of Mexico which, in 2014, will succeed Indonesia as chair of the Open Government Partnership. At this year’s London summit, Mexico’s acting civil service minister, spoke of the great strides his country has made in opening up the public procurement process, which accounts for around 10% of GDP and is a key area in which transparency and accountability can help tackle corruption.
There is, however, a certain paradox. As SOAS professor, Leandro Vergara Camus, who has written extensively on peasant movements in Mexico, explains: “The NGO sector in Mexico has more of a positive view of these kinds of processes than the working class or peasant organisations. The process of transparency and accountability have gone further in urban areas then they have in rural areas.”…
With increasing numbers of organisations likely to jump on the transparency bandwagon in the coming year the greatest challenge is using it effectively and adequately addressing the underlying issues of power and politics.

Top 2013 transparency publications

Open data, transparency and international development, The North South Institute
Data for development: The new conflict resource?, Privacy International
The fix-rate: a key metric for transparency and accountability, Integrity Action
Making UK aid more open and transparent, DfID
Getting a seat at the table: Civil Society advocacy for budget transparency in “untransparent” countries, International Budget Partnership

The dates that mattered

23-24 May: New Extractive Industries Transparency Index standard adopted
30 May: Post 2015 high level report calling for a ‘data revolution’ is published
17-18 June: UK premier, David Cameron, campaigns for tax, trade and transparency during the G8
24 October: US Millenium Challenge Corporation tops the aid transparency index”
30 October – 1 November: Open Government Partnership in London gathers civil society, governments and data experts

Selected Readings on Data Visualization


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data visualization was originally published in 2013.

Data visualization is a response to the ever-increasing amount of  information in the world. With big data, informatics and predictive analytics, we have an unprecedented opportunity to revolutionize policy-making. Yet data by itself can be overwhelming. New tools and techniques for visualizing information can help policymakers clearly articulate insights drawn from data. Moreover, the rise of open data is enabling those outside of government to create informative and visually arresting representations of public information that can be used to support decision-making by those inside or outside governing institutions.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Duke, D.J., K.W. Brodlie, D.A. Duce and I. Herman. “Do You See What I Mean? [Data Visualization].” IEEE Computer Graphics and Applications 25, no. 3 (2005): 6–9. http://bit.ly/1aeU6yA.

  • In this paper, the authors argue that a more systematic ontology for data visualization to ensure the successful communication of meaning. “Visualization begins when someone has data that they wish to explore and interpret; the data are encoded as input to a visualization system, which may in its turn interact with other systems to produce a representation. This is communicated back to the user(s), who have to assess this against their goals and knowledge, possibly leading to further cycles of activity. Each phase of this process involves communication between two parties. For this to succeed, those parties must share a common language with an agreed meaning.”
  • That authors “believe that now is the right time to consider an ontology for visualization,” and “as visualization move from just a private enterprise involving data and tools owned by a research team into a public activity using shared data repositories, computational grids, and distributed collaboration…[m]eaning becomes a shared responsibility and resource. Through the Semantic Web, there is both the means and motivation to develop a shared picture of what we see when we turn and look within our own field.”

Friendly, Michael. “A Brief History of Data Visualization.” In Handbook of Data Visualization, 15–56. Springer Handbooks Comp.Statistics. Springer Berlin Heidelberg, 2008. http://bit.ly/17fM1e9.

  • In this paper, Friendly explores the “deep roots” of modern data visualization. “These roots reach into the histories of the earliest map making and visual depiction, and later into thematic cartography, statistics and statistical graphics, medicine and other fields. Along the way, developments in technologies (printing, reproduction), mathematical theory and practice, and empirical observation and recording enabled the wider use of graphics and new advances in form and content.”
  • Just as the general the visualization of data is far from a new practice, Friendly shows that the graphical representation of government information has a similarly long history. “The collection, organization and dissemination of official government statistics on population, trade and commerce, social, moral and political issues became widespread in most of the countries of Europe from about 1825 to 1870. Reports containing data graphics were published with some regularity in France, Germany, Hungary and Finland, and with tabular displays in Sweden, Holland, Italy and elsewhere.”

Graves, Alvaro and James Hendler. “Visualization Tools for Open Government Data.” In Proceedings of the 14th Annual International Conference on Digital Government Research, 136–145. Dg.o ’13. New York, NY, USA: ACM, 2013. http://bit.ly/1eNSoXQ.

  • In this paper, the authors argue that, “there is a gap between current Open Data initiatives and an important part of the stakeholders of the Open Government Data Ecosystem.” As it stands, “there is an important portion of the population who could benefit from the use of OGD but who cannot do so because they cannot perform the essential operations needed to collect, process, merge, and make sense of the data. The reasons behind these problems are multiple, the most critical one being a fundamental lack of expertise and technical knowledge. We propose the use of visualizations to alleviate this situation. Visualizations provide a simple mechanism to understand and communicate large amounts of data.”
  • The authors also describe a prototype of a tool to create visualizations based on OGD with the following capabilities:
    • Facilitate visualization creation
    • Exploratory mechanisms
    • Viralization and sharing
    • Repurpose of visualizations

Hidalgo, César A. “Graphical Statistical Methods for the Representation of the Human Development Index and Its Components.” United Nations Development Programme Human Development Reports, September 2010. http://bit.ly/166TKur.

  • In this paper for the United Nations Human Development Programme, Hidalgo argues that “graphical statistical methods could be used to help communicate complex data and concepts through universal cognitive channels that are heretofore underused in the development literature.”
  • To support his argument, representations are provided that “show how graphical methods can be used to (i) compare changes in the level of development experienced by countries (ii) make it easier to understand how these changes are tied to each one of the components of the Human Development Index (iii) understand the evolution of the distribution of countries according to HDI and its components and (iv) teach and create awareness about human development by using iconographic representations that can be used to graphically narrate the story of countries and regions.”

Stowers, Genie. “The Use of Data Visualization in Government.” IBM Center for The Business of Government, Using Technology Series, 2013. http://bit.ly/1aame9K.

  • This report seeks “to help public sector managers understand one of the more important areas of data analysis today — data visualization. Data visualizations are more sophisticated, fuller graphic designs than the traditional spreadsheet charts, usually with more than two variables and, typically, incorporating interactive features.”
  • Stowers also offers numerous examples of “visualizations that include geographical and health data, or population and time data, or financial data represented in both absolute and relative terms — and each communicates more than simply the data that underpin it. In addition to these many examples of visualizations, the report discusses the history of this technique, and describes tools that can be used to create visualizations from many different kinds of data sets.”

AU: Govt finds one third of open data was "junk"


IT News: “The number of datasets available on the Government’s open data website has slimmed by more than half after the agency discovered one third of the datasets were junk.
Since its official launch in 2011 data.gov.au grew to hold 1200 datasets from government agencies for public consumption.
In July this year the Deaprtment of Finance migrated the portal to a new open source platform – the Open Knowledge Foundation CKAN platform – for greater ease of use and publishing ability.
Since July the number of datasets fell from 1200 to 500.
Australian Government CTO John Sheridan said in his blog late yesterday the agency had needed to review the 1200 datasets as a result of the CKAN migration, and discovered a significant amount of them were junk.
“We unfortunately found that a third of the “datasets” were just links to webpages or files that either didn’t exist anymore, or redirected somewhere not useful to genuine seekers of data,” Sheridan said.
“In the second instance, the original 1200 number included each individual file. On the new platform, a dataset may have multiple files. In one case we have a dataset with 200 individual files where before it was counted as 200 datasets.”
The number of datasets following the clean out now sits at 529. Around 123 government bodies contributed data to the portal.
Sheridan said the number was still too low.
“A lot of momentum has built around open data in Australia, including within governments around the country and we are pleased to report that a growing number of federal agencies are looking at how they can better publish data to be more efficient, improve policy development and analysis, deliver mobile services and support greater transparency and public innovation,” he said….
The Federal Government’s approach to open data has previously been criticised as “patchy” and slow, due in part to several shortcomings in the data.gov.au website as well as slow progress in agencies adopting an open approach by default.
The Australian Information Commissioner’s February report on open data in government outlined the manual uploading and updating of datasets, lack of automated entry for metadata and a lack of specific search functions within data.gov.au as obstacles affecting the efforts pushing a whole-of-government approach to open data.
The introduction of the new CKAN platform is expected to go some way to addressing the highlighted concerns.”

The Impact of Innovation Inducement Prizes


From the Compendium of Evidence on Innovation Policy/NESTA: “Innovation inducement prizes are one of the oldest types of innovation policy measure.  The popularity of innovation inducement prizes has gradually decreased during the early 20th century. However, innovation inducement prizes have regained some of their popularity since the 1990s with new prizes awarded by the US X Prize Foundation and with the current USA Administration’s efforts to use them in various government departments as an innovation policy instrument. Innovation Prizes are also becoming an important innovation policy instrument in the UK.  A recent report by McKinsey & Company (2009) estimates the value of prizes awarded to be between £600 million and £1.2million. Despite the growing popularity of innovation inducement prizes, the impact of this innovation policy measure is still not understood. This report brings together the existing evidence on the effects of innovation inducement prizes by drawing on a number of ex-ante and ex-post evaluations as well as limited academic literature. This report focuses on ex-ante innovation inducement prizes where the aim is to induce investment or attention to a specific goal or technology. This report does not discuss the impact of ex-post recognition prizes where the prize is given as a recognition after the intended outcome happens (e.g. Nobel Prize).
Innovation inducement prizes have a wide range of rationales and there is no agreed on dominant rationale in the literature. Traditionally, prizes have been seen as an innovation policy instrument that can overcome market failure by creating an incentive for the development of a particular technology or technology application. A second rationale is that the implementation demonstration projects in which not only creation of a specific technology is intended but also demonstration of the feasible application of this technology is targeted. A third rationale is related to the creation of a technology that will later be put in the public domain to attract subsequent research. Prizes are also increasingly organised for community and leadership building. As prizes probably allow more flexibility than most of the other innovation policy instruments, there is a large number of different prize characteristics and thus a vast number of prize typologies based on these characteristics.
Evidence on the effectiveness of prizes is scarce. There are only a few evaluations or academic works that deal with the creation of innovation output and even those which deal with the innovation output only rarely deals with the additionality. Only a very limited number of studies looked at if innovation inducement prizes led to more innovation itself or innovation outputs. As well as developing the particular technology that the innovation inducement prizes produce, they create prestige for both the prize sponsor and entrants. Prizes might also increase the public and sectoral awareness on specific technology issues. A related issue to the prestige gained from the prizes is the motivations of participants as a conditioning factor for innovation performance. Design issues are the main concern of the prizes literature. This reflects the importance of a careful design for the achievement of desired effects (and the limitation of undesired effects). There are a relatively large number of studies that investigated the influence of the design of prize objective on the innovation performance. A number of studies points out that sometimes prizes should be accompanied with or followed by other demand side initiatives to fulfil their objectives, mostly on the basis of ex-ante evaluations. Finally, prizes are also seen as a valuable opportunity for experimentation in innovation policy.
It is evident from the literature we analysed that the evidence on the impact of innovation inducement prizes is scarce. There is also a consensus that innovation inducement prizes are not a substitute for other innovation policy measures but are complementary under certain conditions. Prizes can be effective in creating innovation through more intense competition, engagement of wide variety of actors, distributing risks to many participants and by exploiting more flexible solutions through a less prescriptive nature of the definition of the problem in prizes. They can overcome some of the inherent barriers to other instruments, but if prizes are poorly designed, managed and awarded, they may be ineffective or even harmful.”

Tech challenge develops algorithms to predict


SciDevNet: “Mathematical models that use existing socio-political data to predict mass atrocities could soon inform governments and NGOs on how and where to take preventative action.
The models emerged from one strand of the Tech Challenge for Atrocity Prevention, a competition run by the US Agency for International Development (USAID) and NGO Humanity United. The winners were announced last month (18 November) and will now work with the organiser to further develop and pilot their innovations.
The five winners from different countries who won between US$1,000 and US$12,000, were among nearly 100 entrants who developed algorithms to predict when and where mass atrocities are likely to happen.
Around 1.5 billion people live in countries affected by conflict, sometimes including atrocities such as genocides, mass rape and ethnic cleansing, according to the World Bank’s World Development Report 2011. Many of these countries are in the developing world.
The competition organisers hope the new algorithms could help governments and human rights organisations identify at-risk regions, potentially allowing them to intervene before mass atrocities happen.
The competition started from the premise that certain social and political measurements are linked to increased likelihood of atrocities. Yet because such factors interact in complex ways, organisations working to prevent atrocities lack a reliable method of predicting when and where they might happen next.
The algorithms use sociopolitical indicators and data on past atrocities as their inputs. The data was drawn from archives such as the Global Database of Events, Language and Tone, a data set that encodes more than 200 million globally newsworthy events, recording cultural information such as the people involved, their location and any religious connections.”
Link to the winners of the Model Challenge

Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation


New paper by Henry Sauermann and Chiara Franzoni: “Crowd-based knowledge production is attracting growing attention from scholars and practitioners. One key premise is that participants who have an intrinsic “interest” in a topic or activity are willing to expend effort at lower pay than in traditional employment relationships. However, it is not clear how strong and sustainable interest is as a source of motivation. We draw on research in psychology to discuss important static and dynamic features of interest and derive a number of research questions regarding interest-based effort in crowd-based projects. Among others, we consider the specific versus general nature of interest, highlight the potential role of matching between projects and individuals, and distinguish the intensity of interest at a point in time from the development and sustainability of interest over time. We then examine users’ participation patterns within and across 7 different crowd science projects that are hosted on a shared platform. Our results provide novel insights into contribution dynamics in crowd science projects. Moreover, given that extrinsic incentives such as pay, status, self-use, or career benefits are largely absent in these particular projects, the data also provide unique insights into the dynamics of interest-based motivation and into its potential as a driver of effort.”

Building tech-powered public services


New publication by Sarah Bickerstaffe from IPPR (UK): “Given the rapid pace of technological change and take-up by the public, it is a question of when not if public services become ‘tech-powered’. This new paper asks how we can ensure that innovations are successfully introduced and deployed.
Can technology improve the experience of people using public services, or does it simply mean job losses and a depersonalised offer to users?
Could tech-powered public services be an affordable, sustainable solution to some of the challenges of these times of austerity?
This report looks at 20 case studies of digital innovation in public services, using these examples to explore the impact of new and disruptive technologies. It considers how tech-powered public services can be delivered, focusing on the area of health and social care in particular.
We identify three key benefits of increasing the role of technology in public services: saving time, boosting user participation, and encouraging users to take responsibility for their own wellbeing.
In terms of how to successfully implement technological innovations in public services, five particular lessons stood out clearly and consistently:

  1. User-based iterative design is critical to delivering a product that solves real-world problems. It builds trust and ensures the technology works in the context in which it will be used.
  2. Public sector expertise is essential in order for a project to make the connections necessary to initial development and early funding.
  3. Access to seed and bridge funding is necessary to get projects off the ground and allow them to scale up.
  4. Strong leadership from within the public sector is crucial to overcoming the resistance that practitioners and managers often show initially.
  5. A strong business case that sets out the quality improvements and cost savings that the innovation can deliver is important to get attention and interest from public services.

The seven headline case studies in this report are:

  • Patchwork creates an elegant solution to join up professionals working with troubled families, in an effort to ensure that frontline support is truly coordinated.
  • Casserole Club links people who like cooking with their neighbours who are in need of a hot meal, employing the simplest possible technology to grow social connections.
  • ADL Smartcare uses a facilitated assessment tool to make professional expertise accessible to staff and service users without years of training, meaning they can carry out assessments together, engaging people in their own care and freeing up occupational therapists to focus where they are needed.
  • Mental Elf makes leading research in mental health freely available via social media, providing accessible summaries to practitioners and patients who would not otherwise have the time or ability to read journal articles, which are often hidden behind a paywall.
  • Patient Opinion provides an online platform for people to give feedback on the care they have received and for healthcare professionals and providers to respond, disrupting the typical complaints process and empowering patients and their families.
  • The Digital Pen and form system has saved the pilot hospital trust three minutes per patient by avoiding the need for manual data entry, freeing up clinical and administrative staff for other tasks.
  • Woodland Wiggle allows children in hospital to enter a magical woodland world through a giant TV screen, where they can have fun, socialise, and do their physiotherapy.”

Google Global Impact Award Expands Zooniverse


Press Release: “A $1.8 million Google Global Impact Award will enable Zooniverse, a nonprofit collaboration led by the Adler Planetarium and the University of Oxford, to make setting up a citizen science project as easy as starting a blog and could lead to thousands of innovative new projects around the world, accelerating the pace of scientific research.
The award supports the further development of the Zooniverse, the world’s leading ‘citizen science’ platform, which has already given more than 900,000 online volunteers the chance to contribute to science by taking part in activities including discovering planets, classifying plankton or searching through old ship’s logs for observations of interest to climate scientists. As part of the Global Impact Award, the Adler will receive $400,000 to support the Zooniverse platform.
With the Google Global Impact Award, Zooniverse will be able to rebuild their platform so that research groups with no web development expertise can build and launch their own citizen science projects.
“We are entering a new era of citizen science – this effort will enable prolific development of science projects in which hundreds of thousands of additional volunteers will be able to work alongside professional scientists to conduct important research – the potential for discovery is limitless,” said Michelle B. Larson, Ph.D., Adler Planetarium president and CEO. “The Adler is honored to join its fellow Zooniverse partner, the University of Oxford, as a Google Global Impact Award recipient.”
The Zooniverse – the world’s leading citizen science platform – is a global collaboration across several institutions that design and build citizen science projects. The Adler is a founding partner of the Zooniverse, which has already engaged more than 900,000 online volunteers as active scientists by discovering planets, mapping the surface of Mars and detecting solar flares. Adler-directed citizen science projects include: Galaxy Zoo (astronomy), Solar Stormwatch (solar physics), Moon Zoo (planetary science), Planet Hunters (exoplanets) and The Milky Way Project (star formation). The Zooniverse (zooniverse.org) also includes projects in environmental, biological and medical sciences. Google’s investment in the Adler and its Zooniverse partner, the University of Oxford, will further the global reach, making thousands of new projects possible.”

Selected Readings on Crowdsourcing Data


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of crowdsourcing data was originally published in 2013.

As institutions seek to improve decision-making through data and put public data to use to improve the lives of citizens, new tools and projects are allowing citizens to play a role in both the collection and utilization of data. Participatory sensing and other citizen data collection initiatives, notably in the realm of disaster response, are allowing citizens to crowdsource important data, often using smartphones, that would be either impossible or burdensomely time-consuming for institutions to collect themselves. Civic hacking, often performed in hackathon events, on the other hand, is a growing trend in which governments encourage citizens to transform data from government and other sources into useful tools to benefit the public good.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Baraniuk, Chris. “Power Politechs.” New Scientist 218, no. 2923 (June 29, 2013): 36–39. http://bit.ly/167ul3J.

  • In this article, Baraniuk discusses civic hackers, “an army of volunteer coders who are challenging preconceptions about hacking and changing the way your government operates. In a time of plummeting budgets and efficiency drives, those in power have realised they needn’t always rely on slow-moving, expensive outsourcing and development to improve public services. Instead, they can consider running a hackathon, at which tech-savvy members of the public come together to create apps and other digital tools that promise to enhace the provision of healthcare, schools or policing.”
  • While recognizing that “civic hacking has established a pedigree that demonstrates its potential for positive impact,” Baraniuk argues that a “more rigorous debate over how this activity should evolve, or how authorities ought to engage in it” is needed.

Barnett, Brandon, Muki Hansteen Izora, and Jose Sia. “Civic Hackathon Challenges Design Principles: Making Data Relevant and Useful for Individuals and Communities.” Hack for Change, https://bit.ly/2Ge6z09.

  • In this paper, researchers from Intel Labs offer “guiding principles to support the efforts of local civic hackathon organizers and participants as they seek to design actionable challenges and build useful solutions that will positively benefit their communities.”
  • The authors proposed design principles are:
    • Focus on the specific needs and concerns of people or institutions in the local community. Solve their problems and challenges by combining different kinds of data.
    • Seek out data far and wide (local, municipal, state, institutional, non-profits, companies) that is relevant to the concern or problem you are trying to solve.
    • Keep it simple! This can’t be overstated. Focus [on] making data easily understood and useful to those who will use your application or service.
    • Enable users to collaborate and form new communities and alliances around data.

Buhrmester, Michael, Tracy Kwang, and Samuel D. Gosling. “Amazon’s Mechanical Turk A New Source of Inexpensive, Yet High-Quality, Data?” Perspectives on Psychological Science 6, no. 1 (January 1, 2011): 3–5. http://bit.ly/H56lER.

  • This article examines the capability of Amazon’s Mechanical Turk to act a source of data for researchers, in addition to its traditional role as a microtasking platform.
  • The authors examine the demographics of MTurkers and find that “MTurk participants are slightly more demographically diverse than are standard Internet samples and are significantly more diverse than typical American college samples; (b) participation is affected by compensation rate and task length, but participants can still be recruited rapidly and inexpensively; (c) realistic compensation rates do not affect data quality; and (d) the data obtained are at least as reliable as those obtained via traditional methods.”
  • The paper concludes that, just as MTurk can be a strong tool for crowdsourcing tasks, data derived from MTurk can be high quality while also being inexpensive and obtained rapidly.

Goodchild, Michael F., and J. Alan Glennon. “Crowdsourcing Geographic Information for Disaster Response: a Research Frontier.” International Journal of Digital Earth 3, no. 3 (2010): 231–241. http://bit.ly/17MBFPs.

  • This article examines issues of data quality in the face of the new phenomenon of geographic information being generated by citizens, in order to examine whether this data can play a role in emergency management.
  • The authors argue that “[d]ata quality is a major concern, since volunteered information is asserted and carries none of the assurances that lead to trust in officially created data.”
  • Due to the fact that time is crucial during emergencies, the authors argue that, “the risks associated with volunteered information are often outweighed by the benefits of its use.”
  • The paper examines four wildfires in Santa Barbara in 2007-2009 to discuss current challenges with volunteered geographical data, and concludes that further research is required to answer how volunteer citizens can be used to provide effective assistance to emergency managers and responders.

Hudson-Smith, Andrew, Michael Batty, Andrew Crooks, and Richard Milton. “Mapping for the Masses Accessing Web 2.0 Through Crowdsourcing.” Social Science Computer Review 27, no. 4 (November 1, 2009): 524–538. http://bit.ly/1c1eFQb.

  • This article describes the way in which “we are harnessing the power of web 2.0 technologies to create new approaches to collecting, mapping, and sharing geocoded data.”
  • The authors examine GMapCreator and MapTube, which allow users to do a range of map-related functions such as create new maps, archive existing maps, and share or produce bottom-up maps through crowdsourcing.
  • They conclude that “these tools are helping to define a neogeography that is essentially ‘mapping for the masses,’ while noting that there are many issues of quality, accuracy, copyright, and trust that will influence the impact of these tools on map-based communication.”

Kanhere, Salil S. “Participatory Sensing: Crowdsourcing Data from Mobile Smartphones in Urban Spaces.” In Distributed Computing and Internet Technology, edited by Chittaranjan Hota and Pradip K. Srimani, 19–26. Lecture Notes in Computer Science 7753. Springer Berlin Heidelberg. 2013. https://bit.ly/2zX8Szj.

  • This paper provides a comprehensive overview of participatory sensing — a “new paradigm for monitoring the urban landscape” in which “ordinary citizens can collect multi-modal data streams from the surrounding environment using their mobile devices and share the same using existing communications infrastructure.”
  • In addition to examining a number of innovative applications of participatory sensing, Kanhere outlines the following key research challenges:
    • Dealing with incomplete samples
    •  Inferring user context
    • Protecting user privacy
    • Evaluating data trustworthiness
    • Conserving energy

Data isn't a four-letter word


Speech by Neelie Kroes, Vice-President of the European Commission responsible for the Digital Agenda: “I want to talk about data too: the opportunity as well as the threat.
Making data the engine of the European economy: safeguarding fundamental rights capturing the data boost, and strengthening our defences.
Data is at a cross-roads. We have opportunities; open data, big data, datamining, cloud computing. Tim Berners Lee, creator of the world wide web, saw the massive potential of open data. As he put it, if you put that data online, it will be used by other people to do wonderful things, in ways that you could never imagine.
On the other hand, we have threats: to our privacy and our values, and to the openness that makes it possible to innovate, trade and exchange.
Get it right and we can safeguard a better economic future. Get it wrong, and we cut competitiveness without protecting privacy. So we remain dependent on the digital developments of others: and just as vulnerable to them.
How do we find that balance? Not with hysteria; nor by paralysis. Not by stopping the wonderful things, simply to prevent the not-so-wonderful. Not by seeing data as a dirty word.
We are seeing a whole economy develop around data and cloud computing. Businesses using them, whole industries depending on them, data volumes are increasing exponentially. Data is not just an economic sideshow, it is a whole new asset class; requiring new skills and creating new jobs.
And with a huge range of applications. From decoding human genes to predicting the traffic, and even the economy. Whatever you’re doing these days, chances are you’re using big data (like translation, search, apps, etc).
There is increasing recognition of the data boost on offer. For example, open data can make public administrations more transparent and stimulate a rich innovative market. That is what the G8 Leaders recognised in June, with their Open Data Charter. For scientists too, open data and open access offer new ways to research and progress.
That is a philosophy the Commission has shared for some time. And that is what our ‘Open Data’ package of December 2011 is all about. With new EU laws to open up public administrations, and a new EU Open Data Portal. And all EU-funded scientific publications available under open access.
Now not just the G8 and the Commission are seeing this data opportunity: but the European Council too. Last October, they recognised the potential of big data innovation, the need for a single market in cloud computing; and the urgency of Europe capitalising on both.
We will be acting on that. Next spring, I plan a strategic agenda for research on data. Working with private partners and national research funders to shape that agenda, and get the most bang for our research euro.
And, beyond research, there is much we can do to align our work and support secure big data. From training skilled workers, to modernising copyright for data and text mining, to different actors in the value chain working together: for example through a public-private partnership.
…Empowering people is not always easy in this complex online world. I want to see technical solutions emerge that can do that, give users control over their desired level of privacy, how their data will be used, and making it easier to verify online rights are respected.
How can we do that? How can we ensure systems that are empowering, transparent, and secure? There are a number of subtleties in play. Here’s my take.
First, companies engaged in big data will need to start thinking about privacy protection at every stage: and from system development, to procedures and practices.
This is the principle of “privacy by design”, set out clearly in the proposed Data Protection Regulation. In other words, from now on new business ideas have two purposes: delivering a service and protecting privacy at the right level.
Second, also under the regulation, big data applications that might put fundamental rights at risk would require the company to carry out a “Privacy Impact Assessment”. This is another good way to combine innovation and privacy: ensuring you think about any risks from the start.
Third, sometimes, particularly for personal data, a company might realise they need user consent. Consent is a cornerstone of data protection rules, and should stay that way.
But we need to get smart, and apply common sense to consent. Users can’t be expected to know everything. Nor asked to consent to what they cannot realistically understand. Nor presented with false dilemmas, a black-and-white choice between consenting or getting shut out of services.
Fourth, we can also get smart when it comes to anonymisation. Sometimes, full anonymisation means losing important information, so you can no longer make the links between data. That could make the difference between progress or paralysis. But using pseudonyms can let you to analyse large amounts of data: to spot, for example, that people with genetic pattern X also respond well to therapy Y.
So it is understandable why the European Parliament has proposed a more flexible data protection regime for this type of data. Companies would be able to process the data on grounds of legitimate interest, rather than consent. That could make all the positive difference to big data: without endangering privacy.
Of course, in those cases, companies still to minimise privacy risks. Their internal processes and risk assessments must show how they comply with the guiding principles of data protection law. And – if something does go wrong – the company remains accountable.
Indeed company accountability is another key element of our proposal. And here again we welcome the European Parliament’s efforts to reinforce that. Clearly, you might assure accountability in different ways for different companies. But standards for compliance and processes could make a real difference.
A single data protection law for Europe would be a big step forward. National fortresses and single market barriers just make it harder for Europe to lead in digital, harder for Europe to become the natural home of secure online services. Data protection cannot mean data protectionism. Rather, it means safeguarding privacy does not come at the expense of innovation: with laws both flexible and future proof, pragmatic and proportionate, for a changing world….
But data protection rules are really just the start. They are only part of our response to the Snowden revelations….”