Web Science: Understanding the Emergence of Macro-Level Features on the World Wide Web


Monograph by Kieron O’Hara, Noshir S. Contractor, Wendy Hall, James A. Hendler and Nigel Shadbolt in Foundations and Trends in Web Sciences: “Web Science considers the development of Web Science since the publication of ‘A Framework for Web Science’ (Berners-Lee et al., 2006). This monograph argues that the requirement for understanding should ideally be accompanied by some measure of control, which makes Web Science crucial in the future provision of tools for managing our interactions, our politics, our economics, our entertainment, and – not least – our knowledge and data sharing…
In this monograph we consider the development of Web Science since the launch of this journal and its inaugural publication ‘A Framework for Web Science’ [44]. The theme of emergence is discussed as the characteristic phenomenon of Web-scale applications, where many unrelated micro-level actions and decisions, uninformed by knowledge about the macro-level, still produce noticeable and coherent effects at the scale of the Web. A model of emergence is mapped onto the multitheoretical multilevel (MTML) model of communication networks explained in [252]. Four specific types of theoretical problem are outlined. First, there is the need to explain local action. Second, the global patterns that form when local actions are repeated at scale have to be detected and understood. Third, those patterns feed back into the local, with intricate and often fleeting causal connections to be traced. Finally, as Web Science is an engineering discipline, issues of control of this feedback must be addressed. The idea of a social machine is introduced, where networked interactions at scale can help to achieve goals for people and social groups in civic society; an important aim of Web Science is to understand how such networks can operate, and how they can control the effects they produce on their own environment.”

Open Data in Action


Nick Sinai at the White House: “Over the past few years, the Administration has launched a series of Open Data Initiatives, which, have released troves of valuable data in areas such as health, energy, education, public safety, finance, and global development…
Today, in furtherance of this exciting economic dynamic, The Governance Lab (The GovLab) —a research institution at New York University—released the beta version of its Open Data 500 project—an initiative designed to identify, describe, and analyze companies that use open government data in order to study how these data can serve business needs more effectively. As part of this effort, the organization is compiling a list of 500+ companies that use open government data to generate new business and develop new products and services.
This working list of 500+ companies, from sectors ranging from real estate to agriculture to legal services, shines a spotlight on surprising array of innovative and creative ways that open government data is being used to grow the economy – across different company sizes, different geographies, and different industries. The project includes information about  the companies and what government datasets they have identified as critical resources for their business.
Some of examples from the Open Data 500 Project include:
  • Brightscope, a San Diego-based company that leverages data from the Department of Labor, the Security and Exchange Commission, and the Census Bureau to rate consumers’ 401k plans objectively on performance and fees, so companies can choose better plans and employees can make better decisions about their retirement options.
  • AllTuition, a  Chicago-based startup that provides services—powered by data from Department of Education on Federal student financial aid programs and student loans— to help students and parents manage the financial-aid process for college, in part by helping families keep track of deadlines, and walking them through the required forms.
  • Archimedes, a San Francisco healthcare modeling and analytics company, that leverages  Federal open data from the National Institutes of Health, the Centers for Disease Control and Prevention, and the Center for Medicaid and Medicare Services, to  provide doctors more effective individualized treatment plans and to enable patients to make informed health decisions.
You can learn more here about the project and view the list of open data companies here.

See also:
Open Government Data: Companies Cash In

NYU project touts 500 top open-data firms”

Open data and transparency: a look back at 2013


Zoe Smith in the Guardian on the open data and development in 2013: “The clarion call for a “data revolution” made in the post-2015 high level panel report is a sign of a growing commitment to see freely flowing data become a tool for social change.

Web-based technology continued to offer increasing numbers of people the ability to share standardised data and statistics to demand better governance and strengthen accountability. 2013 seemed to herald the moment that the open data/transparency movement entered the mainstream.
Yet for those who have long campaigned on the issue, the call was more than just a catchphrase, it was a unique opportunity. “If we do get a global drive towards open data in relation to development or anything else, that would be really transformative and it’s quite rare to see such bold statements at such an early stage of the process. I think it set the tone for a year in which transparency was front and centre of many people’s agendas,” says David Hall Matthews, of Publish What You Fund.
This year saw high level discussions translated into commitments at the policy level. David Cameron used the UK’s presidency of the G8 to trigger international action on the three Ts (tax, trade and transparency) through the IF campaign. The pledge at Lough Erne, in Scotland, reaffirmed the commitment to the Busan open data standard as well as the specific undertaking that all G8 members would implement International Aid Transparency Index (IATI) standards by the end of 2015.
2013 was a particularly good year for the US Millenium Challenge Corporation (MCC) which topped the aid transparency index. While at the very top MCC and UK’s DfID were examples of best practice, there was still much room for improvement. “There is a really long tail of agencies who are not really taking transparency at all, yet. This includes important donors, the whole of France and the whole of Japan who are not doing anything credible,” says Hall-Matthews.
Yet given the increasing number of emerging and ‘frontier‘ markets whose growth is driven in large part by wealth derived from natural resources, 2013 saw a growing sense of urgency for transparency to be applied to revenues from oil, gas and mineral resources that may far outstrip aid. In May, the new Extractive Industries Transparency Initiative standard (EITI) was adopted, which is said to be far broader and deeper than its previous incarnation.
Several countries have done much to ensure that transparency leads to accountability in their extractive industries. In Nigeria, for example, EITI reports are playing an important role in the debate about how resources should be managed in the country. “In countries such as Nigeria they’re taking their commitment to transparency and EITI seriously, and are going beyond disclosing information but also ensuring that those findings are acted upon and lead to accountability. For example, the tax collection agency has started to collect more of the revenues that were previously missing,” says Jonas Moberg, head of the EITI International Secretariat.
But just the extent to which transparency and open data can actually deliver on its revolutionary potential has also been called into question. Governments and donors agencies can release data but if the power structures within which this data is consumed and acted upon do not shift is there really any chance of significant social change?
The complexity of the challenge is illustrated by the case of Mexico which, in 2014, will succeed Indonesia as chair of the Open Government Partnership. At this year’s London summit, Mexico’s acting civil service minister, spoke of the great strides his country has made in opening up the public procurement process, which accounts for around 10% of GDP and is a key area in which transparency and accountability can help tackle corruption.
There is, however, a certain paradox. As SOAS professor, Leandro Vergara Camus, who has written extensively on peasant movements in Mexico, explains: “The NGO sector in Mexico has more of a positive view of these kinds of processes than the working class or peasant organisations. The process of transparency and accountability have gone further in urban areas then they have in rural areas.”…
With increasing numbers of organisations likely to jump on the transparency bandwagon in the coming year the greatest challenge is using it effectively and adequately addressing the underlying issues of power and politics.

Top 2013 transparency publications

Open data, transparency and international development, The North South Institute
Data for development: The new conflict resource?, Privacy International
The fix-rate: a key metric for transparency and accountability, Integrity Action
Making UK aid more open and transparent, DfID
Getting a seat at the table: Civil Society advocacy for budget transparency in “untransparent” countries, International Budget Partnership

The dates that mattered

23-24 May: New Extractive Industries Transparency Index standard adopted
30 May: Post 2015 high level report calling for a ‘data revolution’ is published
17-18 June: UK premier, David Cameron, campaigns for tax, trade and transparency during the G8
24 October: US Millenium Challenge Corporation tops the aid transparency index”
30 October – 1 November: Open Government Partnership in London gathers civil society, governments and data experts

Selected Readings on Data Visualization


The Living Library’s Selected Readings series seeks to build a knowledge base on innovative approaches for improving the effectiveness and legitimacy of governance. This curated and annotated collection of recommended works on the topic of data visualization was originally published in 2013.

Data visualization is a response to the ever-increasing amount of  information in the world. With big data, informatics and predictive analytics, we have an unprecedented opportunity to revolutionize policy-making. Yet data by itself can be overwhelming. New tools and techniques for visualizing information can help policymakers clearly articulate insights drawn from data. Moreover, the rise of open data is enabling those outside of government to create informative and visually arresting representations of public information that can be used to support decision-making by those inside or outside governing institutions.

Selected Reading List (in alphabetical order)

Annotated Selected Reading List (in alphabetical order)

Duke, D.J., K.W. Brodlie, D.A. Duce and I. Herman. “Do You See What I Mean? [Data Visualization].” IEEE Computer Graphics and Applications 25, no. 3 (2005): 6–9. http://bit.ly/1aeU6yA.

  • In this paper, the authors argue that a more systematic ontology for data visualization to ensure the successful communication of meaning. “Visualization begins when someone has data that they wish to explore and interpret; the data are encoded as input to a visualization system, which may in its turn interact with other systems to produce a representation. This is communicated back to the user(s), who have to assess this against their goals and knowledge, possibly leading to further cycles of activity. Each phase of this process involves communication between two parties. For this to succeed, those parties must share a common language with an agreed meaning.”
  • That authors “believe that now is the right time to consider an ontology for visualization,” and “as visualization move from just a private enterprise involving data and tools owned by a research team into a public activity using shared data repositories, computational grids, and distributed collaboration…[m]eaning becomes a shared responsibility and resource. Through the Semantic Web, there is both the means and motivation to develop a shared picture of what we see when we turn and look within our own field.”

Friendly, Michael. “A Brief History of Data Visualization.” In Handbook of Data Visualization, 15–56. Springer Handbooks Comp.Statistics. Springer Berlin Heidelberg, 2008. http://bit.ly/17fM1e9.

  • In this paper, Friendly explores the “deep roots” of modern data visualization. “These roots reach into the histories of the earliest map making and visual depiction, and later into thematic cartography, statistics and statistical graphics, medicine and other fields. Along the way, developments in technologies (printing, reproduction), mathematical theory and practice, and empirical observation and recording enabled the wider use of graphics and new advances in form and content.”
  • Just as the general the visualization of data is far from a new practice, Friendly shows that the graphical representation of government information has a similarly long history. “The collection, organization and dissemination of official government statistics on population, trade and commerce, social, moral and political issues became widespread in most of the countries of Europe from about 1825 to 1870. Reports containing data graphics were published with some regularity in France, Germany, Hungary and Finland, and with tabular displays in Sweden, Holland, Italy and elsewhere.”

Graves, Alvaro and James Hendler. “Visualization Tools for Open Government Data.” In Proceedings of the 14th Annual International Conference on Digital Government Research, 136–145. Dg.o ’13. New York, NY, USA: ACM, 2013. http://bit.ly/1eNSoXQ.

  • In this paper, the authors argue that, “there is a gap between current Open Data initiatives and an important part of the stakeholders of the Open Government Data Ecosystem.” As it stands, “there is an important portion of the population who could benefit from the use of OGD but who cannot do so because they cannot perform the essential operations needed to collect, process, merge, and make sense of the data. The reasons behind these problems are multiple, the most critical one being a fundamental lack of expertise and technical knowledge. We propose the use of visualizations to alleviate this situation. Visualizations provide a simple mechanism to understand and communicate large amounts of data.”
  • The authors also describe a prototype of a tool to create visualizations based on OGD with the following capabilities:
    • Facilitate visualization creation
    • Exploratory mechanisms
    • Viralization and sharing
    • Repurpose of visualizations

Hidalgo, César A. “Graphical Statistical Methods for the Representation of the Human Development Index and Its Components.” United Nations Development Programme Human Development Reports, September 2010. http://bit.ly/166TKur.

  • In this paper for the United Nations Human Development Programme, Hidalgo argues that “graphical statistical methods could be used to help communicate complex data and concepts through universal cognitive channels that are heretofore underused in the development literature.”
  • To support his argument, representations are provided that “show how graphical methods can be used to (i) compare changes in the level of development experienced by countries (ii) make it easier to understand how these changes are tied to each one of the components of the Human Development Index (iii) understand the evolution of the distribution of countries according to HDI and its components and (iv) teach and create awareness about human development by using iconographic representations that can be used to graphically narrate the story of countries and regions.”

Stowers, Genie. “The Use of Data Visualization in Government.” IBM Center for The Business of Government, Using Technology Series, 2013. http://bit.ly/1aame9K.

  • This report seeks “to help public sector managers understand one of the more important areas of data analysis today — data visualization. Data visualizations are more sophisticated, fuller graphic designs than the traditional spreadsheet charts, usually with more than two variables and, typically, incorporating interactive features.”
  • Stowers also offers numerous examples of “visualizations that include geographical and health data, or population and time data, or financial data represented in both absolute and relative terms — and each communicates more than simply the data that underpin it. In addition to these many examples of visualizations, the report discusses the history of this technique, and describes tools that can be used to create visualizations from many different kinds of data sets.”

AU: Govt finds one third of open data was "junk"


IT News: “The number of datasets available on the Government’s open data website has slimmed by more than half after the agency discovered one third of the datasets were junk.
Since its official launch in 2011 data.gov.au grew to hold 1200 datasets from government agencies for public consumption.
In July this year the Deaprtment of Finance migrated the portal to a new open source platform – the Open Knowledge Foundation CKAN platform – for greater ease of use and publishing ability.
Since July the number of datasets fell from 1200 to 500.
Australian Government CTO John Sheridan said in his blog late yesterday the agency had needed to review the 1200 datasets as a result of the CKAN migration, and discovered a significant amount of them were junk.
“We unfortunately found that a third of the “datasets” were just links to webpages or files that either didn’t exist anymore, or redirected somewhere not useful to genuine seekers of data,” Sheridan said.
“In the second instance, the original 1200 number included each individual file. On the new platform, a dataset may have multiple files. In one case we have a dataset with 200 individual files where before it was counted as 200 datasets.”
The number of datasets following the clean out now sits at 529. Around 123 government bodies contributed data to the portal.
Sheridan said the number was still too low.
“A lot of momentum has built around open data in Australia, including within governments around the country and we are pleased to report that a growing number of federal agencies are looking at how they can better publish data to be more efficient, improve policy development and analysis, deliver mobile services and support greater transparency and public innovation,” he said….
The Federal Government’s approach to open data has previously been criticised as “patchy” and slow, due in part to several shortcomings in the data.gov.au website as well as slow progress in agencies adopting an open approach by default.
The Australian Information Commissioner’s February report on open data in government outlined the manual uploading and updating of datasets, lack of automated entry for metadata and a lack of specific search functions within data.gov.au as obstacles affecting the efforts pushing a whole-of-government approach to open data.
The introduction of the new CKAN platform is expected to go some way to addressing the highlighted concerns.”

The Impact of Innovation Inducement Prizes


From the Compendium of Evidence on Innovation Policy/NESTA: “Innovation inducement prizes are one of the oldest types of innovation policy measure.  The popularity of innovation inducement prizes has gradually decreased during the early 20th century. However, innovation inducement prizes have regained some of their popularity since the 1990s with new prizes awarded by the US X Prize Foundation and with the current USA Administration’s efforts to use them in various government departments as an innovation policy instrument. Innovation Prizes are also becoming an important innovation policy instrument in the UK.  A recent report by McKinsey & Company (2009) estimates the value of prizes awarded to be between £600 million and £1.2million. Despite the growing popularity of innovation inducement prizes, the impact of this innovation policy measure is still not understood. This report brings together the existing evidence on the effects of innovation inducement prizes by drawing on a number of ex-ante and ex-post evaluations as well as limited academic literature. This report focuses on ex-ante innovation inducement prizes where the aim is to induce investment or attention to a specific goal or technology. This report does not discuss the impact of ex-post recognition prizes where the prize is given as a recognition after the intended outcome happens (e.g. Nobel Prize).
Innovation inducement prizes have a wide range of rationales and there is no agreed on dominant rationale in the literature. Traditionally, prizes have been seen as an innovation policy instrument that can overcome market failure by creating an incentive for the development of a particular technology or technology application. A second rationale is that the implementation demonstration projects in which not only creation of a specific technology is intended but also demonstration of the feasible application of this technology is targeted. A third rationale is related to the creation of a technology that will later be put in the public domain to attract subsequent research. Prizes are also increasingly organised for community and leadership building. As prizes probably allow more flexibility than most of the other innovation policy instruments, there is a large number of different prize characteristics and thus a vast number of prize typologies based on these characteristics.
Evidence on the effectiveness of prizes is scarce. There are only a few evaluations or academic works that deal with the creation of innovation output and even those which deal with the innovation output only rarely deals with the additionality. Only a very limited number of studies looked at if innovation inducement prizes led to more innovation itself or innovation outputs. As well as developing the particular technology that the innovation inducement prizes produce, they create prestige for both the prize sponsor and entrants. Prizes might also increase the public and sectoral awareness on specific technology issues. A related issue to the prestige gained from the prizes is the motivations of participants as a conditioning factor for innovation performance. Design issues are the main concern of the prizes literature. This reflects the importance of a careful design for the achievement of desired effects (and the limitation of undesired effects). There are a relatively large number of studies that investigated the influence of the design of prize objective on the innovation performance. A number of studies points out that sometimes prizes should be accompanied with or followed by other demand side initiatives to fulfil their objectives, mostly on the basis of ex-ante evaluations. Finally, prizes are also seen as a valuable opportunity for experimentation in innovation policy.
It is evident from the literature we analysed that the evidence on the impact of innovation inducement prizes is scarce. There is also a consensus that innovation inducement prizes are not a substitute for other innovation policy measures but are complementary under certain conditions. Prizes can be effective in creating innovation through more intense competition, engagement of wide variety of actors, distributing risks to many participants and by exploiting more flexible solutions through a less prescriptive nature of the definition of the problem in prizes. They can overcome some of the inherent barriers to other instruments, but if prizes are poorly designed, managed and awarded, they may be ineffective or even harmful.”

Tech challenge develops algorithms to predict


SciDevNet: “Mathematical models that use existing socio-political data to predict mass atrocities could soon inform governments and NGOs on how and where to take preventative action.
The models emerged from one strand of the Tech Challenge for Atrocity Prevention, a competition run by the US Agency for International Development (USAID) and NGO Humanity United. The winners were announced last month (18 November) and will now work with the organiser to further develop and pilot their innovations.
The five winners from different countries who won between US$1,000 and US$12,000, were among nearly 100 entrants who developed algorithms to predict when and where mass atrocities are likely to happen.
Around 1.5 billion people live in countries affected by conflict, sometimes including atrocities such as genocides, mass rape and ethnic cleansing, according to the World Bank’s World Development Report 2011. Many of these countries are in the developing world.
The competition organisers hope the new algorithms could help governments and human rights organisations identify at-risk regions, potentially allowing them to intervene before mass atrocities happen.
The competition started from the premise that certain social and political measurements are linked to increased likelihood of atrocities. Yet because such factors interact in complex ways, organisations working to prevent atrocities lack a reliable method of predicting when and where they might happen next.
The algorithms use sociopolitical indicators and data on past atrocities as their inputs. The data was drawn from archives such as the Global Database of Events, Language and Tone, a data set that encodes more than 200 million globally newsworthy events, recording cultural information such as the people involved, their location and any religious connections.”
Link to the winners of the Model Challenge

Participation Dynamics in Crowd-Based Knowledge Production: The Scope and Sustainability of Interest-Based Motivation


New paper by Henry Sauermann and Chiara Franzoni: “Crowd-based knowledge production is attracting growing attention from scholars and practitioners. One key premise is that participants who have an intrinsic “interest” in a topic or activity are willing to expend effort at lower pay than in traditional employment relationships. However, it is not clear how strong and sustainable interest is as a source of motivation. We draw on research in psychology to discuss important static and dynamic features of interest and derive a number of research questions regarding interest-based effort in crowd-based projects. Among others, we consider the specific versus general nature of interest, highlight the potential role of matching between projects and individuals, and distinguish the intensity of interest at a point in time from the development and sustainability of interest over time. We then examine users’ participation patterns within and across 7 different crowd science projects that are hosted on a shared platform. Our results provide novel insights into contribution dynamics in crowd science projects. Moreover, given that extrinsic incentives such as pay, status, self-use, or career benefits are largely absent in these particular projects, the data also provide unique insights into the dynamics of interest-based motivation and into its potential as a driver of effort.”

Building tech-powered public services


New publication by Sarah Bickerstaffe from IPPR (UK): “Given the rapid pace of technological change and take-up by the public, it is a question of when not if public services become ‘tech-powered’. This new paper asks how we can ensure that innovations are successfully introduced and deployed.
Can technology improve the experience of people using public services, or does it simply mean job losses and a depersonalised offer to users?
Could tech-powered public services be an affordable, sustainable solution to some of the challenges of these times of austerity?
This report looks at 20 case studies of digital innovation in public services, using these examples to explore the impact of new and disruptive technologies. It considers how tech-powered public services can be delivered, focusing on the area of health and social care in particular.
We identify three key benefits of increasing the role of technology in public services: saving time, boosting user participation, and encouraging users to take responsibility for their own wellbeing.
In terms of how to successfully implement technological innovations in public services, five particular lessons stood out clearly and consistently:

  1. User-based iterative design is critical to delivering a product that solves real-world problems. It builds trust and ensures the technology works in the context in which it will be used.
  2. Public sector expertise is essential in order for a project to make the connections necessary to initial development and early funding.
  3. Access to seed and bridge funding is necessary to get projects off the ground and allow them to scale up.
  4. Strong leadership from within the public sector is crucial to overcoming the resistance that practitioners and managers often show initially.
  5. A strong business case that sets out the quality improvements and cost savings that the innovation can deliver is important to get attention and interest from public services.

The seven headline case studies in this report are:

  • Patchwork creates an elegant solution to join up professionals working with troubled families, in an effort to ensure that frontline support is truly coordinated.
  • Casserole Club links people who like cooking with their neighbours who are in need of a hot meal, employing the simplest possible technology to grow social connections.
  • ADL Smartcare uses a facilitated assessment tool to make professional expertise accessible to staff and service users without years of training, meaning they can carry out assessments together, engaging people in their own care and freeing up occupational therapists to focus where they are needed.
  • Mental Elf makes leading research in mental health freely available via social media, providing accessible summaries to practitioners and patients who would not otherwise have the time or ability to read journal articles, which are often hidden behind a paywall.
  • Patient Opinion provides an online platform for people to give feedback on the care they have received and for healthcare professionals and providers to respond, disrupting the typical complaints process and empowering patients and their families.
  • The Digital Pen and form system has saved the pilot hospital trust three minutes per patient by avoiding the need for manual data entry, freeing up clinical and administrative staff for other tasks.
  • Woodland Wiggle allows children in hospital to enter a magical woodland world through a giant TV screen, where they can have fun, socialise, and do their physiotherapy.”

Google Global Impact Award Expands Zooniverse


Press Release: “A $1.8 million Google Global Impact Award will enable Zooniverse, a nonprofit collaboration led by the Adler Planetarium and the University of Oxford, to make setting up a citizen science project as easy as starting a blog and could lead to thousands of innovative new projects around the world, accelerating the pace of scientific research.
The award supports the further development of the Zooniverse, the world’s leading ‘citizen science’ platform, which has already given more than 900,000 online volunteers the chance to contribute to science by taking part in activities including discovering planets, classifying plankton or searching through old ship’s logs for observations of interest to climate scientists. As part of the Global Impact Award, the Adler will receive $400,000 to support the Zooniverse platform.
With the Google Global Impact Award, Zooniverse will be able to rebuild their platform so that research groups with no web development expertise can build and launch their own citizen science projects.
“We are entering a new era of citizen science – this effort will enable prolific development of science projects in which hundreds of thousands of additional volunteers will be able to work alongside professional scientists to conduct important research – the potential for discovery is limitless,” said Michelle B. Larson, Ph.D., Adler Planetarium president and CEO. “The Adler is honored to join its fellow Zooniverse partner, the University of Oxford, as a Google Global Impact Award recipient.”
The Zooniverse – the world’s leading citizen science platform – is a global collaboration across several institutions that design and build citizen science projects. The Adler is a founding partner of the Zooniverse, which has already engaged more than 900,000 online volunteers as active scientists by discovering planets, mapping the surface of Mars and detecting solar flares. Adler-directed citizen science projects include: Galaxy Zoo (astronomy), Solar Stormwatch (solar physics), Moon Zoo (planetary science), Planet Hunters (exoplanets) and The Milky Way Project (star formation). The Zooniverse (zooniverse.org) also includes projects in environmental, biological and medical sciences. Google’s investment in the Adler and its Zooniverse partner, the University of Oxford, will further the global reach, making thousands of new projects possible.”