Mapping information economy business with big data: findings from the UK


NESTA: “This paper uses innovative ‘big data’ resources to measure the size of the information economy in the UK.

Key Findings

  • Counts of information economy firms are 42 per cent larger than SIC-based estimates
  • Using ‘big data’ estimates, the research finds 225,800 information economy businesses in the UK
  • Information economy businesses are highly clustered across the country, with very high counts in the Greater South East, notably London (especially central and east London), as well as big cities such as Manchester, Birmingham and Bristol
  • Looking at local clusters, we find hotspots in Middlesbrough, Aberdeen, Brighton, Cambridge and Coventry, among others

Information and Communications Technologies – and the digital economy they support – are of enduring interest to researchers and policymakers. National and local government are particularly keen to understand the characteristics and growth potential of ‘their’ digital businesses.
Given the recent resurgence of interest in industrial policy across many developed countries, there is now substantial policy interest in developing stronger, more competitive digital economies. For example, the UK’s current industrial strategy combines horizontal interventions with support for seven key sectors, of which the ‘information economy’ is one.
The desire to grow high–tech clusters is often prominent in the policy mix – for instance, the UK’s Tech City UK initiative, Regional Innovation Clusters in the US and elements of ‘smart specialisation’ policies in the EU.
In this paper, NIESR and Growth Intelligence use novel ‘big data’ sources to improve our understanding of information economy businesses in the UK – that is, those involved in the production of ICTs. We use this experience to critically reflect on some of the opportunities and challenges presented by big data tools and analytics for economic research and policymaking.”
– See more at: http://www.nesta.org.uk/publications/mapping-information-economy-business-big-data-findings-uk-0#sthash.2ismEMr2.dpuf

Restoring Confidence in Open, Shared and Personal Data


Report of the UK Digital Government Review: “It is obvious that government needs to be able to use data both to deliver services and to present information to public view. How else would government know which bank account to place a pension payment into, or a citizen know the results of an election or how to contact their elected representatives?

As more and more data is created, preserved and shared in ever-increasing volumes a number of urgent questions are begged: over opportunities and hazards; over the importance of using best-practice techniques, insights and technologies developed in the private sector, academia and elsewhere; over the promises and limitations of openness; and how all this might be articulated and made accessible to the public.

Government has already adopted “open data” (we will discuss this more in the next section) and there are now increasing calls for government to pay more attention to data analytics and so-called “big data” – although the first faltering steps to unlock benefits, here, have often ended in the discovery that using large-scale data is a far more nuanced business than was initially assumed

Debates around government and data have often been extremely high-profile – the NHS care.data [27] debate was raging while this review was in progress – but they are also shrouded in terms that can generate confusion and complexities that are not easily summarized.

In this chapter we will unpick some of these terms and some parts of the debate. This is a detailed and complex area and there is much more that could have been included [28]. This is not an area that can easily be summarized into a simple bullet-pointed list of policies.

Within this report we will use the following terms and definitions, proceeding to a detailed analysis of each in turn:

Type of Data

Definition [29]

Examples

1. Open Data Data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike Insolvency notices in the London Gazette
Government spending information
Public transport information
Official National Statistics
2. Shared Data Restricted data provided to restricted organisations or individuals for restricted purposes National Pupil Database
NHS care.data
Integrated health and social care
Individual census returns
3. Personal Data Data that relate to a living individual who can be identified from that data. For full legal definition see [30] Health records
Individual tax records
Insolvency notices in the London gazette
National Pupil Database
NB These definitions overlap. Personal data can exist in both open and shared data.

This social productivity will help build future economic productivity; in the meantime it will improve people’s lives and it will enhance our democracy. From our analysis it was clear that there was room for improvement…”

How to use the Internet to end corrupt deals between companies and governments


Stella Dawson at the Thomson Reuters Foundation: “Every year governments worldwide spend more than $9.5 trillion on public goods and services, but finding out who won those contracts, why and whether they deliver as promised is largely invisible.
Enter the Open Contracting Data Standard (OCDS).
Canada, Colombia, Costa Rica and Paraguay became the first countries to announce on Tuesday that they have adopted the new global standards for publishing contracts online as part of a project to shine a light on how public money is spent and to combat massive corruption in public procurement.
“The mission is to end secret deals between companies and governments,” said Gavin Hayman, the incoming executive director for Open Contracting Partnership.
The concept is simple. Under Open Contracting, the government publishes online the projects it is putting out for bid and the terms; companies submit bids online; the winning contract is published including the reasons why; and then citizens can monitor performance according to the terms of the contract.
The Open Contracting initiative, developed by the World Wide Web Foundation with the support of the World Bank and Omidyar Network, has been several years in the making and is part of a broader global movement to increase the accountability of governments by using Internet technologies to make them more transparent.
A pioneer in data transparency was the Extractive Industries Transparency Initiative, a global coalition of governments, companies and civil society that works on improving accountability by publishing the revenues received in 35 member countries for their natural resources.
Publish What You Fund is a similar initiative for the aid industry. It delivered a common open standards in 2011 for donor countries to publish how much money they gave in development aid and details of what projects that money funded and where.
There’s also the Open Government Partnership, an international forum of 65 countries, each of which adopts an action plan laying out how it will improve the quality of government through collaboration with civil society, frequently using new technologies.
All of these initiatives have helped crack open the door of government.
What’s important about Open Contracting is the sheer scale of impact it could have. Public procurement accounts for about 15 percent of global GDP and according to Anne Jellema, CEO of the World Wide Web Foundation which seeks to expand free access to the web worldwide and backed the OCDS project, corruption adds an estimated $2.3 trillion to the cost of those contracts every year.
A study by the Center for Global Development, a Washington-based think tank, looked at four countries already publishing their contracts online — the United Kingdom, Georgia, Colombia and Slovakia. It found open contracting increased visibility and encouraged more companies to submit bids, the quality and price competitiveness improved and citizen monitoring meant better service delivery….”
 

Gov.uk quietly disrupts the problem of online identity login


The Guardian: “A new “verified identity” scheme for gov.uk is making it simpler to apply for a new driving licence, passport or to file a tax return online, allowing users to register securely using one log in that connects and securely stores their personal data.
After nearly a year of closed testing with a few thousand Britons, the “Gov.UK Verify” scheme quietly opened to general users on 14 October, expanding across more services. It could have as many as half a million users with a year.
The most popular services are expected to be one for tax credit renewals, and CAP farm information – both expected to have around 100,000 users by April next year, and on their own making up nearly half of the total use.
The team behind the system claim this is a world first. Those countries that have developed advanced government services online, such as Estonia, rely on state identity cards – which the UK has rejected.
“This is a federated model of identity, not a centralised one,” said Janet Hughes, head of policy and engagement at the Government Digital Service’s identity assurance program, which developed and tested the system.
How it works
The Verify system has taken three years to develop, and involves checking a user’s identity against details from a range of sources, including credit reference agencies, utility bills, driving licences and mobile provider bills.
But it does not retain those pieces of information, and the credit checking companies do not know what service is being used. Only a mobile or landline number is kept in order to send verification codes for subsequent logins.
When people subsequently log in, they would have to provide a user ID and password, and verify their identity by entering a code sent to related stored phone number.
To enrol in the system, users have to be over 19, living in the UK, and been resident for over 12 months. A faked passport would not be sufficient: “they would need a very full false ID, and have to not appear on any list of fraudulent identities,” one source at the GDS told the Guardian.
Banks now following gov.uk’s lead
Government developers are confident that it presents a higher barrier to authentication than any other digital service – so that fraudulent transactions will be minimised. That has interested banks, which are understood to be expressing interest in using the same service to verify customer identities through an arms-length verification system.
The government system would not pass on people’s data, but would instead verify that someone is who they claim to be, much like Twitter and Facebook verify users’ identity to log in to third party sites, yet don’t share their users’ data.
The US, Canada and New Zealand have also expressed interest in following up the UK’s lead in the system, which requires separate pieces of verified information about themselves from different sources.
The system then cross-references that verified information with credit reference agencies and other sources, which can include a mobile phone provider, passport, bank account, utility bill or driving licence.
The level of confidence in an individual’s identity is split into four levels. The lowest is for the creation of simple accounts to receive reports or updates: “we don’t need to know who it is, only that it’s the same person returning,” said Hughes.
Level 2 requires that “on the balance of probability” someone is who they say they are – which is the level to which Verify will be able to identify people. Hughes says that this will cover the majority of services.
Level 3 requires identity “beyond reasonable doubt” – perhaps including the first application for a passport – and Level 4 would require biometric information to confirm individual identity.

Could digital badges clarify the roles of co-authors?


  at AAAS Science Magazine: “Ever look at a research paper and wonder how the half-dozen or more authors contributed to the work? After all, it’s usually only the first or last author who gets all the media attention or the scientific credit when people are considered for jobs, grants, awards, and more. Some journals try to address this issue with the “authors’ contributions” sections within a paper, but a collection of science, publishing, and software groups is now developing a more modern solution—digital “badges,” assigned on publication of a paper online, that detail what each author did for the work and that the authors can link to their profiles elsewhere on the Web.

Digital badges could clarify co-authors' roles

Those organizations include publishers BioMed Central and the Public Library of Science; The Wellcome Trust research charity; software development groups Mozilla Science Lab (a group of researchers, developers, librarians, and publishers) and Digital Science (a software and technology firm); and ORCID, an effort to assign researchers digital identifiers. The collaboration presented its progress on the project at the Mozilla Festival in London that ended last week. (Mozilla is the open software community behind the Firefox browser and other programs.)
The infrastructure of the badges is still being established, with early prototypes scheduled to launch early next year, according to Amye Kenall, the journal development manager of open data initiatives and journals at BioMed Central. She envisions the badge process in the following way: Once an article is published, the publisher would alert software maintained by Mozilla to automatically set up an online form, where authors fill out roles using a detailed contributor taxonomy. After the authors have completed this, the badges would then appear next to their names on the journal article, and double-clicking on a badge would lead to the ORCID site for that particular author, where the author’s badges, integrated with their publishing record, live….
The parties behind the digital badge effort are “looking to change behavior” of scientists in the competitive dog-eat-dog world of academia by acknowledging contributions, says Kaitlin Thaney, director of Mozilla Science Lab. Amy Brand, vice president of academic and research relations and VP of North America at Digital Science, says that the collaboration believes that the badges should be optional, to accommodate old-fashioned or less tech-savvy authors. She says that the digital credentials may improve lab culture, countering situations where junior scientists are caught up in lab politics and the “star,” who didn’t do much of the actual research apart from obtaining the funding, gets to be the first author of the paper and receive the most credit. “All of this calls out for more transparency,” Brand says….”

City slicker


The Economist on how “Data are slowly changing the way cities operate…WAITING for a bus on a drizzly winter morning is miserable. But for London commuters Citymapper, an app, makes it a little more bearable. Users enter their destination into a search box and a range of different ways to get there pop up, along with real-time information about when a bus will arrive or when the next Tube will depart. The app is an example of how data are changing the way people view and use cities. Local governments are gradually starting to catch up.
Nearly all big British cities have started to open up access to their data. On October 23rd the second version of the London Datastore, a huge trove of information on everything from crime statistics to delays on the Tube, was launched. In April Leeds City council opened an online “Data Mill” which contains raw data on such things as footfall in the city centre, the number of allotment sites or visits to libraries. Manchester also releases chunks of data on how the city region operates.
Mostly these websites act as tools for developers and academics to play around with. Since the first Datastore was launched in 2010, around 200 apps, such as Citymapper, have sprung up. Other initiatives have followed. “Whereabouts”, which also launched on October 23rd, is an interactive map by the Future Cities Catapult, a non-profit group, and the Greater London Authority (GLA). It uses 235 data sets, some 150 of them from the Datastore, from the age and occupation of London residents to the number of pubs or types of restaurants in an area. In doing so it suggests a different picture of London neighbourhoods based on eight different categories (see map, and its website: whereaboutslondon.org)….”

Ebola’s Information Paradox


 Steven Johnson at The New York Times:” …The story of the Broad Street outbreak is perhaps the most famous case study in public health and epidemiology, in large part because it led to the revolutionary insight that cholera was a waterborne disease, not airborne as most believed at the time. But there is another element of the Broad Street outbreak that warrants attention today, as popular anxiety about Ebola surges across the airwaves and subways and living rooms of the United States: not the spread of the disease itself, but the spread of information about the disease.

It was a full seven days after Baby Lewis became ill, and four days after the Soho residents began dying in mass numbers, before the outbreak warranted the slightest mention in the London papers, a few short lines indicating that seven people had died in the neighborhood. (The report understated the growing death toll by an order of magnitude.) It took two entire weeks before the press began treating the outbreak as a major news event for the city.

Within Soho, the information channels were equally unreliable. Rumors spread throughout the neighborhood that the entire city had succumbed at the same casualty rate, and that London was facing a catastrophe on the scale of the Great Fire of 1666. But this proved to be nothing more than rumor. Because the Soho crisis had originated with a single-point source — the poisoned well — its range was limited compared with its intensity. If you lived near the Broad Street well, you were in grave danger. If you didn’t, you were likely to be unaffected.

Compare this pattern of information flow to the way news spreads now. On Thursday, Craig Spencer, a New York doctor, was given a diagnosis of Ebola after presenting a high fever, and the entire world learned of the test result within hours of the patient himself learning it. News spread with similar velocity several weeks ago with the Dallas Ebola victim, Thomas Duncan. In a sense, it took news of the cholera outbreak a week to travel the 20 blocks from Soho to Fleet Street in 1854; today, the news travels at nearly the speed of light, as data traverses fiber-optic cables. Thanks to that technology, the news channels have been on permanent Ebola watch for weeks now, despite the fact that, as the joke went on Twitter, more Americans have been married to Kim Kardashian than have died in the United States from Ebola.

As societies and technologies evolve, the velocities vary with which disease and information can spread. The tremendous population density of London in the 19th century enabled the cholera bacterium to spread through a neighborhood with terrifying speed, while the information about that terror moved more slowly. This was good news for the mental well-being of England’s wider population, which was spared the anxiety of following the death count as if it were a stock ticker. But it was terrible from a public health standpoint; the epidemic had largely faded before the official institutions of public health even realized the magnitude of the outbreak….

Information travels faster than viruses do now. This is why we are afraid. But this is also why we are safe.”

Chicago uses big data to save itself from urban ills


Aviva Rutkin in the New Scientist: “THIS year in Chicago, some kids will get lead poisoning from the paint or pipes in their homes. Some restaurants will cook food in unsanitary conditions and, here and there, a street corner will be suddenly overrun with rats. These kinds of dangers are hard to avoid in a city of more than 2.5 million people. The problem is, no one knows for certain where or when they will pop up.

The Chicago city government is hoping to change that by knitting powerful predictive models into its everyday city inspections. Its latest project, currently in pilot tests, analyses factors such as home inspection records and census data, and uses the results to guess which buildings are likely to cause lead poisoning in children – a problem that affects around 500,000 children in the US each year. The idea is to identify trouble spots before kids are exposed to dangerous lead levels.

“We are able to prevent problems instead of just respond to them,” says Jay Bhatt, chief innovation officer at the Chicago Department of Public Health. “These models are just the beginning of the use of predictive analytics in public health and we are excited to be at the forefront of these efforts.”

Chicago’s projects are based on the thinking that cities already have what they need to raise their municipal IQ: piles and piles of data. In 2012, city officials built WindyGrid, a platform that collected data like historical facts about buildings and up-to-date streams such as bus locations, tweets and 911 calls. The project was designed as a proof of concept and was never released publicly but it led to another, called Plenario, that allowed the public to access the data via an online portal.

The experience of building those tools has led to more practical applications. For example, one tool matches calls to the city’s municipal hotline complaining about rats with conditions that draw rats to a particular area, such as excessive moisture from a leaking pipe, or with an increase in complaints about garbage. This allows officials to proactively deploy sanitation crews to potential hotspots. It seems to be working: last year, resident requests for rodent control dropped by 15 per cent.

Some predictions are trickier to get right. Charlie Catlett, director of the Urban Center for Computation and Data in Chicago, is investigating an old axiom among city cops: that violent crime tends to spike when there’s a sudden jump in temperature. But he’s finding it difficult to test its validity in the absence of a plausible theory for why it might be the case. “For a lot of things about cities, we don’t have that underlying theory that tells us why cities work the way they do,” says Catlett.

Still, predictive modelling is maturing, as other cities succeed in using it to tackle urban ills….Such efforts can be a boon for cities, making them more productive, efficient and safe, says Rob Kitchin of Maynooth University in Ireland, who helped launched a real-time data site for Dublin last month called the Dublin Dashboard. But he cautions that there’s a limit to how far these systems can aid us. Knowing that a particular street corner is likely to be overrun with rats tomorrow doesn’t address what caused the infestation in the first place. “You might be able to create a sticking plaster or be able to manage it more efficiently, but you’re not going to be able to solve the deep structural problems….”

Traversing Digital Babel


New book by Alon Peled: “The computer systems of government agencies are notoriously complex. New technologies are piled on older technologies, creating layers that call to mind an archaeological dig. Obsolete programming languages and closed mainframe designs offer barriers to integration with other agency systems. Worldwide, these unwieldy systems waste billions of dollars, keep citizens from receiving services, and even—as seen in interoperability failures on 9/11 and during Hurricane Katrina—cost lives. In this book, Alon Peled offers a groundbreaking approach for enabling information sharing among public sector agencies: using selective incentives to “nudge” agencies to exchange information assets. Peled proposes the establishment of a Public Sector Information Exchange (PSIE), through which agencies would trade information.
After describing public sector information sharing failures and the advantages of incentivized sharing, Peled examines the U.S. Open Data program, and the gap between its rhetoric and results. He offers examples of creative public sector information sharing in the United States, Australia, Brazil, the Netherlands, and Iceland. Peled argues that information is a contested commodity, and draws lessons from the trade histories of other contested commodities—including cadavers for anatomical dissection in nineteenth-century Britain. He explains how agencies can exchange information as a contested commodity through a PSIE program tailored to an individual country’s needs, and he describes the legal, economic, and technical foundations of such a program. Touching on issues from data ownership to freedom of information, Peled offers pragmatic advice to politicians, bureaucrats, technologists, and citizens for revitalizing critical information flows.”

The Role Of Open Data In Choosing Neighborhood


PlaceILive Blog: “To what extent is it important to get familiar with our environment?
If we think about how the world surrounding us has changed throughout the years, it is not so unreasonable that, while walking to work, we might encounter some new little shops, restaurants, or gas stations we had never noticed before. Likewise, how many times did we wander about for hours just to find green spaces for a run? And the only one we noticed was even more polluted than other urban areas!
Citizens are not always properly informed about the evolution of the places they live in. And that is why it would be crucial for people to be constantly up-to-date with accurate information of the neighborhood they have chosen or are going to choose.
London is a neat evidence of how transparency in providing data is basic in order to succeed as a Smart City.
The GLA’s London Datastore, for instance, is a public platform of datasets revealing updated figures on the main services offered by the town, in addition to population’s lifestyle and environmental risks. These data are then made more easily accessible to the community through the London Dashboard.
The importance of dispensing free information can be also proved by the integration of maps, which constitute an efficient means of geolocation. Consulting a map where it’s easy to find all the services you need as close as possible can be significant in the search for a location.
Wheel 435
(source: Smart London Plan)
The Open Data Index, published by The Open Knowledge Foundation in 2013, is another useful tool for data retrieval: it showcases a rank of different countries in the world with scores based on openness and availability of data attributes such as transport timetables and national statistics.
Here it is possible to check UK Open Data Census and US City Open Data Census.
As it was stated, making open data available and easily findable online not only represented a success for US cities but favoured apps makers and civic hackers too. Lauren Reid, a spokesperson at Code for America, reported according to Government Technology: “The more data we have, the better picture we have of the open data landscape.”
That is, on the whole, what Place I Live puts the biggest effort into: fostering a new awareness of the environment by providing free information, in order to support citizens willing to choose the best place they can live.
The outcome is soon explained. The website’s homepage offers visitors the chance to type address of their interest, displaying an overview of neighborhood parameters’ evaluation and a Life Quality Index calculated for every point on the map.
The research of the nearest medical institutions, schools or ATMs thus gets immediate and clear, as well as the survey about community’s generic information. Moreover, data’s reliability and accessibility are constantly examined by a strong team of professionals with high competence in data analysis, mapping, IT architecture and global markets.
For the moment the company’s work is focused on London, Berlin, Chicago, San Francisco and New York, while higher goals to reach include more than 200 cities.
US Open Data Census finally saw San Francisco’s highest score achievement as a proof of the city’s labour in putting technological expertise at everyone’s disposal, along with the task of fulfilling users’ needs through meticulous selections of datasets. This challenge seems to be successfully overcome by San Francisco’s new investment, partnering with the University of Chicago, in a data analytics dashboard on sustainability performance statistics named Sustainable Systems Framework, which is expected to be released in beta version by the the end of 2015’s first quarter.
 
Another remarkable collaboration in Open Data’s spread comes from the Bartlett Centre for Advanced Spatial Analysis (CASA) of the University College London (UCL); Oliver O’Brien, researcher at UCL Department of Geography and software developer at the CASA, is indeed one of the contributors to this cause.
Among his products, an interesting accomplishment is London’s CityDashboard, a real-time reports’ control panel in terms of spatial data. The web page also allows to visualize the whole data translated into a simplified map and to look at other UK cities’ dashboards.
Plus, his Bike Share Map is a live global view to bicycle sharing systems in over a hundred towns around the world, since bike sharing has recently drawn a greater public attention as an original form of transportation, in Europe and China above all….”