Mapping information economy business with big data: findings from the UK


NESTA: “This paper uses innovative ‘big data’ resources to measure the size of the information economy in the UK.

Key Findings

  • Counts of information economy firms are 42 per cent larger than SIC-based estimates
  • Using ‘big data’ estimates, the research finds 225,800 information economy businesses in the UK
  • Information economy businesses are highly clustered across the country, with very high counts in the Greater South East, notably London (especially central and east London), as well as big cities such as Manchester, Birmingham and Bristol
  • Looking at local clusters, we find hotspots in Middlesbrough, Aberdeen, Brighton, Cambridge and Coventry, among others

Information and Communications Technologies – and the digital economy they support – are of enduring interest to researchers and policymakers. National and local government are particularly keen to understand the characteristics and growth potential of ‘their’ digital businesses.
Given the recent resurgence of interest in industrial policy across many developed countries, there is now substantial policy interest in developing stronger, more competitive digital economies. For example, the UK’s current industrial strategy combines horizontal interventions with support for seven key sectors, of which the ‘information economy’ is one.
The desire to grow high–tech clusters is often prominent in the policy mix – for instance, the UK’s Tech City UK initiative, Regional Innovation Clusters in the US and elements of ‘smart specialisation’ policies in the EU.
In this paper, NIESR and Growth Intelligence use novel ‘big data’ sources to improve our understanding of information economy businesses in the UK – that is, those involved in the production of ICTs. We use this experience to critically reflect on some of the opportunities and challenges presented by big data tools and analytics for economic research and policymaking.”
– See more at: http://www.nesta.org.uk/publications/mapping-information-economy-business-big-data-findings-uk-0#sthash.2ismEMr2.dpuf

Restoring Confidence in Open, Shared and Personal Data


Report of the UK Digital Government Review: “It is obvious that government needs to be able to use data both to deliver services and to present information to public view. How else would government know which bank account to place a pension payment into, or a citizen know the results of an election or how to contact their elected representatives?

As more and more data is created, preserved and shared in ever-increasing volumes a number of urgent questions are begged: over opportunities and hazards; over the importance of using best-practice techniques, insights and technologies developed in the private sector, academia and elsewhere; over the promises and limitations of openness; and how all this might be articulated and made accessible to the public.

Government has already adopted “open data” (we will discuss this more in the next section) and there are now increasing calls for government to pay more attention to data analytics and so-called “big data” – although the first faltering steps to unlock benefits, here, have often ended in the discovery that using large-scale data is a far more nuanced business than was initially assumed

Debates around government and data have often been extremely high-profile – the NHS care.data [27] debate was raging while this review was in progress – but they are also shrouded in terms that can generate confusion and complexities that are not easily summarized.

In this chapter we will unpick some of these terms and some parts of the debate. This is a detailed and complex area and there is much more that could have been included [28]. This is not an area that can easily be summarized into a simple bullet-pointed list of policies.

Within this report we will use the following terms and definitions, proceeding to a detailed analysis of each in turn:

Type of Data

Definition [29]

Examples

1. Open Data Data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and sharealike Insolvency notices in the London Gazette
Government spending information
Public transport information
Official National Statistics
2. Shared Data Restricted data provided to restricted organisations or individuals for restricted purposes National Pupil Database
NHS care.data
Integrated health and social care
Individual census returns
3. Personal Data Data that relate to a living individual who can be identified from that data. For full legal definition see [30] Health records
Individual tax records
Insolvency notices in the London gazette
National Pupil Database
NB These definitions overlap. Personal data can exist in both open and shared data.

This social productivity will help build future economic productivity; in the meantime it will improve people’s lives and it will enhance our democracy. From our analysis it was clear that there was room for improvement…”

Understanding "New Power"


Article by Jeremy Heimans and Henry Timms in Harvard Business Review: “We all sense that power is shifting in the world. We see increasing political protest, a crisis in representation and governance, and upstart businesses upending traditional industries. But the nature of this shift tends to be either wildly romanticized or dangerously underestimated.
There are those who cherish giddy visions of a new techno-utopia in which increased connectivity yields instant democratization and prosperity. The corporate and bureaucratic giants will be felled and the crowds coronated, each of us wearing our own 3D-printed crown. There are also those who have seen this all before. Things aren’t really changing that much, they say. Twitter supposedly toppled a dictator in Egypt, but another simply popped up in his place. We gush over the latest sharing-economy start-up, but the most powerful companies and people seem only to get more powerful.
Both views are wrong. They confine us to a narrow debate about technology in which either everything is changing or nothing is. In reality, a much more interesting and complex transformation is just beginning, one driven by a growing tension between two distinct forces: old power and new power.
Old power works like a currency. It is held by few. Once gained, it is jealously guarded, and the powerful have a substantial store of it to spend. It is closed, inaccessible, and leader-driven. It downloads, and it captures.
New power operates differently, like a current. It is made by many. It is open, participatory, and peer-driven. It uploads, and it distributes. Like water or electricity, it’s most forceful when it surges. The goal with new power is not to hoard it but to channel it.

The battle and the balancing between old and new power will be a defining feature of society and business in the coming years. In this article, we lay out a simple framework for understanding the underlying dynamics at work and how power is really shifting: who has it, how it is distributed, and where it is heading….”

Smart cities: the state-of-the-art and governance challenge


New Paper by Mark Deakin in Triple Helix – A Journal of University-Industry-Government Innovation and Entrepreneurship: “Reflecting on the governance of smart cities, the state-of-the-art this paper advances offers a critique of recent city ranking and future Internet accounts of their development. Armed with these critical insights, it goes on to explain smart cities in terms of the social networks, cultural attributes and environmental capacities, vis-a-vis, vital ecologies of the intellectual capital, wealth creation and standards of participatory governance regulating their development. The Triple Helix model which the paper advances to explain these performances in turn suggests that cities are smart when the ICTs of future Internet developments successfully embed the networks society needs for them to not only generate intellectual capital, or create wealth, but also cultivate the environmental capacity, ecology and vitality of those spaces which the direct democracy of their participatory governance open up, add value to and construct.”

Smarter Than Us: The Rise of Machine Intelligence


 

Book by Stuart Armstrong at the Machine Intelligence Research Institute: “What happens when machines become smarter than humans? Forget lumbering Terminators. The power of an artificial intelligence (AI) comes from its intelligence, not physical strength and laser guns. Humans steer the future not because we’re the strongest or the fastest but because we’re the smartest. When machines become smarter than humans, we’ll be handing them the steering wheel. What promises—and perils—will these powerful machines present? Stuart Armstrong’s new book navigates these questions with clarity and wit.
Can we instruct AIs to steer the future as we desire? What goals should we program into them? It turns out this question is difficult to answer! Philosophers have tried for thousands of years to define an ideal world, but there remains no consensus. The prospect of goal-driven, smarter-than-human AI gives moral philosophy a new urgency. The future could be filled with joy, art, compassion, and beings living worthwhile and wonderful lives—but only if we’re able to precisely define what a “good” world is, and skilled enough to describe it perfectly to a computer program.
AIs, like computers, will do what we say—which is not necessarily what we mean. Such precision requires encoding the entire system of human values for an AI: explaining them to a mind that is alien to us, defining every ambiguous term, clarifying every edge case. Moreover, our values are fragile: in some cases, if we mis-define a single piece of the puzzle—say, consciousness—we end up with roughly 0% of the value we intended to reap, instead of 99% of the value.
Though an understanding of the problem is only beginning to spread, researchers from fields ranging from philosophy to computer science to economics are working together to conceive and test solutions. Are we up to the challenge?
A mathematician by training, Armstrong is a Research Fellow at the Future of Humanity Institute (FHI) at Oxford University. His research focuses on formal decision theory, the risks and possibilities of AI, the long term potential for intelligent life (and the difficulties of predicting this), and anthropic (self-locating) probability. Armstrong wrote Smarter Than Us at the request of the Machine Intelligence Research Institute, a non-profit organization studying the theoretical underpinnings of artificial superintelligence.”

Linguistic Mapping Reveals How Word Meanings Sometimes Change Overnight


Emerging Technology From the arXiv: “In October 2012, Hurricane Sandy approached the eastern coast of the United States. At the same time, the English language was undergoing a small earthquake of its own. Just months before, the word “sandy” was an adjective meaning “covered in or consisting mostly of sand” or “having light yellowish brown color.” Almost overnight, this word gained an additional meaning as a proper noun for one of the costliest storms in U.S. history.
A similar change occurred to the word “mouse” in the early 1970s when it gained the new meaning of “computer input device.” In the 1980s, the word “apple” became a proper noun synonymous with the computer company. And later, the word “windows” followed a similar course after the release of the Microsoft operating system.
All this serves to show how language constantly evolves, often slowly but at other times almost overnight. Keeping track of these new senses and meanings has always been hard. But not anymore.
Today, Vivek Kulkarni at Stony Brook University in New York and a few pals show how they have tracked these linguistic changes by mining the corpus of words stored in databases such as Google Books, movie reviews from Amazon, and of course the microblogging site Twitter.
These guys have developed three ways to spot changes in the language. The first is a simple count of how often words are used, using tools such as Google Trends. For example, in October 2012, the frequency of the words “Sandy” and “hurricane” both spiked in the runup to the storm. However, only one of these words changed its meaning, something that a frequency count cannot spot.
So Kulkarni and co have a second method in which they label all of the words in the databases according to their parts of speech, whether a noun, a proper noun, a verb, an adjective and so on. This clearly reveals a change in the way the word “Sandy” was used, from adjective to proper noun, while also showing that the word “hurricane” had not changed.
The parts of speech technique is useful but not infallible. It cannot pick up the change in meaning of the word mouse, both of which are nouns. So the team have a third approach.
This maps the linguistic vector space in which words are embedded. The idea is that words in this space are close to other words that appear in similar contexts. For example, the word “big” is close to words such as “large,” “huge,” “enormous,” and so on.
By examining the linguistic space at different points in history, it is possible to see how meanings have changed. For example, in the 1950s, the word “gay” was close to words such as “cheerful” and “dapper.” Today, however, it has moved significantly to be closer to words such as “lesbian,” homosexual,” and so on.
Kulkarni and co examine three different databases to see how words have changed: the set of five-word sequences that appear in the Google Books corpus, Amazon movie reviews since 2000, and messages posted on Twitter between September 2011 and October 2013.
Their results reveal not only which words have changed in meaning, but when the change occurred and how quickly. For example, before the 1970s, the word “tape” was used almost exclusively to describe adhesive tape but then gained an additional meaning of “cassette tape.”…”

A micro-democratic perspective on crowd-work


New paper by Karin Hansson: “Social media has provided governments with new means to improve efficiency and innovation, by engaging a crowd in the gathering and development of data. These collaborative processes are also described as a way to improve democracy by enabling a more transparent and deliberative democracy where citizens participate more directly in decision processes on different levels. However, the dominant research on the e-democratic field takes a government perspective rather then a citizen perspective. –democracy from the perspective of the individual actor, in a global context, is less developed.
In this paper I therefore develop a model for a democratic process outside the realm of the nation state, in a performative state where inequality is norm and the state is unclear and fluid. In this process e-participation means an ICT supported method to get a diversity of opinions and perspectives rather than one single. This micro perspective on democratic participation online might be useful for development of tools for more democratic online crowds…”

Hungry Planet: Can Big Data Help Feed 9 Billion Humans?


at NBC News: “With a population set to hit 9 billion human beings by 2050, the world needs to grow more food —without cutting down forests and jungles, which are the climate’s huge lungs.

The solution, according to one soil management scientist, is Big Data.

Kenneth Cassman, an agronomist at the University of Nebraska, Lincoln, recently unveiled a new interactive mapping tool that shows in fine-grain detail where higher crop yields are possible on current arable land.

“By some estimates, 20 to 30 percent of greenhouse gas emissions are associated with agriculture and of that a large portion is due to conversion of natural systems like rainforests or grassland savannahs to crop production, agriculture,” Cassman told NBC News at a conference in suburban Seattle.

The only practical way to stop the conversion of wild lands to farmland is grow more food on land already dedicated to agriculture, he said. Currently, the amount of farmland used to produce rice, wheat, maize and soybean, he noted, is expanding at a rate of about 20 million acres a year.

Cassman and colleagues unveiled the Global Yield Gap and Water Productivity Atlas in October at the Water for Food conference. The atlas was six years and $6 million in the making and contains site-specific data on soil, climate and cropping systems to determine potential yield versus actual yield farm by farm in nearly 20 countries around the world. Projects are ongoing to secure data for 30 more countries….

A key initiative going forward is to teach smallholder farmers how to use the atlas, Cassman said. Until now, the tool has largely rested with agricultural researchers who have validated its promise of delivering information that can help grow more food on existing farmland….

New Tool in Fighting Corruption: Open Data


Martin Tisne at Omidyar Network: “Yesterday in Brisbane, the G20 threw its weight behind open data by featuring it prominently in the G20 Anti-Corruption working action plan. Specifically, the action plan calls for effort in three related areas:

(1)   Prepare a G20 compendium of good practices and lessons learned on open data and its application in the fight against corruption
(2)   Prepare G20 Open Data Principles, including identifying areas or sectors where their application is particularly useful
(3)   Complete self‑assessments of G20 country open data frameworks and initiatives

Open data describes information that is not simply public, but that has been published in a manner that makes it easy to access and easy to compare and connect with other information.
This matters for anti corruption – if you are a journalist or a civil society activist investigating bribery and corruption those connections are everything. They tell you that an anonymous person (e.g. ‘Mr Smith’) who owns an obscure company registered in a tax haven is linked to a another company that has been illegally exporting timber from a neighboring country. That the said Mr. Smith is also the son-in-law of the mining minister of yet another country, who herself has been accused of embezzling mining revenues. As we have written elsewhere on this blog, investigative journalists, prosecution authorities, and civil society groups all need access to this linked data for their work.
The action plan also links open data to the wider G20 agenda, citing its impact on the ability of businesses to make better investment decisions. You can find the full detail here….”

Building a complete Tweet index


Yi Zhuang (@yz) at Twitter: “Since that first simple Tweet over eight years ago, hundreds of billions of Tweets have captured everyday human experiences and major historical events. Our search engine excelled at surfacing breaking news and events in real time, and our search index infrastructure reflected this strong emphasis on recency. But our long-standing goal has been to let people search through every Tweet ever published.
This new infrastructure enables many use cases, providing comprehensive results for entire TV and sports seasons, conferences (#TEDGlobal), industry discussions (#MobilePayments), places, businesses and long-lived hashtag conversations across topics, such as #JapanEarthquake, #Election2012, #ScotlandDecides, #HongKong, #Ferguson and many more. This change will be rolling out to users over the next few days.
In this post, we describe how we built a search service that efficiently indexes roughly half a trillion documents and serves queries with an average latency of under 100ms….”