Teaching machines to understand – and summarize – text


 and  in The Conversation: “We humans are swamped with text. It’s not just news and other timely information: Regular people are drowning in legal documents. The problem is so bad we mostly ignore it. Every time a person uses a store’s loyalty rewards card or connects to an online service, his or her activities are governed by the equivalent of hundreds of pages of legalese. Most people pay no attention to these massive documents, often labeled “terms of service,” “user agreement” or “privacy policy.”

These are just part of a much wider societal problem of information overload. There is so much data stored – exabytes of it, as much stored as has ever been spoken by people in all of human history – that it’s humanly impossible to read and interpret everything. Often, we narrow down our pool of information by choosing particular topics or issues to pay attention to. But it’s important to actually know the meaning and contents of the legal documents that govern how our data is stored and who can see it.

As computer science researchers, we are working on ways artificial intelligence algorithms could digest these massive texts and extract their meaning, presenting it in terms regular people can understand….

Examining privacy policies

A modern internet-enabled life today more or less requires trusting for-profit companies with private information (like physical and email addresses, credit card numbers and bank account details) and personal data (photos and videos, email messages and location information).

These companies’ cloud-based systems typically keep multiple copies of users’ data as part of backup plans to prevent service outages. That means there are more potential targets – each data center must be securely protected both physically and electronically. Of course, internet companies recognize customers’ concerns and employ security teams to protect users’ data. But the specific and detailed legal obligations they undertake to do that are found in their impenetrable privacy policies. No regular human – and perhaps even no single attorney – can truly understand them.

In our study, we ask computers to summarize the terms and conditions regular users say they agree to when they click “Accept” or “Agree” buttons for online services. We downloaded the publicly available privacy policies of various internet companies, including Amazon AWS, Facebook, Google, HP, Oracle, PayPal, Salesforce, Snapchat, Twitter and WhatsApp….

Our software examines the text and uses information extraction techniques to identify key information specifying the legal rights, obligations and prohibitions identified in the document. It also uses linguistic analysis to identify whether each rule applies to the service provider, the user or a third-party entity, such as advertisers and marketing companies. Then it presents that information in clear, direct, human-readable statements….(More)”

Artificial intelligence can predict which congressional bills will pass


Other algorithms have predicted whether a bill will survive a congressional committee, or whether the Senate or House of Representatives will vote to approve it—all with varying degrees of success. But John Nay, a computer scientist and co-founder of Skopos Labs, a Nashville-based AI company focused on studying policymaking, wanted to take things one step further. He wanted to predict whether an introduced bill would make it all the way through both chambers—and precisely what its chances were.

Nay started with data on the 103rd Congress (1993–1995) through the 113th Congress (2013–2015), downloaded from a legislation-tracking website call GovTrack. This included the full text of the bills, plus a set of variables, including the number of co-sponsors, the month the bill was introduced, and whether the sponsor was in the majority party of their chamber. Using data on Congresses 103 through 106, he trained machine-learning algorithms—programs that find patterns on their own—to associate bills’ text and contextual variables with their outcomes. He then predicted how each bill would do in the 107th Congress. Then, he trained his algorithms on Congresses 103 through 107 to predict the 108th Congress, and so on.

Nay’s most complex machine-learning algorithm combined several parts. The first part analyzed the language in the bill. It interpreted the meaning of words by how they were embedded in surrounding words. For example, it might see the phrase “obtain a loan for education” and assume “loan” has something to do with “obtain” and “education.” A word’s meaning was then represented as a string of numbers describing its relation to other words. The algorithm combined these numbers to assign each sentence a meaning. Then, it found links between the meanings of sentences and the success of bills that contained them. Three other algorithms found connections between contextual data and bill success. Finally, an umbrella algorithm used the results from those four algorithms to predict what would happen…. his program scored about 65% better than simply guessing that a bill wouldn’t pass, Nay reported last month in PLOS ONE…(More).

AI software created for drones monitors wild animals and poachers


Springwise: “Artificial intelligence software installed into drones is to be used by US tech company Neurala to help protect endangered species from poachers. Working with the region’s Lingbergh Foundation, Neurala is currently helping operations in South Africa, Malawi and Zimbabwe and have had requests from Botswana, Mozambique and Zambia for assistance with combatting poaching.

The software is designed to monitor video as it is streamed back to researchers from unmanned drones that can fly for up to five hours, identifying animals, vehicles and poachers in real time without any human input. It can then alert rangers via the mobile command center if anything out of the ordinary is detected. The software can analyze regular or infrared footage, and therefore works with video taken day or night.

The Lindbergh Foundation will be deploying the technology as part of operation Air Shepherd, which is aimed at protecting elephants and rhinos in Southern Africa from poachers. According to the Foundation, elephants and rhinos are at risk of being extinct in just 10 years if current poaching rates continue, and has logged 5,000 hours of drone flight time over the course of 4,000 missions to date.

The use of drones within business models is proving popular, with recent innovations including a drone painting systemthat created crowdfunded murals and two Swiss hospitals that used a drone to deliver lab samples between them….(More)”.

Big Data, Data Science, and Civil Rights


Paper by Solon Barocas, Elizabeth Bradley, Vasant Honavar, and Foster Provost:  “Advances in data analytics bring with them civil rights implications. Data-driven and algorithmic decision making increasingly determine how businesses target advertisements to consumers, how police departments monitor individuals or groups, how banks decide who gets a loan and who does not, how employers hire, how colleges and universities make admissions and financial aid decisions, and much more. As data-driven decisions increasingly affect every corner of our lives, there is an urgent need to ensure they do not become instruments of discrimination, barriers to equality, threats to social justice, and sources of unfairness. In this paper, we argue for a concrete research agenda aimed at addressing these concerns, comprising five areas of emphasis: (i) Determining if models and modeling procedures exhibit objectionable bias; (ii) Building awareness of fairness into machine learning methods; (iii) Improving the transparency and control of data- and model-driven decision making; (iv) Looking beyond the algorithm(s) for sources of bias and unfairness—in the myriad human decisions made during the problem formulation and modeling process; and (v) Supporting the cross-disciplinary scholarship necessary to do all of that well…(More)”.

Big Mind: How Collective Intelligence Can Change Our World


Book by Geoff Mulgan: “A new field of collective intelligence has emerged in the last few years, prompted by a wave of digital technologies that make it possible for organizations and societies to think at large scale. This “bigger mind”—human and machine capabilities working together—has the potential to solve the great challenges of our time. So why do smart technologies not automatically lead to smart results? Gathering insights from diverse fields, including philosophy, computer science, and biology, Big Mind reveals how collective intelligence can guide corporations, governments, universities, and societies to make the most of human brains and digital technologies.

Geoff Mulgan explores how collective intelligence has to be consciously organized and orchestrated in order to harness its powers. He looks at recent experiments mobilizing millions of people to solve problems, and at groundbreaking technology like Google Maps and Dove satellites. He also considers why organizations full of smart people and machines can make foolish mistakes—from investment banks losing billions to intelligence agencies misjudging geopolitical events—and shows how to avoid them.

Highlighting differences between environments that stimulate intelligence and those that blunt it, Mulgan shows how human and machine intelligence could solve challenges in business, climate change, democracy, and public health. But for that to happen we’ll need radically new professions, institutions, and ways of thinking.

Informed by the latest work on data, web platforms, and artificial intelligence, Big Mind shows how collective intelligence could help us survive and thrive….(More)”

Nobody Is Smarter or Faster Than Everybody


Rod Collins at Huffington Post: “One of the deepest beliefs of command-and-control management is the assumption that the smartest organization is the one with the smartest individuals. This belief is as old as scientific management itself. According to this way of thinking, just as there is a right way to perform every activity, there are right individuals who are essential for defining what are the right things and for making sure that things are done right. Thus, traditional organizations have long held that the key to the successful achievement of the corporation’s two basic accountabilities of strategy and execution is to hire the smartest individual managers and the brightest functional experts.

Command-and-control management assumes that intelligence fundamentally resides in a select number of star performers who are able to leverage their expertise across large groups of people through proper direction and effective control. Thus, the recruiting efforts and the promotional practices of most companies are focused on competing for and retaining the most talented people. While established management thinking holds that most individual workers are replaceable, this is not so for those star performers whose decision-making and problem-solving prowess are heroically revered. Traditional hierarchical organizations firmly believe in the myth of the individual hero. They are convinced that a single highly intelligent individual can make the difference between success and failure, whether that person is a key senior executive, a functional expert, or even a highly paid consultant.

However, in a rapidly changing world, it is becoming painfully obvious to harried executives that no single individual or even an elite cadre of star performers can adequately process the ever-evolving knowledge of fast-changing markets into operational excellence in real-time. Eric Teller, the CEO of Google X, has astutely recognized that we now live in a world where the pace of technological change exceeds the capacity for most individuals to absorb these changes in real time. If we can’t depend upon smart individuals to process change in time to respond to market developments, what options do business leaders have?

Nobody Is Smarter Than Everybody

If business executives want to build smart companies in a rapidly changing world, they will need to think differently and discover the most untapped resource in their organizations: the collective intelligence of their own people. Innovative organizations, such as Wikipedia and Google, have made this discovery and have leveraged the power of collective intelligence into powerful business models that have radically transformed their industries. The struggling online encyclopedia Nupedia rescued itself from oblivion when it serendipitously discovered an obscure application known as a wiki and transformed itself into Wikipedia by using the wiki platform to leverage the power of collective intelligence. In less than a decade, Wikipedia became the world’s most popular general reference resource. Google, which was a late entry into a crowded field of search engine upstarts, quickly garnered two-thirds of the search market by becoming the first engine to use the wisdom of crowds to rank web pages. These successful enterprises have uncovered the essential management wisdom for our times: Nobody is smarter or faster than everybody….

While smart individuals are important in any organization, it isn’t their unique intelligence that is paramount but rather their unique contributions to the overall intelligence of teams. That’s because the blending of the diverse perspectives of different types of intelligences is often the fastest path to the solution of complex problems, as we learned in the summer of 2011 when a diverse group of over 250,000 experts, non-experts, and unusual suspects in a scientific gaming community called Foldit, solved in ten days a biomolecular problem that had alluded the world’s best scientists for over ten years. This means a self-organized group that required no particular credentials for membership was 365 times more effective and efficient than the world’s most credentialed individual experts. Similarly, the non-credentialed contributors of Wikipedia were able to produce approximately 18,000 articles in its first year of operation compared to only 25 articles produced by academic experts in Nupedia’s first year. This means the wisdom of the crowd was 720 times more effective and efficient than the individual experts. These results are completely counterintuitive to everything that most of us have been taught about how intelligence works. However, as counterintuitive as this may seem, the preeminence of collective intelligence has suddenly become a practical reality thanks to proliferation of digital technology over the last two decades.

As we move from the first wave of the digital revolution, which was sparked by connecting people via the Internet, to the second wave where everyone and everything will be hyper-connected in the emerging Internet of Things, our capacity to aggregate and leverage collective intelligence is likely to accelerate as practical applications of artificial intelligence become everyday realities….(More)”.

Slave to the Algorithm? Why a ‘Right to Explanation’ is Probably Not the Remedy You are Looking for


Paper by Lilian Edwards and Michael Veale: “Algorithms, particularly of the machine learning (ML) variety, are increasingly consequential to individuals’ lives but have caused a range of concerns evolving mainly around unfairness, discrimination and opacity. Transparency in the form of a “right to an explanation” has emerged as a compellingly attractive remedy since it intuitively presents as a means to “open the black box”, hence allowing individual challenge and redress, as well as possibilities to foster accountability of ML systems. In the general furore over algorithmic bias and other issues laid out in section 2, any remedy in a storm has looked attractive.

However, we argue that a right to an explanation in the GDPR is unlikely to be a complete remedy to algorithmic harms, particularly in some of the core “algorithmic war stories” that have shaped recent attitudes in this domain. We present several reasons for this conclusion. First (section 3), the law is restrictive on when any explanation-related right can be triggered, and in many places is unclear, or even seems paradoxical. Second (section 4), even were some of these restrictions to be navigated, the way that explanations are conceived of legally — as “meaningful information about the logic of processing” — is unlikely to be provided by the kind of ML “explanations” computer scientists have been developing. ML explanations are restricted both by the type of explanation sought, the multi-dimensionality of the domain and the type of user seeking an explanation. However (section 5) “subject-centric” explanations (SCEs), which restrict explanations to particular regions of a model around a query, show promise for interactive exploration, as do pedagogical rather than decompositional explanations in dodging developers’ worries of IP or trade secrets disclosure.

As an interim conclusion then, while convinced that recent research in ML explanations shows promise, we fear that the search for a “right to an explanation” in the GDPR may be at best distracting, and at worst nurture a new kind of “transparency fallacy”. However, in our final section, we argue that other parts of the GDPR related (i) to other individual rights including the right to erasure (“right to be forgotten”) and the right to data portability and (ii) to privacy by design, Data Protection Impact Assessments and certification and privacy seals, may have the seeds of building a better, more respectful and more user-friendly algorithmic society….(More)”

Could Big Data Help End Hunger in Africa?


Lenny Ruvaga at VOA News: “Computer algorithms power much of modern life from our Facebook feeds to international stock exchanges. Could they help end malnutrition and hunger in Africa? The International Center for Tropical Agriculture thinks so.

The International Center for Tropical Agriculture has spent the past four years developing the Nutrition Early Warning System, or NEWS.

The goal is to catch the subtle signs of a hunger crisis brewing in Africa as much as a year in advance.

CIAT says the system uses machine learning. As more information is fed into the system, the algorithms will get better at identifying patterns and trends. The system will get smarter.

Information Technology expert Andy Jarvis leads the project.

“The cutting edge side of this is really about bringing in streams of information from multiple sources and making sense of it. … But it is a huge volume of information and what it does, the novelty then, is making sense of that using things like artificial intelligence, machine learning, and condensing it into simple messages,” he said.

Other nutrition surveillance systems exist, like FEWSnet, the Famine Early Warning System Network which was created in the mid-1980s.

But CIAT says NEWS will be able to draw insights from a massive amount of diverse data enabling it to identify hunger risks faster than traditional methods.

“What is different about NEWS is that it pays attention to malnutrition, not just drought or famine, but the nutrition outcome that really matters, malnutrition especially in women and children. For the first time, we are saying these are the options way ahead of time. That gives policy makers an opportunity to really do what they intend to do which is make the lives of women and children better in Africa,” said Dr. Mercy Lung’aho, a CIAT nutrition expert.

While food emergencies like famine and drought grab headlines, the International Center for Tropical Agriculture says chronic malnutrition affects one in four people in Africa, taking a serious toll on economic growth and leaving them especially vulnerable in times of crisis….(More)”.

The Way Ahead


Transcript of lecture delivered by Stephen Fry on the 28th May  2017 • Hay Festival, Hay-on-Wye: “Peter Florence, the supremo of this great literary festival, asked me some months ago if I might, as part of Hay’s celebration of the five hundredth anniversary of Martin Luther’s kickstarting of the reformation, suggest a reform of the internet…

You will be relieved to know, that unlike Martin Luther, I do not have a full 95 theses to nail to the door, or in Hay’s case, to the tent flap. It might be worth reminding ourselves perhaps, however, of the great excitements of the early 16th century. I do not think it is a coincidence that Luther grew up as one of the very first generation to have access to printed books, much as some of you may have children who were the first to grow up with access to e-books, to iPads and to the internet….

The next big step for AI is the inevitable achievement of Artificial General Intelligence, or AGI, sometimes called ‘full artificial intelligence’ the point at which machines really do think like humans. In 2013, hundreds of experts were asked when they thought AGI may arise and the median prediction was they year 2040. After that the probability, most would say certain, is artificial super-intelligence and the possibility of reaching what is called the Technological Singularity – what computer pioneer John van Neumann described as the point “…beyond which humans affairs, as we know them, could not continue.” I don’t think I have to worry about that. Plenty of you in this tent have cause to, and your children beyond question will certainly know all about it. Unless of course the climate causes such havoc that we reach a Meteorological Singularity. Or the nuclear codes are penetrated by a self-teaching algorithm whose only purpose is to find a way to launch…

It’s clear that, while it is hard to calculate the cascade upon cascade of new developments and their positive effects, we already know the dire consequences and frightening scenarios that threaten to engulf us. We know them because science fiction writers and dystopians in all media have got there before us and laid the nightmare visions out. Their imaginations have seen it all coming. So whether you believe Ray Bradbury, George Orwell, Aldous Huxley, Isaac Asimov, Margaret Atwood, Ridley Scott, Anthony Burgess, H. G. Wells, Stanley Kubrick, Kazuo Ishiguro, Philip K. Dick, William Gibson, John Wyndham, James Cameron, the Wachowski’s or the scores and scores of other authors and film-makers who have painted scenarios of chaos and doom, you can certainly believe that a great transformation of human society is under way, greater than Gutenberg’s revolution – greater I would submit than the Industrial Revolution (though clearly dependent on it) – the greatest change to our ways of living since we moved from hunting and gathering to settling down in farms, villages and seaports and started to trade and form civilisations. Whether it will alter the behaviour, cognition and identity of the individual in the same way it is certain to alter the behaviour, cognition and identity of the group, well that is a hard question to answer.

But believe me when I say that it is happening. To be frank it has happened. The unimaginably colossal sums of money that have flowed to the first two generations of Silicon Valley pioneers have filled their coffers, their war chests, and they are all investing in autonomous cars, biotech, the IoT, robotics Artificial Intelligence and their convergence. None more so than the outlier, the front-runner Mr Elon Musk whose neural link system is well worth your reading about online on the great waitbutwhy.com website. Its author Tim Urban is a paid consultant of Elon Musk’s so he has the advantage of knowing what he is writing about but the potential disadvantage of being parti pri and lacking in objectivity. Elon Musk made enough money from his part in the founding and running of PayPal to fund his manifold exploits. The Neuralink project joins his Tesla automobile company and subsidiary battery and solar power businesses, his Space X reusable spacecraft group, his OpenAI initiative and Hyperloop transport system. The 1950s and 60s Space Race was funded by sovereign governments, this race is funded by private equity, by the original investors in Google, Apple, Facebook and so on. Nation states and their agencies are not major players in this game, least of all poor old Britain. Even if our politicians were across this issue, and they absolutely are not, our votes would still be an irrelevance….

So one thesis I would have to nail up to the tent is to clamour for government to bring all this deeper into schools and colleges. The subject of the next technological wave, I mean, not pornography and prostitution. Get people working at the leading edge of AI and robotics to come into the classrooms. But more importantly listen to them – even if what they say is unpalatable, our masters must have the intellectual courage and honesty to say if they don’t understand and ask for repetition and clarification. This time, in other words, we mustn’t let the wave engulf us, we must ride its crest. It’s not quite too late to re-gear governmental and educational planning and thinking….

The witlessness of our leaders and of ourselves is indeed a problem. The real danger surely is not technology but technophobic Canute-ism, a belief that we can control, change or stem the technological tide instead of understanding that we need to learn how to harness it. Driving cars is dangerous, but we developed driving lesson requirements, traffic controls, seat-belts, maintenance protocols, proximity sensors, emission standards – all kinds of ways of mitigating the danger so as not to deny ourselves the life-changing benefits of motoring.

We understand why angry Ned Ludd destroyed the weaving machines that were threatening his occupation (Luddites were prophetic in their way, it was weaving machines that first used the punched cards on which computers relied right up to the 1970s). We understand too why French workers took their clogs, their sabots as they were called, and threw them into the machinery to jam it up, giving us the word sabotage. But we know that they were in the end, if you’ll pardon the phrase, pissing into the wind. No technology has ever been stopped.

So what is the thesis I am nailing up? Well, there is no authority for me to protest to, no equivalent of Pope Leo X for it to be delivered to, and I am certainly no Martin Luther. The only thesis I can think worth nailing up is absurdly simple. It is a cry as much from the heart as from the head and it is just one word – Prepare. We have an advantage over our hunter gatherer and farming ancestors, for whether it is Winter that is coming, or a new Spring, is entirely in our hands, so long as we prepare….(More)”.

Eliminating the Human


I suspect that we almost don’t notice this pattern because it’s hard to imagine what an alternative focus of tech development might be. Most of the news we get barraged with is about algorithms, AI, robots and self driving cars, all of which fit this pattern, though there are indeed many technological innovations underway that have nothing to do with eliminating human interaction from our lives. CRISPR-cas9 in genetics, new films that can efficiently and cheaply cool houses and quantum computing to name a few, but what we read about most and what touches us daily is the trajectory towards less human involvement. Note: I don’t consider chat rooms and product reviews as “human interaction”; they’re mediated and filtered by a screen.

I am not saying these developments are not efficient and convenient; this is not a judgement regarding the services and technology. I am simply noticing a pattern and wondering if that pattern means there are other possible roads we could be going down, and that the way we’re going is not in fact inevitable, but is (possibly unconsciously) chosen.

Here are some examples of tech that allows for less human interaction…

Lastly, “Social” media- social “interaction” that isn’t really social.

While the appearance on social networks is one of connection—as Facebook and others frequently claim—the fact is a lot of social media is a simulation of real social connection. As has been in evidence recently, social media actually increases divisions amongst us by amplifying echo effects and allowing us to live in cognitive bubbles. We are fed what we already like or what our similarly inclined friends like… or more likely now what someone has payed for us to see in an ad that mimics content. In this way, we actually become less connected except to those in our group…..

Many transformative movements in the past succeed based on leaders, agreed upon principles and organization. Although social media is a great tool for rallying people and bypassing government channels, it does not guarantee eventual success.

Social media is not really social—ticking boxes and having followers and getting feeds is NOT being social—it’s a screen simulation of human interaction. Human interaction is much more nuanced and complicated than what happens online. Engineers like things that are quantifiable. Smells, gestures, expression, tone of voice, etc. etc.—in short, all the various ways we communicate are VERY hard to quantify, and those are often how we tell if someone likes us or not….

To repeat what I wrote above—humans are capricious, erratic, emotional, irrational and biased in what sometimes seem like counterproductive ways. I’d argue that though those might seem like liabilities, many of those attributes actually work in our favor. Many of our emotional responses have evolved over millennia, and they are based on the probability that our responses, often prodded by an emotion, will more likely than not offer the best way to deal with a situation….

Our random accidents and odd behaviors are fun—they make life enjoyable. I’m wondering what we’re left with when there are fewer and fewer human interactions. Remove humans from the equation and we are less complete as people or as a society. “We” do not exist as isolated individuals—we as individuals are inhabitants of networks, we are relationships. That is how we prosper and thrive….(More)”.