How Satellite Data and Artificial Intelligence could help us understand poverty better


Maya Craig at Fast Company: “Governments and development organizations currently measure poverty levels by conducting door-to-door surveys. The new partnership will test the use of AI to supplement these surveys and increase the accuracy of poverty data. Orbital said its AI software will analyze satellite images to see if characteristics such as building height and rooftop material can effectively indicate wealth.

The pilot study will be conducted in Sri Lanka. If successful, the World Bank hopes to scale it worldwide. A recent study conducted by the organization found that more than 50 countries lack legitimate poverty estimates, which limits the ability of the development community to support the world’s poorest populations.

“Data depravation is a serious issue, especially in many of the countries where we need it most,” says David Newhouse, senior economist at the World Bank. “This technology has the potential to help us get that data more frequently and at a finer level of detail than is currently possible.”

The announcement is the latest in an emerging industry of AI analysis of satellite photos. A growing number of investors and entrepreneurs are betting that the convergence of these fields will have far-reaching impacts on business, policy, resource management and disaster response.

Wall Street’s biggest hedge-fund businesses have begun using the technology to improve investment strategies. The Pew Charitable Trust employs the method to monitor oceans for illegal fishing activities. And startups like San Francisco-based Mavrx use similar analytics to optimize crop harvest.

The commercial earth-imaging satellite market, valued at $2.7 billion in 2014, is predicted to grow by 14% each year through the decade, according to a recent report.

As recently as two years ago, there were just four commercial earth imaging satellites operated in the U.S., and government contracts accounted for about 70% of imagery sales. By 2020, there will be hundreds of private-sector “smallsats” in orbit capturing imagery that will be easily accessible online. Companies like Skybox Imaging and Planet Labs have the first of these smallsats already active, with plans for more.

The images generated by these companies will be among the world’s largest data sets. And recent breakthroughs in AI research have made it possible to analyze these images to inform decision-making…(More)”

The importance of human innovation in A.I. ethics


John C. Havens at Mashable: “….While welcoming the feedback that sensors, data and Artificial Intelligence provide, we’re at a critical inflection point. Demarcating the parameters between assistance and automation has never been more central to human well-being. But today, beauty is in the AI of the beholder. Desensitized to the value of personal data, we hemorrhage precious insights regarding our identity that define the moral nuances necessary to navigate algorithmic modernity.

If no values-based standards exist for Artificial Intelligence, then the biases of its manufacturers will define our universal code of human ethics. But this should not be their cross to bear alone. It’s time to stop vilifying the AI community and start defining in concert with their creations what the good life means surrounding our consciousness and code.

The intention of the ethics

“Begin as you mean to go forward.” Michael Stewart is founder, chairman & CEO of Lucid, an Artificial Intelligence company based in Austin that recently announced the formation of the industry’s first Ethics Advisory Panel (EAP). While Google claimed creation of a similar board when acquiring AI firm DeepMind in January 2014, no public realization of its efforts currently exist (as confirmed by a PR rep from Google for this piece). Lucid’s Panel, by comparison, has already begun functioning as a separate organization from the analytics side of the business and provides oversight for the company and its customers. “Our efforts,” Stewart says, “are guided by the principle that our ethics group is obsessed with making sure the impact of our technology is good.”

Kay Firth-Butterfield is chief officer of the EAP, and is charged with being on the vanguard of the ethical issues affecting the AI industry and society as a whole. Internally, the EAP provides the hub of ethical behavior for the company. Someone from Firth-Butterfield’s office even sits on all core product development teams. “Externally,” she notes, “we plan to apply Cyc intelligence (shorthand for ‘encyclopedia,’ Lucid’s AI causal reasoning platform) for research to demonstrate the benefits of AI and to advise Lucid’s leadership on key decisions, such as the recent signing of the LAWS letter and the end use of customer applications.”

Ensuring the impact of AI technology is positive doesn’t happen by default. But as Lucid is demonstrating, ethics doesn’t have to stymie innovation by dwelling solely in the realm of risk mitigation. Ethical processes aligning with a company’s core values can provide more deeply relevant products and increased public trust. Transparently including your customer’s values in these processes puts the person back into personalization….(Mashable)”

Machines of Loving Grace: The Quest for Common Ground Between Humans and Robots


Book description: “Robots are poised to transform today’s society as completely as the Internet did twenty years ago. Pulitzer prize-winning New York Times science writer John Markoff argues that we must decide to design ourselves into our future, or risk being excluded from it altogether.

In the past decade, Google introduced us to driverless cars; Apple debuted Siri, a personal assistant that we keep in our pockets; and an Internet of Things connected the smaller tasks of everyday life to the farthest reaches of the Web. Robots have become an integral part of society on the battlefield and the road; in business, education, and health care. Cheap sensors and powerful computers will ensure that in the coming years, these robots will act on their own. This new era offers the promise of immensely powerful machines, but it also reframes a question first raised more than half a century ago, when the intelligent machine was born. Will we control these systems, or will they control us?

In Machines of Loving Grace, John Markoff offers a sweeping history of the complicated and evolving relationship between humans and computers. In recent years, the pace of technological change has accelerated dramatically, posing an ethical quandary. If humans delegate decisions to machines, who will be responsible for the consequences? As Markoff chronicles the history of automation, from the birth of the artificial intelligence and intelligence augmentation communities in the 1950s and 1960s, to the modern-day brain trusts at Google and Apple in Silicon Valley, and on to the expanding robotics economy around Boston, he traces the different ways developers have addressed this fundamental problem and urges them to carefully consider the consequences of their work. We are on the brink of the next stage of the computer revolution, Markoff argues, and robots will profoundly transform modern life. Yet it remains for us to determine whether this new world will be a utopia. Moreover, it is now incumbent upon the designers of these robots to draw a bright line between what is human and what is machine.

After nearly forty years covering the tech industry, Markoff offers an unmatched perspective on the most drastic technology-driven societal shifts since the introduction of the Internet. Machines of Loving Grace draws on an extensive array of research and interviews to present an eye-opening history of one of the most pressing questions of our time, and urges us to remember that we still have the opportunity to design ourselves into the future—before it’s too late….(More)”

How Africa can benefit from the data revolution


 in The Guardian: “….The modern information infrastructure is about movement of data. From data we derive information and knowledge, and that knowledge can be propagated rapidly across the country and throughout the world. Facebook and Google have both made massive investments in machine learning, the mainstay technology for converting data into knowledge. But the potential for these technologies in Africa is much larger: instead of simply advertising products to people, we can imagine modern distributed health systems, distributed markets, knowledge systems for disease intervention. The modern infrastructure should be data driven and deployed across the mobile network. A single good idea can then be rapidly implemented and distributed via the mobile phone app ecosystems.

The information infrastructure does not require large scale thinking and investment to deliver. In fact, it requires just the reverse. It requires agility and innovation. Larger companies cannot react quickly enough to exploit technological advances. Small companies with a good idea can grow quickly. From IBM to Microsoft, Google and now Facebook. All these companies now agree on one thing: data is where the value lies. Modern internet companies are data-driven from the ground up. Could the same thing happen in Africa’s economies? Can entire countries reformulate their infrastructures to be data-driven from the ground up?

Maybe, or maybe not, but it isn’t necessary to have a grand plan to give it a go. It is already natural to use data and communication to solve real world problems. In Silicon Valley these are the challenges of getting a taxi or reserving a restaurant. In Africa they are often more fundamental. John Quinn has been in Kampala, Uganda at Makerere University for eight years now targeting these challenges. In June this year, John and other researchers from across the region came together for Africa’s first workshop on data science at Dedan Kimathi University of Technology. The objective was to spread knowledge of technologies, ideas and solutions. For the modern information infrastructure to be successful software solutions need to be locally generated. African apps to solve African problems. With this in mind the workshop began with a three day summer school on data science which was then followed by two days of talks on challenges in African data science.

The ideas and solutions presented were cutting edge. The Umati project uses social media to understand the use of ethnic hate speech in Kenya (Sidney Ochieng, iHub, Nairobi). The use of social media for monitoring the evolution and effects of Ebola in west Africa (Nuri Pashwani, IBM Research Africa). The Kudusystem for market making in Ugandan farm produce distribution via SMS messages (Kenneth Bwire, Makerere University, Kampala). Telecommunications data for inferring the source and spread of a typhoid outbreak in Kampala (UN Pulse Lab, Kampala). The Punya system for prototyping and deployment of mobile phone apps to deal with emerging crises or market opportunities (Julius Adebayor, MIT) and large scale systems for collating and sharing data resources Open Data Kenya and UN OCHA Human Data Exchange….(More)”

The Future of the Professions: How Technology Will Transform the Work of Human Experts


New book by Richard Susskind and Daniel Susskind: “This book predicts the decline of today’s professions and describes the people and systems that will replace them. In an Internet society, according to Richard Susskind and Daniel Susskind, we will neither need nor want doctors, teachers, accountants, architects, the clergy, consultants, lawyers, and many others, to work as they did in the 20th century.

The Future of the Professions explains how ‘increasingly capable systems’ – from telepresence to artificial intelligence – will bring fundamental change in the way that the ‘practical expertise’ of specialists is made available in society.

The authors challenge the ‘grand bargain’ – the arrangement that grants various monopolies to today’s professionals. They argue that our current professions are antiquated, opaque and no longer affordable, and that the expertise of the best is enjoyed only by a few. In their place, they propose six new models for producing and distributing expertise in society.

The book raises important practical and moral questions. In an era when machines can out-perform human beings at most tasks, what are the prospects for employment, who should own and control online expertise, and what tasks should be reserved exclusively for people?

Based on the authors’ in-depth research of more than ten professions, and illustrated by numerous examples from each, this is the first book to assess and question the relevance of the professions in the 21st century. (Chapter 1)”

Algorithms and Bias


Q. and A. With Cynthia Dwork in the New York Times: “Algorithms have become one of the most powerful arbiters in our lives. They make decisions about the news we read, the jobs we get, the people we meet, the schools we attend and the ads we see.

Yet there is growing evidence that algorithms and other types of software can discriminate. The people who write them incorporate their biases, and algorithms often learn from human behavior, so they reflect the biases we hold. For instance, research has shown that ad-targeting algorithms have shown ads for high-paying jobs to men but not women, and ads for high-interest loans to people in low-income neighborhoods.

Cynthia Dwork, a computer scientist at Microsoft Research in Silicon Valley, is one of the leading thinkers on these issues. In an Upshot interview, which has been edited, she discussed how algorithms learn to discriminate, who’s responsible when they do, and the trade-offs between fairness and privacy.

Q: Some people have argued that algorithms eliminate discriminationbecause they make decisions based on data, free of human bias. Others say algorithms reflect and perpetuate human biases. What do you think?

A: Algorithms do not automatically eliminate bias. Suppose a university, with admission and rejection records dating back for decades and faced with growing numbers of applicants, decides to use a machine learning algorithm that, using the historical records, identifies candidates who are more likely to be admitted. Historical biases in the training data will be learned by the algorithm, and past discrimination will lead to future discrimination.

Q: Are there examples of that happening?

A: A famous example of a system that has wrestled with bias is the resident matching program that matches graduating medical students with residency programs at hospitals. The matching could be slanted to maximize the happiness of the residency programs, or to maximize the happiness of the medical students. Prior to 1997, the match was mostly about the happiness of the programs.

This changed in 1997 in response to “a crisis of confidence concerning whether the matching algorithm was unreasonably favorable to employers at the expense of applicants, and whether applicants could ‘game the system,’ ” according to a paper by Alvin Roth and Elliott Peranson published in The American Economic Review.

Q: You have studied both privacy and algorithm design, and co-wrote a paper, “Fairness Through Awareness,” that came to some surprising conclusions about discriminatory algorithms and people’s privacy. Could you summarize those?

A: “Fairness Through Awareness” makes the observation that sometimes, in order to be fair, it is important to make use of sensitive information while carrying out the classification task. This may be a little counterintuitive: The instinct might be to hide information that could be the basis of discrimination….

Q: The law protects certain groups from discrimination. Is it possible to teach an algorithm to do the same?

A: This is a relatively new problem area in computer science, and there are grounds for optimism — for example, resources from the Fairness, Accountability and Transparency in Machine Learning workshop, which considers the role that machines play in consequential decisions in areas like employment, health care and policing. This is an exciting and valuable area for research. …(More)”

A Visual Introduction to Machine Learning


R2D3 introduction: “In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.

Keep scrolling. Using a data set about homes, we will create a machine learning model to distinguish homes in New York from homes in San Francisco…./

 

  1. Machine learning identifies patterns using statistical learning and computers by unearthing boundaries in data sets. You can use it to make predictions.
  2. One method for making predictions is called a decision trees, which uses a series of if-then statements to identify boundaries and define patterns in the data
  3. Overfitting happens when some boundaries are based on on distinctions that don’t make a difference. You can see if a model overfits by having test data flow through the model….(More)”

AI tool turns complicated legal contracts into simple visual charts


Springwise: “We have seen a host of work related apps that aim to make tedious office tasks more approachable — there is a plugin that can find files without knowing the title, and a tracking tool which analyzes competitors online strategies. Joining this is Beagle, an intelligent contract analysis tool which provides users with a graphical summary of lengthy documents in seconds. It is a time-saving tool which translates complicated documents from elusive legal language into comprehensive visual summaries.

The Beagle system is powered by self-learning artificial intelligence which learns the client’s preferences and adapts accordingly. Users begin by dropping in a file into the app. The AI — trained by lawyers and NLP experts — then converts the information into a single page document. It processes the contract at a rate of one page per 0.05 seconds and highlights key information, displaying it in easy to read graphs and charts. The system also comes with built-in collaboration tools so multiple users can edit and export the files….(More)”

Using Twitter as a data source: An overview of current social media research tools


Wasim Ahmed at the LSE Impact Blog: “I have a social media research blog where I find and write about tools that can be used to capture and analyse data from social media platforms. My PhD looks at Twitter data for health, such as the Ebola outbreak in West Africa. I am increasingly asked why I am looking at Twitter, and what tools and methods there are of capturing and analysing data from other platforms such as Facebook, or even less traditional platforms such as Amazon book reviews. Brainstorming a couple of responses to this question by talking to members of the New Social Media New Social Science network, there are at least six reasons:

  1. Twitter is a popular platform in terms of the media attention it receives and it therefore attracts more research due to its cultural status
  2. Twitter makes it easier to find and follow conversations (i.e., by both its search feature and by tweets appearing in Google search results)
  3. Twitter has hashtag norms which make it easier gathering, sorting, and expanding searches when collecting data
  4. Twitter data is easy to retrieve as major incidents, news stories and events on Twitter are tend to be centred around a hashtag
  5. The Twitter API is more open and accessible compared to other social media platforms, which makes Twitter more favourable to developers creating tools to access data. This consequently increases the availability of tools to researchers.
  6. Many researchers themselves are using Twitter and because of their favourable personal experiences, they feel more comfortable with researching a familiar platform.

It is probable that a combination of response 1 to 6 have led to more research on Twitter. However, this raises another distinct but closely related question: when research is focused so heavily on Twitter, what (if any) are the implications of this on our methods?

As for the methods that are currently used in analysing Twitter data i.e., sentiment analysis, time series analysis (examining peaks in tweets), network analysis etc., can these be applied to other platforms or are different tools, methods and techniques required? In addition to qualitative methods such as content analysis, I have used the following four methods in analysing Twitter data for the purposes of my PhD, below I consider whether these would work for other social media platforms:

  1. Sentiment analysis works well with Twitter data, as tweets are consistent in length (i.e., <= 140) would sentiment analysis work well with, for example Facebook data where posts may be longer?
  2. Time series analysis is normally used when examining tweets overtime to see when a peak of tweets may occur, would examining time stamps in Facebook posts, or Instagram posts, for example, produce the same results? Or is this only a viable method because of the real-time nature of Twitter data?
  3. Network analysis is used to visualize the connections between people and to better understand the structure of the conversation. Would this work as well on other platforms whereby users may not be connected to each other i.e., public Facebook pages?
  4. Machine learning methods may work well with Twitter data due to the length of tweets (i.e., <= 140) but would these work for longer posts and for platforms that are not text based i.e., Instagram?

It may well be that at least some of these methods can be applied to other platforms, however they may not be the best methods, and may require the formulation of new methods, techniques, and tools.

So, what are some of the tools available to social scientists for social media data? In the table below I provide an overview of some the tools I have been using (which require no programming knowledge and can be used by social scientists):…(More)”

The Data Revolution


Review of Rob Kitchin’s The Data Revolution: Big Data, Open Data, Data Infrastructures & their Consequences by David Moats in Theory, Culture and Society: “…As an industry, academia is not immune to cycles of hype and fashion. Terms like ‘postmodernism’, ‘globalisation’, and ‘new media’ have each had their turn filling the top line of funding proposals. Although they are each grounded in tangible shifts, these terms become stretched and fudged to the point of becoming almost meaningless. Yet, they elicit strong, polarised reactions. For at least the past few years, ‘big data’ seems to be the buzzword, which elicits funding, as well as the ire of many in the social sciences and humanities.

Rob Kitchin’s book The Data Revolution is one of the first systematic attempts to strip back the hype surrounding our current data deluge and take stock of what is really going on. This is crucial because this hype is underpinned by very real societal change, threats to personal privacy and shifts in store for research methods. The book acts as a helpful wayfinding device in an unfamiliar terrain, which is still being reshaped, and is admirably written in a language relevant to social scientists, comprehensible to policy makers and accessible even to the less tech savvy among us.

The Data Revolution seems to present itself as the definitive account of this phenomena but in filling this role ends up adopting a somewhat diplomatic posture. Kitchin takes all the correct and reasonable stances on the matter and advocates all the right courses of action but he is not able to, in the context of this book, pursue these propositions fully. This review will attempt to tease out some of these latent potentials and how they might be pushed in future work, in particular the implications of the ‘performative’ character of both big data narratives and data infrastructures for social science research.

Kitchin’s book starts with the observation that ‘data’ is a misnomer – etymologically data should refer to phenomena in the world which can be abstracted, measured etc. as opposed to the representations and measurements themselves, which should by all rights be called ‘capta’. This is ironic because the worst offenders in what Kitchin calls “data boosterism” seem to conflate data with ‘reality’, unmooring data from its conditions of production and making relationship between the two given or natural.

As Kitchin notes, following Bowker (2005), ‘raw data’ is an oxymoron: data are not so much mined as produced and are necessarily framed technically, ethically, temporally, spatially and philosophically. This is the central thesis of the book, that data and data infrastructures are not neutral and technical but also social and political phenomena. For those at the critical end of research with data, this is a starting assumption, but one which not enough practitioners heed. Most of the book is thus an attempt to flesh out these rapidly expanding data infrastructures and their politics….

Kitchin is at his best when revealing the gap between the narratives and the reality of data analysis such as the fallacy of empiricism – the assertion that, given the granularity and completeness of big data sets and the availability of machine learning algorithms which identify patterns within data (with or without the supervision of human coders), data can “speak for themselves”. Kitchin reminds us that no data set is complete and even these out-of-the-box algorithms are underpinned by theories and assumptions in their creation, and require context specific knowledge to unpack their findings. Kitchin also rightly raises concerns about the limits of big data, that access and interoperability of data is not given and that these gaps and silences are also patterned (Twitter is biased as a sample towards middle class, white, tech savy people). Yet, this language of veracity and reliability seems to suggest that big data is being conceptualised in relation to traditional surveys, or that our population is still the nation state, when big data could helpfully force us to reimagine our analytic objects and truth conditions and more pressingly, our ethics (Rieder, 2013).

However, performativity may again complicate things. As Kitchin observes, supermarket loyalty cards do not just create data about shopping, they encourage particular sorts of shopping; when research subjects change their behaviour to cater to the metrics and surveillance apparatuses built into platforms like Facebook (Bucher, 2012), then these are no longer just data points representing the social, but partially constitutive of new forms of sociality (this is also true of other types of data as discussed by Savage (2010), but in perhaps less obvious ways). This might have implications for how we interpret data, the distribution between quantitative and qualitative approaches (Latour et al., 2012) or even more radical experiments (Wilkie et al., 2014). Kitchin is relatively cautious about proposing these sorts of possibilities, which is not the remit of the book, though it clearly leaves the door open…(More)”