Machine Learning, Big Data and the Regulation of Consumer Credit Markets: The Case of Algorithmic Credit Scoring


Paper by Nikita Aggarwal et al: “Recent advances in machine learning (ML) and Big Data techniques have facilitated the development of more sophisticated, automated consumer credit scoring models — a trend referred to as ‘algorithmic credit scoring’ in recognition of the increasing reliance on computer (particularly ML) algorithms for credit scoring. This chapter, which forms part of the 2018 collection of short essays ‘Autonomous Systems and the Law’, examines the rise of algorithmic credit scoring, and considers its implications for the regulation of consumer creditworthiness assessment and consumer credit markets more broadly.

The chapter argues that algorithmic credit scoring, and the Big Data and ML technologies underlying it, offer both benefits and risks for consumer credit markets. On the one hand, it could increase allocative efficiency and distributional fairness in these markets, by widening access to, and lowering the cost of, credit, particularly for ‘thin-file’ and ‘no-file’ consumers. On the other hand, algorithmic credit scoring could undermine distributional fairness and efficiency, by perpetuating discrimination in lending against certain groups and by enabling the more effective exploitation of borrowers.

The chapter considers how consumer financial regulation should respond to these risks, focusing on the UK/EU regulatory framework. As a general matter, it argues that the broadly principles and conduct-based approach of UK consumer credit regulation provides the flexibility necessary for regulators and market participants to respond dynamically to these risks. However, this approach could be enhanced through the introduction of more robust product oversight and governance requirements for firms in relation to their use of ML systems and processes. Supervisory authorities could also themselves make greater use of ML and Big Data techniques in order to strengthen the supervision of consumer credit firms.

Finally, the chapter notes that cross-sectoral data protection regulation, recently updated in the EU under the GDPR, offers an important avenue to mitigate risks to consumers arising from the use of their personal data. However, further guidance is needed on the application and scope of this regime in the consumer financial context….(More)”.

The future is intelligent: Harnessing the potential of artificial intelligence in Africa


Youssef Travaly and Kevin Muvunyi at Brookings: “…AI in particular presents countless avenues for both the public and private sectors to optimize solutions to the most crucial problems facing the continent today, especially for struggling industries. For example, in health care, AI solutions can help scarce personnel and facilities do more with less by speeding initial processing, triage, diagnosis, and post-care follow up. Furthermore, AI-based pharmacogenomics applications, which focus on the likely response of an individual to therapeutic drugs based on certain genetic markers, can be used to tailor treatments. Considering the genetic diversity found on the African continent, it is highly likely that the application of these technologies in Africa will result in considerable advancement in medical treatment on a global level.

In agricultureAbdoulaye Baniré Diallo, co-founder and chief scientific officer of the AI startup My Intelligent Machines, is working with advanced algorithms and machine learning methods to leverage genomic precision in livestock production models. With genomic precision, it is possible to build intelligent breeding programs that minimize the ecological footprint, address changing consumer demands, and contribute to the well-being of people and animals alike through the selection of good genetic characteristics at an early stage of the livestock production process. These are just a few examples that illustrate the transformative potential of AI technology in Africa.

However, a number of structural challenges undermine rapid adoption and implementation of AI on the continent. Inadequate basic and digital infrastructure seriously erodes efforts to activate AI-powered solutions as it reduces crucial connectivity. (For more on strategies to improve Africa’s digital infrastructure, see the viewpoint on page 67 of the full report). A lack of flexible and dynamic regulatory systems also frustrates the growth of a digital ecosystem that favors AI technology, especially as tech leaders want to scale across borders. Furthermore, lack of relevant technical skills, particularly for young people, is a growing threat. This skills gap means that those who would have otherwise been at the forefront of building AI are left out, preventing the continent from harnessing the full potential of transformative technologies and industries.

Similarly, the lack of adequate investments in research and development is an important obstacle. Africa must develop innovative financial instruments and public-private partnerships to fund human capital development, including a focus on industrial research and innovation hubs that bridge the gap between higher education institutions and the private sector to ensure the transition of AI products from lab to market….(More)”.

Data Democracy


Book by Feras Batarseh and Ruixin Yang: “Data Democracy: At the Nexus of Artificial Intelligence, Software Development, and Knowledge Engineering provides a manifesto to data democracy. After reading the chapters of this book, you are informed and suitably warned! You are already part of the data republic, and you (and all of us) need to ensure that our data fall in the right hands. Everything you click, buy, swipe, try, sell, drive, or fly is a data point. But who owns the data? At this point, not you! You do not even have access to most of it. The next best empire of our planet is one who owns and controls the world’s best dataset. If you consume or create data, if you are a citizen of the data republic (willingly or grudgingly), and if you are interested in making a decision or finding the truth through data-driven analysis, this book is for you. A group of experts, academics, data science researchers, and industry practitioners gathered to write this manifesto about data democracy.

Key Features

  • The future of the data republic, life within a data democracy, and our digital freedoms
  • An in-depth analysis of open science, open data, open source software, and their future challenges
  • A comprehensive review of data democracy’s implications within domains such as: healthcare, space exploration, earth sciences, business, and psychology
  • The democratization of Artificial Intelligence (AI), and data issues such as: Bias, imbalance, context, and knowledge extraction
  • A systematic review of AI methods applied to software engineering problems…(More)”.

Crossing the Digital Divide: Applying Technology to the Global Refugee Crisis


Report by Shelly Culbertson, James Dimarogonas, Katherine Costello, and Serafina Lanna: “In the past two decades, the global population of forcibly displaced people has more than doubled, from 34 million in 1997 to 71 million in 2018. Amid this growing crisis, refugees and the organizations that assist them have turned to technology as an important resource, and technology can and should play an important role in solving problems in humanitarian settings. In this report, the authors analyze technology uses, needs, and gaps, as well as opportunities for better using technology to help displaced people and improving the operations of responding agencies. The authors also examine inherent ethical, security, and privacy considerations; explore barriers to the successful deployment of technology; and outline some tools for building a more systematic approach to such deployment. The study approach included a literature review, semi-structured interviews with stakeholders, and focus groups with displaced people in Colombia, Greece, Jordan, and the United States. The authors provide several recommendations for more strategically using and developing technology in humanitarian settings….(More)”.

How people decide what they want to know


Tali Sharot & Cass R. Sunstein in Nature: “Immense amounts of information are now accessible to people, including information that bears on their past, present and future. An important research challenge is to determine how people decide to seek or avoid information. Here we propose a framework of information-seeking that aims to integrate the diverse motives that drive information-seeking and its avoidance. Our framework rests on the idea that information can alter people’s action, affect and cognition in both positive and negative ways. The suggestion is that people assess these influences and integrate them into a calculation of the value of information that leads to information-seeking or avoidance. The theory offers a framework for characterizing and quantifying individual differences in information-seeking, which we hypothesize may also be diagnostic of mental health. We consider biases that can lead to both insufficient and excessive information-seeking. We also discuss how the framework can help government agencies to assess the welfare effects of mandatory information disclosure….(More)”.

Technology Can't Fix Algorithmic Injustice


Annette Zimmerman, Elena Di Rosa and Hochan Kima at Boston Review: “A great deal of recent public debate about artificial intelligence has been driven by apocalyptic visions of the future. Humanity, we are told, is engaged in an existential struggle against its own creation. Such worries are fueled in large part by tech industry leaders and futurists, who anticipate systems so sophisticated that they can perform general tasks and operate autonomously, without human control. Stephen Hawking, Elon Musk, and Bill Gates have all publicly expressed their concerns about the advent of this kind of “strong” (or “general”) AI—and the associated existential risk that it may pose for humanity. In Hawking’s words, the development of strong AI “could spell the end of the human race.”

These are legitimate long-term worries. But they are not all we have to worry about, and placing them center stage distracts from ethical questions that AI is raising here and now. Some contend that strong AI may be only decades away, but this focus obscures the reality that “weak” (or “narrow”) AI is already reshaping existing social and political institutions. Algorithmic decision making and decision support systems are currently being deployed in many high-stakes domains, from criminal justice, law enforcement, and employment decisions to credit scoring, school assignment mechanisms, health care, and public benefits eligibility assessments. Never mind the far-off specter of doomsday; AI is already here, working behind the scenes of many of our social systems.

What responsibilities and obligations do we bear for AI’s social consequences in the present—not just in the distant future? To answer this question, we must resist the learned helplessness that has come to see AI development as inevitable. Instead, we should recognize that developing and deploying weak AI involves making consequential choices—choices that demand greater democratic oversight not just from AI developers and designers, but from all members of society….(More)”.

Data as infrastructure? A study of data sharing legal regimes


Paper by Charlotte Ducuing: “The article discusses the concept of infrastructure in the digital environment, through a study of three data sharing legal regimes: the Public Sector Information Directive (PSI Directive), the discussions on in-vehicle data governance and the freshly adopted data sharing legal regime in the Electricity Directive.

While aiming to contribute to the scholarship on data governance, the article deliberately focuses on network industries. Characterised by the existence of physical infrastructure, they have a special relationship to digitisation and ‘platformisation’ and are exposed to specific risks. Adopting an explanatory methodology, the article exposes that these regimes are based on two close but different sources of inspiration, yet intertwined and left unclear. By targeting entities deemed ‘monopolist’ with regard to the data they create and hold, data sharing obligations are inspired from competition law and especially the essential facility doctrine. On the other hand, beneficiaries appear to include both operators in related markets needing data to conduct their business (except for the PSI Directive), and third parties at large to foster innovation. The latter rationale illustrates what is called here a purposive view of data as infrastructure. The underlying understanding of ‘raw’ data (management) as infrastructure for all to use may run counter the ability for the regulated entities to get a fair remuneration for ‘their’ data.

Finally, the article pleads for more granularity when mandating data sharing obligations depending upon the purpose. Shifting away from a ‘one-size-fits-all’ solution, the regulation of data could also extend to the ensuing context-specific data governance regime, subject to further research…(More)”.

Paging Dr. Google: How the Tech Giant Is Laying Claim to Health Data


Wall Street Journal: “Roughly a year ago, Google offered health-data company Cerner Corp.an unusually rich proposal.

Cerner was interviewing Silicon Valley giants to pick a storage provider for 250 million health records, one of the largest collections of U.S. patient data. Google dispatched former chief executive Eric Schmidt to personally pitch Cerner over several phone calls and offered around $250 million in discounts and incentives, people familiar with the matter say. 

Google had a bigger goal in pushing for the deal than dollars and cents: a way to expand its effort to collect, analyze and aggregate health data on millions of Americans. Google representatives were vague in answering questions about how Cerner’s data would be used, making the health-care company’s executives wary, the people say. Eventually, Cerner struck a storage deal with Amazon.com Inc. instead.

The failed Cerner deal reveals an emerging challenge to Google’s move into health care: gaining the trust of health care partners and the public. So far, that has hardly slowed the search giant.

Google has struck partnerships with some of the country’s largest hospital systems and most-renowned health-care providers, many of them vast in scope and few of their details previously reported. In just a few years, the company has achieved the ability to view or analyze tens of millions of patient health records in at least three-quarters of U.S. states, according to a Wall Street Journal analysis of contractual agreements. 

In certain instances, the deals allow Google to access personally identifiable health information without the knowledge of patients or doctors. The company can review complete health records, including names, dates of birth, medications and other ailments, according to people familiar with the deals.

The prospect of tech giants’ amassing huge troves of health records has raised concerns among lawmakers, patients and doctors, who fear such intimate data could be used without individuals’ knowledge or permission, or in ways they might not anticipate. 

Google is developing a search tool, similar to its flagship search engine, in which patient information is stored, collated and analyzed by the company’s engineers, on its own servers. The portal is designed for use by doctors and nurses, and eventually perhaps patients themselves, though some Google staffers would have access sooner. 

Google executives and some health systems say that detailed data sharing has the potential to improve health outcomes. Large troves of data help fuel algorithms Google is creating to detect lung cancer, eye disease and kidney injuries. Hospital executives have long sought better electronic record systems to reduce error rates and cut down on paperwork….

Legally, the information gathered by Google can be used for purposes beyond diagnosing illnesses, under laws enacted during the dial-up era. U.S. federal privacy laws make it possible for health-care providers, with little or no input from patients, to share data with certain outside companies. That applies to partners, like Google, with significant presences outside health care. The company says its intentions in health are unconnected with its advertising business, which depends largely on data it has collected on users of its many services, including email and maps.

Medical information is perhaps the last bounty of personal data yet to be scooped up by technology companies. The health data-gathering efforts of other tech giants such as Amazon and International Business Machines Corp. face skepticism from physician and patient advocates. But Google’s push in particular has set off alarm bells in the industry, including over privacy concerns. U.S. senators, as well as health-industry executives, are questioning Google’s expansion and its potential for commercializing personal data….(More)”.

On Digital Disinformation and Democratic Myths


 David Karpf at MediaWell: “…How many votes did Cambridge Analytica affect in the 2016 presidential election? How much of a difference did the company actually make?

Cambridge Analytica has become something of a Rorschach test among those who pay attention to digital disinformation and microtargeted propaganda. Some hail the company as a digital Svengali, harnessing the power of big data to reshape the behavior of the American electorate. Others suggest the company was peddling digital snake oil, with outlandish marketing claims that bore little resemblance to their mundane product.

One thing is certain: the company has become a household name, practically synonymous with disinformation and digital propaganda in the aftermath of the 2016 election. It has claimed credit for the surprising success of the Brexit referendum and for the Trump digital strategy. Journalists such as Carole Cadwalladr and Hannes Grasseger and Mikael Krogerus have published longform articles that dive into the “psychographic” breakthroughs that the company claims to have made. Cadwalladr also exposed the links between the company and a network of influential conservative donors and political operatives. Whistleblower Chris Wylie, who worked for a time as the company’s head of research, further detailed how it obtained a massive trove of Facebook data on tens of millions of American citizens, in violation of Facebook’s terms of service. The Cambridge Analytica scandal has been a driving force in the current “techlash,” and has been the topic of congressional hearings, documentaries, mass-market books, and scholarly articles.

The reasons for concern are numerous. The company’s own marketing materials boasted about radical breakthroughs in psychographic targeting—developing psychological profiles of every US voter so that political campaigns could tailor messages to exploit psychological vulnerabilities. Those marketing claims were paired with disturbing revelations about the company violating Facebook’s terms of service to scrape tens of millions of user profiles, which were then compiled into a broader database of US voters. Cambridge Analytica behaved unethically. It either broke a lot of laws or demonstrated that old laws needed updating. When the company shut down, no one seemed to shed a tear.

But what is less clear is just how different Cambridge Analytica’s product actually was from the type of microtargeted digital advertisements that every other US electoral campaign uses. Many of the most prominent researchers warning the public about how Cambridge Analytica uses our digital exhaust to “hack our brains” are marketing professors, more accustomed to studying the impact of advertising in commerce than in elections. The political science research community has been far more skeptical. An investigation from Nature magazine documented that the evidence of Cambridge Analytica’s independent impact on voter behavior is basically nonexistent (Gibney 2018). There is no evidence that psychographic targeting actually works at the scale of the American electorate, and there is also no evidence that Cambridge Analytica in fact deployed psychographic models while working for the Trump campaign. The company clearly broke Facebook’s terms of service in acquiring its massive Facebook dataset. But it is not clear that the massive dataset made much of a difference.

At issue in the Cambridge Analytica case are two baseline assumptions about political persuasion in elections. First, what should be our point of comparison for digital propaganda in elections? Second, how does political persuasion in elections compare to persuasion in commercial arenas and marketing in general?…(More)”.

Navigation Apps Changed the Politics of Traffic


Essay by Laura Bliss: “There might not be much “weather” to speak of in Los Angeles, but there is traffic. It’s the de facto small talk upon arrival at meetings or cocktail parties, comparing journeys through the proverbial storm. And in certain ways, traffic does resemble the daily expressions of climate. It follows diurnal and seasonal patterns; it shapes, and is shaped, by local conditions. There are unexpected downpours: accidents, parades, sports events, concerts.

Once upon a time, if you were really savvy, you could steer around the thunderheads—that is, evade congestion almost entirely.

Now, everyone can do that, thanks to navigation apps like Waze, which launched in 2009 by a startup based in suburban Tel Aviv with the aspiration to save drivers five minutes on every trip by outsmarting traffic jams. Ten years later, the navigation app’s current motto is to “eliminate traffic”—to untie the knots of urban congestion once and for all. Like Google Maps, Apple Maps, Inrix, and other smartphone-based navigation tools, its routing algorithm weaves user locations with other sources of traffic data, quickly identifying the fastest routes available at any given moment.

Waze often describes itself in terms of the social goods it promotes. It likes to highlight the dedication of its active participants, who pay it forward to less-informed drivers behind them, as well as its willingness to share incident reports with city governments so that, for example, traffic engineers can rejigger stop lights or crack down on double parking. “Over the last 10 years, we’ve operated from a sense of civic responsibility within our means,” wrote Waze’s CEO and founder Noam Bardin in April 2018.

But Waze is a business, not a government agency. The goal is to be an indispensable service for its customers, and to profit from that. And it isn’t clear that those objectives align with a solution for urban congestion as a whole. This gets to the heart of the problem with any navigation app—or, for that matter, any traffic fix that prioritizes the needs of independent drivers over what’s best for the broader system. Managing traffic requires us to work together. Apps tap into our selfish desires….(More)”.

This essay is adapted from SOM Thinkers: The Future of Transportation, published by Metropolis Books.