How AI-Driven Insurance Could Reduce Gun Violence


Jason Pontin at WIRED: “As a political issue, guns have become part of America’s endless, arid culture wars, where Red and Blue tribes skirmish for political and cultural advantage. But what if there were a compromise? Economics and machine learning suggest an answer, potentially acceptable to Americans in both camps.

Economists sometimes talk about “negative externalities,” market failures where the full costs of transactions are borne by third parties. Pollution is an externality, because society bears the costs of environmental degradation. The 20th-century British economist Arthur Pigou, who formally described externalities, also proposed their solution: so-called “Pigovian taxes,” where governments charge producers or customers, reducing the quantity of the offending products and sometimes paying for ameliorative measures. Pigovian taxes have been used to fight cigarette smoking or improve air quality, and are the favorite prescription of economists for reducing greenhouse gases. But they don’t work perfectly, because it’s hard for governments to estimate the costs of externalities.

Gun violence is a negative externality too. The choices of millions of Americans to buy guns overflow into uncaptured costs for society in the form of crimes, suicides, murders, and mass shootings. A flat gun tax would be a blunt instrument: It could only reduce gun violence by raising the costs of gun ownership so high that almost no one could legally own a gun, which would swell the black market for guns and probably increase crime. But insurers are very good at estimating the risks and liabilities of individual choices; insurance could capture the externalities of gun violence in a smarter, more responsive fashion.

Here’s the proposed compromise: States should require gun owners to be licensed and pay insurance, just as car owners must be licensed and insured today….

The actuaries who research risk have always considered a wide variety of factors when helping insurers price the cost of a policy. Car, home, and life insurance can vary according to a policy holder’s age, health, criminal record, employment, residence, and many other variables. But in recent years, machine learning and data analytics have provided actuaries with new predictive powers. According to Yann LeCun, the director of artificial intelligence at Facebook and the primary inventor of an important technique in deep learning called convolution, “Deep learning systems provide better statistical models with enough data. They can be advantageously applied to risk evaluation, and convolutional neural nets can be very good at prediction, because they can take into account a long window of past values.”

State Farm, Liberty Mutual, Allstate, and Progressive Insurance have all used algorithms to improve their predictive analysis and to more accurately distribute risk among their policy holders. For instance, in late 2015, Progressive created a telematics app called Snapshot that individual drivers used to collect information on their driving. In the subsequent two years, 14 billion miles of driving data were collected all over the country and analyzed on Progressive’s machine learning platform, H20.ai, resulting in discounts of $600 million for their policy holders. On average, machine learning produced a $130 discount for Progressive customers.

When the financial writer John Wasik popularized gun insurance in a series of posts in Forbes in 2012 and 2013, the NRA’s argument about prior constraints was a reasonable objection. Wasik proposed charging different rates to different types of gun owners, but there were too many factors that would have to be tracked over too long a period to drive down costs for low-risk policy holders. Today, using deep learning, the idea is more practical: Insurers could measure the interaction of dozens or hundreds of factors, predicting the risks of gun ownership and controlling costs for low-risk gun owners. Other, more risky bets might pay more. Some very risky would-be gun owners might be unable to find insurance at all. Gun insurance could even be dynamically priced, changing as the conditions of the policy holders’ lives altered, and the gun owners proved themselves better or worse risks.

Requiring gun owners to buy insurance wouldn’t eliminate gun violence in America. But a political solution to the problem of gun violence is chimerical….(More)”.

A primer on political bots: Part one


Stuart W. Shulman et al at Data Driven Journalism: “The rise of political bots brings into sharp focus the role of automated social media accounts in today’s democratic civil society. Events during the Brexit referendum and the 2016 U.S. Presidential election revealed the scale of this issue for the first time to the majority of citizens and policy-makers. At the same time, the deployment of Russian-linked bots designed to promote pro-gun laws in the aftermath of the Florida school shooting demonstrates the state-sponsored, real-time readiness to shape, through information warfare, the dominant narratives on platforms such as Twitter. The regular news reports on these issues lead us to conclude that the foundations of democracy have become threatened by the presence of aggressive and socially disruptive bots, which aim to manipulate online political discourse.

While there is clarity on the various functions that bot accounts can be scripted to perform, as described below, the task of accurately defining this phenomenon and identifying bot accounts remains a challenge. At Texifter, we have endeavoured to bring nuance to this issue through a research project which explores the presence of automated accounts on Twitter. Initially, this project concerned itself with an attempt to identify bots which participated in online conversations around the prevailing cryptocurrency phenomenon. This article is the first in a series of three blog posts produced by the researchers at Texifter that outlines the contemporary phenomenon of Twitter bots….

Bots in their current iteration have a relatively short, albeit rapidly evolving history. Initially constructed with non-malicious intentions, it wasn’t until the late 1990s with the advent of Web 2.0 when bots began to develop a more negative reputation. Although bots have been used maliciously in denial-of-service (DDoS) attacks, spam emails, and mass identity theft, their purpose is not explicitly to incite mayhem.

Before the most recent political events, bots existed in chat rooms, operated as automated customer service agents on websites, and were a mainstay on dating websites. This familiar form of the bot is known to the majority of the general population as a “chatbot” – for instance, CleverBot was and still is a popular platform to talk to an “AI”. Another prominent example was Microsoft’s failed Twitter Chatbot Tay which made headlines in 2016 when “her” vocabulary and conversation functions were manipulated by Twitter users until “she” espoused neo-nazi views when “she” was subsequently deleted.

Image: XKCD Comic #632.

A Twitter bot is an account controlled by an algorithm or script, which is typically hosted on a cloud platform such as Heroku. They are typically, though not exclusively, scripted to conduct repetitive tasks.  For example, there are bots that retweet content containing particular keywords, reply to new followers, and direct messages to new followers; although they can be used for more complex tasks such as participating in online conversations. Bot accounts make up between 9 and 15% of all active accounts on Twitter; however, it is predicted that they account for a much greater percentage of total Twitter traffic. Twitter bots are generally not created with malicious intent; they are frequently used for online chatting or for raising the professional profile of a corporation – but their ability to pervade our online experience and shape political discourse warrants heightened scrutiny….(More)”.

The Rise of Virtual Citizenship


James Bridle in The Atlantic: “In Cyprus, Estonia, the United Arab Emirates, and elsewhere, passports can now be bought and sold….“If you believe you are a citizen of the world, you are a citizen of nowhere. You don’t understand what citizenship means,” the British prime minister, Theresa May, declared in October 2016. Not long after, at his first postelection rally, Donald Trump asserted, “There is no global anthem. No global currency. No certificate of global citizenship. We pledge allegiance to one flag and that flag is the American flag.” And in Hungary, Prime Minister Viktor Orbán has increased his national-conservative party’s popularity with statements like “all the terrorists are basically migrants” and “the best migrant is the migrant who does not come.”

Citizenship and its varying legal definition has become one of the key battlegrounds of the 21st century, as nations attempt to stake out their power in a G-Zero, globalized world, one increasingly defined by transnational, borderless trade and liquid, virtual finance. In a climate of pervasive nationalism, jingoism, xenophobia, and ever-building resentment toward those who move, it’s tempting to think that doing so would become more difficult. But alongside the rise of populist, identitarian movements across the globe, identity itself is being virtualized, too. It no longer needs to be tied to place or nation to function in the global marketplace.

Hannah Arendt called citizenship “the right to have rights.” Like any other right, it can be bestowed and withheld by those in power, but in its newer forms it can also be bought, traded, and rewritten. Virtual citizenship is a commodity that can be acquired through the purchase of real estate or financial investments, subscribed to via an online service, or assembled by peer-to-peer digital networks. And as these options become available, they’re also used, like so many technologies, to exclude those who don’t fit in.

In a world that increasingly operates online, geography and physical infrastructure still remain crucial to control and management. Undersea fiber-optic cables trace the legacy of imperial trading routes. Google and Facebook erect data centers in Scandinavia and the Pacific Northwest, close to cheap hydroelectric power and natural cooling. The trade in citizenship itself often manifests locally as architecture. From luxury apartments in the Caribbean and the Mediterranean to data centers in Europe and refugee settlements in the Middle East, a scattered geography of buildings brings a different reality into focus: one in which political decisions and national laws transform physical space into virtual territory…(More)”.

Data journalism and the ethics of publishing Twitter data


Matthew L. Williams at Data Driven Journalism: “Collecting and publishing data collected from social media sites such as Twitter are everyday practices for the data journalist. Recent findings from Cardiff University’s Social Data Science Lab question the practice of publishing Twitter content without seeking some form of informed consent from users beforehand. Researchers found that tweets collected around certain topics, such as those related to terrorism, political votes, changes in the law and health problems, create datasets that might contain sensitive content, such as extreme political opinion, grossly offensive comments, overly personal revelations and threats to life (both to oneself and to others). Handling these data in the process of analysis (such as classifying content as hateful and potentially illegal) and reporting has brought the ethics of using social media in social research and journalism into sharp focus.

Ethics is an issue that is becoming increasingly salient in research and journalism using social media data. The digital revolution has outpaced parallel developments in research governance and agreed good practice. Codes of ethical conduct that were written in the mid twentieth century are being relied upon to guide the collection, analysis and representation of digital data in the twenty-first century. Social media is particularly ethically challenging because of the open availability of the data (particularly from Twitter). Many platforms’ terms of service specifically state users’ data that are public will be made available to third parties, and by accepting these terms users legally consent to this. However, researchers and data journalists must interpret and engage with these commercially motivated terms of service through a more reflexive lens, which implies a context sensitive approach, rather than focusing on the legally permissible uses of these data.

Social media researchers and data journalists have experimented with data from a range of sources, including Facebook, YouTube, Flickr, Tumblr and Twitter to name a few. Twitter is by far the most studied of all these networks. This is because Twitter differs from other networks, such as Facebook, that are organised around groups of ‘friends’, in that it is more ‘open’ and the data (in part) are freely available to researchers. This makes Twitter a more public digital space that promotes the free exchange of opinions and ideas. Twitter has become the primary space for online citizens to publicly express their reaction to events of national significance, and also the primary source of data for social science research into digital publics.

The Twitter streaming API provides three levels of data access: the free random 1% that provides ~5M tweets daily and the random 10% and 100% (chargeable or free to academic researchers upon request). Datasets on social interactions of this scale, speed and ease of access have been hitherto unrealisable in the social sciences and journalism, and have led to a flood of journal articles and news pieces, many of which include tweets with full text content and author identity without informed consent. This is presumably because of Twitter’s ‘open’ nature, which leads to the assumption that ‘these are public data’ and using it does not require the rigor and scrutiny of an ethical oversight. Even when these data are scrutinised, journalists don’t need to be convinced by the ‘public data’ argument, due to the lack of a framework to evaluate the potential harms to users. The Social Data Science Lab takes a more ethically reflexive approach to the use of social media data in social research, and carefully considers users’ perceptions, online context and the role of algorithms in estimating potentially sensitive user characteristics.

recent Lab survey conducted into users’ perceptions of the use of their social media posts found the following:

  • 94% were aware that social media companies had Terms of Service
  • 65% had read the Terms of Service in whole or in part
  • 76% knew that when accepting Terms of Service they were giving permission for some of their information to be accessed by third parties
  • 80% agreed that if their social media information is used in a publication they would expect to be asked for consent
  • 90% agreed that if their tweets were used without their consent they should be anonymized…(More)”.

Spanning Today’s Chasms: Seven Steps to Building Trusted Data Intermediaries


James Shulman at the Mellon Foundation: “In 2001, when hundreds of individual colleges and universities were scrambling to scan their slide libraries, The Andrew W. Mellon Foundation created a new organization, Artstor, to assemble a massive library of digital images from disparate sources to support teaching and research in the arts and humanities.

Rather than encouraging—or paying for—each school to scan its own slide of the Mona Lisa, the Mellon Foundation created an intermediary organization that would balance the interests of those who created, photographed and cared for art works, such as artists and museums, and those who wanted to use such images for the admirable calling of teaching and studying history and culture.  This organization would reach across the gap that separated these two communities and would respect and balance the interests of both sides, while helping each accomplish their missions.  At the same time that Napster was using technology to facilitate the un-balanced transfer of digital content from creators to users, the Mellon Foundation set up a new institution aimed at respecting the interests of one side of the market and supporting the socially desirable work of the other.

As the internet has enabled the sharing of data across the world, new intermediaries have emerged as entire platforms. A networked world needs such bridges—think Etsy or Ebay sitting between sellers and buyers, or Facebook sitting between advertisers and users. While intermediaries that match sellers and buyers of things provide a marketplace to bridge from one side or the other, aggregators of data work in admittedly more shadowy territories.

In the many realms that market forces won’t support, however, a great deal of public good can be done by aggregating and managing access to datasets that might otherwise continue to live in isolation. Whether due to institutional sociology that favors local solutions, the technical challenges associated with merging heterogeneous databases built with different data models, intellectual property limitations, or privacy concerns, datasets are built and maintained by independent groups that—if networked—could be used to further each other’s work.

Think of those studying coral reefs, or those studying labor practices in developing markets, or child welfare offices seeking to call upon court records in different states, or medical researchers working in different sub-disciplines but on essentially the same disease.  What intermediary invests in joining these datasets?  Many people assume that computers can simply “talk” to each other and share data intuitively, but without targeted investment in connecting them, they can’t.  Unlike modern databases that are now often designed with the cloud in mind, decades of locally created databases churn away in isolation, at great opportunity cost to us all.

Art history research is an unusually vivid example. Most people can understand that if you want to study Caravaggio, you don’t want to hunt and peck across hundreds of museums, books, photo archives, libraries, churches, and private collections.  You want all that content in one place—exactly what Mellon sought to achieve by creating Artstor.

What did we learn in creating Artstor that might be distilled as lessons for others taking on an aggregation project to serve the public good?….(More)”.

Facebook’s next project: American inequality


Nancy Scola at Politico: “Facebook CEO Mark Zuckerberg is quietly cracking open his company’s vast trove of user data for a study on economic inequality in the U.S. — the latest sign of his efforts to reckon with divisions in American society that the social network is accused of making worse.

The study, which hasn’t previously been reported, is mining the social connections among Facebook’s American users to shed light on the growing income disparity in the U.S., where the top 1 percent of households is said to control 40 percent of the country’s wealth. Facebook is an incomparably rich source of information for that kind of research: By one estimate, about three of five American adults use the social network….

Facebook confirmed the broad contours of its partnership with Chetty but declined to elaborate on the substance of the study. Chetty, in a brief interview following a January speech in Washington, said he and his collaborators — who include researchers from Stanford and New York University — have been working on the inequality study for at least six months.

“We’re using social networks, and measuring interactions there, to understand the role of social capital much better than we’ve been able to,” he said.

Researchers say they see Facebook’s enormous cache of data as a remarkable resource, offering an unprecedentedly detailed and sweeping look at American society. That store of information contains both details that a user might tell Facebook — their age, hometown, schooling, family relationships — and insights that the company has picked up along the way, such as the interest groups they’ve joined and geographic distribution of who they call a “friend.”

It’s all the more significant, researchers say, when you consider that Facebook’s user base — about 239 million monthly users in the U.S. and Canada at last count — cuts across just about every demographic group.

And all that information, say researchers, lets them take guesses about users’ wealth. Facebook itself recently patented a way of figuring out someone’s socioeconomic status using factors ranging from their stated hobbies to how many internet-connected devices they own.

A Facebook spokesman addressed the potential privacy implications of the study’s access to user data, saying, “We conduct research at Facebook responsibly, which includes making sure we protect people’s information.” The spokesman added that Facebook follows an “enhanced” review process for research projects, adopted in 2014 after a controversy over a study that manipulated some people’s news feeds to see if it made them happier or sadder.

According to a Stanford University source familiar with Chetty’s study, the Facebook account data used in the research has been stripped of any details that could be used to identify users. The source added that academics involved in the study have gone through security screenings that include background checks, and can access the Facebook data only in secure facilities….(More)”.

The Social Media Threat to Society and Security


George Soros at Project Syndicate: “It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. And there is a real chance that, once lost, those who grow up in the digital age – in which the power to command and shape people’s attention is increasingly concentrated in the hands of a few companies – will have difficulty regaining it.

The current moment in world history is a painful one. Open societies are in crisis, and various forms of dictatorships and mafia states, exemplified by Vladimir Putin’s Russia, are on the rise. In the United States, President Donald Trump would like to establish his own mafia-style state but cannot, because the Constitution, other institutions, and a vibrant civil society won’t allow it….

The rise and monopolistic behavior of the giant American Internet platform companies is contributing mightily to the US government’s impotence. These companies have often played an innovative and liberating role. But as Facebook and Google have grown ever more powerful, they have become obstacles to innovation, and have caused a variety of problems of which we are only now beginning to become aware…

Social media companies’ true customers are their advertisers. But a new business model is gradually emerging, based not only on advertising but also on selling products and services directly to users. They exploit the data they control, bundle the services they offer, and use discriminatory pricing to keep more of the benefits that they would otherwise have to share with consumers. This enhances their profitability even further, but the bundling of services and discriminatory pricing undermine the efficiency of the market economy.

Social media companies deceive their users by manipulating their attention, directing it toward their own commercial purposes, and deliberately engineering addiction to the services they provide. This can be very harmful, particularly for adolescents.

There is a similarity between Internet platforms and gambling companies. Casinos have developed techniques to hook customers to the point that they gamble away all of their money, even money they don’t have.

Something similar – and potentially irreversible – is happening to human attention in our digital age. This is not a matter of mere distraction or addiction; social media companies are actually inducing people to surrender their autonomy. And this power to shape people’s attention is increasingly concentrated in the hands of a few companies.

It takes significant effort to assert and defend what John Stuart Mill called the freedom of mind. Once lost, those who grow up in the digital age may have difficulty regaining it.

This would have far-reaching political consequences. People without the freedom of mind can be easily manipulated. This danger does not loom only in the future; it already played an important role in the 2016 US presidential election.

There is an even more alarming prospect on the horizon: an alliance between authoritarian states and large, data-rich IT monopolies, bringing together nascent systems of corporate surveillance with already-developed systems of state-sponsored surveillance. This may well result in a web of totalitarian control the likes of which not even George Orwell could have imagined….(More)”.

Free Speech in the Filter Age


Alexandra Borchardt at Project Syndicate: “In a democracy, the rights of the many cannot come at the expense of the rights of the few. In the age of algorithms, government must, more than ever, ensure the protection of vulnerable voices, even erring on victims’ side at times.

Germany’s Network Enforcement Act – according to which social-media platforms like Facebook and YouTube could be fined €50 million ($63 million) for every “obviously illegal” post within 24 hours of receiving a notification – has been controversial from the start. After it entered fully into effect in January, there was a tremendous outcry, with critics from all over the political map arguing that it was an enticement to censorship. Government was relinquishing its powers to private interests, they protested.

So, is this the beginning of the end of free speech in Germany?

Of course not. To be sure, Germany’s Netzwerkdurchsetzungsgesetz (or NetzDG) is the strictest regulation of its kind in a Europe that is growing increasingly annoyed with America’s powerful social-media companies. And critics do have some valid points about the law’s weaknesses. But the possibilities for free expression will remain abundant, even if some posts are deleted mistakenly.

The truth is that the law sends an important message: democracies won’t stay silent while their citizens are exposed to hateful and violent speech and images – content that, as we know, can spur real-life hate and violence. Refusing to protect the public, especially the most vulnerable, from dangerous content in the name of “free speech” actually serves the interests of those who are already privileged, beginning with the powerful companies that drive the dissemination of information.

Speech has always been filtered. In democratic societies, everyone has the right to express themselves within the boundaries of the law, but no one has ever been guaranteed an audience. To have an impact, citizens have always needed to appeal to – or bypass – the “gatekeepers” who decide which causes and ideas are relevant and worth amplifying, whether through the media, political institutions, or protest.

The same is true today, except that the gatekeepers are the algorithms that automatically filter and rank all contributions. Of course, algorithms can be programmed any way companies like, meaning that they may place a premium on qualities shared by professional journalists: credibility, intelligence, and coherence.

But today’s social-media platforms are far more likely to prioritize potential for advertising revenue above all else. So the noisiest are often rewarded with a megaphone, while less polarizing, less privileged voices are drowned out, even if they are providing the smart and nuanced perspectives that can truly enrich public discussions….(More)”.

Republics of Makers: From the Digital Commons to a Flat Marginal Cost Society


Mario Carpo at eFlux: “…as the costs of electronic computation have been steadily decreasing for the last forty years at least, many have recently come to the conclusion that, for most practical purposes, the cost of computation is asymptotically tending to zero. Indeed, the current notion of Big Data is based on the assumption that an almost unlimited amount of digital data will soon be available at almost no cost, and similar premises have further fueled the expectation of a forthcoming “zero marginal costs society”: a society where, except for some upfront and overhead costs (the costs of building and maintaining some facilities), many goods and services will be free for all. And indeed, against all odds, an almost zero marginal cost society is already a reality in the case of many services based on the production and delivery of electricity: from the recording, transmission, and processing of electrically encoded digital information (bits) to the production and consumption of electrical power itself. Using renewable energies (solar, wind, hydro) the generation of electrical power is free, except for the cost of building and maintaining installations and infrastructure. And given the recent progress in the micro-management of intelligent electrical grids, it is easy to imagine that in the near future the cost of servicing a network of very small, local hydro-electric generators, for example, could easily be devolved to local communities of prosumers who would take care of those installations as their tend to their living environment, on an almost voluntary, communal basis.4 This was already often the case during the early stages of electrification, before the rise of AC (alternate current, which, unlike DC, or direct current, could be carried over long distances): AC became the industry’s choice only after Galileo Ferraris’s and Nikola Tesla’s developments in AC technologies in the 1880s.

Likewise, at the micro-scale of the electronic production and processing of bits and bytes of information, the Open Source movement and the phenomenal surge of some crowdsourced digital media (including some so-called social media) in the first decade of the twenty-first century has already proven that a collaborative, zero cost business model can effectively compete with products priced for profit on a traditional marketplace. As the success of Wikipedia, Linux, or Firefox proves, many are happy to volunteer their time and labor for free when all can profit from the collective work of an entire community without having to pay for it. This is now technically possible precisely because the fixed costs of building, maintaining, and delivering these service are very small; hence, from the point of view of the end-user, negligible.

Yet, regardless of the fixed costs of the infrastructure, content—even user-generated content—has costs, albeit for the time being these are mostly hidden, voluntarily born, or inadvertently absorbed by the prosumers themselves. For example, the wisdom of Wikipedia is not really a wisdom of crowds: most Wikipedia entries are de facto curated by fairly traditional scholar communities, and these communities can contribute their expertise for free only because their work has already been paid for by others—often by universities. In this sense, Wikipedia is only piggybacking on someone else’s research investments (but multiplying their outreach, which is one reason for its success). Ditto for most Open Source software, as training a software engineer, coder, or hacker, takes time and money—an investment for future returns that in many countries around the world is still born, at least in part, by public institutions….(More)”.

Crowdsourcing Judgments of News Source Quality


Paper by Gordon Pennycook and David G. Rand: “The spread of misinformation and disinformation, especially on social media, is a major societal challenge. Here, we assess whether crowdsourced ratings of trust in news sources can effectively differentiate between more and less reliable sources. To do so, we ran a preregistered experiment (N = 1,010 from Amazon Mechanical Turk) in which individuals rated familiarity with, and trust in, 60 news sources from three categories: 1) Mainstream media outlets, 2) Websites that produce hyper-partisan coverage of actual facts, and 3) Websites that produce blatantly false content (“fake news”).

Our results indicate that, despite substantial partisan bias, laypeople across the political spectrum rate mainstream media outlets as far more trustworthy than either hyper-partisan or fake news sources (all but 1 mainstream source, Salon, was rated as more trustworthy than every hyper-partisan or fake news source when equally weighting ratings of Democrats and Republicans).

Critically, however, excluding ratings from participants who are not familiar with a given news source dramatically reduces the difference between mainstream media sources and hyper-partisan or fake news sites. For example, 30% of the mainstream media websites (Salon, the Guardian, Fox News, Politico, Huffington Post, and Newsweek) received lower trust scores than the most trusted fake news site (news4ktla.com) when excluding unfamiliar ratings.

This suggests that rather than being initially agnostic about unfamiliar sources, people are initially skeptical – and thus a lack of familiarity is an important cue for untrustworthiness. Overall, our findings indicate that crowdsourcing media trustworthiness judgments is a promising approach for fighting misinformation and disinformation online, but that trustworthiness ratings from participants who are unfamiliar with a given source should not be ignored….(More)”.