Deep Fake

By Emil Verhulst

/diːp feɪk/

An image or recording that has been convincingly altered and manipulated to misrepresent someone as doing or saying something that was not actually done or said (Merriam-Webster).

The term “deepfake” was first coined in late 2017 by a Reddit user who “shared pornographic videos that used open source face-swapping technology.” Since then, the term has expanded to include a harmful alteration or manipulation of digital media – from audio to landscapes. For example, researchers applied AI techniques to modify aerial imagery, which could potentially lead governments astray or spread false information:

“Adversaries may use fake or manipulated information to impact our understanding of the world,” says a spokesperson for the National Geospatial-Intelligence Agency, part of the Pentagon that oversees the collection, analysis, and distribution of geospatial information.”

Audio can also be deepfaked. In 2019, a mysterious case emerged involving a UK-based energy company and its Germany-based parent company. The CEO of the UK energy company received a call from his boss, or at least who he thought was his boss. This “boss” told him to send around 200,000 dollars to a supplier in Hungary:

“The €220,000 was moved to Mexico and channeled to other accounts, and the energy firm—which was not identified—reported the incident to its insurance company, Euler Hermes Group SA. An official with Euler Hermes said the thieves used artificial intelligence to create a deepfake of the German executive’s voice.”

This incident, among others, indicates a rise in crime associated with deep fakes. Deep fakes can alter our perception of reality. In particular, it can prove dangerous in precarious social and political climates, in which false information can incite violence or hate speech online.

So what is the science behind deep fakes?

Deep fakes are usually created using Generative Adversarial Networks, or GANs. This process is a subfield of AI known as Machine Learning (ML). Machine learning is the use of computer systems that can learn without following instructions, and instead learn using statistics and algorithms to dissect patterns in data. To create a deep fake, two ML algorithms called “neural networks” work in conjunction – one creates fake data (videos, images, audio, etc) that replicates the original data (usually a video or audio from another person), while the other identifies the counterfeit data, competing with the other neural network. The networks compete for iterations of the final product, until there is no difference between the real and fake data.

Deep fakes will only continue to become more prevalent in the coming years. As they pose a threat to journalism, online speech, and internet safety, we must remain vigilant about our intake of new information online.