Researchers at Carnegie Mellon University have devised a way to transform the "style" of one video to another.
This technique lets you transfer the speech and facial expressions of comedian John Oliver, for example, onto another bespectacled suited late-night television host (in this case, Stephen Colbert) – or onto a cartoon frog. Alternatively, you can change the bloom of a daffodil to the bloom of a hibiscus or slow the passing of clouds so that the weather appears much calmer than it was in reality.
The method was presented yesterday at the European Conference on Computer Vision (ECCV 2018) in Munich and is available to view on arXiv.
How does it work? Like most Deepfakes, it uses an AI technology called generative adversarial networks (GANS). These are, essentially, a pair of algorithms that work in opposition to create mind-bendingly convincing fake videos. One algorithm (the discriminator) learns how to spot inconsistencies in video style, while the second (the generator) learns how to make up videos that fit a certain style. This means the generator is constantly trying to trick the discriminator. The result: over time and through practice, the videos become more and more lifelike.
What is new is a technique called Recycle-GAN. This is one better than cycle-GAN, which checks the quality of the fake images by converting it back into the style of the original image – a process the researchers describe as translating English to Spanish and then back to English again. So, for example, footage of Colbert may be "translated" to match the style of Oliver. This fake footage would then be "translated" back to Oliver so that you can see how good the "translation" was in the first place.