Terrifying Things Happen When an AI Generates Fake Faces Synced to Music

By Andrew Liszewski on at

We’re still trying to figure out the best applications for neural networks, machine learning, and all the recent advancements in artificial intelligence. Amongst all the practical research being conducted, there’s also lots of frivolous experimentation being done with results that walk the line between fascinating and terrifying.

Automated image processing has emerged as a strong suit of artificial neural networks, fuelled in part by decades of everyone sharing photos and selfies of each other on the internet. It’s resulted in vast archives of headshots being harvested and used to train AIs to do everything from artificially ageing users in novelty mobile apps to generating huge collections of photorealistic headshots of people that don’t actually exist.

The stock photography industry will never be the same, but Mario Klingemann wondered what would happen if those same artificial neural networks churning out fake headshots were synced to music, generating the most expressive faces when a song’s beat is really banging.

Klingemann used the StyleGAN2 generative adversarial network which was created by Nvidia and eventually released as an open source tool over a year ago. He didn’t do any custom image training of his own but instead modified the GAN to adjust its output results based on the sound spectrum of a given audio file, which in this case is the song Triggernometry by Kraftamt.

Some of Klingemann’s followers on Twitter have suggested he dial back some of the more extreme output results from the GAN, whose horrors are really only revealed when you carefully step through the above video frame by frame.

They make some good points, but if you look too closely at anything you’re bound to find something you won’t like – or in this case, something else to prevent you from sleeping soundly at night.