By Dr. Tim Oates, Chief Data Scientist
Generative Adversarial Networks, or GANs, are one of the most exciting ideas in machine learning right now. Yann LeCunn, perhaps best known for inventing Convolutional Neural Networks, said in a 2016 interview:
The most important [new idea in machine learning], in my opinion, is adversarial training (also called GAN for Generative Adversarial Networks). This, and the variations that are now being proposed, is the most interesting idea in the last 10 years in ML …
Here’s the main idea. Suppose you’re printing counterfeit money. You print a bunch of $20 bills and take one to your local Walmart and try to buy something. The cashier says “this bill is counterfeit!”, and you ask “how can you tell?” The cashier points out something like the printing being blurry, or the color being wrong. Armed with this information you go back and change your process, print more bills, and repeat. When the cashier can’t tell the difference between your fake money and real money, the adversarial game is over and you’ve won!
In GAN terminology, you are the generator and the cashier is the discriminator. The generator does what its name suggests, it generates samples from some distribution. In this case, the generator is generating $20 bills from a distribution over possible ways to counterfeit money. The discriminator is fed with samples from two different distributions, one from the generator (which we’ll call fake) and one that the generator is trying to mimic (which we’ll call real). In this case, the generator is trying to mimic the distribution over real $20 bills, so the discriminator sees both real and fake $20 bills.
The goal of the discriminator is to tell the difference between real and fake samples, to catch fake money. The goal of the generator is to fool the discriminator into thinking that it’s fake samples are coming from the real distribution, to pass off fake money as real.
In a GAN, the generator and discriminator are both neural networks. When the discriminator correctly identifies a fake sample, we can use some fairly straightforward math to figure out what the generator could have done to make the sample look more real. Through many iterations of this game, and with some luck, the generator gets better and better at fooling the discriminator by producing increasingly realistic samples.
It’s unlikely that you’re interested in printing counterfeit money, so what other uses are there for GANs? Here are a few:
- If the generator is given photos and the discriminator is given paintings by Van Gogh, the generator learns to make photos look like Van Gogh paintings.
- If the generator is given aerial photos and the discriminator is given Google Maps images, the generator learns to create maps from aerial images.
- If the generator is given low resolution images and the discriminator is given high resolution images, the generator learns to increase image resolution (called super-resolution).
One of the more exciting aspects of GANs is that, in some cases, it removes the need for parallel corpora. For example, if I want to learn to translate English to German, it’s reasonable to expect that I’ll need a corpus of sentences, with each sentence in each language. That’s a parallel corpus and, as you might expect, they’re a pain to build. With a GAN, though, if I’ve got lots of English text and lots of German text, which is really easy to find, I can learn to translate. The idea is to feed the generator with English sentences and give its output to the discriminator who is also getting native German sentences. The generator is trying to improve its translation so that the discriminator can’t tell the difference between a native sentence and a translation.
It’s sometimes hard to tell if which bold new ideas will have staying power. But with the sheer number of papers being published on GANs, many with truly interesting applications, it’s safe to say that we’re looking at more than a simple fad. I’d bet that GANs will be at the most important frontiers of machine learning for some time.
For more on fundamentals of machine learning, read "How Is Learning Even Possible?"