Lesson 8: Generative Modeling
Generative Modeling
Generative modeling is an unsupervised learning approach with the goal of learning the underlying probability distribution of input data. A generative modeling involves learning a density estimation (probability distribution) and generating sample data with the probability distribution model.
Examples of generative models include:
- Autoencoders (AEs)
- Variational autoencoders (VAEs)
- Generative Adversarial Networks (GANs)
Autoencoders (AE)
These are neural networks consisting of encoder and a decoder networks.
The encoder learns a network that maps an input
The decoder takes the low dimension representation
Training objective: The goal of the training is to minimize reconstruction loss. A loss function, such as Mean Squared Error (MSE) is used. Backpropagation of gradient respect to the encoder and decoder weights is used to update the weights in the encoder-decoder model.
The term “autoencoder” arises from the fact that the network automatically encodes the data into a latent representation. The data is not manually encoded.
Variational Autoencoders
Autoencoders are deterministic which means after training an autoencoder, at inference time, when you pass input data into the autoencoder network, the same data will be generated as output without any variability. To introduce some variability in the output generated, we could use variational autoencoders. A variational autoencoder consist of encoder and decoder networks.
The encoder networks maps an input
The last hidden layer of the encoder network is transformed linearly into generate two vectors
Instead,
Note that the symbol
The decoder network maps the sampled latent vector to an output
The encoder network computes the parameters of the Gaussian distribution
The training objective: The overall goal of a variational autoencoder is to learn a generative model of data that captures the underlying structure and distribution of the input data.
The training objective of the variational autoencoder is to minimize the objective function Evidence Lower Bound (ELBO) consisting of two parts:
- Reconstruction loss which measures the difference between
and - KL divergence regularization term which measures the divergence between the learned latent distribution q(z∣x) and a prior distribution p(z), typically a standard Gaussian N(0,I).
The Evidence Lower Bound (ELBO) for a Variational Autoencoder is given by:
where:
are the parameters of the decoder network, are the parameters of the encoder network,
Generative Adversarial Networks
Generative Adversarial Networks or GANs allow us to generate samples that are good without focusing on the interpretability of the latent variable.
GANs do not directly model density but just sample from a learned latent representation of the data. This is because, if a distribution is complex, it’s difficult to model it directly.
GANs samples from something simple such as noise, then learns the network that maps the sample to the original data. A discriminator is then used to classify the fake output and the real data. The real data is then provided to the generator so the generator can generate data that is close to the real data.
Both the generator G network and discriminator network are trained when training GANs until the generator is able to produce data that is close to the real data.
GANs start from random noise or data generated from a normal distribution, and generates a data such as an image.