This article, Flow-based generative model, has recently been created via the Articles for creation process. Please check to see if the reviewer has accidentally left this template after accepting the draft and take appropriate action as necessary.
Reviewer tools: Inform author |
A flow-based generative model is a generative model used in machine learning that explicitly models a probability distribution by leveraging normalizing flow[1], which is a statistical method using the change-of-variable law of probabilities to transform a simple distribution into a complex one, which is usually the distribution (i.e. likelihood function) of observed data, .
The direct modeling of likelihood provides many advantages. For example, the negative log-likelihood can be directly computed and minimized as the loss function. Additionally, novel samples can be generated by sampling from the initial distribution, and applying the flow transformation.
In contrast, many alternative generative modeling methods such as variational autoencoder (VAE) and generative adversarial network do not explicitly represent the likelihood function.
Method
Let be a (possibly multivariate) random variable with distribution .
For , let be a sequence of random variables transformed from . The functions should be invertible, i.e. the inverse function exists. The final output models the target distribution.
The log likelihood of is (see derivation):
To efficiently compute the log likelihood, the functions should be 1. easy to invert, and 2. easy to compute the determinant of its Jacobian. In practice, the functions are modeled using deep neural networks, and are trained to minimize the negative log-likelihood given data samples from the target distribution.
Derivation of log likelihood
Consider and . Note that .
By the change of variable formula, the distribution of is:
Where is the determinant of the Jacobian matrix of .
By the inverse function theorem:
By the identity (where is an invertible matrix), we have:
The log likelihood is thus:
In general, the above applies to any and . Since is equal to subtracted by a non-recursive term, we can infer by induction that:
Examples
TODO description
- RealNVP
- TODO more items, and citation
Applications
TODO description
- Point-cloud modeling
- Music generation
- TODO more items, and citation
References
- ^ . arXiv:1505.05770.
{{cite arXiv}}
: Missing or empty|title=
(help) A bot will complete this citation soon. Click here to jump the queue
External links
Category:Machine learning Category:Statistical models Category:Probabilistic models