Revision as of 17:53, 23 July 2025 edit Victoria Pomeranzeva (talk \| contribs) Extended confirmed users 603 edits m Table of contents Tag: Visual edit ← Previous edit		Revision as of 04:39, 13 August 2025 edit undo Chris the speller (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 894,223 edits m replaced: naturally-occurring → naturally occurring (3) Tag: AWB Next edit →
Line 17: Diffusion models were introduced in 2015 as a method to train a model that can sample from a highly complex probability distribution. They used techniques from [[non-equilibrium thermodynamics]], especially [[diffusion]].<ref>{{Cite journal \|last1=Sohl-Dickstein \|first1=Jascha \|last2=Weiss \|first2=Eric \|last3=Maheswaranathan \|first3=Niru \|last4=Ganguli \|first4=Surya \|date=2015-06-01 \|title=Deep Unsupervised Learning using Nonequilibrium Thermodynamics \|url=http://proceedings.mlr.press/v37/sohl-dickstein15.pdf \|journal=Proceedings of the 32nd International Conference on Machine Learning \|language=en \|publisher=PMLR \|volume=37 \|pages=2256–2265\|arxiv=1503.03585 }}</ref> Consider, for example, how one might model the distribution of all naturally- occurring photos. Each image is a point in the space of all images, and the distribution of naturally- occurring photos is a "cloud" in space, which, by repeatedly adding noise to the images, diffuses out to the rest of the image space, until the cloud becomes all but indistinguishable from a [[Normal distribution\|Gaussian distribution]] <math>\mathcal{N}(0, I)</math>. A model that can approximately undo the diffusion can then be used to sample from the original distribution. This is studied in "non-equilibrium" thermodynamics, as the starting distribution is not in equilibrium, unlike the final distribution. The equilibrium distribution is the Gaussian distribution <math>\mathcal{N}(0, I)</math>, with pdf <math>\rho(x) \propto e^{-\frac 12 \\|x\\|^2}</math>. This is just the [[Maxwell–Boltzmann distribution]] of particles in a potential well <math>V(x) = \frac 12 \\|x\\|^2</math> at temperature 1. The initial distribution, being very much out of equilibrium, would diffuse towards the equilibrium distribution, making biased random steps that are a sum of pure randomness (like a [[Brownian motion\|Brownian walker]]) and gradient descent down the potential well. The randomness is necessary: if the particles were to undergo only gradient descent, then they will all fall to the origin, collapsing the distribution. Line 265: === Other examples === Notable variants include<ref>{{Cite journal \|last1=Cao \|first1=Hanqun \|last2=Tan \|first2=Cheng \|last3=Gao \|first3=Zhangyang \|last4=Xu \|first4=Yilun \|last5=Chen \|first5=Guangyong \|last6=Heng \|first6=Pheng-Ann \|last7=Li \|first7=Stan Z. \|date=July 2024 \|title=A Survey on Generative Diffusion Models \|url=https://ieeexplore.ieee.org/document/10419041 \|journal=IEEE Transactions on Knowledge and Data Engineering \|volume=36 \|issue=7 \|pages=2814–2830 \|doi=10.1109/TKDE.2024.3361474 \|issn=1041-4347\|url-access=subscription }}</ref> Poisson flow generative model,<ref>{{Cite journal \|last1=Xu \|first1=Yilun \|last2=Liu \|first2=Ziming \|last3=Tian \|first3=Yonglong \|last4=Tong \|first4=Shangyuan \|last5=Tegmark \|first5=Max \|last6=Jaakkola \|first6=Tommi \|date=2023-07-03 \|title=PFGM++: Unlocking the Potential of Physics-Inspired Generative Models \|url=https://proceedings.mlr.press/v202/xu23m.html \|journal=Proceedings of the 40th International Conference on Machine Learning \|language=en \|publisher=PMLR \|pages=38566–38591\|arxiv=2302.04265 }}</ref> consistency model,<ref>{{Cite journal \|last1=Song \|first1=Yang \|last2=Dhariwal \|first2=Prafulla \|last3=Chen \|first3=Mark \|last4=Sutskever \|first4=Ilya \|date=2023-07-03 \|title=Consistency Models \|url=https://proceedings.mlr.press/v202/song23a \|journal=Proceedings of the 40th International Conference on Machine Learning \|language=en \|publisher=PMLR \|pages=32211–32252}}</ref> critically- damped Langevin diffusion,<ref>{{Cite arXiv \|last1=Dockhorn \|first1=Tim \|last2=Vahdat \|first2=Arash \|last3=Kreis \|first3=Karsten \|date=2021-10-06 \|title=Score-Based Generative Modeling with Critically-Damped Langevin Diffusion \|class=stat.ML \|eprint=2112.07068 }}</ref> GenPhys,<ref>{{cite arXiv \|last1=Liu \|first1=Ziming \|title=GenPhys: From Physical Processes to Generative Models \|date=2023-04-05 \|eprint=2304.02637 \|last2=Luo \|first2=Di \|last3=Xu \|first3=Yilun \|last4=Jaakkola \|first4=Tommi \|last5=Tegmark \|first5=Max\|class=cs.LG }}</ref> cold diffusion,<ref>{{Cite journal \|last1=Bansal \|first1=Arpit \|last2=Borgnia \|first2=Eitan \|last3=Chu \|first3=Hong-Min \|last4=Li \|first4=Jie \|last5=Kazemi \|first5=Hamid \|last6=Huang \|first6=Furong \|last7=Goldblum \|first7=Micah \|last8=Geiping \|first8=Jonas \|last9=Goldstein \|first9=Tom \|date=2023-12-15 \|title=Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise \|url=https://proceedings.neurips.cc/paper_files/paper/2023/hash/80fe51a7d8d0c73ff7439c2a2554ed53-Abstract-Conference.html \|journal=Advances in Neural Information Processing Systems \|language=en \|volume=36 \|pages=41259–41282\|arxiv=2208.09392 }}</ref> discrete diffusion,<ref>{{Cite journal \|last1=Gulrajani \|first1=Ishaan \|last2=Hashimoto \|first2=Tatsunori B. \|date=2023-12-15 \|title=Likelihood-Based Diffusion Language Models \|url=https://proceedings.neurips.cc/paper_files/paper/2023/hash/35b5c175e139bff5f22a5361270fce87-Abstract-Conference.html \|journal=Advances in Neural Information Processing Systems \|language=en \|volume=36 \|pages=16693–16715\|arxiv=2305.18619 }}</ref><ref>{{cite arXiv \|last1=Lou \|first1=Aaron \|title=Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution \|date=2024-06-06 \|eprint=2310.16834 \|last2=Meng \|first2=Chenlin \|last3=Ermon \|first3=Stefano\|class=stat.ML }}</ref> etc. == Flow-based diffusion model ==

Diffusion model: Difference between revisions