Revision as of 12:55, 1 June 2025 edit Rsjaffe (talk \| contribs) Administrators 64,255 edits Adding {{pp-vandalism}} Tag: Twinkle ← Previous edit		Revision as of 14:30, 1 June 2025 edit undo KatnissEverdeen (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 17,942 edits Undid revision 1293381602 by Diffuser202512 (talk) Tag: Undo Next edit →
Line 2: {{Machine learning\|Artificial neural network}} In [[machine learning]], '''diffusion models''', also known as '''diffusion-based generative models''' or '''score-based generative models''', are a class of [[latent variable model\|latent variable]] [[generative model\|generative]] models. A diffusion model consists of two major components: the forward diffusion process, and the reverse sampling process. The goal of diffusion models is to learn a [[diffusion process]] for a given dataset, such that the process can generate new elements that are distributed similarly as the original dataset. A diffusion model models data as generated by a diffusion process, whereby a new datum performs a [[Wiener process\|random walk with drift]] through the space of all possible data.<ref name="song"/> A trained diffusion model can be sampled in many ways, with different efficiency and quality. The sampling is generally achieved by numerically solving a deterministic or stoachstic differential equation.<ref name="song"/> In particular, the sampling trajectory appears strong regularity for deterministic sampling.<ref>{{Cite journal \|last1=Chen \|first1=Defang \|last2=Zhou \|first2=Zhenyu \|last3=Wang \|first3=Can \|last4=Shen \|first4=Chunhua \|last5=Lyu \|first5=Siwei \|date=2024-05-18 \|title=On the Trajectory Regularity of ODE-based Diffusion Sampling \| url=https://openreview.net/pdf?id=H86WzfH5N1 \|journal=Proceedings of the 41st International Conference on Machine Learning \|language=en \|publisher=PMLR \| volume=235 \|pages=7905-7934\|arxiv=2405.11326}}</ref> There are various equivalent formalisms, including [[Markov chain]]s, denoising diffusion probabilistic models, noise conditioned score networks, and stochastic differential equations.<ref>{{cite journal \|last1=Croitoru \|first1=Florinel-Alin \|last2=Hondru \|first2=Vlad \|last3=Ionescu \|first3=Radu Tudor \|last4=Shah \|first4=Mubarak \|date=2023 \|title=Diffusion Models in Vision: A Survey \|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence \|volume=45 \|issue=9 \|pages=10850–10869 \|arxiv=2209.04747 \|doi=10.1109/TPAMI.2023.3261988 \|pmid=37030794 \|s2cid=252199918}}</ref> They are typically trained using [[Variational Bayesian methods\|variational inference]].<ref name="ho" /> The model responsible for denoising is typically called its "[[#Choice of architecture\|backbone]]". The backbone may be of any kind, but they are typically [[U-Net\|U-nets]] or [[Transformer (deep learning architecture)\|transformers]]. Line 310: <math display="block">\min_{\theta} \int_0^1 \mathbb{E}_{\pi_0, \pi_1, p_t}\left [\lVert{(x_1-x_0) - v_t(x_t)}\rVert^2\right] \,\mathrm{d}t.</math> The data pair <math>(x_0, x_1)</math> can be any coupling of <math>\pi_0</math> and <math>\pi_1</math>, typically independent (i.e., <math>(x_0,x_1) \sim \pi_0 \times \pi_1</math>) obtained by randomly combining observations from <math>\pi_0</math> and <math>\pi_1</math>. This process ensures that the trajectories closely mirror the density map of <math>x_t</math> trajectories but ''reroute'' at intersections to ensure causality. This rectifying process is also known as Flow Matching,<ref>{{cite arXiv \|last1=Lipman \|first1=Yaron \|title=Flow Matching for Generative Modeling \|date=2023-02-08 \|eprint=2210.02747 \|last2=Chen \|first2=Ricky T. Q. \|last3=Ben-Hamu \|first3=Heli \|last4=Nickel \|first4=Maximilian \|last5=Le \|first5=Matt\|class=cs.LG }}</ref> Stochastic Interpolation,<ref>{{cite arXiv \|last1=Albergo \|first1=Michael S. \|title=Building Normalizing Flows with Stochastic Interpolants \|date=2023-03-09 \|eprint=2209.15571 \|last2=Vanden-Eijnden \|first2=Eric\|class=cs.LG }}</ref> Inversion by Direct Iteration (InDI),<ref>{{Citation \|last=Delbracio \|first=Mauricio \|title=Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration \|date=2024-02-02 \|url=http://arxiv.org/abs/2303.11435 \|access-date=2025-05-30 \|publisher=arXiv \|doi=10.48550/arXiv.2303.11435 \|id=arXiv:2303.11435 \|last2=Milanfar \|first2=Peyman}}</ref> and alpha-(de)blending.<ref>{{cite arXiv \|last1=Heitz \|first1=Eric \|title=Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model \|date=2023-05-05 \|eprint=2305.03486 \|last2=Belcour \|first2=Laurent \|last3=Chambon \| first3=Thomas \|class=cs.GR}}</ref> [[File:Reflow Illustration.png\|thumb\|390px\|The reflow process<ref name=":0"/>]]

Diffusion model: Difference between revisions