Content deleted Content added
PopoDameron (talk | contribs) sure, but let's not imply that the sampling is something that the model itself is doing |
The detailed and professional elaboration on the rectified flow has been updated. Tags: Visual edit Mobile edit Mobile web edit |
||
Line 173:
=== Rectified flow ===
Rectified flow is a method for learning transport maps between two distributions, offers a new perspective on understanding diffusion models and their ODE variants. Distinct from the complex SDE models, rectified flow is purely ODE-based, offering a straightforward and unified framework for generative and transfer modeling. Given the infinite possibilities of ODEs/SDEs to transfer data between two distributions, rectified flow specifically advocates for ODEs with solution paths that are straight lines. By learning straight flows, it provides a principled approach to learning ODEs with fast inference, effectively training one-step models with ODEs as intermediate steps. Rectified Flow (RF) formulation is employed in the Stable Diffusion 3. <ref>{{Cite web |title=Stable Diffusion 3: Research Paper |url=https://stability.ai/news/stable-diffusion-3-research-paper |access-date=2024-03-22 |website=Stability AI |language=en-GB}}</ref>
Given two distributions <math>\pi_0</math> and <math>\pi_1</math>, probability flow models implicitly learns the transport map by constructing an ODE driven by a drift force in <math>\mathbb R^d \times [0,1]</math>: <math display="block">\mathrm d \mathbf Z_t = \mathbf v(\mathbf Z_t , t) \, \mathrm dt, \quad t \in [0,1], \quad \text{starting from }\mathbf Z_0 \sim \mathbf\pi_0</math> such that <math>\mathbf Z_1 \sim \pi_1</math> when following the ODE starting from <math>\mathbf Z_0 \sim \pi_0</math>. Generally, for any time-differentiable process <math>\mathbf X(t)</math>, <math>\mathbf v</math> can be estimated by solving: <math display="block">\min_{\mathbf v} \int_0^1 \mathbb{E}\left [\lVert{\dot{\mathbf X}_t - \mathbf v(\mathbf X_t, t)}\rVert^2\right] \,\mathrm{d}t.</math>
By injecting strong priors that intermediate trajectories are straight in rectified flow, it can achieve both theoretical relevance for optimal transport <ref>{{Citation |last=Liu |first=Qiang |title=Rectified Flow: A Marginal Preserving Approach to Optimal Transport |date=2022-09-29 |url=http://arxiv.org/abs/2209.14577 |access-date=2024-03-22 |doi=10.48550/arXiv.2209.14577}}</ref> computational efficiency, as ODEs with straight paths can be simulated precisely without time discretization.
The data pair <math>(\mathbf{X}_0, \mathbf{X}_1)</math> can be any coupling of <math>\pi_0</math> and <math>\pi_1</math>, typically independent (i.e., <math>(\mathbf{X}_0,\mathbf{X}_1) \sim \pi_0 \times \pi_1</math>) obtained by randomly combining observations from <math>\pi_0</math> and <math>\pi_1</math>. This process ensures that the <math>\mathbf{Z}_t</math> trajectories closely mirror the density map of <math>\mathbf{X}_t</math> trajectories but ''reroute'' at intersections to ensure causality. This rectifying process is also referred to as Flow Matching, Stochastic Interpolation, and Alpha-Blending.
A distinctive aspect of rectified flow is its capability for "'''reflow'''", which straightens the trajectory of ODE paths. Denote the rectified flow <math>\boldsymbol{Z}^0 = \{\mathbf{Z}_t: t\in[0,1]\}</math> induced from <math>(\mathbf{X}_0,\mathbf{X}_1)</math> as <math>\boldsymbol{Z}^0 = \mathsf{Rectflow}((\mathbf{X}_0,\mathbf{X}_1))</math>. Recursively applying this <math>\mathsf{Rectflow}(\cdot)</math> operator generates a series of rectified flows <math>\boldsymbol{Z}^{k+1} = \mathsf{Rectflow}((\mathbf{Z}_0^k, \mathbf{Z}_1^k))</math>, starting with <math>(\mathbf{Z}_0^0,\mathbf{Z}_1^0)=(\mathbf{X}_0,\mathbf{X}_1)</math>, where <math>\boldsymbol{Z}^k</math> is the <math>k</math>-th iteration of rectified flow induced from <math>(\mathbf{X}_0,\mathbf{X}_1)</math>. This "reflow" process not only reduces transport costs but also straightens the paths of rectified flows, making <math>\boldsymbol{Z}^k</math> paths straighter with increasing <math>k</math>.
Rectified flow includes a nonlinear extension where linear interpolation <math>\mathbf{X}_t</math> is replaced with any time-differentiable curve that connects <math>\mathbf{X}_0</math> and <math>\mathbf{X}_1</math>, given by <math>\mathbf{X}_t = \alpha_t \mathbf{X}_1 + \beta_t \mathbf{X}_0</math>. This framework encompasses DDIM and probability flow ODEs as special cases, with particular choices of <math>\alpha_t</math> and <math>\beta_t</math>. However, in the case when the path of <math>\mathbf{X}</math> is not straight, the pair <math>(\mathbf{Z}_0, \mathbf{Z}_1)</math> no longer ensures a reduction in convex transport costs, and the reflow process also no longer straighten the paths of <math>\mathbf{Z}_t</math> <ref name=":0" />.
Flows with nearly straight paths offer a significant computational benefit by minimizing time-discretization errors in numerical simulations. Specifically, if an ODE <math>\mathrm{d} \mathbf{Z}_t = \mathbf{v}(\mathbf{Z}_t,t)\; \mathrm{d}t</math> follows perfectly straight paths, it simplifies to <math>\mathbf{Z}_t = \mathbf{Z}_0 + t \cdot \mathbf{v}(\mathbf{Z}_0, 0)</math>, allowing for exact solutions with just ''one single Euler step''. This addresses the very bottleneck of slow inference in ODE/SDE models. Consequently, the reflow/straightening procedure emerges as a unique strategy for training one-step generative models, such as GANs and VAEs, using ODEs as an intermediate mechanism.
== Choice of architecture ==
|