Revision as of 05:46, 22 March 2025 edit Moderately Sized Greg (talk \| contribs) 36 edits m →Regularized Models: Dummy edit. The previous edit added a paragraph explaining how variational approaches often use coarse-to-fine schemes. ← Previous edit		Revision as of 00:44, 23 March 2025 edit undo Moderately Sized Greg (talk \| contribs) 36 edits m Removed excess spaces and arranged references properly as per OpalYosutebito's edits on: https://en.wikipedia.org/w/index.php?title=User:Moderately_Sized_Greg/sandbox&action=history Next edit →
Line 2: [[Image:Opticfloweg.png\|thumb\|right\|400px\|The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each ___location is represented by the direction and length of each arrow.]] '''Optical flow''' or '''optic flow''' is the pattern of apparent [[motion (physics)\|motion]] of objects, surfaces, and edges in a visual scene caused by the [[relative motion]] between an observer and a scene.<ref>{{Cite book \|url={{google books\|plainurl=yes\|id=CSgOAAAAQAAJ\|pg=PA77\|text=optical flow}} \|title=Thinking in Perspective: Critical Essays in the Study of Thought Processes \|last1=Burton \|first1=Andrew \|last2=Radford \|first2=John \|publisher=Routledge \|year=1978 \|isbn=978-0-416-85840-2}}</ref><ref>{{Cite book \|url={{google books\|plainurl=yes\|id=-I_Hazgqx8QC\|pg=PA414\|text=optical flow}} \|title=Electronic Spatial Sensing for the Blind: Contributions from Perception \|last1=Warren \|first1=David H. \|last2=Strelow \|first2=Edward R. \|publisher=Springer \|year=1985 \|isbn=978-90-247-2689-9}}</ref> Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image.<ref name="Horn_1980">{{Cite journal \|last1=Horn \|first1=Berthold K.P. \|last2=Schunck \|first2=Brian G. \|date=August 1981 \|title=Determining optical flow \|url=http://image.diku.dk/imagecanon/material/HornSchunckOptical_Flow.pdf \|journal=Artificial Intelligence \|language=en \|volume=17 \|issue=1–3 \|pages=185–203 \|doi=10.1016/0004-3702(81)90024-2\|hdl=1721.1/6337 }}</ref> The concept of optical flow was introduced by the American psychologist [[James J. Gibson]] in the 1940s to describe the visual stimulus provided to animals moving through the world.<ref>{{Cite book \|title=The Perception of the Visual World \|last=Gibson \|first=J.J. \|publisher=Houghton Mifflin \|year=1950}}</ref> Gibson stressed the importance of optic flow for [[Affordance\|affordance perception]], the ability to discern possibilities for action within the environment. Followers of Gibson and his [[Ecological Psychology\|ecological approach to psychology]] have further demonstrated the role of the optical flow stimulus for the perception of movement by the observer in the world; perception of the shape, distance and movement of objects in the world; and the control of [[Animal locomotion\|locomotion]].<ref>{{Cite journal \|last1=Royden \|first1=C. S. \|last2=Moore \|first2=K. D. \|year=2012 \|title=Use of speed cues in the detection of moving objects by moving observers \|journal=Vision Research \|volume=59 \|pages=17–24 \|doi=10.1016/j.visres.2012.02.006\|pmid=22406544 \|s2cid=52847487 \|doi-access=free }}</ref> The term optical flow is also used by roboticists, encompassing related techniques from image processing and control of navigation including [[motion detection]], [[Image segmentation\|object segmentation]], time-to-contact information, focus of expansion calculations, luminance, [[motion compensation\|motion compensated]] encoding, and stereo disparity measurement.<ref name="Kelson R. T. Aires, Andre M. Santana, Adelardo A. D. Medeiros 2008">{{Cite book \|url=http://www.dca.ufrn.br/~adelardo/artigos/SAC08.pdf \|title=Optical Flow Using Color Information \|last1=Aires \|first1=Kelson R. T. \|last2=Santana \|first2=Andre M. \|last3=Medeiros \|first3=Adelardo A. D. \|publisher=ACM New York, NY, USA \|year=2008 \|isbn=978-1-59593-753-7}}</ref><ref ~~name="S. S. Beauchemin, J. L. Barron 1995"~~>{{Cite journal \|url=http://portal.acm.org/ft_gateway.cfm?id=212141&type=pdf&coll=GUIDE&dl=GUIDE&CFID=72158298&CFTOKEN=85078203 \|title=The computation of optical flow \|last1=Beauchemin \|first1=S. S. \|last2=Barron \|first2=J. L. \|journal=ACM Computing Surveys \|publisher=ACM New York, USA \|year=1995\|volume=27 \|issue=3 \|pages=433–466 \|doi=10.1145/212094.212141 \|s2cid=1334552 \|doi-access=free }}</ref> == Estimation == Optical flow can be estimated in a number of ways. Broadly, optical flow estimation approaches can be divided into machine learning based models (sometimes called data-driven models), classical models (sometimes called knowledge-driven models) which do not use machine learning and hybrid models which use aspects of both learning based models and classical models.<ref ~~name="Zhai_Survey_2021"~~>{{cite journal \|last1=Zhai \|first1=Mingliang \|last2=Xiang \|first2=Xuezhi \|last3=Lv \|first3=Ning \|last4=Kong \|first4=Xiangdong \|title=Optical flow and scene flow estimation: A survey \|journal=Pattern Recognition \|date=2021 \|volume=114 \|pages=107861 \|doi=10.1016/j.patcog.2021.107861 \|url=https://www.sciencedirect.com/science/article/pii/S0031320321000480}}</ref> ===Classical Models=== Line 29: One can combine both of these constraints to formulate estimating optical flow as an [[Optimization problem\|optimization problem]], where the goal is to minimize the cost function of the form, :<math>E = \iint_\Omega \Psi(I(x + u, y + v, t + 1) - I(x, y, t)) + \alpha \Psi(\|\nabla u\|) + \alpha \Psi(\|\nabla v\|) dx dy, </math> where <math>\Omega</math> is the extent of the images <math>I(x, y)</math>, <math>\nabla</math> is the gradient operator, <math>\alpha</math> is a constant, and <math>\Psi()</math> is a [[loss function]].<ref name="Fortun_Survey_2015" /><ref name="Brox_2004" /> ~~<ref name="Fortun_Survey_2015" /><ref name="Brox_2004" />~~ This optimisation problem is difficult to solve owing to its non-linearity. To address this issue, one can use a ''variational approach'' and linearise the brightness constancy constraint using a first order [[Taylor series]] approximation. Specifically, the brightness constancy constraint is approximated as, :<math>\frac{\partial I}{\partial x}u+\frac{\partial I}{\partial y}v+\frac{\partial I}{\partial t} = 0.</math> For convenience, the derivatives of the image, <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math> and <math>\tfrac{\partial I}{\partial t}</math> are often condensed to become <math>I_x</math>, <math>I_y</math> and <math> I_t</math>. Doing so, allows one to rewrite the linearised brightness constancy constraint as,<ref name="Baker_2011" /> :<math>I_x u + I_y v+ I_t = 0.</math> The optimization problem can now be rewritten as :<math>E = \iint_\Omega \Psi(I_x u + I_y v + I_t) + \alpha \Psi(\|\nabla u\|) + \alpha \Psi(\|\nabla v\|) dx dy. </math> For the choice of <math>\Psi(x) = x^2</math>, this method is the same as the [[Horn-Schunck method]].<ref name="Horn_1980" /> Of course, other choices of cost function have been used such as <math>\Psi(x) = \sqrt{x^2 + \epsilon^2}</math>, which is a differentiable variant of the [[Taxicab geometry \|<math>L^1</math> norm]].<ref name="Fortun_Survey_2015" /><ref>{{cite conference \|url=https://ieeexplore.ieee.org/abstract/document/5539939 \|title=Secrets of optical flow estimation and their principles \|last1=Sun \|first1=Deqing \|last2=Roth \|first2=Stefan \|last3=Black \|first3="Micahel J." \|date=2010 \|publisher=IEEE \|book-title=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition \|pages= 2432-2439 \|___location=San Francisco, CA, USA \|conference=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition}}</ref>▼ ~~<ref name="Horn_1980"/>~~ Of course, other choices of cost function have been used such as <math>\Psi(x) = \sqrt{x^2 + \epsilon^2}</math>, which is a differentiable variant of the [[Taxicab geometry \|<math>L^1</math> norm]].<ref name="Fortun_Survey_2015" /> ~~<ref>~~ ▲{{cite conference \|url=https://ieeexplore.ieee.org/abstract/document/5539939 \|title=Secrets of optical flow estimation and their principles \|last1=Sun \|first1=Deqing \|last2=Roth \|first2=Stefan \|last3=Black \|first3="Micahel J." \|date=2010 \|publisher=IEEE \|book-title=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition \|pages= 2432-2439 \|___location=San Francisco, CA, USA \|conference=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition}}</ref> To solve the aforementioned optimization problem, one can use the [[Euler-Lagrange equations]] to provide a system of partial differential equations for each point in <math>I(x, y, t)</math>. In the simplest case of using <math>\Psi(x) = x^2</math>, these equations are, Line 53 ⟶ 49: Doing so yields a system of linear equations which can be solved for <math>(u, v)</math> at each pixel, using an iterative scheme such as [[Gauss-Seidel]].<ref name="Horn_1980" /> Although, linearising the brightness constancy constraint simplifies the optimisation problem significantly, the linearisation is only valid for small displacements and/or smooth images. To avoid this problem, a multi-scale or coarse-to-fine approach is often used. In such a scheme, the images are initially [[downsampling\|downsampled]] and the linearised Euler-Lagrange equations are ~~then~~ solved at ~~this~~the reduced ~~scale~~resolution. The estimated flow field at this scale is then used to initialise the process at next scale.<ref ~~name="Meinhardt-Llopis_2013"~~>{{cite journal \|last1=Meinhardt-Llopis \|first1=Enric \|last2=Pérez \|first2=Javier Sánchez \|last3=Kondermann \|first3=Daniel \|title=Horn-Schunck Optical Flow with a Multi-Scale Strategy \|journal=Image Processing On Line \|date=19 July 2013 \|volume=3 \|pages=151–172 \|doi=10.5201/ipol.2013.20}}</ref> This initialisation process is often performed by [[image warping\|warping]] one frame using the current estimate of flow field toso bethat it is as similar to other ~~frame~~ as possible.<ref name="Brox_2004" /> <ref ~~name="Black_1996"~~>{{cite journal \|last1=Black \|first1=Michael J. \|last2=Anandan \|first2=P. \|title=The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields \|journal=Computer Vision and Image Understanding \|date=1 January 1996 \|volume=63 \|issue=1 \|pages=75–104 \|doi=10.1006/cviu.1996.0006 \|issn=1077-3142}}</ref> An alternate approach is to discretize the optimisation problem and then perform a search of the possible <math>(u, v)</math> values without linearising it.<ref ~~name="Steinbrucker_2009"~~>{{cite conference \|url=https://ieeexplore.ieee.org/document/5459364 \|title=Large Displacement Optical Flow Computation without Warping \|last1=Steinbr¨ucker \|first1=Frank \|last2=Pock \|first2=Thomas \|last3=Cremers \|first3=Daniel \|last4=Weickert \|first4=Joachim \|date=2009 \|publisher=IEEE \|book-title=2009 IEEE 12th International Conference on Computer Vision \|pages=1609-1614 \|conference=2009 IEEE 12th International Conference on Computer Vision}}</ref> This search is often performed using [[Max-flow min-cut theorem]] algorithms, linear programming or [[belief propagation]] methods. Line 67 ⟶ 63: \hat{\boldsymbol{\alpha}} = \arg \min_{\boldsymbol{\alpha}} \sum_{(x, y) \in \mathcal{R}} g(x, y) \rho(x, y, I_1, I_2, u_{\boldsymbol{\alpha}}, v_{\boldsymbol{\alpha}}), </math> where <math>{\boldsymbol{\alpha}}</math> is the set of parameters determining the motion in the region <math>\mathcal{R}</math>, <math>\rho()</math> is data cost term, <math>g()</math> is a weighting function that determines the influence of pixel <math>(x, y)</math> on the total cost, and <math>I_1</math> and <math>I_2</math> are frames 1 and 2 from a pair of consecutive frames.<ref name="Fortun_Survey_2015" /> ~~<ref name="Fortun_Survey_2015" />~~ The simplest parametric model is the [[Lucas-Kanade method]]. This uses rectangular regions and parameterises the motion as purely translational. The Lucas-Kanade method uses the original brightness constancy constrain as the data cost term and selects <math>g(x, y) = 1</math>. Line 81 ⟶ 76: Instead of seeking to model optical flow directly, one can train a [[machine learning]] system to estimate optical flow. Since 2015, when FlowNet<ref>{{Cite conference \|last=Dosovitskiy \|first=Alexey \|last2=Fischer \|first2=Philipp \|last3=Ilg \|first3=Eddy \|last4=Hausser \|first4=Philip \|last5=Hazirbas \|first5=Caner \|last6=Golkov \|first6=Vladimir \|last7=Smagt \|first7=Patrick van der \|last8=Cremers \|first8=Daniel \|last9=Brox \|first9=Thomas \|date=2015 \|title=FlowNet: Learning Optical Flow with Convolutional Networks \|url=https://ieeexplore.ieee.org/document/7410673/ \|publisher=IEEE \|pages=2758–2766 \|doi=10.1109/ICCV.2015.316 \|isbn=978-1-4673-8391-2 \| conference=2015 IEEE International Conference on Computer Vision (ICCV)}}</ref> was proposed, learning based models have been applied to optical flow and have gained prominence. Initially, these approaches were based on [[Convolutional neural network\|Convolutional Neural Networks]] arranged in a [[U-Net]] architecture. However, with the advent of [[Transformer (deep learning architecture)\|transformer architecture]] in 2017, transformer based models have gained prominence.<ref>{{Cite journal \|last=Alfarano \|first=Andrea \|last2=Maiano \|first2=Luca \|last3=Papa \|first3=Lorenzo \|last4=Amerini \|first4=Irene \|date=2024 \|title=Estimating optical flow: A comprehensive review of the state of the art \|url=https://linkinghub.elsevier.com/retrieve/pii/S1077314224002418 \|journal=Computer Vision and Image Understanding \|language=en \|volume=249 \|pages=104160 \|doi=10.1016/j.cviu.2024.104160}}</ref> Most learning-based approaches to optical flow use [[supervised learning]]. In this case, many frame pairs of video data and their corresponding [[ground truth\|ground-truth]] flow fields are used to optimise the parameters of the learning-based model to accurately estimate optical flow. This process often relies on vast training datasets due to the number of parameters involved.<ref ~~name="Tu_2019_Survey"~~>{{cite journal \|last1=Tu \|first1=Zhigang \|last2=Xie \|first2=Wei \|last3=Zhang \|first3=Dejun \|last4=Poppe \|first4=Ronald \|last5=Veltkamp \|first5=Remco C. \|last6=Li \|first6=Baoxin \|last7=Yuan \|first7=Junsong \|title=A survey of variational and CNN-based optical flow techniques \|journal=Signal Processing: Image Communication \|date=1 March 2019 \|volume=72 \|pages=9–24 \|doi=10.1016/j.image.2018.12.002}}</ref> == Uses == Line 98 ⟶ 93: {{distinguish\|Optical flowmeter}} Various configurations of optical flow sensors exist. One configuration is an image sensor chip connected to a processor programmed to run an optical flow algorithm. Another configuration uses a vision chip, which is an integrated circuit having both the [[image sensor]] and the processor on the same die, allowing for a compact implementation.<ref>{{Cite book \|title=Vision Chips \|last=Moini \|first=Alireza \|date=2000 \|publisher=Springer US \|isbn=9781461552673 \|___location=Boston, MA \|oclc=851803922}}</ref><ref>{{Cite book \|title=Analog VLSI and neural systems \|last=Mead \|first=Carver \|date=1989 \|publisher=Addison-Wesley \|isbn=0201059924 \|___location=Reading, Mass. \|oclc=17954003 \|url-access=registration \|url=https://archive.org/details/analogvlsineural00mead }}</ref> An example of this is a generic optical mouse sensor used in an [[optical mouse]]. In some cases the processing circuitry may be implemented using analog or mixed-signal circuits to enable fast optical flow computation using minimal current consumption. One area of contemporary research is the use of [[neuromorphic engineering]] techniques to implement circuits that respond to optical flow, and thus may be appropriate for use in an optical flow sensor.<ref>{{Cite book \|title=Analog VLSI circuits for the perception of visual motion \|last=Stocker \|first=Alan A. \|date=2006 \|publisher=John Wiley & Sons \|isbn=0470034882 \|___location=Chichester, England \|oclc=71521689}}</ref> Such circuits may draw inspiration from biological neural circuitry that similarly responds to optical flow. Line 116 ⟶ 111: == References == {{reflist\|1=2}} == External links ==

Optical flow: Difference between revisions