Optical flow: Difference between revisions

Content deleted Content added
m Regularized Models: Dummy edit. The previous edit added a paragraph explaining how variational approaches often use coarse-to-fine schemes.
m Removed excess spaces and arranged references properly as per OpalYosutebito's edits on: https://en.wikipedia.org/w/index.php?title=User:Moderately_Sized_Greg/sandbox&action=history
Line 2:
[[Image:Opticfloweg.png|thumb|right|400px|The optic flow experienced by a rotating observer (in this case a fly). The direction and magnitude of optic flow at each ___location is represented by the direction and length of each arrow.]]
 
'''Optical flow''' or '''optic flow''' is the pattern of apparent [[motion (physics)|motion]] of objects, surfaces, and edges in a visual scene caused by the [[relative motion]] between an observer and a scene.<ref>{{Cite book |url={{google books|plainurl=yes|id=CSgOAAAAQAAJ|pg=PA77|text=optical flow}} |title=Thinking in Perspective: Critical Essays in the Study of Thought Processes |last1=Burton |first1=Andrew |last2=Radford |first2=John |publisher=Routledge |year=1978 |isbn=978-0-416-85840-2}}</ref><ref>{{Cite book |url={{google books|plainurl=yes|id=-I_Hazgqx8QC|pg=PA414|text=optical flow}} |title=Electronic Spatial Sensing for the Blind: Contributions from Perception |last1=Warren |first1=David H. |last2=Strelow |first2=Edward R. |publisher=Springer |year=1985 |isbn=978-90-247-2689-9}}</ref> Optical flow can also be defined as the distribution of apparent velocities of movement of brightness pattern in an image.<ref name="Horn_1980">{{Cite journal |last1=Horn |first1=Berthold K.P. |last2=Schunck |first2=Brian G. |date=August 1981 |title=Determining optical flow |url=http://image.diku.dk/imagecanon/material/HornSchunckOptical_Flow.pdf |journal=Artificial Intelligence |language=en |volume=17 |issue=1–3 |pages=185–203 |doi=10.1016/0004-3702(81)90024-2|hdl=1721.1/6337 }}</ref>
 
The concept of optical flow was introduced by the American psychologist [[James J. Gibson]] in the 1940s to describe the visual stimulus provided to animals moving through the world.<ref>{{Cite book |title=The Perception of the Visual World |last=Gibson |first=J.J. |publisher=Houghton Mifflin |year=1950}}</ref> Gibson stressed the importance of optic flow for [[Affordance|affordance perception]], the ability to discern possibilities for action within the environment. Followers of Gibson and his [[Ecological Psychology|ecological approach to psychology]] have further demonstrated the role of the optical flow stimulus for the perception of movement by the observer in the world; perception of the shape, distance and movement of objects in the world; and the control of [[Animal locomotion|locomotion]].<ref>{{Cite journal |last1=Royden |first1=C. S. |last2=Moore |first2=K. D. |year=2012 |title=Use of speed cues in the detection of moving objects by moving observers |journal=Vision Research |volume=59 |pages=17–24 |doi=10.1016/j.visres.2012.02.006|pmid=22406544 |s2cid=52847487 |doi-access=free }}</ref>
 
The term optical flow is also used by roboticists, encompassing related techniques from image processing and control of navigation including [[motion detection]], [[Image segmentation|object segmentation]], time-to-contact information, focus of expansion calculations, luminance, [[motion compensation|motion compensated]] encoding, and stereo disparity measurement.<ref name="Kelson R. T. Aires, Andre M. Santana, Adelardo A. D. Medeiros 2008">{{Cite book |url=http://www.dca.ufrn.br/~adelardo/artigos/SAC08.pdf |title=Optical Flow Using Color Information |last1=Aires |first1=Kelson R. T. |last2=Santana |first2=Andre M. |last3=Medeiros |first3=Adelardo A. D. |publisher=ACM New York, NY, USA |year=2008 |isbn=978-1-59593-753-7}}</ref><ref name="S. S. Beauchemin, J. L. Barron 1995">{{Cite journal |url=http://portal.acm.org/ft_gateway.cfm?id=212141&type=pdf&coll=GUIDE&dl=GUIDE&CFID=72158298&CFTOKEN=85078203 |title=The computation of optical flow |last1=Beauchemin |first1=S. S. |last2=Barron |first2=J. L. |journal=ACM Computing Surveys |publisher=ACM New York, USA |year=1995|volume=27 |issue=3 |pages=433–466 |doi=10.1145/212094.212141 |s2cid=1334552 |doi-access=free }}</ref>
 
== Estimation ==
 
Optical flow can be estimated in a number of ways. Broadly, optical flow estimation approaches can be divided into machine learning based models (sometimes called data-driven models), classical models (sometimes called knowledge-driven models) which do not use machine learning and hybrid models which use aspects of both learning based models and classical models.<ref name="Zhai_Survey_2021">{{cite journal |last1=Zhai |first1=Mingliang |last2=Xiang |first2=Xuezhi |last3=Lv |first3=Ning |last4=Kong |first4=Xiangdong |title=Optical flow and scene flow estimation: A survey |journal=Pattern Recognition |date=2021 |volume=114 |pages=107861 |doi=10.1016/j.patcog.2021.107861 |url=https://www.sciencedirect.com/science/article/pii/S0031320321000480}}</ref>
 
===Classical Models===
Line 29:
One can combine both of these constraints to formulate estimating optical flow as an [[Optimization problem|optimization problem]], where the goal is to minimize the cost function of the form,
:<math>E = \iint_\Omega \Psi(I(x + u, y + v, t + 1) - I(x, y, t)) + \alpha \Psi(|\nabla u|) + \alpha \Psi(|\nabla v|) dx dy, </math>
where <math>\Omega</math> is the extent of the images <math>I(x, y)</math>, <math>\nabla</math> is the gradient operator, <math>\alpha</math> is a constant, and <math>\Psi()</math> is a [[loss function]].<ref name="Fortun_Survey_2015" /><ref name="Brox_2004" />
 
<ref name="Fortun_Survey_2015" /><ref name="Brox_2004" />
This optimisation problem is difficult to solve owing to its non-linearity.
To address this issue, one can use a ''variational approach'' and linearise the brightness constancy constraint using a first order [[Taylor series]] approximation. Specifically, the brightness constancy constraint is approximated as,
:<math>\frac{\partial I}{\partial x}u+\frac{\partial I}{\partial y}v+\frac{\partial I}{\partial t} = 0.</math>
For convenience, the derivatives of the image, <math>\tfrac{\partial I}{\partial x}</math>, <math>\tfrac{\partial I}{\partial y}</math> and <math>\tfrac{\partial I}{\partial t}</math> are often condensed to become <math>I_x</math>, <math>I_y</math> and <math> I_t</math>.
Doing so, allows one to rewrite the linearised brightness constancy constraint as,<ref name="Baker_2011" />
:<math>I_x u + I_y v+ I_t = 0.</math>
The optimization problem can now be rewritten as
:<math>E = \iint_\Omega \Psi(I_x u + I_y v + I_t) + \alpha \Psi(|\nabla u|) + \alpha \Psi(|\nabla v|) dx dy. </math>
For the choice of <math>\Psi(x) = x^2</math>, this method is the same as the [[Horn-Schunck method]].<ref name="Horn_1980" />
Of course, other choices of cost function have been used such as <math>\Psi(x) = \sqrt{x^2 + \epsilon^2}</math>, which is a differentiable variant of the [[Taxicab geometry |<math>L^1</math> norm]].<ref name="Fortun_Survey_2015" /><ref>{{cite conference |url=https://ieeexplore.ieee.org/abstract/document/5539939 |title=Secrets of optical flow estimation and their principles |last1=Sun |first1=Deqing |last2=Roth |first2=Stefan |last3=Black |first3="Micahel J." |date=2010 |publisher=IEEE |book-title=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition |pages= 2432-2439 |___location=San Francisco, CA, USA |conference=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition}}</ref>
<ref name="Horn_1980"/>
Of course, other choices of cost function have been used such as <math>\Psi(x) = \sqrt{x^2 + \epsilon^2}</math>, which is a differentiable variant of the [[Taxicab geometry |<math>L^1</math> norm]].<ref name="Fortun_Survey_2015" />
<ref>
{{cite conference |url=https://ieeexplore.ieee.org/abstract/document/5539939 |title=Secrets of optical flow estimation and their principles |last1=Sun |first1=Deqing |last2=Roth |first2=Stefan |last3=Black |first3="Micahel J." |date=2010 |publisher=IEEE |book-title=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition |pages= 2432-2439 |___location=San Francisco, CA, USA |conference=2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition}}</ref>
 
To solve the aforementioned optimization problem, one can use the [[Euler-Lagrange equations]] to provide a system of partial differential equations for each point in <math>I(x, y, t)</math>. In the simplest case of using <math>\Psi(x) = x^2</math>, these equations are,
Line 53 ⟶ 49:
Doing so yields a system of linear equations which can be solved for <math>(u, v)</math> at each pixel, using an iterative scheme such as [[Gauss-Seidel]].<ref name="Horn_1980" />
 
Although, linearising the brightness constancy constraint simplifies the optimisation problem significantly, the linearisation is only valid for small displacements and/or smooth images. To avoid this problem, a multi-scale or coarse-to-fine approach is often used. In such a scheme, the images are initially [[downsampling|downsampled]] and the linearised Euler-Lagrange equations are then solved at thisthe reduced scaleresolution. The estimated flow field at this scale is then used to initialise the process at next scale.<ref name="Meinhardt-Llopis_2013">{{cite journal |last1=Meinhardt-Llopis |first1=Enric |last2=Pérez |first2=Javier Sánchez |last3=Kondermann |first3=Daniel |title=Horn-Schunck Optical Flow with a Multi-Scale Strategy |journal=Image Processing On Line |date=19 July 2013 |volume=3 |pages=151–172 |doi=10.5201/ipol.2013.20}}</ref> This initialisation process is often performed by [[image warping|warping]] one frame using the current estimate of flow field toso bethat it is as similar to other frame as possible.<ref name="Brox_2004" /> <ref name="Black_1996">{{cite journal |last1=Black |first1=Michael J. |last2=Anandan |first2=P. |title=The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields |journal=Computer Vision and Image Understanding |date=1 January 1996 |volume=63 |issue=1 |pages=75–104 |doi=10.1006/cviu.1996.0006 |issn=1077-3142}}</ref>
 
An alternate approach is to discretize the optimisation problem and then perform a search of the possible <math>(u, v)</math> values without linearising it.<ref name="Steinbrucker_2009">{{cite conference |url=https://ieeexplore.ieee.org/document/5459364 |title=Large Displacement Optical Flow Computation without Warping |last1=Steinbr¨ucker |first1=Frank |last2=Pock |first2=Thomas |last3=Cremers |first3=Daniel |last4=Weickert |first4=Joachim |date=2009 |publisher=IEEE |book-title=2009 IEEE 12th International Conference on Computer Vision |pages=1609-1614 |conference=2009 IEEE 12th International Conference on Computer Vision}}</ref>
This search is often performed using [[Max-flow min-cut theorem]] algorithms, linear programming or [[belief propagation]] methods.
 
Line 67 ⟶ 63:
\hat{\boldsymbol{\alpha}} = \arg \min_{\boldsymbol{\alpha}} \sum_{(x, y) \in \mathcal{R}} g(x, y) \rho(x, y, I_1, I_2, u_{\boldsymbol{\alpha}}, v_{\boldsymbol{\alpha}}),
</math>
where <math>{\boldsymbol{\alpha}}</math> is the set of parameters determining the motion in the region <math>\mathcal{R}</math>, <math>\rho()</math> is data cost term, <math>g()</math> is a weighting function that determines the influence of pixel <math>(x, y)</math> on the total cost, and <math>I_1</math> and <math>I_2</math> are frames 1 and 2 from a pair of consecutive frames.<ref name="Fortun_Survey_2015" />
<ref name="Fortun_Survey_2015" />
 
The simplest parametric model is the [[Lucas-Kanade method]]. This uses rectangular regions and parameterises the motion as purely translational. The Lucas-Kanade method uses the original brightness constancy constrain as the data cost term and selects <math>g(x, y) = 1</math>.
Line 81 ⟶ 76:
Instead of seeking to model optical flow directly, one can train a [[machine learning]] system to estimate optical flow. Since 2015, when FlowNet<ref>{{Cite conference |last=Dosovitskiy |first=Alexey |last2=Fischer |first2=Philipp |last3=Ilg |first3=Eddy |last4=Hausser |first4=Philip |last5=Hazirbas |first5=Caner |last6=Golkov |first6=Vladimir |last7=Smagt |first7=Patrick van der |last8=Cremers |first8=Daniel |last9=Brox |first9=Thomas |date=2015 |title=FlowNet: Learning Optical Flow with Convolutional Networks |url=https://ieeexplore.ieee.org/document/7410673/ |publisher=IEEE |pages=2758–2766 |doi=10.1109/ICCV.2015.316 |isbn=978-1-4673-8391-2 | conference=2015 IEEE International Conference on Computer Vision (ICCV)}}</ref> was proposed, learning based models have been applied to optical flow and have gained prominence. Initially, these approaches were based on [[Convolutional neural network|Convolutional Neural Networks]] arranged in a [[U-Net]] architecture. However, with the advent of [[Transformer (deep learning architecture)|transformer architecture]] in 2017, transformer based models have gained prominence.<ref>{{Cite journal |last=Alfarano |first=Andrea |last2=Maiano |first2=Luca |last3=Papa |first3=Lorenzo |last4=Amerini |first4=Irene |date=2024 |title=Estimating optical flow: A comprehensive review of the state of the art |url=https://linkinghub.elsevier.com/retrieve/pii/S1077314224002418 |journal=Computer Vision and Image Understanding |language=en |volume=249 |pages=104160 |doi=10.1016/j.cviu.2024.104160}}</ref>
 
Most learning-based approaches to optical flow use [[supervised learning]]. In this case, many frame pairs of video data and their corresponding [[ground truth|ground-truth]] flow fields are used to optimise the parameters of the learning-based model to accurately estimate optical flow. This process often relies on vast training datasets due to the number of parameters involved.<ref name="Tu_2019_Survey">{{cite journal |last1=Tu |first1=Zhigang |last2=Xie |first2=Wei |last3=Zhang |first3=Dejun |last4=Poppe |first4=Ronald |last5=Veltkamp |first5=Remco C. |last6=Li |first6=Baoxin |last7=Yuan |first7=Junsong |title=A survey of variational and CNN-based optical flow techniques |journal=Signal Processing: Image Communication |date=1 March 2019 |volume=72 |pages=9–24 |doi=10.1016/j.image.2018.12.002}}</ref>
 
== Uses ==
Line 98 ⟶ 93:
{{distinguish|Optical flowmeter}}
 
Various configurations of optical flow sensors exist. One configuration is an image sensor chip connected to a processor programmed to run an optical flow algorithm. Another configuration uses a vision chip, which is an integrated circuit having both the [[image sensor]] and the processor on the same die, allowing for a compact implementation.<ref>{{Cite book |title=Vision Chips |last=Moini |first=Alireza |date=2000 |publisher=Springer US |isbn=9781461552673 |___location=Boston, MA |oclc=851803922}}</ref><ref>{{Cite book |title=Analog VLSI and neural systems |last=Mead |first=Carver |date=1989 |publisher=Addison-Wesley |isbn=0201059924 |___location=Reading, Mass. |oclc=17954003 |url-access=registration |url=https://archive.org/details/analogvlsineural00mead }}</ref> An example of this is a generic optical mouse sensor used in an [[optical mouse]]. In some cases the processing circuitry may be implemented using analog or mixed-signal circuits to enable fast optical flow computation using minimal current consumption.
 
One area of contemporary research is the use of [[neuromorphic engineering]] techniques to implement circuits that respond to optical flow, and thus may be appropriate for use in an optical flow sensor.<ref>{{Cite book |title=Analog VLSI circuits for the perception of visual motion |last=Stocker |first=Alan A. |date=2006 |publisher=John Wiley & Sons |isbn=0470034882 |___location=Chichester, England |oclc=71521689}}</ref> Such circuits may draw inspiration from biological neural circuitry that similarly responds to optical flow.
Line 116 ⟶ 111:
== References ==
 
{{reflist|1=2}}
 
== External links ==