Graph cuts in computer vision: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 18:33, 7 August 2021 edit Normal Name (talk \| contribs) Extended confirmed users 18,371 edits m Fixed empty "not a typo" templates ← Previous edit		Latest revision as of 08:06, 20 August 2025 edit undo Bender the Bot (talk \| contribs) Bots 1,064,377 edits m →History: HTTP to HTTPS for Brown University Tag: AWB
(11 intermediate revisions by 5 users not shown)
Line 1: As applied in the field of [[computer vision]], '''[[graph cut optimization]]''' can be employed to [[Polynomial time\|efficiently]] solve a wide variety of low-level computer vision problems (''early vision''<ref>Adelson, Edward H., and James R. Bergen (1991), "[http://persci.mit.edu/pub_pdfs/elements91.pdf The plenoptic function and the elements of early vision]", Computational models of visual processing 1.2 (1991).</ref>), such as ~~image~~ [[image smoothing]], the stereo [[correspondence problem]], [[image segmentation]], [[object co-segmentation]], and many other computer vision problems that can be formulated in terms of [[energy minimization]]. Many of these energy minimization problems can be approximated by solving a [[maximum flow problem]] in a [[Graph (discrete mathematics)\|graph]]<ref>Boykov, Y., Veksler, O., and Zabih, R. (2001), "[http://www.cs.cornell.edu/rdz/Papers/BVZ-pami01-final.pdfFast approximate energy minimization via graph cuts]," ''IEEE Transactions on Pattern Analysis and Machine Intelligence,'' 23(11): 1222-1239.</ref> (and thus, by the [[max-flow min-cut theorem]], define a minimal [[cut (graph theory)\|cut]] of the graph). Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the [[Bayesian estimation of templates in computational anatomy\|maximum a posteriori estimate]] of a solution. Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as [[Graph partition\|graph partitioning]] algorithms). Many of these energy minimization problems can be approximated by solving a [[maximum flow problem]] in a [[Graph (discrete mathematics)\|graph]]<ref>Boykov, Y., Veksler, O., and Zabih, R. (2001), "[https://www.cs.cornell.edu/rdz/Papers/BVZ-pami01-final.pdf Fast approximate energy minimization via graph cuts]," ''IEEE Transactions on Pattern Analysis and Machine Intelligence,'' 23(11): 1222-1239.</ref> (and thus, by the [[max-flow min-cut theorem]], define a minimal [[cut (graph theory)\|cut]] of the graph). "Binary" problems (such as denoising a binary image) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a [[grayscale]] image) cannot be solved exactly, but solutions produced are usually near the global optimum.▼ Under most formulations of such problems in computer vision, the minimum energy solution corresponds to the [[Maximum a posteriori estimation\|maximum a posteriori estimate]] of a solution. Although many computer vision algorithms involve cutting a graph (e.g., normalized cuts), the term "graph cuts" is applied specifically to those models which employ a max-flow/min-cut optimization (other graph cutting algorithms may be considered as [[graph partition]]ing algorithms). ▲"Binary" problems (such as [[denoising]] a [[binary image]]) can be solved exactly using this approach; problems where pixels can be labeled with more than two different labels (such as stereo correspondence, or denoising of a [[grayscale]] image) cannot be solved exactly, but solutions produced are usually near the global optimum. == History == The foundational theory of [[Cut (graph theory)\|graph cuts~~]] used as [[graph cut optimization\|an optimization method~~]] was first applied in [[computer vision]] in the seminal paper by Greig, Porteous and Seheult<ref name="D.M. Greig, B.T 1989">D.M. Greig, B.T. Porteous and A.H. Seheult (1989), ''[https://classes.cs.uchicago.edu/archive/2006/fall/35040-1/gps.pdf Exact maximum a posteriori estimation for binary images]'', Journal of the Royal Statistical Society, Series B, '''51''', 271–279.</ref> of [[Durham University]]. Allan Seheult and Bruce Porteous were members of Durham's lauded statistics group of the time, ~~lead~~led by [[Julian Besag]] and [[Peter Green (statistician)\|Peter Green]], with the optimisation expert [[Margaret Greig]] notable as the first ever female member of staff of the Durham Mathematical Sciences Department. In the [[Bayesian statistics\|Bayesian]] statistical context of [[smoothing]] noisy (or corrupted) images, they showed how the [[MAP estimate\|maximum a posteriori estimate]] of a [[binary image]] can be obtained ''exactly'' by maximizing the [[Flow network\|flow]] through an associated image network, involving the introduction of a ''source'' and ''sink''. The problem was therefore shown to be efficiently solvable. Prior to this result, ''approximate'' techniques such as [[simulated annealing]] (as proposed by the [[Donald Geman\|Geman brothers]]),<ref>D. Geman and S. Geman (1984), ''[~~http~~https://www.dam.brown.edu/people/documents/stochasticrelaxation.pdf Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images]'', IEEE Trans. Pattern Anal. Mach. Intell., '''6''', 721–741.</ref>), or [[iterated conditional modes]] (a type of [[greedy algorithm]] as suggested by [[Julian Besag]])<ref>J.E. Besag (1986), ''On the statistical analysis of dirty pictures (with discussion)'', [[Journal of the Royal Statistical Society]] Series B, '''48''', 259–302</ref> were used to solve such image smoothing problems. Although the general [[Graph coloring\|<math>k</math>~~[[Graph coloring\|~~-colour problem]] ~~remains~~is NP ~~unsolved~~hard for <math>k > 2,</math> the approach of Greig, Porteous and Seheult<ref name="D.M. Greig, B.T 1989" /> has turned out<ref>Y. Boykov, O. Veksler and R. Zabih (1998), "[~~http~~https://www.cs.cornell.edu/~rdz/Papers/BVZ-cvpr98.pdf Markov Random Fields with Efficient Approximations]", ''International Conference on Computer Vision and Pattern Recognition (CVPR)''.</ref><ref name="boykov2001fast">Y. Boykov, O. Veksler and R. Zabih (2001), "[~~http~~https://www.cs.cornell.edu/~rdz/Papers/BVZ-pami01-final.pdf Fast approximate energy minimisation via graph cuts]", ''IEEE Transactions on Pattern Analysis and Machine Intelligence'', '''29''', 1222–1239.</ref> to have wide applicability in general computer vision problems. For general problems, Greig, Porteous and Seheult's ~~approaches~~approach ~~are~~is often applied iteratively to ~~a sequence~~sequences of related binary problems, usually yielding near optimal solutions. In 2011, C. Couprie ''et al''.<ref>Camille Couprie, Leo Grady, Laurent Najman and Hugues Talbot, "[http://leogrady.net/wp-content/uploads/2017/01/couprie2011power.pdf Power Watersheds: A Unifying Graph-Based Optimization Framework]”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, No. 7, pp. 1384-1399, July 2011</ref> proposed a general image segmentation framework, called the "Power Watershed", that minimized a real-valued [[indicator function]] from [0,1] over a graph, constrained by user seeds (or unary terms) set to 0 or 1, in which the minimization of the indicator function over the graph is optimized with respect to an exponent <math>p</math>. When <math>p=1</math>, the Power Watershed is optimized by graph cuts, when <math>p=0</math> the Power Watershed is optimized by shortest paths, <math>p=2</math> is optimized by the [[~~Random~~random walker algorithm]] and <math>p=\infty</math> is optimized by the [[Watershed (image processing)\|watershed]] algorithm. In this way, the Power Watershed may be viewed as a generalization of graph cuts that provides a straightforward connection with other energy optimization segmentation/clustering algorithms. ==Binary segmentation of images== Line 76 ⟶ 82: {{see also\|Graph cut optimization}} * Minimization is done using a standard minimum cut algorithm. * Due to the [[~~Max~~max-flow min-cut theorem]] we can solve energy minimization by maximizing the flow over the network. The ~~Max Flow~~[[max-flow problem]] consists of a directed graph with edges labeled with capacities, and there are two distinct nodes: the source and the sink. Intuitively, it's is easy to see that the maximum flow is determined by the bottleneck. === Implementation (exact) === {{Wikibooks\|Algorithm Implementation\|Graphs/Maximum flow/Boykov & Kolmogorov}} The Boykov-Kolmogorov algorithm<ref>Yuri Boykov, Vladimir Kolmogorov: [http://discovery.ucl.ac.uk/13383/1/13383.pdf An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision]. IEEE Trans. Pattern Anal. Mach. Intell. 26(9): 1124–1137 (2004)</ref> is an efficient way to compute the max-flow for computer vision -related ~~graph~~graphs. === Implementation (approximation) === The Sim Cut algorithm<ref>P.J. Yim: "[https://patentimages.storage.googleapis.com/2b/1e/e9/5834a9cc3312a0/US9214029.pdf Method and System for Image Segmentation]," United States Patent US8929636, January 6, 2016</ref> approximates the minimum graph cut. The algorithm implements a solution by simulation of an electrical network. This is the approach suggested by [[Cederbaum's maximum flow theorem]].<ref>{{Cite journal\|last=Cederbaum\|first=I.\|date=1962-08-01\|title=On optimal operation of communication nets\|journal=Journal of the Franklin Institute\|volume=274\|issue=2\|pages=130–141\|doi=10.1016/0016-0032(62)90401-5\|issn=0016-0032}}</ref><ref>I.T. Frisch, "On Electrical analogs for flow networks," Proceedings of IEEE, 57:2, pp. 209-210, 1969</ref> Acceleration of the algorithm is possible through [[parallel computing]]. == Software ==