Revision as of 02:40, 5 December 2010 edit David Eppstein (talk \| contribs) Autopatrolled, Administrators 235,586 edits link malik ← Previous edit		Revision as of 04:58, 22 December 2011 edit undo Michael Hardy (talk \| contribs) Administrators 210,577 edits some cleanup; more is needed Next edit →
Line 1: The image segmentation problem is concerned with partitioning an image into multiple regions according to some homogeneity criterion. This article is primarily concerned with graph theoretic approaches to image segmentation. <!-- Unsourced image removed: [[Image:aenep11.png\|right\|thumb\|Figure 1: Input image.]] --> <!-- Unsourced image removed: [[Image:aenep12.png\|right\|thumb\|Figure 2: First partition.]] --> Line 13 ⟶ 12: <!-- Unsourced image removed: [[Image:aenep24.png\|right\|thumb\|Figure 11: Segmentation results.]] --> ==Applications of ~~Image~~image ~~Segmentation~~segmentation== * '''Image Compression''' Segment the image into homogeneous components, and use the most suitable compression algorithm for each component to improve compression. Line 21 ⟶ 20: Automatic analysis of remote sensing data from satellites to identify and measure regions of interest. ==Segmentation using ~~Normalized~~normalized ~~Cuts~~cuts== ===Graph theoretic formulation=== The set of points in an arbitrary feature space can be represented as a weighted undirected complete graph G = (V, E), where the nodes of the graph are the points in the feature space. The weight <math>w_{ij}</math> of an edge <math>(i, j) \in E</math> is a function of the similarity between the nodes <math>i</math> and <math>j</math>. In this context, we can formulate the image segmentation problem as a graph partitioning problem that asks for a partition <math>V_1, \cdots, V_k</math> of the vertex set <math>V</math>, where, according to some measure, the vertices in any set <math>V_i</math> have high similarity, and the vertices in two different sets <math>V_i, V_j</math> have low similarity. ===Normalized ~~Cuts~~cuts=== Let G = (V, E) be a weighted graph. Let <math>A</math> and <math>B</math> be two subsets of vertices. Let: : <math>w(A, B) = \sum \limits_{i \in A, j \in B} w_{ij}</math> : <math>ncut(A, B) = \frac{w(A, B)}{w(A, V)} + \frac{w(A, B)}{w(B, V)}</math> : <math>nassoc(A, B) = \frac{w(A, A)}{w(A, V)} + \frac{w(B, B)}{w(B, V)}</math> In the normalized cuts approach<ref>Jianbo Shi and [[Jitendra Malik]] (1997): "Normalized Cuts and Image Segmentation", IEEE Conference on Computer Vision and Pattern Recognition, pp ~~731-737~~731–737 </ref>, for any cut <math>(S, \overline{S})</math> in <math>G</math>, <math>ncut(S, \overline{S})</math> measures the similarity between different parts, and <math>nassoc(S, \overline{S})</math> measures the total similarity of vertices in the same part. Since <math>ncut(S, \overline{S}) = 2 - nassoc(S, \overline{S})</math>, a cut <math>(S^{}, {\overline{S}}^{})</math> that minimizes <math>ncut(S, \overline{S})</math> also maximizes <math>nassoc(S, \overline{S})</math>. Line 42 ⟶ 41: Computing a cut <math>(S^{}, {\overline{S}}^{})</math> that minimizes <math>ncut(S, \overline{S})</math> is an [[NP-hard]] problem. However, we can find in polynomial time a cut <math>(S, \overline{S})</math> of small normalized weight <math>ncut(S, \overline{S})</math> using [[Spectral graph theory\|spectral techniques]]. ===The Ncut ~~Algorithm~~algorithm=== Let: Line 74 ⟶ 73: ==OBJ CUT== OBJ CUT<ref>M. P. Kumar, P. H. S. Torr, and A. Zisserman. Obj cut. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Diego, pages ~~18-25~~18–25, 2005.</ref> is an efficient method that automatically segments an object. The OBJ CUT method is a generic method, and therefore it is applicable to any object category model. Given an image D containing an instance of a known object category, e.g. cows, the OBJ CUT algorithm computes a segmentation of the object, that is, it infers a set of labels m. Let m be a set of binary labels, and let <math>\Theta</math> be a shape parameter(<math>\Theta</math> is a shape prior on the labels from a [[Layered Pictorial Structure]] (LPS) model). We define an energy function <math>E(m, \Theta)</math> as follows. : <math>E(m, \Theta) = \sum \phi_x(D\|m_x) + \phi_x(m_x\|\Theta) + \sum \Psi_{xy}(m_x, m_y) + \phi(D\|m_x, m_y)</math> (1) The term <math>\phi_x(D\|m_x) + \phi_x(m_x\|\Theta)</math> is called a unary term, and the term <math>\Psi_{xy}(m_x, m_y) + \phi(D\|m_x, m_y)</math> is called a pairwise term. Line 86 ⟶ 85: The best labeling <math>m^{}</math> minimizes <math>\sum \limits_i w_i E(m, \Theta_i)</math>, where <math>w_i</math> is the weight of the parameter <math>\Theta_i</math>. : <math>m^{} = \arg \min \limits_m \sum \limits_i w_i E(m, \Theta_i)</math> (2) ===The OBJ CUT algorithm=== Line 95 ⟶ 94: ===Example=== Figures ~~8-11~~8–11 exemplify the OBJ CUT algorithm. ==Other approaches== * Jigsaw approach<ref> E. Borenstein, S. Ullman: Class-specic, top-down segmentation. In Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pages ~~109-124~~109–124, 2002.</ref> * Image parsing <ref>Z. Tu, X. Chen, A. L. Yuille, S. C. Zhu: Image Parsing: Unifying Segmentation, Detection, and Recognition. Toward Category-Level Object Recognition 2006: ~~545-576~~545–576</ref> * Interleaved segmentation <ref>B. Leibe, A. Leonardis, B. Schiele: An Implicit Shape Model for Combined Object Categorization and Segmentation. Toward Category-Level Object Recognition 2006: ~~508-524~~508–524</ref> * LOCUS <ref> J. Winn, N. Joijic. Locus: Learning object classes with unsupervised segmentation. In Proceedings of the IEEE International Conference on Computer Vision, Beijing, 2005.</ref> * LayoutCRF <ref>J. M. Winn, J. Shotton: The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects. CVPR (1) 2006: ~~37-44~~37–44</ref> * [[Minimum_Spanning_Tree-based_Segmentation\|Minimum Spanning Tree-based segmentation]]

Segmentation-based object categorization: Difference between revisions