Revision as of 02:48, 23 September 2016 edit Dssathe (talk \| contribs) 30 edits added link Tag: Visual edit ← Previous edit		Revision as of 05:02, 23 September 2016 edit undo HyperGaruda (talk \| contribs) Extended confirmed users 10,127 edits Inline external links are strongly discouraged per WP:ELPOINTS#2 Next edit →
Line 1: {{Refimprove\|date=September 2016}} {{Machine learning bar}} The '''Hoshen–Kopelman algorithm''' is a simple and efficient [[algorithm]] for labeling [[Cluster analysis\|clusters]] on a grid. Where the grid is a regular [[network]] of cells,with the cells being either occupied or unoccupied. This algorithm is based on a well-known [[Disjoint-set data structure\|union-finding algorithm]].<ref> https://www.cs.princeton.edu/~rs/AlgsDS07/01UnionFind.pdf </ref> The algorithm was originally described in by J. Hoshen and R. Kopelman in their 1976 paper ~~[http://journals.aps.org/prb/abstract/10.1103/PhysRevB.14.3438~~ "Percolation and Cluster Distribution. I. Cluster Multiple Labeling Technique and Critical Concentration Algorithm]".<ref>{{cite journal\|url=http://link.aps.org/doi/10.1103/PhysRevB.14.3438\|title=Percolation and cluster distribution. I. Cluster multiple labeling technique and critical concentration algorithm\|first1=J.\|last1=Hoshen\|first2=R.\|last2=Kopelman\|date=15 October 1976\|publisher=\|journal=Phys. Rev. B\|volume=14\|issue=8\|pages=3438–3445\|via=APS\|doi=10.1103/PhysRevB.14.3438}}</ref> == Percolation theory == [[Percolation theory]] is the study of the behavior and [[statistics]] of [[Cluster\|clusters]] on [[Lattice graph\|lattices]]. Suppose we have a large square lattice where each cell can be occupied with the [[probability]] ~~<code>~~''p~~</code>~~'' and can be empty with the probability ~~<code>~~1 – ''p''~~</code>~~. Each group of neighboring occupied cells forms a cluster. Neighbors are defined as cells having a common side but not those sharing only a corner i.e. we consider ~~[[Pixel connectivity\|~~4x4 neighborhood~~]] that is~~. (top, bottom, left ~~and~~, right). Each occupied cell is ~~independent~~occupied independently of the status of its neighborhood. The number of clusters, ~~the~~a size of each cluster and their distribution are important topics in percolation theory. <center> {\| Line 16 ⟶ 17: == Hoshen–Kopelman algorithm for cluster finding == In this algorithm, we scan through a grid looking for occupied cells and labeling them with cluster labels~~. The scanning process is called as [[Raster scan\|Raster Scan]]~~. The algorithm begins with scanning the grid cell by cell and check if the cell is occupied, ~~or not. If~~if the cell is occupied, then itthis cell must be ~~labeled~~labelled with a cluster label. This cluster label is decided based on the neighbors of ~~that~~the cell. ~~(For~~which ~~this~~have webeen ~~are~~previously ~~going~~scanned toand ~~use~~labelled, ~~[[Union-find~~and ~~algorithm\|Union-Find Algorithm]] which is explained in the next section.) If~~if the cell doesn’t have any occupied neighbors then~~, a~~ new label is assigned to the cell.~~<ref name=":0" />~~ == Union-find algorithm == ~~This~~The union-find algorithm is a simple method for computing [[equivalence class]]es. Calling the function <code>union(x,y)</code> specifies that, items <code>x</code> and <code>y</code> are members of the same equivalence class. Because equivalence relations are [[Transitive relation\|transitive]]; all ~~the~~ items equivalent to <code>x</code> are equivalent to all ~~the~~ items equivalent to <code>y</code>. Thus for any item <code>x</code>, there is a set of items which are all equivalent to <code>x</code> .are; ~~This~~this set is the equivalence class of which <code>x</code> is a member. A second function <code>find(x)</code> returns a representative member of the equivalence class to which <code>x</code> belongs. == Pseudo-code == ~~During~~So in the ~~[[Raster~~raster scan~~\|Raster Scan]]~~ of the grid in question, ~~whenever~~each time an occupied cell is encountered, ~~neighboring~~a ~~cells~~check ~~are~~is ~~scanned~~done to ~~check~~see whether this cell has any ofneighboring ~~them~~cells who have already been scanned. If ~~we find already scanned neighbors~~so, ~~the~~first a <code>union</code> operation is performed, to specify that these neighboring cells are in fact members of the same equivalence class. Then ~~the~~a <code>find</code> operation is performed to find a representative member of that equivalence class with which to label the current cell. ~~will~~If beon ~~labeled~~the other hand, the current cell has no neighbors, it is assigned a new, previously unused, label. The entire grid is processed in this way. The grid can be raster-scanned a second time, performing only <code>find</code> operations at each cell, to re-label the cells with their final assignment of a representative element. Following ~~[[Pseudocode\|~~pseudo-code]] is referred from ~~[https://www.ocf.berkeley.edu/~fricke/~~one ~~Tobin~~of ~~Fricke's]~~the ~~implementation~~University of ~~the~~California ~~same~~Berkeley's ~~algorithm~~projects.<ref ~~name=":0"~~>{{cite web\|url=https://www.ocf.berkeley.edu/~fricke/projects/hoshenkopelman/hoshenkopelman.html \|title=The Hoshen-Kopelman Algorithm \|website=Ocf.berkeley.edu \|date=2004-04-21 \|accessdate=2016-09-17}}</ref>▼ ~~On the other hand,if the current cell has no neighbors, it is assigned a new, previously unused, label. The entire grid is processed in this way.~~ ▲Following [[Pseudocode\|pseudo-code]] is referred from [https://www.ocf.berkeley.edu/~fricke/ Tobin Fricke's] implementation of the same algorithm.<ref name=":0">{{cite web\|url=https://www.ocf.berkeley.edu/~fricke/projects/hoshenkopelman/hoshenkopelman.html \|title=The Hoshen-Kopelman Algorithm \|website=Ocf.berkeley.edu \|date=2004-04-21 \|accessdate=2016-09-17}}</ref> <strong>Raster Scan and Labeling on the Grid</strong> largest_label = 0; Line 47 ⟶ 45: } <strong>Union</strong> void union(int x, int y) { labels[find(x)] = find(y); }~~<br>~~ <strong>Find</strong> int find(int x) { Line 73: * <code>grid[0][3]</code> is occupied so check cell to the left which is unoccupied so we increment the current label value and assign the label to the cell as <code>2</code>. * <code>grid[0][4]</code>, <code>grid[0][5]</code> and <code>grid[1][0]</code> are unoccupied so they are not labeled. * <code>grid[1][1]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>3</code>. * <code>grid[1][2]</code> is occupied so check cell to the left and above, only ~~the~~ cell to the left is occupied so assign the label of a cell on ~~the~~ left to this cell <code>3</code>. * <code>grid[1][3]</code> is occupied so check cell to the left and above, both the cells are occupied, so merge the two clusters and assign the cluster label of ~~the~~ cell above to the cell on ~~the~~ left and to this cell i.e. <code>2</code>. (Merging using union algorithm will label all the cells with label <code>3</code> to <code>2</code>) * <code>grid[1][4]</code> is occupied so check cell to the left and above, only ~~the~~ cell to the left is occupied so assign the label of a cell on ~~the~~ left to this cell <code>2</code>. * <code>grid[1][5]</code> , <code>grid[2][0]</code> and <code>grid[2][1]</code> are unoccupied so they are not labeled. * <code>grid[2][2]</code> is occupied so check cell to the left and above, only cell above is occupied so assign the label of ~~the~~ cell above to this cell <code>2</code>. * <code>grid[2][3]</code> , <code>grid[2][4]</code> and <code>grid[2][5]</code> are unoccupied so they are not labeled. * <code>grid[3][0]</code> is occupied so check cell above which is unoccupied so we increment the current label value and assign the label to the cell as <code>4</code>. * <code>grid[3][1]</code> is occupied so check cell to the left and above, only ~~the~~ cell to the left is occupied so assign the label of ~~the~~ cell on ~~the~~ left to this cell <code>4</code>. * <code>grid[3][2]</code> is unoccupied so it is not labeled. * <code>grid[3][3]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>5</code>. * <code>grid[3][4]</code> is occupied so check cell to the left and above, only ~~the~~ cell to the left is occupied so assign the label of ~~the~~ cell on ~~the~~ left to this cell <code>5</code>. * <code>grid[3][5]</code> , <code>grid[4][0]</code> and <code>grid[4][1]</code> are unoccupied so they are not labeled. * <code>grid[4][2]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>6</code>. * <code>grid[4][3]</code> is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of ~~the~~ cell above to the cell on ~~the~~ left and to this cell i.e. <code>5</code>. (Merging using union algorithm will label all the cells with label <code>6</code> to <code>5</code>). * <code>grid[4][4]</code> is unoccupied so it is not labeled. * <code>grid[4][5]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>7</code>. * <code>grid[5][0]</code> , <code>grid[5][1]</code> , <code>grid[5][2]</code> and <code>grid[5][3]</code> are unoccupied so they are not labeled. * <code>grid[5][4]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>8</code>. * <code>grid[5][5]</code> is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of ~~the~~ cell above to the cell on ~~the~~ left and to this cell i.e. <code>7</code>. (Merging using union algorithm will label all the cells with label <code>8</code> to <code>7</code>). <center> {\| Line 105: == Applications == * Segmentation and Clustering of a ~~[[Binary image\|~~Binary Image]] <ref>{{cite web\|url=http://www.scialert.net/qredirect.php?doi=jas.2008.2474.2479&linkid=pdf \|format=PDF \|title=Journal of Applied Sciences \|issn=1812-5654 \|website=Scialert.net \|accessdate=2016-09-17}}</ref> * Determination of Nodal Domain Area and Nodal Line Lengths <ref>{{cite web\|url=https://webhome.weizmann.ac.il/home/feamit/nodalweek/c_joas_nodalweek.pdf \|format=PDF \|title=Introduction to the Hoshen-Kopelman algorithm and its application to nodal ___domain statistics \|author=Christian Joas \|website=Webhome.weizmann.ac.il \|accessdate=2016-09-17}}</ref> * ~~[[Connectivity (graph theory)\|~~Nodal Connectivity Information]] * Modeling of ~~[[Electrical resistivity and conductivity\|~~electrical conduction]]. == ~~See~~More ~~Also~~clustering algorithms == * [[K-means clustering\|K-means clustering algorithm]] * [[Fuzzy clustering\|Fuzzy clustering algorithm]] * Gaussian (~~[[Expectation–maximization algorithm\|Expectation Maximization]]~~EM) clustering algorithm * Clustering Methods <ref>{{Cite web\|url=https://web.stanford.edu/class/cs345a/slides/12-clustering.pdf\|title=Clustering\|last=\|first=\|date=\|website=\|publisher=\|access-date=}}</ref> * C-means Clustering Algorithm <ref>{{Cite web\|url=https://sites.google.com/site/dataclusteringalgorithms/fuzzy-c-means-clustering-algorithm\|title=Fuzzy c-means clustering\|last=\|first=\|date=\|website=\|publisher=\|access-date=}}</ref> == References ==

Hoshen–Kopelman algorithm: Difference between revisions