Content deleted Content added
HyperGaruda (talk | contribs) Rmv unnecessary justified alignment per MOS:MARKUP and inappropriate numbered list |
m Open access bot: url-access updated in citation with #oabot. |
||
(49 intermediate revisions by 34 users not shown) | |||
Line 1:
{{short description|Algorithm for labeling clusters on a grid}}
{{Machine learning bar}}
The '''Hoshen–Kopelman algorithm''' is a simple and efficient [[algorithm]] for labeling [[Cluster analysis|clusters]] on a grid
== Percolation theory ==
[[Percolation theory]] is the study of the behavior and [[statistics]] of [[cluster analysis|clusters]] on [[Lattice graph|lattices]]. Suppose we have a large square lattice where each cell can be occupied with the [[probability]]
{| style="margin: auto;"
|-
| <gallery>
Occupied_Grids_P_%3D_0.24.png |
</gallery>
Occupied_Grids_P_%3D_0.64.png |
</gallery>
|}
== Hoshen–Kopelman algorithm for cluster finding ==
In this algorithm, we scan through a grid looking for occupied cells and labeling them with cluster labels. The scanning process is called a [[raster scan]]. The algorithm begins with scanning the grid cell by cell and
== Union-find algorithm ==
==
<strong>Raster Scan and Labeling on the Grid</strong>▼
largest_label = 0;▼
for x in 0 to n_columns {▼
if occupied[x,y] then▼
left = occupied[x-1,y];▼
above = occupied[x,y-1];▼
if (left == 0) and (above == 0) then /* Neither a label above nor to the left. */▼
largest_label = largest_label + 1; /* Make a new, as-yet-unused cluster label. */▼
else if (left != 0) and (above == 0) then /* One neighbor, to the left. */▼
else if (left == 0) and (above != 0) then /* One neighbor, above. */▼
else /* Neighbors BOTH to the left and above. */▼
union(left,above); /* Link the left and above clusters. */▼
}▼
}▼
On the other hand, if the current cell has no neighbors, it is assigned a new, previously unused, label. The entire grid is processed in this way.
Following [[pseudocode]] is referred from [https://www.ocf.berkeley.edu/~fricke/ Tobin Fricke's] implementation of the same algorithm.<ref name=":0">{{cite web|url=https://www.ocf.berkeley.edu/~fricke/projects/hoshenkopelman/hoshenkopelman.html |title=The Hoshen-Kopelman Algorithm for cluster identification|first=Tobin |last=Fricke |website=ocf.berkeley.edu |date=2004-04-21 |access-date=2016-09-17}}</ref> On completion, the cluster labels may be found in <code>labels</code>. Not shown is the second raster scan of the grid needed to produce the example output. In that scan, the value at <code>label[x,y]</code> is replaced by <code>find(label[x,y])</code>.
void union(int x, int y) {▼
labels[find(x)] = find(y);▼
label = zeros[n_columns, n_rows]
labels = [0:n_columns*n_rows] /* Array containing integers from 0 to the size of the image. */
while (labels[x] != x) {▼
int z = labels[x];▼
x = z;▼
}▼
return y;▼
}
'''Union'''
}
'''Find'''
int find(int x) {
while (labels[y] != y)
y = labels[y];
labels[x] = y;
▲ }
▲ return y;
}
== Example ==
Consider the following example. The dark cells in the grid in
The algorithm processes the input grid, cell by cell, as follows: Let's say that grid is a two-dimensional array.
Line 75 ⟶ 80:
* <code>grid[0][3]</code> is occupied so check cell to the left which is unoccupied so we increment the current label value and assign the label to the cell as <code>2</code>.
* <code>grid[0][4]</code>, <code>grid[0][5]</code> and <code>grid[1][0]</code> are unoccupied so they are not labeled.
* <code>grid[1][1]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>3</code>.
* <code>grid[1][2]</code> is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of a cell on the left to this cell <code>3</code>.
* <code>grid[1][3]</code> is occupied so check cell to the left and above, both the cells are occupied, so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. <code>2</code>. (Merging using union algorithm will label all the cells with label <code>3</code> to <code>2</code>)
* <code>grid[1][4]</code> is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of a cell on the left to this cell <code>2</code>.
* <code>grid[1][5]</code> , <code>grid[2][0]</code> and <code>grid[2][1]</code> are unoccupied so they are not labeled.
* <code>grid[2][2]</code> is occupied so check cell to the left and above, only cell above is occupied so assign the label of the cell above to this cell <code>2</code>.
* <code>grid[2][3]</code> , <code>grid[2][4]</code> and <code>grid[2][5]</code> are unoccupied so they are not labeled.
* <code>grid[3][0]</code> is occupied so check cell above which is unoccupied so we increment the current label value and assign the label to the cell as <code>4</code>.
* <code>grid[3][1]</code> is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of the cell on the left to this cell <code>4</code>.
* <code>grid[3][2]</code> is unoccupied so it is not labeled.
* <code>grid[3][3]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>5</code>.
* <code>grid[3][4]</code> is occupied so check cell to the left and above, only the cell to the left is occupied so assign the label of the cell on the left to this cell <code>5</code>.
* <code>grid[3][5]</code> , <code>grid[4][0]</code> and <code>grid[4][1]</code> are unoccupied so they are not labeled.
* <code>grid[4][2]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>6</code>.
* <code>grid[4][3]</code> is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. <code>5</code>. (Merging using union algorithm will label all the cells with label <code>6</code> to <code>5</code>).
* <code>grid[4][4]</code> is unoccupied so it is not labeled.
* <code>grid[4][5]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>7</code>.
* <code>grid[5][0]</code> , <code>grid[5][1]</code> , <code>grid[5][2]</code> and <code>grid[5][3]</code> are unoccupied so they are not labeled.
* <code>grid[5][4]</code> is occupied so check cell to the left and above, both the cells are unoccupied so assign a new label <code>8</code>.
* <code>grid[5][5]</code> is occupied so check cell to the left and above, both, cell to the left and above are occupied so merge the two clusters and assign the cluster label of the cell above to the cell on the left and to this cell i.e. <code>7</code>. (Merging using union algorithm will label all the cells with label <code>8</code> to <code>7</code>).
{| style="margin: auto;"
|-
| <gallery>
H-K algorithm Input.png |
</gallery>
H-K algorithm output.png |
</gallery>
|}
== Applications ==
* Determination of Nodal Domain Area and Nodal Line Lengths <ref>{{cite web|url=https://webhome.weizmann.ac.il/home/feamit/nodalweek/c_joas_nodalweek.pdf |title=Introduction to the Hoshen-Kopelman algorithm and its application to nodal ___domain statistics |author=Christian Joas |website=Webhome.weizmann.ac.il |access-date=2016-09-17}}</ref>▼
* [[Connectivity (graph theory)|Nodal Connectivity Information]]▼
▲* Determination of Nodal Domain Area and Nodal Line Lengths<ref>https://webhome.weizmann.ac.il/home/feamit/nodalweek/c_joas_nodalweek.pdf</ref>
▲* Nodal Connectivity Information
▲* Modeling of electrical conduction.
==
* [[K-means clustering|K-means clustering algorithm]]
* [[Fuzzy clustering|Fuzzy clustering algorithm]]
* Gaussian (
* Clustering Methods <ref>{{Cite web|url=https://web.stanford.edu/class/cs345a/slides/12-clustering.pdf|title=Clustering}}</ref>
* C-means Clustering Algorithm <ref>{{Cite web|url=https://sites.google.com/site/dataclusteringalgorithms/fuzzy-c-means-clustering-algorithm|title=Fuzzy c-means clustering}}</ref>
* [[Connected-component labeling]]
== References ==
{{Reflist}}
{{DEFAULTSORT:Hoshen-Kopelman algorithm}}
[[Category:
|