Content deleted Content added
Larry.europe (talk | contribs) |
m Open access bot: pmc updated in citation with #oabot. |
||
(17 intermediate revisions by 5 users not shown) | |||
Line 1:
{{Short description|Metric of clustering solutions quality}}
[[File:DBCV clustering evaluation.png|thumb|500px|In each graph, an increasing level of noise is introduced to the initial data, which consist of two well-defined semicircles. As the noise increases and thus the overlap between the two groups, the value of the DBCV index progressively decreases. Image released under MIT license.<ref name = felsiq>GitHub.▼
▲[[File:DBCV clustering evaluation.png|thumb|500px|In each graph, an increasing level of noise is introduced to the initial data, which consist of two well-defined semicircles. As the noise increases and thus the overlap between the two groups, the value of the DBCV index progressively decreases.Image released under MIT license.<ref name = felsiq>GitHub.
FelSiq/DBCV Fast Density-Based Clustering Validation (DBCV) Python
package -- https://github.com/FelSiq/DBCV</ref>]]
Line 12 ⟶ 11:
This metric was introduced in 2014 by David Moulavi and colleagues in their work.<ref name = Moulavi>{{Citation
|
|
| last2 = Jaskowiak
| first2 = Pablo A.
| last3 = Campello
| first3 = Ricardo J. G. B.
| last4 = Zimek
| first4 = Arthur
| last5 = Sander
| first5 = Jörg
| chapter = Density-Based Clustering Validation
| year = 2014
Line 24 ⟶ 31:
}}</ref> It utilizes density connectivity principles to quantify clustering structures, making it especially effective at detecting arbitrarily shaped clusters in concave datasets, where traditional metrics may be less reliable.
The DBCV index has been employed
| last= Di Giovanni
| first= Daniele
Line 37 ⟶ 44:
| pmid= 36833240
| pmc= 9956345
}}</ref> ecology
| last= Poutaraud
| first= Joachim
Line 47 ⟶ 54:
| publisher = Elsevier
| doi = 10.1016/j.ecoinf.2024.102687
| doi-access= free▼
▲| doi-access= free
▲ }}</ref> techno-economic analysis,<ref name="Shim">{{Citation
| last= Shim
| first= Jaehyun
Line 61 ⟶ 67:
| bibcode= 2022ECM...27416411S
| url = https://www.sciencedirect.com/science/article/abs/pii/S019689042201189X
}}</ref> and health informatics
| last= Martínez
| first= Rubén Yáñez
| year= 2023
| title= Spanish Corpora of tweets about COVID-19 vaccination for automatic stance detection
| journal = Information Processing
| volume= 60
| issue= 3
Line 72 ⟶ 79:
| publisher = Elsevier
| doi = 10.1016/j.ipm.2023.103294
| doi-access= free
}}</ref>
▲| doi-access= free
<ref>{{cite journal |
}}</ref> as well as in numerous other fields<ref name="Beer">{{cite arXiv |mode=cs2▼
author= Chicco D. |
author2= Oneto L. |
author3= Cangelosi D. |
title = DBSCAN and DBCV application to open medical records heterogeneous data for identifying clinically significant clusters of patients with neuroblastoma |
journal = BioData Mining |
volume = 18 |
issue = 40 |
date = 2025 |
page = 1-17 |
doi = 10.1186/s13040-025-00455-8 |
doi-access=free|
▲
| last= Beer
| first= Anna
Line 117 ⟶ 136:
| doi = 10.1007/978-0-387-39940-9_605
| url = https://doi.org/10.1007/978-0-387-39940-9_605
| url-access= subscription
}}</ref> Two points within the same cluster are considered density-connected if there exists a sequence of intermediate points linking them, where each consecutive pair meets a predefined density criterion. The '''density-based distance''' between two points is determined by identifying the optimal path that minimizes the maximum local reachability distance along its trajectory.
Line 142 ⟶ 162:
== Explanation ==
DBCV index values range between
* +1: Strongly cohesive and well-separated clusters.
Line 150 ⟶ 170:
By leveraging density-based distances instead of traditional [[Euclidean distance|Euclidean measures]], DBCV index provides a more robust evaluation of clustering performance in datasets with irregular or non-spherical distributions.<ref name = Moulavi />
==
*{{Citation
| last1 = Moulavi
| first1 = David
| last2 = Jaskowiak
| first2 = Pablo A.
| last3 = Campello
| first3 = Ricardo J. G. B.
| last4 = Zimek
| first4 = Arthur
| last5 = Sander
| first5 = Jörg
| chapter = Density-based clustering validation
| year = 2014
| title = Proceedings of the 2014 SIAM International Conference on Data Mining
| doi = 10.1137/1.9781611973440.96
| pages = 839–847
| publisher = SIAM
| isbn = 978-1-61197-344-0
| url = https://www.dbs.ifi.lmu.de/~zimek/publications/SDM2014/DBCV.pdf
| doi-access=free
}}
*{{Citation
* [https://github.com/christopherjenness/DBCV Python DBCV Implementation by Christopher Jennes]▼
| last1 = Chicco
| first1 = Davide
| last2 = Sabino
| first2 = Giuseppe
| last3 = Oneto
| first3 = Luca
| last4 = Jurman
| first4 = Giuseppe
| chapter = The DBCV index is more informative than DCSI, CDbw, and VIASCKDE indices for unsupervised clustering internal assessment of concave-shaped and density-based clusters
| year = 2025
| title = PeerJ Computer Science
| doi = 10.7717/peerj-cs.3095
| pages = 1-37
| publisher = PeerJ Inc.
| url = https://doi.org/10.7717/peerj-cs.3095
| doi-access=free
}}
== Implementations ==
▲* [https://github.com/
* [https://doi.org/10.32614/cran.package.dbcvindex R DBCV Implementation by Pablo Andretta Jaskowiak]
== See also ==
Line 167 ⟶ 227:
== References ==
<references/>
{{Machine learning evaluation metrics}}
[[Category:Cluster analysis]]
|