Revision as of 22:04, 15 December 2022 edit Laurburke (talk \| contribs) 51 edits m Correcting the format of the scikit-learn library name to all lowercase. ← Previous edit		Revision as of 04:34, 21 April 2023 edit undo RDBrown (talk \| contribs) Extended confirmed users 15,902 edits Tweak cites \| Add: authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. \| Use this tool. Report bugs. \| #UCB_Gadget Next edit →
Line 12: ===[[Laplacian matrix]]=== [[Image:elastic network model.png\|thumb\|A 2-dimensional spring system.]] Spectral clustering is well known to relate to partitioning of a mass-spring system, where each mass is associated with a data point and each spring stiffness corresponds to a weight of an edge describing a similarity of the two related data points, as in the [[spring system]]. Specifically, the classical reference <ref>{{cite web \|first=J. \|last=Demmel, [\|url=https://people.eecs.berkeley.edu/~demmel/cs267/lecture20/lecture20.html], \|title=CS267: Notes for Lecture 23, April 9, 1999, Graph Partitioning, Part 2}}</ref> explains that the eigenvalue problem describing transversal vibration modes of a mass-spring system is exactly the same as the eigenvalue problem for the graph [[Laplacian matrix]] defined as : <math>L:=D-A</math>, where <math>D</math> is the [[diagonal matrix]] Line 54: If the similarity matrix <math>A</math> has not already been explicitly constructed, the efficiency of spectral clustering may be improved if the solution to the corresponding eigenvalue problem is performed in a [[Matrix-free methods\|matrix-free fashion]] (without explicitly manipulating or even computing the similarity matrix), as in the [[Lanczos algorithm]]. For large-sized graphs, the second eigenvalue of the (normalized) graph [[Laplacian matrix]] is often [[ill-conditioned]], leading to slow convergence of iterative eigenvalue solvers. [[Preconditioner#Preconditioning for eigenvalue problems\|Preconditioning]] is a key technology accelerating the convergence, e.g., in the matrix-free [[LOBPCG]] method. Spectral clustering has been successfully applied on large graphs by first identifying their [[community structure]], and then clustering communities.<ref>{{cite journal\|~~last~~last1=Zare\|~~first~~first1=Habil \|~~author2~~first2=P. \|last2=Shooshtari \|~~author3~~first3=A. \|last3=Gupta \|~~author4~~first4=R. \|last4=Brinkman\|title=Data reduction for spectral clustering to analyze high throughput flow cytometry data\|journal=BMC Bioinformatics\|date=2010\|doi=10.1186/1471-2105-11-403\|volume=11\|pages=403 \|pmid=20667133 \|pmc=2923634}}</ref> Spectral clustering is closely related to [[nonlinear dimensionality reduction]], and dimension reduction techniques such as locally-linear embedding can be used to reduce errors from noise or outliers.<ref>{{Citation \| ~~author~~last1 = Arias-Castro, \|first1=E. ~~and~~ \|last2=Chen, \|first2=G. ~~and~~ \|last3=Lerman, \|first3=G. \| title = Spectral clustering based on local linear approximations. \| journal = Electronic Journal of Statistics \| volume = 5 \| pages = ~~1537–1587~~1537–87 \| year = 2011 \| doi=10.1214/11-ejs651\| arxiv = 1001.1323 Line 75: Moreover, a normalized Laplacian has exactly the same eigenvectors as the normalized adjacency matrix, but with the order of the eigenvalues reversed. Thus, instead of computing the eigenvectors corresponding to the smallest eigenvalues of the normalized Laplacian, one can equivalently compute the eigenvectors corresponding to the largest eigenvalues of the normalized adjacency matrix, without even talking about the Laplacian matrix. Naive constructions of the graph [[adjacency matrix]], e.g., using the RBF kernel, make it dense, thus requiring <math>n^2</math> memory and <math>n^2</math> AO to determine each of the <math>n^2</math> entries of the matrix. Nystrom method<ref>{{Cite journal\|last=Fowlkes\|first=C\|date=2004\|title=Spectral grouping using the Nystrom method.\|url=https://escholarship.org/uc/item/29z29233\|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence\|volume=26\|issue=2\|pages=214–25\|doi=10.1109/TPAMI.2004.1262185\|pmid=15376896\|s2cid=2384316}}</ref> can be used to approximate the similarity matrix, but the approximate matrix is not elementwise positive,<ref>{{Cite journal\|~~last~~first=S. \|last1=Wang, \|first2=A. \|last2=Gittens~~, and~~ \|first3=M. W. \|last3=Mahoney\|year=2019\|title=Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds\|journal=Journal of Machine Learning Research\|volume=20\|pages=1–49\|arxiv=1706.02803}}</ref> i.e. cannot be interpreted as a distance-based similarity. Algorithms to construct the graph adjacency matrix as a [[sparse matrix]] are typically based on a [[nearest neighbor search]], which estimate or sample a neighborhood of a given data point for nearest neighbors, and compute non-zero entries of the adjacency matrix by comparing only pairs of the neighbors. The number of the selected nearest neighbors thus determines the number of non-zero entries, and is often fixed so that the memory footprint of the <math>n</math>-by-<math>n</math> graph adjacency matrix is only <math>O(n)</math>, only <math>O(n)</math> sequential arithmetic operations are needed to compute the <math>O(n)</math> non-zero entries, and the calculations can be trivially run in parallel. Line 90: The ideas behind spectral clustering may not be immediately obvious. It may be useful to highlight relationships with other methods. In particular, it can be described in the context of kernel clustering methods, which reveals several similarities with other approaches.<ref name="filippone2008survey">{{cite journal \| ~~author~~last1 = Filippone \|first1=M., \|last2=Camastra \|first2=F., \|last3=Masulli, \|first3=F., \|last4=Rovetta, \|first4=S. ~~\| year = 2008~~ \| title = A survey of kernel and spectral methods for clustering \|journal = Pattern Recognition Line 105 ⟶ 104: === Relationship with ''k''-means === The weighted kernel ''k''-means problem<ref name="dhillon2004kernel">{{cite conference \| ~~author~~last1 = Dhillon, \|first1=I.S. ~~and~~ \|last2=Guan, \|first2=Y. ~~and~~ \|last3=Kulis, \|first3=B. \| year = 2004 \| title = Kernel ''k''-means: spectral clustering and normalized cuts \| book-title = Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining \| pages = ~~551–556~~551–6 \| url = https://www.cs.utexas.edu/users/inderjit/public_papers/kdd_spectral_kernelkmeans.pdf }}</ref> shares the objective function with the spectral clustering problem, which can be optimized directly by multi-level methods.<ref>{{cite journal\|~~last~~last1=Dhillon\|~~first~~first1=Inderjit\|~~author2~~first2=Yuqiang \|last2=Guan \|~~author3~~first3=Brian \|last3=Kulis \|title=Weighted Graph Cuts without Eigenvectors: A Multilevel Approach\|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence\|date=November 2007\|volume=29\|issue=11\|pages=1944–1957\|doi=10.1109/tpami.2007.1115\|pmid=17848776\|citeseerx=10.1.1.131.2635\|s2cid=9402790}}</ref> === Relationship to DBSCAN === Line 121 ⟶ 120: == History and related literatures== Spectral clustering has a long history.<ref>{{Cite journal\|last=Cheeger\|first=Jeff\|date=1969\|title=A lower bound for the smallest eigenvalue of the Laplacian\|journal=Proceedings of the Princeton Conference in Honor of Professor S. Bochner}}</ref><ref>{{Cite journal\|~~last~~first1=William \|last1=Donath ~~and~~ \|first2=Alan \|last2=Hoffman\|date=1972\|title=Algorithms for partitioning of graphs and computer logic based on eigenvectors of connections matrices\|journal=IBM Technical Disclosure Bulletin}}</ref><ref>{{Cite journal\|last=Fiedler\|first=Miroslav\|date=1973\|title=Algebraic connectivity of graphs\|journal=Czechoslovak Mathematical Journal\|volume=23\|issue=2\|pages=298–305\|doi=10.21136/CMJ.1973.101168\|doi-access=free}}</ref><ref>{{Cite journal\|~~last~~first1=Stephen \|last1=Guattery ~~and~~ \|first2=Gary L. \|last2=Miller\|date=1995\|title=On the performance of spectral graph partitioning methods\|journal=Annual ACM-SIAM Symposium on Discrete Algorithms}}</ref><ref>{{Cite journal\|last=Daniel A. Spielman and Shang-Hua Teng\|date=1996\|title=Spectral Partitioning Works: Planar graphs and finite element meshes\|journal=Annual IEEE Symposium on Foundations of Computer Science}}</ref><ref name=":0" /><ref name=":1">{{Cite journal\|~~last~~last1=Ng, \|first1=Andrew Y. ~~and~~ \|last2=Jordan, \|first2=Michael I. ~~and~~ \|last3=Weiss, \|first3=Yair\|date=2002\|title=On spectral clustering: analysis and an algorithm\|url=https://ai.stanford.edu/~ang/papers/nips01-spectral.pdf\|journal=Advances in Neural Information Processing Systems}}</ref> Spectral clustering as a [[machine learning]] method was popularized by Shi & Malik<ref name=":0" /> and Ng, Jordan, & Weiss.<ref name=":1" /> Ideas and network measures related to spectral clustering also play an important role in a number of applications apparently different from clustering problems. For instance, networks with stronger spectral partitions take longer to converge in opinion-updating models used in sociology and economics.<ref name="DeMarzo Vayanos Zwiebel pp. 909–968">{{cite journal \| last1=DeMarzo \| first1=P. M. \| last2=Vayanos \| first2=D. \| last3=Zwiebel \| first3=J. \| title=Persuasion Bias, Social Influence, and Unidimensional Opinions \| journal=The Quarterly Journal of Economics \| publisher=Oxford University Press ~~(OUP)~~ \| volume=118 \| issue=3 \| date=2003-08-01 \| issn=0033-5533 \| doi=10.1162/00335530360698469 \| pages=909–968\| url=http://eprints.lse.ac.uk/454/ }}</ref><ref name="Golub Jackson pp. 1287–1338">{{cite journal \| last1=Golub \| first1=Benjamin \| last2=Jackson \| first2=Matthew O. \| title=How Homophily Affects the Speed of Learning and Best-Response Dynamics \| journal=The Quarterly Journal of Economics \| publisher=Oxford University Press (OUP) \| volume=127 \| issue=3 \| date=2012-07-26 \| issn=0033-5533 \| doi=10.1093/qje/qjs021 \| pages=1287–1338}}</ref> == See also ==

Spectral clustering: Difference between revisions