Diffusion map: Difference between revisions

Content deleted Content added
No edit summary
Altered journal. | Use this tool. Report bugs. | #UCB_Gadget | Alter: title, doi, template type. Add: isbn, chapter, chapter-url. Removed or converted URL. Removed parameters. Some additions/deletions were parameter name changes. | Use this tool. Report bugs. | #UCB_Gadget
 
(47 intermediate revisions by 28 users not shown)
Line 1:
{{Short description|Geometric algorithm}}
[[File:Diffusion_map_of_a_torodial_helix.jpg|thumb|right|Given non-uniformly sampled data points on a toroidal helix (top), the first two Diffusion Map coordinates with Laplace-Beltrami normalization are plotted (bottom). The Diffusion Map unravels the toroidal helix recovering the underlying intrinsic circular geometry of the data.]]
[[File:Diffusion_map_of_a_torodial_helix.jpg|thumb|right|Given non-uniformly sampled data points on a toroidal helix (top), the first two Diffusion Map coordinates with Laplace–Beltrami normalization are plotted (bottom). The Diffusion Map unravels the toroidal helix recovering the underlying intrinsic circular geometry of the data.]]
 
'''Diffusion maps''' is a [[dimensionality reduction]] or [[feature extraction]] algorithm introduced by [[Ronald Coifman| Coifman]] and Lafon<ref name="PNAS1" /><ref name="PNAS2" /><ref name="DifussionMap" /><ref name="Diffusion" /> which computes a family of embeddings[[embedding]]s of a data set into Euclidean space (often low-dimensional) whose coordinates can be computed from the eigenvectors and eigenvalues of a diffusion operator on the data. The Euclidean distance between points in the embedded space is equal to the "diffusion distance" between probability distributions centered at those points. Different from linear dimensionality reduction methods such as [[principal component analysis]] (PCA) and [[multi-dimensional scaling]] (MDS), diffusion maps isare part of the family of [[nonlinear dimensionality reduction]] methods which focus on discovering the underlying [[manifold]] that the data has been sampled from. By integrating local similarities at different scales, diffusion maps givesgive a global description of the data-set. Compared with other methods, the diffusion mapsmap algorithm is robust to noise perturbation and is computationally inexpensive.
 
==Definition of diffusion maps==
Line 7 ⟶ 8:
 
===Connectivity===
Diffusion maps exploit the relationship between [[heat diffusion]] and [[random walk]] [[Markov chain]]. The basic observation is that if we take a random walk on the data, walking to a nearby data-point is more likely than walking to another that is far away. Let <math>(X, \mathcal{A}, \mu)</math> be a [[measure space]], where <math>X</math> is the data set and <math>\mu</math> represents the distribution onof the points on <math>X</math>.
 
Based on this, the connectivity <math>k</math> between two data points, <math>x</math> and <math>y</math>, can be defined as the probability of walking from <math>x</math> to <math>y</math> in one step of the random walk. Usually, this probability is specified in terms of a kernel function of the two points: <math>k: X \times X \rightarrow \mathbb{R}</math>. For example, the popular Gaussian kernel:
Line 23 ⟶ 24:
(<math>k</math> is positivity preserving).
 
The kernel constitutes the prior definition of the ''local'' geometry of the data-set. Since a given kernel will capture a specific feature of the data set, its choice should be guided by the application that one has in mind. This is a major difference with methods such as [[principal component analysis]], where correlations between all data points are taken into account at once.
 
Given <math>(X, k)</math>, we can then construct a reversible discrete-time Markov chain on <math>X</math> (a process known as the normalized graph Laplacian construction):
: <math>
d(x) = \int_X k(x,y) d\mu(y)
Line 74 ⟶ 75:
</math>
 
One of the main ideas of the diffusion framework is that running the chain forward in time (taking larger and larger powers of <math>M</math>) reveals the geometric structure of <math>X</math> at larger and larger scales (the diffusion process). Specifically, the notion of a ''cluster'' in the data set is quantified as a region in which the probability of escaping this region is low (within a certain time t). Therefore, t not only serves as a time parameter, but it also has the dual role of scale parameter.
 
The eigendecomposition of the matrix <math>M^t</math> yields
Line 82 ⟶ 83:
</math>
 
where <math>\{\lambda_l \}</math> is the sequence of eigenvalues of <math>M</math> and <math>\{\psi_l \}</math> and <math>\{\phi_l \}</math> are the biorthogonal rightleft and leftright eigenvectors respectively.
Due to the spectrum decay of the eigenvalues, only a few terms are necessary to achieve a given relative accuracy in this sum.
 
====Parameter <math>\alpha</math>α and the Diffusiondiffusion Operatoroperator====
The reason to introduce the normalization step involving <math>\alpha</math> is to tune the influence of the data point density on the infinitesimal transition of the diffusion. In some applications, the sampling of the data is generally not related to the geometry of the manifold we are interested in describing. In this case, we can set <math>\alpha=1</math> and the diffusion operator approximates the [[Laplace–Beltrami operator]]. We then recover the Riemannian geometry of the data set regardless of the distribution of the points. To describe the long-term behavior of the point distribution of a system of stochastic differential equations, we can use <math>\alpha=0.5</math> and the resulting Markov chain approximates the [[Fokker-PlanckFokker–Planck equation|Fokker-PlanckFokker–Planck diffusion]]. With <math>\alpha=0</math>, it reduces to the classical graph Laplacian normalization.
 
===Diffusion distance===
Line 120 ⟶ 121:
Thus we get the diffusion map from the original data to a ''k''-dimensional space which is embedded in the original space.
 
In, <ref name="Nadler05diffusionmaps" /> it is proved that
 
: <math>
D_t(x_i,x_j)^2= \approx||\Psi_t(x_i)-\Psi_t(x_j)||^2 \,
</math>
so the Euclidean distance in the diffusion coordinates approximates the diffusion distance.
Line 130 ⟶ 131:
The basic algorithm framework of diffusion map is as:
 
Step 1. Given the similarity matrix ''L''.
 
Step 2. Normalize the matrix according to parameter <math>\alpha</math>: <math>L^{(\alpha)} = D^{-\alpha} L D^{-\alpha} </math>.
 
Step 3. Form the normalized matrix <math>M=({D}^{(\alpha)})^{-1}L^{(\alpha)}</math>.
 
Step 4. Compute the ''k'' largest eigenvalues of <math>M^t</math> and the corresponding eigenvectors.
 
Step 5. Use diffusion map to get the embedding <math>\Psi_t</math>.
 
==Application==
In the paper, <ref name="Nadler05diffusionmaps" /> Nadler et. al. showed how to design a kernel that reproduces the diffusion induced by a [[Fokker-PlanckFokker–Planck equation]]. Also,They theyalso explained that, when the data approximate a manifold, one can recover the geometry of this manifold by computing an approximation of the [[Laplace-BeltramiLaplace–Beltrami operator]]. This computation is completely insensitive
to the distribution of the points and therefore provides a separation of the statistics and the geometry of the
data. Since diffusion maps givesgive a global description of the data-set, itthey can measure the distances between pairpairs of sample points in the manifold in which the data is embedded. Applications based on diffusion maps include [[facial recognition system|face recognition]],<ref name="vmrs">{{cite journal
| last1 = Barkan
| first1 = Oren
Line 154 ⟶ 155:
| url = http://www.cv-foundation.org/openaccess/content_iccv_2013/papers/Barkan_Fast_High_Dimensional_2013_ICCV_paper.pdf
| title = Fast high dimensional vector multiplication face recognition
| workjournal = Proceedings of the IEEE International Conference on Computer Vision 2013
| pppages = 1960–1967
}}</ref> [[Spectral clustering|spectral clustering]], low dimensional representation of images, image segmentation,<ref name="Farbman" /> 3D model segmentation,<ref name="sidi11" /> speaker verification<ref name="spk">{{cite journalbook
| last1 = Barkan
| first1 = Oren
| last2 = Aronowitz
| first2 = Hagai
| title = 2013 IEEE International Conference on Acoustics, Speech and Signal Processing
| url = http://smartfp7.eu/sites/default/files/field/files/page/DIFFUSION%20MAPS%20FOR%20PLDA-BASED%20SPEAKER%20VERIFICATION.pdf
| chapter-url = http://smartfp7.eu/sites/default/files/field/files/page/DIFFUSION%20MAPS%20FOR%20PLDA-BASED%20SPEAKER%20VERIFICATION.pdf
| title = Diffusion maps for PLDA-based speaker verification
| chapter = Diffusion maps for PLDA-based speaker verification
| work = Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
| date = 2013
| pppages = 7639–7643
| doi = 10.1109/ICASSP.2013.6639149
}}</ref> and identification,<ref name="speakerid2011" /> sampling on manifolds, anomaly detection,<ref name="Mishne" /> image inpainting,<ref name="Gepshtein" /> and so on.
| isbn = 978-1-4799-0356-6
}}</ref> and identification,<ref name="speakerid2011" /> sampling on manifolds, anomaly detection,<ref name="Mishne" /><ref>{{Cite journal|last1=Shabat|first1=Gil|last2=Segev|first2=David|last3=Averbuch|first3=Amir|date=2018-01-07|title=Uncovering Unknown Unknowns in Financial Services Big Data by Unsupervised Methodologies: Present and Future trends|url=http://proceedings.mlr.press/v71/shabat18a.html|journal=KDD 2017 Workshop on Anomaly Detection in Finance|language=en|volume=71|pages=8–19}}</ref> image inpainting,<ref name="Gepshtein" /> revealing brain resting state networks organization <ref name="Margulies_et_al_2016">{{cite journal
| last1 = Margulies
| first1 = Daniel S.
| last2 = Ghosh
| first2 = Satrajit S.
| last3 = Goulas
| first3 = Alexandros
| last4 = Falkiewicz
| first4 = Marcel
| last5 = Huntenburg
| first5 = Julia M.
| last6 = Langs
| first6 = Georg
| last7 = Bezgin
| first7 = Gleb
| last8 = Eickhoff
| first8 = Simon B.
| last9 = Castellanos
| first9 = F. Xavier
| last10 = Petrides
| first10 = Michael
| last11 = Jefferies
| first11 = Elizabeth
| last12 = Smallwood
| first12 = Jonathan
| title = Situating the default-mode network along a principal gradient of macroscale cortical organization
| journal = Proceedings of the National Academy of Sciences
| pages = 12574–12579
| volume = 113
| issue = 44
| year = 2016
| doi = 10.1073/pnas.1608282113
| pmid = 27791099
| pmc = 5098630
| bibcode = 2016PNAS..11312574M
| doi-access = free
}}</ref> and so on.
 
Furthermore, the diffusion maps framework has been productively extended to [[complex networks]],<ref>{{cite journal
| last1 = De Domenico
| first1 = Manlio
| url = https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.118.168301
| title = Diffusion geometry unravels the emergence of functional clusters in collective phenomena
| journal = Physical Review Letters
| volume = 118
| issue = 16
| pages = 168301
| year = 2017
| doi = 10.1103/PhysRevLett.118.168301
| pmid = 28474920
| arxiv = 1704.07068
| bibcode = 2017PhRvL.118p8301D
| s2cid = 2638868
}}</ref> revealing a functional organisation of networks which differs from the purely topological or structural one.
 
==See also==
Line 186 ⟶ 242:
| title = Diffusion Maps, Spectral Clustering and Eigenfunctions of Fokker–Planck Operators
| url = http://www.wisdom.weizmann.ac.il/~nadler/Publications/dm_nips05.pdf
| journal = in Advances in Neural Information Processing Systems 18
| volume = 18
}}</ref>
| bibcode = 2005math......6090N
| arxiv = math/0506090
}}</ref>
 
<ref name="Farbman">{{cite journal
Line 196 ⟶ 255:
| year = 2010
| title = Diffusion maps for edge-aware image editing
| journal = ACM Trans. Graph.
| url = http://doi.acm.org/10.1145/1882261.1866171
| journal = ACM Trans. Graph
| volume = 29
| issue = 6
| pages = 145:1–145:10
| doi = 10.1145/1882261.1866171
Line 214 ⟶ 273:
| year = 2005
| title = Geometric diffusions as a tool for harmonic analysis and structure definition of data: Diffusion maps
| url = http://www.pnas.org/content/102/21/7426.long
| journal = PNAS
| doi = 10.1073/pnas.0500334102
| volume = 102
| issue = 21
| pages = 7426–7431
| pmid=15899970
| pmc=1140422
| bibcode = 2005PNAS..102.7426C
}}</ref>
| doi-access = free
}}</ref>
 
<ref name="PNAS2">{{cite journal
Line 234 ⟶ 295:
| year = 2005
| title = Geometric diffusions as a tool for harmonic analysis and structure definition of data: Multiscale methods
| url = http://www.pnas.org/content/102/21/7432.full
| journal = PNAS
| doi = 10.1073/pnas.0500896102
| volume = 102
| issue = 21
| pages = 7432–7437
| pmid=15899969
| pmc=1140426
| bibcode = 2005PNAS..102.7432C
}}</ref>
| doi-access = free
}}</ref>
 
<ref name="DifussionMap">{{cite journal
Line 249 ⟶ 312:
| year = 2006
| title = Diffusion maps
| url = http://www.sciencedirect.com/science/article/pii/S1063520306000546
| journal = Applied and Computational Harmonic Analysis
| doi = 10.1016/j.acha.2006.04.006
| volume = 21
| pages = 5–30
| doi-access=
}}</ref>
| s2cid = 17160669
}}</ref>
 
<ref name="Diffusion">{{cite thesis
Line 280 ⟶ 344:
 
<ref name="Introduction">{{cite journal
| lastlast1 = De la Porte
| firstfirst1 = J.
| last2 = Herbst
| first2 = B M
Line 290 ⟶ 354:
| year = 2008
| title = An Introduction to Diffusion Maps
| journal = Proceedings of the nineteenthNineteenth annualAnnual symposiumSymposium of the Pattern Recognition Association of South Africa (PRASA)
| citeseerx = 10.1.1.309.674
| url = http://www.prasa.org/index.php/2012-03-07-10-55-15
}}</ref>
 
<ref name="sidi11">{{cite conference
| lastlast1 = Oana
| firstfirst1 = Sidi
| first2 = Oliver
| last2 = van Kaick
Line 311 ⟶ 375:
}}</ref>
 
<ref name="speakerid2011">{{cite journalconference
| lastlast1 = Michalevsky
| firstfirst1 = Yan
| first2 = Ronen
| last2 = Talmon
Line 325 ⟶ 389:
 
<ref name="Mishne">{{cite journal
| lastlast1 = Mishne
| firstfirst1 = Gal
| first2 = Israel
| last2 = Cohen
| year = 2013
| title = Multiscale Anomaly Detection Using Diffusion Maps
| journal = IEEE Journal of Selected Topics in Signal Processing
| url = http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6377228
| journal = IEEE Selected Topics in Signal Processing
| volume = 7
| issue = 1
| pages = 111–123
| doi = 10.1109/jstsp.2012.2232279
| bibcode = 2013ISTSP...7..111M
}}</ref>
| s2cid = 1954466
}}</ref>
 
<ref name="Gepshtein">{{cite journal
| lastlast1 = Gepshtein
| firstfirst1 = Shai
| first2 = Yosi
| last2 = Keller
Line 347 ⟶ 413:
| journal = IEEE Transactions on Image Processing
| volume = 22
| issue = 8
| pages = 2983–2994
| doi = 10.1109/tip.2013.2237916
| pmid = 23322762
}}</ref>
| bibcode = 2013ITIP...22.2983G
| s2cid = 14375333
}}</ref>
}}