T-distributed stochastic neighbor embedding: Difference between revisions

Content deleted Content added
Robykiwi (talk | contribs)
No edit summary
Perplexity is an esoteric term meaning exponentiated entropy. The statement is still true using the more common terminology.
Line 27:
and note that <math>p_{ij} = p_{ji}</math>, <math>p_{ii} = 0 </math>, and <math>\sum_{i, j} p_{ij} = 1</math>.
 
The bandwidth of the [[Gaussian kernel]]s <math>\sigma_i</math> is set in such a way that the [[perplexityEntropy (information theory)|entropy]] of the conditional distribution equals a predefined perplexityentropy using the [[bisection method]]. As a result, the bandwidth is adapted to the [[density]] of the data: smaller values of <math>\sigma_i</math> are used in denser parts of the data space.
 
Since the Gaussian kernel uses the Euclidean distance <math>\lVert x_i-x_j \rVert</math>, it is affected by the [[curse of dimensionality]], and in high dimensional data when distances lose the ability to discriminate, the <math>p_{ij}</math> become too similar (asymptotically, they would converge to a constant). It has been proposed to adjust the distances with a power transform, based on the [[intrinsic dimension]] of each point, to alleviate this.<ref>{{Cite conference|last1=Schubert|first1=Erich|last2=Gertz|first2=Michael|date=2017-10-04|title=Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection|conference=SISAP 2017 – 10th International Conference on Similarity Search and Applications|pages=188–203|doi=10.1007/978-3-319-68474-1_13}}</ref>