Revision as of 10:53, 10 March 2022 edit Robykiwi (talk \| contribs) 227 edits No edit summary ← Previous edit		Revision as of 19:43, 17 August 2022 edit undo Quantling (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 6,784 edits Perplexity is an esoteric term meaning exponentiated entropy. The statement is still true using the more common terminology. Next edit →
Line 27: and note that <math>p_{ij} = p_{ji}</math>, <math>p_{ii} = 0 </math>, and <math>\sum_{i, j} p_{ij} = 1</math>. The bandwidth of the [[Gaussian kernel]]s <math>\sigma_i</math> is set in such a way that the [[~~perplexity~~Entropy (information theory)\|entropy]] of the conditional distribution equals a predefined ~~perplexity~~entropy using the [[bisection method]]. As a result, the bandwidth is adapted to the [[density]] of the data: smaller values of <math>\sigma_i</math> are used in denser parts of the data space. Since the Gaussian kernel uses the Euclidean distance <math>\lVert x_i-x_j \rVert</math>, it is affected by the [[curse of dimensionality]], and in high dimensional data when distances lose the ability to discriminate, the <math>p_{ij}</math> become too similar (asymptotically, they would converge to a constant). It has been proposed to adjust the distances with a power transform, based on the [[intrinsic dimension]] of each point, to alleviate this.<ref>{{Cite conference\|last1=Schubert\|first1=Erich\|last2=Gertz\|first2=Michael\|date=2017-10-04\|title=Intrinsic t-Stochastic Neighbor Embedding for Visualization and Outlier Detection\|conference=SISAP 2017 – 10th International Conference on Similarity Search and Applications\|pages=188–203\|doi=10.1007/978-3-319-68474-1_13}}</ref>

T-distributed stochastic neighbor embedding: Difference between revisions