Revision as of 18:44, 5 January 2025 edit Mrwojo (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers, Rollbackers 15,713 edits m →Details: added 2 wikilinks, capitalized "Shannon" ← Previous edit		Revision as of 03:13, 22 April 2025 edit undo 130.102.13.190 (talk) Fixed dead reference link. Tag: Visual edit Next edit →
Line 5: {{lowercase title}} {{Data Visualization}} '''t-distributed stochastic neighbor embedding''' ('''t-SNE''') is a [[statistical]] method for visualizing high-dimensional data by giving each datapoint a ___location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally developed by [[Geoffrey Hinton]] and Sam Roweis,<ref name="SNE">{{cite conference \|~~author1-last~~date=~~Hinton~~January 2002 \|~~author1-first~~title=~~Geoffrey\|~~Stochastic neighbor embedding ~~author2-last=Roweis~~\|~~author2-first~~url=~~Sam~~https://papers.nips.cc/paper_files/paper/2002/file/6150ccc6069bea6b5716254057a194ef-Paper.pdf \|conference=[[Neural Information Processing Systems]] \|~~title~~author1-last=~~Stochastic~~Hinton ~~neighbor embedding~~\|~~date~~author1-first=Geoffrey ~~January 2002~~\|author2-last=Roweis \|~~url~~author2-first=~~https://cs.nyu.edu/~roweis/papers/sne_final.pdf~~Sam}}</ref> where Laurens van der Maaten and Hinton proposed the [[Student's t-distribution\|''t''-distributed]] variant.<ref name=MaatenHinton>{{cite journal\|last=van der Maaten\|first=L.J.P.\|author2=Hinton, G.E. \|title=Visualizing Data Using t-SNE\|journal=Journal of Machine Learning Research \|volume=9\|date=Nov 2008\|pages=2579–2605\|url=http://jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf}}</ref> It is a [[nonlinear dimensionality reduction]] technique for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points with high probability. The t-SNE algorithm comprises two main stages. First, t-SNE constructs a [[probability distribution]] over pairs of high-dimensional objects in such a way that similar objects are assigned a higher probability while dissimilar points are assigned a lower probability. Second, t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the [[Kullback–Leibler divergence]] (KL divergence) between the two distributions with respect to the locations of the points in the map. While the original algorithm uses the [[Euclidean distance]] between objects as the base of its similarity metric, this can be changed as appropriate. A [[Riemannian metric\|Riemannian]] variant is [[Uniform manifold approximation and projection\|UMAP]].

T-distributed stochastic neighbor embedding: Difference between revisions