T-distributed stochastic neighbor embedding: Difference between revisions

Content deleted Content added
capitalization
Monkbot (talk | contribs)
Line 1:
{{lower case title}}
'''t-distributed stochastic neighbor embedding (t-SNE)''' is a [[machine learning]] algorithm for [[dimensionality reduction]] developed by Laurens van der Maaten and [[Geoffrey Hinton]].<ref>{{cite journal|last=van der Maaten|first=L.J.P.|coauthorsauthor2=Hinton, G.E. |title=Visualizing High-Dimensional Data Using t-SNE|journal=Journal of Machine Learning Research 9|date=Nov 2008|pages=2579–2605|url=http://jmlr.org/papers/volume9/vandermaaten08a/vandermaaten08a.pdf}}</ref> It is a [[nonlinear dimensionality reduction]] technique that is particularly well suited for embedding high-dimensional data into a space of two or three dimensions, which can then be visualized in a scatter plot. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points.
 
The t-SNE algorithms comprises two main stages. First, t-SNE constructs a [[probability distribution]] over pairs of high-dimensional objects in such a way that similar objects have a high probability of being picked, whilst dissimilar points have an [[infinitesimal]] probability of being picked. Second, t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the [[Kullback–Leibler divergence]] between the two distributions with respect to the locations of the points in the map.
 
t-SNE has been used in a wide range of applications, including [[computer security]] research,<ref>{{cite journal|last=Gashi|first=I.|coauthors=Stankovic, V., Leita, C., Thonnard, O.|title=An Experimental Study of Diversity with Off-the-shelf AntiVirus Engines|journal=Proceedings of the IEEE International Symposium on Network Computing and Applications|year=2009|pages=4–11}}</ref> [[music analysis]],<ref>{{cite journal|last=Hamel|first=P.|coauthorsauthor2=Eck, D. |title=Learning Features from Music Audio with Deep Belief Networks|journal=Proceedings of the International Society for Music Information Retrieval Conference|year=2010|pages=339–344}}</ref> [[cancer research]],<ref>{{cite journal|last=Jamieson|first=A.R.|coauthors=Giger, M.L., Drukker, K., Lui, H., Yuan, Y., Bhooshan, N.|title=Exploring Nonlinear Feature Space Dimension Reduction and Data Representation in Breast CADx with Laplacian Eigenmaps and t-SNE|journal=Medical Physics 37(1)|year=2010|pages=339–351|doi=10.1118/1.3267037|volume=37}}</ref> and [[bio-informatics]].<ref>{{cite journal|last=Wallach|first=I.|coauthorsauthor2=Liliean, R. |title=The Protein-Small-Molecule Database, A Non-Redundant Structural Resource for the Analysis of Protein-Ligand Binding|journal=Bioinformatics 25(5)|year=2009|pages=615–620|doi=10.1093/bioinformatics/btp035|volume=25|issue=5}}</ref>
 
== Details ==