Revision as of 01:20, 28 June 2016 edit Birjandtalab (talk \| contribs) 5 edits m Fixing a grammatical error. Tag: Visual edit ← Previous edit		Revision as of 05:33, 29 June 2016 edit undo Yobot (talk \| contribs) Bots 4,733,870 edits m WP:CHECKWIKI error fixes using AWB (12041) Next edit →
Line 5: The t-SNE algorithm comprises two main stages. First, t-SNE constructs a [[probability distribution]] over pairs of high-dimensional objects in such a way that similar objects have a high probability of being picked, whilst dissimilar points have an [[infinitesimal]] probability of being picked. Second, t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the [[Kullback–Leibler divergence]] between the two distributions with respect to the locations of the points in the map. Note that whilst the original algorithm uses the [[Euclidean distance]] between objects as the base of its similarity metric, this should be changed as appropriate. t-SNE has been used in a wide range of applications, including [[computer security]] research,<ref>{{cite journal\|last=Gashi\|first=I.\|author2=Stankovic, V. \|author3=Leita, C. \|author4=Thonnard, O. \|title=An Experimental Study of Diversity with Off-the-shelf AntiVirus Engines\|journal=Proceedings of the IEEE International Symposium on Network Computing and Applications\|year=2009\|pages=4–11}}</ref> [[music analysis]],<ref>{{cite journal\|last=Hamel\|first=P.\|author2=Eck, D. \|title=Learning Features from Music Audio with Deep Belief Networks\|journal=Proceedings of the International Society for Music Information Retrieval Conference\|year=2010\|pages=339–344}}</ref> [[cancer research]],<ref>{{cite journal\|last=Jamieson\|first=A.R.\|author2=Giger, M.L. \|author3=Drukker, K. \|author4=Lui, H. \|author5=Yuan, Y. \|author6=Bhooshan, N. \|title=Exploring Nonlinear Feature Space Dimension Reduction and Data Representation in Breast CADx with Laplacian Eigenmaps and t-SNE\|journal=Medical Physics \|issue=1\|year=2010\|pages=339–351\|doi=10.1118/1.3267037\|volume=37}}</ref> [[bioinformatics]],<ref>{{cite journal\|last=Wallach\|first=I.\|author2=Liliean, R. \|title=The Protein-Small-Molecule Database, A Non-Redundant Structural Resource for the Analysis of Protein-Ligand Binding\|journal=Bioinformatics \|year=2009\|pages=615–620\|doi=10.1093/bioinformatics/btp035\|volume=25\|issue=5}}</ref>, and biomedical signal processing.<ref>{{Cite journal\|last=Birjandtalab\|first=J.\|last2=Pouyan\|first2=M. B.\|last3=Nourani\|first3=M.\|date=2016-02-01\|title=Nonlinear dimension reduction for EEG-based epileptic seizure detection\|url=http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7455968\|journal=2016 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)\|pages=595–598\|doi=10.1109/BHI.2016.7455968}}</ref>. == Details == Line 22: Herein a heavy-tailed [[Student-t distribution]] (with one-degree of freedom, which is the same as a [[Cauchy distribution]]) is used to measure similarities between low-dimensional points in order to allow dissimilar objects to be modeled far apart in the map. The locations of the points <math>\mathbf{y}_i</math> in the map are determined by minimizing the (non-symmetric) [[Kullback–Leibler divergence]] of the distribution <math>Q</math> from the distribution <math>P</math>, that is: : <math>KL(P\|\|Q) = \sum_{i \neq j} p_{ij} \log \frac{p_{ij}}{q_{ij}}</math>

T-distributed stochastic neighbor embedding: Difference between revisions