Content deleted Content added
m Fixed the Error equation |
|||
Line 115:
:<math>\mathbf{T}_L = \mathbf{X} \mathbf{W}_L</math>
where the matrix '''T'''<sub>L</sub> now has ''n'' rows but only ''L'' columns. In other words, PCA learns a linear transformation <math> t = W_L^\mathsf{T} x, x \in \mathbb{R}^p, t \in \mathbb{R}^L,</math> where the columns of {{math|''p'' × ''L''}} matrix <math>W_L</math> form an orthogonal basis for the ''L'' features (the components of representation ''t'') that are decorrelated.<ref>{{Cite journal |author=Bengio, Y.|year=2013|title=Representation Learning: A Review and New Perspectives |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=35 |issue=8 |pages=1798–1828 |doi=10.1109/TPAMI.2013.50|pmid=23787338|display-authors=etal|arxiv=1206.5538|s2cid=393948}}</ref> By construction, of all the transformed data matrices with only ''L'' columns, this score matrix maximises the variance in the original data that has been preserved, while minimising the total squared reconstruction error <math>\
[[File:PCA of Haplogroup J using 37 STRs.png|thumb|right|A principal components analysis scatterplot of [[Y-STR]] [[haplotype]]s calculated from repeat-count values for 37 Y-chromosomal STR markers from 354 individuals.<br /> PCA has successfully found linear combinations of the markers that separate out different clusters corresponding to different lines of individuals' Y-chromosomal genetic descent.]]
|