Revision as of 19:32, 22 April 2025 edit Aug zxj (talk \| contribs) 2 edits →Further considerations ← Previous edit		Revision as of 16:11, 23 April 2025 edit undo Patatigno (talk \| contribs) 6 edits m Fixed the Error equation Next edit →
Line 115: :<math>\mathbf{T}_L = \mathbf{X} \mathbf{W}_L</math> where the matrix '''T'''<sub>L</sub> now has ''n'' rows but only ''L'' columns. In other words, PCA learns a linear transformation <math> t = W_L^\mathsf{T} x, x \in \mathbb{R}^p, t \in \mathbb{R}^L,</math> where the columns of {{math\|''p'' × ''L''}} matrix <math>W_L</math> form an orthogonal basis for the ''L'' features (the components of representation ''t'') that are decorrelated.<ref>{{Cite journal \|author=Bengio, Y.\|year=2013\|title=Representation Learning: A Review and New Perspectives \|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence \|volume=35 \|issue=8 \|pages=1798–1828 \|doi=10.1109/TPAMI.2013.50\|pmid=23787338\|display-authors=etal\|arxiv=1206.5538\|s2cid=393948}}</ref> By construction, of all the transformed data matrices with only ''L'' columns, this score matrix maximises the variance in the original data that has been preserved, while minimising the total squared reconstruction error <math>\\|[e(\~~mathbf{T}~~alpha, \~~mathbf~~beta) = \frac{W1}^T{1 -+ \~~mathbf{T~~alpha^2}_L \~~mathbf~~sum_{Wi}~~^T_L~~ \~~\|_2~~left( \alpha L_i + \beta - W_i \right)^2,\]</math> orwhere <math>\\|[W = \~~mathbf{X}~~alpha -L + \~~mathbf{X}_L~~beta.\~~\|_2^2~~]</\math>. <ref>Cite book \| author1=Holmes, M. title=Introduction to Scientific Computing and Data Analysis, 2nd Ed \| year=2023 \| publisher=Springer \| isbn=978-3-031-22429-4 </ref> [[File:PCA of Haplogroup J using 37 STRs.png\|thumb\|right\|A principal components analysis scatterplot of [[Y-STR]] [[haplotype]]s calculated from repeat-count values for 37 Y-chromosomal STR markers from 354 individuals.<br /> PCA has successfully found linear combinations of the markers that separate out different clusters corresponding to different lines of individuals' Y-chromosomal genetic descent.]]

Principal component analysis: Difference between revisions