Partial least squares regression: Difference between revisions

Content deleted Content added
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 33 templates: del empty params (1×); hyphenate params (1×);
WikiCleanerBot (talk | contribs)
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
 
(74 intermediate revisions by 27 users not shown)
Line 1:
{{Short description|Statistical method}}
{{Regression bar}}
'''Partial least squares regression''' ('''PLS) regression''') is a [[statistics|statistical]] method that bears some relation to [[principal component regression|principal components regression]] and is a [[reduced rank regression]];<ref>{{cite book | url=https://books.google.com/books?id=GmnpCAAAQBAJ&pg=PA2 | title=Reduced Rank Regression: With Applications to Quantitative Structure-Activity Relationships | isbn=978-3-642-50015-2 | last1=Schmidli | first1=Heinz | date=13 March 2013 | publisher=Springer }}</ref> instead of finding [[hyperplane]]s of maximum [[variance]] between the response and independent variables, it finds a [[linear regression]] model by projecting the [[predicted variable]]s and the [[observable variable]]s to a new space of maximum covariance (see below). Because both the ''X'' and ''Y'' data are projected to new spaces, the PLS family of methods are known as bilinear factor models. Partial least squares discriminant analysis (PLS-DA) is a variant used when the ''Y'' is categorical.
 
PLS is used to find the fundamental relations between two [[matrix (mathematics)|matrices]] (''X'' and ''Y''), i.e. a [[latent variable]] approach to modeling the [[covariance]] structures in these two spaces. A PLS model will try to find the multidimensional direction in the ''X'' space that explains the maximum multidimensional variance direction in the ''Y'' space. PLS regression is particularly suited when the matrix of predictors has more variables than observations, and when there is [[multicollinearity]] among ''X'' values. By contrast, standard regression will fail in these cases (unless it is [[Tikhonov regularization|regularized]]).
 
Partial least squares was introduced by the Swedish statistician [[Herman Wold|Herman O. A. Wold]], who then developed it with his son, Svante Wold. An alternative term for PLS (andis more correct according'''''projection to Svantelatent Woldstructures''''',<ref name="wold_2001">{{cite journal |last1=Wold |first1=S |last2=Sjöström |first2=M. |last3=Eriksson |first3=L. |title=PLS-regression: a basic tool of chemometrics |journal=Chemometrics and Intelligent Laboratory Systems |volume=58 |issue=2 |pages=109–130 |year=2001 |doi=10.1016/S0169-7439(01)00155-1 |s2cid=11920190 }}</ref>)<ref>{{cite isjournal |last1=Abdi |first1=Hervé |title=Partial least squares regression and '''''projection toon latent structures''''',structure regression (PLS Regression) |journal=WIREs Computational Statistics |date= 2010 |volume=2 |pages=97–106 |doi=10.1002/wics.51 |s2cid=122685021 |url=https://wires.onlinelibrary.wiley.com/doi/epdf/10.1002/wics.51}}</ref> but the term ''partial least squares'' is still dominant in many areas. Although the original applications were in the social sciences, PLS regression is today most widely used in [[chemometrics]] and related areas. It is also used in [[bioinformatics]], [[sensometrics]], [[neuroscience]], and [[anthropology]].
 
== Core idea ==
[[Image:Core Idea PLS.png|thumb|450px|Core Idea of PLS. The loading vectors <math>\vec{p}_1, \vec{q}_1</math> in the input and output space are drawn in red (not normalized for better visibility). When <math>x_1</math> increases (independent of <math>x_2</math>), <math>y_1</math> and <math>y_2</math> increase.]]
We are given a sample of <math>n</math> [[paired data|paired]] observations <math>(\vec{x}_i, \vec{y}_i), i \in {1,\ldots,n}</math>.
In the first step <math>j=1</math>, the partial least squares regression searches for the normalized direction <math>\vec{p}_j</math>, <math>\vec{q}_j</math> that maximizes the covariance<ref>See lecture https://www.youtube.com/watch?v=Px2otK2nZ1c&t=46s</ref>
 
: <math>\max_{\vec{p}_j, \vec{q}_j} \operatorname E [\underbrace{(\vec{p}_j\cdot \vec{X})}_{t_j} \underbrace{(\vec{q}_j\cdot \vec{Y})}_{u_j} ]. </math>
 
Note below, the algorithm is denoted in matrix notation.
 
==Underlying model==
 
The general underlying model of multivariate PLS with <math>\ell</math> components is
 
:<math>X = T P^\mathrm{T} + E</math>
:<math>Y = U Q^\mathrm{T} + F</math>
 
where
where {{mvar|X}} is an <math>n \times m</math> matrix of predictors, {{mvar|Y}} is an <math>n \times p</math> matrix of responses; {{mvar|T}} and {{mvar|U}} are <math>n \times l</math> matrices that are, respectively, projections of {{mvar|X}} (the ''X score'', ''component'' or ''factor'' matrix) and projections of {{mvar|Y}} (the ''Y scores''); {{mvar|P}} and {{mvar|Q}} are, respectively, <math>m \times l</math> and <math>p \times l</math> orthogonal ''loading'' matrices; and matrices {{mvar|E}} and {{mvar|F}} are the error terms, assumed to be independent and identically distributed random normal variables. The decompositions of {{mvar|X}} and {{mvar|Y}} are made so as to maximise the [[covariance]] between {{mvar|T}} and {{mvar|U}}.
* {{mvar|X}} is an <math>n \times m</math> matrix of predictors
* {{mvar|Y}} is an <math>n \times p</math> matrix of responses
* {{mvar|T}} and {{mvar|U}} are <math>n \times \ell</math> matrices that are, respectively, projections of {{mvar|X}} (the ''X score'', ''component'' or ''factor'' matrix) and projections of {{mvar|Y}} (the ''Y scores'')
* {{mvar|P}} and {{mvar|Q}} are, respectively, <math>m \times \ell</math> and <math>p \times \ell</math> ''loading'' matrices
* and matrices {{mvar|E}} and {{mvar|F}} are the error terms, assumed to be independent and identically distributed random normal variables.
 
The decompositions of {{mvar|X}} and {{mvar|Y}} are made so as to maximise the [[covariance]] between {{mvar|T}} and {{mvar|U}}.
 
Note that this covariance is defined pair by pair: the covariance of column ''i'' of {{mvar|T}} (length ''n'') with the column ''i'' of {{mvar|U}} (length ''n'') is maximized. Additionally, the covariance of the column i of {{mvar|T}} with the column ''j'' of {{mvar|U}} (with <math>i \ne j</math>) is zero.
 
In PLSR, the loadings are thus chosen so that the scores form an orthogonal basis. This is a major difference with PCA where orthogonality is imposed onto loadings (and not the scores).
 
==Algorithms==
 
A number of variants of PLS exist for estimating the factor and loading matrices {{mvar|T, U, P}} and {{mvar|Q}}. Most of them construct estimates of the linear regression between {{mvar|X}} and {{mvar|Y}} as <math>Y = X \tilde{B} + \tilde{B}_0</math>. Some PLS algorithms are only appropriate for the case where {{mvar|Y}} is a column vector, while others deal with the general case of a matrix {{mvar|Y}}. Algorithms also differ on whether they estimate the factor matrix {{mvar|T}} as an orthogonal, an(that is, [[orthonormal matrix|orthonormal]]) matrix or not.<ref>
{{cite journal |last1=Lindgren |first1=F |last2=Geladi |first2=P |last3=Wold |first3=S |title=The kernel algorithm for PLS |journal=J. Chemometrics |volume=7 |pages=45–59 |year=1993 |doi=10.1002/cem.1180070104 |s2cid=122950427 }}</ref><ref>{{cite journal |last1=de Jong |first1=S. |last2=ter Braak |first2=C.J.F. |title=Comments on the PLS kernel algorithm |journal=J. Chemometrics |volume=8 |issue=2 |pages=169–174 |year=1994 |doi=10.1002/cem.1180080208 |s2cid=221549296 }}</ref><ref>{{cite journal |last1=Dayal |first1=B.S. |last2=MacGregor |first2=J.F. |title=Improved PLS algorithms |journal=J. Chemometrics |volume=11 |issue=1 |pages=73–85 |year=1997 |doi=10.1002/(SICI)1099-128X(199701)11:1<73::AID-CEM435>3.0.CO;2-# |s2cid=120753851 }}</ref><ref>{{cite journal |last=de Jong |first=S. |title=SIMPLS: an alternative approach to partial least squares regression |journal=Chemometrics and Intelligent Laboratory Systems |volume=18 |pages=251–263 |year=1993 |doi=10.1016/0169-7439(93)85002-X |issue=3 }}</ref><ref>{{cite journal |last1=Rannar |first1=S. |last2=Lindgren |first2=F. |last3=Geladi |first3=P. |last4=Wold |first4=S. |title=A PLS Kernel Algorithm for Data Sets with Many Variables and Fewer Objects. Part 1: Theory and Algorithm |journal=J. Chemometrics |volume=8 |issue=2 |pages=111–125 |year=1994 |doi=10.1002/cem.1180080204 |s2cid=121613293 }}</ref><ref>{{cite journal |last=Abdi |first=H. |title=Partial least squares regression and projection on latent structure regression (PLS-Regression) |journal=Wiley Interdisciplinary Reviews: Computational Statistics |volume=2 |pages=97–106 |year=2010 |doi=10.1002/wics.51 |s2cid=122685021 }}</ref>
The final prediction will be the same for all these varieties of PLS, but the components will differ.
 
PLS is composed of iteratively repeating the following steps ''k'' times (for ''k'' components):
# finding the directions of maximal covariance in input and output space
# performing least squares regression on the input score
# deflating the input <math>X</math> and/or target <math>Y</math>
 
===PLS1===
 
PLS1 is a widely used algorithm appropriate for the vector {{mvar|Y}} case. It estimates {{mathmvar|T}} as an orthonormal matrix.
(Caution: the {{mvar|t}} vectors in the code below may not be normalized appropriately; see talk.)
In pseudocode it is expressed below (capital letters are matrices, lower case letters are vectors if they are superscripted and scalars if they are subscripted):.
 
1 {{nowrap|'''function''' PLS1({{mvar|X, y, l}})}}
2 {{nowrap|<math>X^{(0)} \gets X</math>}}
3 {{nowrap|<math>w^{(0)} \gets X^\mathrm{T} y/|\|X^\mathrm{T}y|\|</math>}}, an initial estimate of {{mvar|w}}.
4 {{nowrap|'''for''' <math>k = 0</math> '''to''' <math>l\ell-1</math>}}
5 {{nowrap|<math>t^{(k)} \gets X^{(k)}w^{(k)}</math>}}
6 {{nowrap|<math>t_k \gets {t^{(k)}}^\mathrm{T} t^{(k)}</math> (note this is a scalar)}}
Line 35 ⟶ 63:
9 {{nowrap|<math>q_k \gets {y}^\mathrm{T} t^{(k)}</math> (note this is a scalar)}}
10 {{nowrap|'''if''' <math>q_k = 0</math>}}
11 {{nowrap|<math>l\ell \gets k</math>, '''break''' the '''for loop'''}}
12 {{nowrap|'''if''' <math>k < (l\ell-1)</math>}}
13 {{nowrap|<math>X^{(k+1)} \gets X^{(k)} - t_k t^{(k)} {p^{(k)}}^\mathrm{T}</math>}}
14 {{nowrap|<math>w^{(k+1)} \gets {X^{(k+1)}}^\mathrm{T} y </math>}}
15 {{nowrap|'''end''' '''for'''}}
16 '''define''' {{mvar|W}} to be the matrix {{nowrap|with columns <math>w^{(0)},w^{(1)},...\ldots,w^{(l\ell-1)}</math>.}}
Do the same to form the {{mvar|P}} matrix and {{mvar|q}} vector.
17 {{nowrap|<math>B \gets W {(P^\mathrm{T} W)}^{-1} q</math>}}
Line 47 ⟶ 75:
 
This form of the algorithm does not require centering of the input {{mvar|X}} and {{mvar|Y}}, as this is performed implicitly by the algorithm.
This algorithm features 'deflation' of the matrix {{mvar|X}} (subtraction of <math>t_k t^{(k)} {p^{(k)}}^\mathrm{T}</math>), but deflation of the vector {{mvar|y}} is not performed, as it is not necessary (it can be proved that deflating {{mvar|y}} yields the same results as not deflating<ref>{{cite journal |last1=Höskuldsson |first1=Agnar |title=PLS Regression Methods |journal=Journal of Chemometrics |date=1988 |volume=2 |issue=3 |page=219 |doi=10.1002/cem.1180020306 |s2cid=120052390 }}</ref>). The user-supplied variable {{mvar|l}} is the limit on the number of latent factors in the regression; if it equals the rank of the matrix {{mvar|X}}, the algorithm will yield the least squares regression estimates for {{mvar|B}} and <math>B_0</math>
[[File:Deflation-The-geometric-interpretation-of-the-deflation-step-in-the-PLS-Algorithm.jpg|thumb|Geometric interpretation of the deflation step in the input space]]
 
==Extensions==
=== OPLS ===
In 2002 a new method was published called orthogonal projections to latent structures (OPLS). In OPLS, continuous variable data is separated into predictive and uncorrelated information. This leads to improved diagnostics, as well as more easily interpreted visualization. However, these changes only improve the interpretability, not the predictivity, of the PLS models.<ref>{{Cite journal
In 2002 a new method was published called orthogonal projections to latent structures (OPLS). In OPLS, continuous variable data is separated into predictive and uncorrelated (orthogonal) information. This leads to improved diagnostics, as well as more easily interpreted visualization. However, these changes only improve the interpretability, not the predictivity, of the PLS models.<ref>{{Cite journal
| last = Trygg
| firstlast1 = JTrygg
| first1 = J
| last2 = Wold
| first2 = S
Line 61 ⟶ 91:
| pages = 119–128
| year = 2002
| doi = 10.1002/cem.695}}| s2cid = 122699039
}}
</ref> L-PLS extends PLS regression to 3 connected data blocks.<ref>{{cite journal |last1=Sæbøa |first1=S. |last2=Almøya |first2=T. |last3=Flatbergb |first3=A. |last4=Aastveita |first4=A.H. |last5=Martens |first5=H. |title=LPLS-regression: a method for prediction and classification under the influence of background information on predictor variables |journal=Chemometrics and Intelligent Laboratory Systems |volume=91 |issue=2 |pages=121–132 |year=2008 |doi=10.1016/j.chemolab.2007.10.006 }}</ref> Similarly, OPLS-DA (Discriminant Analysis) may be applied when working with discrete variables, as in classification and biomarker studies.
</ref> Similarly, OPLS-DA (Discriminant Analysis) may be applied when working with discrete variables, as in classification and biomarker studies.
 
The general underlying model of OPLS is
In 2015 partial least squares was related to a procedure called the three-pass regression filter (3PRF).<ref>{{Cite journal|last=Kelly|first=Bryan|last2=Pruitt|first2=Seth|date=2015-06-01|title=The three-pass regression filter: A new approach to forecasting using many predictors|journal=Journal of Econometrics|series=High Dimensional Problems in Econometrics|volume=186|issue=2|pages=294–316|doi=10.1016/j.jeconom.2015.02.011}}</ref> Supposing the number of observations and variables are large, the 3PRF (and hence PLS) is asymptotically normal for the "best" forecast implied by a linear latent factor model. In stock market data, PLS has been shown to provide accurate out-of-sample forecasts of returns and cash-flow growth.<ref>{{Cite journal|last=Kelly|first=Bryan|last2=Pruitt|first2=Seth|date=2013-10-01|title=Market Expectations in the Cross-Section of Present Values|journal=The Journal of Finance|volume=68|issue=5|pages=1721–1756|doi=10.1111/jofi.12060|issn=1540-6261|citeseerx=10.1.1.498.5973}}</ref>
 
:<math>X = T P^\mathrm{T} +T_\text{Y-orth} P^\mathrm{T}_\text{Y-orth} + E</math>
A PLS version based on [[Singular value decomposition|singular value decomposition (SVD)]] provides a memory efficient implementation that can be used to address high-dimensional problems, such as relating millions of genetic markers to thousands of imaging features in imaging genetics, on consumer-grade hardware.<ref>{{Cite journal|last=Lorenzi|first=Marco|last2=Altmann|first2=Andre|last3=Gutman|first3=Boris|last4=Wray|first4=Selina|last5=Arber|first5=Charles|last6=Hibar|first6=Derrek P.|last7=Jahanshad|first7=Neda|last8=Schott|first8=Jonathan M.|last9=Alexander|first9=Daniel C.|date=2018-03-20|title=Susceptibility of brain atrophy to TRIB3 in Alzheimer's disease, evidence from functional prioritization in imaging genetics|journal=Proceedings of the National Academy of Sciences|volume=115|issue=12|pages=3162–3167|doi=10.1073/pnas.1706100115|issn=0027-8424|pmc=5866534|pmid=29511103}}</ref>
:<math>Y = U Q^\mathrm{T} + F</math>
 
or in O2-PLS<ref>Eriksson, S. Wold, and J. Tryg. "O2PLS® for improved analysis and visualization of complex data." https://www.dynacentrix.com/telecharg/SimcaP/O2PLS.pdf</ref>
PLS correlation (PLSC) is another methodology related to PLS regression,<ref name=":0">{{Cite journal|last=Krishnan|first=Anjali|last2=Williams|first2=Lynne J.|last3=McIntosh|first3=Anthony Randal|last4=Abdi|first4=Hervé|date=May 2011|title=Partial Least Squares (PLS) methods for neuroimaging: A tutorial and review|journal=NeuroImage|volume=56|issue=2|pages=455–475|doi=10.1016/j.neuroimage.2010.07.034}}</ref> which has been used in neuroimaging <ref name=":0" /><ref>{{Cite journal|last=McIntosh|first=Anthony R.|last2=Mišić|first2=Bratislav|date=2013-01-03|title=Multivariate Statistical Analyses for Neuroimaging Data|journal=Annual Review of Psychology|volume=64|issue=1|pages=499–525|doi=10.1146/annurev-psych-113011-143804|issn=0066-4308}}</ref><ref>{{Cite journal|last=Beggs|first=Clive B.|last2=Magnano|first2=Christopher|last3=Belov|first3=Pavel|last4=Krawiecki|first4=Jacqueline|last5=Ramasamy|first5=Deepa P.|last6=Hagemeier|first6=Jesper|last7=Zivadinov|first7=Robert|date=2016-05-02|editor-last=de Castro|editor-first=Fernando|title=Internal Jugular Vein Cross-Sectional Area and Cerebrospinal Fluid Pulsatility in the Aqueduct of Sylvius: A Comparative Study between Healthy Subjects and Multiple Sclerosis Patients|journal=PLOS ONE|volume=11|issue=5|pages=e0153960|doi=10.1371/journal.pone.0153960|issn=1932-6203|pmc=4852898|pmid=27135831}}</ref> and more recently in sport science,<ref>{{Cite journal|last=Weaving|first=Dan|last2=Jones|first2=Ben|last3=Ireton|first3=Matt|last4=Whitehead|first4=Sarah|last5=Till|first5=Kevin|last6=Beggs|first6=Clive B.|date=2019-02-14|editor-last=Connaboy|editor-first=Chris|title=Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis|journal=PLOS ONE|volume=14|issue=2|pages=e0211776|doi=10.1371/journal.pone.0211776|issn=1932-6203|pmc=6375576}}</ref> to quantify the strength of the relationship between data sets. Typically, PLSC divides the data into two blocks (sub-groups) each containing one or more variables, and then uses [[Singular value decomposition|singular value decomposition (SVD)]] to establish the strength of any relationship (i.e. the amount of shared information) that might exist between the two component sub-groups.<ref name=":1">{{Citation|last=Abdi|first=Hervé|title=Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression|date=2013|work=Computational Toxicology|volume=930|pages=549–579|editor-last=Reisfeld|editor-first=Brad|publisher=Humana Press|doi=10.1007/978-1-62703-059-5_23|isbn=9781627030588|last2=Williams|first2=Lynne J.|editor2-last=Mayeno|editor2-first=Arthur N.}}</ref> It does this by using SVD to determine the inertia (i.e. the sum of the singular values) of the covariance matrix of the sub-groups under consideration.<ref name=":1" /><ref name=":0" />
 
:<math>X = T P^\mathrm{T} +T_\text{Y-orth} P^\mathrm{T}_\text{Y-orth} + E</math>
:<math>Y = U Q^\mathrm{T} +U_\text{X-orth} Q^\mathrm{T}_\text{X-orth} + F</math>
 
=== L-PLS ===
Another extension of PLS regression, named L-PLS for its L-shaped matrices, connects 3 related data blocks to improve predictability.<ref>{{cite journal|last1=Sæbøa|first1=S.|last2=Almøya|first2=T.|last3=Flatbergb|first3=A.|last4=Aastveita|first4=A.H.|last5=Martens|first5=H.|year=2008|title=LPLS-regression: a method for prediction and classification under the influence of background information on predictor variables|journal=Chemometrics and Intelligent Laboratory Systems|volume=91|issue=2|pages=121–132|doi=10.1016/j.chemolab.2007.10.006}}</ref> In brief, a new ''Z'' matrix, with the same number of columns as the ''X'' matrix, is added to the PLS regression analysis and may be suitable for including additional background information on the interdependence of the predictor variables.
 
=== 3PRF===
In 2015 partial least squares was related to a procedure called the three-pass regression filter (3PRF).<ref>{{Cite journal|last1=Kelly|first1=Bryan|last2=Pruitt|first2=Seth|date=2015-06-01|title=The three-pass regression filter: A new approach to forecasting using many predictors|journal=Journal of Econometrics|series=High Dimensional Problems in Econometrics|volume=186|issue=2|pages=294–316|doi=10.1016/j.jeconom.2015.02.011}}</ref> Supposing the number of observations and variables are large, the 3PRF (and hence PLS) is asymptotically normal for the "best" forecast implied by a linear latent factor model. In stock market data, PLS has been shown to provide accurate out-of-sample forecasts of returns and cash-flow growth.<ref>{{Cite journal|last1=Kelly|first1=Bryan|last2=Pruitt|first2=Seth|date=2013-10-01|title=Market Expectations in the Cross-Section of Present Values|journal=The Journal of Finance|volume=68|issue=5|pages=1721–1756|doi=10.1111/jofi.12060|issn=1540-6261|citeseerx=10.1.1.498.5973}}</ref>
 
=== Partial least squares SVD ===
A PLS version based on [[Singular value decomposition|singular value decomposition (SVD)]] provides a memory efficient implementation that can be used to address high-dimensional problems, such as relating millions of genetic markers to thousands of imaging features in imaging genetics, on consumer-grade hardware.<ref>{{Cite journal|last1=Lorenzi|first1=Marco|last2=Altmann|first2=Andre|last3=Gutman|first3=Boris|last4=Wray|first4=Selina|last5=Arber|first5=Charles|last6=Hibar|first6=Derrek P.|last7=Jahanshad|first7=Neda|last8=Schott|first8=Jonathan M.|last9=Alexander|first9=Daniel C.|date=2018-03-20|title=Susceptibility of brain atrophy to TRIB3 in Alzheimer's disease, evidence from functional prioritization in imaging genetics|journal=Proceedings of the National Academy of Sciences|volume=115|issue=12|pages=3162–3167|doi=10.1073/pnas.1706100115|issn=0027-8424|pmc=5866534|pmid=29511103|doi-access=free|bibcode=2018PNAS..115.3162L }}</ref>
 
=== PLS correlation ===
 
PLS correlation (PLSC) is another methodology related to PLS regression,<ref name=":0">{{Cite journal|last1=Krishnan|first1=Anjali|last2=Williams|first2=Lynne J.|last3=McIntosh|first3=Anthony Randal|last4=Abdi|first4=Hervé|date=May 2011|title=Partial Least Squares (PLS) methods for neuroimaging: A tutorial and review|journal=NeuroImage|volume=56|issue=2|pages=455–475|doi=10.1016/j.neuroimage.2010.07.034|pmid=20656037|s2cid=8796113}}</ref> which has been used in neuroimaging <ref name=":0" /><ref>{{Cite journal|last1=McIntosh|first1=Anthony R.|last2=Mišić|first2=Bratislav|date=2013-01-03|title=Multivariate Statistical Analyses for Neuroimaging Data|journal=Annual Review of Psychology|volume=64|issue=1|pages=499–525|doi=10.1146/annurev-psych-113011-143804|pmid=22804773|issn=0066-4308}}</ref><ref>{{Cite journal|last1=Beggs|first1=Clive B.|last2=Magnano|first2=Christopher|last3=Belov|first3=Pavel|last4=Krawiecki|first4=Jacqueline|last5=Ramasamy|first5=Deepa P.|last6=Hagemeier|first6=Jesper|last7=Zivadinov|first7=Robert|date=2016-05-02|editor-last=de Castro|editor-first=Fernando|title=Internal Jugular Vein Cross-Sectional Area and Cerebrospinal Fluid Pulsatility in the Aqueduct of Sylvius: A Comparative Study between Healthy Subjects and Multiple Sclerosis Patients|journal=PLOS ONE|volume=11|issue=5|pages=e0153960|doi=10.1371/journal.pone.0153960|issn=1932-6203|pmc=4852898|pmid=27135831|bibcode=2016PLoSO..1153960B|doi-access=free}}</ref> and sport science,<ref>{{Cite journal|last1=Weaving|first1=Dan|last2=Jones|first2=Ben|last3=Ireton|first3=Matt|last4=Whitehead|first4=Sarah|last5=Till|first5=Kevin|last6=Beggs|first6=Clive B.|date=2019-02-14|editor-last=Connaboy|editor-first=Chris|title=Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis|journal=PLOS ONE|volume=14|issue=2|pages=e0211776|doi=10.1371/journal.pone.0211776|pmid=30763328|issn=1932-6203|pmc=6375576|bibcode=2019PLoSO..1411776W|doi-access=free}}</ref> to quantify the strength of the relationship between data sets. Typically, PLSC divides the data into two blocks (sub-groups) each containing one or more variables, and then uses [[Singular value decomposition|singular value decomposition (SVD)]] to establish the strength of any relationship (i.e. the amount of shared information) that might exist between the two component sub-groups.<ref name=":1">{{Citation|last1=Abdi|first1=Hervé|title=Partial Least Squares Methods: Partial Least Squares Correlation and Partial Least Square Regression|date=2013|work=Computational Toxicology|volume=930|pages=549–579|editor-last=Reisfeld|editor-first=Brad|publisher=Humana Press|doi=10.1007/978-1-62703-059-5_23|isbn=9781627030588|last2=Williams|first2=Lynne J.|pmid=23086857|editor2-last=Mayeno|editor2-first=Arthur N.}}</ref> It does this by using SVD to determine the inertia (i.e. the sum of the singular values) of the covariance matrix of the sub-groups under consideration.<ref name=":1" /><ref name=":0" />
 
==See also==
Line 76 ⟶ 124:
*[[Feature extraction]]
*[[Machine learning]]
*[[Multilinear subspace learning]]
*[[Partial least squares path modeling]]
*[[Principal component analysis]]
*[[Regression analysis]]
*[[Total sum of squares]]
*[[Projection pursuit regression]]
 
==Further readingReferences==
{{Reflist}}
 
==Literature==
*{{cite book |first=R. |last=Kramer |title=Chemometric Techniques for Quantitative Analysis |publisher=Marcel-Dekker |year=1998 |isbn=978-0-8247-0198-7 }}
*{{cite journal |last1=Frank |first1=Ildiko E. |first2=Jerome H. |last2=Friedman |title=A Statistical View of Some Chemometrics Regression Tools |journal=Technometrics |volume=35 |issue=2 |pages=109–148 |year=1993 |doi=10.1080/00401706.1993.10485033 }}
*{{cite journal |last1=Haenlein |first1=Michael |first2=Andreas M. |last2=Kaplan | title=A Beginner's Guide to Partial Least Squares Analysis |journal=Understanding Statistics |volume=3 |issue=4 |pages=283–297| year=2004 |doi=10.1207/s15328031us0304_4 }}
*{{cite book
*{{cite journal |last1=Henseler |first1=Joerg |first2=Georg |last2=Fassott | title=Testing Moderating Effects in PLS Path Models. An Illustration of Available Procedures| year=2005 }}
| last1 = Henseler | first1 = Jörg
*{{cite journal |last1=Lingjærde |first1=Ole-Christian |first2=Nils |last2=Christophersen | title=Shrinkage Structure of Partial Least Squares |journal=Scandinavian Journal of Statistics |volume=27 |issue=3 |pages=459–473 | year=2000 |doi=10.1111/1467-9469.00201 }}
| last2 = Fassott | first2 = Georg
| editor1-last = Vinzi | editor1-first = Vincenzo Esposito
| editor2-last = Chin | editor2-first = Wynne W.
| editor3-last = Henseler | editor3-first = Jörg
| editor4-last = Wang | editor4-first = Huiwen
| contribution = Testing Moderating Effects in PLS Path Models: An Illustration of Available Procedures
| doi = 10.1007/978-3-540-32827-8_31
| isbn = 9783540328278
| pages = 713–735
| publisher = Springer
| title = Handbook of Partial Least Squares: Concepts, Methods and Applications
| year = 2010}}
*{{cite journal |last1=Lingjærde |first1=Ole-Christian |first2=Nils |last2=Christophersen | title=Shrinkage Structure of Partial Least Squares |journal=Scandinavian Journal of Statistics |volume=27 |issue=3 |pages=459–473 | year=2000 |doi=10.1111/1467-9469.00201 |s2cid=121489764 }}
*{{cite book | last=Tenenhaus |first=Michel | title= La Régression PLS: Théorie et Pratique. Paris: Technip.| year=1998}}
*{{cite book
*{{cite journal | last1=Rosipal |first1=Roman |first2=Nicole |last2=Kramer | title=Overview and Recent Advances in Partial Least Squares, in Subspace, Latent Structure and Feature Selection Techniques |pages=34–51 | year=2006}}
| last1 = Rosipal | first1 = Roman
| last2 = Krämer | first2 = Nicole
| editor1-last = Saunders | editor1-first = Craig
| editor2-last = Grobelnik | editor2-first = Marko
| editor3-last = Gunn | editor3-first = Steve
| editor4-last = Shawe-Taylor | editor4-first = John
| contribution = Overview and Recent Advances in Partial Least Squares
| doi = 10.1007/11752790_2
| isbn = 9783540341383
| pages = 34–51
| publisher = Springer
| series = Lecture Notes in Computer Science
| title = Subspace, Latent Structure and Feature Selection: Statistical and Optimization Perspectives Workshop, SLSFS 2005, Bohinj, Slovenia, February 23–25, 2005, Revised Selected Papers
| year = 2006}}
*{{cite journal |last=Helland |first=Inge S. |title=PLS regression and statistical models |journal=Scandinavian Journal of Statistics |volume=17 |issue=2 |pages=97–114 |year=1990 |jstor=4616159}}
*{{cite book |author-link=Herman Wold |last=Wold |first=Herman |chapter=Estimation of principal components and related models by iterative least squares |editor-first=P.R. |editor-last=Krishnaiaah |title=Multivariate Analysis |publisher=Academic Press |___location=New York |year=1966 |pages=391–420 }}
Line 97 ⟶ 175:
*{{cite journal |last=Garthwaite |first=Paul H. |title=An Interpretation of Partial Least Squares |journal=[[Journal of the American Statistical Association]] |volume=89 |pages=122–7 |year=1994 |jstor=2291207 |doi=10.1080/01621459.1994.10476452 |issue=425}}
*{{cite book |editor1-last=Wang |editor1-first=H. |title=Handbook of Partial Least Squares |year=2010 |isbn=978-3-540-32825-4 }}
*{{cite journal |last1=Stone |first1=M. |last2=Brooks |first2=R.J. |title=Continuum Regression: Cross-Validated Sequentially Constructed Prediction embracing Ordinary Least Squares, Partial Least Squares and Principal Components Regression |journal=Journal of the Royal Statistical Society, Series B |volume=52 |issue=2 |pages=237–269 |year=1990 |doi=10.1111/j.2517-6161.1990.tb01786.x |jstor=2345437}}
 
==References==
{{Reflist}}
 
== External links ==
{{Prone to spam|date=November 2017}}
{{Z148}}<!-- {{No more links}}
 
Please be cautious adding more external links.
Line 118 ⟶ 193:
 
-->
* [httphttps://wwwpersonal.utdutdallas.edu/~herve/Abdi-PLSR2007PLS-pretty.pdf A short introduction to PLS regression and its history]
* [https://www.youtube.com/watch?v=Px2otK2nZ1c Video: Derivation of PLS by Prof. H. Harry Asada]
 
{{Least Squares and Regression Analysis}}
{{Authority control}}