Multivariate analysis of variance: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 08:33, 10 February 2024 edit Sigasys (talk \| contribs) 24 edits Removed a broken link (which advertised a paid statistics consulting service) ← Previous edit		Latest revision as of 23:33, 23 June 2025 edit undo WikiCleanerBot (talk \| contribs) Bots 1,007,735 edits m v2.05b - Bot T18 CW#553 - Fix errors for CW project (<nowiki> tags) Tag: WPCleaner
(7 intermediate revisions by 7 users not shown)
Line 3: In [[statistics]], '''multivariate analysis of variance''' ('''MANOVA''') is a procedure for comparing [[multivariate random variable\|multivariate]] sample means. As a multivariate procedure, it is used when there are two or more [[dependent variables]],<ref name="Warne2014">{{cite journal \|last=Warne \|first=R. T. \|year=2014 \|title=A primer on multivariate analysis of variance (MANOVA) for behavioral scientists \|journal=Practical Assessment, Research & Evaluation \|volume=19 \|issue=17 \|pages=1–10 \|url=https://scholarworks.umass.edu/pare/vol19/iss1/17/ }}</ref> and is often followed by significance tests involving individual dependent variables separately.<ref>Stevens, J. P. (2002). ''Applied multivariate statistics for the social sciences.'' Mahwah, NJ: Lawrence Erblaum.</ref> Without relation to the image, the dependent variables may be k life satisfactions scores measured at sequential [[time ~~points~~point]]s and p job satisfaction scores measured at sequential time points. In this case there are k+p dependent variables whose [[linear combination]] follows a multivariate [[normal distribution]], multivariate variance-covariance matrix homogeneity, and linear relationship, no multicollinearity, and each without outliers. == Model == Line 16: Where [[Partition of sums of squares\|sums of squares]] appear in univariate analysis of variance, in multivariate analysis of variance certain [[positive-definite matrix\|positive-definite matrices]] appear. The diagonal entries are the same kinds of sums of squares that appear in univariate ANOVA. The off-diagonal entries are corresponding sums of products. Under normality assumptions about [[errors and residuals in statistics\|error]] distributions, the counterpart of the sum of squares due to error has a [[Wishart distribution]]. == Hypothesis Testing == First, define the following <math display="inline">n\times q</math> matrices: Line 27 ⟶ 25: * <math display="inline">\bar Y</math>: where the <math display="inline">i</math>-th row is the best prediction given no information. That is the [[Sample mean and covariance\|empirical mean]] over all <math display="inline">n</math> observations <math display="inline">\frac{1}{n}\sum_{k=1}^n y_k</math> Then the matrix <math display="inline">S_{\text{model}} := (\hat Y - \bar Y)^T(\hat Y - \bar Y)</math> is a generalization of the sum of squares explained by the group, and <math display="inline">S_{\text{res}} := (Y - \hat Y)^T(Y - \hat Y)</math> is a generalization of the [[residual sum of squares]].<ref name="Anderson1994">{{cite book \|last=Anderson \|first=T. W. \|title=An Introduction to Multivariate Statistical Analysis \|year=1994 \|publisher=Wiley}}</ref> <ref name="Krzanowski1988">{{cite book \|last=Krzanowski \|first=W. J. \|title=Principles of Multivariate Analysis. A User's Perspective \|year=1988 \|publisher=Oxford University Press}}</ref> Note that alternatively one could also speak about covariances when the abovementioned matrices are scaled by 1/(n-1) since the subsequent test statistics do not change by multiplying <math display="inline">S_{\text{model}}</math> and <math display="inline">S_{\text{res}}</math> by the same non-zero constant. The most common<ref~~>{{cite~~ ~~web\|last~~name=~~Garson\|first=G. David\|title=Multivariate GLM, MANOVA, and MANCOVA\|url=http://faculty.chass.ncsu.edu/garson/PA765/manova.htm\|access-date=2011-03-22}}~~"Anderson1994"></ref><ref>{{cite web\|last=UCLA: Academic Technology Services, Statistical Consulting Group.\|title=Stata Annotated Output – MANOVA\|url=~~http~~https://~~www~~stats.~~ats~~oarc.ucla.edu~~/stat~~/stata/output/~~Stata_MANOVA.htm~~manova/\|access-date=~~2011~~2024-0302-2210}}</ref> statistics are summaries based on the roots (or eigenvalues) <math display="inline">\lambda_p</math> of the matrix <math display="inline">A:= S_{\text{model}}S_{\text{res}}^{-1}</math> * [[Samuel Stanley Wilks]]' <math>\Lambda_\text{Wilks} = \prod_{1,\ldots,p}(1/(1 + \lambda_{p})) = \det(I + A)^{-1} = \det(S_\text{res})/\det(S_\text{res} + S_\text{model})</math> distributed as [[Wilks' lambda distribution\|lambda]] (Λ) * the [[K. C. Sreedharan Pillai]]–[[M. S. Bartlett]] [[trace of a matrix\|trace]], <math>\Lambda_\text{Pillai} = \sum_{1,\ldots,p}(\lambda_p/(1 + \lambda_p)) = \operatorname{tr}(A(I + A)^{-1})</math><ref>{{cite web\|url=http://www.real-statistics.com/multivariate-statistics/multivariate-analysis-of-variance-manova/manova-basic-concepts/\|title=MANOVA Basic Concepts – Real Statistics Using Excel\|website=www.real-statistics.com\|access-date=5 April 2018}}</ref> * the ~~Lawley–~~[[Derrick Norman Lawley\|Lawley]]–[[Harold Hotelling\|Hotelling]] trace, <math>\Lambda_\text{LH} = \sum_{1,\ldots,p}(\lambda_{p}) = \operatorname{tr}(A)</math> * [[Roy's greatest root]] (also called ''Roy's largest root''), <math>\Lambda_\text{Roy} = \max_p(\lambda_p) </math> Line 54 ⟶ 52: In the case of two groups, all the statistics are equivalent and the test reduces to [[Hotelling's T-square]]. == Introducing ~~Covariates~~covariates (MANCOVA) == {{main\|Multivariate analysis of covariance}} One can also test if there is a group effect after adjusting for covariates. For this, follow the procedure above but substitute <math display="inline">\hat Y</math> with the predictions of the [[general linear model]], containing the group and the covariates, and substitute <math display="inline">\bar Y</math> with the predictions of the general linear model containing only the covariates (and an intercept). Then <math display="inline">S_{\text{model}}</math> are the additional sum of squares explained by adding the grouping information and <math display="inline">S_{\text{res}}</math> is the residual sum of squares of the model containing the grouping and the covariates.<ref name="Krzanowski1988" /> Note that in case of unbalanced data, the order of adding the covariates ~~matter~~matters. ==Correlation of dependent variables==