General linear model: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 10:31, 22 February 2025 edit Jerryobject (talk \| contribs) Extended confirmed users 16,305 edits m Cut needless carriage return whitespace characters in paragraph, sections, WP:LISTGAPs between WP:TABLE items: to standardize, aid work via small screens. ← Previous edit		Latest revision as of 16:28, 18 July 2025 edit undo 129.137.96.11 (talk) →Comparison to multiple linear regression: There was a typo stating that X_{ik} is the kth observation of the kth independent variable, while it should be the ith observation.
(5 intermediate revisions by 3 users not shown)
Line 7: where '''Y''' is a [[Matrix (mathematics)\|matrix]] with series of multivariate measurements (each column being a set of measurements on one of the [[dependent variable]]s), '''X''' is a matrix of observations on [[independent variable]]s that might be a [[design matrix]] (each column being a set of observations on one of the independent variables), '''B''' is a matrix containing parameters that are usually to be estimated and '''U''' is a matrix containing [[Errors and residuals in statistics\|errors]] (noise). The errors are usually assumed to be uncorrelated across measurements, and follow a [[multivariate normal distribution]]. If the errors do not follow a multivariate normal distribution, [[generalized linear model]]s may be used to relax assumptions about '''Y''' and '''U'''. The general linear model ~~incorporates~~(GLM) aencompasses ~~number of different~~several statistical models:, including [[Analysis of variance\|ANOVA]], [[Analysis of covariance\|ANCOVA]], [[Multivariate analysis of variance\|MANOVA]], [[Multivariate analysis of covariance\|MANCOVA]], ordinary [[linear regression]]. Within this framework, both [[t-test\|''t''-test]] and [[F-test\|''F''-test]] can be applied. The general linear model is a generalization of multiple linear regression to the case of more than one dependent variable. If '''Y''', '''B''', and '''U''' were [[column vector]]s, the matrix equation above would represent multiple linear regression. Hypothesis tests with the general linear model can be made in two ways: [[multivariate statistics\|multivariate]] or as several independent [[univariate]] tests. In multivariate tests the columns of '''Y''' are tested together, whereas in univariate tests the columns of '''Y''' are tested independently, i.e., as multiple univariate tests with the same design matrix. Line 19 ⟶ 20: for each observation ''i'' = 1, ... , ''n''. In the formula above we consider ''n'' observations of one dependent variable and ''p'' independent variables. Thus, ''Y''<sub>''i''</sub> is the ''i''<sup>th</sup> observation of the dependent variable, ''X''<sub>''ik''</sub> is ''ki''<sup>th</sup> observation of the ''k''<sup>th</sup> independent variable, ''jk'' = 1, 2, ..., ''p''. The values ''ββk''~~<sub>''j''</sub>~~ represent parameters to be estimated, and ''ε''<sub>''i''</sub> is the ''i''<sup>th</sup> independent identically distributed normal error. In the more general multivariate linear regression, there is one equation of the above form for each of ''m'' > 1 dependent variables that share the same set of explanatory variables and hence are estimated simultaneously with each other: Line 30 ⟶ 31: == Comparison to generalized linear model == The general linear model and the [[generalized linear model]] (GLM)<ref name=":0">{{Cite book \|last1=McCullagh \|first1=P. \|author1-link=Peter McCullagh \|last2=Nelder \|first2=J. A. \|author2-link=John Nelder \|date=January 1, 1983 \|chapter=An outline of generalized linear models \|title=Generalized Linear Models \|pages=21–47 \|publisher=Springer US \|isbn=9780412317606 \|doi=10.1007/978-1-4899-3242-6_2 \|doi-broken-date=1312 ~~December~~July ~~2024~~2025}}</ref><ref>Fox, J. (2015). ''Applied regression analysis and generalized linear models''. Sage Publications.</ref> are two commonly used families of [[Statistics\|statistical methods]] to relate some number of continuous and/or categorical [[Dependent and independent variables\|predictors]] to a single [[Dependent and independent variables\|outcome variable]]. The main difference between the two approaches is that the general linear model strictly assumes that the [[Errors and residuals\|residuals]] will follow a [[Conditional probability distribution\|conditionally]] [[normal distribution]],<ref name=":1">{{cite report \|last1=Cohen \|first1=J. \|last2=Cohen \|first2=P. \|last3=West \|first3=S. G. \|last4=Aiken \|first4=L. S. \|author4-link=Leona S. Aiken \|date=2003 \|title=Applied multiple regression/correlation analysis for the behavioral sciences}}</ref> while the GLM loosens this assumption and allows for a variety of other [[Distribution (mathematics)\|distributions]] from the [[exponential family]] for the residuals.<ref name=":0"/> The general linear model is a special case of the GLM in which the distribution of the residuals follow a conditionally normal distribution. Line 54 ⟶ 55: \|[[R (programming language)\|R]] package and function \|[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html lm()] in stats package (base R) \|[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html glm()] in stats package (base R) manova, \|- \|[[MATLAB]] function