Content deleted Content added
Jerryobject (talk | contribs) m Cut needless carriage return whitespace characters in paragraph, sections, WP:LISTGAPs between WP:TABLE items: to standardize, aid work via small screens. |
→Comparison to multiple linear regression: There was a typo stating that X_{ik} is the kth observation of the kth independent variable, while it should be the ith observation. |
||
(5 intermediate revisions by 3 users not shown) | |||
Line 7:
where '''Y''' is a [[Matrix (mathematics)|matrix]] with series of multivariate measurements (each column being a set of measurements on one of the [[dependent variable]]s), '''X''' is a matrix of observations on [[independent variable]]s that might be a [[design matrix]] (each column being a set of observations on one of the independent variables), '''B''' is a matrix containing parameters that are usually to be estimated and '''U''' is a matrix containing [[Errors and residuals in statistics|errors]] (noise). The errors are usually assumed to be uncorrelated across measurements, and follow a [[multivariate normal distribution]]. If the errors do not follow a multivariate normal distribution, [[generalized linear model]]s may be used to relax assumptions about '''Y''' and '''U'''.
The general linear model
Hypothesis tests with the general linear model can be made in two ways: [[multivariate statistics|multivariate]] or as several independent [[univariate]] tests. In multivariate tests the columns of '''Y''' are tested together, whereas in univariate tests the columns of '''Y''' are tested independently, i.e., as multiple univariate tests with the same design matrix.
Line 19 ⟶ 20:
for each observation ''i'' = 1, ... , ''n''.
In the formula above we consider ''n'' observations of one dependent variable and ''p'' independent variables. Thus, ''Y''<sub>''i''</sub> is the ''i''<sup>th</sup> observation of the dependent variable, ''X''<sub>''ik''</sub> is ''
In the more general multivariate linear regression, there is one equation of the above form for each of ''m'' > 1 dependent variables that share the same set of explanatory variables and hence are estimated simultaneously with each other:
Line 30 ⟶ 31:
== Comparison to generalized linear model ==
The general linear model and the [[generalized linear model]] (GLM)<ref name=":0">{{Cite book |last1=McCullagh |first1=P. |author1-link=Peter McCullagh |last2=Nelder |first2=J. A. |author2-link=John Nelder |date=January 1, 1983 |chapter=An outline of generalized linear models |title=Generalized Linear Models |pages=21–47 |publisher=Springer US |isbn=9780412317606 |doi=10.1007/978-1-4899-3242-6_2 |doi-broken-date=
The main difference between the two approaches is that the general linear model strictly assumes that the [[Errors and residuals|residuals]] will follow a [[Conditional probability distribution|conditionally]] [[normal distribution]],<ref name=":1">{{cite report |last1=Cohen |first1=J. |last2=Cohen |first2=P. |last3=West |first3=S. G. |last4=Aiken |first4=L. S. |author4-link=Leona S. Aiken |date=2003 |title=Applied multiple regression/correlation analysis for the behavioral sciences}}</ref> while the GLM loosens this assumption and allows for a variety of other [[Distribution (mathematics)|distributions]] from the [[exponential family]] for the residuals.<ref name=":0"/> The general linear model is a special case of the GLM in which the distribution of the residuals follow a conditionally normal distribution.
Line 54 ⟶ 55:
|[[R (programming language)|R]] package and function
|[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html lm()] in stats package (base R)
|[https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html glm()] in stats package (base R) manova,
|-
|[[MATLAB]] function
|