Multilevel model: Difference between revisions

Content deleted Content added
level 2 equation
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(25 intermediate revisions by 19 users not shown)
Line 1:
{{Short description|Statistical modelsType of parameters that vary at more than onestatistical levelmodel}}
{{Merge|Mixed model|discuss=Talk:Mixed model#Proposed merge of Multilevel model with Mixed model|date=June 2021}}
{{Use dmy dates|date=April 2019}}
{{Regression bar}}
 
'''Multilevel models''' ({{Refn|also known as '''hierarchical linear models''', '''linear mixed-effect modelmodels''', '''mixed models''', '''nested data models''', '''random coefficient''', '''random-effects models''', '''random parameter models''', or '''split-plot designs''')|group=lower-alpha}} are [[statistical model]]s of [[parameter]]s that vary at more than one level.<ref name="Raud" /> An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of [[linear model]]s (in particular, [[linear regression]]), although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.<ref name="Raud" />
 
Multilevel models are particularly appropriate for research designs where data for participants are organized at more than one level (i.e., [[nested data]]).<ref name="Fidell">{{cite book|last=Fidell|first=Barbara G. Tabachnick, Linda S.|title=Using multivariate statistics|year=2007|publisher=Pearson/A & B|___location=Boston ; Montreal|isbn=978-0-205-45938-4|edition=5th}}</ref> The units of analysis are usually individuals (at a lower level) who are nested within contextual/aggregate units (at a higher level).<ref name="Luke">{{cite book|last=Luke|first=Douglas A.|title=Multilevel modeling|year=2004|publisher=Sage|___location=Thousand Oaks, CA|isbn=978-0-7619-2879-9|edition=3. repr.}}</ref> While the lowest level of data in multilevel models is usually an individual, repeated measurements of individuals may also be examined.<ref name="Fidell" /><ref name="Gomes2022">{{cite journal |last1=Gomes |first1=Dylan G.E. |title=Should I use fixed effects or random effects when I have fewer than five levels of a grouping factor in a mixed-effects model? |journal=PeerJ |date=20 January 2022 |volume=10 |pages=e12794 |doi=10.7717/peerj.12794|pmid=35116198 |pmc=8784019 |doi-access=free }}</ref> As such, multilevel models provide an alternative type of analysis for univariate or [[multivariate analysis]] of [[repeated measures]]. Individual differences in [[growth curve (statistics)|growth curves]] may be examined.<ref name="Fidell" /> Furthermore, multilevel models can be used as an alternative to [[ANCOVA]], where scores on the dependent variable are adjusted for covariates (e.g. individual differences) before testing treatment differences.<ref name="Cohen">{{cite book|last1=Cohen|first1=Jacob|title=Applied multiple regression/correlation analysis for the behavioral sciences|publisher=Erlbaum|___location=Mahwah, NJ [u.a.]|isbn=978-0-8058-2223-6|edition=3.|date=3 October 2003}}</ref> Multilevel models are able to analyze these experiments without the assumptions of homogeneity-of-regression slopes that is required by ANCOVA.<ref name="Fidell" />
 
Multilevel models can be used on data with many levels, although 2-level models are the most common and the rest of this article deals only with these. The dependent variable must be examined at the lowest level of analysis.<ref name="Raud">{{cite book|last=Bryk|first=Stephen W. Raudenbush, Anthony S.|title=Hierarchical linear models : applications and data analysis methods|year=2002|publisher=Sage Publications|___location=Thousand Oaks, CA [u.a.]|isbn=978-0-7619-1904-9|edition=2. ed., [3. Dr.]}}</ref>
 
==Level 1 regression equation==
When there is a single level 1 independent variable, the level 1 model is:
 
<math> Y_{ij} = \alpha_beta_{i0j} + \beta_{ij1j} X_{ij} + e_{ij}</math>.
 
*<math>Y_{ij} </math> refers to the score on the dependent variable for an individual observation at Level 1 (subscript i refers to individual case, subscript j refers to the group).
*<math>X_{ij} </math> refers to the Level 1 predictor.
*<math>\alpha_beta_{i0j} </math> refers to the intercept of the dependent variable for individualgroup case ij.
*<math> \beta_{1j}</math> refers to the slope for individual case i for the relationship in group j (Level 2) between the Level 1 predictor and the dependent variable.
*<math> e_{ij}</math> refers to the random errors of prediction for the Level 1 equation (it is also sometimes referred to as <math>r_{ij}</math>).
<math>e_{ij} \sim \mathcal{N}(0,\sigma_1^2)
 
</math>
 
At Level 1, both the intercepts and slopes in the groups can be either fixed (meaning that all groups have the same values, although in the real world this would be a rare occurrence), non-randomly varying (meaning that the intercepts and/or slopes are predictable from an independent variable at Level 2), or randomly varying (meaning that the intercepts and/or slopes are different in the different groups, and that each have their own overall mean and variance).<ref name="Fidell" /><ref name="Gomes2022"/>
Line 25 ⟶ 27:
When there are multiple level 1 independent variables, the model can be expanded by substituting vectors and matrices in the equation.
 
When the relationship between the response <math> Y_{ij} </math> and predictor <math> X_{ij} </math> can not be described by the linear relationship, then one can find some non linear functional relationship between the response and predictor, and extend the model to [[nonlinear mixed-effects model]]. For example, when the response <math>Y_{ij} </math> is the cumulative infection trajectory of the <math>i</math>-th country, and <math> X_{ij} </math> represents the <math>j</math>-th time points, then the ordered pair <math>(X_{ij},Y_{ij})</math> for each country may show a shape similar to [[logistic function]].<ref>{{Cite journal |last1=Lee|first1=Se Yoon |first2=Bowen |last2=Lei|first3=Bani|last3=Mallick| title = Estimation of COVID-19 spread curves integrating global data and borrowing information|journal=PLOS ONE|year=2020|volume=15 |issue=7 |pages=e0236860 |doi=10.1371/journal.pone.0236860 |arxiv=2005.00662|pmid=32726361 |pmc=7390340 |bibcode=2020PLoSO..1536860L |doi-access=free}}</ref><ref name="ReferenceA">{{Cite journal |last1=Lee|first1=Se Yoon |first2=Bani|last2=Mallick| title = Bayesian Hierarchical Modeling: Application Towards Production Results in the Eagle Ford Shale of South Texas|journal=Sankhya B|year=2021|volume=84 |pages=1–43 |doi=10.1007/s13571-020-00245-8|doi-access=free|s2cid=234027590 }}</ref>
 
==Level 2 regression equation==
Line 31 ⟶ 33:
The dependent variables are the intercepts and the slopes for the independent variables at Level 1 in the groups of Level 2.
 
<math>\alpha_u_{i0j} = \gamma +sim \nu_mathcal{iN}</math>(0,\sigma_2^2)
 
</math>
 
<math>u_{1j} \sim \mathcal{N}(0,\sigma_3^2)
 
</math>
 
<math>\beta_{0j} = \gamma_{00} + \gamma_{01}w_j + u_{0j}</math>
 
<math>\beta_{ij1j} = \deltagamma_{10} + \tau_gamma_{11}w_j + u_{ij1j} </math>
 
*<math>\gammagamma_{00}</math> refers to the overall intercept. This is the grand mean of the scores on the dependent variable across all the groups when all the predictors are equal to 0.
*<math>\tau_gamma_{ij10}</math> refers to the overall regression coefficient, or theaverage slope, between the dependent variable and the Level 21 predictor.
*<math>\nu_{i}w_j</math> refers to the deviationLevel of2 case i from the overall interceptpredictor.
*<math>\deltagamma_{01}</math> refersand to<math>\gamma_{11}</math> therefer overall regression coefficient, orto the slope,effect betweenof the dependentLevel variable2 andpredictor on the Level 1 predictorintercept and slope respectively.
*<math>u_{0j}</math> refers to the deviation in group j from the overall intercept.
*<math>u_{1j}</math> refers to the deviation in group j from the average slope between the dependent variable and the Level 1 predictor.
 
==Types of models==
Line 65 ⟶ 77:
 
;Normality
The assumption of normality states that the error terms at every level of the model are normally distributed.<ref name="Green" />{{disputed inline|reason=[[Variance components model]]|date=August 2016}}. However, most statistical software allows one to specify different distributions for the variance terms, such as a Poisson, binomial, logistic. The multilevel modelling approach can be used for all forms of Generalized Linear models.
 
;Homoscedasticity
Line 74 ⟶ 86:
 
;Orthogonality of regressors to random effects
The regressors must not correlate with the random effects, <math>u_{0j}</math>. This assumption is testable but often ignored, rendering the estimator inconsistent.<ref name=":0">{{Cite journal |last1=Antonakis |first1=John |last2=Bastardoz |first2=Nicolas |last3=Rönkkö |first3=Mikko |date=2021 |title=On Ignoring the Random Effects Assumption in Multilevel Models: Review, Critique, and Recommendations |url=httphttps://journalsjyx.sagepubjyu.comfi/doibitstream/10.1177123456789/109442811987745766704/2/Antonakisym.pdf |journal=Organizational Research Methods |language=en |volume=24 |issue=2 |pages=443–483 |doi=10.1177/1094428119877457 |s2cid=210355362 |issn=1094-4281|url-access= |url-status= |archive-url= |archive-date= }}</ref> If this assumption is violated, the random-effect must be modeled explicitly in the fixed part of the model, either by using dummy variables or including cluster means of all <math>X_{ij} </math> regressors.<ref name=":0" /><ref>{{Cite journal |last1=McNeish |first1=Daniel |last2=Kelley |first2=Ken |date=2019 |title=Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations. |url=http://doi.apa.org/getdoi.cfm?doi=10.1037/met0000182 |journal=Psychological Methods |language=en |volume=24 |issue=1 |pages=20–35 |doi=10.1037/met0000182 |pmid=29863377 |s2cid=44145669 |issn=1939-1463|url-access=subscription }}</ref><ref>{{Cite journal |last1=Bliese |first1=Paul D. |last2=Schepker |first2=Donald J. |last3=Essman |first3=Spenser M. |last4=Ployhart |first4=Robert E. |date=2020 |title=Bridging Methodological Divides Between Macro- and Microresearch: Endogeneity and Methods for Panel Data |url=http://journals.sagepub.com/doi/10.1177/0149206319868016 |journal=Journal of Management |language=en |volume=46 |issue=1 |pages=70–99 |doi=10.1177/0149206319868016 |s2cid=202288849 |issn=0149-2063|url-access=subscription }}</ref><ref>{{Cite book |last=Wooldridge |first=Jeffrey M. |url=https://books.google.com/books?id=hSs3AgAAQBAJ&dq=info:T5fz2cmyyF8J:scholar.google.com&pg=PP1 |title=Econometric Analysis of Cross Section and Panel Data, second edition |date=2010-10-01 |publisher=MIT Press |isbn=978-0-262-29679-3 |language=en}}</ref> This assumption is probably the most important assumption the estimator makes, but one that is misunderstood by most applied researchers using these types of models.<ref name=":0" />
 
==Statistical tests==
Line 86 ⟶ 98:
 
===Level===
The concept of level is the keystone of this approach. In an [[educational research]] example, the levels for a 2-level model might be:
#pupil
#class
 
However, if one were studying multiple schools and multiple school districts, a 4-level model could be:include
#pupil
#class
Line 101 ⟶ 113:
As a simple example, consider a basic linear regression model that predicts income as a function of age, class, gender and race. It might then be observed that income levels also vary depending on the city and state of residence. A simple way to incorporate this into the regression model would be to add an additional [[independent variable|independent]] [[categorical variable]] to account for the ___location (i.e. a set of additional binary predictors and associated regression coefficients, one per ___location). This would have the effect of shifting the mean income up or down—but it would still assume, for example, that the effect of race and gender on income is the same everywhere. In reality, this is unlikely to be the case—different local laws, different retirement policies, differences in level of racial prejudice, etc. are likely to cause all of the predictors to have different sorts of effects in different locales.
 
In other words, a simple linear regression model might, for example, predict that a given randomly sampled person in [[Seattle]] would have an average yearly income $10,000 higher than a similar person in [[Mobile, Alabama]]. However, it would also predict, for example, that a white person might have an average income $7,000 above a black person, and a 65-year-old might have an income $3,000 below a 45-year-old, in both cases regardless of ___location. A multilevel model, however, would allow for different regression coefficients for each predictor in each ___location. Essentially, it would assume that people in a given ___location have correlated incomes generated by a single set of regression coefficients, whereas people in another ___location have incomes generated by a different set of coefficients. Meanwhile, the coefficients themselves are assumed to be correlated and generated from a single set of [[Hyperparameter (Bayesian statistics)|hyperparameter]]s. Additional levels are possible: For example, people might be grouped by cities, and the city-level regression coefficients grouped by state, and the state-level coefficients generated from a single hyper-hyperparameter.
 
Multilevel models are a subclass of [[hierarchical Bayesian model]]s, which are general models with multiple levels of [[random variable]]s and arbitrary relationships among the different variables. Multilevel analysis has been extended to include multilevel [[structural equation modeling]], multilevel [[latent class model]]ing, and other more general models.
Line 125 ⟶ 137:
==Bayesian nonlinear mixed-effects model==
 
[[File:Bayesian research cycle.png|500px|thumb|right|Bayesian research cycle using Bayesian nonlinear mixed effects model: (a) standard research cycle and (b) Bayesian-specific workflow .<ref name="Repeated Measurement Data 2201">{{Cite journal |last1=Lee|first1=Se Yoon| title = Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications |journal=Mathematics|year=2022|volume=10 |issue=6 |page=898 |doi=10.3390/math10060898|doi-access=free|arxiv=2201.12430}}</ref>.]]
 
Multilevel modeling is frequently used in diverse applications and it can be formulated by the Bayesian framework. Particularly, Bayesian nonlinear mixed-effects models have recently received significant attention. A basic version of the Bayesian nonlinear mixed-effects models is represented as the following three-stage:
Line 131 ⟶ 143:
'''''Stage 1: Individual-Level Model'''''
 
<math>\begin{align}
<math>&{y}_{ij} = f(t_{ij};\theta_{1i},\theta_{2i},\ldots,\theta_{li},\ldots,\theta_{Ki} ) + \epsilon_{ij},\quad \epsilon_{ij} \sim N(0, \sigma^2), \quad i =1,\ldots, N, \, j = 1,\ldots, M_i.</math>
\phantom{spacer} \\
&\epsilon_{ij} \sim N(0, \sigma^2), \\
\phantom{spacer} \\
&i =1,\ldots, N, \, j = 1,\ldots, M_i.
\end{align}</math>
 
'''''Stage 2: Population Model'''''
 
<math>\begin{align}
<math>&\theta_{li}= \alpha_l + \sum_{b=1}^{P}\beta_{lb}x_{ib} + \eta_{li}, \quad \eta_{li} \sim N(0, \omega_l^2), \quad i =1,\ldots, N, \, l=1,\ldots, K.</math>
\phantom{spacer} \\
&\eta_{li} \sim N(0, \omega_l^2), \\
\phantom{spacer} \\
&i =1,\ldots, N, \, l=1,\ldots, K.
\end{align}</math>
 
'''''Stage 3: Prior'''''
 
<math>\begin{align}
<math> \sigma^2 \sim \pi(\sigma^2),\quad \alpha_l \sim \pi(\alpha_l), \quad (\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}) \sim \pi(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}), \quad \omega_l^2 \sim \pi(\omega_l^2), \quad l=1,\ldots, K.</math>
&\sigma^2 \sim \pi(\sigma^2),\\
 
\phantom{spacer} \\
Here, <math>y_{ij}</math> denotes the continuous response of the <math>i</math>-th subject at the time point <math>t_{ij}</math>, and <math>x_{ib}</math> is the <math>b</math>-th covariate of the <math>i</math>-th subject. Parameters involved in the model are written in Greek letters. <math>f(t ; \theta_{1},\ldots,\theta_{K})</math> is a known function parameterized by the <math>K</math>-dimensional vector <math>(\theta_{1},\ldots,\theta_{K})</math>. Typically, <math>f</math> is a `nonlinear' function and describes the temporal trajectory of individuals. In the model, <math>\epsilon_{ij}</math> and <math>\eta_{li}</math> describe within-individual variability and between-individual variability, respectively. If '''''Stage 3: Prior''''' is not considered, then the model reduces to a frequentist nonlinear mixed-effect model.
&\alpha_l \sim \pi(\alpha_l), \\
\phantom{spacer} \\
<math> \sigma^2 \sim \pi(\sigma^2),\quad \alpha_l \sim \pi(\alpha_l), \quad &(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}) \sim \pi(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}), \quad \omega_l^2 \sim \pi(\omega_l^2), \quad l=1,\ldots, K.</math>
\phantom{spacer} \\
&\omega_l^2 \sim \pi(\omega_l^2), \\
\phantom{spacer} \\
&l=1,\ldots, K.
\end{align}</math>
 
Here, <math>y_{ij}</math> denotes the continuous response of the <math>i</math>-th subject at the time point <math>t_{ij}</math>, and <math>x_{ib}</math> is the <math>b</math>-th covariate of the <math>i</math>-th subject. Parameters involved in the model are written in Greek letters. <math>f(t ; \theta_{1},\ldots,\theta_{K})</math> is a known function parameterized by the <math>K</math>-dimensional vector <math>(\theta_{1},\ldots,\theta_{K})</math>. Typically, <math>f</math> is a `nonlinear' function and describes the temporal trajectory of individuals. In the model, <math>\epsilon_{ij}</math> and <math>\eta_{li}</math> describe within-individual variability and between-individual variability, respectively. If '''''Stage 3: Prior''''' is not considered, then the model reduces to a frequentist nonlinear mixed-effect model.
 
A central task in the application of the Bayesian nonlinear mixed-effect models is to evaluate the posterior density:
Line 150 ⟶ 183:
<math>\propto \pi(\{y_{ij}\}_{i=1,j=1}^{N,M_i}, \{\theta_{li}\}_{i=1,l=1}^{N,K},\sigma^2, \{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)</math>
 
<math>\begin{align}
<math>=& ~ \underbraceleft.{\pi(\{y_{ij}\}_{i=1,j=1}^{N,M_i} |\{\theta_{li}\}_{i=1,l=1}^{N,K},\sigma^2)}_ \right\}\text{Stage 1: Individual-Level Model} \\
\phantom{spacer} \\
\times
& ~ \underbraceleft.{\pi(\{\theta_{li}\}_{i=1,l=1}^{N,K}|\{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)}_ \right\}\text{Stage 2: Population Model} \\
\phantom{spacer} \\
\times
& ~ \underbraceleft.{p(\sigma^2, \{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)}_ \right\}\text{Stage 3: Prior}
\end{align}</math>
 
The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model.<ref>{{Cite journal |last1name=Lee|first1=Se Yoon| title = Bayesian Nonlinear Models for "Repeated Measurement Data: An Overview, Implementation, and Applications |journal=Mathematics|year=2022|volume=10 |issue=6 |page=898 |doi=10.3390/math10060898|doi-access=free}}<2201"/ref> A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function <math> f </math>; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle.
 
The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model.<ref>{{Cite journal |last1=Lee|first1=Se Yoon| title = Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications |journal=Mathematics|year=2022|volume=10 |issue=6 |page=898 |doi=10.3390/math10060898|doi-access=free}}</ref> A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function <math> f </math>; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle.
 
==See also==
*[[Hyperparameter (Bayesian statistics)|Hyperparameter]]
*[[Mixed-design analysis of variance]]
*[[Multiscale modeling]]
*[[Random effects model]]
*[[Nonlinear mixed-effects model]]
*[[Bayesian hierarchical modeling]]
*[[Restricted randomization]]
 
== Notes ==
{{Reflist|group=lower-alpha}}
 
== References ==
Line 179 ⟶ 219:
* {{cite book |last1=Swamy |first1=P. A. V. B. |author-link=P. A. V. B. Swamy |last2=Tavlas |first2=George S. |chapter=Random Coefficient Models |title=A Companion to Theoretical Econometrics |editor-last=Baltagi |editor-first=Badi H. |___location=Oxford |publisher=Blackwell |year=2001 |isbn=978-0-631-21254-6 |pages=410–429 }}
* {{cite book |last1=Verbeke |first1=G. |last2=Molenberghs |first2=G. |year=2013 |title=Linear Mixed Models for Longitudinal Data |publisher=Springer }} Includes [[SAS (software)|SAS]] code
* {{cite journal |last1=Gomes |first1=Dylan G.E. |title=Should I use fixed effects or random effects when I have fewer than five levels of a grouping factor in a mixed-effects model? |journal=PeerJ |date=20 January 2022 |volume=10 |pages=e12794 |doi=10.7717/peerj.12794|pmid=35116198 |pmc=8784019 |doi-access=free }}
 
==External links==