Multilevel model: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 16:49, 13 January 2024 edit Tanisds (talk \| contribs) 139 edits m →See also: Multiscale modeling ← Previous edit		Latest revision as of 17:38, 21 May 2025 edit undo OAbot (talk \| contribs) Bots 646,409 edits m Open access bot: url-access updated in citation with #oabot.
(14 intermediate revisions by 10 users not shown)
Line 1: {{Short description\|~~Statistical models~~Type of ~~parameters that vary at more than one~~statistical ~~level~~model}} {{Use dmy dates\|date=April 2019}} {{Regression bar}} '''Multilevel models''' ({{Refn\|also known as '''hierarchical linear models''', '''linear mixed-effect ~~model~~models''', '''mixed models''', '''nested data models''', '''random coefficient''', '''random-effects models''', '''random parameter models''', or '''split-plot designs''')\|group=lower-alpha}} are [[statistical model]]s of [[parameter]]s that vary at more than one level.<ref name="Raud" /> An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of [[linear model]]s (in particular, [[linear regression]]), although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.<ref name="Raud" /> Multilevel models are particularly appropriate for research designs where data for participants are organized at more than one level (i.e., [[nested data]]).<ref name="Fidell">{{cite book\|last=Fidell\|first=Barbara G. Tabachnick, Linda S.\|title=Using multivariate statistics\|year=2007\|publisher=Pearson/A & B\|___location=Boston ; Montreal\|isbn=978-0-205-45938-4\|edition=5th}}</ref> The units of analysis are usually individuals (at a lower level) who are nested within contextual/aggregate units (at a higher level).<ref name="Luke">{{cite book\|last=Luke\|first=Douglas A.\|title=Multilevel modeling\|year=2004\|publisher=Sage\|___location=Thousand Oaks, CA\|isbn=978-0-7619-2879-9\|edition=3. repr.}}</ref> While the lowest level of data in multilevel models is usually an individual, repeated measurements of individuals may also be examined.<ref name="Fidell" /><ref name="Gomes2022">{{cite journal \|last1=Gomes \|first1=Dylan G.E. \|title=Should I use fixed effects or random effects when I have fewer than five levels of a grouping factor in a mixed-effects model? \|journal=PeerJ \|date=20 January 2022 \|volume=10 \|pages=e12794 \|doi=10.7717/peerj.12794\|pmid=35116198 \|pmc=8784019 \|doi-access=free }}</ref> As such, multilevel models provide an alternative type of analysis for univariate or [[multivariate analysis]] of [[repeated measures]]. Individual differences in [[growth curve (statistics)\|growth curves]] may be examined.<ref name="Fidell" /> Furthermore, multilevel models can be used as an alternative to [[ANCOVA]], where scores on the dependent variable are adjusted for covariates (e.g. individual differences) before testing treatment differences.<ref name="Cohen">{{cite book\|last1=Cohen\|first1=Jacob\|title=Applied multiple regression/correlation analysis for the behavioral sciences\|publisher=Erlbaum\|___location=Mahwah, NJ [u.a.]\|isbn=978-0-8058-2223-6\|edition=3.\|date=3 October 2003}}</ref> Multilevel models are able to analyze these experiments without the assumptions of homogeneity-of-regression slopes that is required by ANCOVA.<ref name="Fidell" /> Multilevel models can be used on data with many levels, although 2-level models are the most common and the rest of this article deals only with these. The dependent variable must be examined at the lowest level of analysis.<ref name="Raud">{{cite book\|last=Bryk\|first=Stephen W. Raudenbush, Anthony S.\|title=Hierarchical linear models : applications and data analysis methods\|year=2002\|publisher=Sage Publications\|___location=Thousand Oaks, CA [u.a.]\|isbn=978-0-7619-1904-9\|edition=2. ed., [3. Dr.]}}</ref> ==Level 1 regression equation== When there is a single level 1 independent variable, the level 1 model is: <math> Y_{ij} = \~~alpha_~~beta_{i0j} + \beta_{ij1j} X_{ij} + e_{ij}</math>. <math>Y_{ij} </math> refers to the score on the dependent variable for an individual observation at Level 1 (subscript i refers to individual case, subscript j refers to the group). <math>X_{ij} </math> refers to the Level 1 predictor. <math>\~~alpha_~~beta_{i0j} </math> refers to the intercept of the dependent variable for ~~individual~~group ~~case i~~j. <math> \beta_{ij1j}</math> refers to the slope ~~for individual case i~~ for the relationship in group j (Level 2) between the Level 1 predictor and the dependent variable. <math> e_{ij}</math> refers to the random errors of prediction for the Level 1 equation (it is also sometimes referred to as <math>r_{ij}</math>). <math>e_{ij} \sim \mathcal{N}(0,\~~sigma_3~~sigma_1^2) </math> Line 27: When there are multiple level 1 independent variables, the model can be expanded by substituting vectors and matrices in the equation. When the relationship between the response <math> Y_{ij} </math> and predictor <math> X_{ij} </math> can not be described by the linear relationship, then one can find some non linear functional relationship between the response and predictor, and extend the model to [[nonlinear mixed-effects model]]. For example, when the response <math>Y_{ij} </math> is the cumulative infection trajectory of the <math>i</math>-th country, and <math> X_{ij} </math> represents the <math>j</math>-th time points, then the ordered pair <math>(X_{ij},Y_{ij})</math> for each country may show a shape similar to [[logistic function]].<ref>{{Cite journal \|last1=Lee\|first1=Se Yoon \|first2=Bowen \|last2=Lei\|first3=Bani\|last3=Mallick\| title = Estimation of COVID-19 spread curves integrating global data and borrowing information\|journal=PLOS ONE\|year=2020\|volume=15 \|issue=7 \|pages=e0236860 \|doi=10.1371/journal.pone.0236860 \|arxiv=2005.00662\|pmid=32726361 \|pmc=7390340 \|bibcode=2020PLoSO..1536860L \|doi-access=free}}</ref><ref name="ReferenceA">{{Cite journal \|last1=Lee\|first1=Se Yoon \|first2=Bani\|last2=Mallick\| title = Bayesian Hierarchical Modeling: Application Towards Production Results in the Eagle Ford Shale of South Texas\|journal=Sankhya B\|year=2021\|volume=84 \|pages=1–43 \|doi=10.1007/s13571-020-00245-8\|doi-access=\|s2cid=234027590 }}</ref> ==Level 2 regression equation== Line 33: The dependent variables are the intercepts and the slopes for the independent variables at Level 1 in the groups of Level 2. <math>~~\nu_~~u_{i0j} \sim \mathcal{N}(0,\~~sigma_1~~sigma_2^2) </math> <math>~~\tau_~~u_{ij1j} \sim \mathcal{N}(0,\~~sigma_2~~sigma_3^2) </math> <math>\~~alpha_~~beta_{i0j} = \~~gamma~~gamma_{00} + \~~nu_~~gamma_{01}w_j + u_{i0j}</math> <math>\beta_{ij1j} = \~~delta~~gamma_{10} + \~~tau_~~gamma_{11}w_j + u_{ij1j} </math> <math>\~~gamma~~gamma_{00}</math> refers to the overall intercept. This is the grand mean of the scores on the dependent variable across all the groups when all the predictors are equal to 0. <math>\~~tau_~~gamma_{ij10}</math> refers to the ~~overall regression coefficient, or the~~average slope, between the dependent variable and the Level 21 predictor. <math>~~\nu_{i}~~w_j</math> refers to the ~~deviation~~Level of2 ~~case i from the overall intercept~~predictor. <math>\~~delta~~gamma_{01}</math> ~~refers~~and to<math>\gamma_{11}</math> ~~the~~refer ~~overall regression coefficient, or~~to the ~~slope,~~effect ~~between~~of the ~~dependent~~Level ~~variable~~2 ~~and~~predictor on the Level 1 ~~predictor~~intercept and slope respectively. <math>u_{0j}</math> refers to the deviation in group j from the overall intercept. <math>u_{1j}</math> refers to the deviation in group j from the average slope between the dependent variable and the Level 1 predictor. ==Types of models== Line 84 ⟶ 86: ;Orthogonality of regressors to random effects The regressors must not correlate with the random effects, <math>u_{0j}</math>. This assumption is testable but often ignored, rendering the estimator inconsistent.<ref name=":0">{{Cite journal \|last1=Antonakis \|first1=John \|last2=Bastardoz \|first2=Nicolas \|last3=Rönkkö \|first3=Mikko \|date=2021 \|title=On Ignoring the Random Effects Assumption in Multilevel Models: Review, Critique, and Recommendations \|url=https://jyx.jyu.fi/bitstream/123456789/66704/2/Antonakisym.pdf \|journal=Organizational Research Methods \|language=en \|volume=24 \|issue=2 \|pages=443–483 \|doi=10.1177/1094428119877457 \|s2cid=210355362 \|issn=1094-4281\|url-access= \|url-status= \|archive-url= \|archive-date= }}</ref> If this assumption is violated, the random-effect must be modeled explicitly in the fixed part of the model, either by using dummy variables or including cluster means of all <math>X_{ij} </math> regressors.<ref name=":0" /><ref>{{Cite journal \|last1=McNeish \|first1=Daniel \|last2=Kelley \|first2=Ken \|date=2019 \|title=Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations. \|url=http://doi.apa.org/getdoi.cfm?doi=10.1037/met0000182 \|journal=Psychological Methods \|language=en \|volume=24 \|issue=1 \|pages=20–35 \|doi=10.1037/met0000182 \|pmid=29863377 \|s2cid=44145669 \|issn=1939-1463\|url-access=subscription }}</ref><ref>{{Cite journal \|last1=Bliese \|first1=Paul D. \|last2=Schepker \|first2=Donald J. \|last3=Essman \|first3=Spenser M. \|last4=Ployhart \|first4=Robert E. \|date=2020 \|title=Bridging Methodological Divides Between Macro- and Microresearch: Endogeneity and Methods for Panel Data \|url=http://journals.sagepub.com/doi/10.1177/0149206319868016 \|journal=Journal of Management \|language=en \|volume=46 \|issue=1 \|pages=70–99 \|doi=10.1177/0149206319868016 \|s2cid=202288849 \|issn=0149-2063\|url-access=subscription }}</ref><ref>{{Cite book \|last=Wooldridge \|first=Jeffrey M. \|url=https://books.google.com/books?id=hSs3AgAAQBAJ&dq=info:T5fz2cmyyF8J:scholar.google.com&pg=PP1 \|title=Econometric Analysis of Cross Section and Panel Data, second edition \|date=2010-10-01 \|publisher=MIT Press \|isbn=978-0-262-29679-3 \|language=en}}</ref> This assumption is probably the most important assumption the estimator makes, but one that is misunderstood by most applied researchers using these types of models.<ref name=":0" /> ==Statistical tests== Line 96 ⟶ 98: ===Level=== The concept of level is the keystone of this approach. In an [[educational research]] example, the levels for a 2-level model might be: #pupil #class However, if one were studying multiple schools and multiple school districts, a 4-level model could ~~be:~~include #pupil #class Line 111 ⟶ 113: As a simple example, consider a basic linear regression model that predicts income as a function of age, class, gender and race. It might then be observed that income levels also vary depending on the city and state of residence. A simple way to incorporate this into the regression model would be to add an additional [[independent variable\|independent]] [[categorical variable]] to account for the ___location (i.e. a set of additional binary predictors and associated regression coefficients, one per ___location). This would have the effect of shifting the mean income up or down—but it would still assume, for example, that the effect of race and gender on income is the same everywhere. In reality, this is unlikely to be the case—different local laws, different retirement policies, differences in level of racial prejudice, etc. are likely to cause all of the predictors to have different sorts of effects in different locales. In other words, a simple linear regression model might, for example, predict that a given randomly sampled person in [[Seattle]] would have an average yearly income $10,000 higher than a similar person in [[Mobile, Alabama]]. However, it would also predict, for example, that a white person might have an average income $7,000 above a black person, and a 65-year-old might have an income $3,000 below a 45-year-old, in both cases regardless of ___location. A multilevel model, however, would allow for different regression coefficients for each predictor in each ___location. Essentially, it would assume that people in a given ___location have correlated incomes generated by a single set of regression coefficients, whereas people in another ___location have incomes generated by a different set of coefficients. Meanwhile, the coefficients themselves are assumed to be correlated and generated from a single set of [[Hyperparameter (Bayesian statistics)\|hyperparameter]]s. Additional levels are possible: For example, people might be grouped by cities, and the city-level regression coefficients grouped by state, and the state-level coefficients generated from a single hyper-hyperparameter. Multilevel models are a subclass of [[hierarchical Bayesian model]]s, which are general models with multiple levels of [[random variable]]s and arbitrary relationships among the different variables. Multilevel analysis has been extended to include multilevel [[structural equation modeling]], multilevel [[latent class model]]ing, and other more general models. Line 135 ⟶ 137: ==Bayesian nonlinear mixed-effects model== [[File:Bayesian research cycle.png\|500px\|thumb\|right\|Bayesian research cycle using Bayesian nonlinear mixed effects model: (a) standard research cycle and (b) Bayesian-specific workflow .<ref name="Repeated Measurement Data 2201">{{Cite journal \|last1=Lee\|first1=Se Yoon\| title = Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications \|journal=Mathematics\|year=2022\|volume=10 \|issue=6 \|page=898 \|doi=10.3390/math10060898\|doi-access=free\|arxiv=2201.12430}}</ref>.]] Multilevel modeling is frequently used in diverse applications and it can be formulated by the Bayesian framework. Particularly, Bayesian nonlinear mixed-effects models have recently received significant attention. A basic version of the Bayesian nonlinear mixed-effects models is represented as the following three-stage: Line 141 ⟶ 143: '''''Stage 1: Individual-Level Model''''' <math>\begin{align} ~~<math>~~&{y}_{ij} = f(t_{ij};\theta_{1i},\theta_{2i},\ldots,\theta_{li},\ldots,\theta_{Ki} ) + \epsilon_{ij},~~\quad~~ \~~epsilon_{ij}~~ \~~sim N(0, \sigma^2), \quad i =1,\ldots, N, \, j = 1,\ldots, M_i.</math>~~ \phantom{spacer} \\ &\epsilon_{ij} \sim N(0, \sigma^2), \\ \phantom{spacer} \\ &i =1,\ldots, N, \, j = 1,\ldots, M_i. \end{align}</math> '''''Stage 2: Population Model''''' <math>\begin{align} ~~<math>~~&\theta_{li}= \alpha_l + \sum_{b=1}^{P}\beta_{lb}x_{ib} + \eta_{li}, \~~quad~~ \~~eta_{li} \sim N(0, \omega_l^2), \quad i =1,\ldots, N, \, l=1,\ldots, K.</math>~~ \phantom{spacer} \\ &\eta_{li} \sim N(0, \omega_l^2), \\ \phantom{spacer} \\ &i =1,\ldots, N, \, l=1,\ldots, K. \end{align}</math> '''''Stage 3: Prior''''' <math>\begin{align} <math> \sigma^2 \sim \pi(\sigma^2),\quad \alpha_l \sim \pi(\alpha_l), \quad (\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}) \sim \pi(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}), \quad \omega_l^2 \sim \pi(\omega_l^2), \quad l=1,\ldots, K.</math>▼ &\sigma^2 \sim \pi(\sigma^2),\\ \phantom{spacer} \\ &\alpha_l \sim \pi(\alpha_l), \\ \phantom{spacer} \\ ▲~~<math> \sigma^2 \sim \pi(\sigma^2),\quad \alpha_l \sim \pi(\alpha_l), \quad~~ &(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}) \sim \pi(\beta_{l1},\ldots,\beta_{lb},\ldots,\beta_{lP}), \~~quad~~ \~~omega_l^2 \sim \pi(\omega_l^2), \quad l=1,\ldots, K.</math>~~ \phantom{spacer} \\ &\omega_l^2 \sim \pi(\omega_l^2), \\ \phantom{spacer} \\ &l=1,\ldots, K. \end{align}</math> Here, <math>y_{ij}</math> denotes the continuous response of the <math>i</math>-th subject at the time point <math>t_{ij}</math>, and <math>x_{ib}</math> is the <math>b</math>-th covariate of the <math>i</math>-th subject. Parameters involved in the model are written in Greek letters. <math>f(t ; \theta_{1},\ldots,\theta_{K})</math> is a known function parameterized by the <math>K</math>-dimensional vector <math>(\theta_{1},\ldots,\theta_{K})</math>. Typically, <math>f</math> is a `nonlinear' function and describes the temporal trajectory of individuals. In the model, <math>\epsilon_{ij}</math> and <math>\eta_{li}</math> describe within-individual variability and between-individual variability, respectively. If '''''Stage 3: Prior''''' is not considered, then the model reduces to a frequentist nonlinear mixed-effect model. Line 159 ⟶ 183: <math>\propto \pi(\{y_{ij}\}_{i=1,j=1}^{N,M_i}, \{\theta_{li}\}_{i=1,l=1}^{N,K},\sigma^2, \{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)</math> <math>\begin{align} ~~<math>~~=& ~ \~~underbrace~~left.{\pi(\{y_{ij}\}_{i=1,j=1}^{N,M_i} \|\{\theta_{li}\}_{i=1,l=1}^{N,K},\sigma^2)}_ \right\}\text{Stage 1: Individual-Level Model} \\ \phantom{spacer} \\ \times & ~ \~~underbrace~~left.{\pi(\{\theta_{li}\}_{i=1,l=1}^{N,K}\|\{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)}_ \right\}\text{Stage 2: Population Model} \\ \phantom{spacer} \\ \times & ~ \~~underbrace~~left.{p(\sigma^2, \{\alpha_l\}_{l=1}^K, \{\beta_{lb}\}_{l=1,b=1}^{K,P},\{\omega_l\}_{l=1}^K)}_ \right\}\text{Stage 3: Prior} \end{align}</math> The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model.<ref~~>{{Cite~~ ~~journal \|last1~~name=~~Lee\|first1=Se Yoon\| title = Bayesian Nonlinear Models for~~ "Repeated Measurement Data: ~~An Overview, Implementation, and Applications \|journal=Mathematics\|year=2022\|volume=10 \|issue=6 \|page=898 \|doi=10.3390/math10060898\|doi-access=free\|arxiv=~~2201~~.12430}}<~~"/~~ref~~> A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function <math> f </math>; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle.▼ ▲The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model.<ref>{{Cite journal \|last1=Lee\|first1=Se Yoon\| title = Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications \|journal=Mathematics\|year=2022\|volume=10 \|issue=6 \|page=898 \|doi=10.3390/math10060898\|doi-access=free\|arxiv=2201.12430}}</ref> A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function <math> f </math>; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle. ==See also== [[Hyperparameter (Bayesian statistics)\|Hyperparameter]] [[Mixed-design analysis of variance]] [[Multiscale modeling]] [[Random effects model]] [[Nonlinear mixed-effects model]] [[Bayesian hierarchical modeling]] [[Restricted randomization]] == Notes == {{Reflist\|group=lower-alpha}} == References ==