Content deleted Content added
Restored revision 1266873432 by Citation bot (talk): Rm list of nonnotable software - might be OK on the german edition, but we have WP:NOT here |
m replaced: causally-disconnected → causally disconnected (3) |
||
(21 intermediate revisions by 13 users not shown) | |||
Line 3:
[[File:Example Structural equation model.svg|alt= An example structural equation model|thumb|336x336px|Figure 1. An example structural equation model after estimation. Latent variables are sometimes indicated with ovals while observed variables are shown in rectangles. Residuals and variances are sometimes drawn as double-headed arrows (shown here) or single arrows and a circle (as in Figure 2). The latent IQ variance is fixed at 1 to provide scale to the model. Figure 1 depicts measurement errors influencing each indicator of latent intelligence and each indicator of latent achievement. Neither the indicators nor the measurement errors of the indicators are modeled as influencing the latent variables.<ref name="Salkind2007" />]]
[[File:Example SEM of Human Intelligence.png|alt=An example structural equation model pre-estimation|thumb|336x336px|Figure 2. An example structural equation model before estimation. Similar to Figure 1 but without standardized values and with fewer items. Because intelligence and academic performance are merely imagined or theory-postulated variables, their precise scale values are unknown, though the model specifies that each latent variable's values must fall somewhere along the observable scale possessed by one of the indicators. The 1.0 effect connecting a latent to an indicator specifies that each real unit increase or decrease in the latent variable's value results in a corresponding unit increase or decrease in the indicator's value. It is hoped a good indicator has been chosen for each latent, but the 1.0 values do not signal perfect measurement because this model also postulates that there are other unspecified entities causally
'''Structural equation modeling''' ('''SEM''') is a diverse set of methods used by scientists for both observational and experimental research. SEM is used mostly in the social and behavioral science fields, but it is also used in epidemiology,<ref name="BM08">{{cite book | doi=10.4135/9781412953948.n443 | chapter=Structural Equation Modeling | title=Encyclopedia of Epidemiology | date=2008 | isbn=978-1-4129-2816-8 }}</ref> business,<ref name="Shelley06">{{cite book | doi=10.4135/9781412939584.n544 | chapter=Structural Equation Modeling | title=Encyclopedia of Educational Leadership and Administration | date=2006 | isbn=978-0-7619-3087-7 }}</ref> and other fields.
SEM involves a model representing how various aspects of some [[phenomenon]] are thought to [[Causality|causally]] connect to one another. Structural equation models often contain postulated causal connections among some latent variables (variables thought to exist but which can't be directly observed). Additional causal connections link those latent variables to observed variables whose values appear in a data set. The causal connections are represented using
The boundary between what is and is not a structural equation model is not always clear, but SE models often contain postulated causal connections among a set of latent variables (variables thought to exist but which can't be directly observed, like an attitude, intelligence, or mental illness) and causal connections linking the postulated latent variables to variables that can be observed and whose values are available in some data set. Variations among the styles of latent causal connections, variations among the observed variables measuring the latent variables, and variations in the statistical estimation strategies result in the SEM toolkit including [[confirmatory factor analysis]] (CFA), [[confirmatory composite analysis]], [[Path analysis (statistics)|path analysis]], multi-group modeling, longitudinal modeling, [[partial least squares path modeling]], [[latent growth modeling]] and hierarchical or multilevel modeling.<ref name="kline_2016">{{Cite book|last=Kline|first=Rex B. |title=Principles and practice of structural equation modeling|date=2016 |isbn=978-1-4625-2334-4|edition=4th |___location=New York|oclc=934184322}}</ref><ref name="Hayduk87">Hayduk, L. (1987) Structural Equation Modeling with LISREL: Essentials and Advances. Baltimore, Johns Hopkins University Press. ISBN 0-8018-3478-3</ref><ref>{{Cite book |last=Bollen |first=Kenneth A. |title=Structural equations with latent variables |date=1989 |publisher=Wiley |isbn=0-471-01171-1 |___location=New York |oclc=18834634}}</ref><ref>{{Cite book |last=Kaplan |first=David |title=Structural equation modeling: foundations and extensions |date=2009 |publisher=SAGE |isbn=978-1-4129-1624-0 |edition=2nd |___location=Los Angeles |oclc=225852466}}</ref><ref>{{
SEM researchers use computer programs to estimate the strength and sign of the coefficients corresponding to the modeled structural connections, for example the numbers connected to the arrows in Figure 1. Because a postulated model such as Figure 1 may not correspond to the worldly forces controlling the observed data measurements, the programs also provide model tests and diagnostic clues suggesting which indicators, or which model components, might introduce inconsistency between the model and observed data. Criticisms of SEM methods
A great advantage of SEM is that all of these measurements and tests occur simultaneously in one statistical estimation procedure, where all the model coefficients are calculated using all information from the observed variables. This means the estimates are more accurate than if a researcher were to calculate each part of the model separately.{{sfn|MacCallum|Austin|2000|p=209}}
Line 17:
== History ==
Structural equation modeling (SEM) began differentiating itself from correlation and regression when [[Sewall Wright]] provided explicit causal interpretations for a set of regression-style equations based on a solid understanding of the physical and physiological mechanisms producing direct and indirect effects among his observed variables.<ref name="Wright21">{{cite journal |last1=Wright
Different yet mathematically related modeling approaches developed in psychology, sociology, and economics. Early [[Cowles Foundation|Cowles Commission]] work on [[Simultaneous equations model|simultaneous equations]] estimation centered on Koopman and Hood's (1953) algorithms from [[transport economics]] and optimal routing, with [[maximum likelihood estimation]], and closed form algebraic calculations, as iterative solution search techniques were limited in the days before computers. The convergence of two of these developmental streams (factor analysis from psychology, and path analysis from sociology via Duncan) produced the current core of SEM. One of several programs Karl Jöreskog developed at Educational Testing Services, LISREL<ref name="JGvT70">Jöreskog, Karl; Gruvaeus, Gunnar T.; van Thillo, Marielle. (1970) ACOVS: A General Computer Program for Analysis of Covariance Structures. Princeton, N.J.; Educational Testing Services.{{pn|date=June 2025}}</ref><ref name=":0">{{
Traces of the historical convergence of the factor analytic and path analytic traditions persist as the distinction between the measurement and structural portions of models; and as continuing disagreements over model testing, and whether measurement should precede or accompany structural estimates.<ref name="HG00a">{{cite journal |last1=Hayduk
Wright's path analysis influenced Hermann Wold, Wold's student Karl Jöreskog, and Jöreskog's student Claes Fornell, but SEM never gained a large following among U.S. econometricians, possibly due to fundamental differences in modeling objectives and typical data structures. The prolonged separation of SEM's economic branch led to procedural and terminological differences, though deep mathematical and statistical connections remain.<ref name="Westland15">
[[Judea Pearl]]<ref name="Pearl09" /> extended SEM from linear to nonparametric models, and proposed causal and counterfactual interpretations of the equations. Nonparametric SEMs permit estimating total, direct and indirect effects without making any commitment to linearity of effects or assumptions about the distributions of the error terms.<ref name="BP13" />
SEM analyses are popular in the social sciences because these analytic techniques help us break down complex concepts and understand causal processes, but the complexity of the models can introduce substantial variability in the results depending on the presence or absence of conventional control variables, the sample size, and the variables of interest.<ref>{{
Today, SEM forms
== General steps and considerations ==
Line 48:
* and which coefficients will be given fixed/unchanging values (e.g. to provide measurement scales for latent variables as in Figure 2).
The latent level of a model is composed of [[Exogenous and endogenous variables|''endogenous'' and ''exogenous'' variables]]. The endogenous latent variables are the true-score variables postulated as receiving effects from at least one other modeled variable. Each endogenous variable is modeled as the dependent variable in a regression-style equation. The exogenous latent variables are background variables postulated as causing one or more of the endogenous variables and are modeled like the predictor variables in regression-style equations. Causal connections among the exogenous variables are not explicitly modeled but are usually acknowledged by modeling the exogenous variables as freely correlating with one another. The model may include intervening variables – variables receiving effects from some variables but also sending effects to other variables. As in regression, each endogenous variable is assigned a residual or error variable encapsulating the effects of unavailable and usually unknown causes. Each latent variable, whether [[Exogenous and endogenous variables|exogenous or endogenous]], is thought of as containing the cases' true-scores on that variable, and these true-scores causally contribute valid/genuine variations into one or more of the observed/reported indicator variables.<ref name="BMvH03">{{cite journal | doi=10.1037/0033-295X.110.2.203 | title=The theoretical status of latent variables | date=2003 | last1=Borsboom | first1=Denny | last2=Mellenbergh | first2=Gideon J. | last3=Van Heerden | first3=Jaap | journal=Psychological Review | volume=110 | issue=2 | pages=203–219 | pmid=12747522 }}</ref>
The LISREL program assigned Greek names to the elements in a set of matrices to keep track of the various model components. These names became relatively standard notation, though the notation has been extended and altered to accommodate a variety of statistical considerations.<ref name="JS76"/><ref name="Hayduk87"/><ref name="Bollen89"/><ref name="Kline16" >Kline, Rex. (2016) Principles and Practice of Structural Equation Modeling (4th ed). New York, Guilford Press. ISBN 978-1-4625-2334-4</ref> Texts and programs "simplifying" model specification via diagrams or by using equations permitting user-selected variable names, re-convert the user's model into some standard matrix-algebra form in the background. The "simplifications" are achieved by implicitly introducing default program "assumptions" about model features with which users supposedly need not concern themselves. Unfortunately, these default assumptions easily obscure model components that leave unrecognized issues lurking within the model's structure, and underlying matrices.
Two main components of models are distinguished in SEM: the ''structural model'' showing potential causal dependencies between [[Exogenous and endogenous variables|endogenous and exogenous latent variables]], and the ''measurement model'' showing the causal connections between the latent variables and the indicators. Exploratory and confirmatory [[factor analysis]] models, for example, focus on the causal measurement connections, while [[path analysis (statistics)|path models]] more closely correspond to SEMs latent structural connections.
Line 56:
Modelers specify each coefficient in a model as being ''free'' to be estimated, or ''fixed'' at some value. The free coefficients may be postulated effects the researcher wishes to test, background correlations among the exogenous variables, or the variances of the residual or error variables providing additional variations in the endogenous latent variables. The fixed coefficients may be values like the 1.0 values in Figure 2 that provide a scales for the latent variables, or values of 0.0 which assert causal disconnections such as the assertion of no-direct-effects (no arrows) pointing from Academic Achievement to any of the four scales in Figure 1. SEM programs provide estimates and tests of the free coefficients, while the fixed coefficients contribute importantly to testing the overall model structure. Various kinds of constraints between coefficients can also be used.<ref name="Kline16"/><ref name="Hayduk87"/><ref name="Bollen89"/> The model specification depends on what is known from the literature, the researcher's experience with the modeled indicator variables, and the features being investigated by using the specific model structure.
There is a limit to how many coefficients can be estimated in a model. If there are fewer data points than the number of estimated coefficients, the resulting model is said to be "unidentified" and no coefficient estimates can be obtained. Reciprocal effect, and other causal loops, may also interfere with estimation.<ref name="Rigdon95">{{cite journal |last1=Rigdon
=== Estimation of free model coefficients ===
Model coefficients fixed at
a) the coefficients' locations in the model (e.g. which variables are connected/disconnected),
b) the nature of the connections between the variables (covariances or effects; with effects often assumed to be linear),
c) the nature of the error or residual variables (often assumed to be independent of, or causally
and d) the measurement scales appropriate for the variables (interval level measurement is often assumed).
Line 88:
Coefficient estimates in data-inconsistent ("failing") models are interpretable, as reports of how the world would appear to someone believing a model that conflicts with the available data. The estimates in data-inconsistent models do not necessarily become "obviously wrong" by becoming statistically strange, or wrongly signed according to theory. The estimates may even closely match a theory's requirements but the remaining data inconsistency renders the match between the estimates and theory unable to provide succor. Failing models remain interpretable, but only as interpretations that conflict with available evidence.
Replication is unlikely to detect misspecified models which inappropriately-fit the data. If the replicate data is within random variations of the original data, the same incorrect coefficient placements that provided inappropriate-fit to the original data will likely also inappropriately-fit the replicate data. Replication helps detect issues such as data mistakes (made by different research groups), but is especially weak at detecting misspecifications after exploratory model modification – as when confirmatory factor analysis
A modification index is an estimate of how much a model's fit to the data would "improve" (but not necessarily how much the model's structure would improve) if a specific currently
"Accepting" failing models as "close enough" is also not a reasonable alternative. A cautionary instance was provided by Browne, MacCallum, Kim, Anderson, and Glaser who addressed the mathematics behind why the {{math|χ<sup>2</sup>}} test can have (though it does not always have) considerable power to detect model misspecification.<ref name="BMKAG02">{{cite journal |last1=Browne
Many researchers tried to justify switching to fit-indices, rather than testing their models, by claiming that {{math|χ<sup>2</sup>}} increases (and hence {{math|χ<sup>2</sup>}} probability decreases) with increasing sample size (N). There are two mistakes in discounting {{math|χ<sup>2</sup>}} on this basis. First, for proper models, {{math|χ<sup>2</sup>}} does not increase with increasing N,<ref name="Hayduk14b"/> so if {{math|χ<sup>2</sup>}} increases with N that itself is a sign that something is detectably problematic. And second, for models that are detectably misspecified, {{math|χ<sup>2</sup>}} increase with N provides the good-news of increasing statistical power to detect model misspecification (namely power to detect Type II error). Some kinds of important misspecifications cannot be detected by {{math|χ<sup>2</sup>}},<ref name="Hayduk14a"/> so any amount of ill fit beyond what might be reasonably produced by random variations warrants report and consideration.<ref name="Barrett07"/><ref name="Hayduk14b"/> The {{math|χ<sup>2</sup>}} model test, possibly adjusted,<ref name="SB94">Satorra, A.; and Bentler, P. M. (1994) “Corrections to test statistics and standard errors in covariance structure analysis”. In A. von Eye and C. C. Clogg (Eds.), Latent variables analysis: Applications for developmental research (pp. 399–419). Thousand Oaks, CA: Sage.</ref> is the strongest available structural equation model test.
Line 98:
Numerous fit indices quantify how closely a model fits the data but all fit indices suffer from the logical difficulty that the size or amount of ill fit is not trustably coordinated with the severity or nature of the issues producing the data inconsistency.<ref name="Hayduk14a"/> Models with different causal structures which fit the data identically well, have been called equivalent models.<ref name="Kline16"/> Such models are data-fit-equivalent though not causally equivalent, so at least one of the so-called equivalent models must be inconsistent with the world's structure. If there is a perfect 1.0 correlation between X and Y and we model this as X causes Y, there will be perfect fit and zero residual error. But the model may not match the world because Y may actually cause X, or both X and Y may be responding to a common cause Z, or the world may contain a mixture of these effects (e.g. like a common cause plus an effect of Y on X), or other causal structures. The perfect fit does not tell us the model's structure corresponds to the world's structure, and this in turn implies that getting closer to perfect fit does not necessarily correspond to getting closer to the world's structure – maybe it does, maybe it doesn't. This makes it incorrect for a researcher to claim that even perfect model fit implies the model is correctly causally specified. For even moderately complex models, precisely equivalently-fitting models are rare. Models almost-fitting the data, according to any index, unavoidably introduce additional potentially-important yet unknown model misspecifications. These models constitute a greater research impediment.
This logical weakness renders all fit indices "unhelpful" whenever a structural equation model is significantly inconsistent with the data,<ref name="Barrett07"/> but several forces continue to propagate fit-index use. For example, Dag Sorbom reported that when someone asked Karl Joreskog, the developer of the first structural equation modeling program, "Why have you then added GFI?" to your LISREL program, Joreskog replied "Well, users threaten us saying they would stop using LISREL if it always produces such large chi-squares. So we had to invent something to make people happy. GFI serves that purpose."<ref name="Sorbom">
Whether or not researchers are committed to seeking the world’s structure is a fundamental concern. Displacing test evidence of model-data inconsistency by hiding it behind index claims of acceptable-fit, introduces the discipline-wide cost of diverting attention away from whatever the discipline might have done to attain a structurally-improved understanding of the discipline’s substance. The discipline ends up paying a real costs for index-based displacement of evidence of model misspecification. The frictions created by disagreements over the necessity of correcting model misspecifications will likely increase with increasing use of non-factor-structured models, and with use of fewer, more-precise, indicators of similar yet importantly-different latent variables.<ref name="HL12"/>
Line 108:
# whether the researcher knowingly agrees to disregard evidence pointing to the kinds of misspecifications on which the index criteria were based. (If the index criterion is based on simulating a missing factor loading or two, using that criterion acknowledges the researcher's willingness to accept a model missing a factor loading or two.);
# whether the latest, not outdated, index criteria are being used (because the criteria for some indices tightened over time);
# whether satisfying criterion values on pairs of indices are required (e.g. Hu and Bentler
# whether a model test is, or is not, available. (A {{math|χ<sup>2</sup>}} value, degrees of freedom, and probability will be available for models reporting indices based on {{math|χ<sup>2</sup>}}.)
# and whether the researcher has considered both alpha (Type I) and beta (Type II) errors in making their index-based decisions (E.g. if the model is significantly data-inconsistent, the "tolerable" amount of inconsistency is likely to differ in the context of medical, business, social and psychological contexts.).
Line 151:
|'''Factor Model''' proposed wording
for critical values
| .06 wording?
|
|
Line 163:
|References proposing revised/changed,<br/>
disagreements over critical values
|{{sfn|Hu|Bentler|1999}}
|{{sfn|Hu|Bentler|1999}}
|{{sfn|Hu|Bentler|1999}}
|-
|References indicating two-index or paired-index
criteria are required
|{{sfn|Hu|Bentler|1999}}
|{{sfn|Hu|Bentler|1999}}
|{{sfn|Hu|Bentler|1999}}
|-
Line 194:
=== Interpretation ===
Causal interpretations of SE models are the clearest and most understandable but those interpretations will be fallacious/wrong if the model’s structure does not correspond to the world’s causal structure. Consequently, interpretation should address the overall status and structure of the model, not merely the model’s estimated coefficients. Whether a model fits the data, and/or how a model came to fit the data, are paramount for interpretation. Data fit obtained by exploring, or by following successive modification indices, does not guarantee the model is wrong but raises serious doubts because these approaches are prone to incorrectly modeling data features. For example, exploring to see how many factors are required preempts finding the data are not factor structured, especially if the factor model has been “persuaded” to fit via inclusion of measurement error covariances. Data’s ability to speak against a postulated model is progressively eroded with each unwarranted inclusion of a “modification index suggested” effect or error covariance. It becomes exceedingly difficult to recover a proper model if the initial/base model contains several misspecifications.<ref name="HC00">{{cite journal |last1=Herting
Direct-effect estimates are interpreted in parallel to the interpretation of coefficients in regression equations but with causal commitment. Each unit increase in a causal variable’s value is viewed as producing a change of the estimated magnitude in the dependent variable’s value given control or adjustment for all the other operative/modeled causal mechanisms. Indirect effects are interpreted similarly, with the magnitude of a specific indirect effect equaling the product of the series of direct effects comprising that indirect effect. The units involved are the real scales of observed variables’ values, and the assigned scale values for latent variables. A specified/fixed 1.0 effect of a latent on a specific indicator coordinates that indicator’s scale with the latent variable’s scale. The presumption that the remainder of the model remains constant or unchanging may require discounting indirect effects that might, in the real world, be simultaneously prompted by a real unit increase. And the unit increase itself might be inconsistent with what is possible in the real world because there may be no known way to change the causal variable’s value. If a model adjusts for measurement errors, the adjustment permits interpreting latent-level effects as referring to variations in true scores.<ref name="BMvH03"/>
Line 202:
SE model interpretation should connect specific model causal segments to their variance and covariance implications. A single direct effect reports that the variance in the independent variable produces a specific amount of variation in the dependent variable’s values, but the causal details of precisely what makes this happens remains unspecified because a single effect coefficient does not contain sub-components available for integration into a structured story of how that effect arises. A more fine-grained SE model incorporating variables intervening between the cause and effect would be required to provide features constituting a story about how any one effect functions. Until such a model arrives each estimated direct effect retains a tinge of the unknown, thereby invoking the essence of a theory. A parallel essential unknownness would accompany each estimated coefficient in even the more fine-grained model, so the sense of fundamental mystery is never fully eradicated from SE models.
Even if each modeled effect is unknown beyond the identity of the variables involved and the estimated magnitude of the effect, the structures linking multiple modeled effects provide opportunities to express how things function to coordinate the observed variables – thereby providing useful interpretation possibilities. For example, a common cause contributes to the covariance or correlation between two effected variables, because if the value of the cause goes up, the values of both effects should also go up (assuming positive effects) even if we do not know the full story underlying each cause.<ref name="Duncan75"/> (A correlation is the covariance between two variables that have both been standardized to have variance 1.0). Another interpretive contribution might be made by expressing how two causal variables can both explain variance in a dependent variable, as well as how covariance between two such causes can increase or decrease explained variance in the dependent variable. That is, interpretation may involve explaining how a pattern of effects and covariances can contribute to decreasing a dependent variable’s variance.<ref name="Hayduk87p20">Hayduk, L. (1987) Structural Equation Modeling with LISREL: Essentials and Advances, page 20. Baltimore, Johns Hopkins University Press. ISBN 0-8018-3478-3 Page 20</ref> Understanding causal implications implicitly connects to understanding “controlling”, and potentially explaining why some variables, but not others, should be controlled.<ref name="Pearl09"/><ref name="HCSNGDGP-R03">{{cite journal |last1=Hayduk
The statistical insignificance of an effect estimate indicates the estimate could rather easily arise as a random sampling variation around a null/zero effect, so interpreting the estimate as a real effect becomes equivocal. As in regression, the proportion of each dependent variable’s variance explained by variations in the modeled causes are provided by ''R''<sup>2</sup>, though the Blocked-Error ''R''<sup>2</sup> should be used if the dependent variable is involved in reciprocal or looped effects, or if it has an error variable correlated with any predictor’s error variable.<ref name="Hayduk06">{{cite journal |last1=Hayduk
The caution appearing in the Model Assessment section warrants repeat. Interpretation should be possible whether a model is or is not consistent with the data. The estimates report how the world would appear to someone believing the model – even if that belief is unfounded because the model happens to be wrong. Interpretation should acknowledge that the model coefficients may or may not correspond to “parameters” – because the model’s coefficients may not have corresponding worldly structural features.
Line 212:
Interpretations become progressively more complex for models containing interactions, nonlinearities, multiple groups, multiple levels, and categorical variables.<ref name="Kline16"/> <!-- For interpretations of coefficients in models containing interactions, see { reference needed }, for multilevel models see { reference needed }, for longitudinal models see, { reference needed }, and for models containing categoric variables see { reference needed }. --> Effects touching causal loops, reciprocal effects, or correlated residuals also require slightly revised interpretations.<ref name="Hayduk87"/><ref name="Hayduk96"/>
Careful interpretation of both failing and fitting models can provide research advancement. To be dependable, the model should investigate academically informative causal structures, fit applicable data with understandable estimates, and not include vacuous coefficients.<ref name="Millsap07">{{cite journal |last1=Millsap
The multiple ways of conceptualizing PLS models<ref name="RSR17">{{cite journal | doi=10.15358/0344-1369-2017-3-4 | title=On Comparing Results from CB-SEM and PLS-SEM: Five Perspectives and Five Recommendations | date=2017 | last1=Rigdon | first1=Edward E. | last2=Sarstedt | first2=Marko | last3=Ringle | first3=Christian M. | journal=Marketing ZFP | volume=39 | issue=3 | pages=4–16 | doi-access=free }}</ref> complicate interpretation of PLS models. Many of the above comments are applicable if a PLS modeler adopts a realist perspective by striving to ensure their modeled indicators combine in a way that matches some existing but unavailable latent variable. Non-causal PLS models, such as those focusing primarily on ''R''<sup>2</sup> or out-of-sample predictive power, change the interpretation criteria by diminishing concern for whether or not the model’s coefficients have worldly counterparts. The fundamental features differentiating the five PLS modeling perspectives discussed by Rigdon, Sarstedt and Ringle<ref name="RSR17"/> point to differences in PLS modelers’ objectives, and corresponding differences in model features warranting interpretation.
Line 221:
Structural equation modeling is fraught with controversies. Researchers from the factor analytic tradition commonly attempt to reduce sets of multiple indicators to fewer, more manageable, scales or factor-scores for later use in path-structured models. This constitutes a stepwise process with the initial measurement step providing scales or factor-scores which are to be used later in a path-structured model. This stepwise approach seems obvious but actually confronts severe underlying deficiencies. The segmentation into steps interferes with thorough checking of whether the scales or factor-scores validly represent the indicators, and/or validly report on latent level effects. A structural equation model simultaneously incorporating both the measurement and latent-level structures not only checks whether the latent factors appropriately coordinates the indicators, it also checks whether that same latent simultaneously appropriately coordinates each latent’s indictors with the indicators of theorized causes and/or consequences of that latent.<ref name="Hayduk96"/> If a latent is unable to do both these styles of coordination, the validity of that latent is questioned, and a scale or factor-scores purporting to measure that latent is questioned. The disagreements swirled around respect for, or disrespect of, evidence challenging the validity of postulated latent factors. The simmering, sometimes boiling, discussions resulted in a special issue of the journal Structural Equation Modeling focused on a target article by Hayduk and Glaser<ref name="HG00a"/> followed by several comments and a rejoinder,<ref name="HG00b"/> all made freely available, thanks to the efforts of George Marcoulides.
These discussions fueled disagreement over whether or not structural equation models should be tested for consistency with the data, and model testing became the next focus of discussions. Scholars having path-modeling histories tended to defend careful model testing while those with factor-histories tended to defend fit-indexing rather than fit-testing. These discussions led to a target article in Personality and Individual Differences by Paul Barrett<ref name="Barrett07"/> who said: “In fact, I would now recommend banning ALL such indices from ever appearing in any paper as indicative of model “acceptability” or “degree of misfit”.” <ref name="Barrett07"/>(page 821). Barrett’s article was also accompanied by commentary from both perspectives.<ref name="Millsap07"/><ref name="HCBP-RB07">{{cite journal |last1=Hayduk
The controversy over model testing declined as clear reporting of significant model-data inconsistency becomes mandatory. Scientists do not get to ignore, or fail to report, evidence just because they do not like what the evidence reports.<ref name="Hayduk14b"/> The requirement of attending to evidence pointing toward model mis-specification underpins more recent concern for addressing “endogeneity” – a style of model mis-specification that interferes with estimation due to lack of independence of error/residual variables. In general, the controversy over the causal nature of structural equation models, including factor-models, has also been declining. Stan Mulaik, a factor-analysis stalwart, has acknowledged the causal basis of factor models.<ref name="Mulaik09">Mulaik, S.A. (2009) Foundations of Factor Analysis (second edition). Chapman and Hall/CRC. Boca Raton, pages 130-131.</ref> The comments by Bollen and Pearl regarding myths about causality in the context of SEM<ref name="BP13" /> reinforced the centrality of causal thinking in the context of SEM.
A briefer controversy focused on competing models. Comparing competing models can be very helpful but there are fundamental issues that cannot be resolved by creating two models and retaining the better fitting model. The statistical sophistication of presentations like Levy and Hancock (2007),<ref name="LH07">{{cite journal |last1=Levy
An additional controversy that touched the fringes of the previous controversies awaits ignition.{{citation needed|date=March 2024}} Factor models and theory-embedded factor structures having multiple indicators tend to fail, and dropping weak indicators tends to reduce the model-data inconsistency. Reducing the number of indicators leads to concern for, and controversy over, the minimum number of indicators required to support a latent variable in a structural equation model. Researchers tied to factor tradition can be persuaded to reduce the number of indicators to three per latent variable, but three or even two indicators may still be inconsistent with a proposed underlying factor common cause. Hayduk and Littvay (2012)<ref name="HL12"/> discussed how to think about, defend, and adjust for measurement error, when using only a single indicator for each modeled latent variable. Single indicators have been used effectively in SE models for a long time,<ref name="EHR82"/> but controversy remains only as far away as a reviewer who has considered measurement from only the factor analytic perspective.
Line 237:
* Copulas {{citation needed|date=March 2024}}
* Deep Path Modelling <ref name="Ing2024"/>
* Exploratory Structural Equation Modeling
* Fusion validity models<ref name="HEH19">{{doi|10.3389/psyg.2019.01139|doi-access=free}}{{dead link|date=June 2025}}</ref>
* [[Item response theory]] models {{citation needed|date=July 2023}}
* [[Latent class models]] {{citation needed|date=July 2023}}
* [[Latent growth modeling]] {{citation needed|date=July 2023}}
* Link functions {{citation needed|date=July 2023}}
* Longitudinal models
* [[Measurement invariance]] models <ref>{{
* [[Mixture model]] {{citation needed|date=July 2023}}
* [[Multilevel models]], hierarchical models (e.g. people nested in groups) <ref>{{Citation |last1=Sadikaj |first1=Gentiana |title=Multilevel structural equation modeling for intensive longitudinal data: A practical guide for personality researchers |date=2021 |url=https://linkinghub.elsevier.com/retrieve/pii/B9780128139950000339 |work=The Handbook of Personality Dynamics and Processes |pages=855–885 |access-date=2023-11-03 |publisher=Elsevier |language=en |doi=10.1016/b978-0-12-813995-0.00033-9 |isbn=978-0-12-813995-0 |last2=Wright |first2=Aidan G.C. |last3=Dunkley |first3=David M. |last4=Zuroff |first4=David C. |last5=Moskowitz |first5=D.S.|url-access=subscription }}</ref>
* Multiple group modelling with or without constraints between groups (genders, cultures, test forms, languages, etc.) {{citation needed|date=July 2023}}
* Multi-method multi-trait models {{citation needed|date=July 2023}}
* Random intercepts models {{citation needed|date=July 2023}}
* Structural Equation Model Trees {{citation needed|date=July 2023}}
* Structural Equation [[Multidimensional scaling]]<ref>{{
== Software ==
Structural equation modeling programs differ widely in their capabilities and user requirements.<ref>{{
{| class="wikitable sortable"
|-
! Name !! License !! Platform !! Add-on Package for || Link !! Covariance-Based !! Variance-Based
|-
| [[Mplus]] || Commercial || Windows, Mac, Linux || Standalone || [https://www.statmodel.com/ statmodel.com]
|✓
|
|-
| AMOS || Commercial || Windows || Standalone || [https://www.ibm.com/software/products/de/spss-amos ibm.com]
|✓
|
|-
| [[lavaan]] || Open Source || Windows, Mac, Linux || Add-on for [[R (programming language)|R]]|| [https://lavaan.org/ lavaan.org]
|✓
|
|-
|lavaangui
|Open Source
|Windows, Mac, Linux
|Add-on for [[R (programming language)|R]] and Standalone
|[https://lavaangui.org lavaangui.org]
|✓ (uses lavaan)
|
|-
| [[LISREL]] || Commercial || Windows || Standalone || [https://www.ssicentral.com/lisrel/ ssicentral.com]
|✓
|
|-
| EQS || Commercial || Windows, Mac, Linux || Standalone || [http://www.mvsoft.com/eqs60.htm mvsoft.com]
|✓
|
|-
| [[Stata]] || Commercial || Windows, Mac, Linux || Standalone || [https://www.stata.com/stata12/structural-equation-modeling/ stata.com]
|✓
|
|-
| [[SAS Institute|SAS]] || Commercial || Windows, Mac, Linux || Standalone || [https://www.sas.com/de_at/home.html sas.com]
|✓
|
|-
|[[OpenMX|semopy]]
|Open Source
|Windows, Mac, Linux
|Add-on for [[Python (programming language)|Python]]
|[https://semopy.com/index.html semopy.com]
|✓
|
|-
|sem
|Open Source
|Windows, Mac, Linux
|Add-on for [[R (programming language)|R]]
|[https://cran.r-project.org/web/packages/sem/index.html cran.r-project.org]
|✓
|
|-
| [[OpenMX]] || Open Source || Windows, Mac, Linux || Add-on for [[R (programming language)|R]]|| [https://openmx.ssri.psu.edu/ openmx.ssri.psu.edu]
|✓
|
|-
| [[Ωnyx]] || Open Source || Windows, Mac, Linux || Standalone || [http://onyx.brandmaier.de/ onyx.brandmaier.de]
|✓
|
|-
| [[SmartPLS 4]] || Commercial || Windows, Mac || Standalone || [https://www.smartpls.com/ smartpls.com]
|✓
|✓
|-
| [[PLSGraph]] || Commercial || Windows || Standalone || [https://www.plsgraph.com/ plsgraph.com]
|
|✓
|-
| [[WarpPLS]] || Commercial || Windows || Standalone || [http://warppls.com/ warppls.com]
|
|✓
|-
| [[ADANCO]] || Commercial || Windows, Mac || Standalone || [http://www.composite-modeling.com/ composite-modeling.com]
|
|✓
|-
| [[LVPLS]] || Freeware || MS-DOS || Standalone || [http://www2.kuas.edu.tw/prof/fred/vpls/aboutPLSPC.htm www2.kuas.edu.tw]
|
|✓
|-
| [[matrixpls]] || Open Source || Windows, Mac, Linux || Add-on for [[R (programming language)|R]]|| [https://cran.r-project.org/web/packages/matrixpls/ cran.r-project.org]
|
|✓
|-
|[[SEMinR]]
|Open Source
|Windows, Mac, Linux
|Add-on for [[R (programming language)|R]]
|https://github.com/sem-in-r/seminr
|✓ (uses lavaan)
|✓
|}
== See also ==
Line 270 ⟶ 367:
{{Reflist|30em|refs=
<ref name="Barrett07">
<ref name="BC92">{{cite journal |last1=Browne
<ref name="S90">{{cite journal |last1=Steiger
<ref name="SL80">Steiger, J. H.; and Lind, J. (1980) "Statistically Based Tests for the Number of Common Factors." Paper presented at the annual meeting of the Psychometric Society, Iowa City.</ref>
<ref name="MacCallum1986">{{cite journal |doi=10.1037/0033-2909.100.1.107 |title=Specification searches in covariance structure modeling |journal=Psychological Bulletin |volume=100 |pages=107–120 |year=1986 |last1=MacCallum |first1=Robert }}</ref>
Line 292 ⟶ 389:
<!--
<ref name="MacCallum1996">{{cite journal |doi=10.1037/1082-989X.1.2.130 |title=Power analysis and determination of sample size for covariance structure modeling |journal=Psychological Methods |volume=1 |issue=2 |pages=130–49 |year=1996 |last1=MacCallum |first1=Robert C |last2=Browne |first2=Michael W |last3=Sugawara |first3=Hazuki M }}</ref>
<ref name="Bentler2016">{{cite journal |doi=10.1177/0049124187016001004 |title=Practical Issues in Structural Modeling |journal=Sociological Methods & Research |volume=16 |issue=1 |pages=78–117 |year=2016 |last1=Bentler |first1=P. M |last2=Chou |first2=Chih-Ping
<ref name="Browne1993">{{cite book|last1=Browne|first1=M. W.|last2=Cudeck|first2=R.|editor1-last=Bollen|editor1-first=K. A.|editor2-last=Long|editor2-first=J. S.|title=Testing structural equation models|date=1993|publisher=Sage|___location=Newbury Park, CA|chapter=Alternative ways of assessing model fit}}</ref>
<ref name="Loehlin2004">
<ref name="Chou1995">{{cite book|last1=Chou|first1=C. P.|last2=Bentler|first2=Peter|editor1-last=Hoyle|editor1-first=Rick|editor1-link=H|title=Structural equation modeling: Concepts, issues, and applications|date=1995|publisher=Sage|___location=Thousand Oaks, CA|pages=37–55|chapter=Estimates and tests in structural equation modeling}}</ref>
Line 304 ⟶ 401:
<!--<ref name="Boslaugh2008">{{cite book |doi=10.4135/9781412953948.n443 |chapter=Structural Equation Modeling |title=Encyclopedia of Epidemiology |year=2008 |isbn=978-1-4129-2816-8 |last1=Boslaugh |first1=Sarah |last2=McNutt |first2=Louise-Anne |hdl=2022/21973 }}</ref> -->
<ref name="Ing2024">{{cite
}}
Line 313 ⟶ 410:
*{{cite book|last=Kaplan |first=D. |year=2008 |title=Structural Equation Modeling: Foundations and Extensions |publisher=SAGE |edition=2nd |isbn=978-1412916240 }}
*{{cite book|last1=Kline|first1=Rex|title=Principles and Practice of Structural Equation Modeling| publisher=Guilford| isbn=978-1-60623-876-9|date=2011 |edition=Third}}
* {{cite journal |last1=MacCallum |first1=Robert C. |last2=Austin |first2=James T. |title=Applications of Structural Equation Modeling in Psychological Research
*{{cite journal|last1=Quintana|first1=Stephen M.|last2=Maxwell|first2=Scott E.|date=1999|title=Implications of Recent Developments in Structural Equation Modeling for Counseling Psychology|journal=The Counseling Psychologist|volume=27|issue=4|pages=485–527|doi=10.1177/0011000099274002
== Further reading ==
*{{cite journal |doi=10.1007/s11747-011-0278-x |title=Specification, evaluation, and interpretation of structural equation models |journal=Journal of the Academy of Marketing Science |volume=40 |issue=1 |pages=8–34 |year=2011 |last1=Bagozzi |first1=Richard P |last2=Yi |first2=Youjae
* Bartholomew, D. J., and Knott, M. (1999) ''Latent Variable Models and Factor Analysis'' Kendall's Library of Statistics, vol. 7, [[Edward Arnold (publisher)|Edward Arnold Publishers]], {{ISBN|0-340-69243-X}}
*
* [[Kenneth A. Bollen|Bollen, K. A.]] (1989). ''[[Structural Equations with Latent Variables]]''. Wiley, {{ISBN|0-471-01171-1}}
* Byrne, B. M. (2001) ''Structural Equation Modeling with AMOS - Basic Concepts, Applications, and Programming''.LEA, {{ISBN|0-8058-4104-0}}
* {{cite journal |last1=Goldberger
*{{cite journal |first1=Trygve |last1=Haavelmo |title=The Statistical Implications of a System of Simultaneous Equations |journal=Econometrica |volume=11 |issue=1 |date=January 1943 |pages=1–12 |doi=10.2307/1905714 |jstor=1905714 }}
* {{cite book |last1=Hoyle
*{{cite book|author-link1=Karl Jöreskog|last1=Jöreskog|first1=Karl G.|first2=Fan|last2=Yang|year=1996|chapter=Non-linear structural equation models: The Kenny-Judd model with interaction effects|editor1-first=George A.|editor1-last=Marcoulides|editor2-first=Randall E.|editor2-last=Schumacker|title=Advanced structural equation modeling: Concepts, issues, and applications|___location=Thousand Oaks, CA|publisher=Sage Publications|chapter-url={{Google books|VcHeAQAAQBAJ|page=57|plainurl=yes}}|pages=57–88|isbn=978-1-317-84380-1}}
*{{cite book|ref=none|doi=10.4135/9781412950589.n979|chapter=Structural Equation Modeling|title=The SAGE Encyclopedia of Social Science Research Methods|year=2004|isbn=978-0-7619-2363-3|last1=Lewis-Beck|first1=Michael|last2=Bryman|first2=Alan E.|last3=Bryman|first3=Emeritus Professor Alan|last4=Liao|first4=Tim Futing|hdl=2022/21973}}
Line 332 ⟶ 429:
* [http://archive.wikiwix.com/cache/20110707224407/http://www2.chass.ncsu.edu/garson/pa765/structur.htm Structural equation modeling page under David Garson's StatNotes, NCSU]
* [http://disc-nt.cba.uh.edu/chin/ais/ Issues and Opinion on Structural Equation Modeling], SEM in IS Research
* [
* [http://archive.wikiwix.com/cache/20110707224414/http://www.upa.pdx.edu/IOA/newsom/semrefs.htm Structural Equation Modeling Reference List by Jason Newsom]: journal articles and book chapters on structural equation models
* [[Wikibooks:Handbook of Management Scales|Handbook of Management Scales]], a collection of previously used multi-item scales to measure constructs for SEM
|