Structural equation modeling: Difference between revisions

Content deleted Content added
Software: clarify that this is a table
MOS:1STOCC
Line 9:
SEM involves a model representing how various aspects of some [[phenomenon]] are thought to [[Causality|causally]] connect to one another. Structural equation models often contain postulated causal connections among some latent variables (variables thought to exist but which can't be directly observed). Additional causal connections link those latent variables to observed variables whose values appear in a data set. The causal connections are represented using ''[[equation]]s'' but the postulated structuring can also be presented using diagrams containing arrows as in Figures 1 and 2. The causal structures imply that specific patterns should appear among the values of the observed variables. This makes it possible to use the connections between the observed variables' values to estimate the magnitudes of the postulated effects, and to test whether or not the observed data are consistent with the requirements of the hypothesized causal structures.<ref name="Pearl09">Pearl, J. (2009). Causality: Models, Reasoning, and Inference. Second edition. New York: Cambridge University Press.</ref>
 
The boundary between what is and is not a structural equation model is not always clear but SE models often contain postulated causal connections among a set of latent variables (variables thought to exist but which can't be directly observed, like an attitude, intelligence or mental illness) and causal connections linking the postulated latent variables to variables that can be observed and whose values are available in some data set. Variations among the styles of latent causal connections, variations among the observed variables measuring the latent variables, and variations in the statistical estimation strategies result in the SEM toolkit including [[confirmatory factor analysis]] (CFA), [[confirmatory composite analysis]], [[Path analysis (statistics)|path analysis]], multi-group modeling, longitudinal modeling, [[partial least squares path modeling]], [[latent growth modeling]] and hierarchical or multilevel modeling.<ref name="kline_2016">{{Cite book|last=Kline|first=Rex B. |title=Principles and practice of structural equation modeling|date=2016 |isbn=978-1-4625-2334-4|edition=4th |___location=New York|oclc=934184322}}</ref><ref name="Hayduk87">Hayduk, L. (1987) Structural Equation Modeling with LISREL: Essentials and Advances. Baltimore, Johns Hopkins University Press. ISBN 0-8018-3478-3</ref><ref>{{Cite book |last=Bollen |first=Kenneth A. |title=Structural equations with latent variables |date=1989 |publisher=Wiley |isbn=0-471-01171-1 |___location=New York |oclc=18834634}}</ref><ref>{{Cite book |last=Kaplan |first=David |title=Structural equation modeling: foundations and extensions |date=2009 |publisher=SAGE |isbn=978-1-4129-1624-0 |edition=2nd |___location=Los Angeles |oclc=225852466}}</ref><ref>{{Cite journal|last=Curran|first=Patrick J.|date=2003-10-01|title=Have Multilevel Models Been Structural Equation Models All Along?|journal=Multivariate Behavioral Research|volume=38|issue=4|pages=529–569|doi=10.1207/s15327906mbr3804_5|issn=0027-3171|pmid=26777445|s2cid=7384127}}</ref>
 
SEM researchers use computer programs to estimate the strength and sign of the coefficients corresponding to the modeled structural connections, for example the numbers connected to the arrows in Figure 1. Because a postulated model such as Figure 1 may not correspond to the worldly forces controlling the observed data measurements, the programs also provide model tests and diagnostic clues suggesting which indicators, or which model components, might introduce inconsistency between the model and observed data. Criticisms of SEM methods hint at: disregard of available model tests, problems in the model's specification, a tendency to accept models without considering external validity, and potential philosophical biases.<ref>{{cite journal |last1=Tarka |first1=Piotr |year=2017 |title=An overview of structural equation modeling: Its beginnings, historical development, usefulness and controversies in the social sciences |journal=Quality & Quantity |volume=52 |issue=1 |pages=313–54 |doi=10.1007/s11135-017-0469-8 |pmc=5794813 |pmid=29416184}}</ref>
Line 88:
Coefficient estimates in data-inconsistent ("failing") models are interpretable, as reports of how the world would appear to someone believing a model that conflicts with the available data. The estimates in data-inconsistent models do not necessarily become "obviously wrong" by becoming statistically strange, or wrongly signed according to theory. The estimates may even closely match a theory's requirements but the remaining data inconsistency renders the match between the estimates and theory unable to provide succor. Failing models remain interpretable, but only as interpretations that conflict with available evidence.
 
Replication is unlikely to detect misspecified models which inappropriately-fit the data. If the replicate data is within random variations of the original data, the same incorrect coefficient placements that provided inappropriate-fit to the original data will likely also inappropriately-fit the replicate data. Replication helps detect issues such as data mistakes (made by different research groups), but is especially weak at detecting misspecifications after exploratory model modification – as when confirmatory factor analysis (CFA) is applied to a random second-half of data following exploratory factor analysis (EFA) of first-half data.
 
A modification index is an estimate of how much a model's fit to the data would "improve" (but not necessarily how much the model's structure would improve) if a specific currently-fixed model coefficient were freed for estimation. Researchers confronting data-inconsistent models can easily free coefficients the modification indices report as likely to produce substantial improvements in fit. This simultaneously introduces a substantial risk of moving from a causally-wrong-and-failing model to a causally-wrong-but-fitting model because improved data-fit does not provide assurance that the freed coefficients are substantively reasonable or world matching. The original model may contain causal misspecifications such as incorrectly directed effects, or incorrect assumptions about unavailable variables, and such problems cannot be corrected by adding coefficients to the current model. Consequently, such models remain misspecified despite the closer fit provided by additional coefficients. Fitting yet worldly-inconsistent models are especially likely to arise if a researcher committed to a particular model (for example a factor model having a desired number of factors) gets an initially-failing model to fit by inserting measurement error covariances "suggested" by modification indices. MacCallum (1986) demonstrated that "even under favorable conditions, models arising from specification serchers must be viewed with caution."<ref name="MacCallum1986" /> Model misspecification may sometimes be corrected by insertion of coefficients suggested by the modification indices, but many more corrective possibilities are raised by employing a few indicators of similar-yet-importantly-different latent variables.<ref name="HL12">{{cite journal | doi=10.1186/1471-2288-12-159 | doi-access=free | title=Should researchers use single indicators, best indicators, or multiple indicators in structural equation models? | date=2012 | last1=Hayduk | first1=Leslie A. | last2=Littvay | first2=Levente | journal=BMC Medical Research Methodology | volume=12 | page=159 | pmid=23088287 | pmc=3506474 }}</ref>