Revision as of 22:25, 20 April 2012 edit Bearcat (talk \| contribs) Autopatrolled, Administrators 1,643,068 edits categorization/tagging using AWB ← Previous edit		Revision as of 20:54, 24 April 2012 edit undo Lododo (talk \| contribs) 144 edits No edit summary Next edit →
Line 1: Exploratory Factor Analysis (EFA) is used to uncover the underlying structure of a relatively large set of variables. It is commonly used by researchers when developing a ~~researcher~~scale and ~~wants~~serves to identify a set of [[Latent variable\|latent constructs]] underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. The researcher's assumption is that any indicator may be associated with any factor. When developing a scale, researchers should use EFA first before moving on to [[Confirmatory Factor Analysis]] (CFA). EFA requires the researcher to make a number of important decisions about how to conduct the analysis because there is no one set method. '(''Measured Variables''' are any one of several attributes of people that may be observed and measured. An example of a measured variable would be one item on a scale. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis. There should be at least 3 to 5 measured variables per factor.) The researcher's assumption when conducting EFA is that any indicator/measured variable may be associated with any factor. When developing a scale, researchers should use EFA first before moving on to [[Confirmatory Factor Analysis]] (CFA). EFA requires the researcher to make a number of important decisions about how to conduct the analysis because there is no one set method. ==Fitting Procedures== Line 15 ⟶ 17: When selecting how many factors to include in a model, researchers must try to balance [[parsimony]] (a model with relatively few factors) and plausibility (that there are enough factors to adequately account for correlations among measured variables). It is better to include too many factors (overfactoring) than too few factors (underfactoring). '''Overfactoring:''' ~~Including~~occurs when too many factors are included in a model. It is not as bad as underfactoring because major factors will usually be accurately represented and extra factors will have no measured variables load onto them. Still, it should be avoided because overfactoring may lead researchers to put forward constructs with little theoretical value. '''Underfactoring:''' ~~Specifying~~occurs when too few factors are included in a model. This is considered to be a greater error than overfactoring. If not enough factors are included in a model, there is likely to be substantial error. Measured variables that load onto a factor not included in the model can falsely loaded on factors that are included, altering true factor loadings . This can result in rotated solutions in which two factors are combined into a single factor, obscuring the true factor structure. There are a number of procedures in order to determine the best number of factors, including scree plot, parallel analysis, kaiser criterion, and model comparison: ===~~Model~~Scree ~~Comparison~~plot=== ~~Scree plot :~~ Compute the eigenvalues for the correlation matrix and plot the values from largest to smallest. Examine the graph to determine the last substantial drop in the magnitude of eigenvalues. The number of plotted points before the last drop is the number of factors to include in the model. This method has been criticized because of its subjective nature (i.e., there is no clear objective definition of what constitutes a substantial drop) .▼ Choose the best model from a series of models that differ in complexity. Researchers use goodness-of-fit measures to fit models beginning with a model with zero factors and gradually increase the number of factors. The goal is to ultimately choose a model that explains the data significantly better than simpler models (with fewer factors) and explains the data as well as more complex models (with more factors). ▼ There are different methods to assess model fit:▼ '''Likelihood ratio statistic:''' Used to test the null hypothesis that a model has perfect model fit. It should be applied to models with an increasing number of factors until the result is nonsignificant, indicating that the model is not rejected as good model fit of the population. This statistic should be used with a large sample size and normally distributed data. There are some drawbacks to the likelihood ratio test. First, when there is a large sample size, even small discrepancies between the model and the data result in model rejection . When there is a small sample size, even large discrepancies between the model and data may not be significant, which leads to underfactoring . Another disadvantage of the likelihood ratio test is that the null hypothesis of perfect fit is an unrealistic standard ▼ '''Root Mean Square Error of Approximation (RMSEA) fit index:''' RMSEA is an estimate of the discrepancy between the model and the data per degree of freedom for the model. Values less that .05 constitute good fit, values between 0.05 and 0.08 constitute acceptable fit, a values between 0.08 and 0.10 constitute marginal fit and values greater than 0.10 indicate poor fit . An advantage of the RMSEA fit index is that it provides confidence intervals which allow researchers to compare a series of models with varying numbers of factors. ▼ ▲Scree plot : Compute the eigenvalues for the correlation matrix and plot the values from largest to smallest. Examine the graph to determine the last substantial drop in the magnitude of eigenvalues. The number of plotted points before the last drop is the number of factors to include in the model. This method has been criticized because of its subjective nature (i.e., there is no clear objective definition of what constitutes a substantial drop) . ===Parallel analysis=== Line 38 ⟶ 32: Compute the eigenvalues for the correlation matrix and determine how many of these eigenvalues are greater than 1. This number is the number of factors to include in the model. A disadvantage of this procedure is that it is quite arbitrary (e.g. an eigenvalue of 1.01 is included whereas an eigenvalue of .99 is not). This procedure often leads to overfactoring and sometimes underfactoring. Therefore, this procedure should not be used. ===Model Comparison=== ▲Choose the best model from a series of models that differ in complexity. Researchers use goodness-of-fit measures to fit models beginning with a model with zero factors and gradually increase the number of factors. The goal is to ultimately choose a model that explains the data significantly better than simpler models (with fewer factors) and explains the data as well as more complex models (with more factors). ▲There are different methods to assess model fit: ▲'''Likelihood ratio statistic:''' Used to test the null hypothesis that a model has perfect model fit. It should be applied to models with an increasing number of factors until the result is nonsignificant, indicating that the model is not rejected as good model fit of the population. This statistic should be used with a large sample size and normally distributed data. There are some drawbacks to the likelihood ratio test. First, when there is a large sample size, even small discrepancies between the model and the data result in model rejection . When there is a small sample size, even large discrepancies between the model and data may not be significant, which leads to underfactoring . Another disadvantage of the likelihood ratio test is that the null hypothesis of perfect fit is an unrealistic standard ▲'''Root Mean Square Error of Approximation (RMSEA) fit index:''' RMSEA is an estimate of the discrepancy between the model and the data per degree of freedom for the model. Values less that .05 constitute good fit, values between 0.05 and 0.08 constitute acceptable fit, a values between 0.08 and 0.10 constitute marginal fit and values greater than 0.10 indicate poor fit . An advantage of the RMSEA fit index is that it provides confidence intervals which allow researchers to compare a series of models with varying numbers of factors. ==Factor Rotation==

Exploratory factor analysis: Difference between revisions