Factor analysis (FA) is a statistical technique applies to a set of variables to
discover which variables in the set that is relatively independent of one
another. Principal components analysis (PCA) is extremely similar to factor analysis,
and is often used as preliminary stage to factor analysis. Exploratory factor
analysis (EFA) is used to identify the hypothetical constructs in a set of
data, while confirmatory factor analysis (CFA) is used to confirm the existence
of the hypothetical constructs in a fresh set of data. In this blog, I will focus
on exploratory factor analysis (EFA).
EFA is used to analyze the structure of the correlations among a large number of variables by defining sets of variables that are highly interrelated, known as factors (Hair et al., 2010). Thus, EFA is used to reduce a set of items and identify the internal dimensions of the scale.
To perform FA, the data should meet certain requirements: 1) data has been measured on an interval scale, 2) samples vary in their scores on the variables, 3) the scores on the variables have linear correlations with each other, 4) the scores on the variables are normally distributed, 5) absence of outliers among cases, 6) absence of multicollinearity, and 7) factorablity for EFA.
- Communality: Communality represents the amount of variance accounted for by the factor solution for each variance. Communality greater than .50 is desirable.
- Bartlett's test of Sphericity: Bartlett's test of Sphericity is the method examining the entire correlation matrix. A statistically significant Bartlett's test of Sphericity (sig. < .05) indicates that sufficient correlations exist among the variables to proceed.
Conclusion: Factor analysis is used to describe things and to attach the conceptual ideas to its statistical results. Interpretation and naming of factors depend on the meaning of the combination of variables accounted for factors. A good factor makes sense. The variables in each factor should show theoretical sense and parsimonious accounting for the factors.
EFA is used to analyze the structure of the correlations among a large number of variables by defining sets of variables that are highly interrelated, known as factors (Hair et al., 2010). Thus, EFA is used to reduce a set of items and identify the internal dimensions of the scale.
There is no certain rule to say
how much samples must have provided data for factor analysis. However,
correlation coefficients tend to be less reliable when estimated from small
samples. As a general rule of thumb, it should have at least 300 samples for
factor analysis. In instrumentation, it should have a ratio of at least 5-10 subjects
per item.
To perform FA, the data should meet certain requirements: 1) data has been measured on an interval scale, 2) samples vary in their scores on the variables, 3) the scores on the variables have linear correlations with each other, 4) the scores on the variables are normally distributed, 5) absence of outliers among cases, 6) absence of multicollinearity, and 7) factorablity for EFA.
The factorability indices for EFA
including the alpha correlation of each item, factor loading, communalities,
Bartlett's test of Sphericity, the test of KMO, and MSA will be performed
before conducting EFA.
- Alpha correlation: The
interpretation of the factorability indices for EFA includes the correlation
coefficient (r) in which all pairs of items should range from .30 to .70. Conversely, if the items correlate very
highly (.90 or more) then they are redundant, they should be dropped from the
analysis.
- Factor loading: A factor
loading is the correlation of the variable and the factor, the larger the size
of factor loading, the more important the loading in interpreting the factor
matrix. Factor loading greater than .30 is desirable. However, if a variable persists in having
cross-loadings, it becomes a candidate for deletion.
- Communality: Communality represents the amount of variance accounted for by the factor solution for each variance. Communality greater than .50 is desirable.
- Bartlett's test of Sphericity: Bartlett's test of Sphericity is the method examining the entire correlation matrix. A statistically significant Bartlett's test of Sphericity (sig. < .05) indicates that sufficient correlations exist among the variables to proceed.
- The Kaiser-Meyer-Olkin
measure (KMO): The KMO is based on the principle
that if variables share common factors, then partial correlations between pairs
of variables should be small when effects of other variables are
controlled. The KMO measurement of
sampling adequacy for factor analysis at least of .60 is desirable.
- Measure of Sampling Adequacy
(MSA): The MSA is examining the degree of intercorrelations among the variables
both the entire correlation matrix and each individual variable. An overall MSA
value of above .50 before proceeding with the factor analysis is desirable.
After
the appropriateness of performing a FA has been analyzed, factor extraction
using PCA method will be performed. This
procedure condenses items into their underlying constructs which explain the
pattern of correlations. Examine the
results of the PCA to decide how many factors which are worth keeping by
considering Eigenvalues and Scree test (Eigenvalues greater than
1, the point at which there is an 'elbow' on the Scree test).
Carry
out FA by using the number of factors determined from PCA. 'Rotation' is needed
when extraction techniques produce two or more factors. Carry out FA, with an orthogonal rotation (Varimax, Quartimax, and Equamax), to see
how clear the outcome is. Then, carry out FA again, with an oblique rotation (Oblim and Promax),
to produce a clearer outcome. In orthogonal rotation, the factors remain
uncorrelated with each other whereas in oblique rotation they are allowed to
correlate. Several runs of the analysis will be executed to explore an
appropriate factor solution.
The
criteria for the number of factors to extract consists of: 1) Eigenvalues greater
than 1, 2) a Scree test result, and 3) the value of factor loading for each
item that is .30 or greater.
Conclusion: Factor analysis is used to describe things and to attach the conceptual ideas to its statistical results. Interpretation and naming of factors depend on the meaning of the combination of variables accounted for factors. A good factor makes sense. The variables in each factor should show theoretical sense and parsimonious accounting for the factors.