Help the Stat Consulting Group by giving a gift

Factor Analysis Using SAS PROC FACTOR

**This page was developed by the Consulting group of the Division of Statistics and Scientific Computing at the University of Texas at Austin. We thank them for permission to distribute it via our web site.**

**26 June 1995
Usage Note: Stat-53
Copyright 1995-1997, ACITS, The University of Texas at Austin
Statistical Services, 475-9372
Originally available online at: http://ssc.utexas.edu/docs/stat53.html**

This usage note describes how to run a factor analysis, specifically an exploratory common factor analysis, using the SAS FACTOR procedure. This document is composed of three sections: Introduction, Outline of Use, and An Illustrative Example. The Introduction section explains what factor analysis is and when one should use it. The next section is a detailed outline for conducting a factor analysis. Finally the last section illustrates the use of common factor analysis using actual data.

Factor analysis is a generic term for a family of statistical techniques concerned with the reduction of a set of observable variables in terms of a small number of latent factors. It has been developed primarily for analyzing relationships among a number of measurable entities (such as survey items or test scores). The underlying assumption of factor analysis is that there exists a number of unobserved latent variables (or "factors") that account for the correlations among observed variables, such that if the latent variables are partialled out or held constant, the partial correlations among observed variables all become zero. In other words, the latent factors determine the values of the observed variables.

Each observed variable (y) can be expressed as a weighted composite of a set of latent variables (f's) such that

y = a f + a f + ... + a f + e i i1 1 i2 2 ik k i

where y_i is the *i*ith observed variable on the factors, and e_i is the residual
of y_i on the factors. Given the assumption that the residuals are uncorrelated across the
observed variables, the correlations among the observed variables are accounted for by the
factors.

The following is an example of a simple path diagram for a factor analysis model. This diagram is a schematic representation of the above formula.

F1 and F2 are two common factors. Y1, Y2, Y3, Y4, and Y5 are observed variables, possibly 5 subtests or measures of other observations such as responses to items on a survey. e1, e2, e3, e4, and e5 represent residuals or unique factors, which are assumed to be uncorrelated with each other. Any correlation between a pair of the observed variables can be explained in terms of their relationships with the latent variables.

The primary purpose of factor analysis is data reduction and summarization. Factor analysis has been widely used, especially in the behavioral sciences, to assess the construct validity of a test or a scale. For example, a psychologist developed a new battery of 15 subtests to measure three distinct psychological constructs and wanted to validate that battery. A sample of 300 subjects was drawn from the population and measured on the battery of 15 subtests. The 300 by 15 data matrix was submitted to a factor analysis procedure. The output from that procedure was a 15 by 3 factor-loading matrix, which represented the relationships among the observed variables (the 15 subtests) and the 3 latent factors. The number of factors extracted and the pattern of relationships among the observed variables and the factors provided the researcher with information on the construct validity of the test battery.

Factor analysis as a generic term includes *principal component analysis*. While
the two techniques are functionally very similar and are used for the same purpose (data
reduction), they are quite different in terms of underlying assumptions.

The term "common" in *common factor analysis* describes the variance
that is analyzed. It is assumed that the variance of a single variable can be decomposed
into common variance that is shared by other variables included in the model, and unique
variance that is unique to a particular variable and includes the error component. Common
factor analysis (CFA) analyzes only the *common* variance of the observed variables;
principal component analysis considers the *total* variance and makes no distinction
between common and unique variance.

The selection of one technique over the other is based upon several criteria. First of
all, what is the objective of the analysis? Common factor analysis and principal component
analysis are similar in the sense that the purpose of both is to reduce the original
variables into fewer composite variables, called *factors* or *principal components*.
However, they are distinct in the sense that the obtained composite variables serve
different purposes. In common factor analysis, a small number of factors are extracted to
account for the intercorrelations among the observed variables--to identify the latent
dimensions that explain why the variables are correlated with each other. In principal
component analysis, the objective is to account for the maximum portion of the variance
present in the original set of variables with a minimum number of composite variables
called principal components.

Secondly, what are the assumptions about the variance in the original variables? If the observed variables are measured relatively error free, (for example, age, years of education, or number of family members), or if it is assumed that the error and specific variance represent a small portion of the total variance in the original set of the variables, then principal component analysis is appropriate. But if the observed variables are only indicators of the latent constructs to be measured (such as test scores or responses to attitude scales), or if the error (unique) variance represents a significant portion of the total variance, then the appropriate technique to select is common factor analysis. Since the two methods often yield similar results, only CFA will be illustrated here.

It is not uncommon in social science studies for an investigator to conduct a factor
analysis just because some multivariate data happen to be available. The investigator
simply hunts for relationships among the variables without any *a priori* hypothesis
about the relationships among the variables. With the availability of powerful computers
and statistical packages, many advanced multivariate techniques, including factor
analysis, which were once confined to a special population for a limited use, are now
readily accessible to many individuals and are therefore subject to potential misuses. One
key issue that users of factor analysis tend to overlook is that the quality of factor
analytic research depends primarily on the quality of input data submitted to the
analysis. The expression "Garbage In, Garbage Out" fits factor analysis well.

Several important questions should be considered by a researcher preparing input data for a factor analysis. First, what variables should be included in the analysis? Factor analysis is designed to explain why certain variables are correlated. Moreover, common factor analysis is concerned only with that portion of total variance shared by the variables included in the model. Therefore, you should not include variables that are not believed to be related to each other in any way.

Second, how many variables should be included? Factors are unobserved latent variables
that can be inferred from a set of observed variables. Therefore, factors cannot emerge
unless there is a sufficient number of observed variables that vary along the latent
continuum. You cannot define a factor with a single observed variable. You should have a
minimum of three observed variables for each factor expected to emerge. In Thurstone's
terminology, the factors defined by only one or two observed variables are called
"singlet" or "doublet" factors, which are not desirable. Guttman[1] has shown that if a
correlation matrix is suitable for common factor analysis, then **R ^{-1}** (the
inverse of a correlation matrix) should approach a diagonal matrix as the number of
variables increases while the number of factors remains constant. Kaiser and Rice[2] proposed a measure of sampling
adequacy, which indicates how near

Third, is the number of observations sufficient to provide reliable estimations of the correlations between the variables? Correlation coefficients tend to be unstable and greatly influenced by the presence of outliers if the sample size is not large. It is generally unwise to conduct a factor analysis on a sample of fewer than 50 observations. Moreover, the sample size should also be considered in relation to the number of variables included in the analysis. Various rules of thumb have been proposed, with the minimum number of observations per variable ranging from 5 to 10. While there seems to be no definitive answer to this problem, everyone agrees that the more observations you have, the more valid your results.

Fourth, is correlation a valid measure of association among the variables to be analyzed? The correlation coefficient is being used as a measure of conceptual similarity of the variables. If strong curvilinear relationships are present among variables, for example, the correlation coefficient is not an appropriate measure. In such cases, the results of a factor analysis based on correlation coefficients will be invalid. The variables should meet the other assumptions required for the correlation coefficient as well. However, in social and behavioral sciences, we seldom have variables that strictly meet these assumptions. Ordinal and dichotomous variables have been submitted to a factor analysis in the social and behavioral sciences. Unless the distributions of the variables are strongly nonnormal, factor analysis seems to be robust to minor violations of these assumptions.

Once the input data are prepared for the analysis, it is necessary to decide on a factoring technique, that is, a method of extracting factors. In particular, you need to decide whether you want to perform factor analysis or principal components analysis. There is a procedure in SAS specifically designed for principal components analysis (PROC PRINCOM), which is defined by its unique extraction method. On the other hand, if you decide on factor analysis, then you must choose an extraction technique. There are a variety of different methods of factor extraction available in the PROC FACTOR procedure in SAS: principal component, principal factor, iterative principal factor, unweighted least-squares factor, maximum-likelihood factor, alpha factor, image analysis, and Harris component analysis. The two most commonly employed factor analytic techniques are principal component and principal factor analysis. As discussed above, PCA is quite different from FA. The different FA techniques employ different criteria for extracting factors. Discussions on choosing different methods of factor extraction can be found in Loehlin[3].

As mentioned earlier, in principal components analysis we do not make a distinction
between common and unique parts of the variation present in a variable. The correlation
(covariance) matrix, with 1.0s (variances) down the main diagonal, is submitted to an
analysis. On the other hand, a common factor analysis begins by substituting the diagonal
of the correlation matrix with what are called prior communality estimates (h^{2}).
The communality estimate for a variable is the estimate of the proportion of the variance
of the variable that is both error free and shared with other variables in the matrix.
Since the concept of common variance is hypothetical, we never know exactly in advance
what proportion of the variance is common and what proportion is unique among variables.
Therefore, estimates of communalities need to be supplied for a factor analysis. These
estimates can be specified with the PRIORS= option to the PROC FACTOR statement. The
simplest approach is to use the largest absolute correlation for a variable with any other
variable as the communality estimate for the variable (PRIORS=MAX). A more sophisticated
approach is to use the squared multiple correlation (R^{2}) between the variable
and all other variables (PRIORS=SMC). As the number of variables increases, the importance
of accurate prior estimates decreases.

There are still other methods of estimating communalities available in SAS. Interested readers should refer to SAS manual[4]. Some method should be chosen, because SAS by default sets all prior communalities to 1.0, which is the same as requesting a principal components analysis. This default setting has caused misunderstanding among the novice users who are not aware of the consequence of overlooking the default settings. Many researchers claim to have conducted a common factor analysis when actually a principal components analysis was performed.

Determining the optimal number of factors to extract is not a straightforward task since the decision is ultimately subjective. There are several criteria for the number of factors to be extracted, but these are just empirical guidelines rather than an exact quantitative solution. In practice, most factor analysts seldom use a single criterion to decide on the number of factors to extract. Some of the most commonly used guidelines are the Kaiser-Guttman rule, percentage of variance, the scree test, size of the residuals, and interpretability.

The "eigenvalues greater than one" rule has been most commonly used due to
its simple nature and availability in various computer packages. It states that the number
of factors to be extracted should be equal to the number of factors having an eigenvalue
(variance) greater than 1.0. The rationale for choosing this particular value is that a
factor must have variance at least as large as that of a single standardized original
variable. Recall that in principal components analysis 1's are retained in the main
diagonal of the correlation matrix, therefore for *p* standardized variables there is
a total variance of *p* to be decomposed into factors. This rule, however, is more
appropriate for PCA than FA, and it should be adjusted downward when the common factor
model is chosen. In a common factor analysis, communality estimates are inserted in the
main diagonal of the correlation matrix. Therefore, for *p* variables the variance to
be decomposed into factors is less than *p*. It has been suggested that the latent
root (eigenvalue) criterion should be lower and around the average of the initial
communality estimates. The PROC FACTOR statement has the option MINEIGEN= allowing you to
specify the latent root cutoff value. For example, MINEIGEN=1 requests SAS to retain the
factors with eigenvaues greater than 1.

Another criterion, related to the latent root criterion, is the percentage or proportion of the common variance (defined by the sum of communality estimates) that is explained by successive factors. For example, if you set the cutting line at 75 percent of the common variance (PROPORTION=.75 or PERCENT=75), then factors will be extracted until the sum of eigenvalues for the retained factors exceeds 75 percent of the common variance, defined as the sum of initial communality estimates.

Sometimes plotting the eigenvalues against the corresponding factor numbers gives insight into the maximum number of factors to extract. The SCREE option in the PROC FACTOR statement produces a scree plot that illustrates the rate of change in the magnitude of the eigenvalues for the factors. The rate of decline tends to be fast for the first few factors but then levels off. The "elbow", or the point at which the curve bends, is considered to indicate the maximum number of factors to extract. The figure below illustrates an example of a rather idealistic scree plot, where a clear elbow occurred at the fourth factor, which has an eigenvalue right around 1. Notice that the eigenvalues for the first few variables drop rapidly and after the fourth factor the decline in the eigenvalues gradually levels off. The scree plot suggests a maximum of four factors in this example. One less factor than the number at the elbow might be appropriate if you are concerned about getting an overly defined solution. However, many scree plots do not give such a clear indication of the number of factors.

If the factors are doing a good job in explaining the correlations among the original
variables, we expect the predicted correlation matrix **R*** to closely approximate the
input correlation matrix. In other words, we expect the residual matrix **R **-** R***
to approximate a null matrix. The RESIDUAL (or RES) option in the PROC FACTOR statement
prints the residual correlation matrix and the partial correlation matrix (correlation
between variables after the factors are partialled out or statistically controlled). If
the residual correlations or partial correlations are relatively large (> 0.1), then
either the factors are not doing a good job explaining the data or we may need to extract
more factors to more closely explain the correlations. If maximum likelihood factors
(METHOD=ML) are extracted, then the output includes the Chi-square test for the
significance of residuals after the extraction of the given factor. This test comprises
two separate hypothesis tests. The first test, labeled, "Test of H0: No common
factors" tests the null hypothesis that no common factors can sufficiently explain
the intercorrelations among the variables included in the analysis. You want this test to
be statistically significant (p < .05). A nonsignificant value for this test statistic
suggests that your intercorrelations may not be strong enough to warrant performing a
factor analysis since the results from such an analysis could probably not be replicated.

The second Chi-square test statistic, labelled "Test of H0: N factors are
sufficient" is the test of the null hypothesis that N common factors are sufficient
to explain the intercorrelations among the variables, where N is the number of factors you
specify with an NFACTORS=N option in the PROC FACTOR statement. This test is useful for
testing the hypothesis that a given number of factors are sufficient to account for your
data; in this instance your goal is a small chi-square value relative to its degrees of
freedom. This outcome results in a *large* p-value (p > .05). One downside of this
test is that the Chi-square test is very sensitive to sample size: given large degrees of
freedom, this test will normally reject the null hypothesis of the residual matrix being a
null matrix, even when the factor analysis solution is very good. Therefore, be careful in
interpreting this test's significance value. Some data sets do not lend themselves to good
factor solutions, regardless of the number of factors extracted.

Another very important but often overlooked criterion for determining the number of factors is the interpretability of the factors extracted. Factor solutions should be evaluated not only according to empirical criteria but also according to the criterion of " theoretical meaningfulness." Extracting more factors will guarantee that the residual correlations get smaller and thus that the chi-square values get smaller relative to the number of degrees of freedom. However, noninterpretable factors may have little utility. That is, an interpretable three-factor solution may be more useful (not to mention more parsimonious) than a less interpretable four-factor solution with a better goodness-of-fit statistic.

The problem of determining the number of factors is not a concern if the researcher has
an *a priori* hypothesis about the number of factors to extract. That is, an *a
priori* hypothesis can provide a criterion for the number of factors to be extracted.
If a theory or previous research suggests a certain number of factors and the analyst
wants to confirm the hypothesis or replicate the previous study, then a factor analysis
with the prespecified number of factors can be run. The NFACTOR=*n* (or N=*n*)
option in PROC FACTOR extracts the user-supplied number of factors. Ultimately, the
criterion for determining the number of factors should be the replicability of the
solution. It is important to extract only factors that can be expected to replicate
themselves when a new sample of subjects is employed.

Once you decide on the number of factors to extract, the next logical step is to determine the method of rotation. The fundamental theorem of factor analysis is invariant within rotations. That is, the initial factor pattern matrix is not unique. We can get an infinite number of solutions, which produce the same correlation matrix, by rotating the reference axes of the factor solution to simplify the factor structure and to achieve a more meaningful and interpretable solution. The idea of simple structure has provided the most common basis for rotation, the goal being to rotate the factors simultaneously so as to have as many zero loadings on each factor as possible. The following figure is a simplified example of rotation, showing only one variable from a set of several variables.

The variable V1 initially has factor loadings (correlations) of .7 and .6 on factor 1 and factor 2 respectively. However, after rotation the factor loadings have changed to .9 and .2 on the rotated factor 1 and factor 2 respectively, which is closer to a simple structure and easier to interpret.

The simplest case of rotation is an *orthogonal rotation* in which the angle
between the reference axes of factors are maintained at 90 degrees. More complicated forms
of rotation allow the angle between the reference axes to be other than a right angle,
i.e., factors are allowed to be correlated with each other. These types of rotational
procedures are referred to as *oblique rotations*. Orthogonal rotation procedures are
more commonly used than oblique rotation procedures. In some situations, theory may
mandate that underlying latent constructs be uncorrelated with each other, and therefore
oblique rotation procedures will not be appropriate. In other situations where the
correlations between the underlying constructs are not assumed to be zero, oblique
rotation procedures may yield simpler and more interpretable factor patterns.

A number of orthogonal and oblique rotation procedures have been proposed. Each
procedure has a slightly different *simplicity function* to be maximized. The ROTATE=
option in the PROC FACTOR statement supports five orthogonal rotation methods: EQUAMAX,
ORTHOMAX, QUARTIMAX, PARSIMAX, and VARIMAX; and two oblique rotation methods: PROCRUSTES
and PROMAX. The VARIMAX method has been the most commonly used orthogonal rotation
procedure.

One part of the output from a factor analysis is a matrix of factor loadings. A *factor
loading* or *factor structure matrix* is a *n* by *m* matrix of
correlations between the original variables and their factors, where *n* is the
number of variables and *m* is the number of retained factors. When an oblique
rotation method is performed, the output also includes a *factor pattern matrix*,
which is a matrix of standardized regression coefficients for each of the original
variables on the rotated factors. The meaning of the rotated factors are inferred from the
variables significantly loaded on their factors. A decision needs to be made regarding
what constitutes a significant loading. A rule of thumb frequently used is that factor
loadings greater than .30 in absolute value are considered to be significant. This
criterion is just a guideline and may need to be adjusted. As the sample size and the
number of variables increase, the criterion may need to be adjusted slightly downward; it
may need to be adjusted upward as the number of factors increases. The procedure described
next outlines the steps of interpreting a factor matrix.

1. Identifying significant loadings: The analyst starts with the first variable (row) and examines the factor loadings horizontally from left to right, underlining them if they are significant. This process is repeated for all the other variables. You can instruct SAS to perform this step by using the FUZZ= option in the PROC FACTOR statement. For instance, FUZZ=.30 prints only the factor loadings greater than or equal to .30 in absolute value.

Ideally, we expect a single significant loading for each variable on only one factor:
across each row there is only one underlined factor loading. It is not uncommon, however,
to observe *split loadings*, a variable which has multiple significant loadings. On
the other hand, if there are variables that fail to load significantly on any factor, then
the analyst should critically evaluate these variables and consider deriving a new factor
solution after eliminating them.

2. Naming of Factors: Once all significant loadings are identified, the analyst attempts to assign some meaning to the factors based on the patterns of the factor loadings. To do this, the analyst examines the significant loadings for each factor (column). In general, the larger the absolute size of the factor loading for a variable, the more important the variable is in interpreting the factor. The sign of the loadings also needs to be considered in labeling the factors. It may be important to reverse the scoring of the negatively worded items in Likert-type instruments to prevent ambiguity. That is, in Likert-type instruments some items are often negatively worded so that high scores on these items actually reflect low degrees of the attitude or construct being measured. Remember that the factor loadings represent the correlation or linear association between a variable and the latent factor(s). Considering all the variables' loading on a factor, including the size and sign of the loading, the investigator makes a determination as to what the underlying factor may represent.

A factor is a latent continuum along which we can locate data points according to the varying amount of the construct that they possess. Factor scores can quantify individual cases on a latent continuum using a z-score scale which ranges from approximately -3.0 to +3.0. The FACTOR procedure can provide the estimated scoring coeffients which are then used in PROC SCORE to produce a matrix of estimated factor scores. You can then output these scores into a SAS dataset for further analysis.

The following diagram illustrates a general decision process for factor analysis. This decision process is described here as a linear flow of events for the sake of simplicity. However, it would be more realistic to have a number of feedback loops included in the diagram. That is, depending on the result at a given stage, any previously made decision may need to be modified.

Confirmatory factor analysis allows you to test very specific hypotheses regarding the number of factors, factor loadings, and factor intercorrelations. However, it is more complex to run than ordinary exploratory factor analysis, and a full discussion of it is beyond the scope of this document.

**Factor Analysis Decision Diagram**

Below is an illustrative example of the application of common factor analysis to clarify the topics described in the previous sections. Factor analysis has been widely used to examine the structure of tests or scales of various kinds, such as personality scales, attitude measures,and ability scales. The following example illustrates the application of common factor analysis to provide evidence of construct validity of the Wechsler Intelligence Scale for Children (WISC-III).

The Wechsler Intelligence Scale for Children (WISC-III) was designed as a test of general intelligence to provide estimates of the intellectual abilities for children aged between 6 and 16. The WISC-III consists of 13 subtests, each measuring a different facet of intelligence. The matrix of intercorrelations among the 13 subtests, which served as the input data, was obtained from the manual[5] and is shown in Table 2. Inspection of the correlation matrix shows that the correlations are substantial, indicating the presence of a substantial general factor.

Table 1. Correlation matrix for 13 subscales

Subscale Inf Sim Ari Voc Com Dig PiC Cod PiA Blo Obj Sym Information Similarities .66 Arithmetic .57 .55 Vocabulary .70 .69 .54 Comprehension .56 .59 .47 .64 Digit Span .34 .34 .43 .35 .29 Pic. Completion .47 .45 .39 .45 .38 .25 Coding Subscale .21 .20 .27 .26 .25 .23 .18 Pic. Arrang. .40 .39 .35 .40 .35 .20 .37 .28 Block Design .48 .49 .52 .46 .40 .32 .52 .27 .41 Object Assembly .41 .42 .39 .41 .34 .26 .49 .24 .37 .61 Symbol Search .35 .35 .41 .35 .34 .28 .33 .53 .36 .45 .38 Mazes .18 .18 .22 .17 .17 .14 .24 .15 .23 .31 .29 .24

PROC FACTOR can handle input data consisting of either a correlation matrix or the raw data matrix used to produce the correlation matrix. The correlation matrix can be a SAS dataset generated from the PROC CORR procedure or can be a text file containing the lower triangle (including the main diagonal) of a correlation matrix. For our example, a text file of correlations is created and called WISC.DAT. The following SAS DATA step code defines the type of the input data file WISC.DAT as a correlation matrix, and labels its variables. The _TYPE_=`CORR'; statement must be typed exactly as shown:

DATA d1 (TYPE=CORR); _TYPE_='CORR'; INFILE `wisc.dat' MISSOVER; INPUT inf sim ari voc com dig pic cod pia blo obj sym maz; RUN;

The following SAS code calls the FACTOR procedure with some options. METHOD=P or METHOD=PRINCIPAL specifies the method for extracting factors to be the principal-axis factoring method. This option in conjunction with PRIORS=SMC performs a principal factor analysis. The option ROTATE=PROMAX performs an oblique rotation after an orthogonal VARIMAX rotation. It is specified here because the hypothetical constructs that constitute human intelligence, which WISC-III attempts to measure, are believed to be interrelated with each other. The CORR option requests the correlation matrix be printed, and the RES or RESIDUALS option requests that a residual correlation matrix be printed. The residual correlation matrix shows the difference between the observed correlation matrix and the predicted correlation matrix. If the retained factors are sufficient to explain the correlations among the observed variables, the residual correlation matrix is expected to approximate a null matrix (most values <= .10).

PROC FACTOR DATA=D1 METHOD=P PRIORS=SMC ROTATE=PROMAX SCREE CORR RES; RUN;

Table 2 shows the prior communality estimates for 13 subtests used in this analysis. The squared multiple correlations (SMC), which are printed below, represent the proportion of variance of each of the 13 subtests shared by all remaining subtests. The subtest MAZES has the prior communality estimate of 0.132, which means that only 13% of the variance of the subtest MAZES is shared by all other subtests, indicating that this subtest measures a somewhat different construct than the other subtests. A small communality estimate might indicate that the variable or item may need to be modified or even dropped.

Table 2. Initial Communality EstimatesInitial Factor Method:Principal FactorsPrior Communality Estimates:SMCINFO SIM ARITH VOC COMP 0.594574 0.587543 0.481994 0.636296 0.473358 DIGIT PICTCOM CODING PICTARG 0.224104 0.385580 0.306120 0.287693 BLOCK OBJECT SYMBOL MAZES 0.533202 0.439176 0.422932 0.132220 Eigenvalues of the Reduced Correlation Matrix:Total = 5.50479208Average = 0.42344554

The sum of all prior communality estimates, 5.505 in this example, is the estimate of the common variance among all subtests. This initial estimate of the common variance constitutes about 42% of the total variance present among all 13 subtests.

Table 3 shows the factor numbers and corresponding eigenvalues. According to the Kaiser and Guttman rule, only one factor can be retained because only the first factor has an eigenvalue greater than one. However, as suggested in the previous section, this criterion may be applicable only to principal component analysis, not common factor analysis. Two factors can be retained if the average eigenvalue (0.423) instead of 1.0 is used as the criterion. The authors of WISC-III retained all factors with positive eigenvalues and thus retained the first four factors. The fifth and following factors have negative eigenvalues, which may not be intuitively appealing just as a negative variance is not. This oddity occurs only in common factor analysis due to the restriction that the sum of eigenvalues be set equal to the estimated common variance, not the total variance.

Table 3. Eigenvalues of the Reduced Correlation Matrix

Eigenvalue 5.1046 0.6838 0.4021 0.1479 -0.0130 Difference 4.4208 0.2817 0.2542 0.1609 0.0094 Proportion 0.9273 0.1242 0.0731 0.0269 -0.0024 Cumulative 0.9273 1.0515 1.1246 1.1514 1.1491 6 7 8 9 10 Eigenvalue -0.0224 -0.0569 -0.0782 -0.0848 -0.0897 Difference 0.0345 0.0213 0.0065 0.0049 0.0412 Proportion -0.0041 -0.0103 -0.0142 -0.0154 -0.0163 Cumulative 1.1450 1.1347 1.1205 1.1051 1.0888 11 12 13 Eigenvalue -0.1310 -0.1547 -0.2031 Difference 0.0237 0.0485 Proportion -0.0238 -0.0281 -0.0369 Cumulative 1.0650 1.0369 1.0000

The scree plot shown below seems to suggest the presence of a general factor as predicted from the inspection of the correlation matrix. A large first eigenvalue (5.11) and a much smaller second eigenvalue (0.68) suggests the presence of a dominant global factor. Stretching it to the limit, one might argue that a secondary elbow occurred at the fifth factor, implying a four-factor solution. That is equivalent to retaining all factors with positive eigenvalues. Research has suggested that the structure of the Wechsler's intelligence scales are hierarchical. That is, at the top of the hierarchy all subtests converge to a single general factor, below which are several less general factors defined by clusters of subtests. A four-factor solution is more interesting and meaningful than a single factor solution to investigate the hierarchical structure of the WISC-III. The results presented in the following section will be based on a four-factor solution, which was obtained by repeating the analysis with the NFACTOR=4 option specifying that the first four factors be retained.

Table 4. Initial Factor Pattern

FACTOR1 FACTOR2 FACTOR3 FACTOR4 INFO 0.76124 -0.26507 0.00573 -0.00419 INFORMATION SIM 0.75825 -0.26807 0.00088 -0.01733 SIMILARITY ARITH 0.70320 -0.04219 0.07006 0.21817 ARITHMETIC VOC 0.77712 -0.29967 0.08268 -0.07819 VOCABULARY COMP 0.67220 -0.21792 0.11383 0.09479 COMPREHENSION DIGIT 0.45938 0.01293 0.10982 0.23284 DIGIT SPAN PICTCOM 0.61799 0.06079 -0.23502 -0.05384 PICTURECOMPLETION CODING 0.40429 0.33855 0.34093 -0.06015 CODING PICTARG 0.54687 0.11799 -0.0165 -0.13620 PICTURE ARRANGEMENT BLOCK 0.71609 0.21503 -0.2255 0.06332 BLOCK DESIGN OBJECT 0.62675 0.21928 -0.2652 -0.01736 OBJECT ASSEMBLY SYMBOL 0.57731 0.36078 0.23968 -0.03620 SYMBOL SEARCH MAZES 0.32498 0.21379 -0.12221 -0.00324 MAZES

Variance explained by each factor FACTOR1 FACTOR2 FACTOR3 FACTOR4 5.104620 0.683788 0.402128 0.147927 Final Communality Estimates: Total = 6.338464

Table 4 above shows the initial unrotated factor structure matrix, which consists of the correlations between the 13 subtests and the four retained factors. The current estimate of the common variance is now 6.338, which is somewhat larger than the initial estimate of 5.505.

The off-diagonal elements of the residual correlation matrix are all close to 0.01, indicating that the correlations among the 13 subtests can be reproduced fairly accurately from the retained factors. The root mean squared off-diagonal residual is 0.0178. The inspection of the partial correlation matrix yields similar results: the correlations among the 13 subtests after the retained factors are accounted for are all close to zero. The root mean squared partial correlation is 0.038, indicating that four latent factors can accurately account for the observed correlations among the 13 subtests.

The table shown below is the factor structure matrix after the VARIMAX rotation. The correlations greater than 0.30 are underlined. There are some split loadings where a variable is significantly (> 0.3) loaded on more than one factor. This matrix, however, is not interpreted because an oblique solution has been requested.

Table 5. Rotated Factor Pattern (VARIMAX)

Table 5. Rotated Factor Pattern (VARIMAX) FACTOR1 FACTOR2 FACTOR3 FACTOR4 INFO 0.71862 0.29392 0.12616 0.17630 INFORMATION SIM 0.72023 0.29506 0.12237 0.16230 SIMILARITY ARITH 0.49726 0.30656 0.23918 0.38771 ARITHMETIC VOC 0.77718 0.23819 0.17933 0.11727 VOCABULARY COMP 0.65565 0.19763 0.21399 0.08092 COMPREHENSION DIGIT 0.29024 0.16907 0.20796 0.34843 DIGIT SPAN PICTCOM 0.37579 0.53504 0.10572 0.07124 PICTURE COMPLETION CODING 0.12040 0.14820 0.59510 0.08546 CODING PICTARG 0.33269 0.37653 0.28170 0.00121 PICTURE ARRANGEMENT BLOCK 0.32270 0.64662 0.21651 0.21154 BLOCK DESIGN OBJECT 0.26569 0.63181 0.17377 0.10766 OBJECT ASSEMBLY SYMBOL 0.21005 0.32244 0.59566 0.13894 SYMBOL SEARCH MAZES 0.07226 0.36298 0.15838 0.06487 MAZES Variance explained by each factor FACTOR1 FACTOR2 FACTOR3 FACTOR4 2.891010 1.894832 1.110948 0.441675

Table 6 shown below is the factor structure matrix after the oblique PROMAX rotation,
which allows the latent factors to be correlated with each other. The matrix of
inter-factor correlations (Table 7) shows that the factors are substantially correlated
with each other. The inter-factor correlations range between 0.44 and 0.65. If we submit
these intercorrelated factors to new factor analysis, we might be able to obtain a single
second-order factor, which could correspond to the general intelligence or *g *factor
in previous research. One downside of an oblique rotation method is that if the
correlations among the factors are substantial, then it is sometimes difficult to
distinguish among factors by examining the factor loadings. In such situations, you should
investigate the factor pattern matrix, which is a matrix of the standardized coefficients
for the regression of the factors on the observed variables.

Table 6. Factor Structure (Correlations)

FACTOR1 FACTOR2 FACTOR3 FACTOR4 INFO 0.80153 0.56064 0.33700 0.52105 INFORMATION SIM 0.80059 0.55913 0.33257 0.50906 SIMILARITY ARITH 0.65384 0.55813 0.42927 0.65702 ARITHMETIC VOC 0.84027 0.53362 0.37803 0.48942 VOCABULARY COMP 0.71732 0.45943 0.37569 0.41350 COMPREHENSION DIGIT 0.40958 0.35214 0.32514 0.50255 DIGIT SPAN PICTCOM 0.53937 0.64229 0.30602 0.37733 PICTURE COMPLETION CODING 0.28294 0.32896 0.63030 0.31811 CODING PICTARG 0.47527 0.51677 0.41891 0.30366 PICTURE ARRANGEMENT BLOCK 0.56601 0.77315 0.44326 0.54029 BLOCK DESIGN OBJECT 0.48561 0.71459 0.37858 0.41641 OBJECT ASSEMBLY SYMBOL 0.42630 0.52381 0.69512 0.44612 SYMBOL SEARCH MAZES 0.21660 0.39830 0.25905 0.22942 MAZES

Table 7. Inter-factor Correlations

FACTOR1 FACTOR2 FACTOR3 FACTOR4 FACTOR1 1.00000 0.64770 0.43503 0.58664 FACTOR2 0.64770 1.00000 0.52336 0.57564 FACTOR3 0.43503 0.52336 1.00000 0.47436 FACTOR4 0.58664 0.57564 0.47436 1.00000

Table 8 is the factor pattern matrix, which will be used to interpret the meaning of the factors. The values in this matrix are the standardized regression coefficients, which are functionally related to the part or semipartial correlation between a variable and the factor when other factors are held constant. Therefore, a value in this matrix represents the individual and nonredundant contribution that each factor is making to predict a subtest. The regression coefficients greater than 0.30 are underlined to assist the interpretation.

Table 8. Rotated Factor Pattern (Standardized Regression Coefficients)

FACTOR1 FACTOR2 FACTOR3 FACTOR4 INFO 0.73663 0.06911 -0.0553 0.07540 INFORMATION SIM 0.74378 0.07445 -0.05694 0.05688 SIMILARITY ARITH 0.35704 0.08393 0.05243 0.37438 ARITHMETIC VOC 0.85010 -0.02674 0.02492 -0.00572 VOCABULARY COMP 0.71870 -0.0391 0.09895 -0.0325 COMPREHENSION DIGIT 0.16057 -0.01159 0.08321 0.37555 DIGIT SPAN PICTCOM 0.24101 0.54702 -0.06151 -0.04977 PICTURE COMPLETION CODING 0.00651 -0.01816 0.62315 0.02916 CODING PICTARG 0.25467 0.31837 0.20034 -0.12403 PICTURE ARRANGEMENT BLOCK 0.06661 0.65410 0.01652 0.11685 BLOCK DESIGN OBJECT 0.04111 0.69028 0.00237 -0.00618 OBJECT ASSEMBLY SYMBOL 0.03508 0.17311 0.56088 0.05983 SYMBOL SEARCH MAZES 0.08719 0.40886 0.07943 0.00754 MAZES

The subtests significantly loaded on the first factor are Information, Similarity, Arithmetic, Vocabulary, and Comprehension subtests. These are the subtests that are orally presented and require verbal responses. Therefore, this factor may be named "Verbal Comprehension". The second factor is identified by the following subtests: Picture Completion, Picture Arrangement, Block Design, and Object Assembly. All of these subtests have a geometric or configural component in them: these subtests measure the skills that require the manual manipulation or organization of pictures, objects, blocks, and the like. Therefore, this factor may be named "Perceptual Organization." The two subtests loaded on the third factors are Coding and Symbol Search subtests. Both subtests measure basically the speed of simple coding or searching process. Therefore, this factor can be named "Processing Speed." Finally, Arithmetic and Digit Span subtests identify the fourth factor. Both subtests deal with arithmetic problems or numbers so that this factor can be named "Numerical Ability." The last two factors are doublets since they are identified by only two subtests each. Therefore, they are conceptually weak compared to the first two factors and more subtests may need to be added to these factors to make them conceptually sound.

It is possible to estimate the factor scores, or a subject's relative standing on each of the factors, if the original subject-by-variable raw data matrix is available. To compute the factor scores for all subjects on all factors, use the following SAS code:

PROC FACTOR DATA=raw{other options here} OUTSTAT=fact; PROC SCORE DATA=rawSCORE=factOUT=scores; RUN;

where *raw* is the original data matrix, *fact* is the matrix of factor
scoring coefficients, and *scores* is the matrix of factor scores for subjects.

- Guttman, L. (1953) "Image Theory for the Structure of Quantitative Variables",
*Psychometrica*, 18, 277-296. - Kaiser, H.F., and Rice, J. (1974) "Little Jiffy, Mark IV",
*Educational and Psychological Measurement*, 34, 111-117. - Loehlin, J.C. (1992)
*Latent Variable Models*. Erlbaum Associates, Hillsdale NJ. - SAS/STAT User's Guide, 1990, SAS Institute Inc., p. 785.
- Manual for the Wechsler Intelligence Scale for Children (WISC-III), New York, 1991.

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.