SAS Libary
Factor Analysis Using SAS PROC FACTOR


This page was developed by the Consulting group of the Division of Statistics and Scientific Computing at the University of Texas at Austin. We thank them for permission to distribute it via our web site.

26 June 1995
Usage Note: Stat-53
Copyright 1995-1997, ACITS, The University of Texas at Austin
Statistical Services, 475-9372
Originally available online at: http://ssc.utexas.edu/docs/stat53.html


Factor Analysis Using SAS PROC FACTOR

This usage note describes how to run a factor analysis, specifically an exploratory common factor analysis, using the SAS FACTOR procedure. This document is composed of three sections: Introduction, Outline of Use, and An Illustrative Example. The Introduction section explains what factor analysis is and when one should use it. The next section is a detailed outline for conducting a factor analysis. Finally the last section illustrates the use of common factor analysis using actual data.

What Is Factor Analysis?

Factor analysis is a generic term for a family of statistical techniques concerned with the reduction of a set of observable variables in terms of a small number of latent factors. It has been developed primarily for analyzing relationships among a number of measurable entities (such as survey items or test scores). The underlying assumption of factor analysis is that there exists a number of unobserved latent variables (or "factors") that account for the correlations among observed variables, such that if the latent variables are partialled out or held constant, the partial correlations among observed variables all become zero. In other words, the latent factors determine the values of the observed variables.

Each observed variable (y) can be expressed as a weighted composite of a set of latent variables (f's) such that

	y  = a  f  + a  f  + ...  + a  f  + e    
	 i    i1 1    i2 2           ik k    i

where y_i is the iith observed variable on the factors, and e_i is the residual of y_i on the factors. Given the assumption that the residuals are uncorrelated across the observed variables, the correlations among the observed variables are accounted for by the factors.

The following is an example of a simple path diagram for a factor analysis model. This diagram is a schematic representation of the above formula.

F1 and F2 are two common factors. Y1, Y2, Y3, Y4, and Y5 are observed variables, possibly 5 subtests or measures of other observations such as responses to items on a survey. e1, e2, e3, e4, and e5 represent residuals or unique factors, which are assumed to be uncorrelated with each other. Any correlation between a pair of the observed variables can be explained in terms of their relationships with the latent variables.

Purposes of Factor Analysis

The primary purpose of factor analysis is data reduction and summarization. Factor analysis has been widely used, especially in the behavioral sciences, to assess the construct validity of a test or a scale. For example, a psychologist developed a new battery of 15 subtests to measure three distinct psychological constructs and wanted to validate that battery. A sample of 300 subjects was drawn from the population and measured on the battery of 15 subtests. The 300 by 15 data matrix was submitted to a factor analysis procedure. The output from that procedure was a 15 by 3 factor-loading matrix, which represented the relationships among the observed variables (the 15 subtests) and the 3 latent factors. The number of factors extracted and the pattern of relationships among the observed variables and the factors provided the researcher with information on the construct validity of the test battery.

Common Factor Analysis vs. Component Analysis

Factor analysis as a generic term includes principal component analysis. While the two techniques are functionally very similar and are used for the same purpose (data reduction), they are quite different in terms of underlying assumptions.

The term "common" in common factor analysis describes the variance that is analyzed. It is assumed that the variance of a single variable can be decomposed into common variance that is shared by other variables included in the model, and unique variance that is unique to a particular variable and includes the error component. Common factor analysis (CFA) analyzes only the common variance of the observed variables; principal component analysis considers the total variance and makes no distinction between common and unique variance.

The selection of one technique over the other is based upon several criteria. First of all, what is the objective of the analysis? Common factor analysis and principal component analysis are similar in the sense that the purpose of both is to reduce the original variables into fewer composite variables, called factors or principal components. However, they are distinct in the sense that the obtained composite variables serve different purposes. In common factor analysis, a small number of factors are extracted to account for the intercorrelations among the observed variables--to identify the latent dimensions that explain why the variables are correlated with each other. In principal component analysis, the objective is to account for the maximum portion of the variance present in the original set of variables with a minimum number of composite variables called principal components.

Secondly, what are the assumptions about the variance in the original variables? If the observed variables are measured relatively error free, (for example, age, years of education, or number of family members), or if it is assumed that the error and specific variance represent a small portion of the total variance in the original set of the variables, then principal component analysis is appropriate. But if the observed variables are only indicators of the latent constructs to be measured (such as test scores or responses to attitude scales), or if the error (unique) variance represents a significant portion of the total variance, then the appropriate technique to select is common factor analysis. Since the two methods often yield similar results, only CFA will be illustrated here.

II. Outline of Use

1. Preparing Data

It is not uncommon in social science studies for an investigator to conduct a factor analysis just because some multivariate data happen to be available. The investigator simply hunts for relationships among the variables without any a priori hypothesis about the relationships among the variables. With the availability of powerful computers and statistical packages, many advanced multivariate techniques, including factor analysis, which were once confined to a special population for a limited use, are now readily accessible to many individuals and are therefore subject to potential misuses. One key issue that users of factor analysis tend to overlook is that the quality of factor analytic research depends primarily on the quality of input data submitted to the analysis. The expression "Garbage In, Garbage Out" fits factor analysis well.

Several important questions should be considered by a researcher preparing input data for a factor analysis. First, what variables should be included in the analysis? Factor analysis is designed to explain why certain variables are correlated. Moreover, common factor analysis is concerned only with that portion of total variance shared by the variables included in the model. Therefore, you should not include variables that are not believed to be related to each other in any way.

Second, how many variables should be included? Factors are unobserved latent variables that can be inferred from a set of observed variables. Therefore, factors cannot emerge unless there is a sufficient number of observed variables that vary along the latent continuum. You cannot define a factor with a single observed variable. You should have a minimum of three observed variables for each factor expected to emerge. In Thurstone's terminology, the factors defined by only one or two observed variables are called "singlet" or "doublet" factors, which are not desirable. Guttman[1] has shown that if a correlation matrix is suitable for common factor analysis, then R-1 (the inverse of a correlation matrix) should approach a diagonal matrix as the number of variables increases while the number of factors remains constant. Kaiser and Rice[2] proposed a measure of sampling adequacy, which indicates how near R-1 is to a diagonal matrix.

Third, is the number of observations sufficient to provide reliable estimations of the correlations between the variables? Correlation coefficients tend to be unstable and greatly influenced by the presence of outliers if the sample size is not large. It is generally unwise to conduct a factor analysis on a sample of fewer than 50 observations. Moreover, the sample size should also be considered in relation to the number of variables included in the analysis. Various rules of thumb have been proposed, with the minimum number of observations per variable ranging from 5 to 10. While there seems to be no definitive answer to this problem, everyone agrees that the more observations you have, the more valid your results.

Fourth, is correlation a valid measure of association among the variables to be analyzed? The correlation coefficient is being used as a measure of conceptual similarity of the variables. If strong curvilinear relationships are present among variables, for example, the correlation coefficient is not an appropriate measure. In such cases, the results of a factor analysis based on correlation coefficients will be invalid. The variables should meet the other assumptions required for the correlation coefficient as well. However, in social and behavioral sciences, we seldom have variables that strictly meet these assumptions. Ordinal and dichotomous variables have been submitted to a factor analysis in the social and behavioral sciences. Unless the distributions of the variables are strongly nonnormal, factor analysis seems to be robust to minor violations of these assumptions.

2. Selecting a Factor Model

Once the input data are prepared for the analysis, it is necessary to decide on a factoring technique, that is, a method of extracting factors. In particular, you need to decide whether you want to perform factor analysis or principal components analysis. There is a procedure in SAS specifically designed for principal components analysis (PROC PRINCOM), which is defined by its unique extraction method. On the other hand, if you decide on factor analysis, then you must choose an extraction technique. There are a variety of different methods of factor extraction available in the PROC FACTOR procedure in SAS: principal component, principal factor, iterative principal factor, unweighted least-squares factor, maximum-likelihood factor, alpha factor, image analysis, and Harris component analysis. The two most commonly employed factor analytic techniques are principal component and principal factor analysis. As discussed above, PCA is quite different from FA. The different FA techniques employ different criteria for extracting factors. Discussions on choosing different methods of factor extraction can be found in Loehlin[3].

3. Estimating Communalities

As mentioned earlier, in principal components analysis we do not make a distinction between common and unique parts of the variation present in a variable. The correlation (covariance) matrix, with 1.0s (variances) down the main diagonal, is submitted to an analysis. On the other hand, a common factor analysis begins by substituting the diagonal of the correlation matrix with what are called prior communality estimates (h2). The communality estimate for a variable is the estimate of the proportion of the variance of the variable that is both error free and shared with other variables in the matrix. Since the concept of common variance is hypothetical, we never know exactly in advance what proportion of the variance is common and what proportion is unique among variables. Therefore, estimates of communalities need to be supplied for a factor analysis. These estimates can be specified with the PRIORS= option to the PROC FACTOR statement. The simplest approach is to use the largest absolute correlation for a variable with any other variable as the communality estimate for the variable (PRIORS=MAX). A more sophisticated approach is to use the squared multiple correlation (R2) between the variable and all other variables (PRIORS=SMC). As the number of variables increases, the importance of accurate prior estimates decreases.

There are still other methods of estimating communalities available in SAS. Interested readers should refer to SAS manual[4]. Some method should be chosen, because SAS by default sets all prior communalities to 1.0, which is the same as requesting a principal components analysis. This default setting has caused misunderstanding among the novice users who are not aware of the consequence of overlooking the default settings. Many researchers claim to have conducted a common factor analysis when actually a principal components analysis was performed.

4. Determining the Number of Factors

Determining the optimal number of factors to extract is not a straightforward task since the decision is ultimately subjective. There are several criteria for the number of factors to be extracted, but these are just empirical guidelines rather than an exact quantitative solution. In practice, most factor analysts seldom use a single criterion to decide on the number of factors to extract. Some of the most commonly used guidelines are the Kaiser-Guttman rule, percentage of variance, the scree test, size of the residuals, and interpretability.

Kaiser-Guttman rule

The "eigenvalues greater than one" rule has been most commonly used due to its simple nature and availability in various computer packages. It states that the number of factors to be extracted should be equal to the number of factors having an eigenvalue (variance) greater than 1.0. The rationale for choosing this particular value is that a factor must have variance at least as large as that of a single standardized original variable. Recall that in principal components analysis 1's are retained in the main diagonal of the correlation matrix, therefore for p standardized variables there is a total variance of p to be decomposed into factors. This rule, however, is more appropriate for PCA than FA, and it should be adjusted downward when the common factor model is chosen. In a common factor analysis, communality estimates are inserted in the main diagonal of the correlation matrix. Therefore, for p variables the variance to be decomposed into factors is less than p. It has been suggested that the latent root (eigenvalue) criterion should be lower and around the average of the initial communality estimates. The PROC FACTOR statement has the option MINEIGEN= allowing you to specify the latent root cutoff value. For example, MINEIGEN=1 requests SAS to retain the factors with eigenvaues greater than 1.

Percentage of Variance

Another criterion, related to the latent root criterion, is the percentage or proportion of the common variance (defined by the sum of communality estimates) that is explained by successive factors. For example, if you set the cutting line at 75 percent of the common variance (PROPORTION=.75 or PERCENT=75), then factors will be extracted until the sum of eigenvalues for the retained factors exceeds 75 percent of the common variance, defined as the sum of initial communality estimates.

Scree Test

Sometimes plotting the eigenvalues against the corresponding factor numbers gives insight into the maximum number of factors to extract. The SCREE option in the PROC FACTOR statement produces a scree plot that illustrates the rate of change in the magnitude of the eigenvalues for the factors. The rate of decline tends to be fast for the first few factors but then levels off. The "elbow", or the point at which the curve bends, is considered to indicate the maximum number of factors to extract. The figure below illustrates an example of a rather idealistic scree plot, where a clear elbow occurred at the fourth factor, which has an eigenvalue right around 1. Notice that the eigenvalues for the first few variables drop rapidly and after the fourth factor the decline in the eigenvalues gradually levels off. The scree plot suggests a maximum of four factors in this example. One less factor than the number at the elbow might be appropriate if you are concerned about getting an overly defined solution. However, many scree plots do not give such a clear indication of the number of factors.

Analysis of Residuals

If the factors are doing a good job in explaining the correlations among the original variables, we expect the predicted correlation matrix R* to closely approximate the input correlation matrix. In other words, we expect the residual matrix R - R* to approximate a null matrix. The RESIDUAL (or RES) option in the PROC FACTOR statement prints the residual correlation matrix and the partial correlation matrix (correlation between variables after the factors are partialled out or statistically controlled). If the residual correlations or partial correlations are relatively large (> 0.1), then either the factors are not doing a good job explaining the data or we may need to extract more factors to more closely explain the correlations. If maximum likelihood factors (METHOD=ML) are extracted, then the output includes the Chi-square test for the significance of residuals after the extraction of the given factor. This test comprises two separate hypothesis tests. The first test, labeled, "Test of H0: No common factors" tests the null hypothesis that no common factors can sufficiently explain the intercorrelations among the variables included in the analysis. You want this test to be statistically significant (p < .05). A nonsignificant value for this test statistic suggests that your intercorrelations may not be strong enough to warrant performing a factor analysis since the results from such an analysis could probably not be replicated.

The second Chi-square test statistic, labelled "Test of H0: N factors are sufficient" is the test of the null hypothesis that N common factors are sufficient to explain the intercorrelations among the variables, where N is the number of factors you specify with an NFACTORS=N option in the PROC FACTOR statement. This test is useful for testing the hypothesis that a given number of factors are sufficient to account for your data; in this instance your goal is a small chi-square value relative to its degrees of freedom. This outcome results in a large p-value (p > .05). One downside of this test is that the Chi-square test is very sensitive to sample size: given large degrees of freedom, this test will normally reject the null hypothesis of the residual matrix being a null matrix, even when the factor analysis solution is very good. Therefore, be careful in interpreting this test's significance value. Some data sets do not lend themselves to good factor solutions, regardless of the number of factors extracted.

Interpretability

Another very important but often overlooked criterion for determining the number of factors is the interpretability of the factors extracted. Factor solutions should be evaluated not only according to empirical criteria but also according to the criterion of " theoretical meaningfulness." Extracting more factors will guarantee that the residual correlations get smaller and thus that the chi-square values get smaller relative to the number of degrees of freedom. However, noninterpretable factors may have little utility. That is, an interpretable three-factor solution may be more useful (not to mention more parsimonious) than a less interpretable four-factor solution with a better goodness-of-fit statistic.

A Priori Hypotheses

The problem of determining the number of factors is not a concern if the researcher has an a priori hypothesis about the number of factors to extract. That is, an a priori hypothesis can provide a criterion for the number of factors to be extracted. If a theory or previous research suggests a certain number of factors and the analyst wants to confirm the hypothesis or replicate the previous study, then a factor analysis with the prespecified number of factors can be run. The NFACTOR=n (or N=n) option in PROC FACTOR extracts the user-supplied number of factors. Ultimately, the criterion for determining the number of factors should be the replicability of the solution. It is important to extract only factors that can be expected to replicate themselves when a new sample of subjects is employed.

5. The Rotation of Factors

Once you decide on the number of factors to extract, the next logical step is to determine the method of rotation. The fundamental theorem of factor analysis is invariant within rotations. That is, the initial factor pattern matrix is not unique. We can get an infinite number of solutions, which produce the same correlation matrix, by rotating the reference axes of the factor solution to simplify the factor structure and to achieve a more meaningful and interpretable solution. The idea of simple structure has provided the most common basis for rotation, the goal being to rotate the factors simultaneously so as to have as many zero loadings on each factor as possible. The following figure is a simplified example of rotation, showing only one variable from a set of several variables.

The variable V1 initially has factor loadings (correlations) of .7 and .6 on factor 1 and factor 2 respectively. However, after rotation the factor loadings have changed to .9 and .2 on the rotated factor 1 and factor 2 respectively, which is closer to a simple structure and easier to interpret.

The simplest case of rotation is an orthogonal rotation in which the angle between the reference axes of factors are maintained at 90 degrees. More complicated forms of rotation allow the angle between the reference axes to be other than a right angle, i.e., factors are allowed to be correlated with each other. These types of rotational procedures are referred to as oblique rotations. Orthogonal rotation procedures are more commonly used than oblique rotation procedures. In some situations, theory may mandate that underlying latent constructs be uncorrelated with each other, and therefore oblique rotation procedures will not be appropriate. In other situations where the correlations between the underlying constructs are not assumed to be zero, oblique rotation procedures may yield simpler and more interpretable factor patterns.

A number of orthogonal and oblique rotation procedures have been proposed. Each procedure has a slightly different simplicity function to be maximized. The ROTATE= option in the PROC FACTOR statement supports five orthogonal rotation methods: EQUAMAX, ORTHOMAX, QUARTIMAX, PARSIMAX, and VARIMAX; and two oblique rotation methods: PROCRUSTES and PROMAX. The VARIMAX method has been the most commonly used orthogonal rotation procedure.

6. Interpretation of Factors

One part of the output from a factor analysis is a matrix of factor loadings. A factor loading or factor structure matrix is a n by m matrix of correlations between the original variables and their factors, where n is the number of variables and m is the number of retained factors. When an oblique rotation method is performed, the output also includes a factor pattern matrix, which is a matrix of standardized regression coefficients for each of the original variables on the rotated factors. The meaning of the rotated factors are inferred from the variables significantly loaded on their factors. A decision needs to be made regarding what constitutes a significant loading. A rule of thumb frequently used is that factor loadings greater than .30 in absolute value are considered to be significant. This criterion is just a guideline and may need to be adjusted. As the sample size and the number of variables increase, the criterion may need to be adjusted slightly downward; it may need to be adjusted upward as the number of factors increases. The procedure described next outlines the steps of interpreting a factor matrix.

1. Identifying significant loadings: The analyst starts with the first variable (row) and examines the factor loadings horizontally from left to right, underlining them if they are significant. This process is repeated for all the other variables. You can instruct SAS to perform this step by using the FUZZ= option in the PROC FACTOR statement. For instance, FUZZ=.30 prints only the factor loadings greater than or equal to .30 in absolute value.

Ideally, we expect a single significant loading for each variable on only one factor: across each row there is only one underlined factor loading. It is not uncommon, however, to observe split loadings, a variable which has multiple significant loadings. On the other hand, if there are variables that fail to load significantly on any factor, then the analyst should critically evaluate these variables and consider deriving a new factor solution after eliminating them.

2. Naming of Factors: Once all significant loadings are identified, the analyst attempts to assign some meaning to the factors based on the patterns of the factor loadings. To do this, the analyst examines the significant loadings for each factor (column). In general, the larger the absolute size of the factor loading for a variable, the more important the variable is in interpreting the factor. The sign of the loadings also needs to be considered in labeling the factors. It may be important to reverse the scoring of the negatively worded items in Likert-type instruments to prevent ambiguity. That is, in Likert-type instruments some items are often negatively worded so that high scores on these items actually reflect low degrees of the attitude or construct being measured. Remember that the factor loadings represent the correlation or linear association between a variable and the latent factor(s). Considering all the variables' loading on a factor, including the size and sign of the loading, the investigator makes a determination as to what the underlying factor may represent.

7. Estimating Factor Scores

A factor is a latent continuum along which we can locate data points according to the varying amount of the construct that they possess. Factor scores can quantify individual cases on a latent continuum using a z-score scale which ranges from approximately -3.0 to +3.0. The FACTOR procedure can provide the estimated scoring coeffients which are then used in PROC SCORE to produce a matrix of estimated factor scores. You can then output these scores into a SAS dataset for further analysis.

8. Factor Analysis Decision Diagram

The following diagram illustrates a general decision process for factor analysis. This decision process is described here as a linear flow of events for the sake of simplicity. However, it would be more realistic to have a number of feedback loops included in the diagram. That is, depending on the result at a given stage, any previously made decision may need to be modified.

9. Confirmatory Factor Analysis

Confirmatory factor analysis allows you to test very specific hypotheses regarding the number of factors, factor loadings, and factor intercorrelations. However, it is more complex to run than ordinary exploratory factor analysis, and a full discussion of it is beyond the scope of this document.

Factor Analysis Decision Diagram

III. An Illustrative Example

Below is an illustrative example of the application of common factor analysis to clarify the topics described in the previous sections. Factor analysis has been widely used to examine the structure of tests or scales of various kinds, such as personality scales, attitude measures,and ability scales. The following example illustrates the application of common factor analysis to provide evidence of construct validity of the Wechsler Intelligence Scale for Children (WISC-III).

The Wechsler Intelligence Scale for Children (WISC-III) was designed as a test of general intelligence to provide estimates of the intellectual abilities for children aged between 6 and 16. The WISC-III consists of 13 subtests, each measuring a different facet of intelligence. The matrix of intercorrelations among the 13 subtests, which served as the input data, was obtained from the manual[5] and is shown in Table 2. Inspection of the correlation matrix shows that the correlations are substantial, indicating the presence of a substantial general factor.

Table 1.  Correlation matrix for 13 subscales
Subscale 	Inf	Sim	Ari	Voc	Com	Dig	PiC	Cod	PiA	Blo	Obj	Sym   
Information   
Similarities	.66   
Arithmetic	.57	.55   
Vocabulary	.70	.69	.54    
Comprehension	.56	.59	.47	.64   
Digit Span 	.34	.34	.43	.35	.29   
Pic. Completion	.47	.45	.39	.45	.38	.25    
Coding Subscale	.21	.20	.27	.26	.25	.23	.18   
Pic. Arrang.    .40	.39	.35	.40	.35	.20	.37	.28   
Block Design	.48	.49	.52	.46	.40	.32	.52	.27	.41   
Object Assembly	.41	.42	.39	.41	.34	.26	.49	.24	.37	.61   
Symbol Search	.35	.35	.41	.35	.34	.28	.33	.53	.36	.45	.38   
Mazes		.18	.18	.22	.17	.17	.14	.24	.15	.23	.31	.29	.24   

PROC FACTOR can handle input data consisting of either a correlation matrix or the raw data matrix used to produce the correlation matrix. The correlation matrix can be a SAS dataset generated from the PROC CORR procedure or can be a text file containing the lower triangle (including the main diagonal) of a correlation matrix. For our example, a text file of correlations is created and called WISC.DAT. The following SAS DATA step code defines the type of the input data file WISC.DAT as a correlation matrix, and labels its variables. The _TYPE_=`CORR'; statement must be typed exactly as shown:

DATA   d1  (TYPE=CORR);      
  _TYPE_='CORR';        
  INFILE `wisc.dat' MISSOVER;
  INPUT inf sim ari voc com dig pic cod pia blo obj sym maz;     
RUN;

The following SAS code calls the FACTOR procedure with some options. METHOD=P or METHOD=PRINCIPAL specifies the method for extracting factors to be the principal-axis factoring method. This option in conjunction with PRIORS=SMC performs a principal factor analysis. The option ROTATE=PROMAX performs an oblique rotation after an orthogonal VARIMAX rotation. It is specified here because the hypothetical constructs that constitute human intelligence, which WISC-III attempts to measure, are believed to be interrelated with each other. The CORR option requests the correlation matrix be printed, and the RES or RESIDUALS option requests that a residual correlation matrix be printed. The residual correlation matrix shows the difference between the observed correlation matrix and the predicted correlation matrix. If the retained factors are sufficient to explain the correlations among the observed variables, the residual correlation matrix is expected to approximate a null matrix (most values <= .10).

PROC FACTOR DATA=D1 METHOD=P PRIORS=SMC ROTATE=PROMAX SCREE CORR RES;
RUN;

Table 2 shows the prior communality estimates for 13 subtests used in this analysis. The squared multiple correlations (SMC), which are printed below, represent the proportion of variance of each of the 13 subtests shared by all remaining subtests. The subtest MAZES has the prior communality estimate of 0.132, which means that only 13% of the variance of the subtest MAZES is shared by all other subtests, indicating that this subtest measures a somewhat different construct than the other subtests. A small communality estimate might indicate that the variable or item may need to be modified or even dropped.

Table 2.  Initial Communality Estimates 

Initial Factor Method: Principal Factors

Prior Communality Estimates: SMC

INFO		SIM		ARITH		VOC		COMP	
0.594574	0.587543	0.481994	0.636296	0.473358

DIGIT		PICTCOM		CODING		PICTARG
0.224104	0.385580 	0.306120	0.287693

BLOCK	OBJECT	SYMBOL	MAZES
0.533202	0.439176	0.422932	0.132220

Eigenvalues of the Reduced Correlation Matrix: 

 Total = 5.50479208	Average = 0.42344554

The sum of all prior communality estimates, 5.505 in this example, is the estimate of the common variance among all subtests. This initial estimate of the common variance constitutes about 42% of the total variance present among all 13 subtests.

Table 3 shows the factor numbers and corresponding eigenvalues. According to the Kaiser and Guttman rule, only one factor can be retained because only the first factor has an eigenvalue greater than one. However, as suggested in the previous section, this criterion may be applicable only to principal component analysis, not common factor analysis. Two factors can be retained if the average eigenvalue (0.423) instead of 1.0 is used as the criterion. The authors of WISC-III retained all factors with positive eigenvalues and thus retained the first four factors. The fifth and following factors have negative eigenvalues, which may not be intuitively appealing just as a negative variance is not. This oddity occurs only in common factor analysis due to the restriction that the sum of eigenvalues be set equal to the estimated common variance, not the total variance.

Table 3.  Eigenvalues of the Reduced Correlation Matrix

 

Eigenvalue	5.1046 	0.6838	0.4021	0.1479	-0.0130
Difference	4.4208	0.2817	0.2542	0.1609	 0.0094
Proportion	0.9273	0.1242	0.0731	0.0269	-0.0024
Cumulative	0.9273	1.0515	1.1246	1.1514	 1.1491

		6	7	8	9	10
Eigenvalue	-0.0224	-0.0569	-0.0782	-0.0848	-0.0897
Difference	 0.0345	 0.0213	 0.0065	 0.0049	 0.0412
Proportion	-0.0041	-0.0103	-0.0142	-0.0154	-0.0163
Cumulative	 1.1450	 1.1347	 1.1205	 1.1051	 1.0888

		11	12	 13
Eigenvalue     -0.1310  -0.1547  -0.2031
Difference	0.0237   0.0485
Proportion     -0.0238	-0.0281  -0.0369
Cumulative	1.0650   1.0369   1.0000

The scree plot shown below seems to suggest the presence of a general factor as predicted from the inspection of the correlation matrix. A large first eigenvalue (5.11) and a much smaller second eigenvalue (0.68) suggests the presence of a dominant global factor. Stretching it to the limit, one might argue that a secondary elbow occurred at the fifth factor, implying a four-factor solution. That is equivalent to retaining all factors with positive eigenvalues. Research has suggested that the structure of the Wechsler's intelligence scales are hierarchical. That is, at the top of the hierarchy all subtests converge to a single general factor, below which are several less general factors defined by clusters of subtests. A four-factor solution is more interesting and meaningful than a single factor solution to investigate the hierarchical structure of the WISC-III. The results presented in the following section will be based on a four-factor solution, which was obtained by repeating the analysis with the NFACTOR=4 option specifying that the first four factors be retained.

Table 4.  Initial Factor Pattern

 

	FACTOR1	FACTOR2	 FACTOR3 FACTOR4
INFO	0.76124	-0.26507 0.00573 -0.00419	INFORMATION
SIM	0.75825	-0.26807 0.00088 -0.01733	SIMILARITY
ARITH	0.70320	-0.04219 0.07006  0.21817	ARITHMETIC
VOC	0.77712	-0.29967 0.08268 -0.07819	VOCABULARY
COMP	0.67220	-0.21792 0.11383  0.09479	COMPREHENSION
DIGIT	0.45938	0.01293	 0.10982  0.23284	DIGIT SPAN
PICTCOM	0.61799	0.06079	-0.23502 -0.05384	PICTURECOMPLETION
CODING	0.40429	0.33855	 0.34093 -0.06015	CODING
PICTARG	0.54687	0.11799	-0.0165	 -0.13620	PICTURE ARRANGEMENT
BLOCK	0.71609	0.21503	-0.2255	  0.06332	BLOCK DESIGN
OBJECT	0.62675	0.21928	-0.2652	 -0.01736 	OBJECT ASSEMBLY
SYMBOL	0.57731	0.36078	 0.23968 -0.03620 	SYMBOL SEARCH
MAZES	0.32498	0.21379	-0.12221 -0.00324	MAZES
Variance explained by each factor
	FACTOR1		FACTOR2		FACTOR3 	FACTOR4
	5.104620 	0.683788	0.402128 	0.147927
Final Communality Estimates: Total = 6.338464

Table 4 above shows the initial unrotated factor structure matrix, which consists of the correlations between the 13 subtests and the four retained factors. The current estimate of the common variance is now 6.338, which is somewhat larger than the initial estimate of 5.505.

The off-diagonal elements of the residual correlation matrix are all close to 0.01, indicating that the correlations among the 13 subtests can be reproduced fairly accurately from the retained factors. The root mean squared off-diagonal residual is 0.0178. The inspection of the partial correlation matrix yields similar results: the correlations among the 13 subtests after the retained factors are accounted for are all close to zero. The root mean squared partial correlation is 0.038, indicating that four latent factors can accurately account for the observed correlations among the 13 subtests.

The table shown below is the factor structure matrix after the VARIMAX rotation. The correlations greater than 0.30 are underlined. There are some split loadings where a variable is significantly (> 0.3) loaded on more than one factor. This matrix, however, is not interpreted because an oblique solution has been requested.

Table 5.  Rotated Factor Pattern (VARIMAX)
Table 5.  Rotated Factor Pattern (VARIMAX)
        FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO    0.71862	0.29392 0.12616 0.17630 INFORMATION
SIM     0.72023	0.29506 0.12237 0.16230 SIMILARITY
ARITH   0.49726	0.30656	0.23918 0.38771	ARITHMETIC
VOC     0.77718	0.23819 0.17933 0.11727 VOCABULARY
COMP    0.65565	0.19763 0.21399 0.08092 COMPREHENSION
DIGIT   0.29024 0.16907 0.20796 0.34843 DIGIT SPAN
PICTCOM 0.37579	0.53504	0.10572 0.07124 PICTURE COMPLETION
CODING  0.12040 0.14820 0.59510	0.08546 CODING
PICTARG 0.33269	0.37653	0.28170 0.00121 PICTURE ARRANGEMENT
BLOCK   0.32270	0.64662	0.21651 0.21154 BLOCK DESIGN
OBJECT  0.26569 0.63181	0.17377 0.10766 OBJECT ASSEMBLY
SYMBOL  0.21005 0.32244	0.59566	0.13894 SYMBOL SEARCH
MAZES   0.07226 0.36298	0.15838 0.06487 MAZES

Variance explained by each factor

FACTOR1  FACTOR2  FACTOR3  FACTOR4
2.891010 1.894832 1.110948 0.441675

Table 6 shown below is the factor structure matrix after the oblique PROMAX rotation, which allows the latent factors to be correlated with each other. The matrix of inter-factor correlations (Table 7) shows that the factors are substantially correlated with each other. The inter-factor correlations range between 0.44 and 0.65. If we submit these intercorrelated factors to new factor analysis, we might be able to obtain a single second-order factor, which could correspond to the general intelligence or g factor in previous research. One downside of an oblique rotation method is that if the correlations among the factors are substantial, then it is sometimes difficult to distinguish among factors by examining the factor loadings. In such situations, you should investigate the factor pattern matrix, which is a matrix of the standardized coefficients for the regression of the factors on the observed variables.

Table 6.  Factor Structure (Correlations)
        FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO    0.80153 0.56064 0.33700 0.52105 INFORMATION
SIM     0.80059 0.55913 0.33257 0.50906 SIMILARITY
ARITH   0.65384 0.55813 0.42927 0.65702 ARITHMETIC
VOC     0.84027 0.53362 0.37803 0.48942 VOCABULARY
COMP    0.71732 0.45943 0.37569 0.41350 COMPREHENSION
DIGIT   0.40958 0.35214 0.32514 0.50255 DIGIT SPAN
PICTCOM 0.53937 0.64229 0.30602 0.37733 PICTURE COMPLETION
CODING  0.28294 0.32896 0.63030 0.31811 CODING
PICTARG 0.47527 0.51677 0.41891 0.30366 PICTURE ARRANGEMENT
BLOCK   0.56601 0.77315 0.44326 0.54029 BLOCK DESIGN
OBJECT  0.48561 0.71459 0.37858 0.41641 OBJECT ASSEMBLY
SYMBOL  0.42630 0.52381 0.69512 0.44612 SYMBOL SEARCH
MAZES   0.21660 0.39830 0.25905 0.22942 MAZES
Table 7.  Inter-factor Correlations
        FACTOR1 FACTOR2 FACTOR3 FACTOR4
FACTOR1 1.00000 0.64770 0.43503 0.58664
FACTOR2 0.64770 1.00000 0.52336 0.57564
FACTOR3 0.43503 0.52336 1.00000 0.47436
FACTOR4 0.58664 0.57564 0.47436 1.00000

Table 8 is the factor pattern matrix, which will be used to interpret the meaning of the factors. The values in this matrix are the standardized regression coefficients, which are functionally related to the part or semipartial correlation between a variable and the factor when other factors are held constant. Therefore, a value in this matrix represents the individual and nonredundant contribution that each factor is making to predict a subtest. The regression coefficients greater than 0.30 are underlined to assist the interpretation.

Table 8.  Rotated Factor Pattern (Standardized Regression
Coefficients)
        FACTOR1 FACTOR2 FACTOR3 FACTOR4
INFO    0.73663  0.06911 -0.0553   0.07540 INFORMATION
SIM     0.74378  0.07445 -0.05694  0.05688 SIMILARITY
ARITH   0.35704  0.08393  0.05243  0.37438 ARITHMETIC
VOC     0.85010 -0.02674  0.02492 -0.00572 VOCABULARY
COMP    0.71870 -0.0391   0.09895 -0.0325 COMPREHENSION
DIGIT   0.16057 -0.01159  0.08321  0.37555 DIGIT SPAN
PICTCOM 0.24101  0.54702 -0.06151 -0.04977 PICTURE COMPLETION
CODING  0.00651 -0.01816  0.62315  0.02916 CODING
PICTARG 0.25467  0.31837  0.20034 -0.12403 PICTURE ARRANGEMENT
BLOCK   0.06661  0.65410  0.01652  0.11685 BLOCK DESIGN
OBJECT  0.04111  0.69028  0.00237 -0.00618 OBJECT ASSEMBLY
SYMBOL  0.03508  0.17311  0.56088  0.05983 SYMBOL SEARCH
MAZES   0.08719  0.40886  0.07943  0.00754 MAZES

The subtests significantly loaded on the first factor are Information, Similarity, Arithmetic, Vocabulary, and Comprehension subtests. These are the subtests that are orally presented and require verbal responses. Therefore, this factor may be named "Verbal Comprehension". The second factor is identified by the following subtests: Picture Completion, Picture Arrangement, Block Design, and Object Assembly. All of these subtests have a geometric or configural component in them: these subtests measure the skills that require the manual manipulation or organization of pictures, objects, blocks, and the like. Therefore, this factor may be named "Perceptual Organization." The two subtests loaded on the third factors are Coding and Symbol Search subtests. Both subtests measure basically the speed of simple coding or searching process. Therefore, this factor can be named "Processing Speed." Finally, Arithmetic and Digit Span subtests identify the fourth factor. Both subtests deal with arithmetic problems or numbers so that this factor can be named "Numerical Ability." The last two factors are doublets since they are identified by only two subtests each. Therefore, they are conceptually weak compared to the first two factors and more subtests may need to be added to these factors to make them conceptually sound.

It is possible to estimate the factor scores, or a subject's relative standing on each of the factors, if the original subject-by-variable raw data matrix is available. To compute the factor scores for all subjects on all factors, use the following SAS code:

PROC FACTOR  DATA=raw  {other options here} OUTSTAT=fact;
PROC SCORE  DATA=raw   SCORE=fact   OUT=scores;  
RUN;

where raw is the original data matrix, fact is the matrix of factor scoring coefficients, and scores is the matrix of factor scores for subjects.

Footnotes

  1. Guttman, L. (1953) "Image Theory for the Structure of Quantitative Variables", Psychometrica, 18, 277-296.
  2. Kaiser, H.F., and Rice, J. (1974) "Little Jiffy, Mark IV", Educational and Psychological Measurement, 34, 111-117.
  3. Loehlin, J.C. (1992) Latent Variable Models. Erlbaum Associates, Hillsdale NJ.
  4. SAS/STAT User's Guide, 1990, SAS Institute Inc., p. 785.
  5. Manual for the Wechsler Intelligence Scale for Children (WISC-III), New York, 1991.

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.