|
|
|
||||
|
Help the Stat Consulting Group by
giving a gift
| |||||
|
Loading
|
|||||
To begin, we will load a Stata 11 dataset fa_missing, get some descriptive statistics and compute the complete case covariance matrix.
use http://www.ats.ucla.edu/stat/data/fa_missing, clear /* Stata 11 data file */
summarize
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
item13 | 1419 4.450317 .7374944 1 5
item14 | 1428 4.518207 .7086049 1 5
item15 | 1424 4.434691 .7478835 1 5
item16 | 1420 4.270423 .8387034 1 5
item17 | 1423 4.158819 .8969815 1 5
-------------+--------------------------------------------------------
item18 | 1424 3.924157 1.032095 1 5
item19 | 1420 4.072535 .9665034 1 5
item20 | 1396 3.770774 .9137137 1 5
item21 | 1422 3.769339 .9863042 1 5
item22 | 1414 3.592645 1.122807 1 5
-------------+--------------------------------------------------------
item23 | 1423 3.800422 .9639492 1 5
item24 | 1417 3.653493 .9308223 1 5
item25 | 1398 2.285408 .9892487 1 5
item26 | 1414 2.077086 1.058313 1 5
item27 | 1420 1.496479 .7294192 1 5
-------------+--------------------------------------------------------
item28 | 1419 2.273432 .9677116 1 5
count /* count total number of observations */
1428
corr, cov /* complete case covariance matrix */
(obs=1331)
| item13 item14 item15 item16 item17 item18 item19 item20 item21 item22
-------------+------------------------------------------------------------------------------------------
item13 | .536077
item14 | .338379 .48707
item15 | .321189 .321527 .536102
item16 | .345401 .289728 .305775 .681157
item17 | .376869 .34434 .382685 .435569 .799719
item18 | .310446 .313639 .348313 .350631 .52091 1.07943
item19 | .202336 .211924 .250935 .262903 .387361 .6311 .935065
item20 | .202417 .199169 .234828 .235136 .33403 .490213 .39099 .82419
item21 | .340947 .302693 .364249 .369725 .525901 .565762 .472584 .377907 .966934
item22 | .271303 .257165 .30236 .328247 .450612 .625471 .521898 .389491 .553577 1.24822
item23 | .39691 .374938 .405833 .35907 .526654 .567222 .406462 .352182 .558713 .52722
item24 | .305477 .28148 .290924 .325099 .432752 .457068 .332405 .30078 .449396 .456496
item25 | .008449 -.011696 -.038745 -.030012 -.039108 -.042074 -.064693 -.026436 -.025697 -.047378
item26 | .014954 -.024045 -.002687 -.019264 -.021647 -.018859 .018107 -.026555 .002384 -.019735
item27 | -.036163 -.045486 -.046055 -.065249 -.055178 -.070832 -.053228 -.036927 -.062904 -.099815
item28 | -.000554 -.013315 -.033624 -.048267 -.028426 -.051824 -.016597 -.044399 -.031681 -.07906
| item23 item24 item25 item26 item27 item28
-------------+------------------------------------------------------
item23 | .913566
item24 | .618358 .848286
item25 | -.031721 -.043576 .976103
item26 | .014638 -.025494 .10275 1.10263
item27 | -.059988 -.063666 .123452 .170048 .51931
item28 | -.004233 -.049099 .23827 .210952 .353081 .941695
From the output above, you can see that there are a total of 1,428 observations with 1,365 complete
cases. All of the variables have missing cases except for item14.
item20 has the most missing data with only 1,396 nonmissing cases. We will use the mlong format for mi set but this approach will work with any of the mi data formats. When you register variables to be imputed (mi register imputed) you should also include the variables without missing values, such as item14, so that they will be included in the EM covariance matrix. Next, run the mi impute mvn command with the emonly option. Notice that there are no variables to the right of the equal sign. In fact, there is no equal sign at all. After running mi impute, the EM covariance matrix can be found in the saved results in r(Sigma_em) which we will then save to the matrix cov_em for use in factormat.
mi set mlong
mi register imputed item13-item28
(97 m=0 obs. now marked as incomplete)
mi impute mvn item13-item28, emonly
note: variable item14 contains no soft missing (.) values; imputing nothing
Iteration 0: Observed log likelihood = -9021.7844
Iteration 1: Observed log likelihood = -4116.7934
Iteration 2: Observed log likelihood = -4113.8728
Iteration 3: Observed log likelihood = -4113.8685
Iteration 4: Observed log likelihood = -4113.8685
Iteration 5: Observed log likelihood = -4113.8685
Expectation-maximization estimation Number obs = 1428
Number missing = 167
Number patterns = 34
Prior: uniform Obs per pattern: min = 1
avg = 42
max = 1331
Observed log likelihood = -4113.8685 at iteration 5
------------------------------------------------------------------------------------------------------
| item13 item14 item15 item16 item17 item18 item19 item20
-------------+----------------------------------------------------------------------------------------
Coef |
_cons | 4.451285 4.518207 4.435308 4.268804 4.156375 3.922213 4.070152 3.767296
-------------+----------------------------------------------------------------------------------------
Sigma |
item13 | .5430297 .348556 .3350714 .3450292 .3822566 .3104201 .2059764 .2103791
item14 | .348556 .5017693 .3455828 .3003632 .355927 .3212322 .2256632 .2100503
item15 | .3350714 .3455828 .5584064 .3186633 .400073 .3582909 .2667201 .2515707
item16 | .3450292 .3003632 .3186633 .705225 .4391928 .3453138 .2643872 .2404926
item17 | .3822566 .355927 .400073 .4391928 .8085157 .5141493 .3895897 .3465985
item18 | .3104201 .3212322 .3582909 .3453138 .5141493 1.068469 .6312669 .4935741
item19 | .2059764 .2256632 .2667201 .2643872 .3895897 .6312669 .9371905 .4000026
item20 | .2103791 .2100503 .2515707 .2404926 .3465985 .4935741 .4000026 .8360225
item21 | .3460404 .3144694 .3764157 .3618431 .5214286 .5644802 .4791261 .3844396
item22 | .2817349 .2724658 .3191765 .3361343 .4594059 .6214813 .5312196 .3984811
item23 | .4075676 .3925719 .4292744 .3747293 .5356915 .567577 .4185394 .3681236
item24 | .3165909 .3003985 .3143 .334972 .4410231 .4564384 .3410914 .3120366
item25 | -.004998 -.0312026 -.0558893 -.0478347 -.049516 -.0489516 -.0731514 -.0307672
item26 | .0131501 -.0212114 .0033069 -.0091686 -.0192239 -.0085362 .0180015 -.0241016
item27 | -.0416338 -.054581 -.0551466 -.0723354 -.0615237 -.0827423 -.0596916 -.0426282
item28 | -.0053221 -.0246628 -.0425885 -.0575029 -.0363375 -.0585845 -.0250774 -.0471011
------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------
| item21 item22 item23 item24 item25 item26 item27 item28
-------------+----------------------------------------------------------------------------------------
Coef |
_cons | 3.770048 3.593149 3.79814 3.655047 2.285293 2.077139 1.49686 2.273121
-------------+----------------------------------------------------------------------------------------
Sigma |
item13 | .3460404 .2817349 .4075676 .3165909 -.004998 .0131501 -.0416338 -.0053221
item14 | .3144694 .2724658 .3925719 .3003985 -.0312026 -.0212114 -.054581 -.0246628
item15 | .3764157 .3191765 .4292744 .3143 -.0558893 .0033069 -.0551466 -.0425885
item16 | .3618431 .3361343 .3747293 .334972 -.0478347 -.0091686 -.0723354 -.0575029
item17 | .5214286 .4594059 .5356915 .4410231 -.049516 -.0192239 -.0615237 -.0363375
item18 | .5644802 .6214813 .567577 .4564384 -.0489516 -.0085362 -.0827423 -.0585845
item19 | .4791261 .5312196 .4185394 .3410914 -.0731514 .0180015 -.0596916 -.0250774
item20 | .3844396 .3984811 .3681236 .3120366 -.0307672 -.0241016 -.0426282 -.0471011
item21 | .970728 .5586832 .5705437 .4616899 -.0378778 .0058816 -.0728921 -.0373168
item22 | .5586832 1.261583 .5445269 .4737303 -.0510983 -.0252185 -.0972336 -.0784298
item23 | .5705437 .5445269 .9342235 .6368121 -.0503701 .0146983 -.066706 -.009803
item24 | .4616899 .4737303 .6368121 .8657582 -.0596746 -.0287266 -.0696632 -.0554101
item25 | -.0378778 -.0510983 -.0503701 -.0596746 .9778334 .0906248 .1282772 .2393721
item26 | .0058816 -.0252185 .0146983 -.0287266 .0906248 1.118956 .1671461 .196613
item27 | -.0728921 -.0972336 -.066706 -.0696632 .1282772 .1671461 .5314816 .3568273
item28 | -.0373168 -.0784298 -.009803 -.0554101 .2393721 .196613 .3568273 .9353024
------------------------------------------------------------------------------------------------------
Note: no imputation performed.
matrix cov_em = r(Sigma_em)
matrix list cov_em
symmetric cov_em[16,16]
item13 item14 item15 item16 item17 item18 item19 item20
item13 .54302971
item14 .348556 .50176934
item15 .33507137 .34558277 .55840641
item16 .34502917 .3003632 .31866333 .705225
item17 .38225661 .35592696 .40007302 .43919281 .80851571
item18 .31042005 .32123222 .35829089 .34531382 .51414933 1.0684692
item19 .20597637 .22566319 .26672014 .26438724 .38958966 .63126692 .93719047
item20 .21037912 .21005025 .25157069 .24049261 .34659848 .49357406 .40000261 .83602251
item21 .34604038 .31446937 .37641575 .3618431 .52142858 .56448016 .47912613 .38443958
item22 .28173489 .27246583 .31917653 .33613426 .45940591 .62148129 .53121959 .39848113
item23 .40756762 .39257192 .42927439 .37472927 .53569152 .56757702 .41853944 .36812361
item24 .3165909 .30039846 .3143 .33497204 .4410231 .45643844 .34109143 .31203658
item25 -.00499796 -.03120263 -.05588928 -.04783472 -.04951603 -.04895159 -.07315143 -.03076718
item26 .01315008 -.02121138 .00330686 -.0091686 -.0192239 -.00853618 .01800149 -.02410163
item27 -.04163384 -.054581 -.05514657 -.07233544 -.06152372 -.08274233 -.05969162 -.04262815
item28 -.00532215 -.02466275 -.0425885 -.05750295 -.03633748 -.05858451 -.02507744 -.04710109
item21 item22 item23 item24 item25 item26 item27 item28
item21 .97072797
item22 .55868325 1.2615831
item23 .57054369 .54452686 .93422349
item24 .46168988 .4737303 .63681207 .86575819
item25 -.03787779 -.05109826 -.05037005 -.05967465 .97783345
item26 .00588162 -.02521849 .01469832 -.02872665 .09062481 1.1189565
item27 -.07289205 -.09723358 -.06670603 -.0696632 .1282772 .16714608 .53148159
item28 -.03731678 -.07842976 -.00980302 -.05541012 .23937206 .19661302 .35682735 .93530244
We will use the factormat command with the EM estimate of the covariance matrix to obtain
our factor solution.
The factormat command requires that the sample size, n, be entered along with the
name of the covariance matrix. In her paper, Truxillo discusses three methods for specifying nominal
sample size, 1) column-wise minimum, 2) column-wise average and 3) pairwise minimum.
Column-wise minimum is just the number of complete cases for the variables with the most missing
values which is the value we will use for this example. If you will recall from above that value is
1,396.
factormat cov_em, n(1396) fact(4) ml
(obs=1396)
Iteration 0: log likelihood = -236.78484
Iteration 1: log likelihood = -85.766521
(...omitted...)
Iteration 90: log likelihood = -85.345691
Iteration 91: log likelihood = -85.345691
Factor analysis/correlation Number of obs = 1396
Method: maximum likelihood Retained factors = 4
Rotation: (unrotated) Number of params = 58
Schwarz's BIC = 590.691
Log likelihood = -85.34569 (Akaike's) AIC = 286.691
--------------------------------------------------------------------------
Factor | Eigenvalue Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 5.83062 4.66732 0.7030 0.7030
Factor2 | 1.16329 0.34933 0.1403 0.8432
Factor3 | 0.81396 0.32778 0.0981 0.9414
Factor4 | 0.48619 . 0.0586 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(120) = 9652.01 Prob>chi2 = 0.0000
LR test: 4 factors vs. saturated: chi2(62) = 169.61 Prob>chi2 = 0.0000
Factor loadings (pattern matrix) and unique variances
---------------------------------------------------------------------
Variable | Factor1 Factor2 Factor3 Factor4 | Uniqueness
-------------+----------------------------------------+--------------
item13 | 0.7069 0.0779 -0.3640 0.1850 | 0.3274
item14 | 0.7087 0.0258 -0.3268 0.1776 | 0.3588
item15 | 0.7260 -0.0043 -0.2333 0.1634 | 0.3919
item16 | 0.6158 -0.0577 -0.2016 0.2151 | 0.5306
item17 | 0.7618 -0.0252 -0.0548 0.1732 | 0.3860
item18 | 0.7099 -0.1273 0.3400 0.1912 | 0.3276
item19 | 0.5856 -0.1310 0.4220 0.2273 | 0.4102
item20 | 0.5331 -0.0986 0.2382 0.1632 | 0.6227
item21 | 0.7143 -0.0444 0.1319 0.0987 | 0.4606
item22 | 0.6024 -0.1166 0.2541 0.0639 | 0.5549
item23 | 0.8857 0.1157 0.0221 -0.2968 | 0.1134
item24 | 0.7246 0.0169 0.0385 -0.2133 | 0.4276
item25 | -0.0717 0.2901 0.0348 0.0744 | 0.9039
item26 | -0.0061 0.2680 0.0632 0.0585 | 0.9207
item27 | -0.1399 0.6198 0.1288 0.1428 | 0.5593
item28 | -0.0586 0.7349 0.1431 0.1606 | 0.4102
---------------------------------------------------------------------
rotate, varimax normalize blanks(.3)
Factor analysis/correlation Number of obs = 1396
Method: maximum likelihood Retained factors = 4
Rotation: orthogonal varimax (Kaiser on) Number of params = 58
Schwarz's BIC = 590.691
Log likelihood = -85.34569 (Akaike's) AIC = 286.691
--------------------------------------------------------------------------
Factor | Variance Difference Proportion Cumulative
-------------+------------------------------------------------------------
Factor1 | 3.25325 0.32888 0.3922 0.3922
Factor2 | 2.92437 1.70143 0.3526 0.7448
Factor3 | 1.22294 0.32944 0.1474 0.8923
Factor4 | 0.89350 . 0.1077 1.0000
--------------------------------------------------------------------------
LR test: independent vs. saturated: chi2(120) = 9652.01 Prob>chi2 = 0.0000
LR test: 4 factors vs. saturated: chi2(62) = 169.61 Prob>chi2 = 0.0000
Rotated factor loadings (pattern matrix) and unique variances
---------------------------------------------------------------------
Variable | Factor1 Factor2 Factor3 Factor4 | Uniqueness
-------------+----------------------------------------+--------------
item13 | 0.7843 | 0.3274
item14 | 0.7534 | 0.3588
item15 | 0.6961 0.3120 | 0.3919
item16 | 0.6114 | 0.5306
item17 | 0.6040 0.4695 | 0.3860
item18 | 0.3043 0.7487 | 0.3276
item19 | 0.7454 | 0.4102
item20 | 0.5561 | 0.6227
item21 | 0.4237 0.5581 | 0.4606
item22 | 0.5829 | 0.5549
item23 | 0.5049 0.4350 0.6651 | 0.1134
item24 | 0.4010 0.3935 0.5019 | 0.4276
item25 | 0.3044 | 0.9039
item26 | | 0.9207
item27 | 0.6562 | 0.5593
item28 | 0.7676 | 0.4102
---------------------------------------------------------------------
(blanks represent abs(loading)<.3)
Factor rotation matrix
--------------------------------------------------
| Factor1 Factor2 Factor3 Factor4
-------------+------------------------------------
Factor1 | 0.6778 0.5954 -0.0605 0.4270
Factor2 | 0.1006 -0.1816 0.9512 0.2282
Factor3 | -0.6581 0.7251 0.1934 0.0609
Factor4 | 0.3120 0.2945 0.2326 -0.8728
--------------------------------------------------
Almost identical results to these were obtain using SAS proc mi with proc factor and
using Mplus with the missing data option.Truxillo, C. 2005. Maximum likelihood parameter estimation with incomplete data. Proceedings of the Thirtieth Annual SAS(r) Users Group International Conference. <http://www2.sas.com/proceedings/sugi30/111-30.pdf >
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services