UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata Textbook Examples
Regression with Graphics by Lawrence Hamilton
Chapter 8: Principal Components and Factor Analysis

Table 8.1, page 253.
use http://www.ats.ucla.edu/stat/stata/examples/rwg/basins, clear
(Hicks et al. (1990))

gen logro=log10(runoff)
gen logpre=log10(precip)
gen logglac=log10(glacier+1)
gen logarea=log10(area)
egen zlogro=std(logro)
egen zlogpre=std(logpre)
egen zlogglac=std(logglac)
egen zlogarea=std(logarea)
list basin zlogro zlogpre zlogglac zlogarea

                   basin     zlogro    zlogpre   zlogglac   zlogarea 
  1.               Ivory    1.30585   1.253279   1.974755   -1.74394  
  2.               Cropp   1.321216   1.467827   .1666327  -1.612833  
  3. Upper Waitangitoana   .8610425   1.030452  -.6078176  -.2809038  
  4.            Hokitika   1.093791   1.374215  -.1191939   .3364316  
  5.               Haast   .7470333   .8726353  -.1191939   .7685862  
  6. Little Hopwood Burn  -.5045639  -.6635709  -.6078176  -.8944885  
  7.            Shotover  -.7667871  -1.033303  -.1191939   .7948009  
  8.               Arrow  -1.598691  -1.542749  -.6078176   .1047718  
  9.         Manuherikia  -2.604667  -1.925677  -.6078176    .356689  
 10.             Karamea   .1025019  -.0490354  -.6078176   .8208289  
 11.            Buller A   -.373943  -.4820166  -.6078176   1.376861  
 12.            Buller B  -.1806241  -.3731881  -.6078176   1.511363  
 13.         Inangahua A  -.3277019  -.4265141  -.6078176    .170581  
 14.         Inangahua B  -.0883481  -.1786223  -.6078176   .7605425  
 15.                Grey  -.0657941  -.1786223  -.1191939   .5805332  
 16.      Butchers Creek  -.0807653  -.2247163  -.6078176   -1.48221  
 17.             Cleddau   .8228721    .973395  -.1191939   .0032735  
 18.              Hooker    .812043   .8726353   2.135663  -.1627341  
 19.     Tsidjiore Nouve  -.4744647  -.7664243   2.397095  -1.408153  
Figure 8.2, page 254.
graph twoway scatter zlogro zlogpre, xlabel(0) ylabel(0) xline(0) yline(0)
Table 8.2, page 254.
corr zlogro zlogpre zlogglac zlogarea
(obs=19)
             |   zlogro  zlogpre zlogglac zlogarea
-------------+------------------------------------
      zlogro |   1.0000
     zlogpre |   0.9738   1.0000
    zlogglac |   0.3385   0.3025   1.0000
    zlogarea |  -0.2872  -0.2829  -0.5121   1.0000
Figure 8.3, page 254.
graph matrix zlogro zlogpre zlogglac zlogarea,  half 
Table 8.3, page 255.
factor zlogro zlogpre zlogglac zlogarea, pcf
(obs=19)
            (principal component factors; 2 factors retained)
  Factor     Eigenvalue     Difference    Proportion    Cumulative
------------------------------------------------------------------
     1        2.39152         1.29575      0.5979         0.5979
     2        1.09578         0.60839      0.2739         0.8718
     3        0.48739         0.46207      0.1218         0.9937
     4        0.02532               .      0.0063         1.0000

               Factor Loadings
    Variable |      1          2    Uniqueness
-------------+--------------------------------
      zlogro |   0.90586    0.40821    0.01278
     zlogpre |   0.89510    0.43040    0.01356
    zlogglac |   0.63697   -0.58522    0.25178
    zlogarea |  -0.60333    0.63357    0.23458
Figure 8.4, page 258.
greigen, yline(1)
Table 8.4, page 259.
rotate

            (varimax rotation)
               Rotated Factor Loadings
    Variable |      1          2    Uniqueness
-------------+--------------------------------
      zlogro |   0.16443    0.97989    0.01278
     zlogpre |   0.14001    0.98328    0.01356
    zlogglac |   0.84060    0.20399    0.25178
    zlogarea |  -0.86208   -0.14915    0.23458
Table 8.5, page 262.  Obliquely rotated loadings for mountain basin factors (compare with Tables 8.3 and 8.4).
rotate, promax

            (promax rotation)
               Rotated Factor Loadings
    Variable |      1          2    Uniqueness
-------------+--------------------------------
      zlogro |   0.01768    0.98744    0.01278
     zlogpre |  -0.00854    0.99607    0.01356
    zlogglac |   0.85152    0.03752    0.25178
    zlogarea |  -0.88279    0.02413    0.23458
Table 8.6, page 264.
score

            (based on rotated factors)
            
               Scoring Coefficients
    Variable |      1          2
-------------+---------------------
      zlogro |   0.00522    0.50140
     zlogpre |  -0.01226    0.50595
    zlogglac |   0.56570    0.01343
    zlogarea |  -0.58689    0.01810
Table 8.7, page 265.
score f1 f2

            (based on rotated factors)
               Scoring Coefficients
    Variable |      1          2
-------------+---------------------
      zlogro |   0.00522    0.50140
     zlogpre |  -0.01226    0.50595
    zlogglac |   0.56570    0.01343
    zlogarea |  -0.58689    0.01810
Table 8.8, page 266.
gen logsed=log10(yield)
regress logsed zlogro zlogpre zlogglac zlogarea

      Source |       SS       df       MS              Number of obs =      19
-------------+------------------------------           F(  4,    14) =   13.77
       Model |   10.265554     4   2.5663885           Prob > F      =  0.0001
    Residual |  2.60862507    14  .186330362           R-squared     =  0.7974
-------------+------------------------------           Adj R-squared =  0.7395
       Total |  12.8741791    18  .715232171           Root MSE      =  .43166

------------------------------------------------------------------------------
      logsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      zlogro |  -.1151779   .4575377    -0.25   0.805    -1.096499    .8661428
     zlogpre |   .7751374   .4527634     1.71   0.109    -.1959435    1.746218
    zlogglac |   .1315507   .1231924     1.07   0.304    -.1326706    .3957721
    zlogarea |  -.1008783   .1200595    -0.84   0.415    -.3583802    .1566236
       _cons |   3.200158   .0990296    32.32   0.000      2.98776    3.412555
------------------------------------------------------------------------------

regress logsed f1 f2

     Source |       SS       df       MS              Number of obs =      19
-------------+------------------------------           F(  2,    16) =   28.91
       Model |  10.0835282     2  5.04176408           Prob > F      =  0.0000
    Residual |  2.79065092    16  .174415683           R-squared     =  0.7832
-------------+------------------------------           Adj R-squared =  0.7561
       Total |  12.8741791    18  .715232171           Root MSE      =  .41763

------------------------------------------------------------------------------
      logsed |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          f1 |    .192415   .1046676     1.84   0.085    -.0294705    .4143005
          f2 |   .6608587   .1046676     6.31   0.000     .4389732    .8827442
       _cons |   3.200158   .0958111    33.40   0.000     2.997047    3.403268
------------------------------------------------------------------------------
Figure 8.9, page 268.
graph twoway scatter f2 f1, ///
	mlabel(location) msymbol(i) xlabel(-1(1)2) ylabel(-2(1)2) yline(0) xline(0)
Table 8.9, page 268.
use http://www.ats.ucla.edu/stat/stata/examples/rwg/planets, clear
(Beatty et al. (1981))

list planet dsun radius masskg density moons rings, nodis

       planet       dsun     radius     masskg    density     moons     rings 
  1.  Mercury       57.9       2439   3.30e+23       5.42         0      none  
  2.    Venus      108.2       6050   4.87e+24       5.25         0      none  
  3.    Earth      149.6       6378   5.98e+24       5.52         1      none  
  4.     Mars      227.9       3398   6.42e+23       3.94         2      none  
  5.  Jupiter      778.3      71900   1.90e+27      1.314        16     rings  
  6.   Saturn       1427      60000   5.69e+26        .69        17     rings  
  7.   Uranus     2869.6      26145   8.66e+25       1.19        15     rings  
  8.  Neptune     4496.6      24750   1.03e+26       1.66         8     rings  
  9.    Pluto       5900       1550   1.10e+22        1.2         1      none  
Figure 8.10, page 269.
gen logdsun=log(dsun)
gen lograd=log(radius)
gen logmass=log(masskg)
gen logden=log(density)
gen logmoon=log(moons+1)
graph matrix logdsun lograd logmass logden logmoon rings, half 
Figure 8.11, page 270.
factor logdsun lograd logmass logden logmoon rings, pcf factor(2)
(obs=9)
            (principal component factors; 2 factors retained)
  Factor     Eigenvalue     Difference    Proportion    Cumulative
------------------------------------------------------------------
     1        4.62365         3.45469      0.7706         0.7706
     2        1.16896         1.05664      0.1948         0.9654
     3        0.11232         0.05395      0.0187         0.9842
     4        0.05837         0.02174      0.0097         0.9939
     5        0.03663         0.03657      0.0061         1.0000
     6        0.00006               .      0.0000         1.0000

               Factor Loadings
    Variable |      1          2    Uniqueness
-------------+--------------------------------
     logdsun |   0.67105   -0.71093    0.04427
      lograd |   0.92287    0.37357    0.00875
     logmass |   0.83377    0.54463    0.00821
      logden |  -0.84511    0.47053    0.06439
     logmoon |   0.97647    0.00028    0.04651
       rings |   0.97917    0.07720    0.03526
       
greigen, yline(1) xlabel(1(1)6) ylabel(0(1)4)
Table 8.10, page 270.  We have skipped this for now.
Figure 8.12, page 271.  We have skipped this for now.
Table 8.12, page 274.
use http://www.ats.ucla.edu/stat/stata/examples/rwg/tulsa, clear
(Blocker & Eckberg (1989))

corr deepwell chandler tornados floods airpol rivpol
(obs=199)
             | deepwell chandler tornados   floods   airpol   rivpol
-------------+------------------------------------------------------
    deepwell |   1.0000
    chandler |   0.4726   1.0000
    tornados |   0.1131   0.2027   1.0000
      floods |   0.0928   0.0480   0.4052   1.0000
      airpol |   0.2805   0.1661   0.1524   0.0712   1.0000
      rivpol |   0.3365   0.2587   0.1007   0.1511   0.3861   1.0000
Figure 8.13, page 274.
graph matrix deepwell chandler tornados floods airpol rivpol, half jitter(5)
NOTE: This graph looks slightly different than the graph in the book because of the jittering. Jittering adds a small random number to each value graphed, so each time the graph is made, the small random addition to the points will make the graph look slightly different.
Figure 8.14, page 275.
factor deepwell chandler tornados floods airpol rivpol
(obs=199)
            (principal factors; 3 factors retained)
  Factor     Eigenvalue     Difference    Proportion    Cumulative
------------------------------------------------------------------
     1        1.35194         0.88877      0.9929         0.9929
     2        0.46317         0.29830      0.3402         1.3331
     3        0.16487         0.28236      0.1211         1.4542
     4       -0.11749         0.07032     -0.0863         1.3679
     5       -0.18781         0.12528     -0.1379         1.2299
     6       -0.31309               .     -0.2299         1.0000

               Factor Loadings
    Variable |      1          2          3    Uniqueness
-------------+-------------------------------------------
    deepwell |   0.59374   -0.20135   -0.11690    0.59327
    chandler |   0.53406   -0.13881   -0.23460    0.64047
    tornados |   0.36576    0.42706   -0.05948    0.68031
      floods |   0.29282    0.45047    0.02034    0.71092
      airpol |   0.46081   -0.08334    0.23495    0.72551
      rivpol |   0.53134   -0.10543    0.19240    0.66954

greigen, yline(0, 1) xlabel(1(1)6) ylabel(0 1) ytick(-.2 0 .2 .4 .6 .8 1 1.2 1.4)
Table 8.13, page 276.
factor deepwell chandler tornados floods airpol rivpol
(obs=199)
            (principal factors; 3 factors retained)
  Factor     Eigenvalue     Difference    Proportion    Cumulative
------------------------------------------------------------------
     1        1.35194         0.88877      0.9929         0.9929
     2        0.46317         0.29830      0.3402         1.3331
     3        0.16487         0.28236      0.1211         1.4542
     4       -0.11749         0.07032     -0.0863         1.3679
     5       -0.18781         0.12528     -0.1379         1.2299
     6       -0.31309               .     -0.2299         1.0000

               Factor Loadings
    Variable |      1          2          3    Uniqueness
-------------+-------------------------------------------
    deepwell |   0.59374   -0.20135   -0.11690    0.59327
    chandler |   0.53406   -0.13881   -0.23460    0.64047
    tornados |   0.36576    0.42706   -0.05948    0.68031
      floods |   0.29282    0.45047    0.02034    0.71092
      airpol |   0.46081   -0.08334    0.23495    0.72551
      rivpol |   0.53134   -0.10543    0.19240    0.66954

rotate, promax
            (promax rotation)
               Rotated Factor Loadings
    Variable |      1          2          3    Uniqueness
-------------+-------------------------------------------
    deepwell |   0.54661   -0.02946    0.14547    0.59327
    chandler |   0.61054    0.03497   -0.03720    0.64047
    tornados |   0.06995    0.54955   -0.02505    0.68031
      floods |  -0.06653    0.54282    0.03733    0.71092
      airpol |   0.04270    0.00781    0.49339    0.72551
      rivpol |   0.13743    0.01013    0.47524    0.66954

score f1 f2 f3
            (based on rotated factors)
               Scoring Coefficients
    Variable |      1          2          3
-------------+--------------------------------
    deepwell |   0.36499    0.04240    0.21836
    chandler |   0.34685    0.06944    0.10044
    tornados |   0.07552    0.38228    0.06128
      floods |   0.01167    0.35768    0.06601
      airpol |   0.11423    0.06009    0.30260
      rivpol |   0.17054    0.07305    0.33196

corr f1 f2 f3
(obs=199)
             |       f1       f2       f3
-------------+---------------------------
          f1 |   1.0000
          f2 |   0.4957   1.0000
          f3 |   0.8732   0.5503   1.0000
Figure 8.16, page 277.
histogram f3 if sex==0, ///
	fraction bin(8) start(-2) xlabel(-2(1)1) ylabel(0 .1 .2) xline(-.166)
histogram f3 if sex==1, ///
	fraction bin(8) start(-2) xlabel(-2(1)1) ylabel(0 .1 .2) xline(.128)
Table 8.15, page 279.
factor taxbabes manykids lessenvt toocons pollburd privown shutdown punish preserve, ml factor(3)
(obs=241)
Iteration 0:   log likelihood = -25.277324
Iteration 1:   log likelihood =  -10.89083
Iteration 2:   log likelihood = -10.463314
Iteration 3:   log likelihood = -10.376172
Iteration 4:   log likelihood = -10.356368
Iteration 5:   log likelihood =  -10.35112
Iteration 6:   log likelihood = -10.349526
Iteration 7:   log likelihood = -10.349001
Iteration 8:   log likelihood = -10.348822
Iteration 9:   log likelihood = -10.348759
Iteration 10:  log likelihood = -10.348737
Iteration 11:  log likelihood =  -10.34873
Iteration 12:  log likelihood = -10.348727
Iteration 13:  log likelihood = -10.348726
Iteration 14:  log likelihood = -10.348726
Iteration 15:  log likelihood = -10.348726
Iteration 16:  log likelihood = -10.348726
Iteration 17:  log likelihood = -10.348726
Iteration 18:  log likelihood = -10.348726
Iteration 19:  log likelihood = -10.348726

            (maximum likelihood factors; 3 factors retained)
  Factor     Variance       Difference    Proportion    Cumulative
------------------------------------------------------------------
     1        1.09150        -0.42946      0.3398         0.3398
     2        1.52096         0.92086      0.4734         0.8132
     3        0.60010               .      0.1868         1.0000

Test:  3 vs. no   factors.  Chi2(  27) =  223.59, Prob > chi2 =  0.0000
Test:  3 vs. more factors.  Chi2(  12) =   20.20, Prob > chi2 =  0.0635

               Factor Loadings
    Variable |      1          2          3    Uniqueness
-------------+-------------------------------------------
    taxbabes |   0.94460    0.00065   -0.01496    0.10743
    manykids |  -0.33953    0.05576    0.15998    0.85602
    lessenvt |   0.13347    0.51309    0.12787    0.70257
     toocons |   0.05995    0.48555    0.42403    0.58085
    pollburd |   0.02056    0.64636    0.17909    0.54973
     privown |   0.01190    0.36004    0.01118    0.87010
    shutdown |   0.00387   -0.49947    0.48407    0.51620
      punish |   0.23087   -0.38692    0.32437    0.69177
    preserve |   0.09313   -0.26878    0.07998    0.91269

rotate, promax
            (promax rotation)
               Rotated Factor Loadings
    Variable |      1          2          3    Uniqueness
-------------+-------------------------------------------
    taxbabes |   0.94363    0.04826    0.00168    0.10743
    manykids |  -0.36004    0.15240    0.12783    0.85602
    lessenvt |   0.12017    0.47899   -0.11160    0.70257
     toocons |   0.00568    0.70413    0.19670    0.58085
    pollburd |   0.00180    0.60911   -0.12521    0.54973
     privown |   0.01370    0.26475   -0.15846    0.87010
    shutdown |  -0.06779    0.05460    0.72057    0.51620
      punish |   0.18163    0.01425    0.51159    0.69177
    preserve |   0.07924   -0.11676    0.20860    0.91269
Figure 8.17, page 280.
graph twoway (scatter nchldn f1, jitter(3)) (lfit nchldn f1), ylabel(0(1)7) xlabel(-3(1)1)
NOTE: Because of the jittering, this graph does not look exactly like the one in the book.

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.