UCLA Academic Technology Services HomeServicesClassesContactJobs

Stata Textbook Examples
Applied Linear Statistical Models by Neter, Kutner, et. al.
Chapter 23: Multifactor Studies

Inputting table 23.4a, p. 943.
clear
input exercise gender fat smoking rep
  24.1  1  1  1  1
  29.2  1  1  1  2
  24.6  1  1  1  3
  20.0  2  1  1  1
  21.9  2  1  1  2
  17.6  2  1  1  3
  14.6  1  2  1  1
  15.3  1  2  1  2
  12.3  1  2  1  3
  16.1  2  2  1  1
   9.3  2  2  1  2
  10.8  2  2  1  3
  17.6  1  1  2  1
  18.8  1  1  2  2
  23.2  1  1  2  3
  14.8  2  1  2  1
  10.3  2  1  2  2
  11.3  2  1  2  3
  14.9  1  2  2  1
  20.4  1  2  2  2
  12.8  1  2  2  3
  10.1  2  2  2  1
  14.4  2  2  2  2
   6.1  2  2  2  3
end
label define gnd 1 "male" 2 "female"
label define fat 1 "low fat" 2 "high fat
label define smk 1 "Light" 2 "heavy"
label values gender gnd
label values fat fat
label values smoking smk

Table 23.4b, p. 943. The cell means and the factor means.

table gender smoking, by(fat) contents(mean exercise) row col
table  gender smoking, contents(mean exercise) row col
----------------------------------------
fat and   |           smoking           
gender    |    Light     heavy     Total
----------+-----------------------------
low fat   |
     male | 25.96667  19.86667  22.91667
   female | 19.83333  12.13333  15.98333
          | 
    Total |     22.9        16     19.45
----------+-----------------------------
high fat  |
     male | 14.06667  16.03333     15.05
   female | 12.06667      10.2  11.13333
          | 
    Total | 13.06667  13.11667  13.09167
----------------------------------------
----------------------------------------
          |           smoking           
   gender |    Light     heavy     Total
----------+-----------------------------
     male | 20.01667     17.95  18.98333
   female |    15.95  11.16667  13.55833
          | 
    Total | 17.98333  14.55833  16.27083
----------------------------------------

Figure 23.6a1, p. 944.

Note that the code to generate the plot requires a user-written package "Graphing model diagnostics" by Nicholas J. Cox, this package can be located by typing "findit anovaplot" (without the quotes) in the Stata command window (for further information on using findit to find new programs, see our FAQ How do I use findit to search for programs and additional help?).

anova exercise fat smoking fat*smoking if gender==1
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(14 2.2 "C1 Light") text(16 2.2 "C2 Heavy") title(A1 (Male))

Figure 23.6a2, p. 944.

anova exercise fat smoking fat*smoking if gender==2
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(12 2.2 "C1 Light") text(10 2.2 "C2 Heavy") title(A2 (Female))

Figure 23.6b1, p. 944.

anova exercise gender fat if smoking==1
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(21 2.2 "B1 Low Fat") text(11 2.2 "B2 High Fat") title(C1 (Light Smoking))

Figure 23.6b2, p. 944.

anova exercise gender fat if smoking==2
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(12.5 2.2 "B1 Low Fat") text(9.5 2.2 "B2 High Fat") title(C2 (Heavy Smoking))

Fig. 23.7 and 23.8, p. 945.
anova exercise gender fat smoking gender*fat gender*smoking fat*smoking gender*fat*smoking
predict r, residuals
qnorm r
                           Number of obs =      24     R-squared     =  0.7976
                           Root MSE      = 3.05539     Adj R-squared =  0.7090

                  Source |  Partial SS    df       MS           F     Prob > F
      -------------------+----------------------------------------------------
                   Model |  588.582936     7  84.0832766       9.01     0.0002
                         |
                  gender |  176.583755     1  176.583755      18.92     0.0005
                     fat |  242.570427     1  242.570427      25.98     0.0001
                 smoking |  70.3837595     1  70.3837595       7.54     0.0144
              gender*fat |  13.6504194     1  13.6504194       1.46     0.2441
          gender*smoking |  11.0704137     1  11.0704137       1.19     0.2923
             fat*smoking |  72.4537444     1  72.4537444       7.76     0.0132
      gender*fat*smoking |  1.87041736     1  1.87041736       0.20     0.6604
                         |
                Residual |  149.366666    16  9.33541665   
      -------------------+----------------------------------------------------
                   Total |  737.949602    23  32.0847653 

Estimation of Contrasts of Treatment means, p. 947.  First we make dummy variables for a regression.  Then we test the contrasts.

gen male = 0
replace male = 1 if gender==1
gen lowfat = 0
replace lowfat = 1 if fat==1
gen lightsmk = 0
replace lightsmk = 1 if smoking==1
gen genderfat= male*lowfat
gen gendersmk= male*lightsmk
gen fatsmk = lowfat*lightsmk
gen genderfatsmk = male*lowfat*lightsmk
regress exercise male lowfat lightsmk genderfat gendersmk fatsmk genderfatsmk
      Source |       SS       df       MS              Number of obs =      24
-------------+------------------------------           F(  7,    16) =    9.01
       Model |  588.582936     7  84.0832766           Prob > F      =  0.0002
    Residual |  149.366666    16  9.33541665           R-squared     =  0.7976
-------------+------------------------------           Adj R-squared =  0.7090
       Total |  737.949602    23  32.0847653           Root MSE      =  3.0554

------------------------------------------------------------------------------
    exercise |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        male |   5.833333   2.494717     2.34   0.033     .5447702     11.1219
      lowfat |   1.933334   2.494717     0.77   0.450    -3.355229    7.221897
    lightsmk |   1.866667   2.494717     0.75   0.465    -3.421896     7.15523
   genderfat |        1.9   3.528062     0.54   0.598    -5.579157    9.379158
   gendersmk |  -3.833333   3.528062    -1.09   0.293    -11.31249    3.645824
      fatsmk |   5.833333   3.528062     1.65   0.118    -1.645825    13.31249
genderfatsmk |   2.233334   4.989433     0.45   0.660    -8.343792    12.81046
       _cons |       10.2   1.764031     5.78   0.000     6.460421    13.93958
------------------------------------------------------------------------------
lincom lightsmk+.5*gendersmk+fatsmk+.5*genderfatsmk
 ( 1)  lightsmk + .5 gendersmk + fatsmk + .5 genderfatsmk = 0

------------------------------------------------------------------------------
    exercise |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |        6.9   1.764031     3.91   0.001     3.160421    10.63958
------------------------------------------------------------------------------

lincom lightsmk+.5*gendersmk
------------------------------------------------------------------------------
    exercise |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |  -.0499996   1.764031    -0.03   0.978    -3.789578    3.689579
------------------------------------------------------------------------------

lincom male+.5*genderfat+.5*gendersmk+.25*genderfatsmk
 ( 1)  male + .5 genderfat + .5 gendersmk + .25 genderfatsmk = 0

------------------------------------------------------------------------------
    exercise |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |      5.425   1.247358     4.35   0.000     2.780719    8.069282
------------------------------------------------------------------------------

Figure. 23.9b, p. 949.

anova exercise smoking fat smoking*fat
anovaplot, scatter(ms(i)) xscale(r(.5 2.5)) text(11 1.5 "High Percent Fat") text(22 1.5 "Low Percent Fat") legend(off)
                           Number of obs =      24     R-squared     =  0.5223
                           Root MSE      = 4.19846     Adj R-squared =  0.4506

                  Source |  Partial SS    df       MS           F     Prob > F
             ------------+----------------------------------------------------
                   Model |  385.407931     3   128.46931       7.29     0.0017
                         |
                 smoking |  70.3837595     1  70.3837595       3.99     0.0595
                     fat |  242.570427     1  242.570427      13.76     0.0014
             smoking*fat |  72.4537444     1  72.4537444       4.11     0.0562
                         |
                Residual |  352.541671    20  17.6270836   
             ------------+----------------------------------------------------
                   Total |  737.949602    23  32.0847653

Creating the Stress data with missing values and generating the indicator variables, table 23.5, p. 950.
gen y = exercise
replace y = . if gender==1 & fat==1 & smoking==1 & rep==3
replace y = . if gender==2 & fat==2 & smoking==1 & rep==2
gen x1 = 1
replace x1 = -1 if gender==2
gen x2 = 1
replace x2 = -1 if fat==2
gen x3 = 1
replace x3 = -1 if smoking==2
gen x1x2 = x1*x2
gen x1x3 = x1*x3
gen x2x3 = x2*x3
gen x1x2x3 = x1*x2*x3
list exercise gender fat smoking rep y x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3, clean nolabel compress
       exe~e   gen~r   fat   smo~g   rep      y   x1   x2   x3   x1x2   x1x3   x2x3   x1x..  
  1.    24.1       1     1       1     1   24.1    1    1    1      1      1      1       1  
  2.    29.2       1     1       1     2   29.2    1    1    1      1      1      1       1  
  3.    24.6       1     1       1     3      .    1    1    1      1      1      1       1  
  4.      20       2     1       1     1     20   -1    1    1     -1     -1      1      -1  
  5.    21.9       2     1       1     2   21.9   -1    1    1     -1     -1      1      -1  
  6.    17.6       2     1       1     3   17.6   -1    1    1     -1     -1      1      -1  
  7.    14.6       1     2       1     1   14.6    1   -1    1     -1      1     -1      -1  
  8.    15.3       1     2       1     2   15.3    1   -1    1     -1      1     -1      -1  
  9.    12.3       1     2       1     3   12.3    1   -1    1     -1      1     -1      -1  
 10.    16.1       2     2       1     1   16.1   -1   -1    1      1     -1     -1       1  
 11.     9.3       2     2       1     2      .   -1   -1    1      1     -1     -1       1  
 12.    10.8       2     2       1     3   10.8   -1   -1    1      1     -1     -1       1  
 13.    17.6       1     1       2     1   17.6    1    1   -1      1     -1     -1      -1  
 14.    18.8       1     1       2     2   18.8    1    1   -1      1     -1     -1      -1  
 15.    23.2       1     1       2     3   23.2    1    1   -1      1     -1     -1      -1  
 16.    14.8       2     1       2     1   14.8   -1    1   -1     -1      1     -1       1  
 17.    10.3       2     1       2     2   10.3   -1    1   -1     -1      1     -1       1  
 18.    11.3       2     1       2     3   11.3   -1    1   -1     -1      1     -1       1  
 19.    14.9       1     2       2     1   14.9    1   -1   -1     -1     -1      1       1  
 20.    20.4       1     2       2     2   20.4    1   -1   -1     -1     -1      1       1  
 21.    12.8       1     2       2     3   12.8    1   -1   -1     -1     -1      1       1  
 22.    10.1       2     2       2     1   10.1   -1   -1   -1      1      1      1      -1  
 23.    14.4       2     2       2     2   14.4   -1   -1   -1      1      1      1      -1  
 24.     6.1       2     2       2     3    6.1   -1   -1   -1      1      1      1      -1  

Testing factor A (Gender) by dropping x1 from the full model and regressing y on the variables in column 3-8, p. 950.

regress y x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3
      Source |       SS       df       MS              Number of obs =      22
-------------+------------------------------           F(  7,    14) =    7.18
       Model |  484.814865     7  69.2592665           Prob > F      =  0.0009
    Residual |  135.083332    14  9.64880943           R-squared     =  0.7821
-------------+------------------------------           Adj R-squared =  0.6731
       Total |  619.898197    21  29.5189618           Root MSE      =  3.1063

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x1 |      2.625   .6725236     3.90   0.002      1.18258     4.06742
          x2 |   3.091667   .6725236     4.60   0.000     1.649247    4.534086
          x3 |   1.970833   .6725236     2.93   0.011     .5284139    3.413253
        x1x2 |     1.0125   .6725236     1.51   0.154    -.4299195     2.45492
        x1x3 |  -.7666666   .6725236    -1.14   0.273    -2.209086     .675753
        x2x3 |       1.65   .6725236     2.45   0.028     .2075804     3.09242
      x1x2x3 |   .5375001   .6725236     0.80   0.438    -.9049195     1.97992
       _cons |   16.52917   .6725236    24.58   0.000     15.08675    17.97159
------------------------------------------------------------------------------
regress y x2 x3 x1x2 x1x3 x2x3 x1x2x3
      Source |       SS       df       MS              Number of obs =      22
-------------+------------------------------           F(  6,    15) =    2.99
       Model |  337.814861     6  56.3024768           Prob > F      =  0.0397
    Residual |  282.083336    15  18.8055558           R-squared     =  0.5450
-------------+------------------------------           Adj R-squared =  0.3629
       Total |  619.898197    21  29.5189618           Root MSE      =  4.3365

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          x2 |        2.8   .9330743     3.00   0.009     .8111994    4.788801
          x3 |   1.970833   .9388879     2.10   0.053    -.0303587    3.972026
        x1x2 |     1.0125   .9388879     1.08   0.298     -.988692    3.013692
        x1x3 |  -1.058333   .9330743    -1.13   0.274    -3.047134    .9304675
        x2x3 |   1.358333   .9330743     1.46   0.166    -.6304675    3.347134
      x1x2x3 |   .5375001   .9388879     0.57   0.575    -1.463692    2.538692
       _cons |   16.52917   .9388879    17.61   0.000     14.52797    18.53036
------------------------------------------------------------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California