UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Textbook Examples
Applied Linear Statistical Models by Neter, Kutner, et. al.
Chapter 25: Analysis of Covariance

Inputting the Cracker Promotion data, p. 1020.
clear
input y x treat store
  38  21  1  1
  39  26  1  2
  36  22  1  3
  45  28  1  4
  33  19  1  5
  43  34  2  1
  38  26  2  2
  38  29  2  3
  27  18  2  4
  34  25  2  5
  24  23  3  1
  32  29  3  2
  31  30  3  3
  21  16  3  4
  28  29  3  5
end

Table 25.1, p. 1020.

table treat store, contents(mean y mean x)
----------------------------------------
          |            store            
    treat |    1     2     3     4     5
----------+-----------------------------
        1 |   38    39    36    45    33
          |   21    26    22    28    19
          | 
        2 |   43    38    38    27    34
          |   34    26    29    18    25
          | 
        3 |   24    32    31    21    28
          |   23    29    30    16    29
----------------------------------------

Fig. 25.5, p. 1021.

twoway (scatter y x if treat==1, ms(oh) msize(large) legend(label(1 "Treatment 1"))) ///
(scatter y x if treat==2, legend(label(2 "Treatment 2"))) ///
(scatter y x if treat==3, ms(sh) legend(label(3 "Treatment 3")))

Creating the indicator and interaction variables for the Cracker data set. First we need to calculate the overall mean which will be used to generate the x variable (x = X-mean), table 25.2, p. 1021.

quietly sum x
gen littlex = x-r(mean)
gen I1 = 0
replace I1 =  1 if treat==1
replace I1 = -1 if treat==3
gen I2 = 0
replace I2 =  1 if treat==2
replace I2 = -1 if treat==3
gen I1x = I1*littlex
gen I2x = I2*littlex
sort treat store
list treat store y x littlex I1 I2 I1x I2x
mean x

     +---------------------------------------------------------+
     | treat   store    y    x   littlex   I1   I2   I1x   I2x |
     |---------------------------------------------------------|
  1. |     1       1   38   21        -4    1    0    -4     0 |
  2. |     1       2   39   26         1    1    0     1     0 |
  3. |     1       3   36   22        -3    1    0    -3     0 |
  4. |     1       4   45   28         3    1    0     3     0 |
  5. |     1       5   33   19        -6    1    0    -6     0 |
     |---------------------------------------------------------|
  6. |     2       1   43   34         9    0    1     0     9 |
  7. |     2       2   38   26         1    0    1     0     1 |
  8. |     2       3   38   29         4    0    1     0     4 |
  9. |     2       4   27   18        -7    0    1     0    -7 |
 10. |     2       5   34   25         0    0    1     0     0 |
     |---------------------------------------------------------|
 11. |     3       1   24   23        -2   -1   -1     2     2 |
 12. |     3       2   32   29         4   -1   -1    -4    -4 |
 13. |     3       3   31   30         5   -1   -1    -5    -5 |
 14. |     3       4   21   16        -9   -1   -1     9     9 |
 15. |     3       5   28   29         4   -1   -1    -4    -4 |
     +---------------------------------------------------------+

Regressing Y on littlex, I1 and I2, table 25.3, p. 1022. Testing for treatment effect, p. 1023-1024.

regress y littlex I1 I2
predict r, resid
      Source |       SS       df       MS              Number of obs =      15
-------------+------------------------------           F(  3,    11) =   57.78
       Model |  607.828691     3  202.609564           Prob > F      =  0.0000
    Residual |  38.5713085    11  3.50648259           R-squared     =  0.9403
-------------+------------------------------           Adj R-squared =  0.9241
       Total |       646.4    14  46.1714286           Root MSE      =  1.8726

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   .8985594   .1025849     8.76   0.000     .6727716    1.124347
          I1 |   6.017407   .7082568     8.50   0.000     4.458544     7.57627
          I2 |   .9420168   .6986826     1.35   0.205    -.5957732    2.479807
       _cons |       33.8    .483493    69.91   0.000     32.73584    34.86416
------------------------------------------------------------------------------
matrix list e(V)
symmetric e(V)[4,4]
            littlex          I1          I2       _cons
littlex   .01052366
     I1   .01894258   .50162766
     I2  -.01473312  -.26028512   .48815738
  _cons           0           0           0   .23376551
test I1=I2
 ( 1)  I1 - I2 = 0

       F(  1,    11) =   17.06
            Prob > F =    0.0017

Figure 25.6a, p. 1023.

twoway scatter treat r, ylabel(#3)

Figure 25.6b, p. 1023.

qnorm r

Table 25.4, p. 1023.

regress y littlex
      Source |       SS       df       MS              Number of obs =      15
-------------+------------------------------           F(  1,    13) =    5.44
       Model |  190.677778     1  190.677778           Prob > F      =  0.0364
    Residual |  455.722222    13  35.0555556           R-squared     =  0.2950
-------------+------------------------------           Adj R-squared =  0.2408
       Total |       646.4    14  46.1714286           Root MSE      =  5.9208

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   .7277778   .3120521     2.33   0.036     .0536301    1.401925
       _cons |       33.8   1.528737    22.11   0.000     30.49736    37.10264
------------------------------------------------------------------------------

Estimation of treatment effects, pair-wise comparisons, p. 1024.

gen treat1 = 0 
replace treat1 = 1 if treat==1
gen treat2 = 0
replace treat2 = 1 if treat==2
gen treat3 = 0
replace treat3 = 1 if treat==3
regress y littlex treat1 treat2 treat3, noconstant
      Source |       SS       df       MS              Number of obs =      15
-------------+------------------------------           F(  4,    11) = 1265.12
       Model |  17744.4287     4  4436.10717           Prob > F      =  0.0000
    Residual |  38.5713085    11  3.50648259           R-squared     =  0.9978
-------------+------------------------------           Adj R-squared =  0.9970
       Total |       17783    15  1185.53333           Root MSE      =  1.8726

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   .8985594   .1025849     8.76   0.000     .6727716    1.124347
      treat1 |   39.81741   .8575507    46.43   0.000     37.92995    41.70486
      treat2 |   34.74202   .8496605    40.89   0.000     32.87193    36.61211
      treat3 |   26.84058   .8384392    32.01   0.000     24.99518    28.68597
------------------------------------------------------------------------------

lincom treat1-treat2

 ( 1)  treat1 - treat2 = 0

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |    5.07539   1.228965     4.13   0.002     2.370456    7.780324
------------------------------------------------------------------------------

di r(se)^2
1.5103553

lincom treat1-treat3

 ( 1)  treat1 - treat3 = 0

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   12.97683   1.205623    10.76   0.000     10.32327    15.63039
------------------------------------------------------------------------------

di r(se)^2
1.4535275

lincom treat2-treat3

 ( 1)  treat2 - treat3 = 0

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   7.901441   1.188746     6.65   0.000     5.285029    10.51785
------------------------------------------------------------------------------

di r(se)^2
1.4131167

Estimating the mean response for each treatment group when X is at its mean (X=25), p. 1026.

Note: This code uses the matrices saved by the regression command.

regress y littlex I1 I2
matrix v = e(V)

      Source |       SS       df       MS              Number of obs =      15
-------------+------------------------------           F(  3,    11) =   57.78
       Model |  607.828691     3  202.609564           Prob > F      =  0.0000
    Residual |  38.5713085    11  3.50648259           R-squared     =  0.9403
-------------+------------------------------           Adj R-squared =  0.9241
       Total |       646.4    14  46.1714286           Root MSE      =  1.8726

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   .8985594   .1025849     8.76   0.000     .6727716    1.124347
          I1 |   6.017407   .7082568     8.50   0.000     4.458544     7.57627
          I2 |   .9420168   .6986826     1.35   0.205    -.5957732    2.479807
       _cons |       33.8    .483493    69.91   0.000     32.73584    34.86416
------------------------------------------------------------------------------

di _b[_cons]+_b[I1]
39.817407

di v[4,4]+v[2,2]+2*v[4,2]
.73539317

di _b[_cons]+_b[I2]
34.742017

di v[4,4]+v[3,3]+2*v[4,3]
.72192289

di _b[_cons]-_b[I1]-_b[I2]
26.840576

di v[4,4]+v[2,2]+v[3,3]-2*v[4,2]-2*v[4,3]+2*v[3,2]
.7029803

Table 25.5 and testing for parallel slopes, in other words, testing to see if the interactions are significant, p. 1027.

regress y littlex I1 I2 I1x I2x

      Source |       SS       df       MS              Number of obs =      15
-------------+------------------------------           F(  5,     9) =   35.11
       Model |  614.879165     5  122.975833           Prob > F      =  0.0000
    Residual |  31.5208354     9  3.50231504           R-squared     =  0.9512
-------------+------------------------------           Adj R-squared =  0.9241
       Total |       646.4    14  46.1714286           Root MSE      =  1.8714

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   .9387352   .1126656     8.33   0.000     .6838679    1.193603
          I1 |   6.269902   .7516696     8.34   0.000     4.569507    7.970297
          I2 |   .7179135   .7160036     1.00   0.342    -.9017991    2.337626
         I1x |   .1525057   .1843832     0.83   0.430    -.2645981    .5696094
         I2x |   .0525185    .145611     0.36   0.727    -.2768765    .3819134
       _cons |   33.89433   .5123434    66.16   0.000     32.73533    35.05333
------------------------------------------------------------------------------

test I1x=I2x=0
( 1) I1x - I2x = 0
( 2) I1x = 0

F( 2, 9) = 1.01
Prob > F = 0.4032

Inputting the Salable Flowers data set.

clear
input y x a b rep
  98  15  1  1  1
  60   4  1  1  2
  77   7  1  1  3
  80   9  1  1  4
  95  14  1  1  5
  64   5  1  1  6
  55   4  2  1  1
  60   5  2  1  2
  75   8  2  1  3
  65   7  2  1  4
  87  13  2  1  5
  78  11  2  1  6
  71  10  1  2  1
  80  12  1  2  2
  86  14  1  2  3
  82  13  1  2  4
  46   2  1  2  5
  55   3  1  2  6
  76  11  2  2  1
  68  10  2  2  2
  43   2  2  2  3
  47   3  2  2  4
  62   7  2  2  5
  70   9  2  2  6
end

label var y "yield"
label var x "plot size"
label var a "variety"
label var b "moisture"

Table 25.6, p. 1029.

table rep b , contents(mean y mean x) by(a)

----------------------
variety   |  moisture 
and rep   |    1     2
----------+-----------
1         |
        1 |   98    71
          |   15    10
          | 
        2 |   60    80
          |    4    12
          | 
        3 |   77    86
          |    7    14
          | 
        4 |   80    82
          |    9    13
          | 
        5 |   95    46
          |   14     2
          | 
        6 |   64    55
          |    5     3
----------+-----------
2         |
        1 |   55    76
          |    4    11
          | 
        2 |   60    68
          |    5    10
          | 
        3 |   75    43
          |    8     2
          | 
        4 |   65    47
          |    7     3
          | 
        5 |   87    62
          |   13     7
          | 
        6 |   78    70
          |   11     9
----------------------

Fig. 25.7, p. 1030.

twoway (scatter y x if a==1 & b==1, legend(label(1 "A1B1"))) ///
(scatter y x if a==1 & b==2, legend(label(2 "A1B2"))) ///
(scatter y x if a==2 & b==1, legend(label(3 "A2B1"))) ///
(scatter y x if a==2 & b==2, legend(label(4 "A2B2")))

Generating the variable for X centered at its mean, the indicator variables and their interactions, p. 1028.
sum x
gen littlex = x - r(mean)
gen I1 = 1
replace I1 = -1 if a==2
gen I2 = 1
replace I2 = -1 if b==2
gen I12 = I1*I2
Table 25.7, regression output and sums of squares, p. 1030 and the test of the interaction, p. 1031.
regress y littlex I1 I2 I12
anova y littlex I1 I2 I12, continuous(littlex)
      Source |       SS       df       MS              Number of obs =      24
-------------+------------------------------           F(  4,    19) =  197.45
       Model |  4966.51882     4   1241.6297           Prob > F      =  0.0000
    Residual |  119.481183    19  6.28848331           R-squared     =  0.9765
-------------+------------------------------           Adj R-squared =  0.9716
       Total |        5086    23  221.130435           Root MSE      =  2.5077

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     littlex |   3.276882   .1300174    25.20   0.000     3.004752    3.549011
          I1 |   2.042339   .5210844     3.92   0.001     .9516966    3.132981
          I2 |    3.68078     .51291     7.18   0.000     2.607247    4.754313
         I12 |   .8192204     .51291     1.60   0.127    -.2543125    1.892753
       _cons |         70    .511879   136.75   0.000     68.92862    71.07138
------------------------------------------------------------------------------


                           Number of obs =      24     R-squared     =  0.9765
                           Root MSE      = 2.50768     Adj R-squared =  0.9716

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  4966.51882     4   1241.6297     197.45     0.0000
                         |
                 littlex |  3994.51882     1  3994.51882     635.21     0.0000
                      I1 |  96.6018263     1  96.6018263      15.36     0.0009
                      I2 |  323.849473     1  323.849473      51.50     0.0000
                     I12 |  16.0422442     1  16.0422442       2.55     0.1267
                         |
                Residual |  119.481183    19  6.28848331   
              -----------+----------------------------------------------------
                   Total |        5086    23  221.130435
Figure 25.8 p.1031, estimated treatment means plot (x=0).
regress y littlex I1 I2 I12
twoway (function y = _b[_cons]+0*_b[littlex]+x*_b[I1]+(-1)*_b[I2]+x*(-1)*_b[I12], range(1 -1) n(2)) ///
	 (function y = _b[_cons]+0*_b[littlex]+x*_b[I1]+(1)*_b[I2]+x*1*_b[I12], range(1 -1) n(2)) ///
	 ,  xscale(reverse) xlabel(1 "a1" -1 "a2") ylabel(40 100) legend(label(1 "b2") label(2 "b1"))


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California