UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata Textbook Examples
Design and Analysis by Geoffrey Keppel
Chapter 14: The Analysis of Covariance

Version info: Code for this page was tested in Stata 12.

Page 312 shows the analysis of covariance. In Stata you use the contin( ) option to specify continuous variables. In the example below, x is the covariate and c.x indicates to treat this as a continuous variable.
use http://www.ats.ucla.edu/stat/stata/examples/da/chap14, clear

anova y a c.x

                           Number of obs =      24     R-squared     =  0.4522
                           Root MSE      = 4.08761     Adj R-squared =  0.3700

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  275.828162     3  91.9427208       5.50     0.0064
                         |
                       a |  165.793075     2  82.8965373       4.96     0.0178
                       x |  163.828162     1  163.828162       9.81     0.0053
                         |
                Residual |  334.171838    20  16.7085919   
              -----------+----------------------------------------------------
                   Total |      610.00    23  26.5217391
We should note that we have used c. in order to perform contrasts, but such use does not make those variables covariates. This is just a programming trick in Stata to accomplish the contrasts of interest. In general, the c. is used when you have a covariate and you want to tell Stata to treat that variable as a continuous variable (as opposed to a categorical variable).
Below we show how to compute the adjusted means shown on page 314. The adjust command adjusts the means for x and lists them separately for each level of a.
adjust x, by(a)
-------------------------------------------------------------------------------
Dependent variable: y     Command: anova
Covariate set to mean: x = 7.5
-------------------------------------------------------------------------------

----------+-----------
        a |         xb
----------+-----------
        1 |    6.53103
        2 |    10.3747
        3 |    13.0943
----------+-----------
Key:  xb         =  Linear Prediction
On pages 314-316 Keppel shows how to compare group 1 with 2 and 3. We create acomp1 that shows the comparison of interest, and add acomp2 that is orthogonal to acomp1.
generate acomp1=a
recode acomp1 1=1 2=-.5 3=-.5
generate acomp2=a
recode acomp2 1=0 2=-1 3=1
We now replace a with c.acomp1 c.acomp2. These results are slightly different from Keppel's on page 316, but do match results from other packages. We think the differences may be simply due to rounding error.
anova y c.acomp1 c.acomp2 c.x

                          Number of obs =      24     R-squared     =  0.4522
                           Root MSE      = 4.08761     Adj R-squared =  0.3700

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  275.828162     3  91.9427208       5.50     0.0064
                         |
                  acomp1 |   142.11589     1   142.11589       8.51     0.0085
                  acomp2 |  27.5922135     1  27.5922135       1.65     0.2135
                       x |  163.828162     1  163.828162       9.81     0.0053
                         |
                Residual |  334.171838    20  16.7085919   
              -----------+----------------------------------------------------
                   Total |      610.00    23  26.5217391
At the bottom of page 320 Keppel shows how to test for homogeneity of regression across groups. This is equivalent to testing the interaction of a*x, where an interaction would indicate that the regression coefficient for x is significantly different across the levels of a. We see that the test of a*x corresponds to that shown at the bottom of page 320. Note that we get the interaction and main effects using the ## notation in Stata 12 which expands to the main effects plus their interaction.
anova y a##c.x
                           Number of obs =      24     R-squared     =  0.5027
                           Root MSE      = 4.10535     Adj R-squared =  0.3645

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  306.630482     5  61.3260963       3.64     0.0189
                         |
                       a |  74.2381202     2  37.1190601       2.20     0.1394
                       x |  107.524781     1  107.524781       6.38     0.0211
                     a#x |  30.8023193     2  15.4011596       0.91     0.4188
                         |
                Residual |  303.369518    18  16.8538621   
              -----------+----------------------------------------------------
                   Total |         610    23  26.5217391   

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.