### SAS FAQ How can I test contrasts and interaction contrasts using the estimate statement?

It can be rather tricky to program the estimate statement when there are higher order interactions (e.g., three-way interactions, four-way interactions, etc.) included in the mixed model.  Let's look at an example where we are using proc mixed in a repeated measures model.  The data set exercise was used in our seminar on repeated measures. The data set consists of people who were randomly assigned to two different diets: low-fat and not low-fat, as well as three different types of exercise: at rest, walking leisurely and running.  Their pulse rate was measured at three different time points during their assigned exercise: at 1 minute, 15 minutes and 30 minutes. We included all three variables in our mixed model:  diet which has two levels, exertype which has three levels and time which also has three levels.  Even though time is a repeated factor we can treat it in the same manner as the other variables when we want to test the various contrasts and interaction contrasts that may be of interest.  Finally, we will present an example of how to program the estimate statement for interaction contrasts involving a four-way interaction.

proc mixed data=exercise;
class diet exertype time;
model pulse = exertype|diet|time;
repeated time / subject=id type=arh(1) ;
run;
quit;

<output omitted>

Type 3 Tests of Fixed Effects

Num     Den
Effect                  DF      DF    F Value    Pr > F
exertype                 2      24      52.17    <.0001
diet                     1      24      15.81    0.0006
diet*exertype            2      24       5.11    0.0142
time                     2      48      30.82    <.0001
exertype*time            4      48      20.25    <.0001
diet*time                2      48       2.80    0.0709
diet*exertype*time       4      48       4.45    0.0039


#### Contrasts involving only one variable

The graph of the data indicates that there might be a difference between exertype level 3 and the other two levels. Therefore, we will use a reverse helmert coding for exertype in the estimate statement in order to test this particular contrast.  For more information on reverse helmert coding and other contrast coding systems, please refer to chapter 5 in our webbook on regression.


proc mixed data=exercise;
class diet exertype time;
model pulse = exertype|diet|time;
repeated time / subject=id type=arh(1) ;
estimate 'Exertype 12v3' exertype  -.5 -.5 1;
run;
quit;

<output omitted> 
                             Estimates

Standard
Label            Estimate       Error      DF    t Value    Pr > |t|
Exertype 12v3     20.0500      1.9975      24      10.04      <.0001


#### Contrast involving a two-way interaction

The output from the model indicates that there is a statistically significant interaction between exertype and diet, which is reflected in the graphs of exertype by time with one graph per level of diet.  In the graphs we see that the pulse rate of the third level of exertype increases much faster in the non-low fat diet group than in the low-fat diet group.  We want to see if this difference is statistically significant. In order to do this, we will test the interaction of exertype contrasting level 3 versus levels 1 and 2 and diet contrasting levels 1 versus 2.

We want to apply a reverse helmert coding to exertype in order to compare exertype level 3 with the average of the levels one and two; at the same time we want to compare the two levels of diet.  In order to make it easier to see the coding systems, we will present them in a table format.

 diet level 1 1 diet level 2 -1

 exertype level 1 -0.5 exertype level 2 -0.5 exertype level 3 1

The interaction is coded as d1e1 d1e2 d1e3 d2e1 d2e2 d2e3, where d# = coding for diet level #, e# = coding for exertype level # and d#e# is the product of the two.  The order of the factors is determined by the order in which they appear in the class statement.  In this particular case diet appears before exertype in the class statement; thus, with our coding system for diet and exertype, we would have the following coding for the interaction:

d1e1 = 1*-.5 = -.5
d1e2 = 1*-.5 = -.5
d1e3 = 1*1 = 1
d2e1 = -1*-.5 = .5
d2e2 = -1*-.5 = .5
d2e3 = -1*1 = -1

In order to more easily understand the coding for the interaction, it might help to visualize it as a matrix which equals the product of the contrast code for diet as a column matrix and the contrast coding of exertype as a row matrix.

 exertype level 1 = -.5 exertype level 2 = -.5 exertype level 3 = 1 diet level 1 = 1 1*-.5 = -.5 1*-.5 = -.5 1*1 = 1 diet level 2 = -1 -1*-.5 = .5 -1*-.5 = .5 -1*1 = -1

When writing the estimate statement, it does not matter whether the numbers are written as a matrix or as one long stream of numbers; the results will be the same either way.  Thus, the following two estimate statements are equivalent:

estimate 'exertype 12v3 by diet 1v2' diet*exertype   -.5 -.5  1
.5  .5 -1;
estimate 'exertype 12v3 by diet 1v2' diet*exertype   -.5 -.5  1 .5  .5 -1;

proc mixed data=exercise;
class diet exertype time;
model pulse = exertype|diet|time;
repeated time / subject=id type=arh(1) ;
estimate 'exertype 12v3 by diet 1v2' diet*exertype   -.5 -.5  1
.5  .5 -1;
run;
quit;

<output omitted> 
                                   Estimates
Standard
Label                        Estimate       Error      DF    t Value    Pr > |t|
exertype 12v3 by diet 1v2    -12.7667      3.9951      24      -3.20      0.0039


#### Contrast involving a three-way interaction

The graphs also make us suspect that there is a difference between the pulse rate of the third level of exertype at the last time point and the other levels of exertype at the other time points, and that this difference depends on which diet you follow.  So, we want to test the three-way interaction where each variable has a specific contrast coding. For exertype we are contrasting level 3 versus the other two levels; likewise, for time we are contrasting level 3 versus the two other time points and for diet we are contrasting the two diets.  Let's look at the contrast coding for each variable in a table format.

 diet level 1 1 diet level 2 -1

 exertype level 1 -0.5 exertype level 2 -0.5 exertype level 3 1

 time level 1 -0.5 time level 2 -0.5 time level 3 1

The coding for the interaction is determined by the order in which the factors appear in the class statement, and in this example the order is: diet, exertype and time.  Therefore, the interaction is coded as: d1e1t1 d1e1t2 d1e1t3 d1e2t1 d1e2t2 d1e2t3 d1e3t1 d1e3t2 d1e3t3 d2e1t1 d2e1t2 d2e1t3 d2e2t1 d2e2t2 d2e2t3 d2e3t1 d2e3t2 d2e3t3.  Furthermore, d# = coding for diet level #, e# = coding for exertype level #, t# = coding for time level # and d#e#t# is the product of the three.  In this case the coding for the interaction is:

d1e1t1 = 1*-.5*-.5 = .25
d1e1t2 = 1*-.5*-.5 = .25
d1e1t3 = 1*-.5*1 = -.5
d1e2t1 = 1*-.5*-.5 = .25
d1e2t2 = 1*-.5*-.5 = .25
d1e2t3 = 1*-.5*1 = -.5
d1e3t1 = 1*1*-.5 = -.5
d1e3t2 = 1*1*-.5 = -.5
d1e3t3 = 1*1*1 = 1

d2e1t1 = -1*-.5*-.5 = -.25
d2e1t2 = -1*-.5*-.5 = -.25
d2e1t3 = -1*-.5*1 = .5
d2e2t1 = -1*-.5*-.5 = -.25
d2e2t2 = -1*-.5*-.5 = -.25
d2e2t3 = -1*-.5*1 = .5
d2e3t1 = -1*1*-.5 = .5
d2e3t2 = -1*1*-.5 = .5
d2e3t3 = -1*1*1 = -1

This can be more conveniently visualized as matrices.  The coding for time, which is the last factor in the class statement, can be thought of as the row matrix that is multiplied by the coding for exertype, which is the second to last factor in the class statement and which can be thought of as the column matrix.  The matrix, which is the product, is then multiplied by the coding for each level of diet, which appears before exertype and time in the class statement.

For diet level 1 = 1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 1*-.5*-.5 = .25 1*-.5*-.5 = .25 1*-.5*1 = -.5 exertype level 2 = -.5 1*-.5*-.5 = .25 1*-.5*-.5 = .25 1*-.5*1 = -.5 exertype level 3 = 1 1*1*-.5 = -.5 1*1*-.5 = -.5 1*1*1 = 1

For diet level 2 = -1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 -1*-.5*-.5 = -.25 -1*-.5*-.5 = -.25 -1*-.5*1= .5 exertype level 2 = -.5 -1*-.5*-.5 = -.25 -1*-.5*-.5 = -.25 -1*-.5*1 = .5 exertype level 3 = 1 -1*1*-.5 = .5 -1*1*-.5 = .5 -1*1*1 = -1

proc mixed data=long;
class diet exertype time;
model pulse = exertype|diet|time;
repeated time / subject=id type=arh(1) ;
estimate 'ex 12v3 by diet 1v2 by time 12v3'
diet*exertype*time    .25  .25 -.5
.25  .25 -.5
-.5  -.5   1
-.25 -.25  .5
-.25 -.25  .5
.5   .5  -1;
run;
quit;

<output omitted>

Estimates

Standard
Label                              Estimate      Error     DF   t Value   Pr > |t|
ex 12v3 by diet 1v2 by time 12v3   -21.2000     5.7463     48     -3.69     0.0006

#### Contrast involving a four-way interaction

Our model does not include any four-way interactions, but supposed that we had another categorical variable in our model.  Let's call it shoetype. Let's suppose that the subjects in the study were randomly assigned to one of two types of athletic shoes: aerobic shoes and running shoes.  We would like to test the interaction where we contrast the diets, the two types of shoes, time point 3 versus the other two time points and runners versus non-runners.  Let's write the contrast coding in tables to get a clearer picture.

 shoe level 1 1 shoe level 2 -1

 diet level 1 1 diet level 2 -1

 exertype level 1 -0.5 exertype level 2 -0.5 exertype level 3 1

 time level 1 -0.5 time level 2 -0.5 time level 3 1

The coding for the interaction is determine by the order in which the factors appear in the class statement and in this example the order is: shoetype, diet, exertype and time.  Therefore, the interaction is coded as: s1d1e1t1 s1d1e1t2 s1d1e1t3 s1d1e2t1 s1d1e2t2 s1d1e2t3 s1d1e3t1 s1d1e3t2 s1d1e3t3 s1d2e1t1 s1d2e1t2 s1d2e1t3 s1d2e2t1 s1d2e2t2 s1d2e2t3 s1d2e3t1 s1d2e3t2 s1d2e3t3 s2d1e1t1 s2d1e1t2 s2d1e1t3 s2d1e2t1 s2d1e2t2 s2d1e2t3 s2d1e3t1 s2d1e3t2 s2d1e3t3 s2d2e1t1 s2d2e1t2 s2d2e1t3 s2d2e2t1 s2d2e2t2 s2d2e2t3 s2d2e3t1 s2d2e3t2 s2d2e3t3.  Furthermore,  s# = coding for shoetype level #, d# = coding for diet level #, e# = coding for exertype level #, t# = coding for time level # and d#e#t# is the product of the three.  In this case the coding for the interaction is:

s1d1e1t1 = 1*1*-.5*-.5 = .25
s1d1e1t2 = 1*1*-.5*-.5 = .25
s1d1e1t3 = 1*1*-.5*1 = -.5
s1d1e2t1 = 1*1*-.5*-.5 = .25
s1d1e2t2 = 1*1*-.5*-.5 = .25
s1d1e2t3 = 1*1*-.5*1 = -.5
s1d1e3t1 = 1*1*1*-.5 = -.5
s1d1e3t2 = 1*1*1*-.5 = -.5
s1d1e3t3 = 1*1*1*1 = 1
s1d2e1t1 = 1*-1*-.5*-.5 = -.25
s1d2e1t2 = 1*-1*-.5*-.5 = -.25
s1d2e1t3 = 1*-1*-.5*1 = .5
s1d2e2t1 = 1*-1*-.5*-.5 = -.25
s1d2e2t2 = 1*-1*-.5*-.5 = -.25
s1d2e2t3 = 1*-1*-.5*1 = .5
s1d2e3t1 = 1*-1*1*-.5 = .5
s1d2e3t2 = 1*-1*1*-.5 = .5
s1d2e3t3 = 1*-1*1*1 = -1

s2d1e1t1 = -1*1*-.5*-.5 = -.25
s2d1e1t2 = -1*1*-.5*-.5 = -.25
s2d1e1t3 = -1*1*-.5*1 = .5
s2d1e2t1 = -1*1*-.5*-.5 = -.25
s2d1e2t2 = -1*1*-.5*-.5 = -.25
s2d1e2t3 = -1*1*-.5*1 = .5
s2d1e3t1 = -1*1*1*-.5 = .5
s2d1e3t2 = -1*1*1*-.5 = .5
s2d1e3t3 = -1*1*1*1 = -1
s2d2e1t1 = -1*-1*-.5*-.5 = .25
s2d2e1t2 = -1*-1*-.5*-.5 = .25
s2d2e1t3 = -1*-1*-.5*1 = -.5
s2d2e2t1 = -1*-1*-.5*-.5 = .25
s2d2e2t2 = -1*-1*-.5*-.5 = .25
s2d2e2t3 = -1*-1*-.5*1 = -.5
s2d2e3t1 = -1*-1*1*-.5 = -.5
s2d2e3t2 = -1*-1*1*-.5 = -.5
s2d2e3t3 = -1*-1*1*1 = 1

This can also be more conveniently visualized as a matrix.  The coding for time, which is the last factor in the class statement can be thought of as the row matrix that is multiplied by the coding for exertype, which is the second to last factor in the class statement and which can be thought of as the column matrix.  The matrix, which is the product, is then multiplied by the coding for each level of diet which appears before exertype and time in the class statement.

For shoetype level 1 = 1 and diet level 1 = 1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 1*1*-.5*-.5 = .25 1*1*-.5*-.5 = .25 1*1*-.5*1 = -.5 exertype level 2 = -.5 1*1*-.5*-.5 = .25 1*1*-.5*-.5 = .25 1*1*-.5*1 = -.5 exertype level 3 = 1 1*1*1*-.5 = -.5 1*1*1*-.5 = -.5 1*1*1*1 = 1

For shoetype level 1 = 1 and diet level 2 = -1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 1*-1*-.5*-.5 = -.25 1*-1*-.5*-.5 = -.25 1*-1*-.5*1= .5 exertype level 2 = -.5 1*-1*-.5*-.5 = -.25 1*-1*-.5*-.5 = -.25 1*-1*-.5*1 = .5 exertype level 3 = 1 1*-1*1*-.5 = .5 1*-1*1*-.5 = .5 1*-1*1*1 = -1

For shoetype level 2 = -1 and diet level 1 = 1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 -1*1*-.5*-.5 = -.25 -1*1*-.5*-.5 = -.25 -1*1*-.5*1= .5 exertype level 2 = -.5 -1*1*-.5*-.5 = -.25 -1*1*-.5*-.5 = -.25 -1*1*-.5*1 = .5 exertype level 3 = 1 -1*1*1*-.5 = .5 -1*1*1*-.5 = .5 -1*1*1*1 = -1

For shoetype level 2 = -1 and diet level 2 = -1:

 time level 1 = -.5 time level 2 = -.5 time level 3 = 1 exertype level 1 = -.5 -1*-1*-.5*-.5 = .25 -1*-1*-.5*-.5 = .25 -1*-1*-.5*1 = -.5 exertype level 2 = -.5 -1*-1*-.5*-.5 = .25 -1*-1*-.5*-.5 = .25 -1*-1*-.5*1 = -.5 exertype level 3 = 1 -1*-1*1*-.5 = -.5 -1*-1*1*-.5 = -.5 -1*-1*1*1 = 1

The code for this interaction would be:

proc mixed data=exercise;
class shoetype diet exertype time;
model pulse = shoetype|exertype|diet|time;
repeated time / subject=id type=arh(1) ;
estimate 'sh 1v2 & d 1v2 & ex 12v3 & t 12v3'
shoetypediet*exertype*time    .25  .25 -.5
.25  .25 -.5
-.5  -.5   1
-.25 -.25  .5
-.25 -.25  .5
.5   .5  -1
-.25 -.25  .5
-.25 -.25  .5
.5   .5  -1
.25  .25 -.5
.25  .25 -.5
-.5  -.5   1;
run;
quit;

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.