UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Regression with Stata
Chapter 7: More on interactions of categorical and continuous variables

This is a draft version of this chapter.  Comments and suggestions to improve this draft are welcome.

Chapter Outline
    1. Continuous and Categorical Predictors without Interaction
    2. Continuous and Categorical Predictors with Interaction
    3. Show slopes for each group
         3.1 Show slopes by performing separate analyses
         3.2 Show slopes for each group from one analysis
    4. Compare slopes across groups
    5. Simple effects and simple comparisons of group, strategy 1
         5.1 Simple effects and comparisons when meals is 1 sd below mean
         5.2 Simple effects and comparisons when meals is at the mean
         5.3 Simple effects and comparisons when meals is 1 sd above the mean
    6. Simple effects and simple comparisons of group, strategy 2
        6.1 Simple effects and comparisons when meals is 1 sd below mean
        6.2 Simple effects and comparisons when meals is at the mean
        6.3 Simple effects and comparisons when meals is 1 sd above mean
    7. Interaction Comparison
    8. More on predicted values

In this chapter we continue to use the elemapi2 data file which we can use as shown below.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2, clear

1. Continuous and Categorical Predictors without Interaction

In this model we have collcat as a categorical variable with 3 levels and meals as continuous variable. We will use reverse Helmert coding for collcat which will be useful for later analyses.

This uses the xi3 command. You can download xi3 from within Stata by typing findit xi3 in the command line (see How can I use the findit command to search for programs and get additional help? for more information about using findit).

xi3: regress api00 r.collcat meals

If we look at the predicted values, we can see that there are 3 regression lines, each of which are parallel. We graph the predicted values below.

predict yhat
separate yhat , by(collcat)
graph twoway scatter yhat1 yhat2 yhat3 meals, connect(l l l) clpattern(solid longdash dot) msymbol(i i i)

We drop the variables yhat yhat1 yhat2 yhat3 in case we want to use them later.

drop yhat yhat1 yhat2 yhat3

We could also get these results using the postgr3 command. You can download postgr3 from within Stata by typing findit postgr3 in the command line (see How can I use the findit command to search for programs and get additional help? for more information about using findit).

postgr3 meals, by(collcat) 
Variables left asis: meals _Icollcat_2 _Icollcat_3
(option xb assumed; fitted values)

If you wanted to indicate the different groups using different types of lines we could use the clpattern option.

postgr3 meals, by(collcat) clpattern(solid longdash dot)
Variables left asis: meals _Icollcat_2 _Icollcat_3
(option xb assumed; fitted values)

2. Continuous and Categorical Predictors with Interaction

The prior analysis assumed that the slopes for the 3 collcat groups are the same. We can test this assumption by including an interaction of collcat by meals.

xi3: regress api00 r.collcat*meals

We can then test overall interaction and we see that the interaction is significant.

test _Ico2Xme _Ico3Xme

Again we use the postgr3 command to show the graph of the regression lines between meals and api00 for the 3 levels of collcat and we see that the lines are not parallel (as we would expect because of the significant interaction we found above).

postgr3 meals, by(collcat) clpattern(solid longdash dot)
Variables left asis: meals _Icollcat_2 _Icollcat_3 _Ico2Xme _Ico3Xme
(option xb assumed; fitted values)

3. Show slopes for each group

3.1 Show slopes by performing separate analyses

We can perform separate analyses by collcat to get slope of meals at each level of collcat.

bysort collcat : regress api00 meals

3.2 Show slopes for each group in one analysis

We can also get the slope of meals at each level of collcat via one overall regression. One way is to dummy code the categorical variable collcat and to create interaction terms with variable meals. The trick is to include all the interaction terms, leaving out both meals and one of the categories of collcat out of the regression. Then the coefficient for each of the interaction terms will be the slope of meals for each group of collcat. In our case, the slope of meals for group 1 of collcat is -4.138392, -4.110242 for group 2 and -3.329426 for group 3 as shown in the regression output below.

tab collcat, gen(cd)
    collcat |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |        129       32.25       32.25
          2 |        134       33.50       65.75
          3 |        137       34.25      100.00
------------+-----------------------------------
      Total |        400      100.00
gen cdm1 = cd1*meals
gen cdm2 = cd2*meals
gen cdm3 = cd3*meals
. regress api00 cd1 cd2 cdm1 cdm2 cdm3
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  5,   394) =  361.86
       Model |  6629929.87     5  1325985.97           Prob > F      =  0.0000
    Residual |  1443742.13   394  3664.32012           R-squared     =  0.8212
-------------+------------------------------           Adj R-squared =  0.8189
       Total |  8073672.00   399  20234.7669           Root MSE      =  60.534
------------------------------------------------------------------------------
       api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         cd1 |   21.28174   16.87625     1.26   0.208    -11.89701    54.46049
         cd2 |   31.57666   16.02623     1.97   0.050       .06904    63.08428
        cdm1 |  -4.138392   .1548438   -26.73   0.000    -4.442816   -3.833969
        cdm2 |  -4.110242    .159782   -25.72   0.000    -4.424373    -3.79611
        cdm3 |  -3.329426    .204061   -16.32   0.000     -3.73061   -2.928241
       _cons |   864.8508   11.78298    73.40   0.000     841.6854    888.0162
------------------------------------------------------------------------------

4. Compare slopes across groups

Compare slopes 3 vs. 1 and 2, 2 vs 1.

xi3: regress api00 r.collcat*meals
desc _Ico2Xme _Ico3Xme

slopes 1 and 2 do not differ, but 1 and 2 vs. 3 differ.

5. Simple effects and simple comparisons of group, strategy 1

Test simple effects and simple comparisons of collcat at meals mean-1sd, mean, mean+1sd get mean-1sd, mean, mean+1sd

summarize meals

show graph with lines at each level for simple effect we will test effect of collcat at the 3 places where the vertical lines are drawn in the graph.

postgr3 meals, by(collcat) clpattern(solid longdash dot) xline(28.4033 60.315 92.2267) 
Variables left asis: meals _Icollcat_2 _Icollcat_3 _IcolXmeals_2 _IcolXmeals_3
(option xb assumed; fitted values)

5.1 Simple effects and comparisons when meals is 1 sd below mean

generate mealslow = meals-28.4033
. xi3: regress api00 r.collcat*mealslow

Below we show the simple effect of collcat when meals is one standard deviation below the mean (i.e. when meals is 28.4033).

test _Icollcat_2 _Icollcat_3

If we want a simple comparison of 2 vs 1 at 28.4 we can simply inspect the test for _Icollcat_2 since this compares levels 2 and 1 of collcat.

We can obtain predicted values for when collcat is 1 and 2 at meals=28.4 as shown below.

lincom _cons + -.5*_Icollcat_2 + -.333333*_Icollcat_3
lincom _cons +  .5*_Icollcat_2 + -.333333*_Icollcat_3

We can obtain the simple comparison that compares level 3 of collcat vs levels 1 and 2 combined when meals is 28.4 by inspecting the coefficient for _Icollcat_3. We can obtain predicted values for levels 1 and 2 of collcat combined and for level 3 of collcat when meals=28.4 as shown below.

lincom _cons + 0*_Icollcat_2 + -.333333*_Icollcat_3
lincom _cons + 0*_Icollcat_2 +  .666667*_Icollcat_3

5.2 Simple effects and comparisons when meals is at the mean

We can obtain simple effects and simple comparisons of collcat when meals is at the mean much like we did above, as shown below.

generate mealsmn = meals-(60.315)
. xi3: regress api00 r.collcat*mealsmn

simple effect of collcat at meals=mean

test  _Icollcat_2 _Icollcat_3

The simple comparison of 2 vs 1 at when meals is 60.3 can be seen in the test of _Icollcat_2 and the simple comparison of 3 vs 2 and 1 when meals is 60.3 can be seen in the test of _Icollcat_3.

5.3 Simple effects and comparisons when meals is 1 sd above the mean

The simple effects and simple comparisons of collcat when meals is one standard deviation above the mean (i.e. 60.315+ 31.9117) is illustrated below.

generate mealshi = meals - 92.2267
. xi3: regress api00 r.collcat*mealshi

simple effects of collcat when meals = mean + 1sd

test  _Icollcat_2 _Icollcat_3

The simple comparison of 2 vs. 1 at when meals is 92.22 can be seen in the test of _Icollcat_2 and the simple comparison of 3 vs. 2 and 1 when meals is 92.22 can be seen in the test of _Icollcat_3.

6. Simple effects and simple comparisons of group, strategy 2

This strategy shows how to do these same comparisons all via one model.

xi3: regress api00 r.collcat*meals

6.1 Simple effects and comparisons when meals is 1 sd below mean

test simple effect of collcat when meals is one standard deviation below the mean, i.e. 28.403.

test _Icollcat_2+ 28.403*_Ico2Xme =0
test _Icollcat_3+ 28.403*_Ico3Xme =0, accum

Compare collcat groups 2 vs. 1 when meals is at 28.403.

lincom _Icollcat_2+ 28.403*_Ico2Xme

obtain predicted values for collcat groups 1 and 2 when meals is 28.403.

lincom _cons + -.5*_Icollcat_2 + -.333333*_Icollcat_3 + 28.403*meals + ///
         (-.5*28.403)*_Ico2Xme + (-.333333*28.403)*_Ico3Xme
lincom _cons +  .5*_Icollcat_2 + -.333333*_Icollcat_3 + 28.403*meals + ///
         ( .5*28.403)*_Ico2Xme + (-.333333*28.403)*_Ico3Xme

test simple comparison when mean-1sd, 1 and 2 vs 3.

lincom  _Icollcat_3 + 28.403*_Ico3Xme 

obtain predicted values for groups 1 and 2 vs 3.

lincom _cons +   0*_Icollcat_2 + -.333333*_Icollcat_3 + 28.403*meals + ///
         (0*28.403)*_Ico2Xme + (-.333333*28.403)*_Ico3Xme
lincom _cons +   0*_Icollcat_2 +  .666667*_Icollcat_3 + 28.403*meals + ///
         (0*28.403)*_Ico2Xme + ( .666667*28.403)*_Ico3Xme

6.2 Simple effects and comparisons when meals is at the mean

test simple effect when meals = mean

test _Icollcat_2+ 60.315*_Ico2Xme =0
test _Icollcat_3+ 60.315*_Ico3Xme =0, accum

test simple comparison 2 vs. 1.

lincom _Icollcat_2+ 60.315*_Ico2Xme

test simple comparison 1 and 2 vs. 3

lincom _Icollcat_3+ 60.315*_Ico3Xme 

6.3 Simple effects and comparisons when meals is 1 sd above mean

test simple effect when mean+1sd

test _Icollcat_2+ 92.23*_Ico2Xme =0
test _Icollcat_3+ 92.23*_Ico3Xme =0, accum

test simple comparison 2 vs. 1

lincom _Icollcat_2+ 92.23*_Ico2Xme

test simple comparison 1 and 2 vs. 3.

lincom _Icollcat_3+ 92.23*_Ico3Xme

7. interaction comparisoin

We can compare the size of the effect of collcat at one level of meals to the size of the effect of collcat at another level of meals.

test whether 1 vs 2 effect at comparing mean to mean-1sd

lincom (_Icollcat_2+ 60.315*_Ico2Xme) - (_Icollcat_2+ 28.403*_Ico2Xme) 

same as

lincom 31.912 * _Ico2Xme

notice is the same as below after rounding error.

display 12.89125-11.99283

test whether 1 and 2 vs 3 effect at comparing mean-1sd to mean

lincom (_Icollcat_3+ 60.315*_Ico3Xme) - (_Icollcat_3 + 28.403*_Ico3Xme)

same as

lincom 31.915 * _Ico3Xme 

notice is the same as

display 46.8836 - 21.51465 

8. More on predicted values

show how to get predicted values, also known as adjusted means in ancova terminology. get predicted values for group 1, 2, 3 when meals=21

lincom _cons + 21*meals +  -.5*_Icollcat_2 + -.333333*_Icollcat_3 + ///
         (21*-.5)*_Ico2Xme + (21*-.33333)*_Ico3Xme
lincom _cons + 21*meals +   .5*_Icollcat_2 + -.333333*_Icollcat_3 + ///
         (21* .5)*_Ico2Xme + (21*-.33333)*_Ico3Xme

 ( 1)  .5 _Icollcat_2 - .333333 _Icollcat_3 + 21.0 meals + 10.5 _IcolXmeals_2 - 
       6.99993 _IcolXmeals_3 + _cons = 0.0

------------------------------------------------------------------------------
       api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   810.1124   8.084814   100.20   0.000     794.2177    826.0072
------------------------------------------------------------------------------
. lincom _cons + 21*meals +    0*_Icollcat_2 +  .666667*_Icollcat_3 + ///
         (21*  0)*_Ico2Xme + (21* .66667)*_Ico3Xme

in case we don't trust our computations, use predict command to verify that we have gotten the correct predicted values. use the predict command and then look at predicted values when meals is 21 for each level of collcat. This works because we checked to see that we really do have meals=21 in our data file for each level of collcat.

predict pred
list meals collcat _Icollcat_2 _Icollcat_3 pred if meals==21
log close

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California