UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata FAQ
How can I understand a continuous by continuous interaction?

First off, let's start with what a significant continuous by continuous interaction means. It means that the slope of one continuous variable on the response variable changes as the values of a second continuous change.

We will use an example from the dataset hsb2 that has a statistically significant continuous by continuous interaction to illustrate one explanatory approach. The continuous variables math and socst, are standardized test scores for math and social studies, respectively. We will make use of the ATS written command xi3, that allows us to easily include continuous by continuous interactions in the regress command. To obtain xi3 just type the command findit xi3 (see How can I used the findit command to search for programs?).

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear

xi3: regress read math*socst

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   78.61
       Model |  11424.7622     3  3808.25406           Prob > F      =  0.0000
    Residual |  9494.65783   196  48.4421318           R-squared     =  0.5461
-------------+------------------------------           Adj R-squared =  0.5392
       Total |    20919.42   199  105.122714           Root MSE      =    6.96

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |  -.1105123   .2916338    -0.38   0.705    -.6856552    .4646307
       socst |  -.2200442   .2717539    -0.81   0.419    -.7559812    .3158928
     _ImaXso |   .0112807   .0052294     2.16   0.032     .0009677    .0215938
       _cons |   37.84271   14.54521     2.60   0.010     9.157506    66.52792
------------------------------------------------------------------------------
Looking at the model above we see a significant math by socst interaction. At this point all we can say is that for every unit increase in socst the slope of math will increase by .0113 units. We could, of course, discuss the change in slope of socst for each unit change in math but for this example we will stick to the first interpretation.

We know that the slope of math will increase as socst increases but it can be difficult to visualize this. It may helpful to think of socst as being an ordered categorical variable (low, medium, high) and then asking: how does the slope of math change as socst values move from low to medium to high? For our purposes low will be one standard deviation below the mean, medium will be at the mean, and high will be one standard deviation above the mean.

Additionally, the slopes for both math and socst do not appear to be significant but this is an artifact due to the high collinearity of math and socst with the interaction. You will see that when we center the variables the slopes for math and socst at their mean values the picture will become clearer.

In this next section we will be by centering math and socst at their means, i.e., setting socst to its "medium" value.

summarize socst

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       socst |       200      52.405    10.73579         26         71

/* save the mean and sd of socst as global macro variables */
global mean = r(mean)
global sd = r(sd)

/* create centered socst variable, soc1 */
generate soc1 = socst-$mean

/* create centered math variable, cmean */
summarize math

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
        math |       200      52.645    9.368448         33         75

global mean2 = r(mean)
generate cmath = math-$mean2

summarize cmath

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       cmath |       200   -1.97e-07    9.368448    -19.645     22.355
The new variables cmath and soc1 are the variables centered at the mean of math and socst, respectively. The descriptive stats on cmath shows that it has a range of roughtly -20 to +23 which will be useful to know when we get around to graphing.

Next, we will run a regression using the centered predictor variables.

xi3: regress read cmath*soc1

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   78.61
       Model |  11424.7622     3  3808.25407           Prob > F      =  0.0000
    Residual |   9494.6578   196  48.4421316           R-squared     =  0.5461
-------------+------------------------------           Adj R-squared =  0.5392
       Total |    20919.42   199  105.122714           Root MSE      =    6.96

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cmath |    .480654   .0637009     7.55   0.000     .3550268    .6062812
        soc1 |   .3738294   .0555457     6.73   0.000     .2642855    .4833733
     _IcmXso |   .0112807   .0052294     2.16   0.032     .0009677    .0215938
       _cons |   51.61533   .5686851    90.76   0.000      50.4938    52.73685
------------------------------------------------------------------------------

/* save slope and intercept as global macro variables */
global b1 = _b[cmath]
global c1 = _b[_cons]
Let's interpret the coefficients for this model. The constant (51.62) is the value of read when cmath and soc1 are zero, i.e., when math and socst are at their mean values. The slope for math (cmath) on read is .48 (and is significant) when both soc1 and _IcmXso (the interaction) are zero. Further, for every on unit of increase in soc1 the slope of cmath increases by .0113 units.

We can graph the regression of cmath on read because we are holding socst constant at its mean value.

twoway (function $c1 + $b1*x, range(-20 23)) ///
        (scatter read cmath, msym(oh) jitter(2)), ///
        legend(off)

This has worked out pretty good, so we will go ahead and center socst at two addition values, low, the mean - 1 sd and high, the mean + 1 sd.
/* create variable soc2, centered at mean + 1 sd of socst */
generate soc2 = socst-($mean + $sd)

xi3: regress read cmath*soc2

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   78.61
       Model |  11424.7622     3  3808.25406           Prob > F      =  0.0000
    Residual |  9494.65781   196  48.4421317           R-squared     =  0.5461
-------------+------------------------------           Adj R-squared =  0.5392
       Total |    20919.42   199  105.122714           Root MSE      =    6.96

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cmath |   .6017615   .0774773     7.77   0.000     .4489654    .7545576
        soc2 |   .3738294   .0555457     6.73   0.000     .2642855    .4833733
     _IcmXso |   .0112807   .0052294     2.16   0.032     .0009677    .0215938
       _cons |   55.62868   .7894084    70.47   0.000     54.07186    57.18551
------------------------------------------------------------------------------

/* save slope and intercept as global macro variables */
global b2 = _b[cmath]
global c2 = _b[_cons]

/* create variable soc3, centered at mean - 1 sd of socst */
generate soc3 = socst-($mean - $sd)

xi3: regress read cmath*soc3

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   78.61
       Model |  11424.7622     3  3808.25407           Prob > F      =  0.0000
    Residual |   9494.6578   196  48.4421316           R-squared     =  0.5461
-------------+------------------------------           Adj R-squared =  0.5392
       Total |    20919.42   199  105.122714           Root MSE      =    6.96

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       cmath |   .3595465   .0917421     3.92   0.000      .178618    .5404749
        soc3 |   .3738294   .0555457     6.73   0.000     .2642855    .4833733
     _IcmXso |   .0112807   .0052294     2.16   0.032     .0009677    .0215938
       _cons |   47.60197   .8572345    55.53   0.000     45.91138    49.29256
------------------------------------------------------------------------------

/* save slope and intercept as global macro variables */
global b3 = _b[cmath]
global c3 = _b[_cons]
You will note that the slope and intercepts differ in these last two models. Below, we present a summary of the slopes and intercepts for cmath at the the three centered values of socst. In the summary, we do not include the coefficients for socst nor the interaction because the values are the same for all three of the centered models.
display as txt "regression equations for math on read at 3 values of socst"
display as txt "at +1sd:  $c2 + $b2*math"
display as txt "at mean:  $c1 + $b1*math"
display as txt "at -1sd:  $c3 + $b3*math"

regression equations for math on read at 3 values of socst
at +1sd:  55.62868289662283 + .6017614777967648*math
at mean:  51.61532731557158 + .4806539733117885*math 
at -1sd:  47.60197182980541 + .3595464625651616*math
What the summary table shows us is that as socst increases from one standard deviation below the mean, through the mean and to one sd above the mean that slope of cmath on read gets larger as does the intercept. Below is a graph that includes all three regression lines along with a scatterplot of the data.
twoway (function $c2 + $b2*x, range(-20 20))      ///
        (function $c1 + $b1*x, range(-20 20))     ///
        (function $c3 + $b3*x, range(-20 20))     ///
        (scatter read cmath, msym(oh) jitter(2)), ///
        legend(order(1 "+1sd" 2 "mean" 3 "-1sd"))

In the graph above, the top line represents the regression of read on math when the value of socst is held constant at one standard deviation above its mean. The middle line represents the regression when socst is held constant at its mean value. And the bottom line is for the case when socst is held constant one standard deviation below the mean.

There is also a way to compute the slope coefficients and constants without centering either of the predictor variables using the lincom command. We will use the global macro variables that have already been created but we will need to rerun the original regression model.

xi3: regress read math*socst

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   78.61
       Model |  11424.7622     3  3808.25406           Prob > F      =  0.0000
    Residual |  9494.65783   196  48.4421318           R-squared     =  0.5461
-------------+------------------------------           Adj R-squared =  0.5392
       Total |    20919.42   199  105.122714           Root MSE      =    6.96

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |  -.1105123   .2916338    -0.38   0.705    -.6856552    .4646307
       socst |  -.2200442   .2717539    -0.81   0.419    -.7559812    .3158928
     _ImaXso |   .0112807   .0052294     2.16   0.032     .0009677    .0215938
       _cons |   37.84271   14.54521     2.60   0.010     9.157506    66.52792
------------------------------------------------------------------------------

/* high: at mean plus 1 sd */

lincom math + ($mean+$sd)*_ImaXso

 ( 1)  math + 63.14079 _ImaXso = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   .6017615   .0774773     7.77   0.000     .4489654    .7545576
------------------------------------------------------------------------------

lincom _cons + ($mean+$sd)*socst + ($mean+$sd)*($mean2)*_ImaXso + ($mean2)*math

 ( 1)  52.645 math + 63.14079 socst + 3324.047 _ImaXso + _cons = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   55.62868   .7894084    70.47   0.000     54.07186    57.18551
------------------------------------------------------------------------------

/* medium: at the mean */

lincom math + ($mean)*_ImaXso

 ( 1)  math + 52.405 _ImaXso = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |    .480654   .0637009     7.55   0.000     .3550268    .6062811
------------------------------------------------------------------------------

lincom _cons + ($mean)*socst + ($mean)*($mean2)*_ImaXso + ($mean2)*math

 ( 1)  52.645 math + 52.405 socst + 2758.861 _ImaXso + _cons = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   51.61533   .5686851    90.76   0.000      50.4938    52.73685
------------------------------------------------------------------------------

/* low: at mean minus 1 sd */

lincom math + ($mean-$sd)*_ImaXso

 ( 1)  math + 41.66921 _ImaXso = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   .3595465   .0917421     3.92   0.000      .178618    .5404749
------------------------------------------------------------------------------

lincom _cons + ($mean-$sd)*socst + ($mean-$sd)*($mean2)*_ImaXso + ($mean2)*math

 ( 1)  52.645 math + 41.66921 socst + 2193.675 _ImaXso + _cons = 0

------------------------------------------------------------------------------
        read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         (1) |   47.60197   .8572345    55.53   0.000     45.91138    49.29256
------------------------------------------------------------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California