### Stata FAQ Part 1: How can I understand a 3-way continuous interaction? (Stata 11)

Multiple regression models often contain interaction terms. This FAQ page covers the situation in which there are two moderator variables which jointly influence the regression of the dependent variable on an independent variable. In other words, a regression model that has a significant three-way interaction of continuous variables.

If you are new to the topic of interactions of continuous variables, you may want to start with the FAQ page the discusses the simpler continuous by continuous two-way interaction, How can I explain a continuous by continuous interaction?

Back to the three-way interaction. There are several approaches that one might use to explain a three-way interaction. The approach that we will demonstrate is to compute simple slopes, i.e., the slopes of the dependent variable on the independent variable when the moderator variables are held constant at different combinations of high and low values. After computing the simple slopes, we will then compute and test the differences among all pairs of the slopes. The method used in this FAQ is adapted from a paper by Dawson and Richter (2004). The Dawson and Richter method computes the differences in simple slopes and their standard errors using a number of moderately-complex formulas. We will take an easier path and let margins do all of the heavy lifting.

We will begin by looking at the regression equation which includes a three-way continuous interaction. In the formula, Y is the response variable, X the predictor (independent) variable with Z and W being the two moderator variables.

Y = b0 + b1X + b2Z + b3W + b4XZ + b5XW + b6ZW + b7XZW
We can reorder the terms into two groups, the first grouping (terms that do not contain X) defines the intercept while the second grouping (all the terms that contain an X) defines the simple slope.
Y = (b0 + b2Z + b3W + b6ZW) + (b1 + b4Z + b5W + b7ZW)X
Next, we will define high values of Z and W as being one standard deviation above their respective means and will denote them as Hz and Hw. The low values will be defined as one standard deviation below their means and will be as Lz and Lw. Altogether there are four possible combinations of conditions: 1) HzHw, 2) HzLw, 3) LzHw and 4) LzLw.

Here are the formulas for the simple slope and intercept for condition 1) when both Z and W are at their high values. The formulas for the other three conditions are exactly the same except that the values for Z and W are different.

simple slope 1 = b1 + b4Hz + b5Hw + b7HzHw

intercept 1 =  b0 + b2Hz + b3Hw + b6HzHw
We will illustrate the simple slopes process using the dataset hsb2 that has a statistically significant (barely) three-way continuous interaction. In order to keep the notation consistent we will temporarily change the names of the variables; y for the response variable, x for the independent variable, z for the first moderator, and w for the second moderator. It is clear from the code below that write is the response variable, read is the IV and math and science are the moderator variables.

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear

rename write y
rename math w
rename science z

regress y c.x##c.z##c.w

Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  7,   192) =   26.28
Model |  8748.54751     7   1249.7925           Prob > F      =  0.0000
Residual |  9130.32749   192   47.553789           R-squared     =  0.4893
Total |   17878.875   199   89.843593           Root MSE      =  6.8959

------------------------------------------------------------------------------
y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
x |  -3.013849   1.802977    -1.67   0.096    -6.570035    .5423358
z |  -2.435316   1.757363    -1.39   0.167    -5.901532      1.0309
|
c.x#c.z |   .0553139   .0325207     1.70   0.091    -.0088297    .1194576
|
w |  -3.228097   1.926393    -1.68   0.095    -7.027708    .5715142
|
c.x#c.w |   .0737523    .035898     2.05   0.041     .0029472    .1445575
|
c.z#c.w |   .0611136    .035031     1.74   0.083    -.0079813    .1302086
|
c.x#c.z#c.w |  -.0012563   .0006277    -2.00   0.047    -.0024944   -.0000182
|
_cons |   166.5438   93.12788     1.79   0.075     -17.1413    350.2289
------------------------------------------------------------------------------
Please note that the three-way interaction, xzw, is statistically significant with a p-value of .047.

We need to create two global macro variables to hold the range of x that will be used for graphing. We will also create four global macros that contain the values w and z when that are one standard deviation above the mean and one standard deviation below the mean.


/* compute range of x for graphing */
sum x
global hi=r(max)
global lo=r(min)
sum w
global Hw=r(mean)+r(sd)
global Lw=r(mean)-r(sd)
sum z
global Hz=r(mean)+r(sd)
global Lz=r(mean)-r(sd)
Since we will want to graph our simple slopes we will begin by using the margins command to compute the intercepts for each of the four simple regressions. The margins command for intercepts is actually much more complex that for the slopes. For one thing, we need to hold x at zero because the intercept is the value of the respopnse variable when the predictor is zero. What makes this tricky is that the value zero does not occur in our sample, it is just a conditional mean. So we will need to use some options that are not needed for the slopes, such as, asbalanced, noesample, noestimcheck and force. After computing the intercepts for the simple regressions we will save the values in a matrix named i. Please note, we have manually annotated the output with the various combinations of w and z in parentheses.
margins, at( x=(0) w=($Hw$Lw) z=($Hz$Lz)) ///
asbalanced noesample noestimcheck force vsquish

Adjusted predictions                              Number of obs   =        200
Model VCE    : OLS

Expression   : Linear prediction, predict()
1._at        : x               =           0
z               =    61.75089
w               =    62.01345
2._at        : x               =           0
z               =    61.75089
w               =    43.27655
3._at        : x               =           0
z               =    41.94911
w               =    62.01345
4._at        : x               =           0
z               =    41.94911
w               =    43.27655

------------------------------------------------------------------------------
|            Delta-method
|     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
(HwHz) 1  |   50.00314   5.940566     8.42   0.000     38.35985    61.64643
(HzLw) 2  |   39.77795   7.819569     5.09   0.000     24.45187    55.10402
(LzHw) 3  |   23.18061   10.64204     2.18   0.029      2.32259    44.03862
(LwLz) 4  |   35.63004   5.537431     6.43   0.000     24.77687     46.4832
------------------------------------------------------------------------------

/* put intercepts in matrix i */

mat i=r(b)

mat list i

i[1,4]
1.         2.         3.         4.
_at        _at        _at        _at
r1   50.00314  39.777948  23.180607  35.630038
Now we can go ahead and use the margins command to compute the simple slopes for x while holding w and z constant at the various values defined above. In order to do post-estimations tests of differences in simple slopes we have included the post option. We will also save the simple slopes in a matrix named s. Once again, we have manually annotated the output.
margins, dydx(x) at(w=($Hw$Lw) z=($Hz$Lz)) post vsquish

Average marginal effects                          Number of obs   =        200
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : x
1._at        : z               =    61.75089
w               =    62.01345
2._at        : z               =    61.75089
w               =    43.27655
3._at        : z               =    41.94911
w               =    62.01345
4._at        : z               =    41.94911
w               =    43.27655

------------------------------------------------------------------------------
|            Delta-method
|      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
x            |
_at |
(HwHz) 1  |    .164556   .0953841     1.73   0.084    -.0223935    .3515054
(HzLw) 2  |    .236248   .1429934     1.65   0.099    -.0440138    .5165098
(LzHw) 3  |   .6119673   .1962032     3.12   0.002      .227416    .9965186
(LwLz) 4  |   .2175364   .1275474     1.71   0.088    -.0324519    .4675248
------------------------------------------------------------------------------

/* put slopes in matrix s */

mat s=e(b)  /* e(b) because we used the post option */

mat list s

s[1,4]
x:         x:         x:         x:
1.         2.         3.         4.
_at        _at        _at        _at
y1  .16455596  .23624803  .61196727  .21753644
Now we have the intercepts and slopes for each of the four regression lines we can plot the four simple regression lines. We do this by holding the values of the moderator variables, z and w, constant at either their high (mean + 1 SD) or low values (mean - 1 SD) and allowing the independent variable, x, to vary.

Please note: You must use the names y and x with the twoway function plot command. Do not use variable names from your data.

twoway (function y=i[1,1] + s[1,1]*x, range($lo$hi))  ///
(function y=i[1,2] + s[1,2]*x, range($lo$hi))  ///
(function y=i[1,3] + s[1,3]*x, range($lo$hi))  ///
(function y=i[1,4] + s[1,4]*x, range($lo$hi))  ///
(scatter y x, msym(oh) jitter(3)),              ///
legend(order(1 "1 HzHw" 2 "2 HzLw" 3 "3 LzHw" 4 "4 LzLw")) ///
xtitle(X) ytitle(Y) scheme(lean1)


Inspecting the graph above, it seems that the biggest differences in slopes is between slopes 1 & 3, 2 & 3, and 4 & 3, i.e., the slope for LzHw (3) seems to be the most different from the others.

It should be noted that the lines on this graph were not plotted from the actual data in the dataset, rather they are theoretical regression lines that have the same slopes and intercepts but are plotted using the entire range of x.

Next, we will compute the tests of differences in simple slopes using the lincom command. There are six possible comparisons among the four slopes.

/* differences in simple slopes */

lincom [x]1bn._at - [x]2bn._at  /* a) HwHz vs HzLw (1 & 2) */

( 1)  [x]1bn._at - [x]2._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |  -.0716921   .1642481    -0.44   0.662    -.3936124    .2502282
------------------------------------------------------------------------------

lincom [x]1bn._at - [x]3bn._at  /* b) HwHz vs LzHw (1 & 3) */

( 1)  [x]1bn._at - [x]3._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |  -.4474113   .2060755    -2.17   0.030    -.8513119   -.0435107
------------------------------------------------------------------------------

lincom [x]1bn._at - [x]4bn._at  /* c) HwHz vs LwLz (1 & 4) */

( 1)  [x]1bn._at - [x]4._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |  -.0529805    .163752    -0.32   0.746    -.3739284    .2679675
------------------------------------------------------------------------------

lincom [x]2bn._at - [x]3bn._at  /* d) HzLw vs LzHw (2 & 3) */

( 1)  [x]2._at - [x]3._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |  -.3757192   .2579781    -1.46   0.145     -.881347    .1299085
------------------------------------------------------------------------------

lincom [x]2bn._at - [x]4bn._at  /* e) HzLw vs LwLz (2 & 4) */

( 1)  [x]2._at - [x]4._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |   .0187116   .1722173     0.11   0.913    -.3188281    .3562513
------------------------------------------------------------------------------

lincom [x]3bn._at - [x]4bn._at  /* f) LzHw vs LwLz (3 & 4) */

( 1)  [x]3._at - [x]4._at = 0

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) |   .3944308   .2202323     1.79   0.073    -.0372165    .8260782
------------------------------------------------------------------------------
With some cutting and pasting we can put together a table including each of the tests of differences in simple slopes.
   Slopes      |      Diff.   Std. Err.      z    P>|t|     [95% Conf. Interval]
---------------+----------------------------------------------------------------
(HwHz vs HzLw) |  -.0716921   .1642481    -0.44   0.662    -.3936124    .2502282
(HwHz vs LzHw) |  -.4474113   .2060755    -2.17   0.030    -.8513119   -.0435107
(HwHz vs LzLw) |  -.0529805   .163752     -0.32   0.746    -.3739284    .2679675
(HzLw vs LzHw) |  -.3757192   .2579781    -1.46   0.145     -.881347    .1299085
(HzLw vs LzLw) |   .0187116   .1722173     0.11   0.913    -.3188281    .3562513
(LzHw vs LzHw) |   .3944308   .2202323     1.79   0.073    -.0372165    .8260782
The p-values in the table above cannot be taken at face value. We need to make some kind of an adjustment to account for the fact that there are six post-hoc tests. The easiest adjustment would be a Bonferroni. This is an extremely conservative adjustment and is just a matter of multiplying each of the p-values by the number of tests (6). In this case, the comparison with the smallest p-value, HzHw versus LzHw, would have a Bonferroni adjusted p-value of .031*6 = .186.

In this example, the p-value for the three-way interaction was barely significant (.047), so it is not surprising that none of the post-hoc tests of differences in simple slopes were statistically significant.

Below are all the commands that we used collected into a do-file to make it easier to use. You might be wondering, wouldn't it be even easier if there was an ado-file for that would do everything with a single command? It wouldn't be very difficult to create such a command but you would not be able to see how all of the steps work to create the tests of differences in simple slopes. In other words, it builds character.

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear

rename write y
rename math w
rename science z

/* compute range of x for graphing */
sum x
global hi=r(max)
global lo=r(min)
sum w
global Hw=r(mean)+r(sd)
global Lw=r(mean)-r(sd)
sum z
global Hz=r(mean)+r(sd)
global Lz=r(mean)-r(sd)

/* simple intercepts */
margins, at( x=(0) w=($Hw$Lw) z=($Hz$Lz)) ///
asbalanced noesample noestimcheck force vsquish
mat i=r(b)

/* simple slopes */
margins, dydx(x) at(w=($Hw$Lw) z=($Hz$Lz)) post vsquish
mat s=e(b)

twoway (function y=i[1,1] + s[1,1]*x, range($lo$hi))  ///
(function y=i[1,2] + s[1,2]*x, range($lo$hi))  ///
(function y=i[1,3] + s[1,3]*x, range($lo$hi))  ///
(function y=i[1,4]+s[1,4]*x, range($lo$hi))  ///
(scatter y x, msym(oh) jitter(3)),     ///
legend(order(1 "1 HzHw" 2 "2 LwHz" 3 "3 HwLz" 4 "4 LzLw")) ///
xtitle(X) ytitle(Y) scheme(lean1)

/* differences in simple slopes */
lincom [x]1bn._at - [x]2bn._at
lincom [x]1bn._at - [x]3bn._at
lincom [x]1bn._at - [x]4bn._at
lincom [x]2bn._at - [x]3bn._at
lincom [x]2bn._at - [x]4bn._at
lincom [x]3bn._at - [x]4bn._at
Hey, wait a minute, my model also has covariates. What about them? We will deal with adding covariates to the model in Part 2: How can I explain a 3-way continuous interaction?

References

Dawson, J.F. & Richter, A.W. (2004) A significance test of slope differences for three-way interactions in moderated multiple regression analysis. RP0422. Aston University, Birmingham, UK

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.