|
|
|
||||
|
Help the Stat Consulting Group by
giving a gift
| |||||
|
Loading
|
|||||
If you are new to the topic of interactions of continuous variables, you may want to start with the FAQ page the discusses the simpler continuous by continuous two-way interaction, How can I explain a continuous by continuous interaction?
Back to the three-way interaction. There are several approaches that one might use to explain a three-way interaction. The approach that we will demonstrate is to compute simple slopes, i.e., the slopes of the dependent variable on the independent variable when the moderator variables are held constant at different combinations of high and low values. After computing the simple slopes, we will then compute and test the differences among all pairs of the slopes. The method used in this FAQ is adapted from a paper by Dawson and Richter (2004). The Dawson and Richter method computes the differences in simple slopes and their standard errors using a number of moderately-complex formulas. We will take an easier path and let margins do all of the heavy lifting.
We will begin by looking at the regression equation which includes a three-way continuous interaction. In the formula, Y is the response variable, X the predictor (independent) variable with Z and W being the two moderator variables.
Y = b0 + b1X + b2Z + b3W + b4XZ + b5XW + b6ZW + b7XZW
Y = (b0 + b2Z + b3W + b6ZW) + (b1 + b4Z + b5W + b7ZW)X
Here are the formulas for the simple slope and intercept for condition 1) when both Z and W are at their high values. The formulas for the other three conditions are exactly the same except that the values for Z and W are different.
simple slope 1 = b1 + b4Hz + b5Hw + b7HzHw intercept 1 = b0 + b2Hz + b3Hw + b6HzHw
use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
rename write y
rename read x
rename math w
rename science z
regress y c.x##c.z##c.w
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 7, 192) = 26.28
Model | 8748.54751 7 1249.7925 Prob > F = 0.0000
Residual | 9130.32749 192 47.553789 R-squared = 0.4893
-------------+------------------------------ Adj R-squared = 0.4707
Total | 17878.875 199 89.843593 Root MSE = 6.8959
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | -3.013849 1.802977 -1.67 0.096 -6.570035 .5423358
z | -2.435316 1.757363 -1.39 0.167 -5.901532 1.0309
|
c.x#c.z | .0553139 .0325207 1.70 0.091 -.0088297 .1194576
|
w | -3.228097 1.926393 -1.68 0.095 -7.027708 .5715142
|
c.x#c.w | .0737523 .035898 2.05 0.041 .0029472 .1445575
|
c.z#c.w | .0611136 .035031 1.74 0.083 -.0079813 .1302086
|
c.x#c.z#c.w | -.0012563 .0006277 -2.00 0.047 -.0024944 -.0000182
|
_cons | 166.5438 93.12788 1.79 0.075 -17.1413 350.2289
------------------------------------------------------------------------------
Please note that the three-way interaction, xzw, is statistically significant with
a p-value of .047.We need to create two global macro variables to hold the range of x that will be used for graphing. We will also create four global macros that contain the values w and z when that are one standard deviation above the mean and one standard deviation below the mean.
Since we will want to graph our simple slopes we will begin by using the margins command to compute the intercepts for each of the four simple regressions. The margins command for intercepts is actually much more complex that for the slopes. For one thing, we need to hold x at zero because the intercept is the value of the respopnse variable when the predictor is zero. What makes this tricky is that the value zero does not occur in our sample, it is just a conditional mean. So we will need to use some options that are not needed for the slopes, such as, asbalanced, noesample, noestimcheck and force. After computing the intercepts for the simple regressions we will save the values in a matrix named i. Please note, we have manually annotated the output with the various combinations of w and z in parentheses./* compute range of x for graphing */ sum x global hi=r(max) global lo=r(min) sum w global Hw=r(mean)+r(sd) global Lw=r(mean)-r(sd) sum z global Hz=r(mean)+r(sd) global Lz=r(mean)-r(sd)
margins, at( x=(0) w=($Hw $Lw) z=($Hz $Lz)) ///
asbalanced noesample noestimcheck force vsquish
Adjusted predictions Number of obs = 200
Model VCE : OLS
Expression : Linear prediction, predict()
1._at : x = 0
z = 61.75089
w = 62.01345
2._at : x = 0
z = 61.75089
w = 43.27655
3._at : x = 0
z = 41.94911
w = 62.01345
4._at : x = 0
z = 41.94911
w = 43.27655
------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_at |
(HwHz) 1 | 50.00314 5.940566 8.42 0.000 38.35985 61.64643
(HzLw) 2 | 39.77795 7.819569 5.09 0.000 24.45187 55.10402
(LzHw) 3 | 23.18061 10.64204 2.18 0.029 2.32259 44.03862
(LwLz) 4 | 35.63004 5.537431 6.43 0.000 24.77687 46.4832
------------------------------------------------------------------------------
/* put intercepts in matrix i */
mat i=r(b)
mat list i
i[1,4]
1. 2. 3. 4.
_at _at _at _at
r1 50.00314 39.777948 23.180607 35.630038
Now we can go ahead and use the margins command to compute the simple slopes for x while
holding w and z constant at the various values defined above. In order to do post-estimations
tests of differences in simple slopes we have included the post option. We will also
save the simple slopes in a matrix named s. Once again, we have manually annotated the output.
margins, dydx(x) at(w=($Hw $Lw) z=($Hz $Lz)) post vsquish
Average marginal effects Number of obs = 200
Model VCE : OLS
Expression : Linear prediction, predict()
dy/dx w.r.t. : x
1._at : z = 61.75089
w = 62.01345
2._at : z = 61.75089
w = 43.27655
3._at : z = 41.94911
w = 62.01345
4._at : z = 41.94911
w = 43.27655
------------------------------------------------------------------------------
| Delta-method
| dy/dx Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x |
_at |
(HwHz) 1 | .164556 .0953841 1.73 0.084 -.0223935 .3515054
(HzLw) 2 | .236248 .1429934 1.65 0.099 -.0440138 .5165098
(LzHw) 3 | .6119673 .1962032 3.12 0.002 .227416 .9965186
(LwLz) 4 | .2175364 .1275474 1.71 0.088 -.0324519 .4675248
------------------------------------------------------------------------------
/* put slopes in matrix s */
mat s=e(b) /* e(b) because we used the post option */
mat list s
s[1,4]
x: x: x: x:
1. 2. 3. 4.
_at _at _at _at
y1 .16455596 .23624803 .61196727 .21753644
Now we have the intercepts and slopes for each of
the four regression lines we can plot
the four simple regression lines. We do this by
holding the values of the moderator variables, z and w,
constant at either their high (mean + 1 SD) or
low values (mean - 1 SD) and allowing the independent variable, x, to vary.Please note: You must use the names y and x with the twoway function plot command. Do not use variable names from your data.
twoway (function y=i[1,1] + s[1,1]*x, range($lo $hi)) ///
(function y=i[1,2] + s[1,2]*x, range($lo $hi)) ///
(function y=i[1,3] + s[1,3]*x, range($lo $hi)) ///
(function y=i[1,4] + s[1,4]*x, range($lo $hi)) ///
(scatter y x, msym(oh) jitter(3)), ///
legend(order(1 "1 HzHw" 2 "2 HzLw" 3 "3 LzHw" 4 "4 LzLw")) ///
xtitle(X) ytitle(Y) scheme(lean1)

Inspecting the graph above, it seems that the biggest differences in slopes
is between slopes 1 & 3, 2 & 3, and 4 & 3, i.e., the slope
for LzHw (3) seems to be the most different from the others.It should be noted that the lines on this graph were not plotted from the actual data in the dataset, rather they are theoretical regression lines that have the same slopes and intercepts but are plotted using the entire range of x.
Next, we will compute the tests of differences in simple slopes using the lincom command. There are six possible comparisons among the four slopes.
/* differences in simple slopes */
lincom [x]1bn._at - [x]2bn._at /* a) HwHz vs HzLw (1 & 2) */
( 1) [x]1bn._at - [x]2._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | -.0716921 .1642481 -0.44 0.662 -.3936124 .2502282
------------------------------------------------------------------------------
lincom [x]1bn._at - [x]3bn._at /* b) HwHz vs LzHw (1 & 3) */
( 1) [x]1bn._at - [x]3._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | -.4474113 .2060755 -2.17 0.030 -.8513119 -.0435107
------------------------------------------------------------------------------
lincom [x]1bn._at - [x]4bn._at /* c) HwHz vs LwLz (1 & 4) */
( 1) [x]1bn._at - [x]4._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | -.0529805 .163752 -0.32 0.746 -.3739284 .2679675
------------------------------------------------------------------------------
lincom [x]2bn._at - [x]3bn._at /* d) HzLw vs LzHw (2 & 3) */
( 1) [x]2._at - [x]3._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | -.3757192 .2579781 -1.46 0.145 -.881347 .1299085
------------------------------------------------------------------------------
lincom [x]2bn._at - [x]4bn._at /* e) HzLw vs LwLz (2 & 4) */
( 1) [x]2._at - [x]4._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | .0187116 .1722173 0.11 0.913 -.3188281 .3562513
------------------------------------------------------------------------------
lincom [x]3bn._at - [x]4bn._at /* f) LzHw vs LwLz (3 & 4) */
( 1) [x]3._at - [x]4._at = 0
------------------------------------------------------------------------------
| Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | .3944308 .2202323 1.79 0.073 -.0372165 .8260782
------------------------------------------------------------------------------
With some cutting and pasting we can put together a table including each of
the tests of differences in simple slopes.
The p-values in the table above cannot be taken at face value. We need to make some kind of an adjustment to account for the fact that there are six post-hoc tests. The easiest adjustment would be a Bonferroni. This is an extremely conservative adjustment and is just a matter of multiplying each of the p-values by the number of tests (6). In this case, the comparison with the smallest p-value, HzHw versus LzHw, would have a Bonferroni adjusted p-value of .031*6 = .186.Slopes | Diff. Std. Err. z P>|t| [95% Conf. Interval] ---------------+---------------------------------------------------------------- (HwHz vs HzLw) | -.0716921 .1642481 -0.44 0.662 -.3936124 .2502282 (HwHz vs LzHw) | -.4474113 .2060755 -2.17 0.030 -.8513119 -.0435107 (HwHz vs LzLw) | -.0529805 .163752 -0.32 0.746 -.3739284 .2679675 (HzLw vs LzHw) | -.3757192 .2579781 -1.46 0.145 -.881347 .1299085 (HzLw vs LzLw) | .0187116 .1722173 0.11 0.913 -.3188281 .3562513 (LzHw vs LzHw) | .3944308 .2202323 1.79 0.073 -.0372165 .8260782
In this example, the p-value for the three-way interaction was barely significant (.047), so it is not surprising that none of the post-hoc tests of differences in simple slopes were statistically significant.
Below are all the commands that we used collected into a do-file to make it easier to use. You might be wondering, wouldn't it be even easier if there was an ado-file for that would do everything with a single command? It wouldn't be very difficult to create such a command but you would not be able to see how all of the steps work to create the tests of differences in simple slopes. In other words, it builds character.
use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
rename write y
rename read x
rename math w
rename science z
/* compute range of x for graphing */
sum x
global hi=r(max)
global lo=r(min)
sum w
global Hw=r(mean)+r(sd)
global Lw=r(mean)-r(sd)
sum z
global Hz=r(mean)+r(sd)
global Lz=r(mean)-r(sd)
/* simple intercepts */
margins, at( x=(0) w=($Hw $Lw) z=($Hz $Lz)) ///
asbalanced noesample noestimcheck force vsquish
mat i=r(b)
/* simple slopes */
margins, dydx(x) at(w=($Hw $Lw) z=($Hz $Lz)) post vsquish
mat s=e(b)
twoway (function y=i[1,1] + s[1,1]*x, range($lo $hi)) ///
(function y=i[1,2] + s[1,2]*x, range($lo $hi)) ///
(function y=i[1,3] + s[1,3]*x, range($lo $hi)) ///
(function y=i[1,4]+s[1,4]*x, range($lo $hi)) ///
(scatter y x, msym(oh) jitter(3)), ///
legend(order(1 "1 HzHw" 2 "2 LwHz" 3 "3 HwLz" 4 "4 LzLw")) ///
xtitle(X) ytitle(Y) scheme(lean1)
/* differences in simple slopes */
lincom [x]1bn._at - [x]2bn._at
lincom [x]1bn._at - [x]3bn._at
lincom [x]1bn._at - [x]4bn._at
lincom [x]2bn._at - [x]3bn._at
lincom [x]2bn._at - [x]4bn._at
lincom [x]3bn._at - [x]4bn._at
Hey, wait a minute, my model also has covariates. What about them?
We will deal with adding covariates to the model in
Part 2: How can I
explain a 3-way continuous interaction?References
Dawson, J.F. & Richter, A.W. (2004) A significance test of slope differences for three-way interactions in moderated multiple regression analysis. RP0422. Aston University, Birmingham, UK
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services