UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata FAQ
How can get anova main-effects with dummy coding using margin? (Stata 11)

Many researchers like to do their anova using regression with dummy coding but find it confusing when they don't get the same main-effects as in anova. This FAQ will show you how to get those main-effects.

Example 1

Let's begin by showing the normal anova using a dataset called crf24 to use as a comparison.

use http://www.ats.ucla.edu/stat/stata/faq/crf24, clear

anova y a##b

                           Number of obs =      32     R-squared     =  0.9214
                           Root MSE      = .877971     Adj R-squared =  0.8985

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |         217     7          31      40.22     0.0000
                         |
                       a |       3.125     1       3.125       4.05     0.0554
                       b |       194.5     3  64.8333333      84.11     0.0000
                     a#b |      19.375     3  6.45833333       8.38     0.0006
                         |
                Residual |        18.5    24  .770833333   
              -----------+----------------------------------------------------
                   Total |       235.5    31  7.59677419  
Here is how the above analyses would look using Stata 11's factor variables with the regress command. The regression model will be followed by a test of the interaction, the margins command and the test of the two main effects using the testparm command.
regress y a##b

      Source |       SS       df       MS              Number of obs =      32
-------------+------------------------------           F(  7,    24) =   40.22
       Model |         217     7          31           Prob > F      =  0.0000
    Residual |        18.5    24  .770833333           R-squared     =  0.9214
-------------+------------------------------           Adj R-squared =  0.8985
       Total |       235.5    31  7.59677419           Root MSE      =  .87797

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         2.a |         -2   .6208194    -3.22   0.004    -3.281308   -.7186918
             |
           b |
          2  |        .25   .6208194     0.40   0.691    -1.031308    1.531308
          3  |       3.25   .6208194     5.24   0.000     1.968692    4.531308
          4  |       4.25   .6208194     6.85   0.000     2.968692    5.531308
             |
         a#b |
        2 2  |          1   .8779711     1.14   0.266    -.8120434    2.812043
        2 3  |         .5   .8779711     0.57   0.574    -1.312043    2.312043
        2 4  |          4   .8779711     4.56   0.000     2.187957    5.812043
             |
       _cons |       3.75   .4389856     8.54   0.000     2.843978    4.656022
------------------------------------------------------------------------------

testparm a#b   /* test of a#b interaction */

 ( 1)  2.a#2.b = 0
 ( 2)  2.a#3.b = 0
 ( 3)  2.a#4.b = 0

       F(  3,    24) =    8.38
            Prob > F =    0.0006
Even though the interaction is statistically significant we will go ahead and check out the main effects. We will demonstrate two methods for computing the main effects for this example. We need to make clear that there are more than two methods of obtaining the main effects using the margins command. These are just two of the easier methods.

The first method uses testparm with the equal option.

estimates store m1   /* store regression results for later computations */

margins a b, asbalanced post  /* margins command for main effects: method 1 */

Adjusted predictions                              Number of obs   =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()
at           : a                (asbalanced)
               b                (asbalanced)

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
           a |
          1  |     5.6875   .2194928    25.91   0.000     5.257302    6.117698
          2  |     5.0625   .2194928    23.06   0.000     4.632302    5.492698
             |
           b |
          1  |       2.75   .3104097     8.86   0.000     2.141608    3.358392
          2  |        3.5   .3104097    11.28   0.000     2.891608    4.108392
          3  |       6.25   .3104097    20.13   0.000     5.641608    6.858392
          4  |          9   .3104097    28.99   0.000     8.391608    9.608392
------------------------------------------------------------------------------

testparm i.a, equal   /* a main effect */

 ( 1)  - 1bn.a + 2.a = 0

           chi2(  1) =    4.05
         Prob > chi2 =    0.0441

testparm i.b, equal   /* b main effect */

 ( 1)  - 1bn.b + 2.b = 0
 ( 2)  - 1bn.b + 3.b = 0
 ( 3)  - 1bn.b + 4.b = 0

           chi2(  3) =  252.32
         Prob > chi2 =    0.0000

display "scale as F-ratio = " r(chi2)/r(df)

scale as F-ratio = 84.108108
Next, we demonstrate the second method for main effects using margins with the dydx option.
estimates restore m1   /* restore regression results */

margins, dydx(a b) asbalanced post  /* margins command for main effects: method 2 */

Conditional marginal effects                      Number of obs   =         32
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 2.a 2.b 3.b 4.b
at           : a                (asbalanced)
               b                (asbalanced)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         2.a |      -.625   .3104097    -2.01   0.044    -1.233392   -.0166082
             |
           b |
          2  |        .75   .4389856     1.71   0.088    -.1103959    1.610396
          3  |        3.5   .4389856     7.97   0.000     2.639604    4.360396
          4  |       6.25   .4389856    14.24   0.000     5.389604    7.110396
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

testparm i.a   /* a main effect */

 ( 1)  2.a = 0

           chi2(  1) =    4.05
         Prob > chi2 =    0.0441

testparm i.b   /* b main effect */

 ( 1)  2.b = 0
 ( 2)  3.b = 0
 ( 3)  4.b = 0

           chi2(  3) =  252.32
         Prob > chi2 =    0.0000

display "scale as F-ratio = " r(chi2)/r(df)

scale as F-ratio = 84.108108

Example 2

This method generalizes to more complex designs with multiple factors, so let's consider a 3-factor completely crossed design.

use http://www.ats.ucla.edu/stat/stata/faq/threeway, clear

anova y a##b##c

                           Number of obs =      24     R-squared     =  0.9689
                           Root MSE      =  1.1547     Adj R-squared =  0.9403

                  Source |  Partial SS    df       MS           F     Prob > F
              -----------+----------------------------------------------------
                   Model |  497.833333    11  45.2575758      33.94     0.0000
                         |
                       a |         150     1         150     112.50     0.0000
                       b |  .666666667     1  .666666667       0.50     0.4930
                       c |  127.583333     2  63.7916667      47.84     0.0000
                     a*b |  160.166667     1  160.166667     120.13     0.0000
                     a*c |       18.25     2       9.125       6.84     0.0104
                     b*c |  22.5833333     2  11.2916667       8.47     0.0051
                   a*b*c |  18.5833333     2  9.29166667       6.97     0.0098
                         |
                Residual |          16    12  1.33333333   
              -----------+----------------------------------------------------
                   Total |  513.833333    23  22.3405797 
And here is the same model using the regress command.
regress y a##b##c

      Source |       SS       df       MS              Number of obs =      24
-------------+------------------------------           F( 11,    12) =   33.94
       Model |  497.833333    11  45.2575758           Prob > F      =  0.0000
    Residual |          16    12  1.33333333           R-squared     =  0.9689
-------------+------------------------------           Adj R-squared =  0.9403
       Total |  513.833333    23  22.3405797           Root MSE      =  1.1547

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         2.a |        -.5   1.154701    -0.43   0.673    -3.015876    2.015876
         2.b |        -.5   1.154701    -0.43   0.673    -3.015876    2.015876
             |
         a#b |
        2 2  |        6.5   1.632993     3.98   0.002     2.942014    10.05799
             |
           c |
          2  |          4   1.154701     3.46   0.005     1.484124    6.515876
          3  |          8   1.154701     6.93   0.000     5.484124    10.51588
             |
         a#c |
        2 2  |          1   1.632993     0.61   0.552    -2.557986    4.557986
        2 3  |  -1.10e-14   1.632993    -0.00   1.000    -3.557986    3.557986
             |
         b#c |
        2 2  |         -4   1.632993    -2.45   0.031    -7.557986   -.4420135
        2 3  |         -9   1.632993    -5.51   0.000    -12.55799   -5.442014
             |
       a#b#c |
      2 2 2  |          3   2.309401     1.30   0.218    -2.031753    8.031753
      2 2 3  |        8.5   2.309401     3.68   0.003     3.468247    13.53175
             |
       _cons |         11   .8164966    13.47   0.000     9.221007    12.77899
------------------------------------------------------------------------------

testparm a#b#c   /* test of the a#b#c interaction */

 ( 1)  2.a#2.b#2.c = 0
 ( 2)  2.a#2.b#3.c = 0

       F(  2,    12) =    6.97
            Prob > F =    0.0098
Before we get to the main effects, we will test the three two-way interactions.
estimates store m1   /* store regression results for later computations */

margins, dydx(a) over(b) asbal post noatlegend   /* margins for a#b interaction */

Conditional marginal effects                      Number of obs   =         24

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 2.a
over         : b

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.a          |
           b |
          1  |  -.1666667   .6666667    -0.25   0.803    -1.473309    1.139976
          2  |   10.16667   .6666667    15.25   0.000     8.860024    11.47331
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

test [2.a]1.b=[2.a]2.b      /* test of a#b interaction */

 ( 1)  [2.a]1bn.b - [2.a]2.b = 0

           chi2(  1) =  120.13
         Prob > chi2 =    0.0000

estimates restore m1

margins, dydx(a) over(c) asbal post noatlegend   /* margins for a#c interaction */

Conditional marginal effects                      Number of obs   =         24

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 2.a
over         : c

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.a          |
           c |
          1  |       2.75   .8164966     3.37   0.001     1.149696    4.350304
          2  |       5.25   .8164966     6.43   0.000     3.649696    6.850304
          3  |          7   .8164966     8.57   0.000     5.399696    8.600304
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

test ([2.a]1.c=[2.a]2.c)([2.a]1.c=[2.a]3.c)   /* test of a#c interaction */

 ( 1)  [2.a]1bn.c - [2.a]2.c = 0
 ( 2)  [2.a]1bn.c - [2.a]3.c = 0

           chi2(  2) =   13.69
         Prob > chi2 =    0.0011

display "scale as F-ratio = " r(chi2)/r(df)

scale as F-ratio = 6.84375

estimates restore m1

margins, dydx(b) over(c) asbal post noatlegend   /* margins for b#c interaction */

Conditional marginal effects                      Number of obs   =         24

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 2.b
over         : c

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
2.b          |
           c |
          1  |       2.75   .8164966     3.37   0.001     1.149696    4.350304
          2  |        .25   .8164966     0.31   0.759    -1.350304    1.850304
          3  |         -2   .8164966    -2.45   0.014    -3.600304   -.3996961
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

test ([2.b]1.c=[2.b]2.c)([2.b]1.c=[2.b]3.c)   /* test of b#c interaction */

 ( 1)  [2.b]1bn.c - [2.b]2.c = 0
 ( 2)  [2.b]1bn.c - [2.b]3.c = 0

           chi2(  2) =   16.94
         Prob > chi2 =    0.0002

display "scale as F-ratio = " r(chi2)/r(df)

scale as F-ratio = 8.46875
Finally, we will compute the main effects using with method 2 as shown above.
estimates restore m1

margins, dydx(a b c) asbalanced post   /* margins command for main effects */

Conditional marginal effects                      Number of obs   =         24
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 2.a 2.b 2.c 3.c
at           : a                (asbalanced)
               b                (asbalanced)
               c                (asbalanced)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         2.a |          5   .4714045    10.61   0.000     4.076064    5.923936
         2.b |   .3333333   .4714045     0.71   0.480    -.5906025    1.257269
             |
           c |
          2  |       3.25   .5773503     5.63   0.000     2.118414    4.381586
          3  |      5.625   .5773503     9.74   0.000     4.493414    6.756586
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.

testparm i.a   /* a main-effect */

 ( 1)  2.a = 0

           chi2(  1) =  112.50
         Prob > chi2 =    0.0000

testparm i.b   /* b main-effect */

 ( 1)  2.b = 0

           chi2(  1) =    0.50
         Prob > chi2 =    0.4795

testparm i.c   /* c main-effect */

 ( 1)  2.c = 0
 ( 2)  3.c = 0

           chi2(  2) =   95.69
         Prob > chi2 =    0.0000

display "scale as F-ratio = " r(chi2)/r(df)

scale as F-ratio = 47.84375

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.