UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata FAQ
How can I understand a categorical by categorical interaction in logistic regression? (Stata 11)

Interactions in logistic regression models can be trickier than interactions in comparable OLS regression models. This is particularly true when there are covariates in the model in addition to the the categorical predictors. This FAQ page will try to help you to understand categorical by categorical interactions in logistic regression models with continuous covariates.

We will use an example dataset, logit2-2, that has two binary predictors, f and h, and a continuous covariate, cv1. In addition, the model will include f by h interaction. We will begin by loading the data, creating the interaction variable and running the logit model.

use http://www.ats.ucla.edu/stat/data/logit2-2, clear

logit y f##h cv1

Logistic regression                               Number of obs   =        200
                                                  LR chi2(4)      =     106.10
                                                  Prob > chi2     =     0.0000
Log likelihood =  -78.74193                       Pseudo R2       =     0.4025

------------------------------------------------------------------------------
           y |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         1.f |   2.996118   .7521524     3.98   0.000     1.521926    4.470309
         1.h |   2.390911   .6608498     3.62   0.000      1.09567    3.686153
             |
         f#h |
        1 1  |  -2.047755   .8807989    -2.32   0.020    -3.774089   -.3214213
             |
         cv1 |    .196476   .0328518     5.98   0.000     .1320876    .2608644
       _cons |  -11.86075   1.895828    -6.26   0.000     -15.5765   -8.144991
------------------------------------------------------------------------------
As you can see all of the variables in the above model including the interaction term are statistically significant. If this were an OLS regression model we could do a very good job of understanding the interaction using just the coefficients in the model. The situation in logistic regression is more complicated because the effect of the covariate is nonlinear, meaning that the interaction effect can be very different for different values of the covariate. To begin to understand what is going on consider the Table 1 below.
Table 1: Predicted probabilities when cv1=50

        h=0     h=1    Dprob      LB       UB
f=0   .1154   .5876   .4722    .2693    .6751
f=1   .7230   .7862   .0633   -.1399    .2665
Table 1 contain predicted probabilities, differences in predicted probabilities and the confidence interval of the difference in predicted probabilities while holding cv1 at 50. The first value, .1154, is the predicted probability when f=0 and h=0. The second value, .5876, is the predicted probability when f=0 and h=1. The third value, .4722, is the difference in probabilities for f=0 when h changes from 0 to 1. The next two values are the 95% confidence interval on the difference in probabilities. If the confidence interval contains zero the difference would not be considered statistically significant. In our example, the confidence interval does not contain zero. Thus, for our example, the difference in probabilities is statistically significant.

We we can obtained all the values for Table 1 for a range of values of cv1 by using the margins command twice. We run margins once to get the probabilities for h = 0 and h = 1 for both f equal 0 and 1. Then we will run margins again to get the difference in probabilities and them confidence interval for the difference. We will manually annotate the output to indicate which values are the same as Table 1.

margins h, at(f=(0 1) cv1=(30(5)70)) vsquish

Adjusted predictions                              Number of obs   =        200
Model VCE    : OIM

Expression   : Pr(y), predict()
1._at        : f               =           0
               cv1             =          30
2._at        : f               =           0
               cv1             =          35
3._at        : f               =           0
               cv1             =          40
4._at        : f               =           0
               cv1             =          45
5._at        : f               =           0
               cv1             =          50
6._at        : f               =           0
               cv1             =          55
7._at        : f               =           0
               cv1             =          60
8._at        : f               =           0
               cv1             =          65
9._at        : f               =           0
               cv1             =          70
10._at       : f               =           1
               cv1             =          30
11._at       : f               =           1
               cv1             =          35
12._at       : f               =           1
               cv1             =          40
13._at       : f               =           1
               cv1             =          45
14._at       : f               =           1
               cv1             =          50
15._at       : f               =           1
               cv1             =          55
16._at       : f               =           1
               cv1             =          60
17._at       : f               =           1
               cv1             =          65
18._at       : f               =           1
               cv1             =          70

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       _at#h |
        1 0  |   .0025567   .0025254     1.01   0.311    -.0023929    .0075063
        1 1  |   .0272372   .0206496     1.32   0.187    -.0132352    .0677097
        2 0  |   .0067995   .0057853     1.18   0.240    -.0045396    .0181385
        2 1  |    .069579   .0413211     1.68   0.092    -.0114088    .1505668
        3 0  |   .0179561   .0129716     1.38   0.166    -.0074678      .04338
        3 1  |   .1664783   .0709412     2.35   0.019     .0274362    .3055204
        4 0  |   .0465604    .028158     1.65   0.098    -.0086283     .101749
        4 1  |   .3478701   .0933387     3.73   0.000     .1649296    .5308105
        5 0  |    .115378   .0575106     2.01   0.045     .0026592    .2280968  value from Table 1
        5 1  |   .5875788   .0877652     6.69   0.000     .4155621    .7595955  value from Table 1
        6 0  |   .2583492   .1025789     2.52   0.012     .0572982    .4594002
        6 1  |   .7918883   .0631887    12.53   0.000     .6680407    .9157359
        7 0  |   .4819613   .1389465     3.47   0.001      .209631    .7542915
        7 1  |    .910416    .037977    23.97   0.000     .8359824    .9848496
        8 0  |   .7130398   .1272483     5.60   0.000     .4636377    .9624418
        8 1  |   .9644667   .0200004    48.22   0.000     .9252666    1.003667
        9 0  |   .8690487   .0818878    10.61   0.000     .7085515    1.029546
        9 1  |   .9863932   .0096629   102.08   0.000     .9674543    1.005332
       10 0  |   .0487835   .0294567     1.66   0.098    -.0089504    .1065175
       10 1  |   .0674087   .0446964     1.51   0.132    -.0201947    .1550121
       11 0  |   .1204719   .0549249     2.19   0.028     .0128212    .2281227
       11 1  |   .1618113     .07791     2.08   0.038     .0091104    .3145121
       12 0  |    .267844   .0851192     3.15   0.002     .1010134    .4346747
       12 1  |   .3401934   .1024706     3.32   0.001     .1393547    .5410321
       13 0  |   .4941981   .1006298     4.91   0.000     .2969673    .6914289
       13 1  |   .5793115   .0914477     6.33   0.000     .4000773    .7585456
       14 0  |   .7229559   .0872338     8.29   0.000     .5519808    .8939309  value from Table 1
       14 1  |   .7862264   .0599327    13.12   0.000     .6687605    .9036924  value from Table 1
       15 0  |   .8745225   .0571538    15.30   0.000      .762503     .986542
       15 1  |   .9076026    .034318    26.45   0.000     .8403406    .9748645
       16 0  |   .9490168   .0308608    30.75   0.000     .8885308    1.009503
       16 1  |   .9632823   .0180954    53.23   0.000     .9278159    .9987487
       17 0  |   .9802821   .0149266    65.67   0.000     .9510265    1.009538
       17 1  |    .985929   .0088829   110.99   0.000     .9685188    1.003339
       18 0  |    .992525    .006799   145.98   0.000     .9791993    1.005851
       18 1  |   .9946848   .0041367   240.46   0.000      .986577    1.002792
------------------------------------------------------------------------------

margins f, dydx(h) at(cv1=(30(5)70)) post noatlegend

Conditional marginal effects                      Number of obs   =        200
Model VCE    : OIM

Expression   : Pr(y), predict()
dy/dx w.r.t. : 1.h

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.h          |
       _at#f |
        1 0  |   .0246805   .0188412     1.31   0.190    -.0122475    .0616086
        1 1  |   .0186252   .0331697     0.56   0.574    -.0463863    .0836367
        2 0  |   .0627795   .0378532     1.66   0.097    -.0114115    .1369704
        2 1  |   .0413394   .0695885     0.59   0.552    -.0950517    .1777304
        3 0  |   .1485222   .0656193     2.26   0.024     .0199107    .2771337
        3 1  |   .0723494   .1167547     0.62   0.535    -.1564856    .3011843
        4 0  |   .3013097   .0902579     3.34   0.001     .1244074     .478212
        4 1  |   .0851134   .1360832     0.63   0.532    -.1816048    .3518315
        5 0  |   .4722008   .1035128     4.56   0.000     .2693194    .6750821  value from Table 1
        5 1  |   .0632706   .1036697     0.61   0.542    -.1399183    .2664595  value from Table 1
        6 0  |   .5335391   .1208833     4.41   0.000     .2966122    .7704659
        6 1  |   .0330801   .0565538     0.58   0.559    -.0777632    .1439234
        7 0  |   .4284548    .137549     3.11   0.002     .1588636    .6980459
        7 1  |   .0142654   .0255894     0.56   0.577    -.0358888    .0644197
        8 0  |   .2514269    .120641     2.08   0.037      .014975    .4878788
        8 1  |   .0056469   .0106397     0.53   0.596    -.0152065    .0265003
        9 0  |   .1173445    .076704     1.53   0.126    -.0329926    .2676816
        9 1  |   .0021597   .0042758     0.51   0.613    -.0062207    .0105402
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.
Here is what we can say based upon the output above. There are no significant differences between the two levels of h when the covariate is held constant at either 30 or 35. When the covariate is held constant between 40 and 65 there is a significant h 0-1 difference at f=0 but not at f=1. Finally, when the covariates are held constant at 70 the h differences are not significant. It may be easier to understand these results if we graph the confidence intervals for the difference in probability separately for both f=0 and f=1.

There is so much information in the margins output that its difficult to see what is going on. A graphic representation would do a better job of organizing and displaying the results. We will produce the graphs using three programs written by Roger Newson; parmest (findit parmest), fvregen (findit fvregen), and eclplot (findit eclplot). Basically, parmest and fvregen put the results form the last margins command into memory as a dataset and eclplot does the actual plotting.

parmest, label norestore

fvregen

drop if z==.

rename _at cv1

recode cv1 (1=30)(2=35)(3=40)(4=45)(5=50)(6=55)(7=60)(8=65)(9=70)

eclplot estimate min95 max95 cv1 if f==0, yline(0) estopts(scheme(lean1)) ///
  title(95% confidence interval for f=0) ytitle(difference in probability)



eclplot estimate min95 max95 cv1 if f==1, yline(0) estopts(scheme(lean1)) ///
  title(95% confidence interval for f=1) ytitle(difference in probability)


How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California