UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Statistical Computing Seminars
Visualizing Interactions for Logistic Models

0. Getting Started

0. Getting the Programs and Data

The aim of this seminar is to help you learn how to visualize interactions for models using logistic regression. It will demonstrate a suite of tools name vibl for visualizing binary logit models (By the way, vibl is pronounced "vibble" and it rhymes with kibble). You can get all of the programs and data files associated with the seminar as shown below.

net from http://www.ats.ucla.edu/stat/stata/ado/analysis
net install vibl
net get vibl

This page also refers to the xi3 and postgr3 commands.  If you do not have these, you can download them as shown below.

net from http://www.ats.ucla.edu/stat/stata/ado/analysis
net install xi3
net install postgr3

0.2 Movies

Some of the sections illustrate interactive use of the viblmdb command and have movies that accompany the sections. These sections start with a link that will look like this.

--- View the movie that accompanies this section ---

You can click on the link and it will bring up a movie showing us interacting with Stata and with verbal (audio) explanations.

1. Adjusted Means, Logits and Probabilities in Models with Interactions

1.1 Adjusted Means in Regression

Let's look at a model which has an interaction with two dummy variables.

use http://www.ats.ucla.edu/stat/stata/seminars/stata_vibl/hsbvibl, clear
xi3: regress socst i.academic*i.public
------------------------------------------------------------------------------
       socst |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |   4.041667   3.987242     1.01   0.312    -3.821737    11.90507
  _Ipublic_1 |  -4.462644   3.608323    -1.24   0.218    -11.57877    2.653478
   _Iac1Xpu1 |    5.63394   4.262881     1.32   0.188    -2.773063    14.04094
       _cons |      51.75   3.453053    14.99   0.000     44.94009    58.55991
------------------------------------------------------------------------------

We can graph the predicted means using the postgr3 command as shown below. Note that there is only one set of adjusted means.

postgr3 public, by(academic) table2
------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 |    51.75  47.28736
        1 | 55.79167  56.96296
------------------------------

Say that we label these four cells as A, B, C and D as shown below.

-----------------------------------------
          |       public      
 academic |        0           1
----------+------------------------------
        0 |    51.75 (A)    47.28736 (B)
        1 | 55.79167 (C)    56.96296 (D)
-----------------------------------------

We can interpret the interaction by asking whether the effect of public when academic is 1 (i.e. D-C ) is the same as the effect of public when academic is 0 (i.e. B - A). In other words, is (D - C) equal to (B - A). Another way to look at this is if (D - C) is equal to (B - A) then (D - C) - (B - A) will equal 0. The extent to which this "difference of differences" is non-zero is a way of interpreting the magnitude of the interaction. Based on this data we compute (D - C) - (B - A) and get  (56.96296 - 55.79167) - (47.28736 - 51.75) = 5.63393. Note how this corresponds to the coefficient for the interaction term.

Say that we now add two covariates to this model, math and science.

xi3: regress socst i.academic*i.public math science
------------------------------------------------------------------------------
       socst |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |   2.561599   3.538577     0.72   0.470     -4.41742    9.540619
  _Ipublic_1 |  -2.353959   3.209158    -0.73   0.464    -8.683276    3.975358
   _Iac1Xpu1 |   3.022477    3.79753     0.80   0.427    -4.467267    10.51222
        math |   .3191042   .0940267     3.39   0.001     .1336584    .5045499
     science |   .2563785   .0803609     3.19   0.002     .0978852    .4148717
       _cons |   21.72092   5.065715     4.29   0.000     11.72998    31.71187
------------------------------------------------------------------------------

We can use the postgr3 command to get the adjusted means. By adding the table2 option we also get a table of means as well as the graph of the adjusted means. We also add the ylabel() option to specify the scaling for the y axis. Note that these adjusted means are computed by holding the covariates math and science at their respective means (as indicated by the notes below the command).

postgr3 public, by(academic) table2 ylabel(40 45 to 60)
Holding math constant at 52.645
Holding science constant at 51.85

Here we see the table of means from the table2 option. We have added the letters A, B, C and D to label the four cells so we can quickly refer to each cell.

------------------------------------
          |           public      
 academic |        0         1
----------+-------------------------
        0 | 51.81338(A)  49.45942(B)
        1 | 54.37498(C)   55.0435(D)
------------------------------------

We can compute the "difference of differences" by computing (D - C) - (B - A), which yields (55.04 - 54.37) - (49.45 - 51.81) = 3.03
Note how this corresponds to the coefficient for the interaction.

We can repeat this command showing the adjusted means by holding math and science constant at different values. In this example, we hold math constant at 40 and science constant at 40.

postgr3 public, by(academic) table2 x(math=40 science=40) ylabel(40 45 to 60)
Holding math constant at 40
Holding science constant at 40


------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 | 44.74023  42.38627
        1 | 47.30183  47.97034
------------------------------

Although the means are generally lower, note how the "difference of differences" is still the same as above, (D - C) - (B - A) = (47.97 - 47.30) - (42.38- 44.74) = 3.03.

Likewise, we can set the covariates math constant at 60 and science constant at 60.

postgr3 public, by(academic) table2 x(math=60 science=60) ylabel(40 45 to 60)
Holding math constant at 60
Holding science constant at 60

------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 | 56.24988  53.89592
        1 | 58.81148     59.48
------------------------------

Note how the "difference of differences" still remains the same (within rounding error). (D - C) - (B - A) = (59.48 - 58.81) - ( 53.89 - 56.24) = 3.02

Summary

  1. When there are no covariates, there are only one set of means.
  2. When there are covariates, there are actually a family of adjusted means.
  3. As the values of the covariates are altered, the graph shifts up and down, but the "difference of differences" remains the same.
  4. The coefficient for the interaction (_Iac1Xpu1 =  3.022477 ) and the difference of differences ((D-C) minus (B-A) = 3.02) are the same.  The former is used for testing the significance of the interaction, while the latter is used to describe the effect, yet they both have the same value.

1.2 Means and Adjusted Means in Logit Models

We will repeat the same steps that we showed in section 1.1, but this time using a logistic regression model and we will generate predicted values that are the predicted logits from the model. Where the outcome from the previous model was socst (a continuous variable), the outcome in this model is honors which is a 0/1 variable.

xi3: logit honors i.academic*i.public
------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |  -.1743534    .839501    -0.21   0.835    -1.819745    1.471038
  _Ipublic_1 |  -1.475907   .7686826    -1.92   0.055    -2.982497    .0306838
   _Iac1Xpu1 |   2.064383   .9072052     2.28   0.023     .2862936    3.842473
       _cons |   .5108256   .7302967     0.70   0.484    -.9205297    1.942181
------------------------------------------------------------------------------

We can graph the predicted logits using the postgr3 command. Note we add the predict(xb) option to specify that we want the logits (not predicted probabilities). Since there are no covariates, note there is only one predicted logit per cell.

postgr3 public, by(academic) table2 predict(xb)

--------------------------------
          |        public       
 academic |         0          1
----------+---------------------
        0 |  .5108256  -.9650809
        1 |  .3364722   .9249488
--------------------------------

Note how the difference in differences corresponds to the coefficient for the interaction, (.924 - .336) - (-.965  - .510) = 2.063

Let's now add two covariates to the above model.

xi3: logit honors i.academic*i.public math science
------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |  -.4114892   .8872006    -0.46   0.643    -2.150371    1.327392
  _Ipublic_1 |  -1.314945   .8047815    -1.63   0.102    -2.892288    .2623976
   _Iac1Xpu1 |   1.937694   .9691488     2.00   0.046     .0381978    3.837191
        math |   .0439395   .0259495     1.69   0.090    -.0069205    .0947996
     science |   .0761083   .0232778     3.27   0.001     .0304846     .121732
       _cons |  -5.716258   1.430697    -4.00   0.000    -8.520373   -2.912143
------------------------------------------------------------------------------

We can graph the predicted logits using postgr3 and we also show them as a table.

postgr3 public, by(academic) predict(xb) table2 ylabel(-3 -2 to 3)
Holding math constant at 52.645
Holding science constant at 51.85


--------------------------------
          |        public       
 academic |         0          1
----------+---------------------
        0 |   .543155  -.7717903
        1 |  .1316657    .754415
--------------------------------

As we have done before, we can compute the "difference of differences" like this (D - C) - (B - A) = (.754 - .131) - (-.771 - .543) = 1.937

postgr3 public, by(academic) predict(xb) table2 x(math=40 science=40) ///
  ylabel(-3 -2 to 3)
Holding math constant at 40
Holding science constant at 40


--------------------------------
          |        public       
 academic |         0          1
----------+---------------------
        0 | -.9143439  -2.229289
        1 | -1.325833  -.7030839
--------------------------------

Note how the "difference of differences" is the same as above, (D - C) - (B - A)= (-.703 - -1.325) - ( -2.229 - -.914) = 1.937

postgr3 public, by(academic) predict(xb) table2 x(math=60 science=60) ylabel(-3 -2 to 3)
Holding math constant at 60
Holding science constant at 60


------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 | 1.486613  .1716678
        1 | 1.075124  1.697873
------------------------------

Note how the "difference of differences" is still the same, (D - C) - (B - A) = (1.697 - 1.075) - (.171 - 1.486) = 1.937

Summary

  1. When in the logit scale, this model behaves exactly like an OLS model because they BOTH are linear models.
  2. When there are no covariates, there are only one set of predicted logits.
  3. When there are covariates, there are actually a family of adjusted logits, depending on the values of the covariates.
  4. As the values of the covariates are altered, the predicted logits in the graphs shift up and down, but the "difference of differences" remains the same.
  5. The coefficient for the interaction (_Iac1Xpu1 =  1.937) and the difference of differences ((D-C) minus (B-A) = 1.937) are the same.  The former is used for testing the significance of the interaction, while the latter is used to describe the effect, yet they both have the same value.

1.3 Predicted Probabilities and Adjusted Probabilities Logit Models

So far we have seen that the adjusted means in OLS models and adjusted logits in logistic models yield "difference of differences" that are the same regardless of the values of the covariates. Let's now repeat the analyses from above but generate predicted probabilities and adjusted probabilities.

First, we will run the same logit regression with two dummies and their interaction

xi3: logit honors i.academic*i.public
------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |  -.1743534    .839501    -0.21   0.835    -1.819745    1.471038
  _Ipublic_1 |  -1.475907   .7686826    -1.92   0.055    -2.982497    .0306838
   _Iac1Xpu1 |   2.064383   .9072052     2.28   0.023     .2862936    3.842473
       _cons |   .5108256   .7302967     0.70   0.484    -.9205297    1.942181
------------------------------------------------------------------------------

We can graph the predicted probabilities like this. Note since there are no covariates, we have only a single set of predicted probabilities.

postgr3 public, by(academic) table2
------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 |     .625  .2758621
        1 | .5833333  .7160494
------------------------------

We can compute the difference in differences like this (D-C) - (B-A) = (.716 - .583) - (.275 - .625) = .483

xi3: logit honors i.academic*i.public math science

Now we will use postgr3 to create the adjusted probabilities.

postgr3 public, by(academic) table2 ylabel(0 .1 to 1)
Holding math constant at 52.645
Holding science constant at 51.85


----------------------------
          |      public     
 academic |       0        1
----------+-----------------
        0 | .632546  .316092
        1 | .532869   .68014
----------------------------

Using the same methods as before, we will compute the difference in differences as (D - C) - (B - A)= (.680 - .532) - (.316 - .632) = .464.

postgr3 public, by(academic) table2 x(math=40 science=40) ylabel(0 .1 to 1)
Holding math constant at 40
Holding science constant at 40


------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 | .2861118   .097151
        1 | .2098494  .3311289
------------------------------

Note how the difference in differences is not the same as above, (D - C) - (B - A) =  (.331 - .209) - (.097 - .286) = .311.

postgr3 public, by(academic) table2 x(math=60 science=60) ylabel(0 .1 to 1)
Holding math constant at 60
Holding science constant at 60


------------------------------
          |       public      
 academic |        0         1
----------+-------------------
        0 | .8155693  .5428119
        1 | .7455701  .8452567
------------------------------

Note how the difference in differences is different yet again, (D - C) - (B - A) = (.845 - .745) - (.542 - .815) = .373.

To recap, here is a table showing the difference of differences and how they changed based on the covariate patterns.

------------------------------------------
              | Difference of Differences
Covariates    | (D-C) - (B-A)
--------------+---------------------------
both at mean  | .464
both at 40    | .311
both at 60    | .373
------------------------------------------

Summary

  1. When there are no covariates, there are only one set of predicted probabilities.
  2. When there are covariates, there are actually a family of adjusted adjusted probabilities.
  3. As the values of the covariates are altered, the adjusted probabilities are shifted up or down. Because of the non-linearity of probabilities, the "difference of differences (D - C) - (B - A)" changes.
  4. For the OLS and logit models (using adjusted logits), it was difficult to distinguish between the coefficient for the interaction and the difference of differences (D - C) - (B - A). The former is used for the hypothesis test of the interaction effect, and the latter is used for interpreting the interaction. In the probability scale, we must distinguish between the coefficient that represents the value on which the hypotheses are tested and the difference of differences, which are used as an interpretative framework for understanding the interaction. The difference of differences is no longer synonymous with the interaction effect in this case.

2.0 The concept of covariate contributions

In our simple model above, we have 2 covariates. Given that the pattern of results depends on the values of the covariates, how can we go about investigating a reasonable set of covariate patterns to understand how our pattern of results depends on the covariates? Here are some options.

  1. We could investigate every possible covariate pattern (very time consuming).
  2. We could investigate all covariate patterns that appear in our data (still time consuming).
  3. We could create an index that represents the composite influence of all of the covariates. We could investigate the pattern of the "difference in differences" at the low, middle and high values of this index. We call this index the covariate contribution and it is described and defined below.

Let's re-run our model, this time using dummy variables and a manually constructed interaction.

use http://www.ats.ucla.edu/stat/stata/seminars/stata_vibl/hsbvibl, clear
logit honors academic public acpub math science
------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    academic |  -.4114892   .8872006    -0.46   0.643    -2.150371    1.327392
      public |  -1.314945   .8047815    -1.63   0.102    -2.892288    .2623976
       acpub |   1.937694   .9691488     2.00   0.046     .0381978    3.837191
        math |   .0439395   .0259495     1.69   0.090    -.0069205    .0947996
     science |   .0761083   .0232778     3.27   0.001     .0304846     .121732
       _cons |  -5.716258   1.430697    -4.00   0.000    -8.520373   -2.912143
------------------------------------------------------------------------------

So, the statistical model, based on these results, is

Yhat = -5.71 + -.41*academic + -1.31*public + 1.93*acBYpub + .043*math + .076*science

We can simplify the model to

Yhat = -5.71 + -.41*academic + -1.31*public + 1.93*acBYpub + Covariate Contribution

where

Covariate Contribution = .043*math + .076*science

So, the Covariate Contribution when math is 40 and science is 60 is

.043*40 + .076*60 = 6.2800

and when math is 60 and science is 48.7 the covariate contribution is approximately the same

.043*60 + .076*48.7 = 6.2812.

To make a more general statement, if we are focusing on dummy variables x1, x2 and their interaction x1x2, and we have a model like this.

Yhat = B0 + B1*x1 + B2*x2 + B12*x1x2 + B3*x3 + B4*x4 + B5*x5 etc...

then the Covariate Contribution would be

Covariate Contribution = B3*x3 + B4*x4 + B5*x5 etc...

Rather than fretting about the individual values of the covariates, the covariate contribution forms a composite index of the influence of  the covariates on the value of Yhat. Using our data, we can calculate the covariate contribution by using a simple generate command, like this.

generate cc = .0439395*math + .0761083*science

We can then inspect the covariate contributions using the summarize command.

summarize cc 
    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
          cc |       200     6.25941    1.062341   3.912154    8.77526

Here we use the centile command to get the 10th to 90th percentiles (in increments of 10).

centile cc, centile(10 20 30 40 50 60 70 80 90)

                                                       -- Binom. Interp. --
    Variable |     Obs  Percentile      Centile        [95% Conf. Interval]
-------------+-------------------------------------------------------------
          cc |     200         10       4.77296        4.582271    4.991118
             |                 20      5.239889        4.995336    5.478327
             |                 30      5.608901        5.326364     5.90322
             |                 40      6.018401        5.696843    6.193006
             |                 50      6.300938         6.09802    6.496595
             |                 60      6.628993        6.391559    6.839334
             |                 70      6.903919        6.771646    7.035632
             |                 80       7.13853          6.9714    7.391951
             |                 90       7.73037        7.430705    7.956061

We can select a range of values for the covariate contributions and explore the adjusted probabilities that are derived from that range of covariate contributions. We could then examine the difference in differences that result from a range of covariate contributions. There are probably many ways you could select an appropriate range, for example by taking the average covariate contribution and adding and subtracting one standard deviation, or choosing quartiles, or choosing percentiles such as the 10th percentile, 50th percentile, 90th percentile. Below, we choose to explore the values based on the 20th, 50th and 80th percentiles.

We could explore graphs of the predicted probabilities and difference of differences with the covariate contributions ranging from 5.24 to 6.30 to 7.13. While we can specify values for specific covariates in postgr3, we cannot directly specify covariate contribution values. We need some kind of program that will let us visualize interactions like postgr3 but permit us to vary the covariate contribution. We have developed a suite of programs to help with this called vibl (Visualizing Binary Logistic models).

3.0 Introduction to Visualizing for Binary Logistic models (vibl)

3.1 Introducing vibl to Understand how Probabilities are Affected by Covariates

--- View the movie that accompanies this section ---

The vibl suite of programs allow us to plot of the interactions like postgr3 but it permits us to enter the coefficients for all of the terms in the model, and to vary the covariate contribution. If you have not done so already, you can download this suite of programs and the associated data files from our web site like this.

net from http://www.ats.ucla.edu/stat/stata/ado/analysis
net install vibl
net get vibl

Let's start by focusing on the program viblidb. We can start this program by typing

viblidb , b0(-5.71) b1(-.41) b2(-1.31) b12(1.93) ccat(6.30) ccmin(5) ccmax(7.5)

or we can type

viblidb

and once in the program we can use the point and click interface to select these values

The graph assumes you have two dummy variables, x1 and x2. In terms of the models we explored above, x1 corresponds to academic (0/1) and x2 corresponds to public (0/1). Note that we show the Stata Version 8 style graphs in this page for aesthetics, but the default graphs are Stata Version 7 graphs which are used for their great speed.

Consistent with our labeling in previous examples, the cells A, B, C and D map to our variables like this.

-----------------------------
               |  academic (x1)
  public (x2)  |   0     1
---------------+--------------
             0 |  (A)   (B)
             1 |  (C)   (D)
----------------------------- 

In fact, you see a little table of the predicted probabilities in the Stata results window.

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.64 (A)   0.54 (B)   (B-A) = -0.10
       1 |  0.33 (C)   0.69 (D)   (D-C) =  0.36
                      (D-C) minus (B-A) =  0.46

For our given interaction, we have a family of graphs that vary depending on the size of the covariate contribution. We can imagine this family of graphs being printed on different pages of a book, and each page number corresponds to a different covariate contribution. If higher page numbers correspond to higher covariate contributions, then as you flip the pages forward in the book the predicted probabilities would rise and the shape of the interaction might begin to change.

3.2 Options for starting vibl

By the way, sometimes it is sometimes easier to pass parameters to vibl to specify the starting values. For example

viblidb , b0(-5.71) b1(-.41) b2(-1.31) b12(1.93) ccat(6.30) ccmin(5) ccmax(7.5)

The meaning of these parameters is summarized below.

3.3 Viewing Graphs of Multiple Covariate Contributions at Once (multiple plot 1)

--- View the movie that accompanies this section ---

You might want to show graphs for a number of covariate contributions side by side so you can compare them. The viblidb tool will allow you to do that.

Start viblidb with these parameter values

viblidb , b0(-5.71) b1(-.41) b2(-1.31) b12(1.93) ccat(6.30) ccmin(5) ccmax(7.5)

In addition, the output window shows tables of probabilities that correspond to these three graphs. We can see that the difference in differences (i.e., (D-C) minus (B-A)) is .39 when the covariate contribution is low (at the 20th percentile), then is .46 when the covariate contribution is medium (at the 50th percentile) and then is .39 when the covariate contribution is high (at the 80th percentile).

**For CC=5.24**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.38 (A)   0.29 (B)   (B-A) = -0.09
       1 |  0.14 (C)   0.44 (D)   (D-C) =  0.30
                      (D-C) minus (B-A) =  0.39
**For CC=6.3**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.64 (A)   0.54 (B)   (B-A) = -0.10
       1 |  0.33 (C)   0.69 (D)   (D-C) =  0.36
                      (D-C) minus (B-A) =  0.46
**For CC=7.14**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.81 (A)   0.73 (B)   (B-A) = -0.08
       1 |  0.53 (C)   0.84 (D)   (D-C) =  0.31
                      (D-C) minus (B-A) =  0.39

As a reminder, here is a legend for the four cells.

-----------------------------
               |  public (x1)
academic (x2)  |   0     1
---------------+--------------
             0 |  (A)   (B)
             1 |  (C)   (D)
----------------------------- 

3.4 Viewing the entire family of graphs at once (plot 2)

--- View the movie that accompanies this section ---

While it is nice to be able to see three graphs at once, kind of like three pages from our imaginary flip book, you might want to be able to see all of the pages at once. While we could make such a graph three dimensionally, we have found a way to make such a graph two dimensionally, but it can be a bit tricky to interpret.

Click on Type II and then Show Plots and an additional graph appears.

On the x axis of the fourth graph you can see the covariate contribution ranging from 5 to 7.5 (because those are the values we chose in the dialog box for the min and max CC values). Also, note that three vertical lines are drawn. These correspond to values we chose on the CC list.

The vertical line at 5.24 in the bottom right panel corresponds to the graph in the top left panel. Cell D is red with a dot and Cell C is red (no dot). Cell A is blue with no dot and Cell B is blue with a dot. In this graph, you can see that the (D-C) difference is positive, while the (B-A) difference is negative.

The vertical line at 6.30 in the bottom right panel corresponds to the top right graph.

The vertical line at 7.14 corresponds to the bottom left graph.

The important part of this graph is that you can see that the difference between (B-A) (blue dot minus dashed blue) is fairly constant across the levels of CC. Likewise, the difference between (D-C) (red dot minus red dashed) is fairly constant across levels of the CC. However, you can see that as you approach the low levels for the covariate contribution the (D-C) difference contracts as C presses against the floor.  Likewise, as the covariate contribution approaches high levels, the difference between (D-C) contracts as well as D presses against the ceiling.

3.5 Viewing Entire Family of Differences at once (plot 3)

--- View the movie that accompanies this section ---

The Type II graph is pretty good at helping us see the values of cells A, B, C, and D across the covariate contribution. We can visually look for consistency (or inconsistency) of the differences in (B-A) and (D-C) across the levels of the covariate contributions. Rather than mentally subtracting (B-A) and (D-C), we can show two lines representing these differences, across the levels of the covariate contributions.

You can see that the red line has a slight upside down U shape. The blue line is relatively flat, with a very slight U shape. This suggests that across these values of the covariate contribution, these effects are fairly constant. We can view the difference in differences, (D-C) minus (B-A), using Type IV plots as shown next.

3.6 Viewing difference of differences (plot 4)

--- View the movie that accompanies this section ---

Uncheck the CC list and check Type IV plots as well (so plot types 1 2 3 and 4 are checked) and then click on update plots. The first three plots are as we have seen before, but now the bottom right graph is a Type IV graph which shows the difference in differences, (D-C) minus (B-A). As we saw above, if this were a linear model this difference in differences would be constant across the levels of the covariate contribution. While the difference in differences has a slight upside down U shape, the difference is still moderately constant across the levels of the covariate contribution.

3.7 Interacting with all four plots at once

--- View the movie that accompanies this section ---

We can click the up and dn buttons to vary the covariate contribution from 5.2 to 7.1 and view all four graphs at once.

Each time we press Up to increment the CC it is like we are moving to a different page from our flip book corresponding to the different covariate contributions.

4.0 Apparent Interactions

--- View the movie that accompanies this section ---

Let's start viblidb again to look at more options.

viblidb

We can adjust the values for b0, b1, b2, b12, and cc.

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.18 (A)   0.62 (B)   (B-A) =  0.44
       1 |  0.38 (C)   0.82 (D)   (D-C) =  0.44
                      (D-C) minus (B-A) =  0.00

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.38 (A)   0.82 (B)   (B-A) =  0.44
       1 |  0.62 (C)   0.92 (D)   (D-C) =  0.30
                      (D-C) minus (B-A) = -0.14

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.08 (A)   0.38 (B)   (B-A) =  0.30
       1 |  0.18 (C)   0.62 (D)   (D-C) =  0.44
                      (D-C) minus (B-A) =  0.14

 



5.0 Are the patterns robust across CC values?

5.1 Scenario 1

--- View the movie that accompanies this section ---

Start viblidb with these coefficients.

viblidb , b0(.4) b1(.3) b2(-.8) b12(.8) ccmin(-1) ccmax(1)

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.60 (A)   0.67 (B)   (B-A) =  0.07
       1 |  0.40 (C)   0.67 (D)   (D-C) =  0.27
                      (D-C) minus (B-A) =  0.20

Assume b12 of .8 is significant, but you are interested in exploring the patterns of the differences of differences across the values of the covariate contributions from -1 to 1. At a covariate contribution of 0, the (B-A) difference is small (0.07) and the (D-C) difference is larger (0.27), and a difference of differences of .20. But does this pattern hold across covariate contributions? Say that the median CC is 0 and the 20th percentile is -1 and the 80th percentile is 1.

 **For CC=-1**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.35 (A)   0.43 (B)   (B-A) =  0.08
       1 |  0.20 (C)   0.43 (D)   (D-C) =  0.23
                      (D-C) minus (B-A) =  0.15
**For CC=0**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.60 (A)   0.67 (B)   (B-A) =  0.07
       1 |  0.40 (C)   0.67 (D)   (D-C) =  0.27
                      (D-C) minus (B-A) =  0.20
**For CC=1**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.80 (A)   0.85 (B)   (B-A) =  0.05
       1 |  0.65 (C)   0.85 (D)   (D-C) =  0.20
                      (D-C) minus (B-A) =  0.15

Summary. For this pattern of results, the difference in differences is positive across the levels of the covariate contributions, although the value dips from .20 at 0 down to .15 at -1 and +1.

5.2 Scenario 2

--- View the movie that accompanies this section ---

Let's examine a different set of coefficients and this time say the covariate contribution ranges from -2 (at the 20th percentile) to 2 (at the 80th percentile).

viblidb , b0(.8) b1(1.2) b2(-1.4) b12(-1) ccmin(-2) ccmax(2)
         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.69 (A)   0.88 (B)   (B-A) =  0.19
       1 |  0.35 (C)   0.40 (D)   (D-C) =  0.05
                      (D-C) minus (B-A) = -0.14

Note how the difference between C and D is small (0.05) as compared to the difference between A and B, which is .19, leading to a difference of differences of -.14. But this is when CC=0.

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.23 (A)   0.50 (B)   (B-A) =  0.27
       1 |  0.07 (C)   0.08 (D)   (D-C) =  0.01
                      (D-C) minus (B-A) = -0.26

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.94 (A)   0.98 (B)   (B-A) =  0.04
       1 |  0.80 (C)   0.83 (D)   (D-C) =  0.03
                      (D-C) minus (B-A) = -0.01



Summary. The difference in differences for this example depends on the covariate contribution. A single Type I graph would not properly portray the relationship.

6.0 Comparing effects with and without interactions

6.1 Scenario 1

--- View the movie that accompanies this section ---

Let's start fresh and switch to a model that looks like this

viblidb , b0(-1.6) b1(.4) b2(.7) b12(1) ccmin(-1) ccmax(1)

Looking at this plot, we can see the evidence of the interaction. However, we have seen that we can get faked out by plots where a difference in differences makes it look like there is an underlying interaction when there actually is no interaction (see Section 4).

**For CC=-1**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.07 (A)   0.10 (B)   (B-A) =  0.03
       1 |  0.13 (C)   0.38 (D)   (D-C) =  0.25
                      (D-C) minus (B-A) =  0.22
**For CC=0**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.17 (A)   0.23 (B)   (B-A) =  0.06
       1 |  0.29 (C)   0.62 (D)   (D-C) =  0.33
                      (D-C) minus (B-A) =  0.27
**For CC=1**

         |      x1   
      x2 |  0         1
---------+--------------------
       0 |  0.35 (A)   0.45 (B)   (B-A) =  0.10
       1 |  0.52 (C)   0.82 (D)   (D-C) =  0.30
                      (D-C) minus (B-A) =  0.20

Summary. You can see that the difference in differences is consistently positive even though it changes from .22 to .27 to .20.  When the interaction is omitted, the difference in differences is very close to 0 (see the Type IV graph in the bottom right).  With the interaction term included, the size of the difference in differences is almost entirely due to the interaction.

6.2 Scenario 2

--- View the movie that accompanies this section ---

Consider a model with these results.

viblidb , b0(-1) b1(1.7) b2(1.5) b12(1) ccmin(0) ccmax(2)

Note how there is little difference in the predicted probabilities with and without the interaction across the levels of the covariate contribution. In the Type IV graph, you can see that the difference in differences is very similar when the interaction is included or excluded. In fact, the difference in differences is more extreme when the interaction is omitted.

7.0 Comparing Probabilities and Logits

--- View the movie that accompanies this section ---

Consider the results of this model.

viblidb , b0(-1) b1(1.7) b2(2.5) b12(2) ccmin(-2) ccmax(2)

CC at -2

CC at 2

   

8.0 Making high quality graphs with viblidb

Say that we run this model.

viblidb , b0(.4) b1(.3) b2(-.8) b12(.8) ccmin(-1) ccmax(1)

Say that we want to get a nice looking Version 8 graph. We can click on the Version 8 button and then we see a graph that looks like this.

So, you may ask, why do we create Version 7 graphs by default instead of Version 8 graphs? Even though Stata Version 8 graphs look terrific, they are not nearly as fast version 7 graphs. For the sake of speed, we use Stata Version 7 graphs by default, but give you the option of making Stata Version 8 graphs in case you wish to make a publication quality graph. However, you might want to customize these graphs further. The next section shows you how.

9.0 Pasting syntax for making graphs

Say that we use viblidb like below.

viblidb , b0(.4) b1(.3) b2(-.8) b12(.8) ccmin(-1) ccmax(1)

Suppose that we like the graph but would like to tinker with it.  We can press the Paste Syntax button and the program will display the following message.

Syntax has been pasted to _vibli_paste_syntax.do file. 

The syntax for creating the graph has been placed into the file named _vibli_paste_syntax.do and we can view the file with the type command as shown below.

type  _vibli_paste_syntax.do

/*---------------------------------------------
Session starts at 12:32:46  25 Oct 2004.
----------------------------------------------*/
vibligraph, b0(.4) b1(.3) b2(-.8) b12(.8) ccmin(-1) ccmax(1) nodraw omitint(0) 
  abcd  x1name(x1) x2name(x2)ccat(0) type(1) name(g1, replace)
graph combine g1 

You can then edit this file with the do file editor to tailor the graph to your liking. You can see help vibligraph for more options for customizing these graphs.  You can also supply regular Stata graph options at the end of the command. This is discussed further in Section 11.

10.0 Computing CCs and Graphs for your data

We have seen how we can make graphs for various covariate contributions. Rather than approaching this as a teaching tool, let's see how you can use this as a research tool to create publication quality graphs. Let's use the data file vibl.

use http://www.ats.ucla.edu/stat/stata/seminars/stata_vibl/hsbvibl, clear

Then, say we want to run the following logit model (where acpub is the product of academic and public).

logit honors academic public acpub math science
------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    academic |  -.4114892   .8872006     -0.46   0.643   -2.150371    1.327392
      public |  -1.314945   .8047815     -1.63   0.102   -2.892288     .2623976
       acpub |   1.937694   .9691488      2.00   0.046     .0381978   3.837191
        math |   .0439395   .0259495      1.69   0.090    -.0069205    .0947996
     science |   .0761083   .0232778      3.27   0.001     .0304846    .121732
       _cons |  -5.716258  1.430697      -4.00   0.000   -8.520373   -2.912143
------------------------------------------------------------------------------

We can then use the viblicc command to compute the covariate contribution and we use the generate() option to create the variable mycc1.

viblicc honors academic public acpub math science, generate(mycc1)
Saving covariate contribution as mycc1
Percentiles for Covariate Contribution
    P1   P10   P20   P30   P40   P50   P60   P70   P80   P90   P99
 4.011 4.773  5.24 5.609 6.018 6.301 6.629 6.904 7.139  7.73 8.458

The program creates the variable mycc1 and it computes selected percentiles of the covariate contribution, namely it computes these percentiles 1st 10th 20th 30th 40th 50th 60th 70th 80th 90th and 99th.

We can run the viblicc command and add the graph option so viblicc will call vibligraph to display the graph of the predicted probabilities holding the covariate contribution at the median (50th percentile).

viblicc honors academic public acpub math science, graph
Percentiles for Covariate Contribution
    P1   P10   P20   P30   P40   P50   P60   P70   P80   P90   P99
 4.011 4.773  5.24 5.609 6.018 6.301 6.629 6.904 7.139  7.73 8.458

vibligraph , b0(-5.716) b1(-.411) b2(-1.315) b12(1.938) ccat(6.301) ///
  ccmin(5.24) ccmax(7.73) x1name(academic) x2name(public)

         |   academic
  public |  0         1
---------+--------------------
       0 |  0.64 (A)   0.54 (B)   (B-A) = -0.10
       1 |  0.33 (C)   0.69 (D)   (D-C) =  0.36
                      (D-C) minus (B-A) =  0.46

However, we might want to explore the range of covariate contributions (not just display a graph where the covariate contribution is at the median). Instead of adding the graph option, we can add the db option and that will start viblidb as shown below. The coefficients from the model are automatically filled in and the covariate contribution is set to start at the median and the min CC and max CC values are set to the 20th and 80th percentiles of the covariate contribution. You can then vary the covariate contribution values or view the different types of graphs to better understand your results.

viblicc honors academic public acpub math science, db

The above viblicc command starts viblid is equivalent to typing

viblib , b0(-5.71) b1(-.41)  b2(-1.31)  b12(1.93) ccat(6.30) ccmin(5.24) ///
  ccmax(7.73)

11.0 Using vibligraph

Here is a graph where we supply the coefficients.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(0)

Here is a graph where we supply the coefficients and increase the covariate contribution to 1.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1)

Here is a graph where we change the label for the y axis via ylabel(.6 .7 to 1). You can add graph options at the end of the vibligraph command.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1) ylabel(.6 .7 to 1)

Here is the same graph in logit scale and adjusting the labeling of the y axis.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1) logit ylabel(1 1.5 2)

This example shows how you can label the y axis, use a different title for the x axis, change the legend, add a title, and add a note to the graph. As you can see, you can add just about any graph option you desire to customize the graph.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(0) ylabel(.5 .6 to .8) ///
        xtitle("Academic (yes/no)") legend(label(1 Public) label(2 Private)) ///
        title(Predicted Probabilities by Public and Academic) ///
        note(Contribution of Covariates at average)

Here is a simpler version of the same graph but shown as a Type II graph.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1) type(2)

Here is the same graph but shown as a Type III graph.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1) type(3)
 

Here is the same graph, but shown as a Type IV graph.

vibligraph , b0(.1) b1(.3) b2(.2) b12(.4) ccat(1) type(4)

You can see help vibligraph to learn more about the syntax for using it.

12.0 Using vibligraph for your data (dummy by continuous)

Although all of the discussion has focused on dummy by dummy interactions, there is no reason why this cannot be extended to dummy by continuous interactions.

First, lets look at an analysis using xi3 and postgr3 where we focus on the interaction of public and write, a dummy by continuous interaction.

use http://www.ats.ucla.edu/stat/stata/seminars/stata_vibl/hsbvibl, clear
xi3: logit honors i.public*write math science, nolog
i.public          _Ipublic_0-1        (naturally coded; _Ipublic_0 omitted)

Logit estimates                                   Number of obs   =        200
                                                  LR chi2(5)      =      71.26
                                                  Prob > chi2     =     0.0000
Log likelihood =  -102.9903                       Pseudo R2       =     0.2570

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  _Ipublic_1 |  -7.208777   3.231223    -2.23   0.026    -13.54186   -.8756967
       write |  -.0320984   .0554697    -0.58   0.563     -.140817    .0766202
    _Ipu1Xwr |   .1298698   .0582835     2.23   0.026     .0156362    .2441035
        math |    .050613   .0259883     1.95   0.051    -.0003232    .1015493
     science |   .0431597   .0230889     1.87   0.062    -.0020938    .0884132
       _cons |    -2.8703   2.976909    -0.96   0.335    -8.704935    2.964335
------------------------------------------------------------------------------
postgr3 write, by(public)

Now let's consider this using the vibl tools. First, we will rerun the model with pubwrite as the interaction term.

gen pubwrite = public*write
logit honors write public pubwrite math science, nolog

Logit estimates                                   Number of obs   =        200
                                                  LR chi2(5)      =      71.26
                                                  Prob > chi2     =     0.0000
Log likelihood =  -102.9903                       Pseudo R2       =     0.2570

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       write |  -.0320984   .0554697    -0.58   0.563     -.140817    .0766202
      public |  -7.208777   3.231223    -2.23   0.026    -13.54186   -.8756967
    pubwrite |   .1298698   .0582835     2.23   0.026     .0156362    .2441035
        math |    .050613   .0259883     1.95   0.051    -.0003232    .1015493
     science |   .0431597   .0230889     1.87   0.062    -.0020938    .0884132
       _cons |    -2.8703   2.976909    -0.96   0.335    -8.704935    2.964335
------------------------------------------------------------------------------

Now we use viblicc to get a sense of the range of the covariate values.

viblicc honors write public pubwrite math science
Percentiles for Covariate Contribution
    P1   P10   P20   P30   P40   P50   P60   P70   P80   P90   P99
  3.35 3.781 4.077 4.379 4.685 4.881 5.203 5.357 5.576 6.022 6.672

And we can get summary statistics for write for each level of public to get a sense of its range for each group.

bys public: summ write

--------------------------------------------------------------------------------------------------------
-> public = 0

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       write |        32    55.53125     7.17965         38         67

--------------------------------------------------------------------------------------------------------
-> public = 1

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       write |       168       52.25    9.785575         31         67

Now, we can pull together this information for the vibligraph command. We get the coefficients from the logit output, the ccat() value is the median covariate contribution from the viblicc command above and the xmin() and xmax() are selected at 40 and 65 to make sure they are within the min and max for both groups.

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(4.88) xmin(40) xmax(65)

We then repeat the command for the CC at the 20th percentile.

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(4.07) xmin(40) xmax(65)

We then repeat the command for the CC at the 80th percentile.

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(5.58) xmin(40) xmax(65)

We can specify the xat(50) and xinc(10) option to indicate we want to get the predicted probabilities when x is 50 and 60. We do this for the CC at the 20th, 50th, and 80th percentiles, shown below.

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(4.07) ///
  xmin(40) xmax(65) xat(50) xinc(10) 
**For CC=4.07**

         |      x1   
      x2 |  50         60
---------+--------------------
       0 |  0.43 (A)   0.35 (B)   (B-A) = -0.08
       1 |  0.27 (C)   0.50 (D)   (D-C) =  0.23
                      (D-C) minus (B-A) =  0.31

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(4.88) ///
  xmin(40) xmax(65) xat(50) xinc(10) 
**For CC=4.88**

         |      x1   
      x2 |  50         60
---------+--------------------
       0 |  0.62 (A)   0.55 (B)   (B-A) = -0.07
       1 |  0.45 (C)   0.69 (D)   (D-C) =  0.24
                      (D-C) minus (B-A) =  0.31

vibligraph , b0(-2.87) b1(-.03) b2(-7.2) b12(.13) ccat(5.58) ///
  xmin(40) xmax(65) xat(50) xinc(10) 
**For CC=5.58**

         |      x1   
      x2 |  50         60
---------+--------------------
       0 |  0.77 (A)   0.71 (B)   (B-A) = -0.06
       1 |  0.62 (C)   0.82 (D)   (D-C) =  0.20
                      (D-C) minus (B-A) =  0.26

Note that for graphs that involve continuous by dummy interactions, you cannot use viblidb since that is restricted just to dummy by dummy interactions.

13.0 How do the vibli commands work together?

14.0 How does xi3 and postgr3 work?

We have used xi3 and postgr3 to compute adjusted means, but these programs should not be treated as some black box. Here is a demonstration of how these programs work by showing how to manually obtain the values. Since xi3 and postgr3 are the underlying technology beneath vibl and graphengine , you can use this logic to see how you can extend this logic to other kinds of models.

14.1 Using xi3 and postgr3

First, consider this model run via xi3.

use http://www.ats.ucla.edu/stat/stata/seminars/stata_vibl/hsbvibl, clear
xi3: logit honors i.academic*i.public math science
------------------------------------------------------------------------------
     honors |      Coef.  Std. Err.      z    P>|z|    [95% Conf. Interval]
-------------+----------------------------------------------------------------
_Iacademic_1 |  -.4114892   .8872006    -0.46  0.643    -2.150371    1.327392
  _Ipublic_1 |  -1.314945   .8047815    -1.63  0.102    -2.892288    .2623976
   _Iac1Xpu1 |   1.937694   .9691488    2.00   0.046     .0381978   3.837191
        math |   .0439395  .0259495     1.69   0.090   -.0069205    .0947996
     science |   .0761083   .0232778    3.27   0.001     .0304846    .121732
       _cons |  -5.716258  1.430697    -4.00   0.000    -8.520373  -2.912143
------------------------------------------------------------------------------

Now we generate the adjusted probabilities using postgr3.

postgr3 public, by(academic) table2
Variables left asis: _Ipublic_1 _Iacademic_1 _Iac1Xpu1
Holding math constant at 52.645
Holding science constant at 51.85

----------------------------
          |      public     
 academic |       0        1
----------+-----------------
        0 | .632546  .316092
        1 | .532869   .68014
----------------------------

14.2 Creating values manually

Now we run the logit model manually (with manually constructed interaction terms).

logit honors academic public acpub math science
------------------------------------------------------------------------------
      honors |      Coef.  Std. Err.      z    P>|z|    [95% Conf. Interval]
-------------+----------------------------------------------------------------
    academic |  -.4114892   .8872006   -0.46   0.643    -2.150371    1.327392
      public |  -1.314945   .8047815   -1.63   0.102    -2.892288    .2623976
       acpub |   1.937694  .9691488     2.00   0.046    .0381978    3.837191
        math |   .0439395  .0259495     1.69   0.090   -.0069205    .0947996
     science |   .0761083   .0232778    3.27   0.001     .0304846    .121732
       _cons |  -5.716258  1.430697    -4.00   0.000    -8.520373  -2.912143
------------------------------------------------------------------------------

We will use the preserve command so we can restore our data back to their original state.

preserve

Now we will replace math with the mean of math, and science with the mean of science.

summarize math
replace math = r(mean)
summarize science
replace science = r(mean)

At this point, the variable math contains the mean of math and science contains the mean of science. Now, when we use the predict command, the predictions will be based on the average value of math and science. So, we issue predict yhat that creates the adjusted probability, holding math and science at their mean.

predict yhat
(option p assumed; Pr(honors))

Now we separate yhat by public making yhat0 (corresponding to academic==0) and yhat1 (corresponding to academic==1).

separate yhat, by(academic)
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
yhat0           float  %9.0g                  yhat, academic == 0
yhat1           float  %9.0g                  yhat, academic == 1
graph twoway line yhat0 yhat1 public

Here we show the tables of predicted probabilities

table academic public , contents(mean yhat)
----------------------------
          |      public     
 academic |       0        1
----------+-----------------
        0 | .632546  .316092
        1 | .532869   .68014
----------------------------

We can then use the restore command to restore the data file back to its original state (namely putting math and science back to their original values).

restore

15.0 How did we make hsbvibl?

Here is how we made the hsbvibl data file.

use http://www.ats.ucla.edu/stat/stata/notes/hsb2, clear
generate honors = socst > 51
generate public = schtyp==1
generate academic = prog==2
gen acpub = academic*public
save hsbvibl, replace

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California