### Stata FAQ How to generate a saturated model using DESMAT?

Sometimes, we need to generate a saturated model. In Stata, this can be done easily using the program desmat, written by John Hendrickx. The command needs to be downloaded before we use it and can be obtained by typing findit dm73_3 in the command line (see How can I use the findit command to search for programs and get additional help? for more information about using findit).

Here is an example using a data set on belief in afterlife from An Introduction To Categorical Analysis by Argresti. There are three categorical variables in the data set.
use http://www.ats.ucla.edu/stat/stata/faq/afterlife, clear

list
race     gender     belief      count
1.         1          1          1        371
2.         1          1          2         49
3.         1          1          3         74
4.         1          0          1        250
5.         1          0          2         45
6.         1          0          3         71
7.         0          1          1         64
8.         0          1          2          9
9.         0          1          3         15
10.         0          0          1         25
11.         0          0          2          5
12.         0          0          3         13
To generate a saturated model, we can simply do the following. The three predictors grouped with "*" indicate that we want all the main effects, 2-way interactions and the 3-way interaction.
desmat: poisson count race*gender*belief

-------------------------------------------------------------------------------
poisson
-------------------------------------------------------------------------------
Dependent variable                                                     count
Number of observations:                                                   12
Initial log likelihood:                                             -665.927
Log likelihood:                                                      -33.156
LR chi square:                                                      1265.541
Model degrees of freedom:                                                 11
Pseudo R-squared:                                                      0.950
Prob:                                                                  0.000
-------------------------------------------------------------------------------
nr Effect                                                     Coeff        s.e.
-------------------------------------------------------------------------------
count
race
1      1                                                      2.303**     0.210
gender
2      1                                                      0.940**     0.236
race.gender
3      1.1                                                   -0.545*      0.250
belief
4      2                                                     -1.609**     0.490
5      3                                                     -0.654       0.342
race.belief
6      1.2                                                   -0.105       0.516
7      1.3                                                   -0.605       0.367
gender.belief
8      1.2                                                   -0.352       0.606
9      1.3                                                   -0.797       0.446
race.gender.belief
10     1.1.2                                                  0.043       0.645
11     1.1.3                                                  0.444       0.483
12   _cons                                                    3.219**     0.200
-------------------------------------------------------------------------------
*  p < .05
** p < .01
A set of dummy variables are generated by the program, and they are named as _x_1, _x_2, etc. To see what they are parameterized for, we can type
showtrms

Desmat generated the following design matrix:

nr   Variables       Term                        Parameterization
First    Last
1    _x_1           race                        ind(0)
2    _x_2           gender                      ind(0)
3    _x_3           race.gender                 ind(0).ind(0)
4    _x_4    _x_5   belief                      ind(1)
5    _x_6    _x_7   race.belief                 ind(0).ind(1)
6    _x_8    _x_9   gender.belief               ind(0).ind(1)
7   _x_10   _x_11   race.gender.belief          ind(0).ind(0).ind(1)
There are a few options for desmat. For example, we can use desrep to display the full result of a model.
desmat: poisson count race*gender*belief, desrep(exp all)

-------------------------------------------------------------------------------
poisson
-------------------------------------------------------------------------------
Dependent variable                                                     count
Number of observations:                                                   12
Initial log likelihood:                                             -665.927
Log likelihood:                                                      -33.156
LR chi square:                                                      1265.541
Model degrees of freedom:                                                 11
Pseudo R-squared:                                                      0.950
Prob:                                                                  0.000
-------------------------------------------------------------------------------
nr Effect             Coeff        s.e.       z        prob    lo 95%    hi 95%
(exponential parameters)
-------------------------------------------------------------------------------
count
race
1      1             10.000**     2.098    10.977     0.000     6.629    15.085
gender
2      1              2.560**     0.604     3.986     0.000     1.612     4.064
race.gender
3      1.1            0.580*      0.145    -2.184     0.029     0.355     0.946
belief
4      2              0.200**     0.098    -3.285     0.001     0.077     0.522
5      3              0.520       0.178    -1.912     0.056     0.266     1.016
race.belief
6      1.2            0.900       0.464    -0.204     0.838     0.327     2.474
7      1.3            0.546       0.201    -1.646     0.100     0.266     1.122
gender.belief
8      1.2            0.703       0.426    -0.582     0.561     0.215     2.304
9      1.3            0.451       0.201    -1.785     0.074     0.188     1.081
race.gender.belief
10     1.1.2          1.044       0.673     0.066     0.947     0.295     3.695
11     1.1.3          1.558       0.753     0.918     0.359     0.604     4.017
12   _cons           25.000**     5.000    16.094     0.000    16.893    36.998
-------------------------------------------------------------------------------
*  p < .05
** p < .01
One thing that one often wants to do after running a saturated model is to compare it with other models. We can issue the command lrtest to save the likelihood ratio for the saturated model after the saturated model is created. Then we run other smaller models and do the lrtest again using the saved information to compare models.
lrtest, saving(m0)
desmat: poisson count race belief*gender, desrep(exp all)

-------------------------------------------------------------------------------
poisson
-------------------------------------------------------------------------------
Dependent variable                                                     count
Number of observations:                                                   12
Initial log likelihood:                                             -665.927
Log likelihood:                                                      -36.852
LR chi square:                                                      1258.149
Model degrees of freedom:                                                  6
Pseudo R-squared:                                                      0.945
Prob:                                                                  0.000
-------------------------------------------------------------------------------
nr Effect             Coeff        s.e.       z        prob    lo 95%    hi 95%
(exponential parameters)
-------------------------------------------------------------------------------
count
race
1      1              6.565**     0.616    20.063     0.000     5.463     7.890
belief
2      2              0.182**     0.028   -11.088     0.000     0.135     0.246
3      3              0.305**     0.038    -9.513     0.000     0.239     0.390
gender
4      1              1.582**     0.122     5.952     0.000     1.360     1.840
belief.gender
5      2.1            0.733       0.152    -1.493     0.136     0.488     1.102
6      3.1            0.670*      0.114    -2.350     0.019     0.480     0.936
7    _cons           36.352**     3.682    35.473     0.000    29.806    44.336
-------------------------------------------------------------------------------
*  p < .05
** p < .01

lrtest, using(m0)

Poisson:  likelihood-ratio test                       chi2(5)     =       7.39
Prob > chi2 =     0.1931
Another command that comes with desmat is destest. It performs a Wald test on model terms after a model has been created.
destest

Testing all model terms ...
-------------------------------------------------------------------------------
Term                                                Wald chi2      df  P > chi2
-------------------------------------------------------------------------------
race                                                  402.544**     1     0.000
belief                                                179.902**     2     0.000
gender                                                 35.431**     1     0.000
belief.gender                                           6.766*      2     0.034
-------------------------------------------------------------------------------
*  p < .05
** p < .01
For more information, please do help desmat or visit the webpage on DESMAT for Stata.

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.