UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata FAQ: How can I perform post estimation tests with multiply imputed datasets?

Below we show how to perform post estimation hypothesis tests on models based on multiply imputed data with the package mim written by John B. Galati, Patrick Royston, and John C. Carlin. You can download the package by typing findit mim. (See How can I use the findit command to search for programs and get additional help? for more information about using findit.) Rather than using a set of unique command names, the mim package uses a prefix (mim:) followed by a "normal" Stata command. This is similar to other prefix commands you may already be familiar with such as xi:, or svy:.

The example for this dataset uses data on high school students. The variables read, write, and math give the student's scores in reading, writing, and math respectively. The variable female is equal to one if the student is female and zero otherwise. Finally, prog contains information on the type of program the student is in either general, academic, and vocational. The multiply imputed datasets are contained in a single file which contains all 10 imputations as well as the original data. The variable _mj gives the imputation number,  _mj = 0 contains the original data.

Below we use mim: reg to fit a linear regression model. The variable prog is nominal (also called categorical) so the xi: prefix is used to include dummy variables for prog in the model. The xi: prefix must come before the mim: prefix. The mim: prefix informs Stata that we want to analyze multiply imputed datasets, without it, the command would be performed on the dataset as though it were a single dataset, rather than a series of multiply imputed datasets.

xi: mim: reg read write female math i.prog
i.prog            _Iprog_1-3          (naturally coded; _Iprog_1 omitted)

Multiple-imputation estimates (regress)                  Imputations =      10
Linear regression                                        Minimum obs =     200
                                                         Minimum dof =    77.2

------------------------------------------------------------------------------
        read |     Coef.  Std. Err.     t    P>|t|    [95% Conf. Int.]     FMI
-------------+----------------------------------------------------------------
       write |   .374845   .085968    4.36   0.000    .203908  .545783   0.224
      female |  -1.88666    1.1745   -1.61   0.110   -4.20897   .43566   0.114
        math |   .408751   .086997    4.70   0.000    .235523  .581978   0.244
    _Iprog_2 |   2.14236    1.4504    1.48   0.142   -.728298  5.01303   0.138
    _Iprog_3 |   .475086   1.69738    0.28   0.780   -2.89809  3.84827   0.215
       _cons |   10.4923   3.95046    2.66   0.009    2.67696  18.3076   0.128
------------------------------------------------------------------------------

Once the model is estimated the testparm command with the mim: prefix can be used to perform multiple degree of freedom tests. One common use for this is to test for an overall effect of a nominal variable represented by a series of dummy variables. Below we use mim: testparm to test for an overall effect of type of program (prog).

mim: testparm _Iprog_2 _Iprog_3

 ( 1)  _Iprog_2 = 0
 ( 2)  _Iprog_3 = 0

       F(  2, 375.7) =    1.18
            Prob > F =    0.3078

The testparm command with the mim: prefix can also be used to test nested models, where the null hypothesis is that the coefficients on two or more variables are simultaneously equal to zero.

mim: testparm write math

 ( 1)  write = 0
 ( 2)  math = 0

       F(  2, 371.0) =   51.42
            Prob > F =    0.0000

It is also possible to test linear combinations of variables. Below we test a model with an interaction between math and female. The variable female is dummy coded (0=male, 1=female). First we create the interaction as we normally would, then we use the regress command with the mim: prefix to fit a regression model. Then the command lincom is used (with the mim: prefix) to test the null hypothesis that the effect of math on read is zero when female=1.

gen female_math = female*math
(21 missing values generated)

mim: regress read female math female_math

Multiple-imputation estimates (regress)                  Imputations =      10
Linear regression                                        Minimum obs =     200
                                                         Minimum dof =    80.8

------------------------------------------------------------------------------
        read |     Coef.  Std. Err.     t    P>|t|    [95% Conf. Int.]     FMI
-------------+----------------------------------------------------------------
      female |   -4.5878   6.87084   -0.67   0.506   -18.2109  9.03536   0.177
        math |   .645248   .095011    6.79   0.000    .456248  .834248   0.231
 female_math |   .093203   .127862    0.73   0.468   -.160302  .346708   0.176
       _cons |   17.8749   5.14618    3.47   0.001    7.63534  28.1145   0.235
------------------------------------------------------------------------------

mim: lincom math + female_math

Multiple-imputation estimates for lincom                      Imputations = 10

 ( 1)  math + female_math = 0

------------------------------------------------------------------------------
        read |    Coeff.  Std. Err.     t    P>|t|    [95% Conf. Int.]     FMI
-------------+----------------------------------------------------------------
         (1) |   .738451   .085099    8.68   0.000    .571227  .905675   0.107
------------------------------------------------------------------------------

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.