Statistical Computing Seminars
Multiple Imputation in SAS, Part 2

Outline of this seminar:

Part 1:

Part 2:

Introduction

We ended our previous seminar with an example on using proc mi. In this seminar, we will start with an example of IVEware and then move on to performing analysis after multiple imputation. For the purpose of illustration and convenience, we assume that  multiple imputed data have been provided so at the end, we will discuss some data management issues related to proc mi and proc mianalyze.

The data set hsb_mar.sas7bdat which is based on  hsb2.sas7bdat is used for this seminar and can be downloaded following the link. The SAS code for this seminar was developed using SAS 9.2. The format used by the data set can be created as follows.

proc format;
   value female 0 = "male"
                1 = "female";
   value race   1 ="hispanic"
                2 = "asian"
                3 = "african-amer"
                4 = "white";
   value ses    1 = "low"
                2 = "milddle"
                3 = "high";
   value schtyp 1 ="public"
                2 ="private";
   value prog   1 = "general"
                2 = "academic"
		3 = "vocation";
run;
options fmtsearch=(work);

An example of using IVEware

IVEware is imputation and variance estimation software developed by Raghunathan et al (1997), and it can be called from SAS or run independently. It uses a multivariate sequential regression approach for obtaining the imputed values. The regression models can be linear regression, poisson regression, logistic regression and more. IVEware also incorporates the survey design, such as stratification, cluster and sampling weights. The description of IVEware below is extracted from the User Guide of IVEware.

"IVEware includes four modules: IMPUTE, DESCRIBE, REGRESS and SASMOD.

· IMPUTE uses a multivariate sequential regression approach to imputing item missing values. IMPUTE can create multiply imputed data sets.

· DESCRIBE estimates the population means, proportions, subgroup differences, contrasts and linear combinations of means and proportions. A Taylor Series approach is used to obtain variance estimates appropriate for a user specified complex sample design. A multiple imputation analysis can be performed when there are missing values.

· REGRESS fits linear, logistic, polytomous, Poisson, Tobit and proportional hazard regression models for data resulting from a complex sample design. The repeated replication approach is used to estimate the sampling variances. A multiple imputation analysis can be performed when there are missing values.

· SASMOD allows users to take into account complex sample design features when analyzing data with several SAS procedures. Currently the following SAS PROCS can be called: CALIS, CATMOD, GENMOD, LIFEREG, MIXED, NLIN, PHREG, and PROBIT. A multiple imputation analysis can be performed when there are missing values."

Now let's see a simple example using IVEware. The data set used here is the same as before.

options set = SRCLIB "D:\work\ive_sas_windows" 
        sasautos = ('!SRCLIB' sasautos) mautosource ;
options nofmterr;
data hsb_mar;
  set ats.hsb_mar;
run;

data _null_;
  infile datalines;
  filename setup "test1.set";
  file setup;
  input;
  put _infile_;
  datalines4;
  title Descriptives;
  datain hsb_mar;
  estout hsbdes;
  mean read math write female;
  run;
;;;;
run;
%describe(name=test1, dir=.);
Descriptives

     Problem  1

                  Degrees of freedom
                           190

         Factor   Covariance of denominator
           None        0.00000

           Mean      Number of         Sum of       Weighted       Standard
           READ          Cases        Weights           Mean          Error
                           191            191       52.28796      0.7388216

                         Lower          Upper         T Test     Prob > |T|
                         Bound          Bound
                      50.83062        53.7453       70.77210        0.00000

                    Unweighted           Bias         Design
                          Mean                        Effect
                      52.28796        0.00000        1.00000

     Problem  2

                  Degrees of freedom
                           184

         Factor   Covariance of denominator
           None        0.00000

           Mean      Number of         Sum of       Weighted       Standard
           MATH          Cases        Weights           Mean          Error
                           185            185        52.8973      0.6882224

                         Lower          Upper         T Test     Prob > |T|
                         Bound          Bound
                      51.53947       54.25512       76.86076        0.00000

                    Unweighted           Bias         Design
                          Mean                        Effect
                       52.8973        0.00000        1.00000

     Problem  3

                  Degrees of freedom
                           182

         Factor   Covariance of denominator
           None        0.00000

           Mean      Number of         Sum of       Weighted       Standard
          WRITE          Cases        Weights           Mean          Error

                         Lower          Upper         T Test     Prob > |T|
                         Bound          Bound
                      51.60053       54.30111       77.37340        0.00000

                    Unweighted           Bias         Design
                          Mean                        Effect
                      52.95082        0.00000        1.00000

     Problem  4

                  Degrees of freedom
                           181

         Factor   Covariance of denominator
           None        0.00000

           Mean      Number of         Sum of       Weighted       Standard
         FEMALE          Cases        Weights           Mean          Error
                           182            182      0.5549451     0.03693963

                         Lower          Upper         T Test     Prob > |T|
                         Bound          Bound
                     0.4820576      0.6278325       15.02303        0.00000

                    Unweighted           Bias         Design
                          Mean                        Effect
                     0.5549451        0.00000        1.00000
 
data _null_;
  infile datalines;
  filename setup "impute.set";
  file setup;
  input;
  put _infile_;
datalines4;
  title Multiple imputation;
  datain hsb_mar;
  dataout hsb_mar_1;
  default continuous;
  categorical female prog race ses;
  iterations 5;
  multiples 5;
  seed 2001;
  run;
;;;;
%impute(name=impute, dir=.);
%PUTDATA(NAME=impute,DIR=. ,MULT=2,DATAOUT=hsb_mar_2);
%PUTDATA(NAME=impute,DIR=. ,MULT=3,DATAOUT=hsb_mar_3);
%PUTDATA(NAME=impute,DIR=. ,MULT=4,DATAOUT=hsb_mar_4);
%PUTDATA(NAME=impute,DIR=. ,MULT=5,DATAOUT=hsb_mar_5);
RUN;

data _null_;
  infile datalines;
  filename setup "impute_regress.set";
  file setup;
  input;
  put _infile_;
datalines4;
  title Multiply imputed jackknife regression;
  datain
    hsb_mar_1
    hsb_mar_2
    hsb_mar_3
    hsb_mar_4
    hsb_mar_5;
  estout regexam2;
  dependent socst;
  predictor write read female math;
  run;
;;;;
%regress(name=impute_regress, dir=.);
RUN;
Multiply imputed jackknife regression

Regression type:        Linear
Dependent variable:     SOCST  social studies score
Predictors:             WRITE  writing score
                        READ  reading score
                        FEMALE  female
                        MATH  math score

All imputations

Valid cases                200

Degr freedom       16.80949217

Sum of squares:
  Model            11039.10185
  Error            11897.09315
  Total              22936.195
  R-square             0.48130
  F-value              3.89931
  P-value              0.02027

Variable              Estimate         Std Error            T Test        Prob > |T|
Intercept            6.5596849         3.6069808           1.81861           0.08683
WRITE                0.3412771         0.1050343           3.24920           0.00478
READ                 0.3911219         0.0790752           4.94620           0.00013
FEMALE               0.7965601         1.5908236           0.50072           0.62306
MATH                 0.1292599         0.0922853           1.40065           0.17951

Variable              Estimate             95% Confidence Interval
                                           Lower             Upper
Intercept            6.5596849        -1.0569193        14.1762891
WRITE                0.3412771         0.1194836         0.5630706
READ                 0.3911219         0.2241444         0.5580993
FEMALE               0.7965601        -2.5626686         4.1557888
MATH                 0.1292599        -0.0656125         0.3241323

A regression example with a multivariate hypothesis test

Once we have obtained the multiply imputed data, we will move on to the second and third step of our analysis described in previous seminar, that is to perform the analysis by imputation and to combine the analyses at the end. The last step of combining the analyses should be done using the SAS procedure proc mianalyze.

Let's say that we are interested in modeling the score of a social sciences test, socst on read, math, write, female as before. Since we want to treat female as a categorical variable in the analysis model, we are going to use the imputed data set created at the end of last seminar as shown below.  Because in this setup we have forced the imputed values for female to be between 0 and 1. In an extra data step we are going to  "recreate" the values for female using binomial distribution based on the imputed values.  

proc mi data = hsb_mar seed=432156 nimpute = 20 out=t2 
    minimum = . . . . 0 0 0 0 0 0 0 0 0
    maximum = . . . . 1 1 1 1 1 1 1 1 1
    MINMAXITER = 400;
    mcmc plots=trace plots=acf;
    var socst write read  math s1 s2 p1 p2 r1-r3 schtyp female;	
run;
data t2;
  set t2;
  if 0 < female < 1 then female = ranbin(1245, 1, female);
run;

Now we are ready to perform our regression analysis. One of the hypothesis that we are interested in testing is that the effect of write, math and read are equal to each other. The hypothesis test can be carried out using the test statement in proc mianalyze as shown below.

proc reg data = t2 outest=regout covout;
  by _imputation_;
  model socst = write read math female;
run;
quit;
proc mianalyze data=regout edf=195;
  modeleffects Intercept write read math female;
  mytest: test write=read, write=math / mult;
run;
          Model Information

Data Set                  WORK.REGOUT
Number of Imputations     20


                                            Variance Information

                                                                       Relative       Fraction
             -----------------Variance-----------------                Increase        Missing       Relative
Parameter         Between         Within          Total       DF    in Variance    Information     Efficiency

Intercept        0.356772      12.835697      13.210307   186.08       0.029185       0.028440       0.998580
write            0.001502       0.007645       0.009222   128.41       0.206224       0.173506       0.991399
read             0.000386       0.005915       0.006320   173.88       0.068458       0.064476       0.996787
math             0.000973       0.007845       0.008866    152.6       0.130165       0.116407       0.994213
female           0.454866       1.482958       1.960567   100.28       0.322065       0.248289       0.987738


                                                                 Parameter Estimates

                                                                                                                                  t for H0:
Parameter        Estimate      Std Error    95% Confidence Limits        DF        Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

Intercept        7.071084       3.634599     -0.09923     14.24140   186.08       5.924567       8.111225              0               1.95     0.0532
write            0.339907       0.096032      0.14990      0.52992   128.41       0.269254       0.421620              0               3.54     0.0006
read             0.377972       0.079500      0.22106      0.53488   173.88       0.347713       0.408543              0               4.75     <.0001
math             0.138487       0.094162     -0.04754      0.32452    152.6       0.077483       0.202446              0               1.47     0.1434
female           0.378430       1.400203     -2.39944      3.15630   100.28      -1.014299       1.180621              0               0.27     0.7875

Test: mytest

                                           Test Specification

             ----------------------------------L Matrix----------------------------------
Parameter       Intercept           write            read            math          female               C

TestPrm1                0        1.000000       -1.000000               0               0               0
TestPrm2                0        1.000000               0       -1.000000               0               0


                                            Variance Information

                                                                       Relative       Fraction
             -----------------Variance-----------------                Increase        Missing       Relative
Parameter         Between         Within          Total       DF    in Variance    Information     Efficiency

TestPrm1         0.002665       0.017921       0.020719   143.89       0.156161       0.136724       0.993210
TestPrm2         0.004402       0.022265       0.026888   128.01       0.207616       0.174487       0.991351


                                                               Parameter Estimates

                                                                                                                             t for H0:
Parameter        Estimate      Std Error    95% Confidence Limits        DF        Minimum        Maximum              C   Parameter=C   Pr > |t|

TestPrm1        -0.038066       0.143943     -0.32258     0.246450   143.89      -0.110137       0.067655              0         -0.26     0.7918
TestPrm2         0.201420       0.163975     -0.12303     0.525872   128.01       0.078520       0.344137              0          1.23     0.2216


                    Multivariate Inference
Assuming Proportionality of Between/Within Covariance Matrices

Avg Relative
    Increase                            F for H0:
 in Variance   Num DF   Den DF   Parameter=Theta0     Pr > F

    0.143052        2   1979.5               1.37     0.2531

A variation of the regression example

If we are only interested in getting the parameter estimates of the regression model, we don't really need variance and covariance matrix, so the code below works.

ods output parameterestimates=regout1;
proc reg data = t2;
  by _imputation_;
  model socst = write read math female;
run;
quit;
proc mianalyze parms=regout1 edf=195;
  modeleffects Intercept write read math female;
run;
          Model Information

PARMS Data Set            WORK.REGOUT1
Number of Imputations     20

                                            Variance Information

                                                                       Relative       Fraction
             -----------------Variance-----------------                Increase        Missing       Relative
Parameter         Between         Within          Total       DF    in Variance    Information     Efficiency

Intercept        0.356772      12.835697      13.210307   186.08       0.029185       0.028440       0.998580
write            0.001502       0.007645       0.009222   128.41       0.206224       0.173506       0.991399
read             0.000386       0.005915       0.006320   173.88       0.068458       0.064476       0.996787
math             0.000973       0.007845       0.008866    152.6       0.130165       0.116407       0.994213
female           0.454866       1.482958       1.960567   100.28       0.322065       0.248289       0.987738

                                                                 Parameter Estimates

                                                                                                                                  t for H0:
Parameter        Estimate      Std Error    95% Confidence Limits        DF        Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

Intercept        7.071084       3.634599     -0.09923     14.24140   186.08       5.924567       8.111225              0               1.95     0.0532
write            0.339907       0.096032      0.14990      0.52992   128.41       0.269254       0.421620              0               3.54     0.0006
read             0.377972       0.079500      0.22106      0.53488   173.88       0.347713       0.408543              0               4.75     <.0001
math             0.138487       0.094162     -0.04754      0.32452    152.6       0.077483       0.202446              0               1.47     0.1434
female           0.378430       1.400203     -2.39944      3.15630   100.28      -1.014299       1.180621              0               0.27     0.7875

Well, as expected, the two methods produce identical output. In each method, the first step is to run the regression model and the second step is to run proc mianalyze to combine the regression model results. The difference between the two methods is the difference in data set created in step 1 for combining the results in step 2. Let's take a look at these data sets.

*method 1;
proc print data = regout (obs=24);
run;

Obs    _Imputation_    _MODEL_    _TYPE_    _NAME_       _DEPVAR_     _RMSE_    Intercept      WRITE       READ        MATH       FEMALE     SOCST

  1          1         MODEL1     PARMS                   SOCST      7.86928       6.7148     0.30645     0.34784     0.20245     1.03407      -1
  2          1         MODEL1     COV       Intercept     SOCST      7.86928      12.6760    -0.10452    -0.04320    -0.08309    -0.35284       .
  3          1         MODEL1     COV       WRITE         SOCST      7.86928      -0.1045     0.00763    -0.00226    -0.00314    -0.02925       .
  4          1         MODEL1     COV       READ          SOCST      7.86928      -0.0432    -0.00226     0.00569    -0.00265     0.00749       .
  5          1         MODEL1     COV       MATH          SOCST      7.86928      -0.0831    -0.00314    -0.00265     0.00725     0.01402       .
  6          1         MODEL1     COV       FEMALE        SOCST      7.86928      -0.3528    -0.02925     0.00749     0.01402     1.37260       .
  7          2         MODEL1     PARMS                   SOCST      7.93860       7.3575     0.36895     0.38182     0.10668    -0.32243      -1
  8          2         MODEL1     COV       Intercept     SOCST      7.93860      13.1665    -0.07921    -0.05651    -0.10380    -0.38420       .
  9          2         MODEL1     COV       WRITE         SOCST      7.93860      -0.0792     0.00823    -0.00234    -0.00399    -0.04636       .
 10          2         MODEL1     COV       READ          SOCST      7.93860      -0.0565    -0.00234     0.00622    -0.00290     0.01528       .
 11          2         MODEL1     COV       MATH          SOCST      7.93860      -0.1038    -0.00399    -0.00290     0.00863     0.02302       .
 12          2         MODEL1     COV       FEMALE        SOCST      7.93860      -0.3842    -0.04636     0.01528     0.02302     1.53388       .
 13          3         MODEL1     PARMS                   SOCST      7.87091       6.9166     0.29919     0.40275     0.15272     0.79477      -1
 14          3         MODEL1     COV       Intercept     SOCST      7.87091      12.7756    -0.09557    -0.04862    -0.09077    -0.10551       .
 15          3         MODEL1     COV       WRITE         SOCST      7.87091      -0.0956     0.00807    -0.00225    -0.00361    -0.04640       .
 16          3         MODEL1     COV       READ          SOCST      7.87091      -0.0486    -0.00225     0.00599    -0.00291     0.01422       .
 17          3         MODEL1     COV       MATH          SOCST      7.87091      -0.0908    -0.00361    -0.00291     0.00807     0.01886       .
 18          3         MODEL1     COV       FEMALE        SOCST      7.87091      -0.1055    -0.04640     0.01422     0.01886     1.51881       .
 19          4         MODEL1     PARMS                   SOCST      7.84248       7.3165     0.38010     0.37954     0.09995    -0.46673      -1
 20          4         MODEL1     COV       Intercept     SOCST      7.84248      12.5256    -0.07946    -0.04984    -0.09936    -0.26761       .
 21          4         MODEL1     COV       WRITE         SOCST      7.84248      -0.0795     0.00705    -0.00198    -0.00318    -0.04211       .
 22          4         MODEL1     COV       READ          SOCST      7.84248      -0.0498    -0.00198     0.00613    -0.00327     0.01163       .
 23          4         MODEL1     COV       MATH          SOCST      7.84248      -0.0994    -0.00318    -0.00327     0.00813     0.02048       .
 24          4         MODEL1     COV       FEMALE        SOCST      7.84248      -0.2676    -0.04211     0.01163     0.02048     1.49208       .

*method 2;
options nolabel;
proc print data = regout1 (obs=20);
run;
Obs    _Imputation_    Model     Dependent    Variable     DF       Estimate         StdErr     tValue     Probt

  1          1         MODEL1      SOCST      Intercept     1        6.71475        3.56034       1.89    0.0608
  2          1         MODEL1      SOCST      WRITE         1        0.30645        0.08735       3.51    0.0006
  3          1         MODEL1      SOCST      READ          1        0.34784        0.07541       4.61    <.0001
  4          1         MODEL1      SOCST      MATH          1        0.20245        0.08517       2.38    0.0184
  5          1         MODEL1      SOCST      FEMALE        1        1.03407        1.17158       0.88    0.3785
  6          2         MODEL1      SOCST      Intercept     1        7.35753        3.62857       2.03    0.0440
  7          2         MODEL1      SOCST      WRITE         1        0.36895        0.09070       4.07    <.0001
  8          2         MODEL1      SOCST      READ          1        0.38182        0.07887       4.84    <.0001
  9          2         MODEL1      SOCST      MATH          1        0.10668        0.09290       1.15    0.2522
 10          2         MODEL1      SOCST      FEMALE        1       -0.32243        1.23850      -0.26    0.7949
 11          3         MODEL1      SOCST      Intercept     1        6.91660        3.57430       1.94    0.0544
 12          3         MODEL1      SOCST      WRITE         1        0.29919        0.08985       3.33    0.0010
 13          3         MODEL1      SOCST      READ          1        0.40275        0.07741       5.20    <.0001
 14          3         MODEL1      SOCST      MATH          1        0.15272        0.08983       1.70    0.0907
 15          3         MODEL1      SOCST      FEMALE        1        0.79477        1.23240       0.64    0.5198
 16          4         MODEL1      SOCST      Intercept     1        7.31651        3.53915       2.07    0.0400
 17          4         MODEL1      SOCST      WRITE         1        0.38010        0.08398       4.53    <.0001
 18          4         MODEL1      SOCST      READ          1        0.37954        0.07831       4.85    <.0001
 19          4         MODEL1      SOCST      MATH          1        0.09995        0.09014       1.11    0.2689
 20          4         MODEL1      SOCST      FEMALE        1       -0.46673        1.22151      -0.38    0.7028

In method 1, the data set regout, created by proc reg using the outest = and covout options, contains not only the parameter estimates but also the variance covariance matrix of the parameter estimates. In method 2, the data set regout2, created using ods output only, contains parameter estimates and their standard errors. If we want to run any hypothesis test that involves more than just one single predictor variable in the model, we will have to use method 1 since the information on variance and covariance is needed.

Getting the averaged R-square

Oftentimes we want to report R-square for a model. Now with multiply imputed data set, we end up with multiple R-squares. You can report the averaged R-square together with the minimum and maximum.  It is fairly straightforward to get them, and here is one way to do so.

ods output FitStatistics = fitstat;
proc reg data = t2;
  by _imputation_;
  model socst = write read math female;
run;
quit;
proc means data = fitstat min max mean;
 where  Label2="R-Square";
 title "Averaged R-square";
 var nvalue2;
run;
Averaged R-square                                                                            

The MEANS Procedure

        Analysis Variable : nValue2

     Minimum         Maximum            Mean
--------------------------------------------
   0.4579211       0.4856979       0.4714016
--------------------------------------------

Another regression example with categorical variables and their interactions

Let's say that our analysis model is variable write on ses, female, read and the interaction of ses with female, treating both ses and female as categorical variables. First of all, proc reg does not have a class statement. Secondly, the test statement in proc mianalyze does not work with class variables. This means that we need to create dummy variables before running proc mianalyze. In SAS 9 and above, this becomes a lot easier with proc glmmod. Proc glmmod, like some of the SAS procedures, does not perform analysis, per se. Instead, it produces a design matrix (dummy variables) for a general linear model. It can be used with regression procedures. Here is the code for our model.  

ods output designpoints=t3;
ods listing close;
proc glmmod data=t2;
  by _imputation_;
  class ses female;
  model write = ses female ses*female read;
run;
ods listing;
proc reg data = t3 outest=regout3 covout;
  by _imputation_;
  model write = ses_1 ses_2 female_1 ses_female_1_1 ses_female_2_1 read;
run;
proc mianalyze data=regout3 edf=193;
  modeleffects intercept ses_1 ses_2 female_1 ses_female_1_1 ses_female_2_1 read;
  interaction: test ses_female_1_1, ses_female_2_1 / mult;
run;
          Model Information

Data Set                  WORK.REGOUT3
Number of Imputations     20


                                               Variance Information

                                                                            Relative       Fraction
                  -----------------Variance-----------------                Increase        Missing       Relative
Parameter              Between         Within          Total       DF    in Variance    Information     Efficiency

intercept             0.743091      10.249762      11.030007   169.59       0.076123       0.071227       0.996451
ses_1                 1.110474       5.234794       6.400792   122.74       0.222740       0.185006       0.990834
ses_2                 0.494814       2.980575       3.500130   136.86       0.174314       0.150407       0.992536
female_1              1.263178       3.577076       4.903412    90.69       0.370788       0.276047       0.986386
ses_female_1_1        2.701027       8.498906      11.334985   97.309       0.333699       0.255098       0.987406
ses_female_2_1        1.526107       5.777745       7.380157   109.08       0.277342       0.220981       0.989072
read                  0.000158       0.002728       0.002894   174.65       0.060796       0.057638       0.997126

                                                                    Parameter Estimates
                                                                                                                                       t for H0:
Parameter             Estimate      Std Error    95% Confidence Limits        DF        Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

intercept            23.364482       3.321145     16.80837     29.92059   169.59      21.945295      24.840872              0               7.04     <.0001
ses_1                -2.620320       2.529979     -7.62836      2.38772   122.74      -5.262403      -0.600005              0              -1.04     0.3024
ses_2                -2.022661       1.870863     -5.72220      1.67688   136.86      -3.360548      -0.761200              0              -1.08     0.2815
female_1              4.091534       2.214365     -0.30723      8.49030    90.69       2.195456       6.036303              0               1.85     0.0679
ses_female_1_1        1.767133       3.366747     -4.91466      8.44893   97.309      -1.381168       4.865506              0               0.52     0.6009
ses_female_2_1        1.946620       2.716644     -3.43764      7.33088   109.08      -0.877186       3.763880              0               0.72     0.4752
read                  0.538969       0.053792      0.43280      0.64514   174.65       0.519234       0.567900              0              10.02     <.0001

Test: interaction
                                                           Test Specification

             --------------------------------------------------L Matrix--------------------------------------------------
                                                                              ses_female_     ses_female_
Parameter       intercept           ses_1           ses_2        female_1             1_1             2_1            read               C

TestPrm1                0               0               0               0        1.000000               0               0               0
TestPrm2                0               0               0               0               0        1.000000               0               0

                                            Variance Information

                                                                       Relative       Fraction
             -----------------Variance-----------------                Increase        Missing       Relative
Parameter         Between         Within          Total       DF    in Variance    Information     Efficiency

TestPrm1         2.701027       8.498906      11.334985   97.309       0.333699       0.255098       0.987406
TestPrm2         1.526107       5.777745       7.380157   109.08       0.277342       0.220981       0.989072

                                                               Parameter Estimates
                                                                                                                             t for H0:
Parameter        Estimate      Std Error    95% Confidence Limits        DF        Minimum        Maximum              C   Parameter=C   Pr > |t|

TestPrm1         1.767133       3.366747     -4.91466     8.448925   97.309      -1.381168       4.865506              0          0.52     0.6009
TestPrm2         1.946620       2.716644     -3.43764     7.330878   109.08      -0.877186       3.763880              0          0.72     0.4752

                    Multivariate Inference
Assuming Proportionality of Between/Within Covariance Matrices

Avg Relative
    Increase                            F for H0:
 in Variance   Num DF   Den DF   Parameter=Theta0     Pr > F

    0.271089        2   690.87               0.28     0.7576

A  logistic regression example

Let's say that students are put in an honors class based on their writing score cutoff at 60, and we want to predict the probability of a student being in the honors class based on the reading score and gender. We also want to produce odds ratio for the two predictor variables. To this end, we need to calculate the odds ratio based on the parameter estimates given by proc mianalyze. This is done by first outputting the parameter estimates to a data set and then create odds ratio variables in a data step.  Also notice that we didn't ask for variance and covariance in this example, since we are not interested in performing a hypothesis test.

data t4;
  set t2;
  y = (write>=60);
run;
proc logistic data = t4 descending;
  by _imputation_;
  model y = female read ses;
  ods output parameterestimates=logparms;
run;
ods output parameterestimates = logout;
proc mianalyze parms=logparms;
  modeleffects intercept female read ses;
run;
data logout_a;
  set logout;
  if parm ^="intercept";
  odds_ratio = exp(estimate);
  odds_LCLMean = exp(lclmean);
  odds_UCLMean = exp(uclmean);
run;
proc print data = logout_a noobs;
  var parm odds:;
run;
                  odds_      odds_      odds_
Obs    Parm       ratio     LCLMean    UCLMean

 1     female    2.69291    1.11796    6.48660
 2     read      1.14077    1.08855    1.19550
 3     ses       1.24224    0.71140    2.16919

What if we want to present the model graphically? Here is an example. Let's say that we want to plot the predicted probabilities for read scores ranging from 30 to 70 by male and female, fixing the ses value at 2. The basic steps in this example are the following.

proc transpose data = logout out=score;
  id parm;
  var estimate;
run;
data score;
  set score;
  _type_="PARMS";
  drop _name_;
run;
proc print data = score noobs;
run;
   intercept          female            read             ses    _type_

   -9.314543        0.990624        0.131705        0.216918    PARMS

data toscore;
  do female = 0 to 1;
     do read = 30 to 70 by 5;
	   intercept = 1;
	   ses = 2;
	   output;
	 end;
  end;
run;

proc score data = toscore score=score out=scored type=parms;
run;
data scored_p;
  set scored;
  p = exp(estimate)/(1+exp(estimate));
run;

ods graphics on;
proc sgplot data = scored_p;
  series x = read y = p /group=female;
run;
ods graphics off;

A mixed model example

This example involves imputation of a longitudinal data set and running a mixed model after the multiple imputation. The full data set, dp.sas7bdat, and data set with missing values, dp_miss.sas7bdat, created based on the complete data set can be downloaded by following the link. Let's first run a mixed model using the full data set.

proc mixed data = dp;
  class group subj;
  model y = group pre x1 visit /solution;
  random intercept visit /subject=subj type=un;
run;
             Dimensions

Covariance Parameters             4
Columns in X                      6
Columns in Z Per Subject          2
Subjects                         61
Max Obs Per Subject               6


          Number of Observations

Number of Observations Read             295
Number of Observations Used             295
Number of Observations Not Used           0


                     Iteration History

Iteration    Evaluations    -2 Res Log Like       Criterion

        0              1      1799.97381611
        1              2      1649.18880814      0.00000115
        2              1      1649.18816211      0.00000000

                   Convergence criteria met.


 Covariance Parameter Estimates

Cov Parm     Subject    Estimate

UN(1,1)      SUBJ        23.1600
UN(2,1)      SUBJ        -2.5572
UN(2,2)      SUBJ         0.8530
Residual                  8.4011


           Fit Statistics

-2 Res Log Likelihood          1649.2
AIC (smaller is better)        1657.2
AICC (smaller is better)       1657.3
BIC (smaller is better)        1665.6


  Null Model Likelihood Ratio Test

    DF    Chi-Square      Pr > ChiSq

     3        150.79          <.0001


                       Solution for Fixed Effects

                                  Standard
Effect       GROUP    Estimate       Error      DF    t Value    Pr > |t|

Intercept               1.9654      3.8244      58       0.51      0.6093
GROUP        0          4.0593      1.1131     180       3.65      0.0003
GROUP        1               0           .       .        .         .
PRE                     0.4696      0.1484     180       3.17      0.0018
X1                     0.01702     0.01496     180       1.14      0.2569
VISIT                  -1.2094      0.1664      52      -7.27      <.0001


        Type 3 Tests of Fixed Effects

              Num     Den
Effect         DF      DF    F Value    Pr > F

GROUP           1     180      13.30    0.0003
PRE             1     180      10.02    0.0018
X1              1     180       1.29    0.2569
VISIT           1      52      52.82    <.0001

Notice that the covariate x1 is not a significant predictor in this model. Now let's run the same model using the data set with missing data.

proc mixed data = ats.dp_miss;
  class group subj;
  model y = group pre x1 visit /solution;
  random intercept visit /subject=subj type=un;
run;
            Dimensions

Covariance Parameters             4
Columns in X                      6
Columns in Z Per Subject          2
Subjects                         58
Max Obs Per Subject               6


          Number of Observations

Number of Observations Read             295
Number of Observations Used             250
Number of Observations Not Used          45


                     Iteration History

Iteration    Evaluations    -2 Res Log Like       Criterion

        0              1      1494.45188419
        1              2      1374.49573745      0.00002490
        2              1      1374.48385751      0.00000006
        3              1      1374.48382794      0.00000000

                   Convergence criteria met.

 Covariance Parameter Estimates

Cov Parm     Subject    Estimate

UN(1,1)      SUBJ        20.2010
UN(2,1)      SUBJ        -1.6866
UN(2,2)      SUBJ         0.4588
Residual                  7.8997

           Fit Statistics

-2 Res Log Likelihood          1374.5
AIC (smaller is better)        1382.5
AICC (smaller is better)       1382.7
BIC (smaller is better)        1390.7

  Null Model Likelihood Ratio Test

    DF    Chi-Square      Pr > ChiSq

     3        119.97          <.0001

                       Solution for Fixed Effects

                                  Standard
Effect       GROUP    Estimate       Error      DF    t Value    Pr > |t|

Intercept               0.8407      3.9547      55       0.21      0.8324
GROUP        0          4.3650      1.0995     141       3.97      0.0001
GROUP        1               0           .       .        .         .
PRE                     0.4023      0.1492     141       2.70      0.0079
X1                     0.03676     0.01731     141       2.12      0.0355
VISIT                  -1.3262      0.1506      49      -8.81      <.0001


        Type 3 Tests of Fixed Effects

              Num     Den
Effect         DF      DF    F Value    Pr > F

GROUP           1     141      15.76    0.0001
PRE             1     141       7.27    0.0079
X1              1     141       4.51    0.0355
VISIT           1      49      77.54    <.0001

Now the predictor variable x1 becomes a significant predictor. This is an example of what happens to the parameter estimates with listwise deletion when the data are not missing completely at random. (Since we generated the data with missing values based on the full data set, we know this is the case.) The parameter estimates are biased using listwise deletion when missing is not completely at random. Assuming that the data are missing at random, applying a multiple imputation technique to the data set will give us a chance to model the missingness and to improve the estimates of variance covariance structure needed for the analysis model.

For a longitudinal data set, the way to impute is to turn the data in wide so that each row corresponds to a subject. After the imputation, we will have to turn the data back to long for the mixed model analysis. In turning data from long to wide, we need to know which variables are within-variables; in other words, time-varying and which variables are between variables. In our example, visit is the variable for time and x1 is a time-varying covariate. The outcome variable y, apparently, is also a time-varying variable.

* number of time points;
proc freq data = dp_miss;
  tables visit;
run;
                                  Cumulative    Cumulative
VISIT    Frequency     Percent     Frequency      Percent
----------------------------------------------------------
    1          61       20.68            61        20.68
    2          53       17.97           114        38.64
    3          46       15.59           160        54.24
    4          45       15.25           205        69.49
    5          45       15.25           250        84.75
    6          45       15.25           295       100.00

proc sort data = dp_miss;
  by subj visit;
run;
data wide;
  set dp_miss;
  array yt(6);
  array xt(6);
  by subj;
  retain yt xt;
  if first.subj then do i = 1 to 6;
     yt(i) = .;
	 xt(i) = .;
  end;
  yt(visit) = y;
  xt(visit) = x1;
  if last.subj;
  drop visit y x1 i;
run;
proc print data = wide (obs=10) noobs;
run;
SUBJ    GROUP    PRE    yt1      yt2      yt3      yt4        yt5        yt6        xt1        xt2        xt3        xt4        xt5        xt6

  1       0       18     17    18.0000     15    17.0000    14.0000    15.0000    119.717    126.267    129.251    138.490    121.896    130.930
  2       0       27     26    23.0000     18    17.0000    12.0000    10.0000       .          .       123.376       .       132.156    102.482
  3       0       16     17    14.0000      .      .          .          .        131.226    124.536       .          .          .          .
  4       0       17     14    23.0000     17    13.0000    12.0000    12.0000    120.253    124.482    121.966    130.832    126.038    122.019
  5       0       15     12    10.0000      8     4.0000     5.0000     5.0000    152.194    124.859    134.678    155.972    153.952    138.948
  6       0       20     19    11.5400      9     8.0000     6.8200     5.0500    143.304    130.493    134.912    123.653    125.339    130.643
  7       0       16     13    13.0000      9     7.0000     8.0000     7.0000    125.082    113.869    127.469    126.645    129.214    100.558
  8       0       28     26    27.0000      .      .          .          .        119.613    145.215       .          .          .          .
  9       0       28     26    24.0000     19    13.9400    11.0000     9.0000       .       136.975    116.491    119.581       .          .
 10       0       25      9    12.0000     15    12.0000    13.0000    20.0000    131.701       .       127.819    126.092    124.667       .
ods select misspattern;
proc mi data = wide out=wide_imputed;
  var xt: yt: group pre ;
run;
                                                                         Missing Data Patterns
                                                                                                   -----------------------------Group Means----------------------------
Group  xt1  xt2  xt3  xt4  xt5  xt6  yt1  yt2  yt3  yt4  yt5  yt6  GROUP  PRE      Freq   Percent           xt1           xt2           xt3           xt4           xt5
    1  X    X    X    X    X    X    X    X    X    X    X    X    X      X          31     50.82    129.210454    123.651090    133.238919    133.059281    127.296889
    2  X    X    X    X    X    .    X    X    X    X    X    X    X      X           1      1.64    124.891563    114.192368    129.880737    100.408318    132.438690
    3  X    X    X    .    .    .    X    X    X    .    .    .    X      X           1      1.64    124.349800    140.247986    129.350708             .             .
    4  X    X    .    .    .    .    X    X    .    .    .    .    X      X           5      8.20    130.125708    140.110742             .             .             .
    5  X    .    X    X    X    X    X    X    X    X    X    X    X      X           1      1.64    133.692200             .    131.970581    121.127487    118.771194
    6  X    .    X    X    X    .    X    X    X    X    X    X    X      X           1      1.64    131.701416             .    127.819115    126.092209    124.666901
    7  X    .    X    .    .    .    X    X    X    X    X    X    X      X           1      1.64    153.944626             .    138.519775             .             .
    8  X    .    .    .    .    .    X    .    .    .    .    .    X      X           6      9.84    128.225189             .             .             .             .
    9  .    X    X    X    .    X    X    X    X    X    X    X    X      X           1      1.64             .    122.507767    130.384705    121.351753             .
   10  .    X    X    X    .    .    X    X    X    X    X    X    X      X           1      1.64             .    136.975418    116.491409    119.580635             .
   11  .    X    X    .    X    X    X    X    X    X    X    X    X      X           1      1.64             .    135.618896    131.715240             .    135.236145
   12  .    X    X    .    X    .    X    X    X    X    X    X    X      X           1      1.64             .    120.561295    127.346565             .    117.862259
   13  .    X    X    .    .    X    X    X    X    X    X    X    X      X           1      1.64             .    126.772293    132.519714             .             .
   14  .    X    .    X    .    .    X    X    X    X    X    X    X      X           1      1.64             .    188.888687             .    164.752579             .
   15  .    X    .    .    .    .    X    X    .    .    .    .    X      X           1      1.64             .    129.818832             .             .             .
   16  .    .    X    .    X    X    X    X    X    X    X    X    X      X           1      1.64             .             .    123.375587             .    132.156387
   17  .    .    X    .    .    .    X    X    X    X    X    X    X      X           1      1.64             .             .    132.042450             .             .
   18  .    .    .    X    X    X    X    X    X    X    X    X    X      X           1      1.64             .             .             .    119.213478    140.304291
   19  .    .    .    .    X    X    X    X    X    X    X    X    X      X           1      1.64             .             .             .             .    122.023026
   20  .    .    .    .    .    .    X    X    .    .    .    .    X      X           1      1.64             .             .             .             .             .
   21  .    .    .    .    .    .    X    .    .    .    .    .    X      X           2      3.28             .             .             .             .             .
                                                               Missing Data Patterns
       -----------------------------------------------------------------Group Means----------------------------------------------------------------
Group           xt6             yt1             yt2             yt3             yt4             yt5             yt6           GROUP             PRE
    1    129.316789       14.080968       10.947742        9.389677        9.474839        7.758387        7.240323        0.580645       18.935484
    2             .       11.000000        7.000000        5.000000        8.000000        2.000000        3.000000        1.000000       23.000000
    3             .        7.000000       12.000000       15.000000               .               .               .        1.000000       22.000000
    4             .       16.600000       19.800000               .               .               .               .        0.200000       20.000000
    5    112.064934       11.000000        7.000000        3.000000        2.000000        2.000000        2.000000        1.000000       28.000000
    6             .        9.000000       12.000000       15.000000       12.000000       13.000000       20.000000               0       25.000000
    7             .       14.000000       14.000000       13.000000       12.000000       18.000000       15.000000        1.000000       24.000000
    8             .       16.166667               .               .               .               .               .        0.333333       19.500000
    9    121.262314       19.000000       13.000000       22.000000       12.000000       18.000000       13.000000               0       26.000000
   10             .       26.000000       24.000000       19.000000       13.940000       11.000000        9.000000               0       28.000000
   11    138.241745       20.000000       18.000000       16.000000        9.000000       10.000000        6.000000        1.000000       25.000000
   12             .       16.000000       15.000000       11.000000       11.000000       11.000000       11.000000        1.000000       24.000000
   13    127.396812        8.000000       17.000000       15.000000        7.000000        5.000000        7.000000        1.000000       27.000000
   14             .       22.000000       27.000000       24.000000       22.000000       24.000000       23.000000        1.000000       27.459999
   15             .       16.000000       13.000000               .               .               .               .        1.000000       23.000000
   16    102.481682       26.000000       23.000000       18.000000       17.000000       12.000000       10.000000               0       27.000000
   17             .        1.000000       18.000000       10.000000       13.000000       12.000000       10.000000        1.000000       26.000000
   18    169.372818       15.000000       24.000000       18.000000       15.190000       13.000000       12.320000        1.000000       25.000000
   19    125.793289       11.000000        9.000000       10.000000        8.000000        7.000000        4.000000        1.000000       23.000000
   20             .       13.000000       22.000000               .               .               .               .               0       26.000000
   21             .       19.000000               .               .               .               .               .        0.500000       25.000000
ods graphics on;
proc mi data = wide out=wide_imputed nimpute=30 seed=1213445;
  var xt: yt: group pre ;
  mcmc plots=acf(wlf) nbiter=1000;
run;

                                            Variance Information

                                                                      Relative       Fraction
            -----------------Variance-----------------                Increase        Missing       Relative
Variable         Between         Within          Total       DF    in Variance    Information     Efficiency

xt1             1.531802       3.211843       4.794705   33.951       0.492820       0.335106       0.988953
xt2             0.705394       3.594408       4.323315   46.117       0.202789       0.170224       0.994358
xt3             1.180991       2.402971       3.623328   33.482       0.507853       0.341934       0.988731
xt4             1.767436       3.767083       5.593433   34.206       0.484818       0.331415       0.989074
xt5             1.936756       2.553319       4.554633   26.765       0.783809       0.446720       0.985328
xt6             4.982459       5.654930      10.803471    24.56       0.910452       0.484574       0.984104
yt2             0.091587       0.724349       0.818989   50.194       0.130655       0.116370       0.996136
yt3             0.088875       0.600429       0.692266   48.893       0.152953       0.133713       0.995563
yt4             0.080134       0.539325       0.622130    48.86       0.153535       0.134157       0.995548
yt5             0.112376       0.588447       0.704569   46.411       0.197337       0.166373       0.994485
yt6             0.104164       0.499650       0.607286   45.445       0.215424       0.179019       0.994068

                                                             Parameter Estimates

                                                                                                                          t for H0:
Variable            Mean      Std Error    95% Confidence Limits        DF        Minimum        Maximum            Mu0    Mean=Mu0   Pr > |t|

xt1           130.800280       2.189681     126.3501     135.2505   33.951     128.194969     133.084071              0       59.73     <.0001
xt2           128.420909       2.079258     124.2359     132.6060   46.117     126.590679     130.156352              0       61.76     <.0001
xt3           131.033630       1.903504     127.1630     134.9042   33.482     128.495709     133.178206              0       68.84     <.0001
xt4           132.221511       2.365044     127.4162     137.0268   34.206     130.392552     135.398365              0       55.91     <.0001
xt5           127.007770       2.134159     122.6270     131.3885   26.765     124.358792     129.587070              0       59.51     <.0001
xt6           133.330690       3.286863     126.5551     140.1063    24.56     128.635706     138.967370              0       40.56     <.0001
yt2            13.716894       0.904980      11.8994      15.5344   50.194      13.049361      14.259155              0       15.16     <.0001
yt3            11.976376       0.832026      10.3043      13.6485   48.893      11.264413      12.519671              0       14.39     <.0001
yt4            11.000523       0.788752       9.4154      12.5857    48.86      10.358900      11.422313              0       13.95     <.0001
yt5             9.877199       0.839386       8.1880      11.5664   46.411       9.058228      10.414946              0       11.77     <.0001
yt6             9.133555       0.779286       7.5644      10.7027   45.445       8.627612       9.984750              0       11.72     <.0001
data long_imputed;
  set wide_imputed;
  array yt(6) yt:;
  array xt(6) xt:;
  do visit = 1 to 6;
    y = yt(visit);
   x1 = xt(visit);
  output;
  end;
  drop xt: yt:;
run;
At this point, the data have been reshaped back to long. It seems that we are ready for our analysis model. But we are not quite there yet. Since we have turned the data to wide for the imputation procedure, we now have more data points than we started with. Now every subject has 6 rows of data! We need to restrict our analysis to the time points where we have some data. To this end, we can use proc sql to select only the time points where we have some values for each of the subjects.
proc sql;
  create table long_imputed_a as
  select a.*, a.visit-1 as time,
         a.x1-mean(a.x1) as cx1, a.pre-mean(a.pre) as cpre
  from long_imputed as a, dp_miss as b
  where a.subj=b.subj and a.visit = b.visit;
quit;
proc sort data = long_imputed_a;
  by _imputation_;
run;
proc mixed data = long_imputed_a;
  by _imputation_;
  class group subj;
  model y = group pre x1 visit /solution;
  random intercept visit /subject=subj type=un;
  ods output SolutionF=mixparms;
run;
proc mianalyze parms(classvar=full)=mixparms;
  class group;
  modeleffects Intercept group pre x1 visit;
run;
           Model Information

PARMS Data Set            WORK.MIXPARMS
Number of Imputations     30


                              Variance Information

                             -----------------Variance-----------------
Parameter           group         Between         Within          Total       DF

Intercept               .        1.254539      14.171881      15.468238   4128.9
group                   0        0.000401       1.221414       1.221829   2.52E8
group            1.000000               0              .              .        .
pre                     .     0.000015750       0.021683       0.021699   5.16E7
x1                      .     0.000077681       0.000219       0.000300    404.3
visit                   .     0.000083252       0.027082       0.027168   2.89E6

                         Variance Information

                                 Relative       Fraction
                                 Increase        Missing       Relative
Parameter           group     in Variance    Information     Efficiency

Intercept               .        0.091474       0.084251       0.997199
group                   0        0.000339       0.000339       0.999989
group            1.000000               .              .              .
pre                     .        0.000751       0.000750       0.999975
x1                      .        0.365790       0.271418       0.991034
visit                   .        0.003177       0.003167       0.999894

                                  Parameter Estimates

Parameter         group      Estimate      Std Error    95% Confidence Limits        DF

Intercept             .      1.517928       3.932968     -6.19281      9.22866   4128.9
group                 0      4.085099       1.105364      1.91863      6.25157   2.52E8
group          1.000000             0              .       .            .             .

                                       Parameter Estimates

                                                                             t for H0:
Parameter         group       Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

Intercept             .     -0.344025       3.600943              0               0.39     0.6996
group                 0      4.051295       4.119540              0               3.70     0.0002
group          1.000000             0              0              0                .        .

                                  Parameter Estimates

Parameter         group      Estimate      Std Error    95% Confidence Limits        DF

pre                   .      0.465283       0.147307      0.17657      0.75400   5.16E7
x1                    .      0.021101       0.017312     -0.01293      0.05513    404.3
visit                 .     -1.215938       0.164828     -1.53899     -0.89288   2.89E6

                                       Parameter Estimates

                                                                             t for H0:
Parameter         group       Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

pre                   .      0.455333       0.475869              0               3.16     0.0016
x1                    .      0.004767       0.036989              0               1.22     0.2236
visit                 .     -1.239579      -1.201451              0              -7.38     <.0006

Now let's put the results of the three analyses together in one single table.

--------------------------------------------------------------------
|Parameter         |     Full      |   Listwise    | MI data (30)  |
|------------------+---------------+---------------+---------------|
|INTERCEPT         |          1.965|          0.841|          1.518|
|                  |        (3.824)|        (3.955)|        (3.933)|
|GROUP             |          4.059|          4.365|          4.085|
|                  |        (1.113)|        (1.099)|        (1.105)|
|PRE               |          0.470|          0.402|          0.465|
|                  |        (0.148)|        (0.149)|        (0.147)|
|X1                |          0.017|          0.037|          0.021|
|                  |        (0.015)|        (0.017)|        (0.017)|
|VISIT             |         -1.209|         -1.326|         -1.216|
|                  |        (0.166)|        (0.151)|        (0.165)|
--------------------------------------------------------------------

Here is the sas code for creating the table above.

options formchar = '|----|+|---+=|-/<>*';
proc mixed data = ats.dp;
  class group subj;
  model y = group pre x1 visit /solution;
  random intercept visit /subject=subj type=un;
  ods output solutionF = full;
run;
proc mixed data = ats.dp_miss;
  class group subj;
  model y = group pre x1 visit /solution;
  random intercept visit /subject=subj type=un;
  ods output solutionF = listwise;
run;
ods output parameterestimates=mi;
proc mianalyze parms(classvar=full)=mixparms;
  class group;
  modeleffects Intercept group pre x1 visit;
run;
data all;
  length d $16;
  label d = "data type";
  set full (in=f1) listwise (in=f2) 
  mi(in=f3 rename=(parm=effect) keep=parm group estimate stderr df tvalue probt);
  if f1 = 1 then d = "Full";
  if f2 = 1 then d = "Listwise";
  if f3 = 1 then d = "MI data (30)";
  effect = upcase(effect);
run;
proc format;           
	picture stderrf (round)       
            low-high=' 9.999)' (prefix='(')                                
            .=' ';                                                  
run;
proc tabulate data=all order=data noseps;      
  class d effect;                      
  var estimate stderr;     
  table effect=''*(estimate =' '*sum=' '*F=15.3                          
                   stderr=' '*sum=' '*F=stderrf.),                
        d=' '                                                  
   / box=[label="Parameter"] rts=20 row=float misstext=' '; 
run;
Notice that in the process of running proc sql, we have also created centered variables and recoded visit variable into a new variable called time, to have initial value starting at 0 instead of at 1 so the intercept in the model corresponds to the initial status holding the covariates x1 and pre at their grand means. We are going to run a new model using these centered covariates together with an interaction of group with time. Using this example, we are going to show how to combine the estimation results from an estimate statement. A lot of times, we need to figure some details such as how to write the estimate statement and how to output the results from the estimate statement. To this end, we should just run our model on a sample data set, say, the first imputed data set and turn the ods trace on. Once we figure out all the details we can move on to the entire imputed data set.
ods trace on /listing;
proc mixed data = long_imputed_a covtest;
  where  _imputation_ = 1;
  class  group subj;
  model y = group time group*time cpre cx1 /solution;
  random intercept time /subject=subj type=un;
  estimate "time effect for group = 0" time 1 group*time 1; 
run;
ods trace off;

(output omitted...)

Output Added:
-------------
Name:       CovParms
Label:      Covariance Parameter Estimates
Template:   Stat.Mixed.CovParms
Path:       Mixed.CovParms
-------------

                  Covariance Parameter Estimates

                                    Standard         Z
Cov Parm     Subject    Estimate       Error     Value        Pr Z

UN(1,1)      SUBJ        18.8465      4.4290      4.26      <.0001
UN(2,1)      SUBJ        -1.7606      0.9171     -1.92      0.0549
UN(2,2)      SUBJ         0.8440      0.2934      2.88      0.0020
Residual                  8.4525      0.8855      9.55      <.0001


Output Added:
-------------
Name:       FitStatistics
Label:      Fit Statistics
Template:   Stat.Mixed.FitStatistics
Path:       Mixed.FitStatistics
-------------

           Fit Statistics

-2 Res Log Likelihood          1648.6
AIC (smaller is better)        1656.6
AICC (smaller is better)       1656.7
BIC (smaller is better)        1665.0

Output Added:
-------------
Name:       LRT
Label:      Null Model Likelihood Ratio Test
Template:   Stat.Mixed.LRT
Path:       Mixed.LRT
-------------

  Null Model Likelihood Ratio Test

    DF    Chi-Square      Pr > ChiSq

     3        140.88          <.0001

Output Added:
-------------
Name:       SolutionF
Label:      Solution for Fixed Effects
Template:   Stat.Mixed.SolutionF
Path:       Mixed.SolutionF
-------------

                        Solution for Fixed Effects

                                   Standard
Effect        GROUP    Estimate       Error      DF    t Value    Pr > |t|

Intercept               12.9740      0.8350      58      15.54      <.0001
GROUP         0          3.7834      1.2629     180       3.00      0.0031
GROUP         1               0           .       .        .         .
time                    -1.2847      0.2127      51      -6.04      <.0001
time*GROUP    0          0.1596      0.3400     180       0.47      0.6392
time*GROUP    1               0           .       .        .         .
cpre                     0.4643      0.1472     180       3.15      0.0019
cx1                     0.02163     0.01493     180       1.45      0.1492

Output Added:
-------------
Name:       Tests3
Label:      Type 3 Tests of Fixed Effects
Template:   Stat.Mixed.Tests3
Path:       Mixed.Tests3
-------------

         Type 3 Tests of Fixed Effects

               Num     Den
Effect          DF      DF    F Value    Pr > F

GROUP            1     180       8.97    0.0031
time             1      51      50.20    <.0001
time*GROUP       1     180       0.22    0.6392
cpre             1     180       9.95    0.0019
cx1              1     180       2.10    0.1492

Output Added:
-------------
Name:       Estimates
Label:      Estimates
Template:   Stat.Mixed.Estimates
Path:       Mixed.Estimates
-------------
                                   Estimates

                                         Standard
Label                        Estimate       Error      DF    t Value    Pr > |t|

time effect for group = 0     -1.1251      0.2653     180      -4.24      <.0001

From the output, we can verify that our estimate statement does what it is supposed to and the names of the data sets that we care to create. In this model, there are three sets of estimates that we are interested: the parameter estimates of the fixed effect, the results from the estimate statement and maybe the variance-covariance estimates. The point of "maybe" here is that we can mechanically combine the estimation results, but we should not use the combined results blindly. For instance, how  reliable are the estimates of the variance-covariance parameters? If the number of between subjects is not large enough, these parameter estimates are not as reliable. Now we are rerun the code above adding the ods output statement to check that the three estimation data sets will be created.

proc mixed data = long_imputed_a covtest;
  where  _imputation_ = 1;
  class  group subj;
  model y = group time group*time cpre cx1 /solution;
  random intercept time /subject=subj type=un;
  estimate "time effect for group = 0" time 1 group*time 1; 
  ods output CovParms = cov solutionf= sol estimates=est;
run;
NOTE: Convergence criteria met.
NOTE: The data set WORK.EST has 1 observations and 6 variables.
NOTE: The data set WORK.SOL has 8 observations and 7 variables.
NOTE: The data set WORK.COV has 4 observations and 6 variables.
NOTE: PROCEDURE MIXED used (Total process time):
      real time           0.09 seconds
      cpu time            0.06 seconds

Now we are ready to run the model on the entire imputed data set. We add a second estimate statement, a trivial one, for the purpose of illustrating how to combine the results from the estimate statement using proc mianalyze with different syntax.

proc mixed data = long_imputed_a covtest;
  by  _imputation_ ;
  class  group subj;
  model y = group time group*time cpre cx1 /solution;
  random intercept time /subject=subj type=un;
  estimate "time effect for group = 0" time 1 group*time 1; 
  estimate "time effect for group = 1" time 1;
  ods output CovParms = cov solutionf= sol estimates=est;
run;
proc mianalyze parms(classvar=full)=sol;
  class group;
  modeleffects Intercept group time group*time cpre cx1;
run;
                   Parameter Estimates

Parameter            group         Minimum        Maximum

Intercept                .       12.924877      13.006671
group                    0        3.742516       3.861900
group             1.000000               0              0
time                     .       -1.307816      -1.259890
time*group               0        0.143170       0.194270
time*group        1.000000               0              0
cpre                     .        0.456482       0.476776
cx1                      .        0.004594       0.037081

                          Parameter Estimates

                                                    t for H0:
Parameter            group          Theta0   Parameter=Theta0   Pr > |t|

Intercept                .               0              15.57     <.0001
group                    0               0               3.02     0.0025
group             1.000000               0                .        .
time                     .               0              -6.00     <.0001
time*group               0               0               0.47     0.6366
time*group        1.000000               0                .        .
cpre                     .               0               3.17     0.0015
cx1                      .               0               1.22     0.2233
proc print data = est (obs=10) noobs ;
run;
_Imputation_              Label              Estimate      StdErr      DF     tValue     Probt

      1         time effect for group = 0     -1.1251      0.2653     180      -4.24    <.0001
      2         time effect for group = 0     -1.1148      0.2696     180      -4.14    <.0001
      3         time effect for group = 0     -1.1331      0.2586     180      -4.38    <.0001
      4         time effect for group = 0     -1.1182      0.2699     180      -4.14    <.0001
      5         time effect for group = 0     -1.1176      0.2663     180      -4.20    <.0001
      6         time effect for group = 0     -1.1261      0.2692     180      -4.18    <.0001
      7         time effect for group = 0     -1.1105      0.2680     180      -4.14    <.0001
      8         time effect for group = 0     -1.1104      0.2636     180      -4.21    <.0001
      9         time effect for group = 0     -1.1068      0.2654     180      -4.17    <.0001
     10         time effect for group = 0     -1.1142      0.2667     180      -4.18    <.0001

proc sort data = est;
 by label _imputation_;
run;
ods select parameterestimates;
proc mianalyze data = est ;
  by label;
  modeleffects estimate;
  stderr stderr;
run;
Label=time effect for group = 0

                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate        -1.117590       0.266011     -1.63896     -0.59622   2.68E7

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate        -1.135251      -1.101759              0              -4.20     <.0001

Label=time effect for group = 1


                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate        -1.198107       0.170490     -1.53226     -0.86395   4.21E6

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate        -1.220460      -1.184616              0              -7.03     <.0001

proc sort data = cov;
 by covparm _imputation_;
run;
ods select parameterestimates;
proc mianalyze data = cov ;
  by covparm;
  modeleffects estimate;
  stderr stderr;
run;
Cov Parm=Residual

                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate         8.437262       0.884596     6.703486     10.17104   9.61E6

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate         8.330932       8.489894              0               9.54     <.0001

Cov Parm=UN(1,1)


                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate        18.690516       4.413127     10.04094     27.34009   5.05E6

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate        18.238351      19.075624              0               4.24     <.0001

Cov Parm=UN(2,1)


                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate        -1.719833       0.913920     -3.51108     0.071418   7.48E6

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate        -1.835205      -1.617613              0              -1.88     0.0599

Cov Parm=UN(2,2)


                            Parameter Estimates

Parameter        Estimate      Std Error    95% Confidence Limits        DF

estimate         0.847049       0.295519     0.267839     1.426260   215486

                                 Parameter Estimates

                                                                 t for H0:
Parameter         Minimum        Maximum         Theta0   Parameter=Theta0   Pr > |t|

estimate         0.762822       0.890457              0               2.87     0.0042

References

T. E. Raghunathan, J. M. Lepkowski, J. Van Hoewyk and P. Solenberger, A Multivariate Technique for Multiply Imputing Missing Values Using a Sequence of Regression Models, Survey Methodology, June 2001, Vol. 27, No. 1, pp. 8595.

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.