Outline of this seminar:
Part 1:
Part 2:
We ended our previous seminar with an example on using proc mi. In this seminar, we will start with an example of IVEware and then move on to performing analysis after multiple imputation. For the purpose of illustration and convenience, we assume that multiple imputed data have been provided so at the end, we will discuss some data management issues related to proc mi and proc mianalyze.
The data set hsb_mar.sas7bdat which is based on hsb2.sas7bdat is used for this seminar and can be downloaded following the link. The SAS code for this seminar was developed using SAS 9.2. The format used by the data set can be created as follows.
proc format;
value female 0 = "male"
1 = "female";
value race 1 ="hispanic"
2 = "asian"
3 = "african-amer"
4 = "white";
value ses 1 = "low"
2 = "milddle"
3 = "high";
value schtyp 1 ="public"
2 ="private";
value prog 1 = "general"
2 = "academic"
3 = "vocation";
run;
options fmtsearch=(work);
IVEware is imputation and variance estimation software developed by Raghunathan et al (1997), and it can be called from SAS or run independently. It uses a multivariate sequential regression approach for obtaining the imputed values. The regression models can be linear regression, poisson regression, logistic regression and more. IVEware also incorporates the survey design, such as stratification, cluster and sampling weights. The description of IVEware below is extracted from the User Guide of IVEware.
includes four modules: IMPUTE, DESCRIBE, REGRESS and SASMOD."IVEware
·
IMPUTE uses a multivariate sequential regression approach to imputing item missing values. IMPUTE can create multiply imputed data sets.·
DESCRIBE estimates the population means, proportions, subgroup differences, contrasts and linear combinations of means and proportions. A Taylor Series approach is used to obtain variance estimates appropriate for a user specified complex sample design. A multiple imputation analysis can be performed when there are missing values.·
REGRESS fits linear, logistic, polytomous, Poisson, Tobit and proportional hazard regression models for data resulting from a complex sample design. The repeated replication approach is used to estimate the sampling variances. A multiple imputation analysis can be performed when there are missing values.·
SASMOD allows users to take into account complex sample design features when analyzing data with several SAS procedures. Currently the following SAS PROCS can be called: CALIS, CATMOD, GENMOD, LIFEREG, MIXED, NLIN, PHREG, and PROBIT. A multiple imputation analysis can be performed when there are missing values."Now let's see a simple example using IVEware. The data set used here is the same as before.
options set = SRCLIB "D:\work\ive_sas_windows"
sasautos = ('!SRCLIB' sasautos) mautosource ;
options nofmterr;
data hsb_mar; set ats.hsb_mar; run; data _null_; infile datalines; filename setup "test1.set"; file setup; input; put _infile_; datalines4; title Descriptives; datain hsb_mar; estout hsbdes; mean read math write female; run; ;;;; run; %describe(name=test1, dir=.);
Descriptives
Problem 1
Degrees of freedom
190
Factor Covariance of denominator
None 0.00000
Mean Number of Sum of Weighted Standard
READ Cases Weights Mean Error
191 191 52.28796 0.7388216
Lower Upper T Test Prob > |T|
Bound Bound
50.83062 53.7453 70.77210 0.00000
Unweighted Bias Design
Mean Effect
52.28796 0.00000 1.00000
Problem 2
Degrees of freedom
184
Factor Covariance of denominator
None 0.00000
Mean Number of Sum of Weighted Standard
MATH Cases Weights Mean Error
185 185 52.8973 0.6882224
Lower Upper T Test Prob > |T|
Bound Bound
51.53947 54.25512 76.86076 0.00000
Unweighted Bias Design
Mean Effect
52.8973 0.00000 1.00000
Problem 3
Degrees of freedom
182
Factor Covariance of denominator
None 0.00000
Mean Number of Sum of Weighted Standard
WRITE Cases Weights Mean Error
Lower Upper T Test Prob > |T|
Bound Bound
51.60053 54.30111 77.37340 0.00000
Unweighted Bias Design
Mean Effect
52.95082 0.00000 1.00000
Problem 4
Degrees of freedom
181
Factor Covariance of denominator
None 0.00000
Mean Number of Sum of Weighted Standard
FEMALE Cases Weights Mean Error
182 182 0.5549451 0.03693963
Lower Upper T Test Prob > |T|
Bound Bound
0.4820576 0.6278325 15.02303 0.00000
Unweighted Bias Design
Mean Effect
0.5549451 0.00000 1.00000
data _null_;
infile datalines;
filename setup "impute.set";
file setup;
input;
put _infile_;
datalines4;
title Multiple imputation;
datain hsb_mar;
dataout hsb_mar_1;
default continuous;
categorical female prog race ses;
iterations 5;
multiples 5;
seed 2001;
run;
;;;;
%impute(name=impute, dir=.);
%PUTDATA(NAME=impute,DIR=. ,MULT=2,DATAOUT=hsb_mar_2);
%PUTDATA(NAME=impute,DIR=. ,MULT=3,DATAOUT=hsb_mar_3);
%PUTDATA(NAME=impute,DIR=. ,MULT=4,DATAOUT=hsb_mar_4);
%PUTDATA(NAME=impute,DIR=. ,MULT=5,DATAOUT=hsb_mar_5);
RUN;
data _null_;
infile datalines;
filename setup "impute_regress.set";
file setup;
input;
put _infile_;
datalines4;
title Multiply imputed jackknife regression;
datain
hsb_mar_1
hsb_mar_2
hsb_mar_3
hsb_mar_4
hsb_mar_5;
estout regexam2;
dependent socst;
predictor write read female math;
run;
;;;;
%regress(name=impute_regress, dir=.);
RUN;
Multiply imputed jackknife regression
Regression type: Linear
Dependent variable: SOCST social studies score
Predictors: WRITE writing score
READ reading score
FEMALE female
MATH math score
All imputations
Valid cases 200
Degr freedom 16.80949217
Sum of squares:
Model 11039.10185
Error 11897.09315
Total 22936.195
R-square 0.48130
F-value 3.89931
P-value 0.02027
Variable Estimate Std Error T Test Prob > |T|
Intercept 6.5596849 3.6069808 1.81861 0.08683
WRITE 0.3412771 0.1050343 3.24920 0.00478
READ 0.3911219 0.0790752 4.94620 0.00013
FEMALE 0.7965601 1.5908236 0.50072 0.62306
MATH 0.1292599 0.0922853 1.40065 0.17951
Variable Estimate 95% Confidence Interval
Lower Upper
Intercept 6.5596849 -1.0569193 14.1762891
WRITE 0.3412771 0.1194836 0.5630706
READ 0.3911219 0.2241444 0.5580993
FEMALE 0.7965601 -2.5626686 4.1557888
MATH 0.1292599 -0.0656125 0.3241323
Once we have obtained the multiply imputed data, we will move on to the second and third step of our analysis described in previous seminar, that is to perform the analysis by imputation and to combine the analyses at the end. The last step of combining the analyses should be done using the SAS procedure proc mianalyze.
Let's say that we are interested in modeling the score of a social sciences test, socst on read, math, write, female as before. Since we want to treat female as a categorical variable in the analysis model, we are going to use the imputed data set created at the end of last seminar as shown below. Because in this setup we have forced the imputed values for female to be between 0 and 1. In an extra data step we are going to "recreate" the values for female using binomial distribution based on the imputed values.
proc mi data = hsb_mar seed=432156 nimpute = 20 out=t2
minimum = . . . . 0 0 0 0 0 0 0 0 0
maximum = . . . . 1 1 1 1 1 1 1 1 1
MINMAXITER = 400;
mcmc plots=trace plots=acf;
var socst write read math s1 s2 p1 p2 r1-r3 schtyp female;
run;
data t2; set t2; if 0 < female < 1 then female = ranbin(1245, 1, female); run;
Now we are ready to perform our regression analysis. One of the hypothesis that we are interested in testing is that the effect of write, math and read are equal to each other. The hypothesis test can be carried out using the test statement in proc mianalyze as shown below.
proc reg data = t2 outest=regout covout; by _imputation_; model socst = write read math female; run; quit; proc mianalyze data=regout edf=195; modeleffects Intercept write read math female; mytest: test write=read, write=math / mult; run;
Model Information
Data Set WORK.REGOUT
Number of Imputations 20
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Parameter Between Within Total DF in Variance Information Efficiency
Intercept 0.356772 12.835697 13.210307 186.08 0.029185 0.028440 0.998580
write 0.001502 0.007645 0.009222 128.41 0.206224 0.173506 0.991399
read 0.000386 0.005915 0.006320 173.88 0.068458 0.064476 0.996787
math 0.000973 0.007845 0.008866 152.6 0.130165 0.116407 0.994213
female 0.454866 1.482958 1.960567 100.28 0.322065 0.248289 0.987738
Parameter Estimates
t for H0:
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
Intercept 7.071084 3.634599 -0.09923 14.24140 186.08 5.924567 8.111225 0 1.95 0.0532
write 0.339907 0.096032 0.14990 0.52992 128.41 0.269254 0.421620 0 3.54 0.0006
read 0.377972 0.079500 0.22106 0.53488 173.88 0.347713 0.408543 0 4.75 <.0001
math 0.138487 0.094162 -0.04754 0.32452 152.6 0.077483 0.202446 0 1.47 0.1434
female 0.378430 1.400203 -2.39944 3.15630 100.28 -1.014299 1.180621 0 0.27 0.7875
Test: mytest
Test Specification
----------------------------------L Matrix----------------------------------
Parameter Intercept write read math female C
TestPrm1 0 1.000000 -1.000000 0 0 0
TestPrm2 0 1.000000 0 -1.000000 0 0
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Parameter Between Within Total DF in Variance Information Efficiency
TestPrm1 0.002665 0.017921 0.020719 143.89 0.156161 0.136724 0.993210
TestPrm2 0.004402 0.022265 0.026888 128.01 0.207616 0.174487 0.991351
Parameter Estimates
t for H0:
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum C Parameter=C Pr > |t|
TestPrm1 -0.038066 0.143943 -0.32258 0.246450 143.89 -0.110137 0.067655 0 -0.26 0.7918
TestPrm2 0.201420 0.163975 -0.12303 0.525872 128.01 0.078520 0.344137 0 1.23 0.2216
Multivariate Inference
Assuming Proportionality of Between/Within Covariance Matrices
Avg Relative
Increase F for H0:
in Variance Num DF Den DF Parameter=Theta0 Pr > F
0.143052 2 1979.5 1.37 0.2531
A variation of the regression example
If we are only interested in getting the parameter estimates of the regression model, we don't really need variance and covariance matrix, so the code below works.
ods output parameterestimates=regout1; proc reg data = t2; by _imputation_; model socst = write read math female; run; quit; proc mianalyze parms=regout1 edf=195; modeleffects Intercept write read math female; run;
Model Information
PARMS Data Set WORK.REGOUT1
Number of Imputations 20
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Parameter Between Within Total DF in Variance Information Efficiency
Intercept 0.356772 12.835697 13.210307 186.08 0.029185 0.028440 0.998580
write 0.001502 0.007645 0.009222 128.41 0.206224 0.173506 0.991399
read 0.000386 0.005915 0.006320 173.88 0.068458 0.064476 0.996787
math 0.000973 0.007845 0.008866 152.6 0.130165 0.116407 0.994213
female 0.454866 1.482958 1.960567 100.28 0.322065 0.248289 0.987738
Parameter Estimates
t for H0:
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
Intercept 7.071084 3.634599 -0.09923 14.24140 186.08 5.924567 8.111225 0 1.95 0.0532
write 0.339907 0.096032 0.14990 0.52992 128.41 0.269254 0.421620 0 3.54 0.0006
read 0.377972 0.079500 0.22106 0.53488 173.88 0.347713 0.408543 0 4.75 <.0001
math 0.138487 0.094162 -0.04754 0.32452 152.6 0.077483 0.202446 0 1.47 0.1434
female 0.378430 1.400203 -2.39944 3.15630 100.28 -1.014299 1.180621 0 0.27 0.7875
Well, as expected, the two methods produce identical output. In each method, the first step is to run the regression model and the second step is to run proc mianalyze to combine the regression model results. The difference between the two methods is the difference in data set created in step 1 for combining the results in step 2. Let's take a look at these data sets.
*method 1; proc print data = regout (obs=24); run; Obs _Imputation_ _MODEL_ _TYPE_ _NAME_ _DEPVAR_ _RMSE_ Intercept WRITE READ MATH FEMALE SOCST 1 1 MODEL1 PARMS SOCST 7.86928 6.7148 0.30645 0.34784 0.20245 1.03407 -1 2 1 MODEL1 COV Intercept SOCST 7.86928 12.6760 -0.10452 -0.04320 -0.08309 -0.35284 . 3 1 MODEL1 COV WRITE SOCST 7.86928 -0.1045 0.00763 -0.00226 -0.00314 -0.02925 . 4 1 MODEL1 COV READ SOCST 7.86928 -0.0432 -0.00226 0.00569 -0.00265 0.00749 . 5 1 MODEL1 COV MATH SOCST 7.86928 -0.0831 -0.00314 -0.00265 0.00725 0.01402 . 6 1 MODEL1 COV FEMALE SOCST 7.86928 -0.3528 -0.02925 0.00749 0.01402 1.37260 . 7 2 MODEL1 PARMS SOCST 7.93860 7.3575 0.36895 0.38182 0.10668 -0.32243 -1 8 2 MODEL1 COV Intercept SOCST 7.93860 13.1665 -0.07921 -0.05651 -0.10380 -0.38420 . 9 2 MODEL1 COV WRITE SOCST 7.93860 -0.0792 0.00823 -0.00234 -0.00399 -0.04636 . 10 2 MODEL1 COV READ SOCST 7.93860 -0.0565 -0.00234 0.00622 -0.00290 0.01528 . 11 2 MODEL1 COV MATH SOCST 7.93860 -0.1038 -0.00399 -0.00290 0.00863 0.02302 . 12 2 MODEL1 COV FEMALE SOCST 7.93860 -0.3842 -0.04636 0.01528 0.02302 1.53388 . 13 3 MODEL1 PARMS SOCST 7.87091 6.9166 0.29919 0.40275 0.15272 0.79477 -1 14 3 MODEL1 COV Intercept SOCST 7.87091 12.7756 -0.09557 -0.04862 -0.09077 -0.10551 . 15 3 MODEL1 COV WRITE SOCST 7.87091 -0.0956 0.00807 -0.00225 -0.00361 -0.04640 . 16 3 MODEL1 COV READ SOCST 7.87091 -0.0486 -0.00225 0.00599 -0.00291 0.01422 . 17 3 MODEL1 COV MATH SOCST 7.87091 -0.0908 -0.00361 -0.00291 0.00807 0.01886 . 18 3 MODEL1 COV FEMALE SOCST 7.87091 -0.1055 -0.04640 0.01422 0.01886 1.51881 . 19 4 MODEL1 PARMS SOCST 7.84248 7.3165 0.38010 0.37954 0.09995 -0.46673 -1 20 4 MODEL1 COV Intercept SOCST 7.84248 12.5256 -0.07946 -0.04984 -0.09936 -0.26761 . 21 4 MODEL1 COV WRITE SOCST 7.84248 -0.0795 0.00705 -0.00198 -0.00318 -0.04211 . 22 4 MODEL1 COV READ SOCST 7.84248 -0.0498 -0.00198 0.00613 -0.00327 0.01163 . 23 4 MODEL1 COV MATH SOCST 7.84248 -0.0994 -0.00318 -0.00327 0.00813 0.02048 . 24 4 MODEL1 COV FEMALE SOCST 7.84248 -0.2676 -0.04211 0.01163 0.02048 1.49208 . *method 2; options nolabel; proc print data = regout1 (obs=20); run;
Obs _Imputation_ Model Dependent Variable DF Estimate StdErr tValue Probt 1 1 MODEL1 SOCST Intercept 1 6.71475 3.56034 1.89 0.0608 2 1 MODEL1 SOCST WRITE 1 0.30645 0.08735 3.51 0.0006 3 1 MODEL1 SOCST READ 1 0.34784 0.07541 4.61 <.0001 4 1 MODEL1 SOCST MATH 1 0.20245 0.08517 2.38 0.0184 5 1 MODEL1 SOCST FEMALE 1 1.03407 1.17158 0.88 0.3785 6 2 MODEL1 SOCST Intercept 1 7.35753 3.62857 2.03 0.0440 7 2 MODEL1 SOCST WRITE 1 0.36895 0.09070 4.07 <.0001 8 2 MODEL1 SOCST READ 1 0.38182 0.07887 4.84 <.0001 9 2 MODEL1 SOCST MATH 1 0.10668 0.09290 1.15 0.2522 10 2 MODEL1 SOCST FEMALE 1 -0.32243 1.23850 -0.26 0.7949 11 3 MODEL1 SOCST Intercept 1 6.91660 3.57430 1.94 0.0544 12 3 MODEL1 SOCST WRITE 1 0.29919 0.08985 3.33 0.0010 13 3 MODEL1 SOCST READ 1 0.40275 0.07741 5.20 <.0001 14 3 MODEL1 SOCST MATH 1 0.15272 0.08983 1.70 0.0907 15 3 MODEL1 SOCST FEMALE 1 0.79477 1.23240 0.64 0.5198 16 4 MODEL1 SOCST Intercept 1 7.31651 3.53915 2.07 0.0400 17 4 MODEL1 SOCST WRITE 1 0.38010 0.08398 4.53 <.0001 18 4 MODEL1 SOCST READ 1 0.37954 0.07831 4.85 <.0001 19 4 MODEL1 SOCST MATH 1 0.09995 0.09014 1.11 0.2689 20 4 MODEL1 SOCST FEMALE 1 -0.46673 1.22151 -0.38 0.7028
In method 1, the data set regout, created by proc reg using the outest = and covout options, contains not only the parameter estimates but also the variance covariance matrix of the parameter estimates. In method 2, the data set regout2, created using ods output only, contains parameter estimates and their standard errors. If we want to run any hypothesis test that involves more than just one single predictor variable in the model, we will have to use method 1 since the information on variance and covariance is needed.
Getting the averaged R-square
Oftentimes we want to report R-square for a model. Now with multiply imputed data set, we end up with multiple R-squares. You can report the averaged R-square together with the minimum and maximum. It is fairly straightforward to get them, and here is one way to do so.
ods output FitStatistics = fitstat; proc reg data = t2; by _imputation_; model socst = write read math female; run; quit; proc means data = fitstat min max mean; where Label2="R-Square"; title "Averaged R-square"; var nvalue2; run;
Averaged R-square
The MEANS Procedure
Analysis Variable : nValue2
Minimum Maximum Mean
--------------------------------------------
0.4579211 0.4856979 0.4714016
--------------------------------------------
Let's say that our analysis model is variable write on ses, female, read and the interaction of ses with female, treating both ses and female as categorical variables. First of all, proc reg does not have a class statement. Secondly, the test statement in proc mianalyze does not work with class variables. This means that we need to create dummy variables before running proc mianalyze. In SAS 9 and above, this becomes a lot easier with proc glmmod. Proc glmmod, like some of the SAS procedures, does not perform analysis, per se. Instead, it produces a design matrix (dummy variables) for a general linear model. It can be used with regression procedures. Here is the code for our model.
ods output designpoints=t3; ods listing close; proc glmmod data=t2; by _imputation_; class ses female; model write = ses female ses*female read; run; ods listing; proc reg data = t3 outest=regout3 covout; by _imputation_; model write = ses_1 ses_2 female_1 ses_female_1_1 ses_female_2_1 read; run; proc mianalyze data=regout3 edf=193; modeleffects intercept ses_1 ses_2 female_1 ses_female_1_1 ses_female_2_1 read; interaction: test ses_female_1_1, ses_female_2_1 / mult; run;
Model Information
Data Set WORK.REGOUT3
Number of Imputations 20
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Parameter Between Within Total DF in Variance Information Efficiency
intercept 0.743091 10.249762 11.030007 169.59 0.076123 0.071227 0.996451
ses_1 1.110474 5.234794 6.400792 122.74 0.222740 0.185006 0.990834
ses_2 0.494814 2.980575 3.500130 136.86 0.174314 0.150407 0.992536
female_1 1.263178 3.577076 4.903412 90.69 0.370788 0.276047 0.986386
ses_female_1_1 2.701027 8.498906 11.334985 97.309 0.333699 0.255098 0.987406
ses_female_2_1 1.526107 5.777745 7.380157 109.08 0.277342 0.220981 0.989072
read 0.000158 0.002728 0.002894 174.65 0.060796 0.057638 0.997126
Parameter Estimates
t for H0:
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
intercept 23.364482 3.321145 16.80837 29.92059 169.59 21.945295 24.840872 0 7.04 <.0001
ses_1 -2.620320 2.529979 -7.62836 2.38772 122.74 -5.262403 -0.600005 0 -1.04 0.3024
ses_2 -2.022661 1.870863 -5.72220 1.67688 136.86 -3.360548 -0.761200 0 -1.08 0.2815
female_1 4.091534 2.214365 -0.30723 8.49030 90.69 2.195456 6.036303 0 1.85 0.0679
ses_female_1_1 1.767133 3.366747 -4.91466 8.44893 97.309 -1.381168 4.865506 0 0.52 0.6009
ses_female_2_1 1.946620 2.716644 -3.43764 7.33088 109.08 -0.877186 3.763880 0 0.72 0.4752
read 0.538969 0.053792 0.43280 0.64514 174.65 0.519234 0.567900 0 10.02 <.0001
Test: interaction
Test Specification
--------------------------------------------------L Matrix--------------------------------------------------
ses_female_ ses_female_
Parameter intercept ses_1 ses_2 female_1 1_1 2_1 read C
TestPrm1 0 0 0 0 1.000000 0 0 0
TestPrm2 0 0 0 0 0 1.000000 0 0
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Parameter Between Within Total DF in Variance Information Efficiency
TestPrm1 2.701027 8.498906 11.334985 97.309 0.333699 0.255098 0.987406
TestPrm2 1.526107 5.777745 7.380157 109.08 0.277342 0.220981 0.989072
Parameter Estimates
t for H0:
Parameter Estimate Std Error 95% Confidence Limits DF Minimum Maximum C Parameter=C Pr > |t|
TestPrm1 1.767133 3.366747 -4.91466 8.448925 97.309 -1.381168 4.865506 0 0.52 0.6009
TestPrm2 1.946620 2.716644 -3.43764 7.330878 109.08 -0.877186 3.763880 0 0.72 0.4752
Multivariate Inference
Assuming Proportionality of Between/Within Covariance Matrices
Avg Relative
Increase F for H0:
in Variance Num DF Den DF Parameter=Theta0 Pr > F
0.271089 2 690.87 0.28 0.7576
Let's say that students are put in an honors class based on their writing score cutoff at 60, and we want to predict the probability of a student being in the honors class based on the reading score and gender. We also want to produce odds ratio for the two predictor variables. To this end, we need to calculate the odds ratio based on the parameter estimates given by proc mianalyze. This is done by first outputting the parameter estimates to a data set and then create odds ratio variables in a data step. Also notice that we didn't ask for variance and covariance in this example, since we are not interested in performing a hypothesis test.
data t4; set t2; y = (write>=60); run; proc logistic data = t4 descending; by _imputation_; model y = female read ses; ods output parameterestimates=logparms; run; ods output parameterestimates = logout; proc mianalyze parms=logparms; modeleffects intercept female read ses; run; data logout_a; set logout; if parm ^="intercept"; odds_ratio = exp(estimate); odds_LCLMean = exp(lclmean); odds_UCLMean = exp(uclmean); run; proc print data = logout_a noobs; var parm odds:; run;
odds_ odds_ odds_ Obs Parm ratio LCLMean UCLMean 1 female 2.69291 1.11796 6.48660 2 read 1.14077 1.08855 1.19550 3 ses 1.24224 0.71140 2.16919
What if we want to present the model graphically? Here is an example. Let's say that we want to plot the predicted probabilities for read scores ranging from 30 to 70 by male and female, fixing the ses value at 2. The basic steps in this example are the following.
proc transpose data = logout out=score; id parm; var estimate; run; data score; set score; _type_="PARMS"; drop _name_; run; proc print data = score noobs; run;
intercept female read ses _type_
-9.314543 0.990624 0.131705 0.216918 PARMS
data toscore;
do female = 0 to 1;
do read = 30 to 70 by 5;
intercept = 1;
ses = 2;
output;
end;
end;
run;
proc score data = toscore score=score out=scored type=parms;
run;
data scored_p;
set scored;
p = exp(estimate)/(1+exp(estimate));
run;
ods graphics on;
proc sgplot data = scored_p;
series x = read y = p /group=female;
run;
ods graphics off;
This example involves imputation of a longitudinal data set and running a mixed model after the multiple imputation. The full data set, dp.sas7bdat, and data set with missing values, dp_miss.sas7bdat, created based on the complete data set can be downloaded by following the link. Let's first run a mixed model using the full data set.
proc mixed data = dp; class group subj; model y = group pre x1 visit /solution; random intercept visit /subject=subj type=un; run;
Dimensions
Covariance Parameters 4
Columns in X 6
Columns in Z Per Subject 2
Subjects 61
Max Obs Per Subject 6
Number of Observations
Number of Observations Read 295
Number of Observations Used 295
Number of Observations Not Used 0
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 1799.97381611
1 2 1649.18880814 0.00000115
2 1 1649.18816211 0.00000000
Convergence criteria met.
Covariance Parameter Estimates
Cov Parm Subject Estimate
UN(1,1) SUBJ 23.1600
UN(2,1) SUBJ -2.5572
UN(2,2) SUBJ 0.8530
Residual 8.4011
Fit Statistics
-2 Res Log Likelihood 1649.2
AIC (smaller is better) 1657.2
AICC (smaller is better) 1657.3
BIC (smaller is better) 1665.6
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
3 150.79 <.0001
Solution for Fixed Effects
Standard
Effect GROUP Estimate Error DF t Value Pr > |t|
Intercept 1.9654 3.8244 58 0.51 0.6093
GROUP 0 4.0593 1.1131 180 3.65 0.0003
GROUP 1 0 . . . .
PRE 0.4696 0.1484 180 3.17 0.0018
X1 0.01702 0.01496 180 1.14 0.2569
VISIT -1.2094 0.1664 52 -7.27 <.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
GROUP 1 180 13.30 0.0003
PRE 1 180 10.02 0.0018
X1 1 180 1.29 0.2569
VISIT 1 52 52.82 <.0001
Notice that the covariate x1 is not a significant predictor in this model. Now let's run the same model using the data set with missing data.
proc mixed data = ats.dp_miss; class group subj; model y = group pre x1 visit /solution; random intercept visit /subject=subj type=un; run;
Dimensions
Covariance Parameters 4
Columns in X 6
Columns in Z Per Subject 2
Subjects 58
Max Obs Per Subject 6
Number of Observations
Number of Observations Read 295
Number of Observations Used 250
Number of Observations Not Used 45
Iteration History
Iteration Evaluations -2 Res Log Like Criterion
0 1 1494.45188419
1 2 1374.49573745 0.00002490
2 1 1374.48385751 0.00000006
3 1 1374.48382794 0.00000000
Convergence criteria met.
Covariance Parameter Estimates
Cov Parm Subject Estimate
UN(1,1) SUBJ 20.2010
UN(2,1) SUBJ -1.6866
UN(2,2) SUBJ 0.4588
Residual 7.8997
Fit Statistics
-2 Res Log Likelihood 1374.5
AIC (smaller is better) 1382.5
AICC (smaller is better) 1382.7
BIC (smaller is better) 1390.7
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
3 119.97 <.0001
Solution for Fixed Effects
Standard
Effect GROUP Estimate Error DF t Value Pr > |t|
Intercept 0.8407 3.9547 55 0.21 0.8324
GROUP 0 4.3650 1.0995 141 3.97 0.0001
GROUP 1 0 . . . .
PRE 0.4023 0.1492 141 2.70 0.0079
X1 0.03676 0.01731 141 2.12 0.0355
VISIT -1.3262 0.1506 49 -8.81 <.0001
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
GROUP 1 141 15.76 0.0001
PRE 1 141 7.27 0.0079
X1 1 141 4.51 0.0355
VISIT 1 49 77.54 <.0001
Now the predictor variable x1 becomes a significant predictor. This is an example of what happens to the parameter estimates with listwise deletion when the data are not missing completely at random. (Since we generated the data with missing values based on the full data set, we know this is the case.) The parameter estimates are biased using listwise deletion when missing is not completely at random. Assuming that the data are missing at random, applying a multiple imputation technique to the data set will give us a chance to model the missingness and to improve the estimates of variance covariance structure needed for the analysis model.
For a longitudinal data set, the way to impute is to turn the data in wide so that each row corresponds to a subject. After the imputation, we will have to turn the data back to long for the mixed model analysis. In turning data from long to wide, we need to know which variables are within-variables; in other words, time-varying and which variables are between variables. In our example, visit is the variable for time and x1 is a time-varying covariate. The outcome variable y, apparently, is also a time-varying variable.
* number of time points;
proc freq data = dp_miss;
tables visit;
run;
Cumulative Cumulative
VISIT Frequency Percent Frequency Percent
----------------------------------------------------------
1 61 20.68 61 20.68
2 53 17.97 114 38.64
3 46 15.59 160 54.24
4 45 15.25 205 69.49
5 45 15.25 250 84.75
6 45 15.25 295 100.00
proc sort data = dp_miss;
by subj visit;
run;
data wide;
set dp_miss;
array yt(6);
array xt(6);
by subj;
retain yt xt;
if first.subj then do i = 1 to 6;
yt(i) = .;
xt(i) = .;
end;
yt(visit) = y;
xt(visit) = x1;
if last.subj;
drop visit y x1 i;
run;
proc print data = wide (obs=10) noobs;
run;
SUBJ GROUP PRE yt1 yt2 yt3 yt4 yt5 yt6 xt1 xt2 xt3 xt4 xt5 xt6 1 0 18 17 18.0000 15 17.0000 14.0000 15.0000 119.717 126.267 129.251 138.490 121.896 130.930 2 0 27 26 23.0000 18 17.0000 12.0000 10.0000 . . 123.376 . 132.156 102.482 3 0 16 17 14.0000 . . . . 131.226 124.536 . . . . 4 0 17 14 23.0000 17 13.0000 12.0000 12.0000 120.253 124.482 121.966 130.832 126.038 122.019 5 0 15 12 10.0000 8 4.0000 5.0000 5.0000 152.194 124.859 134.678 155.972 153.952 138.948 6 0 20 19 11.5400 9 8.0000 6.8200 5.0500 143.304 130.493 134.912 123.653 125.339 130.643 7 0 16 13 13.0000 9 7.0000 8.0000 7.0000 125.082 113.869 127.469 126.645 129.214 100.558 8 0 28 26 27.0000 . . . . 119.613 145.215 . . . . 9 0 28 26 24.0000 19 13.9400 11.0000 9.0000 . 136.975 116.491 119.581 . . 10 0 25 9 12.0000 15 12.0000 13.0000 20.0000 131.701 . 127.819 126.092 124.667 .
ods select misspattern;
proc mi data = wide out=wide_imputed;
var xt: yt: group pre ;
run;
Missing Data Patterns
-----------------------------Group Means---------------------------- Group xt1 xt2 xt3 xt4 xt5 xt6 yt1 yt2 yt3 yt4 yt5 yt6 GROUP PRE Freq Percent xt1 xt2 xt3 xt4 xt5
1 X X X X X X X X X X X X X X 31 50.82 129.210454 123.651090 133.238919 133.059281 127.296889
2 X X X X X . X X X X X X X X 1 1.64 124.891563 114.192368 129.880737 100.408318 132.438690
3 X X X . . . X X X . . . X X 1 1.64 124.349800 140.247986 129.350708 . .
4 X X . . . . X X . . . . X X 5 8.20 130.125708 140.110742 . . .
5 X . X X X X X X X X X X X X 1 1.64 133.692200 . 131.970581 121.127487 118.771194
6 X . X X X . X X X X X X X X 1 1.64 131.701416 . 127.819115 126.092209 124.666901
7 X . X . . . X X X X X X X X 1 1.64 153.944626 . 138.519775 . .
8 X . . . . . X . . . . . X X 6 9.84 128.225189 . . . .
9 . X X X . X X X X X X X X X 1 1.64 . 122.507767 130.384705 121.351753 .
10 . X X X . . X X X X X X X X 1 1.64 . 136.975418 116.491409 119.580635 .
11 . X X . X X X X X X X X X X 1 1.64 . 135.618896 131.715240 . 135.236145
12 . X X . X . X X X X X X X X 1 1.64 . 120.561295 127.346565 . 117.862259
13 . X X . . X X X X X X X X X 1 1.64 . 126.772293 132.519714 . .
14 . X . X . . X X X X X X X X 1 1.64 . 188.888687 . 164.752579 .
15 . X . . . . X X . . . . X X 1 1.64 . 129.818832 . . .
16 . . X . X X X X X X X X X X 1 1.64 . . 123.375587 . 132.156387
17 . . X . . . X X X X X X X X 1 1.64 . . 132.042450 . .
18 . . . X X X X X X X X X X X 1 1.64 . . . 119.213478 140.304291
19 . . . . X X X X X X X X X X 1 1.64 . . . . 122.023026
20 . . . . . . X X . . . . X X 1 1.64 . . . . .
21 . . . . . . X . . . . . X X 2 3.28 . . . . .
Missing Data Patterns
-----------------------------------------------------------------Group Means---------------------------------------------------------------- Group xt6 yt1 yt2 yt3 yt4 yt5 yt6 GROUP PRE
1 129.316789 14.080968 10.947742 9.389677 9.474839 7.758387 7.240323 0.580645 18.935484
2 . 11.000000 7.000000 5.000000 8.000000 2.000000 3.000000 1.000000 23.000000
3 . 7.000000 12.000000 15.000000 . . . 1.000000 22.000000
4 . 16.600000 19.800000 . . . . 0.200000 20.000000
5 112.064934 11.000000 7.000000 3.000000 2.000000 2.000000 2.000000 1.000000 28.000000
6 . 9.000000 12.000000 15.000000 12.000000 13.000000 20.000000 0 25.000000
7 . 14.000000 14.000000 13.000000 12.000000 18.000000 15.000000 1.000000 24.000000
8 . 16.166667 . . . . . 0.333333 19.500000
9 121.262314 19.000000 13.000000 22.000000 12.000000 18.000000 13.000000 0 26.000000
10 . 26.000000 24.000000 19.000000 13.940000 11.000000 9.000000 0 28.000000
11 138.241745 20.000000 18.000000 16.000000 9.000000 10.000000 6.000000 1.000000 25.000000
12 . 16.000000 15.000000 11.000000 11.000000 11.000000 11.000000 1.000000 24.000000
13 127.396812 8.000000 17.000000 15.000000 7.000000 5.000000 7.000000 1.000000 27.000000
14 . 22.000000 27.000000 24.000000 22.000000 24.000000 23.000000 1.000000 27.459999
15 . 16.000000 13.000000 . . . . 1.000000 23.000000
16 102.481682 26.000000 23.000000 18.000000 17.000000 12.000000 10.000000 0 27.000000
17 . 1.000000 18.000000 10.000000 13.000000 12.000000 10.000000 1.000000 26.000000
18 169.372818 15.000000 24.000000 18.000000 15.190000 13.000000 12.320000 1.000000 25.000000
19 125.793289 11.000000 9.000000 10.000000 8.000000 7.000000 4.000000 1.000000 23.000000
20 . 13.000000 22.000000 . . . . 0 26.000000
21 . 19.000000 . . . . . 0.500000 25.000000
ods graphics on;
proc mi data = wide out=wide_imputed nimpute=30 seed=1213445;
var xt: yt: group pre ;
mcmc plots=acf(wlf) nbiter=1000;
run;
Variance Information
Relative Fraction
-----------------Variance----------------- Increase Missing Relative
Variable Between Within Total DF in Variance Information Efficiency
xt1 1.531802 3.211843 4.794705 33.951 0.492820 0.335106 0.988953
xt2 0.705394 3.594408 4.323315 46.117 0.202789 0.170224 0.994358
xt3 1.180991 2.402971 3.623328 33.482 0.507853 0.341934 0.988731
xt4 1.767436 3.767083 5.593433 34.206 0.484818 0.331415 0.989074
xt5 1.936756 2.553319 4.554633 26.765 0.783809 0.446720 0.985328
xt6 4.982459 5.654930 10.803471 24.56 0.910452 0.484574 0.984104
yt2 0.091587 0.724349 0.818989 50.194 0.130655 0.116370 0.996136
yt3 0.088875 0.600429 0.692266 48.893 0.152953 0.133713 0.995563
yt4 0.080134 0.539325 0.622130 48.86 0.153535 0.134157 0.995548
yt5 0.112376 0.588447 0.704569 46.411 0.197337 0.166373 0.994485
yt6 0.104164 0.499650 0.607286 45.445 0.215424 0.179019 0.994068
Parameter Estimates
t for H0:
Variable Mean Std Error 95% Confidence Limits DF Minimum Maximum Mu0 Mean=Mu0 Pr > |t|
xt1 130.800280 2.189681 126.3501 135.2505 33.951 128.194969 133.084071 0 59.73 <.0001
xt2 128.420909 2.079258 124.2359 132.6060 46.117 126.590679 130.156352 0 61.76 <.0001
xt3 131.033630 1.903504 127.1630 134.9042 33.482 128.495709 133.178206 0 68.84 <.0001
xt4 132.221511 2.365044 127.4162 137.0268 34.206 130.392552 135.398365 0 55.91 <.0001
xt5 127.007770 2.134159 122.6270 131.3885 26.765 124.358792 129.587070 0 59.51 <.0001
xt6 133.330690 3.286863 126.5551 140.1063 24.56 128.635706 138.967370 0 40.56 <.0001
yt2 13.716894 0.904980 11.8994 15.5344 50.194 13.049361 14.259155 0 15.16 <.0001
yt3 11.976376 0.832026 10.3043 13.6485 48.893 11.264413 12.519671 0 14.39 <.0001
yt4 11.000523 0.788752 9.4154 12.5857 48.86 10.358900 11.422313 0 13.95 <.0001
yt5 9.877199 0.839386 8.1880 11.5664 46.411 9.058228 10.414946 0 11.77 <.0001
yt6 9.133555 0.779286 7.5644 10.7027 45.445 8.627612 9.984750 0 11.72 <.0001
data long_imputed;
set wide_imputed;
array yt(6) yt:;
array xt(6) xt:;
do visit = 1 to 6;
y = yt(visit);
x1 = xt(visit);
output;
end;
drop xt: yt:;
run;
At this point, the data have been reshaped back to long. It seems that we are
ready for our analysis model. But we are not quite there yet. Since we have
turned the data to wide for the imputation procedure, we now have more data
points than we started with. Now every subject has 6 rows of data! We need to
restrict our analysis to the time points where we have some data. To this end,
we can use proc sql to select only the
time points where we have some values for each of the subjects.proc sql;
create table long_imputed_a as
select a.*, a.visit-1 as time,
a.x1-mean(a.x1) as cx1, a.pre-mean(a.pre) as cpre
from long_imputed as a, dp_miss as b
where a.subj=b.subj and a.visit = b.visit;
quit;
proc sort data = long_imputed_a;
by _imputation_;
run;
proc mixed data = long_imputed_a;
by _imputation_;
class group subj;
model y = group pre x1 visit /solution;
random intercept visit /subject=subj type=un;
ods output SolutionF=mixparms;
run;
proc mianalyze parms(classvar=full)=mixparms;
class group;
modeleffects Intercept group pre x1 visit;
run;
Model Information
PARMS Data Set WORK.MIXPARMS
Number of Imputations 30
Variance Information
-----------------Variance-----------------
Parameter group Between Within Total DF
Intercept . 1.254539 14.171881 15.468238 4128.9
group 0 0.000401 1.221414 1.221829 2.52E8
group 1.000000 0 . . .
pre . 0.000015750 0.021683 0.021699 5.16E7
x1 . 0.000077681 0.000219 0.000300 404.3
visit . 0.000083252 0.027082 0.027168 2.89E6
Variance Information
Relative Fraction
Increase Missing Relative
Parameter group in Variance Information Efficiency
Intercept . 0.091474 0.084251 0.997199
group 0 0.000339 0.000339 0.999989
group 1.000000 . . .
pre . 0.000751 0.000750 0.999975
x1 . 0.365790 0.271418 0.991034
visit . 0.003177 0.003167 0.999894
Parameter Estimates
Parameter group Estimate Std Error 95% Confidence Limits DF
Intercept . 1.517928 3.932968 -6.19281 9.22866 4128.9
group 0 4.085099 1.105364 1.91863 6.25157 2.52E8
group 1.000000 0 . . . .
Parameter Estimates
t for H0:
Parameter group Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
Intercept . -0.344025 3.600943 0 0.39 0.6996
group 0 4.051295 4.119540 0 3.70 0.0002
group 1.000000 0 0 0 . .
Parameter Estimates
Parameter group Estimate Std Error 95% Confidence Limits DF
pre . 0.465283 0.147307 0.17657 0.75400 5.16E7
x1 . 0.021101 0.017312 -0.01293 0.05513 404.3
visit . -1.215938 0.164828 -1.53899 -0.89288 2.89E6
Parameter Estimates
t for H0:
Parameter group Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
pre . 0.455333 0.475869 0 3.16 0.0016
x1 . 0.004767 0.036989 0 1.22 0.2236
visit . -1.239579 -1.201451 0 -7.38 <.0006
Now let's put the results of the three analyses together in one single table.
-------------------------------------------------------------------- |Parameter | Full | Listwise | MI data (30) | |------------------+---------------+---------------+---------------| |INTERCEPT | 1.965| 0.841| 1.518| | | (3.824)| (3.955)| (3.933)| |GROUP | 4.059| 4.365| 4.085| | | (1.113)| (1.099)| (1.105)| |PRE | 0.470| 0.402| 0.465| | | (0.148)| (0.149)| (0.147)| |X1 | 0.017| 0.037| 0.021| | | (0.015)| (0.017)| (0.017)| |VISIT | -1.209| -1.326| -1.216| | | (0.166)| (0.151)| (0.165)| --------------------------------------------------------------------
Here is the sas code for creating the table above.
options formchar = '|----|+|---+=|-/<>*';
proc mixed data = ats.dp;
class group subj;
model y = group pre x1 visit /solution;
random intercept visit /subject=subj type=un;
ods output solutionF = full;
run;
proc mixed data = ats.dp_miss;
class group subj;
model y = group pre x1 visit /solution;
random intercept visit /subject=subj type=un;
ods output solutionF = listwise;
run;
ods output parameterestimates=mi;
proc mianalyze parms(classvar=full)=mixparms;
class group;
modeleffects Intercept group pre x1 visit;
run;
data all;
length d $16;
label d = "data type";
set full (in=f1) listwise (in=f2)
mi(in=f3 rename=(parm=effect) keep=parm group estimate stderr df tvalue probt);
if f1 = 1 then d = "Full";
if f2 = 1 then d = "Listwise";
if f3 = 1 then d = "MI data (30)";
effect = upcase(effect);
run;
proc format;
picture stderrf (round)
low-high=' 9.999)' (prefix='(')
.=' ';
run;
proc tabulate data=all order=data noseps;
class d effect;
var estimate stderr;
table effect=''*(estimate =' '*sum=' '*F=15.3
stderr=' '*sum=' '*F=stderrf.),
d=' '
/ box=[label="Parameter"] rts=20 row=float misstext=' ';
run;
Notice that in the process of running proc sql, we have also created
centered variables and recoded visit variable into a new variable
called time, to have initial value starting at 0 instead of at 1 so the intercept in the model corresponds to the
initial status holding the covariates x1 and pre at their grand means. We are
going to run a new model using these centered covariates together with an
interaction of group with time. Using this example, we are going
to show how to combine the estimation results from an estimate statement.
A lot of times, we need to figure some details such as how to write the estimate
statement and how to output the results from the estimate statement. To this
end, we should just run our model on a sample data set, say, the first imputed
data set and turn the ods trace on. Once we figure out all the details we can
move on to the entire imputed data set.
ods trace on /listing;
proc mixed data = long_imputed_a covtest;
where _imputation_ = 1;
class group subj;
model y = group time group*time cpre cx1 /solution;
random intercept time /subject=subj type=un;
estimate "time effect for group = 0" time 1 group*time 1;
run;
ods trace off;
(output omitted...)
Output Added:
-------------
Name: CovParms
Label: Covariance Parameter Estimates
Template: Stat.Mixed.CovParms
Path: Mixed.CovParms
-------------
Covariance Parameter Estimates
Standard Z
Cov Parm Subject Estimate Error Value Pr Z
UN(1,1) SUBJ 18.8465 4.4290 4.26 <.0001
UN(2,1) SUBJ -1.7606 0.9171 -1.92 0.0549
UN(2,2) SUBJ 0.8440 0.2934 2.88 0.0020
Residual 8.4525 0.8855 9.55 <.0001
Output Added:
-------------
Name: FitStatistics
Label: Fit Statistics
Template: Stat.Mixed.FitStatistics
Path: Mixed.FitStatistics
-------------
Fit Statistics
-2 Res Log Likelihood 1648.6
AIC (smaller is better) 1656.6
AICC (smaller is better) 1656.7
BIC (smaller is better) 1665.0
Output Added:
-------------
Name: LRT
Label: Null Model Likelihood Ratio Test
Template: Stat.Mixed.LRT
Path: Mixed.LRT
-------------
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
3 140.88 <.0001
Output Added:
-------------
Name: SolutionF
Label: Solution for Fixed Effects
Template: Stat.Mixed.SolutionF
Path: Mixed.SolutionF
-------------
Solution for Fixed Effects
Standard
Effect GROUP Estimate Error DF t Value Pr > |t|
Intercept 12.9740 0.8350 58 15.54 <.0001
GROUP 0 3.7834 1.2629 180 3.00 0.0031
GROUP 1 0 . . . .
time -1.2847 0.2127 51 -6.04 <.0001
time*GROUP 0 0.1596 0.3400 180 0.47 0.6392
time*GROUP 1 0 . . . .
cpre 0.4643 0.1472 180 3.15 0.0019
cx1 0.02163 0.01493 180 1.45 0.1492
Output Added:
-------------
Name: Tests3
Label: Type 3 Tests of Fixed Effects
Template: Stat.Mixed.Tests3
Path: Mixed.Tests3
-------------
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
GROUP 1 180 8.97 0.0031
time 1 51 50.20 <.0001
time*GROUP 1 180 0.22 0.6392
cpre 1 180 9.95 0.0019
cx1 1 180 2.10 0.1492
Output Added:
-------------
Name: Estimates
Label: Estimates
Template: Stat.Mixed.Estimates
Path: Mixed.Estimates
-------------
Estimates
Standard
Label Estimate Error DF t Value Pr > |t|
time effect for group = 0 -1.1251 0.2653 180 -4.24 <.0001
From the output, we can verify that our estimate statement does what it is supposed to and the names of the data sets that we care to create. In this model, there are three sets of estimates that we are interested: the parameter estimates of the fixed effect, the results from the estimate statement and maybe the variance-covariance estimates. The point of "maybe" here is that we can mechanically combine the estimation results, but we should not use the combined results blindly. For instance, how reliable are the estimates of the variance-covariance parameters? If the number of between subjects is not large enough, these parameter estimates are not as reliable. Now we are rerun the code above adding the ods output statement to check that the three estimation data sets will be created.
proc mixed data = long_imputed_a covtest; where _imputation_ = 1; class group subj; model y = group time group*time cpre cx1 /solution; random intercept time /subject=subj type=un; estimate "time effect for group = 0" time 1 group*time 1; ods output CovParms = cov solutionf= sol estimates=est; run;
NOTE: Convergence criteria met.
NOTE: The data set WORK.EST has 1 observations and 6 variables.
NOTE: The data set WORK.SOL has 8 observations and 7 variables.
NOTE: The data set WORK.COV has 4 observations and 6 variables.
NOTE: PROCEDURE MIXED used (Total process time):
real time 0.09 seconds
cpu time 0.06 seconds
Now we are ready to run the model on the entire imputed data set. We add a second estimate statement, a trivial one, for the purpose of illustrating how to combine the results from the estimate statement using proc mianalyze with different syntax.
proc mixed data = long_imputed_a covtest; by _imputation_ ; class group subj; model y = group time group*time cpre cx1 /solution; random intercept time /subject=subj type=un; estimate "time effect for group = 0" time 1 group*time 1; estimate "time effect for group = 1" time 1; ods output CovParms = cov solutionf= sol estimates=est; run; proc mianalyze parms(classvar=full)=sol; class group; modeleffects Intercept group time group*time cpre cx1; run;
Parameter Estimates
Parameter group Minimum Maximum
Intercept . 12.924877 13.006671
group 0 3.742516 3.861900
group 1.000000 0 0
time . -1.307816 -1.259890
time*group 0 0.143170 0.194270
time*group 1.000000 0 0
cpre . 0.456482 0.476776
cx1 . 0.004594 0.037081
Parameter Estimates
t for H0:
Parameter group Theta0 Parameter=Theta0 Pr > |t|
Intercept . 0 15.57 <.0001
group 0 0 3.02 0.0025
group 1.000000 0 . .
time . 0 -6.00 <.0001
time*group 0 0 0.47 0.6366
time*group 1.000000 0 . .
cpre . 0 3.17 0.0015
cx1 . 0 1.22 0.2233
proc print data = est (obs=10) noobs ; run;
_Imputation_ Label Estimate StdErr DF tValue Probt
1 time effect for group = 0 -1.1251 0.2653 180 -4.24 <.0001
2 time effect for group = 0 -1.1148 0.2696 180 -4.14 <.0001
3 time effect for group = 0 -1.1331 0.2586 180 -4.38 <.0001
4 time effect for group = 0 -1.1182 0.2699 180 -4.14 <.0001
5 time effect for group = 0 -1.1176 0.2663 180 -4.20 <.0001
6 time effect for group = 0 -1.1261 0.2692 180 -4.18 <.0001
7 time effect for group = 0 -1.1105 0.2680 180 -4.14 <.0001
8 time effect for group = 0 -1.1104 0.2636 180 -4.21 <.0001
9 time effect for group = 0 -1.1068 0.2654 180 -4.17 <.0001
10 time effect for group = 0 -1.1142 0.2667 180 -4.18 <.0001
proc sort data = est;
by label _imputation_;
run;
ods select parameterestimates;
proc mianalyze data = est ;
by label;
modeleffects estimate;
stderr stderr;
run;
Label=time effect for group = 0
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate -1.117590 0.266011 -1.63896 -0.59622 2.68E7
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate -1.135251 -1.101759 0 -4.20 <.0001
Label=time effect for group = 1
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate -1.198107 0.170490 -1.53226 -0.86395 4.21E6
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate -1.220460 -1.184616 0 -7.03 <.0001
proc sort data = cov;
by covparm _imputation_;
run;
ods select parameterestimates;
proc mianalyze data = cov ;
by covparm;
modeleffects estimate;
stderr stderr;
run;
Cov Parm=Residual
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate 8.437262 0.884596 6.703486 10.17104 9.61E6
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate 8.330932 8.489894 0 9.54 <.0001
Cov Parm=UN(1,1)
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate 18.690516 4.413127 10.04094 27.34009 5.05E6
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate 18.238351 19.075624 0 4.24 <.0001
Cov Parm=UN(2,1)
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate -1.719833 0.913920 -3.51108 0.071418 7.48E6
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate -1.835205 -1.617613 0 -1.88 0.0599
Cov Parm=UN(2,2)
Parameter Estimates
Parameter Estimate Std Error 95% Confidence Limits DF
estimate 0.847049 0.295519 0.267839 1.426260 215486
Parameter Estimates
t for H0:
Parameter Minimum Maximum Theta0 Parameter=Theta0 Pr > |t|
estimate 0.762822 0.890457 0 2.87 0.0042
The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.