|
|
|
||||
|
|
|||||
To illustrate data analysis using Mplus we will run a multiple regression followed by censored regression and logistic regression using the hsb2.dat data file.
We will begin with a multiple regression with write as the response variable and read and female as the predictor variables. The Mplus command file is hsbreg.inp.
We use Type = meanstructure in the Analysis block in order to get an intercept with model. The Model command block uses the keyword ON to indicate that the model regresses write on read and female. The Output Standardized was included to obtain standardized regression coefficients.
Title:
Analyzing data linear regression
Data:
File is hsb2.dat ;
Variable:
Names are
id female race ses schtyp prog read write math science socst;
Usevariables are
female read write;
Analysis:
Type = meanstructure;
Model:
write ON read female;
Output:
Standardized;
Here is the regression output from Mplus.
INPUT READING TERMINATED NORMALLY
Analyzing data linear regression
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 200
Number of dependent variables 1
Number of independent variables 2
Number of continuous latent variables 0
Observed dependent variables
Continuous
WRITE
Observed independent variables
FEMALE READ
Estimator ML
Information matrix EXPECTED
Maximum number of iterations 1000
Convergence criterion 0.500D-04
Maximum number of steepest descent iterations 20
Input data file(s)
hsb2.dat
Input data format FREE
THE MODEL ESTIMATION TERMINATED NORMALLY
TESTS OF MODEL FIT
Chi-Square Test of Model Fit
Value 0.000
Degrees of Freedom 0
P-Value 0.0000
Chi-Square Test of Model Fit for the Baseline Model
Value 115.756
Degrees of Freedom 2
P-Value 0.0000
CFI/TLI
CFI 1.000
TLI 1.000
Loglikelihood
H0 Value -1568.077
H1 Value -1568.077
Information Criteria
Number of Free Parameters 4
Akaike (AIC) 3144.155
Bayesian (BIC) 3157.348
Sample-Size Adjusted BIC 3144.676
(n* = (n + 2) / 24)
RMSEA (Root Mean Square Error Of Approximation)
Estimate 0.000
90 Percent C.I. 0.000 0.000
Probability RMSEA <= .05 0.000
SRMR (Standardized Root Mean Square Residual)
Value 0.000
MODEL RESULTS
Estimates S.E. Est./S.E. Std StdYX
WRITE ON
READ 0.566 0.049 11.546 0.566 0.612
FEMALE 5.487 1.007 5.451 5.487 0.289
Intercepts
WRITE 20.228 2.693 7.511 20.228 2.139
Residual Variances
WRITE 50.113 5.011 10.000 50.113 0.561
R-SQUARE
Observed
Variable R-Square
WRITE 0.439
Beginning Time: 11:55:19
Ending Time: 11:55:19
Elapsed Time: 00:00:00
Next, we will run the same regression model using the missing data file intorduced in the previous units.
We need to include missing along with meanstructure in the Type command of the Analysis block in order to analyze the model with missing data. If we were to omit missing form the Type command Mplus with analyze the model by deleting cases with missing data.
Title:
Linear regression with missing data
Data:
File is hsbmis.dat ;
Variable:
Names are
id female race ses schtyp prog read write math science socst;
Missing are all (-9999) ;
Usevariables are
female read write ;
Analysis:
Type = meanstructure missing ;
Model:
write on read female ;
Output:
Standardized ;
Here is the regression output from Mplus.
INPUT READING TERMINATED NORMALLY
Linear regression with missing data
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 200
Number of dependent variables 1
Number of independent variables 2
Number of continuous latent variables 0
Observed dependent variables
Continuous
WRITE
Observed independent variables
FEMALE READ
Estimator ML
Information matrix OBSERVED
Maximum number of iterations 1000
Convergence criterion 0.500D-04
Maximum number of steepest descent iterations 20
Input data file(s)
hsbmis.dat
Input data format FREE
SUMMARY OF DATA
Number of patterns 4
COVARIANCE COVERAGE OF DATA
Minimum covariance coverage value 0.100
PROPORTION OF DATA PRESENT
Covariance Coverage
WRITE FEMALE READ
________ ________ ________
WRITE 0.930
FEMALE 0.900 0.970
READ 0.875 0.915 0.945
THE MODEL ESTIMATION TERMINATED NORMALLY
TESTS OF MODEL FIT
Degrees of Freedom 0
Loglikelihood
H0 Value -1479.708
Information Criteria
Number of Free Parameters 4
Akaike (AIC) 2967.417
Bayesian (BIC) 2980.610
Sample-Size Adjusted BIC 2967.938
(n* = (n + 2) / 24)
MODEL RESULTS
Estimates S.E. Est./S.E. Std StdYX
WRITE ON
READ 0.548 0.052 10.495 0.548 0.596
FEMALE 5.456 1.079 5.056 5.456 0.288
Intercepts
WRITE 20.880 2.895 7.212 20.880 2.211
Residual Variances
WRITE 50.966 5.376 9.479 50.966 0.572
R-SQUARE
Observed
Variable R-Square
WRITE 0.428
Beginning Time: 08:51:30
Ending Time: 08:51:31
Elapsed Time: 00:00:01
Our next example using Mplus will be a multiple regression model with the response variable censored from above at 67. We will keep write as the response variable and read and female as the predictor variables. The Mplus command file is hsbcen.inp.
We will declare Censored are write (a) in the Variable block to indicate the censored nature of the response variable and we will add Estimator = MLR to the Analysis block.
Title:
Analyzing data linear regression
Data:
File is hsb2.dat ;
Variable:
Names are
id female race ses schtyp prog read write math science socst;
Usevariables are
female read write;
Censored are write (a) ;
Analysis:
Type = meanstructure;
Estimator = MLR;
Model:
write ON read female;
Here is the censored regression output from Mplus.
INPUT READING TERMINATED NORMALLY
Analyzing data censored linear regression
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 200
Number of dependent variables 1
Number of independent variables 2
Number of continuous latent variables 0
Observed dependent variables
Censored
WRITE
Observed independent variables
FEMALE READ
Estimator MLR
Information matrix OBSERVED
Optimization Specifications for the Quasi-Newton Algorithm for
Continuous Outcomes
Maximum number of iterations 1000
Convergence criterion 0.100D-05
Optimization Specifications for the EM Algorithm
Maximum number of iterations 500
Convergence criteria
Loglikelihood change 0.100D-02
Relative loglikelihood change 0.100D-05
Derivative 0.100D-02
Optimization Specifications for the M step of the EM Algorithm for
Categorical Latent variables
Number of M step iterations 1
M step convergence criterion 0.100D-02
Basis for M step termination ITERATION
Optimization Specifications for the M step of the EM Algorithm for
Censored, Binary or Ordered Categorical (Ordinal), Unordered
Categorical (Nominal) and Count Outcomes
Number of M step iterations 1
M step convergence criterion 0.100D-02
Basis for M step termination ITERATION
Maximum value for logit thresholds 15
Minimum value for logit thresholds -15
Minimum expected cell size for chi-square 0.100D-01
Optimization algorithm EMA
Integration Specifications
Type STANDARD
Number of integration points 15
Dimensions of numerical integration 0
Adaptive quadrature ON
Cholesky OFF
Input data file(s)
hsb2.dat
Input data format FREE
SUMMARY OF CENSORED LIMITS
WRITE 67.000
THE MODEL ESTIMATION TERMINATED NORMALLY
TESTS OF MODEL FIT
Loglikelihood
H0 Value -663.887
Information Criteria
Number of Free Parameters 4
Akaike (AIC) 1335.775
Bayesian (BIC) 1348.968
Sample-Size Adjusted BIC 1336.296
(n* = (n + 2) / 24)
MODEL RESULTS
Estimates S.E. Est./S.E.
WRITE ON
READ 0.584 0.045 12.837
FEMALE 5.617 1.050 5.350
Intercepts
WRITE 19.352 2.570 7.531
Residual Variances
WRITE 52.814 4.615 11.443
QUALITY OF NUMERICAL RESULTS
Condition Number for the Information Matrix 0.243E-05
(ratio of smallest to largest eigenvalue)
Beginning Time: 12:30:30
Ending Time: 12:30:30
Elapsed Time: 00:00:00
In order to demonstrate logistic regression we will need to transform write to a binary response variable using the cut(59) command in the Define command block. With the cut(59) command write = 0 when write <= 59 and write = 1 when write > 59. Since write is now a zero/one variable we need to declare it to be categorical in the Variable block.
Title:
Analyzing data logistic regression
Data:
File is hsb2.dat ;
Variable:
Names are
id female race ses schtyp prog read write math science socst;
Usevariables are
female read write;
Categorical are write;
Define:
cut write (59);
Analysis:
Type = logistic;
Model:
write ON read female;
Here is the logistic regression output from Mplus.
INPUT READING TERMINATED NORMALLY
Analyzing data logistic regression
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 200
Number of dependent variables 1
Number of independent variables 2
Number of continuous latent variables 0
Observed dependent variables
Binary and ordered categorical (ordinal)
WRITE
Observed independent variables
FEMALE READ
Estimator MLR
Maximum number of iterations 1000
Convergence criterion 0.500D-04
Maximum number of steepest descent iterations 20
Input data file(s)
hsb2.dat
Input data format FREE
SUMMARY OF CATEGORICAL DATA PROPORTIONS
WRITE
Category 1 0.735
Category 2 0.265
TESTS OF MODEL FIT
Loglikelihood
H0 Value -85.444
Information Criteria
Number of Free Parameters 3
Akaike (AIC) 176.887
Bayesian (BIC) 186.782
Sample-Size Adjusted BIC 177.278
(n* = (n + 2) / 24)
RESULTS FOR LOGISTIC REGRESSION
Odds .95 C.I.
Estimates S.E. Est./S.E. Ratio Lower Upper
Thresholds
WRITE$1 -9.603 1.293
Slopes
FEMALE 1.121 0.420 2.668 3.068 1.346 6.990
READ 0.144 0.021 6.867 1.155 1.109 1.204
Beginning Time: 12:10:59
Ending Time: 12:10:59
Elapsed Time: 00:00:00
In this next example we will transform write into a four-category ordinal response variable using the cut(45.5 54 60) command in the Define command block. This cut command splits write at each of the quartiles.
Title:
Analyzing data ordinal logistic regression
Data:
File is hsb2.dat ;
Variable:
Names are
id female race ses schtyp prog read write math science socst;
Usevariables are
female read write;
Categorical are write;
Define:
cut write (45.5 54 60);
Analysis:
Type = logistic;
Model:
write ON read female;
Here is the logistic regression output from Mplus.
INPUT READING TERMINATED NORMALLY
Analyzing data ordinal logistic regression
SUMMARY OF ANALYSIS
Number of groups 1
Number of observations 200
Number of dependent variables 1
Number of independent variables 2
Number of continuous latent variables 0
Observed dependent variables
Binary and ordered categorical (ordinal)
WRITE
Observed independent variables
FEMALE READ
Estimator MLR
Maximum number of iterations 1000
Convergence criterion 0.500D-04
Maximum number of steepest descent iterations 20
Input data file(s)
hsb2.dat
Input data format FREE
SUMMARY OF CATEGORICAL DATA PROPORTIONS
WRITE
Category 1 0.250
Category 2 0.285
Category 3 0.220
Category 4 0.245
TESTS OF MODEL FIT
Loglikelihood
H0 Value -229.901
Information Criteria
Number of Free Parameters 5
Akaike (AIC) 469.801
Bayesian (BIC) 486.293
Sample-Size Adjusted BIC 470.453
(n* = (n + 2) / 24)
RESULTS FOR LOGISTIC REGRESSION
Odds .95 C.I.
Estimates S.E. Est./S.E. Ratio Lower Upper
Thresholds
WRITE$1 5.990 0.826
WRITE$2 7.725 0.875
WRITE$3 9.075 0.921
Slopes
FEMALE 1.253 0.292 4.283 3.500 1.973 6.208
READ 0.131 0.015 8.683 1.139 1.106 1.174
Beginning Time: 09:57:35
Ending Time: 09:57:35
Elapsed Time: 00:00:00
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services