UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Mplus Class Notes
Analyzing Data


To illustrate data analysis using Mplus we will run a multiple regression followed by censored regression and logistic regression using the hsb2.dat data file.

1.0 Multiple Regression

We will begin with a multiple regression with write as the response variable and read and female as the predictor variables. The Mplus command file is hsbreg.inp.

We use Type = meanstructure in the Analysis block in order to get an intercept with model. The Model command block uses the keyword ON to indicate that the model regresses write on read and female. The Output Standardized was included to obtain standardized regression coefficients.

Title:
  Analyzing data linear regression
Data:
  File is hsb2.dat ;
Variable:
  Names are
     id female race ses schtyp prog read write math science socst;
  Usevariables are
     female read write;
Analysis:
  Type = meanstructure;
Model:
   write ON read female;
Output:
  Standardized;

Here is the regression output from Mplus.

INPUT READING TERMINATED NORMALLY

Analyzing data linear regression

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    1
Number of independent variables                                  2
Number of continuous latent variables                            0

Observed dependent variables

  Continuous
   WRITE

Observed independent variables

   FEMALE      READ

Estimator                                                       ML
Information matrix                                        EXPECTED
Maximum number of iterations                                  1000
Convergence criterion                                    0.500D-04
Maximum number of steepest descent iterations                   20

Input data file(s)
  hsb2.dat

Input data format  FREE

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT
 
Chi-Square Test of Model Fit
          Value                              0.000
          Degrees of Freedom                     0
          P-Value                           0.0000

Chi-Square Test of Model Fit for the Baseline Model
          Value                            115.756
          Degrees of Freedom                     2
          P-Value                           0.0000

CFI/TLI
          CFI                                1.000
          TLI                                1.000

Loglikelihood
          H0 Value                       -1568.077
          H1 Value                       -1568.077

Information Criteria
          Number of Free Parameters              4
          Akaike (AIC)                    3144.155
          Bayesian (BIC)                  3157.348
          Sample-Size Adjusted BIC        3144.676
            (n* = (n + 2) / 24)

RMSEA (Root Mean Square Error Of Approximation)
          Estimate                           0.000
          90 Percent C.I.                    0.000  0.000
          Probability RMSEA <= .05           0.000

SRMR (Standardized Root Mean Square Residual)
          Value                              0.000

MODEL RESULTS
                   Estimates     S.E.  Est./S.E.    Std     StdYX
 WRITE    ON
    READ               0.566    0.049     11.546    0.566    0.612
    FEMALE             5.487    1.007      5.451    5.487    0.289

 Intercepts
    WRITE             20.228    2.693      7.511   20.228    2.139

 Residual Variances
    WRITE             50.113    5.011     10.000   50.113    0.561

R-SQUARE

    Observed
    Variable  R-Square
    WRITE        0.439

     Beginning Time:  11:55:19
        Ending Time:  11:55:19
       Elapsed Time:  00:00:00

2.0 Multiple Regression with Missing Data

Next, we will run the same regression model using the missing data file intorduced in the previous units.

We need to include missing along with meanstructure in the Type command of the Analysis block in order to analyze the model with missing data. If we were to omit missing form the Type command Mplus with analyze the model by deleting cases with missing data.

Title:
    Linear regression with missing data
  Data:
    File is hsbmis.dat ;
  Variable:
    Names are
       id female race ses schtyp prog read write math science socst;
    Missing are all (-9999) ;
    Usevariables are
       female read write ;
  Analysis:
    Type = meanstructure missing ;
  Model:
    write on read female ;
  Output:
    Standardized ;

Here is the regression output from Mplus.

INPUT READING TERMINATED NORMALLY

Linear regression with missing data

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    1
Number of independent variables                                  2
Number of continuous latent variables                            0

Observed dependent variables

  Continuous
   WRITE

Observed independent variables
   FEMALE      READ

Estimator                                                       ML
Information matrix                                        OBSERVED
Maximum number of iterations                                  1000
Convergence criterion                                    0.500D-04
Maximum number of steepest descent iterations                   20

Input data file(s)
  hsbmis.dat

Input data format  FREE

SUMMARY OF DATA
     Number of patterns           4

COVARIANCE COVERAGE OF DATA

Minimum covariance coverage value   0.100

     PROPORTION OF DATA PRESENT
           Covariance Coverage
              WRITE         FEMALE        READ
              ________      ________      ________
 WRITE          0.930
 FEMALE         0.900         0.970
 READ           0.875         0.915         0.945

THE MODEL ESTIMATION TERMINATED NORMALLY


TESTS OF MODEL FIT

Degrees of Freedom                               0

Loglikelihood
          H0 Value                       -1479.708

Information Criteria
          Number of Free Parameters              4
          Akaike (AIC)                    2967.417
          Bayesian (BIC)                  2980.610
          Sample-Size Adjusted BIC        2967.938
            (n* = (n + 2) / 24)

MODEL RESULTS
                   Estimates     S.E.  Est./S.E.    Std     StdYX
 WRITE    ON
    READ               0.548    0.052     10.495    0.548    0.596
    FEMALE             5.456    1.079      5.056    5.456    0.288

 Intercepts
    WRITE             20.880    2.895      7.212   20.880    2.211

 Residual Variances
    WRITE             50.966    5.376      9.479   50.966    0.572

R-SQUARE

    Observed
    Variable  R-Square
    WRITE        0.428

     Beginning Time:  08:51:30
        Ending Time:  08:51:31
       Elapsed Time:  00:00:01

3.0 Censored Regression

Our next example using Mplus will be a multiple regression model with the response variable censored from above at 67. We will keep write as the response variable and read and female as the predictor variables. The Mplus command file is hsbcen.inp.

We will declare Censored are write (a) in the Variable block to indicate the censored nature of the response variable and we will add Estimator = MLR to the Analysis block.

Title:
  Analyzing data linear regression
Data:
  File is hsb2.dat ;
Variable:
  Names are
     id female race ses schtyp prog read write math science socst;
  Usevariables are
     female read write;
  Censored are write (a) ;
Analysis:
  Type = meanstructure;
  Estimator = MLR;
Model:
   write ON read female;

Here is the censored regression output from Mplus.

INPUT READING TERMINATED NORMALLY

Analyzing data censored linear regression

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    1
Number of independent variables                                  2
Number of continuous latent variables                            0

Observed dependent variables

  Censored
   WRITE

Observed independent variables
   FEMALE      READ

Estimator                                                      MLR
Information matrix                                        OBSERVED
Optimization Specifications for the Quasi-Newton Algorithm for
Continuous Outcomes
  Maximum number of iterations                                1000
  Convergence criterion                                  0.100D-05
Optimization Specifications for the EM Algorithm
  Maximum number of iterations                                 500
  Convergence criteria
    Loglikelihood change                                 0.100D-02
    Relative loglikelihood change                        0.100D-05
    Derivative                                           0.100D-02
Optimization Specifications for the M step of the EM Algorithm for
Categorical Latent variables
  Number of M step iterations                                    1
  M step convergence criterion                           0.100D-02
  Basis for M step termination                           ITERATION
Optimization Specifications for the M step of the EM Algorithm for
Censored, Binary or Ordered Categorical (Ordinal), Unordered
Categorical (Nominal) and Count Outcomes
  Number of M step iterations                                    1
  M step convergence criterion                           0.100D-02
  Basis for M step termination                           ITERATION
  Maximum value for logit thresholds                            15
  Minimum value for logit thresholds                           -15
  Minimum expected cell size for chi-square              0.100D-01
Optimization algorithm                                         EMA
Integration Specifications
  Type                                                    STANDARD
  Number of integration points                                  15
  Dimensions of numerical integration                            0
  Adaptive quadrature                                           ON
Cholesky                                                       OFF

Input data file(s)
  hsb2.dat

Input data format  FREE

SUMMARY OF CENSORED LIMITS
      WRITE             67.000

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood
          H0 Value                        -663.887

Information Criteria
          Number of Free Parameters              4
          Akaike (AIC)                    1335.775
          Bayesian (BIC)                  1348.968
          Sample-Size Adjusted BIC        1336.296
            (n* = (n + 2) / 24)

MODEL RESULTS
                   Estimates     S.E.  Est./S.E.
 WRITE      ON
    READ               0.584    0.045     12.837
    FEMALE             5.617    1.050      5.350

 Intercepts
    WRITE             19.352    2.570      7.531

 Residual Variances
    WRITE             52.814    4.615     11.443

QUALITY OF NUMERICAL RESULTS
     Condition Number for the Information Matrix              0.243E-05
       (ratio of smallest to largest eigenvalue)

     Beginning Time:  12:30:30
        Ending Time:  12:30:30
       Elapsed Time:  00:00:00

4.0 Logistic Regression

In order to demonstrate logistic regression we will need to transform write to a binary response variable using the cut(59) command in the Define command block. With the cut(59) command write = 0 when write <= 59 and write = 1 when write > 59. Since write is now a zero/one variable we need to declare it to be categorical in the Variable block.

Title:
  Analyzing data logistic regression
Data:
  File is hsb2.dat ;
Variable:
  Names are
     id female race ses schtyp prog read write math science socst;
  Usevariables are
     female read write;
  Categorical are write;
Define:
  cut write (59);
Analysis:
  Type = logistic;
Model:
  write ON read female;

Here is the logistic regression output from Mplus.

INPUT READING TERMINATED NORMALLY

Analyzing data logistic regression

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    1
Number of independent variables                                  2
Number of continuous latent variables                            0

Observed dependent variables
 
  Binary and ordered categorical (ordinal)
   WRITE

Observed independent variables
   FEMALE      READ

Estimator                                                      MLR
Maximum number of iterations                                  1000
Convergence criterion                                    0.500D-04
Maximum number of steepest descent iterations                   20

Input data file(s)
  hsb2.dat

Input data format  FREE

SUMMARY OF CATEGORICAL DATA PROPORTIONS

    WRITE
      Category 1    0.735
      Category 2    0.265

TESTS OF MODEL FIT

Loglikelihood
          H0 Value                         -85.444

Information Criteria
          Number of Free Parameters              3
          Akaike (AIC)                     176.887
          Bayesian (BIC)                   186.782
          Sample-Size Adjusted BIC         177.278
            (n* = (n + 2) / 24)

RESULTS FOR LOGISTIC REGRESSION
                                                    Odds       .95 C.I.
                   Estimates     S.E.  Est./S.E.   Ratio    Lower    Upper
 Thresholds
    WRITE$1           -9.603    1.293

 Slopes
    FEMALE             1.121    0.420      2.668    3.068    1.346    6.990
    READ               0.144    0.021      6.867    1.155    1.109    1.204

     Beginning Time:  12:10:59
        Ending Time:  12:10:59
       Elapsed Time:  00:00:00

5.0 Ordinal Logistic Regression

In this next example we will transform write into a four-category ordinal response variable using the cut(45.5 54 60) command in the Define command block. This cut command splits write at each of the quartiles.

Title:
  Analyzing data ordinal logistic regression
Data:
  File is hsb2.dat ;
Variable:
  Names are
     id female race ses schtyp prog read write math science socst;
  Usevariables are
     female read write;
  Categorical are write;
Define:
  cut write (45.5 54 60);
Analysis:
  Type = logistic;
Model:
  write ON read female;

Here is the logistic regression output from Mplus.

INPUT READING TERMINATED NORMALLY

Analyzing data ordinal logistic regression

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    1
Number of independent variables                                  2
Number of continuous latent variables                            0

Observed dependent variables
  Binary and ordered categorical (ordinal)
   WRITE

Observed independent variables
   FEMALE      READ

Estimator                                                      MLR
Maximum number of iterations                                  1000
Convergence criterion                                    0.500D-04
Maximum number of steepest descent iterations                   20

Input data file(s)
  hsb2.dat

Input data format  FREE

SUMMARY OF CATEGORICAL DATA PROPORTIONS

    WRITE
      Category 1    0.250
      Category 2    0.285
      Category 3    0.220
      Category 4    0.245

TESTS OF MODEL FIT

Loglikelihood
          H0 Value                        -229.901

Information Criteria
          Number of Free Parameters              5
          Akaike (AIC)                     469.801
          Bayesian (BIC)                   486.293
          Sample-Size Adjusted BIC         470.453
            (n* = (n + 2) / 24)

RESULTS FOR LOGISTIC REGRESSION
                                                    Odds       .95 C.I.
                   Estimates     S.E.  Est./S.E.   Ratio    Lower    Upper
 Thresholds
    WRITE$1            5.990    0.826
    WRITE$2            7.725    0.875
    WRITE$3            9.075    0.921

 Slopes
    FEMALE             1.253    0.292      4.283    3.500    1.973    6.208
    READ               0.131    0.015      8.683    1.139    1.106    1.174

     Beginning Time:  09:57:35
        Ending Time:  09:57:35
       Elapsed Time:  00:00:00

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California