UCLA Academic Technology Services HomeServicesClassesContactJobs

Mplus Data Analysis Examples
Ordinal Logistic Regression

NOTE: This example was done using Mplus version 4.21. The syntax may not work with earlier versions of Mplus.

Examples

Example 1: A marketing research firm wants to investigate what factors influence the size of soda (small, medium, large or extra large) that people order at a fast-food chain.  These factors may include what type of sandwich is ordered (burger or chicken), whether or not fries are also ordered, and age of the consumer.  While the outcome variable, size of soda, is obviously ordered, the difference between the various sizes is not consistent.  The differences are 10, 8, 12 ounces, respectively.

Example 2:  A 5-point Likert scale is used to assess people's opinion about a local ballot measure.  The response options are "strongly disagree", "disagree", "neutral", "agree" and "strongly agree".  Predictor variables will include the measure's author, his/her political party, and how much the measure's proposals will cost.  The researchers have reason to believe that the psychological "distances" between these points are not equal.  For example, the "distance" between "strongly disagree" and "disagree" may be shorter than the distance between "disagree" and "neutral". 

Example 3:  A study looks at factors that influence the decision of whether to apply to graduate school.  College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school.  Hence, our outcome variable has three categories.  Data on parental educational status, whether the undergraduate institution is public or private, and current GPA is also collected. 

Description of the Data

For our data analysis below, we are going to expand on Example 3 about applying to graduate school.  We have generated hypothetical data, which can be obtained here.

This hypothetical data set has a thee level variable called apply (coded 0, 1, 2), that we will use as our response (i.e., outcome, dependent) variable. We also have three variables that we will use as predictors: pared, which is a 0/1 variable indicating whether at least one parent has a graduate degree; public, which is a 0/1 variable where 1 indicates that the undergraduate institution is a public university and 0 indicates that it is a private university, and gpa, which is the student's grade point average. Let's start with some descriptive statistics for the variables of interest.

  Title: Ordinal logistic regression in Mplus;
  Data:
    File is D:\documents\ologit in Mplus DAE\ologit.dat ;
  Variable:
    Names are apply pared public gpa;
      categorical are apply;
  Analysis:
    type = basic;
  Plot:
    type = plot1;

For this output only, we will display all of the information in the output. You will want to look at this carefully to be sure that the data were read into Mplus correctly. You will want to make sure that you have the correct number of observations, and that the categorical and continuous variables have been correctly specified. We have not used a missing statement because we have no missing data in this data set. If any of our variables had missing data we would have specified "missing = #" in the variable statement, where # is the numeric value given to missing values (e.g. -9999). Below the output are histograms for each of our four variables, these were produced using the plotting function in Mplus. In order to be able to do this, we included the plot statement and specified "type = plot1" which tells Mplus to create the auxiliary files necessary for the plotting function.

INPUT READING TERMINATED NORMALLY

Ordinal logistic regression in Mplus;

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         400

Number of dependent variables                                    4
Number of independent variables                                  0
Number of continuous latent variables                            0

Observed dependent variables

  Continuous
   PARED       PUBLIC      GPA

  Binary and ordered categorical (ordinal)
   APPLY


Estimator                                                    WLSMV
Maximum number of iterations                                  1000
Convergence criterion                                    0.500D-04
Maximum number of steepest descent iterations                   20
Parameterization                                             DELTA

Input data file(s)
  D:\documents\ologit in Mplus DAE\ologit.dat

Input data format  FREE


SUMMARY OF CATEGORICAL DATA PROPORTIONS

    APPLY
      Category 1    0.550
      Category 2    0.350
      Category 3    0.100


RESULTS FOR BASIC ANALYSIS


     ESTIMATED SAMPLE STATISTICS


           MEANS/INTERCEPTS/THRESHOLDS
              APPLY$1       APPLY$2       PARED         PUBLIC        GPA
              ________      ________      ________      ________      ________
      1         0.126         1.282         0.157         0.143         2.999


           CORRELATION MATRIX (WITH VARIANCES ON THE DIAGONAL)
              APPLY         PARED         PUBLIC        GPA
              ________      ________      ________      ________
 APPLY
 PARED          0.234         0.133
 PUBLIC         0.052         0.079         0.122
 GPA            0.179         0.186         0.227         0.158


     STANDARD ERRORS FOR ESTIMATED SAMPLE STATISTICS


           S.E. FOR MEANS/INTERCEPTS/THRESHOLDS
              APPLY$1       APPLY$2       PARED         PUBLIC        GPA
              ________      ________      ________      ________      ________
      1         0.063         0.085     16970.182     17629.901         0.020


           S.E. FOR CORRELATION MATRIX (WITH VARIANCES ON THE DIAGONAL)
              APPLY         PARED         PUBLIC        GPA
              ________      ________      ________      ________
 APPLY
 PARED          0.053      6574.693
 PUBLIC         0.054         0.044      6025.901
 GPA            0.060         0.047         0.046         0.013

Some Strategies You Might Try

Using the Ordinal Logistic Model

Before we run our ordinal logistic model, we will see if any cells (created by the crosstab of our categorical and response variables) are empty or extremely small.  If any are, we may have difficulty running our model. We cannot do this in Mplus, so the tables below come from Stata. You can use whatever statistics package you prefer to do this.

           |         pared
     apply |         0          1 |     Total
-----------+----------------------+----------
         0 |       200         20 |       220 
         1 |       110         30 |       140 
         2 |        27         13 |        40 
-----------+----------------------+----------
     Total |       337         63 |       400 


           |        public
     apply |         0          1 |     Total
-----------+----------------------+----------
         0 |       189         31 |       220 
         1 |       124         16 |       140 
         2 |        30         10 |        40 
-----------+----------------------+----------
     Total |       343         57 |       400 

None of the cells is too small or empty (has no cases), so we will run our model in Mplus. The syntax in bold below contains our model. Under analysis we have specified "estimator = ml" had we not specified that the estimator should be ml, Mplus would have performed a probit regression model using weighted least squares, specifying "estimator = ml" instructs Mplus to estimate an ordinal logit model and to estimate it using maximum likelihood.

  Title: Ordinal logistic regression in Mplus,
   Descriptive statistics;
  Data:
    File is D:\documents\ologit in Mplus DAE\ologit.dat ;
  Variable:
    Names are
       apply pared public gpa;
      categorical are apply;
      ! pared and public are taken out of the
      ! categorical statement because they are
      ! not dependent variables
  Analysis:
    Type = general ;
    estimator = ml;
  Model:
      apply on pared public gpa;

output omitted

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         400

output omitted

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                        -358.512

Information Criteria

          Number of Free Parameters              5
          Akaike (AIC)                     727.025
          Bayesian (BIC)                   746.982
          Sample-Size Adjusted BIC         731.117
            (n* = (n + 2) / 24)


MODEL RESULTS

                   Estimates     S.E.  Est./S.E.

 APPLY      ON
    PARED              1.048    0.266      3.942
    PUBLIC            -0.059    0.298     -0.197
    GPA                0.616    0.261      2.363

 Thresholds
    APPLY$1            2.203    0.780      2.826
    APPLY$2            4.299    0.804      5.345


LOGISTIC REGRESSION ODDS RATIO RESULTS

 APPLY      ON
    PARED              2.851
    PUBLIC             0.943
    GPA                1.851

At the top of the output we see that all 400 observations in our data set were used in the analysis. As discussed above, if any of our variables had missing data we would have needed to specify "missing = #" in the variable statement, where # is the numeric value given to missing values (e.g. -9999). By default Mplus will exclude cases with missing values on any of the variables in our analysis, and hence missing data will result in fewer observations being used. In Mplus there are other (good) options for handling missing data. We won't discuss them here, except to say that they are available, and are one of the reasons one might consider running this sort of analysis in Mplus (since many other packages can be used to run an ordinal logistic regression model). The next thing we see in the (abridged) output is the final log likelihood (-358.512), it can be used in comparisons of nested models, but we won't show an example of that here. Under the heading "Information Criteria" we see the Akaike and Bayesian information criterion values. Both the AIC and the BIC are measures of fit with some correction for the complexity of the model, but the BIC has a stronger correction for parsimony. In both cases, lower values indicate better fit of the model.

Under the heading "MODEL RESULTS" we see the coefficients, their standard errors, and the z-test (Est./S.E.). If |z| (|est./s.e.|) is greater than 1.95 the coefficient is statistically significant at the 0.05 level (for a two-tailed test), for a p-value of less than 0.01, |z| > 2.75 is necessary.   Both pared and gpa are statistically significant; public is not.  The estimates in the output are given in units of ordered logits, or ordered log odds.  So for pared, we would say that for a one unit increase in pared (i.e., going from 0 to 1), we expect a 1.048 increase in the log odds of moving from a given level of apply to any higher category, given all of the other variables in the model are held constant.  For gpa, we would say that for a one unit increase in gpa, we would expect a 0.616 increase in the expected value of apply in the log odds scale, given that all of the other variables in the model are held constant.  Below the table of coefficients are the Thresholds. The thresholds shown at the bottom of the output indicate where the latent variable is cut to make the three groups that we observe in our data.  Note that this latent variable is continuous.  In general, these are not used in the interpretation of the results.  Note that different statistics packages use different formulations for thresholds, and that some packages label these as cutpoints rather than thresholds.  Mplus produces thresholds which match the cutpoints Stata produces. For further information on the different formulations, and for help in converting from one to the other, please see the Mplus technical appendices or the Stata FAQ:  How can I convert Stata's parameterization of ordered probit and logistic models to one in which a constant is estimated?

Finally, we see the results in terms of proportional odds ratios. We would interpret these pretty much as we would odds ratios from a binary logistic regression.  For pared, we would say that for a one unit increase in pared, i.e., going from 0 to 1, the odds of high apply versus the combined middle and low categories are 2.851 times greater, given that all of the other variables in the model are held constant.  Likewise, the odds of the combined middle and high categories versus low apply is 2.85 times greater, given that all of the other variables in the model are held constant.  For a one unit increase in gpa, the odds of the low and middle categories of apply versus the high category of apply are 1.85 times greater, given that the other variables in the model are held constant.  Because of the proportional odds assumption (see below for more explanation), the same increase, 1.85 times, is found between low apply and the combined categories of middle and high apply.

One of the assumptions underlying ordinal logistic (and ordinal probit) regression is that the relationship between each pair of outcome groups is the same.  In other words, ordinal logistic regression assumes that the coefficients that describe the relationship between, say, the lowest versus all higher categories of the response variable are the same as those that describe the relationship between the next lowest category and all higher categories, etc.  This is called the proportional odds assumption or the parallel regression assumption.  Because the relationship between all pairs of groups is the same, there is only one set of coefficients (only one model).  If this was not the case, we would need different models to describe the relationship between each pair of outcome groups. Mplus does not have a formal test for the proportional odds assumption. One way to asses whether the proportional odds assumption is reasonable is to turn your ordered dependent variable into a series of binary variables that are equal to one if y is greater than or equal to a given value, and zero otherwise. You will need k-1 of these binary variables, where k is the number of values your dependent variable takes on. You will then want to perform a series of binary logistic regression analyses, using each of these new variables as the outcome. If the proportional odds assumption is reasonable, the coefficients should be similar across each of these binary logistic regression models.

Sample Write-up of the Analysis

The table of the results below is one example of a table that might be more appropriate for publication. 

                  b
                (s.e.)
apply           
pared           1.05***
                (0.27)
public          -0.06
                (0.30)
gpa             0.62*
                (0.26)
           
Threshold 1      2.20**
                (0.78)
Threshold 2      4.30***
                (0.80)
log_likelihood  -358.51

Below is one way of describing the results.

Parental education and grade point average are positively associated with the tendency to apply for graduate school.  For a one unit increase in pared, the expected ordered log odds increases by 1.05 as you move to the next higher category of apply.  For every unit increase in gpa, we expect a 0.62 increase in the expected log odds as you move to the next higher category of apply.  There was no statistically significant effect of public on apply.

Similarities and differences between logit and probit models

Neither the ordinal logistic model nor the ordinal probit model are linear.  To make the model linear, a transformation is done on the dependent variable.  In logistic regression (including binary, ordinal and multinomial logistic models), the transformation is the logit function which is the natural log of the odds.  In probit models (including binary, ordinal and multinomial probit models), the function used is the inverse of the standard normal cumulative distribution (a.k.a. a z-score).  In reality, this difference isn't too important:  both transformations are equally good at linearizing the model; which one you use is a matter of personal preference.  Both methods use maximum likelihood, and so require more cases than a similar OLS model.  Unlike logistic models, you don't get odds ratios with probit models, but you can get predicted probabilities from either type of model (note: Mplus will not calculate predicted probabilities regardless of the type of model).

Cautions, Flies in the Ointment

See Also


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.