UCLA Academic Technology Services HomeServicesClassesContactJobs

Logistic Regression with Stata
Chapter 5 - Ordinal Logistic Regression

NOTE:  This page is under construction!!

So far in this course we have analyzed data in which the response variable has had exactly two levels, but what about the situation in which there are more than two levels? In this chapter of the Logistic Regression with Stata, we cover the various commands used for multinomial and ordered logistic regression allowing for more than two categories. Multinomial response models have much in common with the logistic regression models that we have covered so far. However, you will find that there are differences in some of the assumptions, in the analyses and in the interpretation of these models.

4.2 Ordered Logistic Regression

4.2.1 Example 1

Let's begin our discussion of ordered logistic regression with an example that has a binary outcome variable, honcomp, that indicates that a student is enrolled in an "honors composition" course. We begin with an ordinary logistic regression. Next, we will run an ordered logistic regression for the same model using Stata's ologit command. As you can see, the values of the coefficients and the standard errors are the same, except that, the sign for _cut1 is reversed from _cons. We will explain shortly what _cut1 is although it is already clear that it is related to the constant found in the logistic regression models.

4.2.2 Example 2

For our next example we will select ses as the response variable. It has three ordered categories. Here are the frequencies for each of the categories. We can also obtain much of the same information using the codebook command. For a predictor variable we will use the variable academic which is a dummy variable indicating whether or not students are in an academic program. Here is the ordered logistic model predicting ses using academic. The format of these results may seem confusing at first. What isn't clear from the output is that logistic regression is a multi-equation model. In this example, there are two equations, each with the same coefficients. This is known as the proportional odds model. Other logistics regression models, which do not assume proportional odds will have one equation, with their own constants and coefficients, for each of the k-1 equations.

In our example, the results are formatted like a single equation model when, in fact, this is a two equation model because there are three levels of ses. In ordered logistic regression, Stata sets the constant to zero and estimates the cut points for separating the various levels of the response variable. Other programs may parameterize the model differently by estimating the constant and setting the first cut point to zero. In order to show the multi-equation nature of this model, we will redisplay the results in a different format.

With ordered logistic regression there are other possible methods that do not involve the proportional odds assumption. There is a program omodel (available from the Stata website) which can be used to test the proportional odds assumption. You can download omodel from within Stata by typing findit omodel (see How can I use the findit command to search for programs and get additional help? for more information about using findit). These results suggest that the proportional odds approach is reasonable since the chi-square test is not significant. If the test of proportionality had been significant we could have tried the gologit2 program by Richard Williams of Notre Dame University. You can download gologit2 from within Stata by typing findit gologit2 (see How can I use the findit command to search for programs and get additional help? for more information about using findit). gologit2 with the npl option does not assume proportional odds, let's try it just for "fun." These results clearly show the multiple equation nature of ordered logistic regression with different constants, coefficients and standard errors.

The gologit2 command provides us with an alternative method for testing the proportionality assumption. If the assumption of proportional odds is tenable then there should not be a significant difference between the coefficients for academic in the two equations. The test command computes a Wald test across the two equations.

The results of this Wald test of proportionality are very similar to those found using the omodel command.

Let's rerun the ologit command followed by the listcoef and fitstat commands.

From the listcoef, we see that the relative risk ratio for academic is approximately 2.5, which means that the risk (odds) of being in the high ses versus medium and low ses is 2.5 times greater for students in the academic program. The same relative risk ratio also applies to the comparison of medium and high ses versus low ses.

4.2.3 Example 3

The variable academic that we used in the previous example is a dichotomization of the three category variable prog (program type). Let's look at the frequencies for each of the levels of prog and create dummy coded variables at the same time using the tabulate command. Now we can use prog1 and prog3 in an ordered logistic regression so that the academic group will be our comparison group. Individually, prog1 and prog3 are statistically significant and we can determine from the likelihood ration chi-square (chi2(2) = 12.06) that they are jointly significant, i.e., that the variable prog is significant.

We will follow this analysis with the omodel command to check on the proportional odds assumption.

The test of proportionality is not significant, thus we can continue looking at the results for the ologit command by following up with listcoef and fitstat. Note that if the ones and zeros were reversed in both prog1 and prog3 then the relative risk ratio for prog1 would be 1/.3569 = 2.80 and for prog3 would be 1/.4274 = 2.34.

The fitstat gives a deviance of 409.11 which is lower than the deviance of 409.33 for the model that used the dichotomous variable academic. This is not a very big change in the deviance. If you look at the AIC you will see that the value for current model (2.086) is actually larger than for the model with academic (2.077). Again, this is a very small change which suggests that the three category predictor, prog, is not really any better than the dichotomous predictor academic.

4.2.4 Example 4

Next we will look at a model that has both categorical and continuous predictor variables and their interaction. We can tell from the test of the individual coefficients that the interaction term is not significant but let's run a likelihood ratio test anyway, just to confirm what we already know. Now we see that both math and academic are significant. However, the coefficient for math is for a one point change in the math test score, which is not very meaningful. Let's create a new variable math10 which is the math test score divided by ten. A change of ten points on the math test will be more meaningful than a one point change. The ologit will be followed by listcoef and fitstat. From the listcoef results we see that for every ten point increase in math the odds of being in high ses versus medium and low ses are about 1.5 times greater. The same thing is true for the odds of medium and high ses versus low ses. The relative risk ratio for math10 is less than that of academic which indicates that the odds are about 1.8 times greater from students in the academic program.

From the fitstat restults we can see that the deviance has dropped to 401.4 and the AIC is down to 2.05, both of which indicate that this model fits better than the model without math.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.