UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

SPSS Data Analysis Examples
Ordinal Logistic Regression

Examples

Example 1:  A marketing research firm wants to investigate what factors influence the size of soda (small, medium, large or extra large) that people order at a fast-food chain.  These factors may include what type of sandwich is ordered (burger or chicken), whether or not fries are also ordered, and age of the consumer.  While the outcome variable, size of soda, is obviously ordered, the difference between the various sizes is not consistent.  The differences are 10, 8, 12 ounces, respectively. 

Example 2:  A 5-point Likert scale is used to assess people's opinion about a local ballot measure.  The response options are "strongly disagree", "disagree", "neutral", "agree" and "strongly agree".  Predictor variables will include the measure's author, his/her political party, and how much the measure's proposals will cost.  The researchers have reason to believe that the psychological "distances" between these points are not equal.  For example, the "distance" between "strongly disagree" and "disagree" may be shorter than the distance between "disagree" and "neutral". 

Example 3:  A study looks at factors that influence the decision of whether to apply to graduate school.  College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school.  Hence, our outcome variable has three categories.  Data on parental educational status, whether the undergraduate institution is public or private, and current GPA is also collected. 

Description of the Data

For our data analysis below, we are going to expand on Example 3 about applying to graduate school.  We have generated hypothetical data, which can be downloaded here.

This hypothetical data set has a thee level variable called apply (coded 0, 1, 2), that we will use as our response (i.e., outcome, dependent) variable.  We also have three variables that we will use as predictors:  pared, which is a 0/1 variable indicating whether at least one parent has a graduate degree; public, which is a 0/1 variable where 1 indicates that the undergraduate institution is a public university and 0 indicates that it is a private university, and gpa, which is the student's grade point average. 

get file "D:\data\ologit.sav".
freq var = apply.

freq var = pared.

freq var = public.

descriptives var = gpa.

Some Strategies You Might Try

Using the Ordinal Logistic Model

Before we run our ordinal logistic model, we will see if any cells are empty or extremely small.  If any are, we may have difficulty running our model.  There are two ways in SPSS that we can do this.  The first way is to make simple crosstabs.  The second way is to use the cellinfo option on the /print subcommand.  You should use the cellinfo option only with categorical predictor variables; the table will be long and difficult to interpret if you include continuous predictors.

crosstabs
/tables = apply by pared.

crosstabs
/tables = apply by public.

plum apply with pared public 
/link = logit
/print = cellinfo.

None of the cells is too small or empty (has no cases), so we will run our model.  In the syntax below, we have included the link = logit subcommand, even though it is the default, just to remind ourselves that we are using the logit link function.  Also note that if you do not include the print subcommand, only the Case Processing Summary table is provided in the output.

plum apply with pared public gpa
/link = logit
/print = parameter summary.

In the output above, we first see a warning about empty cells.  Technically, there are empty cells because of the continuous variable in our model, gpa.  However, we are not worried about this warning message.  We checked for empty cells when we did the crosstabs with the response variable by each of the categorical predictor variables, and those tables looked OK, so we will proceed with the analysis.  In the Case Processing Summary table, we see the number and percentage of cases in each level of our response variable.  These numbers look fine, but we would be concerned if one level had very few cases in it.  We also see that all 400 observations in our data set were used in the analysis.  Fewer observations would have been used if any of our variables had missing values.  By default, SPSS does a listwise deletion of cases with missing values.  Next we see the Model Fitting Information table, which gives the -2 log likelihood for the intercept-only and final models.  The -2 log likelihood can be used in comparisons of nested models, but we won't show an example of that here.

In the Parameter Estimates table we see the coefficients, their standard errors, the Wald test and associated p-values (Sig.), and the 95% confidence interval of the coefficients.  Both pared and gpa are statistically significant; public is not.  So for pared, we would say that for a one unit increase in pared (i.e., going from 0 to 1), we expect a 1.05 increase in the ordered log odds of being in a higher level of apply, given all of the other variables in the model are held constant.  For gpa, we would say that for a one unit increase in gpa, we would expect a 0.62 increase in the log odds of being in a higher level of apply, given that all of the other variables in the model are held constant.  The thresholds shown at the top of this output, and they indicate where the latent variable is cut to make the three groups that we observe in our data.  Note that this latent variable is continuous.  In general, these are not used in the interpretation of the results.  Some statistical packages call the thresholds "cutpoints" (thresholds and cutpoints are the same thing); other packages, such as SAS report intercepts, which are the negative of the thresholds.  In this example, the intercepts would be -2.203 and -4.299.  For further information, please see the Stata FAQ:  How can I convert Stata's parameterization of ordered probit and logistic models to one in which a constant is estimated?

As of version 15 of SPSS, you cannot directly obtain the proportional odds ratios from SPSS.  You can either use the SPSS Output Management System (OMS) to capture the parameter estimates and exponentiate them, or you can calculate them by hand.  Please see Ordinal Regression by Marija J. Norusis for examples of how to do this.  The commands for using OMS and calculating the proportional odds ratios is shown below.  For more information on how to use OMS, please see our SPSS FAQ: How can I output my results to a data file in SPSS?  Please note that the single quotes in the square brackets are important, and you will get an error message if they are omitted or unbalanced.

oms select tables
 /destination format = sav outfile = "D:\ologit_results.sav"
 /if commands = ['plum'] subtypes = ['Parameter Estimates'].

plum apply with pared public gpa
/link = logit
/print = parameter.

omsend.

get file "D:\ologit_results.sav".
rename variables Var2 = Predictor_Variables.
* the next command deletes the thresholds from the data set.
select if Var1 = "Location".
exe.
* the command below removes unnessary variables from the data set.
* transformations cannot be pending for the command below to work, so
* the exe.
* above is necessary.
delete variables Command_ Subtype_ Label_ Var1.
compute expb = exp(Estimate).
compute Lower_95_CI = exp(LowerBound).
compute Upper_95_CI = exp(UpperBound).
exe.

In the column expb we see the results presented as proportional odds ratios (the coefficient exponentiated).  We have also calculated the lower and upper 95% confidence interval.  We would interpret the proportional odds ratios pretty much as we would odds ratios from a binary logistic regression.  We will ignore the values for apply = 0 and apply = 1, as those are the thresholds and not usually reported in terms of proportional odds ratios.  For pared, we would say that for a one unit increase in pared, i.e., going from 0 to 1, the odds of high apply versus the combined middle and low categories are 2.85 greater, given that all of the other variables in the model are held constant.  Likewise, the odds of the combined middle and high categories versus low apply is 2.85 times greater, given that all of the other variables in the model are held constant.  For a one unit increase in gpa, the odds of the low and middle categories of apply versus the high category of apply are 1.85 times greater, given that the other variables in the model are held constant.  Because of the proportional odds assumption (see below for more explanation), the same increase, 1.85 times, is found between low apply and the combined categories of middle and high apply.

One of the assumptions underlying ordinal logistic (and ordinal probit) regression is that the relationship between each pair of outcome groups is the same.  In other words, ordinal logistic regression assumes that the coefficients that describe the relationship between, say, the lowest versus all higher categories of the response variable are the same as those that describe the relationship between the next lowest category and all higher categories, etc.  This is called the proportional odds assumption or the parallel regression assumption.  Because the relationship between all pairs of groups is the same, there is only one set of coefficients (only one model).  If this was not the case, we would need different models to describe the relationship between each pair of outcome groups.  We need to test the proportional odds assumption, and we can use the tparallel option on the print subcommand.  The null hypothesis of this chi-square test is that there is no difference in the coefficients between models, so we "hope" to get a non-significant result. 

plum apply with pared public gpa
/link = logit
/print = tparallel.

The above test indicates that we have not violated the proportional odds assumption. 

Sample Write-up of the Analysis

Below is one way of describing the results.

Parental education and grade point average are positively associated with the tendency to apply for graduate school.  For a one unit increase in pared, the expected ordered log odds increases by 1.05 as you move to the next higher category of apply.  For every unit increase in gpa, we expect a 0.62 increase in the expected log odds as you move to the next higher category of apply.  There was no statistically significant effect of public on apply.

Cautions, Flies in the Ointment

See Also

SPSS Annotated Output:  Ordinal Logistic Regression
Ordinal Regression by Marija J. Norusis
An Introduction to Categorical Data Analysis by Alan Agresti
Categorical Data Analysis, Second Edition by Alan Agresti
Interpreting Probability Models:  Logit, Probit, and Other Generalized Linear Models by Tim Futing Liao
Statistical Methods for Categorical Data Analysis by Daniel Powers and Yu Xie


How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California