UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Data Analysis Examples
Probit Regression

Examples

Example 1:  Suppose that we are interested in factors that influence whether or not a political candidate wins an election.  Our outcome variable has only two possible values:  win or not win.  We believe that factors such as the amount of money spent on the campaign, the amount of time spent campaigning negatively and whether the candidate is an incumbent affect whether the candidate wins the election.  Because our outcome variable is binary (either the candidate wins or does not win), we need to use a model that handles this feature correctly. 

Example 2:  Some people have heart attacks and others don't.  We would like to see if exercise, age and gender influences whether or not someone has a heart attack.  Again, we have a binary outcome:  have heart attack or not. 

Example 3:  Many undergraduates wish to continue their education in graduate school.  In their application to any given graduate program, they include their GRE scores and their GPA from their undergraduate institution.  Some students are graduating from very prestigious institutions, while others are graduating from not-so-prestigious institutions.  Many months after sending in their applications, students receive either a thick or a thin envelope from the graduate program to which they applied:  some were admitted and others were not.

Description of the Data

For our data analysis below, we are going to expand on our third example about getting into graduate school.  We have generated hypothetical data, which can be obtained by clicking on probit.sas7bdat . You can store this anywhere you like, but our examples will assume it has been stored in c:\data.

This hypothetical data set has a 0/1 variable called admit that we will use as our response (i.e., outcome, dependent) variable.  We also have three variables that we will use as predictors:  gre, which is the student's Graduate Record Exam score; gpa, which is the student's grade point average; and topnotch, which is a 0/1 variable where 1 indicates that the undergraduate institution was "top notch" and 0 indicates that it is not. 

proc means data "c:\data\probit";
var gre gpa;
run;
The MEANS Procedure

Variable      N            Mean         Std Dev         Minimum         Maximum
-------------------------------------------------------------------------------
GRE         400     587.7000000     115.5165364     220.0000000     800.0000000
GPA         400       3.3899000       0.3805668       2.2600000       4.0000000
-------------------------------------------------------------------------------
proc freq data="c:\data\probit";
  tables topnotch;
run;
The FREQ Procedure

                                     Cumulative    Cumulative
TOPNOTCH    Frequency     Percent     Frequency      Percent
-------------------------------------------------------------
       0         335       83.75           335        83.75
       1          65       16.25           400       100.00

Some Strategies You Might Try

Using the Probit Model

Before we run our probit model, we will see if any cells (created by the crosstab of our categorical and response variables) are empty or particularly small.  If any are, we may have difficulty running our model. 

proc freq data="c:\data\probit";
  tables admit*topnotch / norow nocol noperc;
run;
The FREQ Procedure

Table of ADMIT by TOPNOTCH

ADMIT     TOPNOTCH

Frequency|       0|       1|  Total
---------+--------+--------+
       0 |    238 |     35 |    273
---------+--------+--------+
       1 |     97 |     30 |    127
---------+--------+--------+
Total         335       65      400

None of the cells is too small or empty (has no cases), so we will run our model. There are actually two ways to run a probit model in SAS, one is proc probit the other is to use proc logistic with link=probit on the model statement. The advantage of running the model using proc logistic is that it is easier to specify the ordering of the categories than it is in proc probit. The advantage of using proc probit is that it will produce graphs that will help you interpret and explain your models.

Running a Probit Model Using PROC LOGISTIC

The code below tells SAS that we want to run proc logistic on a dataset, and that we want to model the data in descending order. If we did not do this, SAS would model the probability of being a 0 on admit versus a 1 on admit. Our conclusions would be the same, but, the interpretation is a bit different, since all the coefficients would be in terms of the probability of not being admitted to graduate school (i.e. having a zero on admit). The model statement tells SAS that our dependent variable is admit and our independent variables are gre, gpa, and topnotch. The option link=probit tells SAS that instead of running a logistic regression, we would like to do a probit regression.


proc logistic data="c:\data\probit" descending;
model admit = gre gpa topnotch / link=probit;
run;

The LOGISTIC Procedure

                    Model Information

Data Set                     c:\data\probit        Written by SAS
Response Variable            ADMIT
Number of Response Levels    2
Model                        binary probit
Optimization Technique       Fisher's scoring


Number of Observations Read       400
Number of Observations Used       400

          Response Profile
Ordered                      Total
  Value        ADMIT     Frequency
      1            1           127
      2            0           273
      
Probability modeled is ADMIT=1. 
This output tells us the file being analyzed and the number of observations used. We see that all 400 observations in our data set were used in the analysis (fewer observations would have been used if any of our variables had missing values). We also see that SAS is modeling admit as a binary probit and is modeling admit being 1 (If we omitted the descending option, SAS would model admit being 0).
        Model Convergence Status
                         
Convergence criterion (GCONV=1E-8) satisfied.


               Model Fit Statistics
                             Intercept
              Intercept            and
Criterion          Only     Covariates

AIC             501.977        485.887
SC              505.968        501.853
-2 Log L        499.977        477.887

               Testing Global Null Hypothesis: BETA=0

Test                 Chi-Square       DF     Pr > ChiSq

Likelihood Ratio        22.0897        3         <.0001
Score                   21.5235        3         <.0001
Wald                    21.5263        3         <.0001
This output describes the overall fit of the model, and tests the overall fit of the model. This is usually boring. The -2 Log L (499.977) can be used in comparisons of nested models, but we won't show an example of that here. The likelihood ratio chi-square of 21.0897 with a p-value of 0.0001 tells us that our model as a whole fits significantly better than an empty model.

The LOGISTIC Procedure

              Analysis of Maximum Likelihood Estimates

                                             Standard          Wald
Parameter    DF    Estimate       Error    Chi-Square    Pr > ChiSq

Intercept     1     -2.7978      0.6476       18.6630        <.0001
GRE           1     0.00152    0.000640        5.6661        0.0173
GPA           1      0.4010      0.1948        4.2370        0.0396
TOPNOTCH      1      0.2730      0.1803        2.2923        0.1300
The above table gives the degrees of freedom, the coefficients (estimate), their standard errors (error), a chi-square value, and the p-value associated with the chi-square value. The coefficients for gre and gpa are statistically significant, the coefficient for topnotch is not. A discussion of the interpretation of the coefficients can be found in the sample write up section below.
        Association of Predicted Probabilities and Observed Responses

Percent Concordant     63.9    Somers' D    0.283
Percent Discordant     35.6    Gamma        0.284
Percent Tied            0.5    Tau-a        0.123
Pairs                 34671    c            0.641

The table above gives information about the relationship between the predicted probabilities from our model, and the actual outcomes in our data.

If we want to get the predicted probability for each case, we can do this by adding an output line to our proc logistic statement. The line reads "output out=new_dataset prob=varname;" where new_dataset is the name of the new dataset which will contain all the variables in the dataset we are working from, along with the predicted probabilities, and varname is the name of the variable we want to have the predicted probabilities. Our total syntax would look like this:

proc logistic data="c:\data\probit" descending;
  model admit = gre gpa topnotch / link=probit;
  output out=new_dataset prob=newvar;
run;

The output from this code will look the same as the output from the above code, the only difference is that SAS will create a new dataset.

Running a Probit Model Using PROC PROBIT

As noted above, the advantage of using proc probit over proc logistic with the probit option is that proc probit will produce graphs that plot the predicted probability of an outcome against your independent variables. This can make your analysis much easier to interpret. The down side of using proc probit is that there is no descending option, so it takes a little more work to get the analysis to run so that the base category is the lowest value (e.g. 0) and and the model gives the results for changes in the probability of higher values (e.g. 1). We don't have to do this, but if we don't, our model will model the probability of not being accepted to graduate school, which is a little counter-intuitive.

So the first step is to reorder our data so that the 1s (accepted) come first, and the 0s (not accepted) come second. We do this using proc sort, running proc sort can be time consuming if your dataset is very large, but for most datasets this will  not be a problem. The first line of the proc sort command just tells SAS which dataset we want to sort, the second line, by descending admit; tells SAS that we want to sort on the variable admit and that we want to sort in descending order.

proc sort data="c:\data\probit"; 
by descending admit ;
run;

Now that our dataset is sorted, we can run proc probit. The first line tells SAS we want to run proc probit and which dataset we want to use. The order=data tells SAS that we want to use the order that the data are sorted in to determine which category is used as the base category in our model. On the next line the class admit; statement tells SAS that the variable admit is categorical. The variable topnotch is also categorical, but since it is coded as a zero, one dummy variable, it doesn't matter how it is entered into the equation. Proc probit requires that the dependent variable (admit) be entered in the class statement, and that the class statement come before the model statement. The line that begins with the word model tells SAS that the model we want to estimate is admit predicted by gre, gpa, and topnotch. The next two lines both start with predpplot, this part of the code tells SAS that we want plots of the predicted probability versus one of our independent variables. The level = ("1") tells SAS that it should graph changes in the probability that the dependent variable (admit) is equal to one, and nodata indicates that the actual data points should not be included on the graph. These two lines are optional, in the sense that you can run a probit analysis without them, but they may help you in interpreting the analysis, and in explaining it to others.

proc probit data="c:\data\probit" order=data;
  class admit;
  model admit = gre gpa topnotch;
  predpplot var = gre level = ("1") nodata;
  predpplot var = gpa level = ("1") nodata;
run;
Probit Procedure

                          Model Information

Data Set                  c:\data\probit    Written by SAS

Dependent Variable                                 ADMIT
Number of Observations                               400
Name of Distribution                              Normal
Log Likelihood                              -238.9433896

        Number of Observations Read         400
        Number of Observations Used         400

                     Class Level Information

                      Name       Levels    Values

                     ADMIT            2    1 0

                         Response Profile

                  Ordered                 Total
                    Value    ADMIT    Frequency

                        1    1              127
                        2    0              273

PROC PROBIT is modeling the probabilities of levels of ADMIT having LOWER Ordered Values in the
response profile table.

Algorithm converged.
This output tells us the file being analyzed and the number of observations used. We see that all 400 observations in our data set were used in the analysis (fewer observations would have been used if any of our variables had missing values). The log likelihood (-238.9433896) can be used in comparisons of nested models, but we won't show an example of that here. The distribution is normal, which is correct for a probit model. The response profile shows that SAS is modeling admit being 1 (if we had not sorted our data, or had omitted the order=data option, SAS would model admit being 0 and our coefficients would be reversed). The last thing this section of the output tells us is that the algorithm converged. This is not very interesting but it is important, if the algorithm does not converge, you cannot interpret the results of your model.


                  Type III Analysis of Effects

                                                   Wald
           Effect       DF    Chi-Square    Pr > ChiSq

           GRE           1        5.7055        0.0169
           GPA           1        4.3118        0.0378
           TOPNOTCH      1        2.3111        0.1284

              Analysis of Parameter Estimates

                                     Standard   95% Confidence     Chi-
Parameter  DF Estimate    Error       Limits       Square Pr > ChiSq

Intercept   1  -2.7979   0.6475  -4.0670  -1.5287   18.67     <.0001
GRE         1   0.0015   0.0006   0.0003   0.0028    5.71     0.0169
GPA         1   0.4010   0.1931   0.0225   0.7795    4.31     0.0378
TOPNOTCH    1   0.2730   0.1796  -0.0790   0.6250    2.31     0.1284
The Analysis of Parameter Estimates table gives the degrees of freedom, the coefficients (estimate), their standard errors (error), a chi-square value, and the p-value associated with the chi-square value. The coefficients for gre and gpa are statistically significant, the coefficient for topnotch is not. A discussion of the interpretation of the coefficients can be found in the sample write up section below.

Below are the graphs that SAS displays, the first shows how the predicted probability of a student being accepted to graduate school (accept =1) changes as gre scores increase. The second graph shows how the predicted probability of a student being accepted to graduate school (accept=1) changes as values of gpa change. We see that as both gpa and gre scores increase, the predicted probability of being accepted to graduate school increases. While we could ask for this type of graph for topnotch since there are only two values of topnotch (0,1) this wouldn't make a very interesting graph.

Sample Write-up of the Analysis

Below is one way of describing the results.  Please note that the coefficients can be discussed in terms of either Z-scores or probit index.  These are equivalent terms.

The Z-score of a person with a zero GRE score and zero GPA at a non-topnotch school is about -2.8.  For each point of increase in GRE score, the Z-score is increased by .0015244; for each point of increase in GPA, the probit index increases by .4.

Similarities and differences between logit and probit models

Neither the logit model nor the probit model are linear, which makes things difficult.  To make the model linear, a transformation is done on the dependent variable.  In logit regression, the transformation is the logit function which is the natural log of the odds.  In probit models, the function used is the inverse of the standard normal cumulative distribution (a.k.a. a z-score).  In reality, this difference isn't too important:  both transformations are equally good at linearizing the model; which one you use is a matter of personal preference.  Both models need to have diagnostics done afterwards to check that the assumptions of the model have not been violated.  Both methods use maximum likelihood, and so require more cases than a similar OLS model.  Unlike logit models, you don't get odds ratios with probit models.  In general, the logit coefficients are larger than the probit coefficients by a factor of 1.7.  However, this rule often does not apply when an independent variable has a high standard error (lots of variability).

Cautions, Flies in the Ointment

See Also


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California