|
|
|
||||
|
|
|||||
Example 2: Some people have heart attacks and others don't. We would like to see if exercise, age and gender influences whether or not someone has a heart attack. Again, we have a binary outcome: have heart attack or not.
Example 3: Many undergraduates wish to continue their education in graduate school. In their application to any given graduate program, they include their GRE scores and their GPA from their undergraduate institution. Some students are graduating from very prestigious institutions, while others are graduating from not-so-prestigious institutions. Many months after sending in their applications, students receive either a thick or a thin envelope from the graduate program to which they applied: some were admitted and others were not.
use http://www.ats.ucla.edu/stat/stata/dae/probit.dta, clear
This hypothetical data set has a 0/1 variable called admit that we will use as our response (i.e., outcome, dependent) variable. We also have three variables that we will use as predictors: gre, which is the student's Graduate Record Exam score; gpa, which is the student's grade point average; and topnotch, which is a 0/1 variable where 1 indicates that the undergraduate institution was "top notch" and 0 indicates that it is not.
summarize gre gpa
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
gre | 400 587.7 115.5165 220 800
gpa | 400 3.3899 .3805668 2.26 4
tab topnotch
topnotch | Freq. Percent Cum.
------------+-----------------------------------
0 | 335 83.75 83.75
1 | 65 16.25 100.00
------------+-----------------------------------
Total | 400 100.00
Before we run our probit model, we will see if any cells (created by the crosstab of our categorical and response variables) are empty or particularly small. If any are, we may have difficulty running our model.
tab admit topnotch
| topnotch
admit | 0 1 | Total
-----------+----------------------+----------
0 | 238 35 | 273
1 | 97 30 | 127
-----------+----------------------+----------
Total | 335 65 | 400
None of the cells is too small or empty (has no cases), so we will run our model.
probit admit gre topnotch gpa
Iteration 0: log likelihood = -249.98826
Iteration 1: log likelihood = -238.97735
Iteration 2: log likelihood = -238.94339
Iteration 3: log likelihood = -238.94339
Probit regression Number of obs = 400
LR chi2(3) = 22.09
Prob > chi2 = 0.0001
Log likelihood = -238.94339 Pseudo R2 = 0.0442
------------------------------------------------------------------------------
admit | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
gre | .0015244 .0006382 2.39 0.017 .0002736 .0027752
topnotch | .2730334 .1795984 1.52 0.128 -.078973 .6250398
gpa | .4009853 .1931077 2.08 0.038 .0225012 .7794694
_cons | -2.797884 .6475363 -4.32 0.000 -4.067032 -1.528736
------------------------------------------------------------------------------
In the output above, we first see the iteration log. In general, this is not so interesting but does contain information on how well the model converges. The final log likelihood (-238.94339) can be used in comparisons of nested models, but we won't show an example of that here. Also at the top of the output we see that all 400 observations in our data set were used in the analysis. Fewer observations would have been used if any of our variables had missing values. By default, Stata does a listwise deletion of cases with missing values. The likelihood ratio chi-square of 22.09 with a p-value of 0.0001 tells us that our model as a whole is statistically significant, as compared to model with no predictors. The pseudo-R-squared is also given. It is a pseudo-R-squared because there is no direct equivalent of an R-squared (from OLS regression) in non-linear models. There are many different pseudo-R-squares, but the emphasis should be on the pseudo.
In the table we see the coefficients, their standard errors, the z-test and associated p-values, and the 95% confidence interval of the coefficients. Both gre and gpa are statistically significant; topnotch is not. A discussion of the interpretation of the coefficients can be found in the sample write up section below.
There is no equivalent of an exponentiated coefficient in probit, so if you find interpreting probit coefficients tricky, you might prefer looking at predicted probabilities, which are sometimes easier for many to understand than the coefficients. The commands used below are user-written and need to be downloaded, which you can do by typing findit spost. We will start with prtab. This can be used with either a categorical variable or a continuous variable and shows the predicted probability for each of the values of the variable specified. Although topnotch is not statistically significant, we will use it as an example with a categorical predictor. As you can see, the predicted probability of being accepted into the graduate program is 0.3 if the undergraduate institution was not "top notch" and .4 if it was. We can also see that the predicted probability of getting accepted is only .14 if one's GRE score is 220 and increases to .43 if one's GRE score is 800. Beneath each output, we can see the values at which the variables are held; by default, they are held at their mean.
prtab topnotch
probit: Predicted probabilities of positive outcome for admit
----------------------
topnotch | Prediction
----------+-----------
0 | 0.2937
1 | 0.3937
----------------------
gre topnotch gpa
x= 587.7 .1625 3.3899
prtab gre
probit: Predicted probabilities of positive outcome for admit
----------------------
gre | Prediction
----------+-----------
220 | 0.1448
300 | 0.1744
340 | 0.1905
360 | 0.1989
380 | 0.2076
400 | 0.2164
420 | 0.2254
440 | 0.2347
460 | 0.2442
480 | 0.2538
500 | 0.2637
520 | 0.2737
540 | 0.2840
560 | 0.2944
580 | 0.3050
600 | 0.3158
620 | 0.3267
640 | 0.3378
660 | 0.3490
680 | 0.3603
700 | 0.3718
720 | 0.3834
740 | 0.3951
760 | 0.4068
780 | 0.4187
800 | 0.4307
----------------------
gre topnotch gpa
x= 587.7 .1625 3.3899
We can use the prvalue command to obtain the predicted probabilities when GRE is set to specific values: 2, 3 and 4. As you can see, when one's GPA is 2, the predicted probability of being accepted is only .146 and .85 of not being accepted. When GPA is increased to 3, the probability of being accepted increases to .25, and when one's GPA is 4, the predicted probability of being accepted is .4.
prvalue , x(gpa=2)
probit: Predictions for admit
Confidence intervals by delta method
95% Conf. Interval
Pr(y=1|x): 0.1456 [ 0.0200, 0.2711]
Pr(y=0|x): 0.8544 [ 0.7289, 0.9800]
gre topnotch gpa
x= 587.7 .1625 2
prvalue , x(gpa=3)
probit: Predictions for admit
Confidence intervals by delta method
95% Conf. Interval
Pr(y=1|x): 0.2563 [ 0.1910, 0.3217]
Pr(y=0|x): 0.7437 [ 0.6783, 0.8090]
gre topnotch gpa
x= 587.7 .1625 3
prvalue , x(gpa=4)
probit: Predictions for admit
Confidence intervals by delta method
95% Conf. Interval
Pr(y=1|x): 0.3999 [ 0.2997, 0.5000]
Pr(y=0|x): 0.6001 [ 0.5000, 0.7003]
gre topnotch gpa
x= 587.7 .1625 4
We can use the prvalue command to look at specific profiles. Below we see the predicted probabilities for a student who had a 3.5 GPA and a GRE score of 700. Finally, we see the predicted probabilities for a student with the highest values for all variables.
prvalue , x(gpa=3.5 gre=700)
probit: Predictions for admit
Confidence intervals by delta method
95% Conf. Interval
Pr(y=1|x): 0.3886 [ 0.3202, 0.4570]
Pr(y=0|x): 0.6114 [ 0.5430, 0.6798]
gre topnotch gpa
x= 700 .1625 3.5
prvalue , x(gpa=max gre=max topnotch=1)
probit: Predictions for admit
Confidence intervals by delta method
95% Conf. Interval
Pr(y=1|x): 0.6174 [ 0.4765, 0.7583]
Pr(y=0|x): 0.3826 [ 0.2417, 0.5235]
gre topnotch gpa
x= 800 1 4
We will use the estout command to create a table of the results that might be more appropriate for publication. This command is user-written, so type findit estout to download it.
estout, varwidth(12) varlabels(_cons Constant) cells(b(star fmt(%8.2f)) ///
se(par fmt(%8.2f))) ///
stats(ll chi2 r2_p, labels(log_likelihood LR_chi_square r2_pvalue) fmt(%8.2f))
b/se
gre 0.00*
(0.00)
topnotch 0.27
(0.18)
gpa 0.40*
(0.19)
Constant -2.80***
(0.65)
log_likelihood -238.94
LR_chi_square 22.09
r2_pvalue 0.04
Below is one way of describing the results. Please note that the coefficients can be discussed in terms of either Z-scores or probit index. These are equivalent terms.
The Z-score of a person with a zero GRE score and zero GPA at a non-topnotch school is about -2.8. For each point of increase in GRE score, the Z-score is increased by .0015244; for each point of increase in GPA, the probit index increases by .4.
Describing the results in terms of Z-scores may not be the simplest metric for your audience to understand. As we saw above, you can use the prvalue, prtab and other spost commands to obtain predicted probabilities. These are often useful for helping to tell the "story" of your results.
Neither the logit model nor the probit model are linear, which makes things difficult. To make the model linear, a transformation is done on the dependent variable. In logit regression, the transformation is the logit function which is the natural log of the odds. In probit models, the function used is the inverse of the standard normal cumulative distribution (a.k.a. a z-score). In reality, this difference isn't too important: both transformations are equally good at linearizing the model; which one you use is a matter of personal preference. Both models need to have diagnostics done afterwards to check that the assumptions of the model have not been violated. Both methods use maximum likelihood, and so require more cases than a similar OLS model. Unlike logit models, you don't get odds ratios with probit models. In general, the logit coefficients are larger than the probit coefficients by a factor of 1.7. However, this rule often does not apply when an independent variable has a high standard error (lots of variability).
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services