|
|
|
||||
|
|
|||||
Note: The examples on this page were done in SPSS 17. If you are using an earlier version of SPSS, you may need to use the genlog command.
Example 1. School administrators study the attendance behavior of high school juniors at two schools. Predictors of the number of days of absence include gender of the student and standardized test scores in math and language arts.
We have attendance data on 316 high school juniors from two urban high schools in the file poissonreg.dta. The response variable of interest is days absent, daysabs. The variables math and langarts give the standardized test scores for math and language arts respectively. The variable male is a binary indicator of student gender.
Let's look at the data.
GET FILE='D:\work\data\spss\poissonreg.sav'. DESCRIPTIVES VARIABLES=male math langarts daysabs /STATISTICS=MEAN STDDEV VAR MIN MAX .
GRAPH /HISTOGRAM=daysabs .
FREQUENCIES VARIABLES=male.
GENLIN daysabs WITH male langarts math /MODEL male langarts math INTERCEPT=YES DISTRIBUTION=NEGBIN(MLE) LINK=LOG.
|
Goodness of Fitb | |||
|
|
Value |
df |
Value/df |
|
Deviance |
356.935 |
311 |
1.148 |
|
Scaled Deviance |
356.935 |
311 |
|
|
Pearson Chi-Square |
337.089 |
311 |
1.084 |
|
Scaled Pearson Chi-Square |
337.089 |
311 |
|
|
Log Likelihooda |
-880.873 |
|
|
|
Akaike's Information Criterion (AIC) |
1771.746 |
|
|
|
Finite Sample Corrected AIC (AICC) |
1771.940 |
|
|
|
Bayesian Information Criterion (BIC) |
1790.525 |
|
|
|
Consistent AIC (CAIC) |
1795.525 |
|
|
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | |||
|
a. The full log likelihood function is displayed and used in computing information criteria. | |||
|
b. Information criteria are in small-is-better form. | |||
|
Tests of Model Effects | |||
|
Source |
Type III | ||
|
Wald Chi-Square |
df |
Sig. | |
|
(Intercept) |
136.380 |
1 |
.000 |
|
male |
9.531 |
1 |
.002 |
|
langarts |
6.608 |
1 |
.010 |
|
math |
.109 |
1 |
.741 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | |||
|
Parameter Estimates | ||||
|
Parameter |
|
95% Wald Confidence Interval | ||
|
B |
Std. Error |
Lower |
Upper | |
|
(Intercept) |
2.716 |
.2326 |
2.260 |
3.172 |
|
male |
-.431 |
.1397 |
-.705 |
-.157 |
|
langarts |
-.014 |
.0056 |
-.025 |
-.003 |
|
math |
-.002 |
.0048 |
-.011 |
.008 |
|
(Scale) |
1a |
|
|
|
|
(Negative binomial) |
1.288 |
.1231 |
1.068 |
1.554 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | ||||
|
a. Fixed at the displayed value. | ||||
|
Parameter Estimates | |||
|
Parameter |
Hypothesis Test | ||
|
Wald Chi-Square |
df |
Sig. | |
|
(Intercept) |
136.380 |
1 |
.000 |
|
male |
9.531 |
1 |
.002 |
|
langarts |
6.608 |
1 |
.010 |
|
math |
.109 |
1 |
.741 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | |||
|
| |||
The output looks very much like the output from an OLS regression. The output begins the goodness of fit including log likelihood, AIC and BIC. These values can be used when comparing models.
Next comes the Tests of Model Effects. This section looks the same as the section of Parameter Estimates. This is because that we have entered all the variables as continuous variables. So each one of them has just one degree of freedom. With models where there are categorical predictor variables, this section will give the over effects of categorical variables and continuous variables as well.
The Parameter Estimates follows. You will find the negative binomial regression coefficients for each of the variables along with standard errors, Chi-Square values, p-values and 95% confidence intervals for the coefficients.
Now, just to be on the safe side, let's rerun the negbin command with the covb = robust option in order to obtain robust standard errors for the negative binomial regression coefficients.
GENLIN daysabs WITH male langarts math /MODEL male langarts math INTERCEPT=YES DISTRIBUTION=NEGBIN(MLE) LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=ROBUST.
|
Parameter Estimates | ||||
|
Parameter |
|
95% Wald Confidence Interval | ||
|
B |
Std. Error |
Lower |
Upper | |
|
(Intercept) |
2.716 |
.2135 |
2.298 |
3.135 |
|
male |
-.431 |
.1401 |
-.706 |
-.157 |
|
langarts |
-.014 |
.0054 |
-.025 |
-.004 |
|
math |
-.002 |
.0063 |
-.014 |
.011 |
|
(Scale) |
1a |
|
|
|
|
(Negative binomial) |
1.288 |
.1231 |
1.068 |
1.554 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | ||||
|
a. Fixed at the displayed value. | ||||
|
Parameter Estimates | |||
|
Parameter |
Hypothesis Test | ||
|
Wald Chi-Square |
df |
Sig. | |
|
(Intercept) |
161.818 |
1 |
.000 |
|
male |
9.472 |
1 |
.002 |
|
langarts |
7.124 |
1 |
.008 |
|
math |
.066 |
1 |
.798 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts, math | |||
|
| |||
Using the covb = robust option has resulted in a fairly large change in the model chi-square, which is now a Wald chi-square, based on log pseudo likelihoods, instead of a likelihood ratio chi-square. The robust standard errors attempt to adjust for heterogeneity in the model. The variable math was not significant without the covb = robust option and is even less so with it.
Since math is not significant in the model with robust standard errors, we will rerun the model dropping that variable.
GENLIN daysabs WITH male langarts /MODEL male langarts INTERCEPT=YES DISTRIBUTION=NEGBIN(MLE) LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=ROBUST.
|
Goodness of Fitb | |||
|
|
Value |
df |
Value/df |
|
Deviance |
2238.318 |
313 |
7.151 |
|
Scaled Deviance |
2238.318 |
313 |
|
|
Pearson Chi-Square |
2752.913 |
313 |
8.795 |
|
Scaled Pearson Chi-Square |
2752.913 |
313 |
|
|
Log Likelihooda |
-1549.857 |
|
|
|
Akaike's Information Criterion (AIC) |
3105.713 |
|
|
|
Finite Sample Corrected AIC (AICC) |
3105.790 |
|
|
|
Bayesian Information Criterion (BIC) |
3116.981 |
|
|
|
Consistent AIC (CAIC) |
3119.981 |
|
|
|
Dependent Variable: days absent Model: (Intercept), male, langarts | |||
|
a. The full log likelihood function is displayed and used in computing information criteria. | |||
|
b. Information criteria are in small-is-better form. | |||
|
Omnibus Testa | ||
|
Likelihood Ratio Chi-Square |
df |
Sig. |
|
171.503 |
2 |
.000 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts | ||
|
a. Compares the fitted model against the intercept-only model. | ||
|
Parameter Estimates | ||||
|
Parameter |
|
95% Wald Confidence Interval | ||
|
B |
Std. Error |
Lower |
Upper | |
|
(Intercept) |
2.647 |
.1823 |
2.290 |
3.004 |
|
male |
-.409 |
.1352 |
-.674 |
-.144 |
|
langarts |
-.015 |
.0034 |
-.021 |
-.008 |
|
(Scale) |
1a |
|
|
|
|
Dependent Variable: days absent Model: (Intercept), male, langarts | ||||
|
a. Fixed at the displayed value. | ||||
|
Parameter Estimates | |||
|
Parameter |
Hypothesis Test | ||
|
Wald Chi-Square |
df |
Sig. | |
|
(Intercept) |
210.718 |
1 |
.000 |
|
male |
9.164 |
1 |
.002 |
|
langarts |
18.263 |
1 |
.000 |
|
Dependent Variable: days absent Model: (Intercept), male, langarts | |||
|
| |||
Finally, we will use the emmeans option to get the predicted value in days absent for male and female. In order to use the emmeans option, we will have to specify variable male to be a categorical variable. The model specified this way is the same as the one above since male is a binary variable, except the reference group for male is now switched to male = 1. That is why the sign for the parameter coefficients are reversed.
GENLIN daysabs BY male WITH langarts /MODEL male langarts INTERCEPT=YES DISTRIBUTION=NEGBIN(MLE) LINK=LOG /CRITERIA METHOD=FISHER(1) SCALE=1 COVB=ROBUST /EMMEANS TABLES=male SCALE=ORIGINAL.
|
Estimates | ||||
|
male |
|
95% Wald Confidence Interval | ||
|
Mean |
Std. Error |
Lower |
Upper | |
|
0 |
6.82 |
.690 |
5.47 |
8.17 |
|
1 |
4.43 |
.432 |
3.59 |
5.28 |
|
Covariates appearing in the model are fixed at the following values: langarts=50.0638 | ||||