Stata Data Analysis Examples
Zero-Truncated Negative Binomial

Version info: Code for this page was tested in Stata 12.

Zero-truncated negative binomial regression is used to model count data for which the value zero cannot occur and when there is evidence of over dispersion .

Please Note: The purpose of this page is to show how to use various data analysis commands. It does not cover all aspects of the research process which researchers are expected to do. In particular, it does not cover data cleaning and verification, verification of assumptions, model diagnostics and potential follow-up analyses.

Examples of zero-truncated negative binomial

Example 1. A study of the length of hospital stay, in days, as a function of age, kind of health insurance and whether or not the patient died while in the hospital. Length of hospital stay is recorded as a minimum of at least one day.

Example 2. A study of the number of journal articles published by tenured faculty as a function of discipline (fine arts, science, social science, humanities, medical, etc). To get tenure faculty must publish, i.e., there are no tenured faculty with zero publications.

Example 3. A study by the county traffic court on the number of tickets received by teenagers as predicted by school performance, amount of driver training and gender. Only individuals who have received at least one citation are in the traffic court files.

Description of the data

Let's pursue Example 1 from above.

We have a hypothetical data file, ztp.dta with 1,493 observations. The variable describing length of hospital visit is stay. The variable age gives the age group from 1 to 9 which will be treated as interval in this example. The variables hmo and died are binary indicator variables for HMO insured patients and patients who died while in hospital, respectively. These are the same data as were used in the ztp example.

Let's look at the data.

Analysis methods you might consider

Before we show how you can analyze these data with a zero-truncated negative binomial analysis, let's consider some other methods that you might use.

Zero-truncated negative binomial regression

The tnbreg command will analyze models that are left truncated on any value not just zero. The ztnb command previously was used for zero-truncated negative binomial regression, but is no longer supported in Stata12 and has been superseded by tnbreg.

The output looks very much like the output from an OLS regression:

Looking through the results we see the following:

We can also use the margins command to help understand our model. We will first compute the expected counts for the categorical variable hmo while holding the continuous variables age and died at their mean values using the atmeans option.  Please note that the unit for stay is days and not log days for the margins command.

The expected stay for non-HMO patients was 9.502, days while it was 8.203 days for HMO patients.

Using the dydx option computes the difference in expected counts between HMO and non-HMO patients while still holding the other variables at their mean value.

As shown above, HMO patients spend 1.299 days less in the hospital than non-HMO patients when the other variables are held at their mean levels.

One last margins command will give the expected counts for values of age variable from one through nine while averaging across the two levels of hmo and died. We will show these results even though age was not statistically significant.

A number of model fit indicators are available using the estat ic command.

Things to consider

See Also

References

 

 

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.