UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Data Analysis Examples
Tobit Analysis

Examples of Tobit Analysis

Example 1. In the 1980s there was a federal law restricting speedometer readings to no more than 85 mph. So if you wanted to try and predict a vehicle's top-speed from a combination of horse-power and engine size, you would get a reading no higher than 85, regardless of how fast the vehicle was really traveling. This is a classic case of right-censoring (censoring from above) of the data. The only thing we are certain of is that those vehicles were traveling at least 85 mph. Tobit models are designed to make improved estimates when there is either left- or right-censoring.

Example 2. A research project is studying the level of lead in home drinking water as a function of the age of a house and family income. The water testing kit cannot detect lead concentrations below 5 parts per billion (ppb). The EPA considers levels above 15 ppb to be dangerous. These data are an example of left-censoring (censoring from below) and can be analyzed using tobit analysis.

Example 3. Consider the situation in which we have a measure of academic aptitude (scaled 200-800) which we want to model using reading and math test scores and whether the student is enrolled in a public or private school. The problem here is that students who answer all questions on the academic aptitude test correctly receive a score of 800, even though it is likely that these students are not "truly" equal in aptitude.

Description of the Data

Let's pursue Example 3 from above.

We have a hypothetical data file, tobitex.sas7bdat , with 200 observations.  The academic aptitude variable is apt, the reading and writing test scores are read and write, respectively. The variable public is a zero-one variable with the ones indicating a public school student.

Let's look at the data.

Some Strategies You Might Be Tempted To Try

Before we show how you can analyze this with a tobit analysis, let's consider some other methods that you might use.

SAS Tobit Analysis

The ub = option on the endogenous statement indicates the value at which the right-censoring begins.  There is also a lb() option to indicate the value of the left-censoring, which was not needed in this example.

At the top, the output provides a summary of the number of left- and right-censored values.  In the table entitled Parameter Estimates, we have the tobit regression coefficients, the standard error of the coefficients, the t-Value and the associated p-values.  The ancillary statistic _sigma is equivalent to the standard error of estimate in OLS regression.  The value of 73.63 can be compared to the standard deviation of academic aptitude, which was 101.44.  This shows a substantial reduction.  The output also contains an estimate of the standard error of _sigma, as well as the t-Value and p-value.  That _sigma is statistically significant means that 73.63 is statistically significantly different from 0.  The validity of this test of _sigma is a matter of debate among statisticians, and some programs will produce the estimate and standard error, but not the test of statistical significance.

To get an idea about model fit, you can use the squared multiple correlation between the outcome variable (apt) and the predicted value.  The predicted values are obtained by creating an output data set (which we called temp1) on  the output statement with the predict option.  The following proc corr and data step are then used to get the desired value.

proc qlim data=tobitex ;
model apt = read math public;
endogenous apt ~ censored (ub=800);
output out = temp1 predicted;
run;

ods output PearsonCorr=tobit_corr;
proc corr data = temp1 nosimple;
var apt p_apt;
run;

data _null_;
set tobit_corr;
if variable = "APT";
file print;
a = round((p_apt)**2, .0001);
put "The squared multiple correlation between apt and the predicted value is " a;
run;

The squared multiple correlation between apt and the predicted value is 0.527

Sample Write-Up of the Analysis

In the tobit regression model predicting academic aptitude from reading, math and public school, each of the predictor variables in the model was also statically significant at the .001 level.  The squared correlation between the observed and predicted academic aptitude values was 0.53, indicating that these three predictors accounted for over 50% of the variability in the outcome variable. A unit change in read and math lead to a 3.68 and 4.56 increase in the predicted aptitude, respectively.  Attending a public school increased the predicted aptitude by 62.16 points as compared with private school attendance.

See Also


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California