Supplemental notes to Applied Survival Analysis
Applied Survival Analysis

Supplementary notes
Tests of Proportionality in SAS, Stata and R

When modeling a Cox proportional hazard model a key assumption is proportional hazards. There are a number of basic concepts for testing proportionality but the implementation of these concepts differ across statistical packages. The goal of this page is to illustrate how to test for proportionality in Stata, SAS and R using an example from Applied Survival Analysis by Hosmer and Lemeshow.  Data set for SAS, Stata and R (the R file is a csv file which can be read in using the read.csv function).

1. Kaplan-Meier Curves

Works best for time fixed covariates with few levels.  If the predictor satisfy the proportional hazard assumption then the graph of the survival function versus the survival time should results in a graph with parallel curves, similarly the graph of the log(-log(survival)) versus log of survival time graph should result in parallel lines if the predictor is proportional.  This method does not work well for continuous predictor or categorical predictors that have many levels because the graph becomes to "cluttered".  Furthermore, the curves are sparse when there are fewer time points and it may be difficult to gage how close to parallel is close enough. Due to space limitations we will only show the graphs for the predictor treat.

SAS
It is very easy to create the graphs in SAS using proc lifetest.  The plot option in the model statement lets you specify both the survival function versus time as well as the log(-log(survival) versus log(time).

proc lifetest data=uis plot=(s, lls) noprint;
  time time*censor(0);
  strata treat;
run; 

Stata
The sts graph command in Stata will generate the survival function versus time graph.

sts graph, by(treat)

R
The plot function applied to a survfit object will generate a graph of the survival function versus the survival time.

plot(survfit(formula = Surv(time, censor)~ treat, data = uis, conf.type="none"), 
     lty=0, xlab="Time", ylab="Survival Probability" )

2. Including Time Dependent Covariates in the Cox Model

Generate the time dependent covariates by creating interactions of the predictors and a function of survival time and include in the model.  If any of the time dependent covariates are significant then those predictors are not proportional.

SAS
In SAS it is possible to create all the time dependent variable inside proc phreg as demonstrated.  Furthermore, by using the test statement is is possibly to test all the time dependent covariates all at once.

proc phreg data=uis;
  model time*censor(0) = age race treat site agesite aget racet treatt sitet;
  aget = age*log(time);
  racet = race*log(time);
  treatt = treat*log(time);
  sitet = site*log(time);
  proportionality_test: test aget, racet, treatt, sitet;
run; 
<output omitted>

               Analysis of Maximum Likelihood Estimates

                   Parameter      Standard
Variable    DF      Estimate         Error    Chi-Square    Pr > ChiSq
age          1      -0.02932       0.03336        0.7729        0.3793
race         1      -0.99201       0.53252        3.4702        0.0625
treat        1      -0.52086       0.40742        1.6344        0.2011
site         1      -1.48805       0.67291        4.8901        0.0270
agesite      1       0.02889       0.01547        3.4881        0.0618
aget         1       0.00124       0.00693        0.0319        0.8583
racet        1       0.14554       0.10951        1.7663        0.1838
treatt       1       0.05876       0.08536        0.4739        0.4912
sitet        1       0.07357       0.09701        0.5752        0.4482

            Linear Hypotheses Testing Results

                               Wald
 Label                   Chi-Square      DF    Pr > ChiSq
 proportionality_test        3.8875       4        0.4214

Stata
We use the tvc and the texp option in the stcox command. We list the predictors that we would like to include as interaction with log(time) in the tvc option (tvc = time varying covariates).  The texp option is where we can specify the function of time that we would like used in the time dependent covariates.  By using the lrtest commands it is possible to tests all the time dependent covariates together by comparing the smaller model without any time dependent covariates to the larger model that includes all the time dependent covariates.

stcox age race treat site c.age#site , nohr nolog noshow tvc(age race treat site) ///
 texp( ln(_t) )
 
Cox regression -- Breslow method for ties

No. of subjects =          617                     Number of obs   =       617
No. of failures =          500
Time at risk    =       145294
                                                   LR chi2(9)      =     27.48
Log likelihood  =   -2890.3829                     Prob > chi2     =    0.0012

------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
main         |
         age |   -.029325    .033356    -0.88   0.379    -.0947016    .0360516
        race |  -.9920153   .5325233    -1.86   0.062    -2.035742    .0517113
       treat |  -.5208583   .4074197    -1.28   0.201    -1.319386    .2776696
        site |  -1.488051   .6729145    -2.21   0.027     -2.80694   -.1691632
             |
  site#c.age |
          1  |   .0288933   .0154704     1.87   0.062    -.0014281    .0592146
-------------+----------------------------------------------------------------
tvc          |
         age |   .0012374   .0069302     0.18   0.858    -.0123454    .0148202
        race |   .1455428   .1095113     1.33   0.184    -.0690954    .3601809
       treat |   .0587615    .085358     0.69   0.491    -.1085371    .2260601
        site |   .0735743   .0970118     0.76   0.448    -.1165654    .2637139
------------------------------------------------------------------------------
Note: variables in tvc equation interacted with ln(_t)

lrtest, saving(0)
quietly stcox age race treat site c.age#site
lrtest, using(0) 

Cox:  likelihood-ratio test                           chi2(4)     =       2.76
                                                      Prob > chi2 =     0.5988

3. Tests and Graps Based on the Schoenfeld Residuals

Testing the time dependent covariates is equivalent to testing for a non-zero slope in a generalized linear regression of the scaled Schoenfeld residuals on functions of time.  A non-zero slope is an indication of a violation of the proportional hazard assumption. As with any regression it is highly recommended that you look at the graph of the regression in addition to performing the tests of non-zero slopes.  There are certain types on non-proportionality that will not be detected by the tests of non-zero slopes alone but that might become obvious when looking at the graphs of the residuals such as nonlinear relationship (i.e. a quadratic fit) between the residuals and the function of time or undue influence of outliers.

R
First we create the coxph object by using the coxph function. To create the plots of the Schoenfeld residuals versus log(time) create a cox.zph object by applying the cox.zph function to the cox.ph object.  Then the plot function will automatically create the Schoenfeld residual plots for each of the predictors in the model including a lowess smoothing curve. The order of the residuals in the time.dep.zph object corresponds to the order in which they were entered in the coxph model. To plot one graph at a time use the bracket notation with the number corresponding to the predictor of interest.  The abline function adds a reference line at y=0 to the individual plots.

time.dep <- coxph( Surv(time, censor)~age+race+treat+ site+age:site,
            uis, method="breslow", na.action=na.exclude)
time.dep.zph <- cox.zph(table5.11, transform = 'log')
summary(time.dep.zph)
plot(time.dep.zph)

<plots have been omitted>

             rho chisq     p 
     age  0.0245 0.283 0.595
    race  0.0601 1.851 0.174
   treat  0.0346 0.597 0.440
    site  0.0355 0.587 0.444
age:site -0.0289 0.385 0.535
  GLOBAL      NA 3.069 0.689
#plots for the predictors: age and treat including the reference line at y=0 
plot(time.dep.zph[1])
abline(h=0, lty=3)

 
plot(time.dep.zph[3])
abline(h=0, lty=3)

Stata
The tests of the non-zero slope developed by Therneau and Grambsch have been implemented in Stata in the estat phtest command.  The algorithms that Stata uses are slightly different from the algorithms used by Therneau and Grambsch and therefore the results may differ slightly.  The estat phtest with the detail option will perform the tests of each predictor as well as a global test.  There are different functions of time available including the identity function, the log of survival time and the rank of the survival times.  The stphtest command with the plot option will provide the graphs with a lowess curve.  The usual graphing options can be used to include a horizontal reference line at y=0.  The Stata graphs do not include 95% confidence intervals for the lowess curves which makes it more difficult to assess how much the curves may deviate from the y=0 line.

stcox age race treat site c.age#site, nolog noshow schoenfeld(sch*) scaledsch(sca*)

estat phtest, log detail

 Test of proportional-hazards assumption

      Time:  Log(t)
      ----------------------------------------------------------------
                  |       rho            chi2       df       Prob>chi2
      ------------+---------------------------------------------------
      age         |      0.02448         0.28        1         0.5946
      race        |      0.06006         1.85        1         0.1737
      treat       |      0.03464         0.60        1         0.4398
      site        |      0.03551         0.59        1         0.4437
      0b.site#co~e|            .            .        1             .
      1.site#c.age|     -0.02890         0.38        1         0.5351
      ------------+---------------------------------------------------
      global test |                      3.07        5         0.6894
      ----------------------------------------------------------------

estat phtest, log plot(age) yline(0)



estat phtest, log plot(treat) yline(0)

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.