UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

S-Plus Textbook Examples
Applied Survival Analysis

Chapter 4: Interpretation of Fitted Proportional Hazards Regression Models

Sections noted by R indicate differences between R and SPLUS.

SPLUS program
R program

comma delimited data sets to be used in R:
uis.csv
hmohiv.csv

To input the data sets in R use the following code which must be executed before running the R program:

>hmohiv <- read.csv("d:/hmohiv.csv", header=T)
>uis <- read.csv("d:/uis.csv", header=T)

Here are  the SPLUS data sets: hmohiv, uis.

Table 4.2, p. 119.
Note1: The coxreg function uses the Breslow method of breaking ties as the default. The coxph function uses the Efron method as the default. Since the coxph function is easier to use it will be used in preference to coxreg.
Note2: Both the coxreg and the coxph functions only supply the confidence intervals for the hazard ratio.

R-In R the coxreg function is not available.

drug.reg <- coxreg(hmohiv$time, hmohiv$censor, hmohiv$drug)
print( drug.reg)

 Alive Dead Deleted 
    20   80       0

      coef exp(coef) se(coef)    z      p 
[1,] 0.779      2.18    0.242 3.22 0.0013

     exp(coef) exp(-coef) lower .95 upper .95 
[1,]      2.18      0.459      1.36       3.5

Likelihood ratio test= 10.2  on 1 df,   p=0.00141
Efficient score test = 10.7  on 1 df,   p=0.00105

drug.ph <- coxph( Surv(time, censor)~drug,  hmohiv, method="breslow")
summary(drug.ph)

  n= 100 

      coef exp(coef) se(coef)    z      p 
drug 0.779      2.18    0.242 3.22 0.0013

     exp(coef) exp(-coef) lower .95 upper .95 
drug      2.18      0.459      1.36       3.5

Rsquare= 0.097   (max possible= 0.997 )
Likelihood ratio test= 10.2  on 1 df,   p=0.00141
Wald test            = 10.4  on 1 df,   p=0.0013
Score (logrank) test = 10.7  on 1 df,   p=0.00105

Creating the dummy variables for categories of age.

hmohiv$age2[hmohiv$age<30 | hmohiv$age>34] <- 0 
hmohiv$age2[hmohiv$age>=30 & hmohiv$age<=34] <- 1

hmohiv$age3[hmohiv$age<35 | hmohiv$age>39] <- 0 
hmohiv$age3[hmohiv$age>=35 & hmohiv$age<=39] <- 1

hmohiv$age4[hmohiv$age<40] <- 0 
hmohiv$age4[hmohiv$age>=40] <- 1 

Table 4.4, p. 122 and table 4.5, p. 123.

dummy.ph <- coxph( Surv(time, censor)~age2+age3+age4, hmohiv, method="breslow")
summary(dummy.ph) 

  n= 100 

     coef exp(coef) se(coef)    z        p 
age2 1.20      3.31    0.451 2.65 0.008000
age3 1.31      3.72    0.459 2.86 0.004200
age4 1.86      6.42    0.469 3.96 0.000074

     exp(coef) exp(-coef) lower .95 upper .95 
age2      3.31      0.302      1.37      8.01
age3      3.72      0.269      1.51      9.14
age4      6.42      0.156      2.56     16.12

Rsquare= 0.178   (max possible= 0.997 )
Likelihood ratio test= 19.6  on 3 df,   p=0.000209
Wald test            = 16.6  on 3 df,   p=0.000876
Score (logrank) test = 18.3  on 3 df,   p=0.000389

Creating deviation from mean coded variables for the categories of age.


hmohiv$age1[ hmohiv$age >= 30] <- 0 
hmohiv$age1[ hmohiv$age < 30] <- 1 

hmohiv$age2d <- hmohiv$age2 
hmohiv$age2d[hmohiv$age1 == 1] <- -1 

hmohiv$age3d <- hmohiv$age3 
hmohiv$age3d[hmohiv$age1 == 1] <- -1 

hmohiv$age4d <- hmohiv$age4 
hmohiv$age4d[hmohiv$age1 == 1] <- -1

Table 4.7, p. 127.

dummyd.ph <- coxph( Surv(time, censor)~age2d+age3d+age4d, hmohiv, method="breslow")
summary(dummyd.ph)

  n= 100 

       coef exp(coef) se(coef)     z       p 
age2d 0.104      1.11    0.192 0.543 0.59000
age3d 0.221      1.25    0.206 1.071 0.28000
age4d 0.768      2.15    0.209 3.667 0.00025

      exp(coef) exp(-coef) lower .95 upper .95 
age2d      1.11      0.901     0.762      1.62
age3d      1.25      0.802     0.833      1.87
age4d      2.15      0.464     1.430      3.25

Rsquare= 0.178   (max possible= 0.997 )
Likelihood ratio test= 19.6  on 3 df,   p=0.000209
Wald test            = 16.6  on 3 df,   p=0.000876
Score (logrank) test = 18.3  on 3 df,   p=0.000389

Table 4.8, p. 129.

age.ph <- coxph( Surv(time, censor)~age, hmohiv, method="breslow")
summary(age.ph)

  n= 100 

      coef exp(coef) se(coef)    z      p 
age 0.0814      1.08   0.0174 4.67 3e-006

    exp(coef) exp(-coef) lower .95 upper .95 
age      1.08      0.922      1.05      1.12

Rsquare= 0.192   (max possible= 0.997 )
Likelihood ratio test= 21.4  on 1 df,   p=3.82e-006
Wald test            = 21.8  on 1 df,   p=3.02e-006
Score (logrank) test = 22  on 1 df,   p=2.72e-006

Creating the variable drug in the uis data set.

uis$drug[uis$ivhx==1] <- 0 
uis$drug[uis$ivhx>1] <- 1

Table 4.9, p. 133.
Note: The estimates for the crude model is slightly different since it only excluded observations based on whether or not they were missing in the variable drug. So, there are only 18 observations that have been excluded from calculations whereas in the adjusted model there were 23 observations excluded from the calculations.

crude.ph <- coxph( Surv(time, censor)~drug, uis, method="breslow", na.action=na.exclude)
summary(crude.ph)
adjust.ph <- coxph( Surv(time, censor)~drug+age, uis, method="breslow", na.action=na.exclude)
summary(adjust.ph)


  n=610 (18 observations deleted due to missing values)

      coef exp(coef) se(coef)    z       p 
drug 0.326      1.39   0.0946 3.44 0.00057

     exp(coef) exp(-coef) lower .95 upper .95 
drug      1.39      0.722      1.15      1.67

Rsquare= 0.02   (max possible= 1 )
Likelihood ratio test= 12.2  on 1 df,   p=0.000476
Wald test            = 11.9  on 1 df,   p=0.000571
Score (logrank) test = 12  on 1 df,   p=0.00054



  n=605 (23 observations deleted due to missing values)

        coef exp(coef) se(coef)     z        p 
drug  0.4394     1.552  0.10072  4.36 0.000013
 age -0.0264     0.974  0.00784 -3.37 0.000770

     exp(coef) exp(-coef) lower .95 upper .95 
drug     1.552      0.644     1.274     1.890
 age     0.974      1.027     0.959     0.989

Rsquare= 0.038   (max possible= 1 )
Likelihood ratio test= 23.3  on 2 df,   p=8.65e-006
Wald test            = 23.1  on 2 df,   p=9.73e-006
Score (logrank) test = 23.2  on 2 df,   p=9.21e-006

Table 4.10, p. 135.
Note: again the crude model is slightly different since it only excluded observations that had missing entries for treat (and treat did not have any missing values). The two other models excluded observations that were missing for either age or treat and therefore 5 observations were excluded from the calculations.

crude.ph <- coxph( Surv(time, censor)~treat, uis, method="breslow", na.action=na.exclude)
summary(crude.ph)
adjust.ph <- coxph( Surv(time, censor)~treat+age, uis, method="breslow", na.action=na.exclude)
summary(adjust.ph)
inter.ph <- coxph( Surv(time, censor)~treat+age+treat:age, uis, method="breslow", na.action=na.exclude)
summary(inter.ph)

Call:
coxph( Surv(time, censor) ~ treat, uis, na.action = na.exclude, method = "breslow")

  n= 628 

        coef exp(coef) se(coef)    z      p 
treat -0.231     0.794    0.089 -2.6 0.0094

      exp(coef) exp(-coef) lower .95 upper .95 
treat     0.794       1.26     0.667     0.945

Rsquare= 0.011   (max possible= 1 )
Likelihood ratio test= 6.75  on 1 df,   p=0.00937
Wald test            = 6.74  on 1 df,   p=0.00941
Score (logrank) test = 6.77  on 1 df,   p=0.00926

Call:
coxph( Surv(time, censor) ~ treat + age, uis, na.action = na.exclude, method = "breslow")

  n=623 (5 observations deleted due to missing values)

         coef exp(coef) se(coef)     z     p 
treat -0.2230     0.800  0.08933 -2.50 0.013
  age -0.0133     0.987  0.00721 -1.84 0.066

      exp(coef) exp(-coef) lower .95 upper .95 
treat     0.800       1.25     0.672     0.953
  age     0.987       1.01     0.973     1.001

Rsquare= 0.015   (max possible= 1 )
Likelihood ratio test= 9.48  on 2 df,   p=0.00876
Wald test            = 9.42  on 2 df,   p=0.00903
Score (logrank) test = 9.44  on 2 df,   p=0.00892

Call:
coxph( Surv(time, censor) ~ treat + age + treat:age, uis, na.action = na.exclude, method = "breslow")

  n=623 (5 observations deleted due to missing values)

              coef exp(coef) se(coef)      z    p 
    treat  0.52188     1.685   0.4745  1.100 0.27
      age -0.00177     0.998   0.0101 -0.175 0.86
treat:age -0.02316     0.977   0.0145 -1.597 0.11

          exp(coef) exp(-coef) lower .95 upper .95 
    treat     1.685      0.593     0.665      4.27
      age     0.998      1.002     0.979      1.02
treat:age     0.977      1.023     0.950      1.01

Rsquare= 0.019   (max possible= 1 )
Likelihood ratio test= 12  on 3 df,   p=0.00724
Wald test            = 11.2  on 3 df,   p=0.0107
Score (logrank) test = 11.3  on 3 df,   p=0.0101

Fig. 4.2, p. 139.
Note: First we fit the Cox proportional hazards model with drug as the only predictor. Then we use the function survfit to calculate the baseline survivorship function for IV drug use absent (drug=0). We then calculate the survivorship function for IV drug use present (drug=1), see formula 4.25.

crude.ph <- coxph( Surv(time, censor)~drug, hmohiv, method="breslow", na.action=na.exclude)
fit <- survfit(crude.ph, conf.type="none")
fit$surv2 <- fit$surv^exp(.779)
plot( fit$time, fit$surv, xlab="Survival Time (Months)", ylab="Survival Probability", 
     ylim=c(0, 1.0), type="s")
points(fit$time, fit$surv2,  type="s")

Creating the centered variable for age called agec from hmohiv data set.

hmohiv$agec <- hmohiv$age - 35

Table 4.12, p. 143.

full.model <- coxph( Surv(time, censor)~agec+drug, hmohiv, na.action=na.exclude, method="breslow")
print(full.model)

Call:
coxph( Surv(time, censor) ~ agec + drug, hmohiv, na.action = na.exclude, method = "breslow")


       coef exp(coef) se(coef)    z        p 
agec 0.0915      1.10   0.0185 4.95 7.4e-007
drug 0.9414      2.56   0.2555 3.68 2.3e-004

Likelihood ratio test=35  on 2 df, p=2.53e-008  n= 100 

Creating centered age and ndrugtx variables.

uis$agec <- uis$age - 30
uis$ndrugtxc <- uis$ndrugtx - 3

Table 4.13, p. 148.

full.model <- coxph( Surv(time, censor)~treat+agec+drug+ndrugtxc, uis, 
              na.action=na.exclude, method="breslow")
print(full.model)

            coef exp(coef) se(coef)     z       p 
   treat -0.2271     0.797  0.09158 -2.48 0.01300
    agec -0.0307     0.970  0.00794 -3.87 0.00011
    drug  0.3426     1.409  0.10426  3.29 0.00100
ndrugtxc  0.0309     1.031  0.00799  3.87 0.00011

Likelihood ratio test=41.1  on 4 df, p=2.57e-008  n=593 (35 observations deleted due to missing values)

Fig. 4.6, p. 148.
Note: The coefficient of treat is to generate the survivorship function for the patients that were in the treatment group (treat=1).

fit <- survfit(full.model, conf.type="none")

fit$surv2 <- fit$surv^exp(-0.2271)
plot( fit$time, fit$surv, xlab="Time to Drug Use From Admission (Days)", 
      ylab="Survival Probability", ylim=c(0, 1.0), xlim=c(0, 1000), type="s")
points(fit$time, fit$surv2,  type="s") 


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California