|
|
|
||||
|
|
|||||
S-Plus Textbook Examples
Applied Survival Analysis
Chapter 4: Interpretation
of Fitted Proportional Hazards Regression Models
Sections noted by R indicate differences between R and SPLUS.
SPLUS program
R program
comma delimited data sets to be used in R:
uis.csv
hmohiv.csv
To input the data sets in R use the following code which must be executed before running the R program:
>hmohiv <- read.csv("d:/hmohiv.csv", header=T)
>uis <- read.csv("d:/uis.csv", header=T)
Here are the SPLUS data sets: hmohiv, uis.
Table 4.2, p. 119.
Note1: The coxreg function uses the Breslow method of breaking ties as the default. The coxph function uses the Efron method as the default. Since the coxph function is easier to use it will be used in preference to coxreg.
Note2: Both the coxreg and the coxph functions only supply the confidence intervals for the hazard ratio.
R-In R the coxreg function is not available.
drug.reg <- coxreg(hmohiv$time, hmohiv$censor, hmohiv$drug)
print( drug.reg)
Alive Dead Deleted
20 80 0
coef exp(coef) se(coef) z p
[1,] 0.779 2.18 0.242 3.22 0.0013
exp(coef) exp(-coef) lower .95 upper .95
[1,] 2.18 0.459 1.36 3.5
Likelihood ratio test= 10.2 on 1 df, p=0.00141
Efficient score test = 10.7 on 1 df, p=0.00105
drug.ph <- coxph( Surv(time, censor)~drug, hmohiv, method="breslow")
summary(drug.ph)
n= 100
coef exp(coef) se(coef) z p
drug 0.779 2.18 0.242 3.22 0.0013
exp(coef) exp(-coef) lower .95 upper .95
drug 2.18 0.459 1.36 3.5
Rsquare= 0.097 (max possible= 0.997 )
Likelihood ratio test= 10.2 on 1 df, p=0.00141
Wald test = 10.4 on 1 df, p=0.0013
Score (logrank) test = 10.7 on 1 df, p=0.00105
Creating the dummy variables for categories of age.
hmohiv$age2[hmohiv$age<30 | hmohiv$age>34] <- 0 hmohiv$age2[hmohiv$age>=30 & hmohiv$age<=34] <- 1 hmohiv$age3[hmohiv$age<35 | hmohiv$age>39] <- 0 hmohiv$age3[hmohiv$age>=35 & hmohiv$age<=39] <- 1 hmohiv$age4[hmohiv$age<40] <- 0 hmohiv$age4[hmohiv$age>=40] <- 1
Table 4.4, p. 122 and table 4.5, p. 123.
dummy.ph <- coxph( Surv(time, censor)~age2+age3+age4, hmohiv, method="breslow")
summary(dummy.ph)
n= 100
coef exp(coef) se(coef) z p
age2 1.20 3.31 0.451 2.65 0.008000
age3 1.31 3.72 0.459 2.86 0.004200
age4 1.86 6.42 0.469 3.96 0.000074
exp(coef) exp(-coef) lower .95 upper .95
age2 3.31 0.302 1.37 8.01
age3 3.72 0.269 1.51 9.14
age4 6.42 0.156 2.56 16.12
Rsquare= 0.178 (max possible= 0.997 )
Likelihood ratio test= 19.6 on 3 df, p=0.000209
Wald test = 16.6 on 3 df, p=0.000876
Score (logrank) test = 18.3 on 3 df, p=0.000389
Creating deviation from mean coded variables for the categories of age.
hmohiv$age1[ hmohiv$age >= 30] <- 0 hmohiv$age1[ hmohiv$age < 30] <- 1 hmohiv$age2d <- hmohiv$age2 hmohiv$age2d[hmohiv$age1 == 1] <- -1 hmohiv$age3d <- hmohiv$age3 hmohiv$age3d[hmohiv$age1 == 1] <- -1 hmohiv$age4d <- hmohiv$age4 hmohiv$age4d[hmohiv$age1 == 1] <- -1
Table 4.7, p. 127.
dummyd.ph <- coxph( Surv(time, censor)~age2d+age3d+age4d, hmohiv, method="breslow")
summary(dummyd.ph)
n= 100
coef exp(coef) se(coef) z p
age2d 0.104 1.11 0.192 0.543 0.59000
age3d 0.221 1.25 0.206 1.071 0.28000
age4d 0.768 2.15 0.209 3.667 0.00025
exp(coef) exp(-coef) lower .95 upper .95
age2d 1.11 0.901 0.762 1.62
age3d 1.25 0.802 0.833 1.87
age4d 2.15 0.464 1.430 3.25
Rsquare= 0.178 (max possible= 0.997 )
Likelihood ratio test= 19.6 on 3 df, p=0.000209
Wald test = 16.6 on 3 df, p=0.000876
Score (logrank) test = 18.3 on 3 df, p=0.000389
Table 4.8, p. 129.
age.ph <- coxph( Surv(time, censor)~age, hmohiv, method="breslow")
summary(age.ph)
n= 100
coef exp(coef) se(coef) z p
age 0.0814 1.08 0.0174 4.67 3e-006
exp(coef) exp(-coef) lower .95 upper .95
age 1.08 0.922 1.05 1.12
Rsquare= 0.192 (max possible= 0.997 )
Likelihood ratio test= 21.4 on 1 df, p=3.82e-006
Wald test = 21.8 on 1 df, p=3.02e-006
Score (logrank) test = 22 on 1 df, p=2.72e-006
Creating the variable drug in the uis data set.
uis$drug[uis$ivhx==1] <- 0 uis$drug[uis$ivhx>1] <- 1
Table 4.9, p. 133.
Note: The estimates for the crude model is slightly different since it only excluded observations based on whether or not they were missing in the variable drug. So, there are only 18 observations that have been excluded from calculations whereas in the adjusted model there were 23 observations excluded from the calculations.
crude.ph <- coxph( Surv(time, censor)~drug, uis, method="breslow", na.action=na.exclude)
summary(crude.ph)
adjust.ph <- coxph( Surv(time, censor)~drug+age, uis, method="breslow", na.action=na.exclude)
summary(adjust.ph)
n=610 (18 observations deleted due to missing values)
coef exp(coef) se(coef) z p
drug 0.326 1.39 0.0946 3.44 0.00057
exp(coef) exp(-coef) lower .95 upper .95
drug 1.39 0.722 1.15 1.67
Rsquare= 0.02 (max possible= 1 )
Likelihood ratio test= 12.2 on 1 df, p=0.000476
Wald test = 11.9 on 1 df, p=0.000571
Score (logrank) test = 12 on 1 df, p=0.00054
n=605 (23 observations deleted due to missing values)
coef exp(coef) se(coef) z p
drug 0.4394 1.552 0.10072 4.36 0.000013
age -0.0264 0.974 0.00784 -3.37 0.000770
exp(coef) exp(-coef) lower .95 upper .95
drug 1.552 0.644 1.274 1.890
age 0.974 1.027 0.959 0.989
Rsquare= 0.038 (max possible= 1 )
Likelihood ratio test= 23.3 on 2 df, p=8.65e-006
Wald test = 23.1 on 2 df, p=9.73e-006
Score (logrank) test = 23.2 on 2 df, p=9.21e-006
Table 4.10, p. 135.
Note: again the crude model is slightly different since it only excluded observations that had missing entries for treat (and treat did not have any missing values). The two other models excluded observations that were missing for either age or treat and therefore 5 observations were excluded from the calculations.
crude.ph <- coxph( Surv(time, censor)~treat, uis, method="breslow", na.action=na.exclude)
summary(crude.ph)
adjust.ph <- coxph( Surv(time, censor)~treat+age, uis, method="breslow", na.action=na.exclude)
summary(adjust.ph)
inter.ph <- coxph( Surv(time, censor)~treat+age+treat:age, uis, method="breslow", na.action=na.exclude)
summary(inter.ph)
Call:
coxph( Surv(time, censor) ~ treat, uis, na.action = na.exclude, method = "breslow")
n= 628
coef exp(coef) se(coef) z p
treat -0.231 0.794 0.089 -2.6 0.0094
exp(coef) exp(-coef) lower .95 upper .95
treat 0.794 1.26 0.667 0.945
Rsquare= 0.011 (max possible= 1 )
Likelihood ratio test= 6.75 on 1 df, p=0.00937
Wald test = 6.74 on 1 df, p=0.00941
Score (logrank) test = 6.77 on 1 df, p=0.00926
Call:
coxph( Surv(time, censor) ~ treat + age, uis, na.action = na.exclude, method = "breslow")
n=623 (5 observations deleted due to missing values)
coef exp(coef) se(coef) z p
treat -0.2230 0.800 0.08933 -2.50 0.013
age -0.0133 0.987 0.00721 -1.84 0.066
exp(coef) exp(-coef) lower .95 upper .95
treat 0.800 1.25 0.672 0.953
age 0.987 1.01 0.973 1.001
Rsquare= 0.015 (max possible= 1 )
Likelihood ratio test= 9.48 on 2 df, p=0.00876
Wald test = 9.42 on 2 df, p=0.00903
Score (logrank) test = 9.44 on 2 df, p=0.00892
Call:
coxph( Surv(time, censor) ~ treat + age + treat:age, uis, na.action = na.exclude, method = "breslow")
n=623 (5 observations deleted due to missing values)
coef exp(coef) se(coef) z p
treat 0.52188 1.685 0.4745 1.100 0.27
age -0.00177 0.998 0.0101 -0.175 0.86
treat:age -0.02316 0.977 0.0145 -1.597 0.11
exp(coef) exp(-coef) lower .95 upper .95
treat 1.685 0.593 0.665 4.27
age 0.998 1.002 0.979 1.02
treat:age 0.977 1.023 0.950 1.01
Rsquare= 0.019 (max possible= 1 )
Likelihood ratio test= 12 on 3 df, p=0.00724
Wald test = 11.2 on 3 df, p=0.0107
Score (logrank) test = 11.3 on 3 df, p=0.0101
Fig. 4.2, p. 139.
Note: First we fit the Cox proportional hazards model with drug as the only predictor. Then we use the function survfit to calculate the baseline survivorship function for IV drug use absent (drug=0). We then calculate the survivorship function for IV drug use present (drug=1), see formula 4.25.
crude.ph <- coxph( Surv(time, censor)~drug, hmohiv, method="breslow", na.action=na.exclude)
fit <- survfit(crude.ph, conf.type="none")
fit$surv2 <- fit$surv^exp(.779)
plot( fit$time, fit$surv, xlab="Survival Time (Months)", ylab="Survival Probability",
ylim=c(0, 1.0), type="s")
points(fit$time, fit$surv2, type="s")
Creating the centered variable for age called agec from hmohiv data set.
hmohiv$agec <- hmohiv$age - 35
Table 4.12, p. 143.
full.model <- coxph( Surv(time, censor)~agec+drug, hmohiv, na.action=na.exclude, method="breslow")
print(full.model)
Call:
coxph( Surv(time, censor) ~ agec + drug, hmohiv, na.action = na.exclude, method = "breslow")
coef exp(coef) se(coef) z p
agec 0.0915 1.10 0.0185 4.95 7.4e-007
drug 0.9414 2.56 0.2555 3.68 2.3e-004
Likelihood ratio test=35 on 2 df, p=2.53e-008 n= 100
Creating centered age and ndrugtx variables.
uis$agec <- uis$age - 30 uis$ndrugtxc <- uis$ndrugtx - 3
Table 4.13, p. 148.
full.model <- coxph( Surv(time, censor)~treat+agec+drug+ndrugtxc, uis,
na.action=na.exclude, method="breslow")
print(full.model)
coef exp(coef) se(coef) z p
treat -0.2271 0.797 0.09158 -2.48 0.01300
agec -0.0307 0.970 0.00794 -3.87 0.00011
drug 0.3426 1.409 0.10426 3.29 0.00100
ndrugtxc 0.0309 1.031 0.00799 3.87 0.00011
Likelihood ratio test=41.1 on 4 df, p=2.57e-008 n=593 (35 observations deleted due to missing values)
Fig. 4.6, p. 148.
Note: The coefficient of treat is to generate the survivorship function for the patients that were in the treatment group (treat=1).
fit <- survfit(full.model, conf.type="none")
fit$surv2 <- fit$surv^exp(-0.2271)
plot( fit$time, fit$surv, xlab="Time to Drug Use From Admission (Days)",
ylab="Survival Probability", ylim=c(0, 1.0), xlim=c(0, 1000), type="s")
points(fit$time, fit$surv2, type="s")
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services