UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Textbook Examples
Applied Survival Analysis by D. Hosmer and S. Lemeshow
Chapter 7: Extensions of the Proportional Hazards Model 

In this chapter we will be using the uis data sets.
Creating the interactions needed for the model specified in table 5.11, p. 213.
data uis;
  set uis;
  ivhx3 = (ivhx = 3);
  ndrugfp1 = 1/((ndrugtx+1)/10);
  ndrugfp2 = (1/((ndrugtx+1)/10))*log((ndrugtx+1)/10);
  racesite = race*site;
  agesite = age*site;
run;
Table 7.1, p. 245.
Proportional hazard model stratified by site.
proc phreg data=uis;
  model time*censor(0) = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat agesite racesite;
  strata site;
run;

<output omitted>
                     Analysis of Maximum Likelihood Estimates

                   Parameter      Standard                                  Hazard
Variable    DF      Estimate         Error    Chi-Square    Pr > ChiSq       Ratio
age          1      -0.04137       0.00991       17.4174        <.0001       0.959
becktota     1       0.00873       0.00497        3.0817        0.0792       1.009
ndrugfp1     1      -0.57290       0.12524       20.9256        <.0001       0.564
ndrugfp2     1      -0.21369       0.04861       19.3265        <.0001       0.808
ivhx3        1       0.23128       0.10875        4.5227        0.0334       1.260
race         1      -0.46347       0.13487       11.8099        0.0006       0.629
treat        1      -0.24850       0.09441        6.9282        0.0085       0.780
agesite      1       0.03285       0.01612        4.1551        0.0415       1.033
racesite     1       0.84667       0.24791       11.6636        0.0006       2.332
Fig. 7.1, p.247.
Graphs of the modified risk-score-adjusted stratum-specific survivorship functions for treatment.
data cov0;
  treat = 0;
  age = 0;
  becktota =  0;
  ndrugfp1 =0;
  ndrugfp2 =  0;
  ivhx3 = 0;
  race  = 0;
  agesite =  0;
  racesite =  0;
run;
proc phreg data=uis noprint;
  model time*censor(0) = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat agesite racesite;
  strata site;
  baseline out = temp covariates=cov0 survival=s /method=ch nomean;
run;
data temp1;
  set temp;
  if site=0 then s00 = s**exp(-2.0079-.2486);
  if site = 1 then s01 = s**exp(-.8903-.2486);
run;
data cov1;
  treat = 1;
  age = 0;
  becktota =  0;
  ndrugfp1 =0;
  ndrugfp2 =  0;
  ivhx3 = 0;
  race  = 0;
  agesite =  0;
  racesite =  0;
run;
proc phreg data=uis noprint;
  model time*censor(0) = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat agesite racesite;
  strata site;
  baseline out = temp covariates=cov1 survival=s /method=ch nomean;
run;
proc sort data=temp;
  by time;
run;

data temp2;
  set temp;
  if site=0 then s10 = s**exp(-2.0079-.2486);
  if site = 1 then s11 = s**exp(-.8903-.2486);
run;
data combo;
  set temp1 temp2;
run;
goptions reset=all;
symbol1 c=blue  h=.8 i=stepjll;
symbol2 c=red  h=.8 i=stepjll;
symbol3 c=black  h=.8 i=stepjll;
symbol4 c=green  h=.8 i=stepjll;
axis order=(0 to 1 by .25) label=(a=90 'Covariate Adjusted Survivorship Function');
legend1 label=none value=(height=1 font=swiss 'Treat=0, Site=0' 'Treat=0, Site=1' 
        'Treat=1, Site=0' 'Treat=1, Site=1' ) 
        position=(top right inside) mode=share cborder=black;
proc gplot data=combo;
  plot (s00 s01 s10 s11)*time / overlay vaxis=axis1 legend=legend1;
run;
quit;
Table 7.3, p. 252.
Model including time-varying covariate off_trt added to model in table 5.11.
proc phreg data=uis;
  model time*censor(0) = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat site agesite 
                         racesite off_trt;
  if (time > los) then off_trt = 1; 
  else off_trt = 0;
run;

<output omitted>
                     Analysis of Maximum Likelihood Estimates

                   Parameter      Standard                                  Hazard
Variable    DF      Estimate         Error    Chi-Square    Pr > ChiSq       Ratio
age          1      -0.03788       0.01006       14.1891        0.0002       0.963
becktota     1       0.00797       0.00491        2.6328        0.1047       1.008
ndrugfp1     1      -0.60863       0.12835       22.4877        <.0001       0.544
ndrugfp2     1      -0.22558       0.04960       20.6882        <.0001       0.798
ivhx3        1       0.27467       0.10895        6.3563        0.0117       1.316
race         1      -0.51695       0.13450       14.7725        0.0001       0.596
treat        1       0.01940       0.09613        0.0407        0.8401       1.020
site         1      -0.96947       0.51587        3.5318        0.0602       0.379
agesite      1       0.03636       0.01580        5.2951        0.0214       1.037
racesite     1       0.51089       0.25690        3.9550        0.0467       1.667
off_trt      1       2.57110       0.15676      269.0111        <.0001      13.080
Table 7.4, p. 255.
Model with delayed entry (left truncation). The total time in the study is the length of time specified by the time variable, entry into the study is after the amount of time specified by the los variable. In other words, the entry was delayed by the amount of time specified by the los variable. The amount of time that the subjects were at risk in this model is time - los.
proc phreg data=uis;
  model (los, time)*censor(0) = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat site agesite racesite;
run;

<output omitted>
                     Analysis of Maximum Likelihood Estimates

                   Parameter      Standard                                  Hazard
Variable    DF      Estimate         Error    Chi-Square    Pr > ChiSq       Ratio
age          1      -0.03319       0.01090        9.2714        0.0023       0.967
becktota     1       0.00455       0.00534        0.7246        0.3946       1.005
ndrugfp1     1      -0.54606       0.14272       14.6401        0.0001       0.579
ndrugfp2     1      -0.20381       0.05486       13.7995        0.0002       0.816
ivhx3        1       0.21508       0.11852        3.2931        0.0696       1.240
race         1      -0.49450       0.14244       12.0514        0.0005       0.610
treat        1       0.13957       0.10501        1.7664        0.1838       1.150
site         1      -0.96014       0.55590        2.9831        0.0841       0.383
agesite      1       0.03985       0.01711        5.4253        0.0198       1.041
racesite     1       0.22033       0.28961        0.5788        0.4468       1.246
Creating the data set for the interval censoring analysis, table 7.5, p. 263.
The goal is to create multiple observations per person in the data set.  The number of observations will depend on the length of time that the person was in the study.  Each observation will represent 6 months that the person was in the study. Thus, if a person was in the study for 9 months they will then have two observations in the data set, one observation for the period 1-6 months and another for the period 6-12 months.  If the person relapsed in the month 9 then the new censoring variable, censor1, will equal 0 in the first observations for the 1-6 month period and censor1 will equal 1 in the observation representing the period 6-12 months.   Every time we want to create an expanded data set by filling in multiple observations per person we use the output statement.  We use an output statement every time we create a new interval and in the end we have created four new observations per id.  Since not all people were in the study for 24 months or more there are replicated observation.  To get rid of these replicates we used the first.variable_name operator.  This operator will delete observation that are duplicate values of the variable named in the operator.
data long;
  set uis;
  month = time/ 30.4;
  if month > 0 then do;
  interval = 6;
  if month > 6 then censor1 = 0;
  else censor1 = censor;
  end;
  output;
  if month > 6 then do;
  interval = 12;
  if month > 12 then censor1 = 0;
  else censor1 = censor;
  end;
  output;
  if month > 12 then do;
  interval = 18;
  if month > 18 then censor1 = 0;
  else censor1 = censor;
  end;
  output;
  if month > 18 then do;
  interval = 24;
  if month > 24 then censor1 = 0;
  else censor1 = censor;
  end;
  output;
  if month > 24 then do;
  interval = 30;
  censor1 = censor;
  end;
  output;
run;
proc sort data=long;
  by id interval;
run;
data long;
  set long;
  by id interval;
  if first.id or first.interval;                               
run;
proc sort data=long;
  by id;
run;
Table 7. 5, p. 263.
proc print data=long noobs;
  where id in (1, 2, 3, 4, 7, 31, 5, 388);
  var id month interval censor1 age;
run;

 ID     month     interval    censor1    age
  1     6.1842        6          0        39
  1     6.1842       12          1        39
  2     0.8553        6          1        33
  3     6.8092        6          0        33
  3     6.8092       12          1        33
  4     4.7368        6          1        32
  5    18.1250        6          0        24
  5    18.1250       12          0        24
  5    18.1250       18          0        24
  5    18.1250       24          0        24
  7    15.0987        6          0        39
  7    15.0987       12          0        39
  7    15.0987       18          1        39
 31    17.2039        6          0        39
 31    17.2039       12          0        39
 31    17.2039       18          0        39
388    18.6842        6          0        43
388    18.6842       12          0        43
388    18.6842       18          0        43
388    18.6842       24          1        43
Creating the variables int1-int4 to be used in the model in table 7.6, p. 264.
data long;
  set long;
  int1 = 0;
  if interval = 6 then int1 = 1;
  int2 = 0;
  if interval = 12 then int2 = 1;
  int3 = 0;
  if interval = 18 then int3 = 1;
  int4 = 0;
  if interval ge 24 then int4 = 1;
run;
Table 7.6, p. 264.
Using proc genmod for the interval censored model. Since we are modeling censor1 which is a dichotomous variable the appropriate distribution is binomial and the appropriate link function is the complementary log-log function. The descending option is used so that proc genmod models the probability that censor1 = 1 instead of modeling the probability that censor1 = 0 which is the default setting.
proc genmod data=long descending;
  model censor1 = age becktota ndrugfp1 ndrugfp2 ivhx3 race treat site agesite 
                         racesite  int1 int2 int3 int4/ dist=bin link=cll noint;
run;

<output omitted>
                            Analysis Of Parameter Estimates

                               Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq

Intercept     0      0.0000      0.0000      0.0000      0.0000        .           .
age           1     -0.0398      0.0101     -0.0597     -0.0199      15.38        <.0001
becktota      1      0.0061      0.0051     -0.0039      0.0162       1.42        0.2339
ndrugfp1      1     -0.5314      0.1296     -0.7853     -0.2774      16.82        <.0001
ndrugfp2      1     -0.1969      0.0502     -0.2954     -0.0984      15.36        <.0001
ivhx3         1      0.2364      0.1119      0.0170      0.4557       4.46        0.0347
race          1     -0.4441      0.1379     -0.7143     -0.1739      10.38        0.0013
treat         1     -0.2344      0.0967     -0.4239     -0.0450       5.88        0.0153
site          1     -1.2188      0.5441     -2.2852     -0.1523       5.02        0.0251
agesite       1      0.0290      0.0164     -0.0033      0.0612       3.10        0.0782
racesite      1      0.8243      0.2536      0.3273      1.3213      10.57        0.0012
int1          1      1.8281      0.4367      0.9722      2.6840      17.52        <.0001
int2          1      1.7458      0.4505      0.8627      2.6288      15.01        0.0001
int3          1      0.9044      0.4791     -0.0347      1.8435       3.56        0.0591
int4          1     -0.8810      0.7322     -2.3161      0.5541       1.45        0.2289
Scale         0      1.0000      0.0000      1.0000      1.0000
Fig. 7.2, p. 267.
Graphs of the modified risk-score-adjusted survivorship functions for the two treatments based on the fitted model in table 7.6.
data graph;
  set uis;
  month = time/ 30.41667;
  if month le 6 then interval = 6 ;
  else if month le 12 then interval = 12;
  else if month le 18 then interval = 18;
  else interval = 24;
  s0 = 0; 
  c1 = exp( -exp( 1.827) );
  c2 = c1*exp(-exp( 1.745) );
  c3 = c2*exp(-exp( .904) );
  c4 = c3*exp(-exp(  -0.816 ) );
  int1 = (interval = 6);
  int2 = (interval = 12);
  int3 = (interval = 18);
  int4 = (interval = 24);
  s0 = c1*(int1) + c2*(int2) + c3*(int3) + c4*(int4) ;
  s0_short = s0**exp( -2.027);
  s0_long = s0**exp( -2.027 -0.235);
run;
data graph;
  if _n_ = 1 then do;
  s0_short = 1;
  s0_long = 1;
  interval = 0;
  end;
  output;
  set graph;
run;
proc sort data=graph ;
  by interval;
run;
goptions reset=all;
symbol1 v=circle c=blue  h=.8 i=stepjll;
symbol2 v=triangle c=red  h=.8 i=stepjll;
axis1  label=(a=90 'Estimated Survivorship Function') order=(0 to 1 by .25);
axis2 order=(0, 6, 12, 18, 24) label=('Time to Relapse');
legend1 label=none value=(height=1 font=swiss 'Short Treat' 'Long Treat' ) 
        position=(top right inside) mode=share cborder=black;
proc gplot data=graph;
  plot (s0_short s0_long)*interval / overlay vaxis=axis1 legend=legend1 haxis=axis2;
run;
quit;

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California