UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Textbook Examples
Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence
by Judith D. Singer and John B. Willett
Chapter 15:  Extending the Cox Regression Model

This contents of this page are still under construction, however this does show the use statements for obtaining the data files.
Table 15.1, page 548.
use http://www.ats.ucla.edu/stat/stata/examples/alda/data/firstcocaine, clear

generate event = ~censor
stset cokeage, failure(event)
 
     failure event:  event ~= 0 & event ~= .
obs. time interval:  (0, cokeage]
 exit on or before:  failure
 
------------------------------------------------------------------------------
     1658  total obs.
        0  exclusions
------------------------------------------------------------------------------
     1658  obs. remaining, representing
      382  failures in single record/single failure data
    56221  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =        42
  
/* Model A */
stcox birthyr earlymj earlyod, efron nohr
 
         failure _d:  event
   analysis time _t:  cokeage
 
Cox regression -- Efron method for ties
 
No. of subjects =         1658                     Number of obs   =      1658
No. of failures =          382
Time at risk    =        56221
                                                   LR chi2(3)      =    247.83
Log likelihood  =   -2638.6141                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .1550843   .0199279     7.78   0.000     .1160263    .1941422
     earlymj |   1.217073   .1640307     7.42   0.000     .8955789    1.538567
     earlyod |   .7911743   .1962008     4.03   0.000     .4066279    1.175721
------------------------------------------------------------------------------
 
display -2*e(ll) "   " -2*(e(ll)-e(df_m))
5277.2282   5283.2282
 
/* Model B */
use http://www.ats.ucla.edu/stat/stata/examples/alda/data/firstcocaine, clear

generate event = ~censor
expand cokeage
sort id
by id: generate t = _n
generate event2 = 0
by id: replace event2 = event if _n==_N
generate usemj = 0
replace usemj = 1 if t>mjage
generate useod = 0
replace useod = 1 if t>odage
compress

stset t, fail(event2) id(id)
stcox birthyr usemj useod, nohr edron
 
         failure _d:  event2
   analysis time _t:  t
                 id:  id
 
Cox regression -- Efron method for ties
 
No. of subjects =         1658                     Number of obs   =     56221
No. of failures =          382
Time at risk    =        56221
                                                   LR chi2(3)      =    855.96
Log likelihood  =   -2334.5481                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |    .107414   .0214486     5.01   0.000     .0653754    .1494526
       usemj |   2.551764   .2809543     9.08   0.000     2.001104    3.102425
       useod |   1.853868   .1292125    14.35   0.000     1.600616    2.107119
------------------------------------------------------------------------------
 
display -2*e(ll) "   " -2*(e(ll)-e(df_m))
4669.0962   4675.0962
 
/* Model C */
generate soldmj = 0
replace soldmj = 1 if t>sellmjage
generate moreod = 0
replace moreod = 1 if t>sdage

stcox birthyr usemj soldmj useod moreod, nohr efron
 
         failure _d:  event2
   analysis time _t:  t
                 id:  id
 
Cox regression -- Efron method for ties
 
No. of subjects =         1658                     Number of obs   =     56221
No. of failures =          382
Time at risk    =        56221
                                                   LR chi2(5)      =    944.52
Log likelihood  =   -2290.2684                     Prob > chi2     =    0.0000

------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .0849289   .0218326     3.89   0.000     .0421378    .1277201
       usemj |   2.459197    .283572     8.67   0.000     1.903406    3.014988
      soldmj |   .6898893   .1226253     5.63   0.000     .4495482    .9302304
       useod |   1.251102   .1565606     7.99   0.000      .944249    1.557955
      moreod |   .7603747   .1306618     5.82   0.000     .5042824    1.016467
------------------------------------------------------------------------------
 
display -2*e(ll) "   " -2*(e(ll)-e(df_m))
4580.5369   4590.5369
 
/* Model D */
stcox birthyr earlymj usemj soldmj earlyod useod moreod, nohr efron
 
         failure _d:  event2
   analysis time _t:  t
                 id:  id

Cox regression -- Efron method for ties

No. of subjects =         1658                     Number of obs   =     56221
No. of failures =          382
Time at risk    =        56221
                                                   LR chi2(7)      =    944.75
Log likelihood  =   -2290.1554                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .0835003   .0225715     3.70   0.000      .039261    .1277395
     earlymj |   .0752714   .1709014     0.44   0.660    -.2596892    .4102319
       usemj |   2.452513   .2842857     8.63   0.000     1.895323    3.009702
      soldmj |   .6788679   .1249547     5.43   0.000     .4339611    .9237747
     earlyod |  -.0802819   .2032497    -0.39   0.693    -.4786439    .3180802
       useod |    1.25428   .1572314     7.98   0.000     .9461116    1.562448
      moreod |   .7637909   .1321908     5.78   0.000     .5047017     1.02288
------------------------------------------------------------------------------
 
display -2*e(ll) "   " -2*(e(ll)-e(df_m))
4580.3108   4594.3108
Table 15.2, page 555.
We have not worked this example yet, but here is how you can get the data.
use http://www.ats.ucla.edu/stat/stata/examples/alda/data/relapse_days, clear
Table 15.3, page 560.
Note: Uses data from Table 15.1, Model C. The unstratified model is not repeated.
/* stratified model */
stcox birthyr usemj soldmj useod moreod, nohr efron strat(rural)
 
         failure _d:  event2
   analysis time _t:  t
                 id:  id
 
Stratified Cox regr. -- Efron method for ties

No. of subjects =         1658                     Number of obs   =     56221
No. of failures =          382
Time at risk    =        56221
                                                   LR chi2(5)      =    928.30
Log likelihood  =   -2135.9495                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .0853703   .0218743     3.90   0.000     .0424974    .1282432
       usemj |   2.457945   .2837022     8.66   0.000     1.901899    3.013991
      soldmj |   .6847278   .1228423     5.57   0.000     .4439613    .9254944
       useod |   1.251934   .1566545     7.99   0.000     .9448965    1.558971
      moreod |   .7468379   .1312596     5.69   0.000     .4895738    1.004102
------------------------------------------------------------------------------
                                                           Stratified by rural
 
display -2*e(ll)
4271.899 
 
/* nonrural model */
stcox birthyr usemj soldmj useod moreod if ~rural, nohr efron
 
         failure _d:  event2
   analysis time _t:  t
                 id:  id
 
Cox regression -- Efron method for ties
 
No. of subjects =         1316                     Number of obs   =     44333
No. of failures =          328
Time at risk    =        44333
                                                   LR chi2(5)      =    776.89
Log likelihood  =   -1904.6792                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .0812738   .0235786     3.45   0.000     .0350607     .127487
       usemj |   2.436957   .3154511     7.73   0.000     1.818684     3.05523
      soldmj |   .7151351   .1312688     5.45   0.000      .457853    .9724171
       useod |   1.272721   .1715663     7.42   0.000      .936457    1.608985
      moreod |   .6924939   .1410193     4.91   0.000      .416101    .9688867
------------------------------------------------------------------------------
 
display -2*e(ll)
3809.3584 
 
/* rural model */
stcox birthyr usemj soldmj useod moreod if rural, nohr efron

         failure _d:  event2
   analysis time _t:  t
                 id:  id
 
Cox regression -- Efron method for ties
 
No. of subjects =          342                     Number of obs   =     11888
No. of failures =           54
Time at risk    =        11888
                                                   LR chi2(5)      =    153.00
Log likelihood  =   -230.47825                     Prob > chi2     =    0.0000
 
------------------------------------------------------------------------------
          _t |
          _d |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     birthyr |   .1097733   .0584298     1.88   0.060    -.0047469    .2242935
       usemj |   2.517957   .6487525     3.88   0.000     1.246425    3.789488
      soldmj |   .4541637    .352952     1.29   0.198    -.2376095    1.145937
       useod |   1.145638   .3842576     2.98   0.003     .3925069    1.898769
      moreod |   1.105014   .3523088     3.14   0.002     .4145017    1.795527
------------------------------------------------------------------------------
 
display -2*e(ll)
460.9565

Table 15.4 Page 566. Notice that the data set has a duplicated id. This will cause problem when we perform stset with the id option. We assume that this is a data entry error that the id should be recoded to other unique number.

use http://www.ats.ucla.edu/stat/stata/examples/alda/data/lengthofstay, clear
duplicates list id
Duplicates in terms of id
  +------------+
  | obs:    id |
  |------------|
  |   86   845 |
  |   87   845 |
  +------------+
replace id = 80000 if _n==87
(1 real change made)
stset days, failure(censor= 0) id(id)
                id:  id
     failure event:  censor == 0
obs. time interval:  (days[_n-1], days]
 exit on or before:  failure
------------------------------------------------------------------------------
      174  total obs.
        0  exclusions
------------------------------------------------------------------------------
      174  obs. remaining, representing
      174  subjects
      172  failures in single failure-per-subject data
     4938  total analysis time at risk, at risk from t =         0
                             earliest observed entry t =         0
                                  last observed exit t =       100

Model A:

stcox treat, nohr efron
Cox regression -- Efron method for ties
No. of subjects =          174                     Number of obs   =       174
No. of failures =          172
Time at risk    =         4938
                                                   LR chi2(1)      =      0.89
Log likelihood  =   -718.31392                     Prob > chi2     =    0.3449
------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       treat |   .1457002   .1541486     0.95   0.345    -.1564254    .4478259
------------------------------------------------------------------------------

Model B:

stsplit new, every(1)
(4764 observations (episodes) created)
gen tt1 = treat*(_t-1)
stcox treat tt1, nohr efron
Cox regression -- Efron method for ties
No. of subjects =          174                     Number of obs   =      4938
No. of failures =          172
Time at risk    =         4938
                                                   LR chi2(2)      =      6.15
Log likelihood  =   -715.68705                     Prob > chi2     =    0.0463
------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       treat |   .7064112   .2924036     2.42   0.016     .1333106    1.279512
         tt1 |  -.0208327   .0092073    -2.26   0.024    -.0388786   -.0027868
------------------------------------------------------------------------------

Model C:

recode _t  (min/7 = 1)  (8/14 = 2) (15/21=3) (22/28=4) (29/35=5) (35/max=6), gen(catt)
(4764 differences between _t and catt)
tab catt, gen(trt)
  RECODE of |
         _t |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      1,169       23.67       23.67
          2 |      1,060       21.47       45.14
          3 |        893       18.08       63.22
          4 |        652       13.20       76.43
          5 |        386        7.82       84.24
          6 |        778       15.76      100.00
------------+-----------------------------------
      Total |      4,938      100.00
foreach X of numlist 1/6 {
  2. replace trt`X' = trt`X'*treat
  3. }
(616 real changes made)
(585 real changes made)
(506 real changes made)
(398 real changes made)
(209 real changes made)
(357 real changes made)
stcox trt1 - trt6, nohr efron
Cox regression -- Efron method for ties
No. of subjects =          174                     Number of obs   =      4938
No. of failures =          172
Time at risk    =         4938
                                                   LR chi2(6)      =     19.74
Log likelihood  =   -708.89176                     Prob > chi2     =    0.0031
------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        trt1 |   1.571139   .6406079     2.45   0.014     .3155702    2.826707
        trt2 |   .5677856   .4928543     1.15   0.249     -.398191    1.533762
        trt3 |   .8497044   .3620698     2.35   0.019     .1400607    1.559348
        trt4 |  -.3498585   .3641467    -0.96   0.337    -1.063573    .3638559
        trt5 |  -.7696889   .4159832    -1.85   0.064    -1.585001    .0456231
        trt6 |   -.069058   .3144735    -0.22   0.826    -.6854147    .5472987
------------------------------------------------------------------------------

Model D:

gen t_lgtrt = treat*log(_t)/log(2)
stcox treat t_lgtrt, nolog nohr efron
         failure _d:  censor == 0
   analysis time _t:  days
                 id:  id
Cox regression -- Efron method for ties
No. of subjects =          174                     Number of obs   =      4938
No. of failures =          172
Time at risk    =         4938
                                                   LR chi2(2)      =     14.46
Log likelihood  =   -711.53087                     Prob > chi2     =    0.0007
------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       treat |   2.533511   .7603294     3.33   0.001     1.043292    4.023729
     t_lgtrt |  -.5301232   .1618843    -3.27   0.001    -.8474105   -.2128358
------------------------------------------------------------------------------ 

Figure 15.4, Page 573

Top panel:

stcox treat, nohr strata(treat) basechazard(H0) efron
Stratified Cox regr. -- Efron method for ties
No. of subjects =          174                     Number of obs   =      4938
No. of failures =          172
Time at risk    =         4938
                                                   LR chi2(1)      =      0.00
Log likelihood  =    -602.5695                     Prob > chi2     =    1.0000
------------------------------------------------------------------------------
          _t |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       treat |  (dropped)
------------------------------------------------------------------------------
                                                           Stratified by treat
gen logH0 = log(H0)
(528 missing values generated)
separate logH0, by(treat) gen(y)
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------
y0              float  %9.0g                  logH0, treat == 0
y1              float  %9.0g                  logH0, treat == 1
line y1 y0 _t if _t<=77, sort ylab(-4(1) 2) xlab(0(7) 77)

Bottom panel using the steps for the top panel.

keep _t treat y0 y1
save whole, replace
file whole.dta saved
drop if treat ==0
(412 observations deleted)
drop y0
sort _t
save treat1, replace
file treat1.dta saved

use whole, clear
drop if treat==1
(358 observations deleted)
drop y1
sort _t
save treat0, replace
file treat0.dta saved

merge _t using treat1
gen diff= y1-y0
(37 missing values generated)
line diff _t if _t<=77, sort ylab(-.25(.25) 1.75) xlab(0(7) 77)



Table 15.8, Page 601

We have not worked this example yet, but here is how you can get the data.

use http://www.ats.ucla.edu/stat/stata/examples/alda/data/doctors, clear

Table 15.9, Page 604

We have not worked this example yet, but here is how you can get the data.

use http://www.ats.ucla.edu/stat/stata/examples/alda/data/monkeys, clear


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California