|
|
|
||||
|
Help the Stat Consulting Group by
giving a gift
| |||||
|
Loading
|
|||||
These data are from a 1996 study (Gregoire, Kumar Everitt, Henderson & Studd) on the efficacy of estrogen patches in treating postnatal depression. Women were randomly assigned to either a placebo control group (group=0, n=27) or estrogen patch group (group=1, n=34). Prior to the first treatment all patients took the Edinburgh Postnatal Depression Scale (EPDS). EPDS data was collected monthly for six months once the treatment began. Higher scores on the EDPS are indicative of higher levels of depression.
Before reading in the data we will need to change the size of the largest matrix that Stata can use. We need to do this because one of the analyses requires a large number of coded variables:
set matsize 160use http://www.ats.ucla.edu/stat/stata/library/depress, clear
sort group
by group: summarize pre dep1 dep2 dep3 dep4 dep5 dep6
-> group= 0
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
pre | 27 20.77778 3.954874 15 28
dep1 | 27 16.48148 5.279644 7 26
dep2 | 22 15.88818 6.124177 4 27
dep3 | 17 14.12882 4.974648 4.19 22
dep4 | 17 12.27471 5.848791 2 23
dep5 | 17 11.40294 4.438702 3.03 18
dep6 | 17 10.89588 4.68157 3.45 20
-> group= 1
Variable | Obs Mean Std. Dev. Min Max
---------+-----------------------------------------------------
pre | 34 21.24882 3.574432 15 28
dep1 | 34 13.36794 5.556373 1 27
dep2 | 31 11.73677 6.575079 1 27
dep3 | 29 9.134138 5.475564 1 24
dep4 | 28 8.827857 4.666653 0 22
dep5 | 28 7.309286 5.740988 0 24
dep6 | 28 6.590714 4.730158 1 23
corr pre dep1 dep2 dep3 dep4 dep5 dep6
(obs=45)
| pre dep1 dep2 dep3 dep4 dep5 dep6
---------+---------------------------------------------------------------
pre | 1.0000
dep1 | 0.1922 1.0000
dep2 | 0.3904 0.4982 1.0000
dep3 | 0.3958 0.5258 0.8672 1.0000
dep4 | 0.1658 0.3933 0.7357 0.7831 1.0000
dep5 | 0.2848 0.3674 0.7500 0.8520 0.8449 1.0000
dep6 | 0.2688 0.2795 0.6900 0.7967 0.7894 0.9014 1.0000
graph matrix dep1 dep2 dep3 dep4 dep5 dep6, half
Let's check to see if the groups differ on the pretest depression score:
ttest pre, by(group)
Two-sample t test with equal variances
------------------------------------------------------------------------------
Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
---------+--------------------------------------------------------------------
0 | 27 20.77778 .7611158 3.954874 19.21328 22.34227
1 | 34 21.24882 .61301 3.574432 20.00165 22.496
---------+--------------------------------------------------------------------
combined | 61 21.04033 .476678 3.722975 20.08683 21.99383
---------+--------------------------------------------------------------------
diff | -.4710457 .9658499 -2.403707 1.461615
------------------------------------------------------------------------------
Degrees of freedom: 59
Ho: mean(0) - mean(1) = diff = 0
Ha: diff < 0 Ha: diff ~= 0 Ha: diff > 0
t = -0.4877 t = -0.4877 t = -0.4877
P < t = 0.3138 P > |t| = 0.6276 P > t = 0.6862
There isn't much of a difference between groups on the pretest so let's continue on to the panel data analysis.
In order to use these data for our panel data analysis, the data must be reorganized into the long form using the reshape command.
reshape long dep, i(subj) j(visit)
(note: j = 1 2 3 4 5 6)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 61 -> 366
Number of variables 9 -> 5
j variable (6 values) -> visit
xij variables:
dep1 dep2 ... dep6 -> dep
-----------------------------------------------------------------------------
Before we begin the panel data anlyses let's look at some other analyses for comparison. We will begin with a repeated measures analysis of variance. This is the analysis that requires the larger matrix size.
anova dep group / subj|group visit group*visit /, repeated(visit)
Number of obs = 295 R-squared = 0.7699
Root MSE = 3.39594 Adj R-squared = 0.6980
Source | Partial SS df MS F Prob > F
------------+----------------------------------------------------
Model | 8643.81572 70 123.483082 10.71 0.0000
|
group | 548.494938 1 548.494938 5.60 0.0212
subj|group | 5775.54143 59 97.8905328
------------+----------------------------------------------------
visit | 1050.05444 5 210.010889 18.21 0.0000
group*visit | 19.3028953 5 3.86057906 0.33 0.8916
|
Residual | 2583.26536 224 11.5324346
------------+----------------------------------------------------
Total | 11227.0811 294 38.1873506
Between-subjects error term: subj|group
Levels: 61 (59 df)
Lowest b.s.e. variable: subj
Covariance pooled over: group (for repeated variable)
Repeated variable: visit
Huynh-Feldt epsilon = 0.5930
Greenhouse-Geisser epsilon = 0.5532
Box's conservative epsilon = 0.2000
------------ Prob > F ------------
Source | df F Regular H-F G-G Box
------------+----------------------------------------------------
visit | 5 18.21 0.0000 0.0000 0.0000 0.0001
group*visit | 5 0.33 0.8916 0.7979 0.7840 0.5658
Residual | 224
------------+----------------------------------------------------
matrix list e(Srep)
symmetric e(Srep)[6,6]
c1 c2 c3 c4 c5 c6
r1 31.361171
r2 15.71989 38.927914
r3 13.555927 28.365674 27.90249
r4 9.4625252 22.74371 20.519069 26.403025
r5 8.6149335 23.887935 23.161248 22.47211 28.026157
r6 4.6830378 19.242424 18.721233 18.46616 22.103924 22.204237
This analysis indicates that both group and visit are significant while the group*visit interaction is not. Some researchers are critical of this type of analysis since it is based on fixed-effects adjusted for the repeated factor. Also, this repeated measures analysis assumes compound symmetry in the covariance matrix (which seems to be a stretch in this case). However, we can do worse. The next several analyses are not meant to answer the research question but to show relationships among several different commands in Stata.
regress dep pre group visit
Source | SS df MS Number of obs = 295
---------+------------------------------ F( 3, 291) = 48.05
Model | 3719.12931 3 1239.70977 Prob > F = 0.0000
Residual | 7507.95176 291 25.8005215 R-squared = 0.3313
---------+------------------------------ Adj R-squared = 0.3244
Total | 11227.0811 294 38.1873506 Root MSE = 5.0794
------------------------------------------------------------------------------
dep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4769071 .0798565 5.972 0.000 .3197376 .6340767
group | -4.290664 .6072954 -7.065 0.000 -5.485912 -3.095416
visit | -1.307841 .169842 -7.700 0.000 -1.642116 -.9735667
_cons | 8.233577 1.803945 4.564 0.000 4.683143 11.78401
------------------------------------------------------------------------------
glm dep pre group visit, fam(gaus) link(iden)
Iteration 1 : deviance = 7507.9518
Residual df = 291 No. of obs = 295
Pearson X2 = 7507.952 Deviance = 7507.952
Dispersion = 25.80052 Dispersion = 25.80052
Gaussian (normal) distribution, identity link
------------------------------------------------------------------------------
dep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4769071 .0798565 5.972 0.000 .3197376 .6340767
group | -4.290664 .6072954 -7.065 0.000 -5.485912 -3.095416
visit | -1.307841 .169842 -7.700 0.000 -1.642116 -.9735667
_cons | 8.233577 1.803945 4.564 0.000 4.683143 11.78401
------------------------------------------------------------------------------
(Model is ordinary regression, use regress instead)
We are finally ready to try the panel data analysis using Stata's xtgee command. xtgee allows us to specify various working covariance structures through the use of the corr option. We will start with an covariance structure of independence. We don't believe that this is the correct covariance structure but it allows us to compare results with the OLS regression and the glm results above. The estat wcorrelations (which we will abbreviate as estat wcorr) will allow us to view the working correlation matrix.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ind)
Iteration 1: tolerance = 3.270e-15
GEE population-averaged model Number of obs = 295
Group variable: subj Number of groups = 61
Link: identity Obs per group: min = 1
Family: Gaussian avg = 4.8
Correlation: independent max = 6
Wald chi2(3) = 146.13
Scale parameter: 25.45068 Prob > chi2 = 0.0000
Pearson chi2(295): 7507.95 Deviance = 7507.95
Dispersion (Pearson): 25.45068 Dispersion = 25.45068
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4769071 .0793133 6.013 0.000 .321456 .6323582
group | -4.290664 .6031641 -7.114 0.000 -5.472844 -3.108484
visit | -1.307841 .1686866 -7.753 0.000 -1.638461 -.9772215
_cons | 8.233577 1.791673 4.595 0.000 4.721962 11.74519
------------------------------------------------------------------------------
estat wcorr
Estimated within-subj correlation matrix R:
c1 c2 c3 c4 c5 c6
r1 1.0000
r2 0.0000 1.0000
r3 0.0000 0.0000 1.0000
r4 0.0000 0.0000 0.0000 1.0000
r5 0.0000 0.0000 0.0000 0.0000 1.0000
r6 0.0000 0.0000 0.0000 0.0000 0.0000 1.0000
The three previous analyses yielded identical but propbably incorrect results. The common thread among them is that they all assume that the observations within subjects are independent. This seems, on the face of it, to be highly unlikely. Scores on the depression scale are not likely to be independent from one visit to the next.
We can also try analyzing these data using compound symmetry for the correlational structure. Compound symmetry is obtained using exchangable for the corr option in xtgee.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(exc)
GEE population-averaged model Number of obs = 295
Group variable: subj Number of groups = 61
Link: identity Obs per group: min = 1
Family: Gaussian avg = 4.8
Correlation: exchangeable max = 6
Wald chi2(3) = 135.08
Scale parameter: 25.56569 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4599018 .1441533 3.190 0.001 .1773666 .742437
group | -4.024676 1.081131 -3.723 0.000 -6.143654 -1.905698
visit | -1.226764 .1175009 -10.440 0.000 -1.457062 -.9964666
_cons | 8.432806 3.120987 2.702 0.007 2.315783 14.54983
-----------------------------------------------------------------------------
estat wcorr
Estimated within-subj correlation matrix R:
c1 c2 c3 c4 c5 c6
r1 1.0000
r2 0.5554 1.0000
r3 0.5554 0.5554 1.0000
r4 0.5554 0.5554 0.5554 1.0000
r5 0.5554 0.5554 0.5554 0.5554 1.0000
r6 0.5554 0.5554 0.5554 0.5554 0.5554 1.0000
Note in particular the change in the standard errors between this analysis and the previous one. Next, what if we impose no preconceived notions about the correlations among the responses over time. In this next example, we will request an unstructured correlation matrix. This is equivalent to the assumptions made in a multivariate analysis.
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(unstr)
GEE population-averaged model Number of obs = 295
Group and time vars: subj visit Number of groups = 61
Link: identity Obs per group: min = 1
Family: Gaussian avg = 4.8
Correlation: unstructured max = 6
Wald chi2(3) = 94.13
Scale parameter: 25.87029 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .3399185 .1326684 2.562 0.010 .0798932 .5999437
group | -4.134413 .9986306 -4.140 0.000 -6.091693 -2.177133
visit | -1.228327 .1492831 -8.228 0.000 -1.520916 -.9357372
_cons | 11.13045 2.892903 3.848 0.000 5.460464 16.80044
------------------------------------------------------------------------------
estat wcorr
Estimated within-subj correlation matrix R:
c1 c2 c3 c4 c5 c6
r1 1.0000
r2 0.4955 1.0000
r3 0.3477 0.8622 1.0000
r4 0.3012 0.7359 0.6677 1.0000
r5 0.2328 0.7431 0.7394 0.7701 1.0000
r6 0.0943 0.5671 0.5625 0.6166 0.7179 1.0000
Now, let's try a different correlation structure, auto regressive with lag one. This is the
correlational structure that is most likely to be correct considering the repeated measures
over time
xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
GEE population-averaged model Number of obs = 287
Group and time vars: subj visit Number of groups = 53
Link: identity Obs per group: min = 2
Family: Gaussian avg = 5.4
Correlation: AR(1) max = 6
Wald chi2(3) = 64.55
Scale parameter: 25.82413 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4268002 .1376156 3.101 0.002 .1570785 .6965219
group | -4.218194 1.053504 -4.004 0.000 -6.283023 -2.153364
visit | -1.181975 .1907298 -6.197 0.000 -1.555799 -.8081517
_cons | 9.037864 3.036076 2.977 0.003 3.087264 14.98846
------------------------------------------------------------------------------
estat wcorr
Estimated within-subj correlation matrix R:
c1 c2 c3 c4 c5 c6
r1 1.0000
r2 0.6812 1.0000
r3 0.4641 0.6812 1.0000
r4 0.3161 0.4641 0.6812 1.0000
r5 0.2154 0.3161 0.4641 0.6812 1.0000
r6 0.1467 0.2154 0.3161 0.4641 0.6812 1.000
This analysis probably more closely reflects the correlations among the depression scores over six visits that we observed in our descriptive analysis.
Now, let's back up and reconsider the group by visit interaction. We will try a model with the interaction using the ar1 correlations.
generate gxv = group*visit
xtgee dep pre group visit gxv, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
GEE population-averaged model Number of obs = 287
Group and time vars: subj visit Number of groups = 53
Link: identity Obs per group: min = 2
Family: Gaussian avg = 5.4
Correlation: AR(1) max = 6
Wald chi2(4) = 64.83
Scale parameter: 25.81682 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4284649 .1377094 3.111 0.002 .1585595 .6983703
group | -3.55197 1.654127 -2.147 0.032 -6.794 -.3099395
visit | -1.057824 .3044115 -3.475 0.001 -1.654459 -.4611881
gxv | -.2040059 .3905217 -0.522 0.601 -.9694144 .5614026
_cons | 8.606923 3.147897 2.734 0.006 2.437158 14.77669
------------------------------------------------------------------------------
The group by visit interaction still is not significant even though this may be a better approach for testing it. So far we have been treating visit as a continuous variable. Is it possible that our analysis might change if we were to treat visit as a categorical variable, in the way that the anova did? Let's try one more analysis using xi to create dummy variables on-the-fly.
xi: xtgee dep pre group i.visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
GEE population-averaged model Number of obs = 287
Group and time vars: subj visit Number of groups = 53
Link: identity Obs per group: min = 2
Family: Gaussian avg = 5.4
Correlation: AR(1) max = 6
Wald chi2(7) = 66.85
Scale parameter: 25.67071 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
pre | .4264589 .1372194 3.108 0.002 .1575137 .6954041
group | -4.197096 1.050645 -3.995 0.000 -6.256323 -2.137869
Ivisit_2 | -.964717 .5556079 -1.736 0.083 -2.053689 .1242546
Ivisit_3 | -2.790063 .7474989 -3.733 0.000 -4.255134 -1.324992
Ivisit_4 | -3.730425 .8528421 -4.374 0.000 -5.401964 -2.058885
Ivisit_5 | -5.127078 .9147959 -5.605 0.000 -6.920045 -3.334111
Ivisit_6 | -5.84916 .9534054 -6.135 0.000 -7.7178 -3.98052
_cons | 7.896145 2.998003 2.634 0.008 2.020168 13.77212
------------------------------------------------------------------------------
test Ivisit_2 Ivisit_3 Ivisit_4 Ivisit_5 Ivisit_6
( 1) Ivisit_2 = 0.0
( 2) Ivisit_3 = 0.0
( 3) Ivisit_4 = 0.0
( 4) Ivisit_5 = 0.0
( 5) Ivisit_6 = 0.0
chi2( 5) = 40.56
Prob > chi2 = 0.0000
We can test to see whether the categorical version of visit accounts for more variability that the continuous version by including both in the model but using only k - 2 = 4 dummy variables for time
xi: xtgee dep pre group visit i.visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1)
GEE population-averaged model Number of obs = 287
Group and time vars: subj visit Number of groups = 53
Link: identity Obs per group: min = 2
Family: Gaussian avg = 5.4
Correlation: AR(1) max = 6
Wald chi2(7) = 66.85
Scale parameter: 25.67071 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
dep | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
pre | .4264589 .1372194 3.11 0.002 .1575137 .6954041
group | -4.197096 1.050645 -3.99 0.000 -6.256323 -2.137869
visit | -1.169832 .1906811 -6.14 0.000 -1.54356 -.7961039
_Ivisit_2 | .205115 .5196299 0.39 0.693 -.8133408 1.223571
_Ivisit_3 | -.4503992 .648481 -0.69 0.487 -1.721399 .8206003
_Ivisit_4 | -.2209286 .6602134 -0.33 0.738 -1.514923 1.073066
_Ivisit_5 | -.4477498 .5585628 -0.80 0.423 -1.542513 .6470131
_cons | 9.065977 3.031614 2.99 0.003 3.124124 15.00783
------------------------------------------------------------------------------
test _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5
( 1) _Ivisit_2 = 0
( 2) _Ivisit_3 = 0
( 3) _Ivisit_4 = 0
( 4) _Ivisit_5 = 0
chi2( 4) = 1.92
Prob > chi2 = 0.7506
These results indicate that the categorical version of visit does not account for significantly more variability than the continuous version. In the final analysis, I think that I prefer the following model, xtgee dep pre group visit, fam(gaus) link(iden) i(subj) t(visit) corr(ar1), of all the analyses run so far. Those results looked as follows:
------------------------------------------------------------------------------ dep | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- pre | .4268002 .1376156 3.101 0.002 .1570785 .6965219 group | -4.218194 1.053504 -4.004 0.000 -6.283023 -2.153364 visit | -1.181975 .1907298 -6.197 0.000 -1.555799 -.8081517 _cons | 9.037864 3.036076 2.977 0.003 3.087264 14.98846 ------------------------------------------------------------------------------
The final interpretation of these results indicate that there is a significant effect for the pretest, i.e., for evey one point increase in the pretest score there is about a 0.4 increase in the depression score, when controlling for treatment and visit. There is also an effect for the estrogen patch when controlling for pretest depression and visit. Use of the estrogen patch reduces the depression score by 4.2 point. Finally, there is also a significant visit effect when controlling for pretest depression and group membership. The depression score decreases on the average by 1.18 points for each visit.
use http://www.ats.ucla.edu/stat/stata/library/depres01, clear
We will go through as series of analyses pretty much paralleling models that were run above using the continuous response variable. To get a binary logit type model we will set family to binary and link to logit. We will start with the correlation structure independent follow by exchangable (compound symmetry) and then unstructured.
xtgee depressd group visit, i(subj) fam(bin) link(logit) corr(ind) GEE population-averaged model Number of obs = 295 Group variable: subj Number of groups = 61 Link: logit Obs per group: min = 1 Family: binomial avg = 4.8 Correlation: independent max = 6 Wald chi2(2) = 52.54 Scale parameter: 1 Prob > chi2 = 0.0000 Pearson chi2(295): 295.72 Deviance = 338.95 Dispersion (Pearson): 1.00245 Dispersion = 1.148974 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- group | -1.606602 .277919 -5.78 0.000 -2.151313 -1.061891 visit | -.4402142 .0802387 -5.49 0.000 -.5974791 -.2829493 _cons | 2.38366 .3675414 6.49 0.000 1.663292 3.104028 ------------------------------------------------------------------------------ stat wcorr Estimated within-subj correlation matrix R: | c1 c2 c3 c4 c5 c6 ------+------------------------------------------------------------------ r1 | 1 r2 | 0 1 r3 | 0 0 1 r4 | 0 0 0 1 r5 | 0 0 0 0 1 r6 | 0 0 0 0 0 1 xtgee depressd group visit, i(subj) fam(bin) link(logit) corr(exc) GEE population-averaged model Number of obs = 295 Group variable: subj Number of groups = 61 Link: logit Obs per group: min = 1 Family: binomial avg = 4.8 Correlation: exchangeable max = 6 Wald chi2(2) = 45.64 Scale parameter: 1 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- group | -1.616323 .4669082 -3.462 0.001 -2.531446 -.7011994 visit | -.3984038 .0613331 -6.496 0.000 -.5186145 -.2781931 _cons | 2.409522 .4456646 5.407 0.000 1.536035 3.283008 ------------------------------------------------------------------------------ estat wcorr Estimated within-subj correlation matrix R: c1 c2 c3 c4 c5 c6 r1 1.0000 r2 0.4518 1.0000 r3 0.4518 0.4518 1.0000 r4 0.4518 0.4518 0.4518 1.0000 r5 0.4518 0.4518 0.4518 0.4518 1.0000 r6 0.4518 0.4518 0.4518 0.4518 0.4518 1.0000 xtgee depressd group visit, i(subj) t(visit) fam(bin) link(logit) corr(unstr) GEE population-averaged model Number of obs = 295 Group and time vars: subj visit Number of groups = 61 Link: logit Obs per group: min = 1 Family: binomial avg = 4.8 Correlation: unstructured max = 6 Wald chi2(2) = 32.57 Scale parameter: 1 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- group | -1.5933 .4553165 -3.50 0.000 -2.485704 -.7008963 visit | -.3897561 .0748284 -5.21 0.000 -.5364169 -.2430952 _cons | 2.311344 .4521761 5.11 0.000 1.425095 3.197593 ------------------------------------------------------------------------------ estat wcorr Estimated within-subj correlation matrix R: | c1 c2 c3 c4 c5 c6 ------+------------------------------------------------------------------ r1 | 1 r2 | .404501 1 r3 | .1803076 .6315383 1 r4 | .284646 .5602217 .5795466 1
With these data, just as with the continnuous response variable, it might be more reasonable to hypothesize that the correlation structure would be autoregressive.
xtgee depressd group visit, i(subj) t(visit) fam(bin) link(logit) corr(ar1) note: some groups have fewer than 2 observations not possible to estimate correlations for those groups 8 groups omitted from estimation GEE population-averaged model Number of obs = 287 Group and time vars: subj visit Number of groups = 53 Link: logit Obs per group: min = 2 Family: binomial avg = 5.4 Correlation: AR(1) max = 6 Wald chi2(2) = 26.04 Scale parameter: 1 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- group | -1.588712 .4391128 -3.618 0.000 -2.449358 -.7280672 visit | -.4036122 .0933711 -4.323 0.000 -.5866163 -.2206082 _cons | 2.259702 .4961409 4.555 0.000 1.287284 3.23212 ------------------------------------------------------------------------------ estat wcorr Estimated within-subj correlation matrix R: c1 c2 c3 c4 c5 c6 r1 1.0000 r2 0.5643 1.0000 r3 0.3185 0.5643 1.0000 r4 0.1797 0.3185 0.5643 1.0000 r5 0.1014 0.1797 0.3185 0.5643 1.0000 r6 0.0572 0.1014 0.1797 0.3185 0.5643 1.000
If we want, we can also obtain the results in the odds ratio metric using the eform option.
xtgee, eform note: some groups have fewer than 2 observations not possible to estimate correlations for those groups 8 groups omitted from estimation GEE population-averaged model Number of obs = 287 Group and time vars: subj visit Number of groups = 53 Link: logit Obs per group: min = 2 Family: binomial avg = 5.4 Correlation: AR(1) max = 6 Wald chi2(2) = 26.04 Scale parameter: 1 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ depressd | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] ---------+-------------------------------------------------------------------- group | .2041883 .0896617 -3.618 0.000 .086349 .4828413 visit | .6679031 .0623629 -4.323 0.000 .5562061 .8020309 ------------------------------------------------------------------------------
Let's add in the pretest and a group by visit interaction.
xtgee depressd pre group visit gxv, i(subj) t(visit) fam(bin) link(logit) corr(ar1) note: some groups have fewer than 2 observations not possible to estimate correlations for those groups 8 groups omitted from estimation GEE population-averaged model Number of obs = 287 Group and time vars: subj visit Number of groups = 53 Link: logit Obs per group: min = 2 Family: binomial avg = 5.4 Correlation: AR(1) max = 6 Wald chi2(4) = 29.71 Scale parameter: 1 Prob > chi2 = 0.0000 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- pre | .1231682 .0565583 2.18 0.029 .012316 .2340204 group | -1.278468 .7833482 -1.63 0.103 -2.813802 .2568666 visit | -.3504923 .1484459 -2.36 0.018 -.6414409 -.0595436 gxv | -.1279848 .1946883 -0.66 0.511 -.5095669 .2535973 _cons | -.4669354 1.271484 -0.37 0.713 -2.958999 2.025128 ------------------------------------------------------------------------------
Clearly, there is no interaction but we'll stick with the pretest for the moment. Next let's try the categorical version of visit and the model that contains both the categorical and continuous version of visit.
xi: xtgee depressd pre group i.visit, i(subj) fam(bin) link(logit) t(visit) corr(ar1) note: some groups have fewer than 2 observations not possible to estimate correlations for those groups 8 groups omitted from estimation GEE population-averaged model Number of obs = 287 Group and time vars: subj visit Number of groups = 53 Link: logit Obs per group: min = 2 Family: binomial avg = 5.4 Correlation: AR(1) max = 6 Wald chi2(7) = 30.86 Scale parameter: 1 Prob > chi2 = 0.0001 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- pre | .1140311 .056433 2.02 0.043 .0034244 .2246378 group | -1.692654 .4377388 -3.87 0.000 -2.550607 -.8347021 _Ivisit_2 | -.1751772 .3106588 -0.56 0.573 -.7840573 .4337028 _Ivisit_3 | -1.015265 .3915632 -2.59 0.010 -1.782715 -.2478151 _Ivisit_4 | -1.108258 .4287682 -2.58 0.010 -1.948628 -.2678878 _Ivisit_5 | -1.489162 .4548596 -3.27 0.001 -2.380671 -.597654 _Ivisit_6 | -2.14973 .4951443 -4.34 0.000 -3.120195 -1.179265 _cons | -.4832614 1.18731 -0.41 0.684 -2.810346 1.843823 ------------------------------------------------------------------------------ test _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5 _Ivisit_6 ( 1) _Ivisit_2 = 0 ( 2) _Ivisit_3 = 0 ( 3) _Ivisit_4 = 0 ( 4) _Ivisit_5 = 0 ( 5) _Ivisit_6 = 0 chi2( 5) = 21.92 xi: xtgee depressd pre group visit i.visit, i(subj) fam(bin) link(logit) t(visit) corr(ar1) note: _Ivisit_6 dropped due to collinearity note: some groups have fewer than 2 observations not possible to estimate correlations for those groups 8 groups omitted from estimation GEE population-averaged model Number of obs = 287 Group and time vars: subj visit Number of groups = 53 Link: logit Obs per group: min = 2 Family: binomial avg = 5.4 Correlation: AR(1) max = 6 Wald chi2(7) = 30.86 Scale parameter: 1 Prob > chi2 = 0.0001 ------------------------------------------------------------------------------ depressd | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- pre | .1140311 .056433 2.02 0.043 .0034244 .2246378 group | -1.692654 .4377388 -3.87 0.000 -2.550607 -.8347021 visit | -.429946 .0990289 -4.34 0.000 -.624039 -.235853 _Ivisit_2 | .2547688 .2901423 0.88 0.380 -.3138998 .8234373 _Ivisit_3 | -.1553729 .3440849 -0.45 0.652 -.829767 .5190212 _Ivisit_4 | .1815801 .3544878 0.51 0.608 -.5132033 .8763635 _Ivisit_5 | .2306217 .3201945 0.72 0.471 -.3969481 .8581914 _cons | -.0533153 1.201905 -0.04 0.965 -2.409005 2.302375 ------------------------------------------------------------------------------ test _Ivisit_2 _Ivisit_3 _Ivisit_4 _Ivisit_5 ( 1) _Ivisit_2 = 0 ( 2) _Ivisit_3 = 0 ( 3) _Ivisit_4 = 0 ( 4) _Ivisit_5 = 0 chi2( 4) = 3.04 Prob > chi2 = 0.5507
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services