Stata Textbook Examples
Applied Linear Statistical Models by Neter, Kutner, et. al.
Chapter 23: Multifactor Studies
Inputting table 23.4a, p. 943.
clear
input exercise gender fat smoking rep
24.1 1 1 1 1
29.2 1 1 1 2
24.6 1 1 1 3
20.0 2 1 1 1
21.9 2 1 1 2
17.6 2 1 1 3
14.6 1 2 1 1
15.3 1 2 1 2
12.3 1 2 1 3
16.1 2 2 1 1
9.3 2 2 1 2
10.8 2 2 1 3
17.6 1 1 2 1
18.8 1 1 2 2
23.2 1 1 2 3
14.8 2 1 2 1
10.3 2 1 2 2
11.3 2 1 2 3
14.9 1 2 2 1
20.4 1 2 2 2
12.8 1 2 2 3
10.1 2 2 2 1
14.4 2 2 2 2
6.1 2 2 2 3
end
label define gnd 1 "male" 2 "female"
label define fat 1 "low fat" 2 "high fat
label define smk 1 "Light" 2 "heavy"
label values gender gnd
label values fat fat
label values smoking smk
Table 23.4b, p. 943. The cell means and the factor means.
table gender smoking, by(fat) contents(mean exercise) row col
table gender smoking, contents(mean exercise) row col
----------------------------------------
fat and | smoking
gender | Light heavy Total
----------+-----------------------------
low fat |
male | 25.96667 19.86667 22.91667
female | 19.83333 12.13333 15.98333
|
Total | 22.9 16 19.45
----------+-----------------------------
high fat |
male | 14.06667 16.03333 15.05
female | 12.06667 10.2 11.13333
|
Total | 13.06667 13.11667 13.09167
----------------------------------------
----------------------------------------
| smoking
gender | Light heavy Total
----------+-----------------------------
male | 20.01667 17.95 18.98333
female | 15.95 11.16667 13.55833
|
Total | 17.98333 14.55833 16.27083
----------------------------------------
Figure 23.6a1, p. 944.
Note that the code to generate the plot requires a user-written package
"Graphing model diagnostics" by Nicholas J. Cox, this package can be located
by typing "findit anovaplot" (without the quotes) in the Stata command
window (for further information on using findit to find new programs, see
our FAQ How do I
use findit to search for programs and additional help?).
anova exercise fat smoking fat*smoking if gender==1
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(14 2.2 "C1 Light") text(16 2.2 "C2 Heavy") title(A1 (Male))

Figure 23.6a2, p. 944.
anova exercise fat smoking fat*smoking if gender==2
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(12 2.2 "C1 Light") text(10 2.2 "C2 Heavy") title(A2 (Female))

Figure 23.6b1, p. 944.
anova exercise gender fat if smoking==1
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(21 2.2 "B1 Low Fat") text(11 2.2 "B2 High Fat") title(C1 (Light Smoking))

Figure 23.6b2, p. 944.
anova exercise gender fat if smoking==2
anovaplot, scatter(ms(i)) legend(off) xscale(r(.6 2.4)) text(12.5 2.2 "B1 Low Fat") text(9.5 2.2 "B2 High Fat") title(C2 (Heavy Smoking))

Fig. 23.7 and 23.8, p. 945.
anova exercise gender fat smoking gender*fat gender*smoking fat*smoking gender*fat*smoking
predict r, residuals
qnorm r
Number of obs = 24 R-squared = 0.7976
Root MSE = 3.05539 Adj R-squared = 0.7090
Source | Partial SS df MS F Prob > F
-------------------+----------------------------------------------------
Model | 588.582936 7 84.0832766 9.01 0.0002
|
gender | 176.583755 1 176.583755 18.92 0.0005
fat | 242.570427 1 242.570427 25.98 0.0001
smoking | 70.3837595 1 70.3837595 7.54 0.0144
gender*fat | 13.6504194 1 13.6504194 1.46 0.2441
gender*smoking | 11.0704137 1 11.0704137 1.19 0.2923
fat*smoking | 72.4537444 1 72.4537444 7.76 0.0132
gender*fat*smoking | 1.87041736 1 1.87041736 0.20 0.6604
|
Residual | 149.366666 16 9.33541665
-------------------+----------------------------------------------------
Total | 737.949602 23 32.0847653

Estimation of Contrasts of Treatment means, p. 947. First we make
dummy variables for a regression. Then we test the contrasts.
gen male = 0
replace male = 1 if gender==1
gen lowfat = 0
replace lowfat = 1 if fat==1
gen lightsmk = 0
replace lightsmk = 1 if smoking==1
gen genderfat= male*lowfat
gen gendersmk= male*lightsmk
gen fatsmk = lowfat*lightsmk
gen genderfatsmk = male*lowfat*lightsmk
regress exercise male lowfat lightsmk genderfat gendersmk fatsmk genderfatsmk
Source | SS df MS Number of obs = 24
-------------+------------------------------ F( 7, 16) = 9.01
Model | 588.582936 7 84.0832766 Prob > F = 0.0002
Residual | 149.366666 16 9.33541665 R-squared = 0.7976
-------------+------------------------------ Adj R-squared = 0.7090
Total | 737.949602 23 32.0847653 Root MSE = 3.0554
------------------------------------------------------------------------------
exercise | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | 5.833333 2.494717 2.34 0.033 .5447702 11.1219
lowfat | 1.933334 2.494717 0.77 0.450 -3.355229 7.221897
lightsmk | 1.866667 2.494717 0.75 0.465 -3.421896 7.15523
genderfat | 1.9 3.528062 0.54 0.598 -5.579157 9.379158
gendersmk | -3.833333 3.528062 -1.09 0.293 -11.31249 3.645824
fatsmk | 5.833333 3.528062 1.65 0.118 -1.645825 13.31249
genderfatsmk | 2.233334 4.989433 0.45 0.660 -8.343792 12.81046
_cons | 10.2 1.764031 5.78 0.000 6.460421 13.93958
------------------------------------------------------------------------------
lincom lightsmk+.5*gendersmk+fatsmk+.5*genderfatsmk
( 1) lightsmk + .5 gendersmk + fatsmk + .5 genderfatsmk = 0
------------------------------------------------------------------------------
exercise | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 6.9 1.764031 3.91 0.001 3.160421 10.63958
------------------------------------------------------------------------------
lincom lightsmk+.5*gendersmk
------------------------------------------------------------------------------
exercise | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | -.0499996 1.764031 -0.03 0.978 -3.789578 3.689579
------------------------------------------------------------------------------
lincom male+.5*genderfat+.5*gendersmk+.25*genderfatsmk
( 1) male + .5 genderfat + .5 gendersmk + .25 genderfatsmk = 0
------------------------------------------------------------------------------
exercise | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
(1) | 5.425 1.247358 4.35 0.000 2.780719 8.069282
------------------------------------------------------------------------------
Figure. 23.9b, p. 949.
anova exercise smoking fat smoking*fat
anovaplot, scatter(ms(i)) xscale(r(.5 2.5)) text(11 1.5 "High Percent Fat") text(22 1.5 "Low Percent Fat") legend(off)
Number of obs = 24 R-squared = 0.5223
Root MSE = 4.19846 Adj R-squared = 0.4506
Source | Partial SS df MS F Prob > F
------------+----------------------------------------------------
Model | 385.407931 3 128.46931 7.29 0.0017
|
smoking | 70.3837595 1 70.3837595 3.99 0.0595
fat | 242.570427 1 242.570427 13.76 0.0014
smoking*fat | 72.4537444 1 72.4537444 4.11 0.0562
|
Residual | 352.541671 20 17.6270836
------------+----------------------------------------------------
Total | 737.949602 23 32.0847653

Creating the Stress data with missing values and generating the indicator
variables, table 23.5, p. 950.
gen y = exercise
replace y = . if gender==1 & fat==1 & smoking==1 & rep==3
replace y = . if gender==2 & fat==2 & smoking==1 & rep==2
gen x1 = 1
replace x1 = -1 if gender==2
gen x2 = 1
replace x2 = -1 if fat==2
gen x3 = 1
replace x3 = -1 if smoking==2
gen x1x2 = x1*x2
gen x1x3 = x1*x3
gen x2x3 = x2*x3
gen x1x2x3 = x1*x2*x3
list exercise gender fat smoking rep y x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3, clean nolabel compress
exe~e gen~r fat smo~g rep y x1 x2 x3 x1x2 x1x3 x2x3 x1x..
1. 24.1 1 1 1 1 24.1 1 1 1 1 1 1 1
2. 29.2 1 1 1 2 29.2 1 1 1 1 1 1 1
3. 24.6 1 1 1 3 . 1 1 1 1 1 1 1
4. 20 2 1 1 1 20 -1 1 1 -1 -1 1 -1
5. 21.9 2 1 1 2 21.9 -1 1 1 -1 -1 1 -1
6. 17.6 2 1 1 3 17.6 -1 1 1 -1 -1 1 -1
7. 14.6 1 2 1 1 14.6 1 -1 1 -1 1 -1 -1
8. 15.3 1 2 1 2 15.3 1 -1 1 -1 1 -1 -1
9. 12.3 1 2 1 3 12.3 1 -1 1 -1 1 -1 -1
10. 16.1 2 2 1 1 16.1 -1 -1 1 1 -1 -1 1
11. 9.3 2 2 1 2 . -1 -1 1 1 -1 -1 1
12. 10.8 2 2 1 3 10.8 -1 -1 1 1 -1 -1 1
13. 17.6 1 1 2 1 17.6 1 1 -1 1 -1 -1 -1
14. 18.8 1 1 2 2 18.8 1 1 -1 1 -1 -1 -1
15. 23.2 1 1 2 3 23.2 1 1 -1 1 -1 -1 -1
16. 14.8 2 1 2 1 14.8 -1 1 -1 -1 1 -1 1
17. 10.3 2 1 2 2 10.3 -1 1 -1 -1 1 -1 1
18. 11.3 2 1 2 3 11.3 -1 1 -1 -1 1 -1 1
19. 14.9 1 2 2 1 14.9 1 -1 -1 -1 -1 1 1
20. 20.4 1 2 2 2 20.4 1 -1 -1 -1 -1 1 1
21. 12.8 1 2 2 3 12.8 1 -1 -1 -1 -1 1 1
22. 10.1 2 2 2 1 10.1 -1 -1 -1 1 1 1 -1
23. 14.4 2 2 2 2 14.4 -1 -1 -1 1 1 1 -1
24. 6.1 2 2 2 3 6.1 -1 -1 -1 1 1 1 -1
Testing factor A (Gender) by dropping x1 from the full model and
regressing y on the variables in column 3-8, p. 950.
regress y x1 x2 x3 x1x2 x1x3 x2x3 x1x2x3
Source | SS df MS Number of obs = 22
-------------+------------------------------ F( 7, 14) = 7.18
Model | 484.814865 7 69.2592665 Prob > F = 0.0009
Residual | 135.083332 14 9.64880943 R-squared = 0.7821
-------------+------------------------------ Adj R-squared = 0.6731
Total | 619.898197 21 29.5189618 Root MSE = 3.1063
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | 2.625 .6725236 3.90 0.002 1.18258 4.06742
x2 | 3.091667 .6725236 4.60 0.000 1.649247 4.534086
x3 | 1.970833 .6725236 2.93 0.011 .5284139 3.413253
x1x2 | 1.0125 .6725236 1.51 0.154 -.4299195 2.45492
x1x3 | -.7666666 .6725236 -1.14 0.273 -2.209086 .675753
x2x3 | 1.65 .6725236 2.45 0.028 .2075804 3.09242
x1x2x3 | .5375001 .6725236 0.80 0.438 -.9049195 1.97992
_cons | 16.52917 .6725236 24.58 0.000 15.08675 17.97159
------------------------------------------------------------------------------
regress y x2 x3 x1x2 x1x3 x2x3 x1x2x3
Source | SS df MS Number of obs = 22
-------------+------------------------------ F( 6, 15) = 2.99
Model | 337.814861 6 56.3024768 Prob > F = 0.0397
Residual | 282.083336 15 18.8055558 R-squared = 0.5450
-------------+------------------------------ Adj R-squared = 0.3629
Total | 619.898197 21 29.5189618 Root MSE = 4.3365
------------------------------------------------------------------------------
y | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x2 | 2.8 .9330743 3.00 0.009 .8111994 4.788801
x3 | 1.970833 .9388879 2.10 0.053 -.0303587 3.972026
x1x2 | 1.0125 .9388879 1.08 0.298 -.988692 3.013692
x1x3 | -1.058333 .9330743 -1.13 0.274 -3.047134 .9304675
x2x3 | 1.358333 .9330743 1.46 0.166 -.6304675 3.347134
x1x2x3 | .5375001 .9388879 0.57 0.575 -1.463692 2.538692
_cons | 16.52917 .9388879 17.61 0.000 14.52797 18.53036
------------------------------------------------------------------------------
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services
The content of this web site should not be
construed as an endorsement of any particular web site, book, or software
product by the University of California
|