Stata Textbook Examples
Applied Linear Statistical Models by Neter, Kutner, et. al.
Chapter 18: ANOVA Diagnostics and Remedial Measures
Inputting the Rust Inhibitor data, table 17.2a, p. 712.
clear
input performance brand experiment
43.9 1 1
39.0 1 2
46.7 1 3
43.8 1 4
44.2 1 5
47.7 1 6
43.6 1 7
38.9 1 8
43.6 1 9
40.0 1 10
89.8 2 1
87.1 2 2
92.7 2 3
90.6 2 4
87.7 2 5
92.4 2 6
86.1 2 7
88.1 2 8
90.8 2 9
89.1 2 10
68.4 3 1
69.3 3 2
68.5 3 3
66.4 3 4
70.0 3 5
68.1 3 6
70.6 3 7
65.2 3 8
63.8 3 9
69.2 3 10
36.2 4 1
45.2 4 2
40.7 4 3
40.5 4 4
39.3 4 5
40.3 4 6
43.2 4 7
38.7 4 8
40.9 4 9
39.7 4 10
end
Table 18.1, p. 758.
anova performance brand
predict r, residuals
table experiment brand, contents(mean r) cell(5) stubw(10)
---------------------------------------
| brand
experiment | 1 2 3 4
-----------+---------------------------
1 | .76 .36 .45 -4.27
2 | -4.14 -2.34 1.35 4.73
3 | 3.56 3.26 .55 .23
4 | .66 1.16 -1.55 .03
5 | 1.06 -1.74 2.05 -1.17
6 | 4.56 2.96 .15 -.17
7 | .46 -3.34 2.65 2.73
8 | -4.24 -1.34 -2.75 -1.77
9 | .46 1.36 -4.15 .43
10 | -3.14 -.34 1.25 -.77
---------------------------------------
Figure 18.1a, p. 759.
anova performance brand
predict yhat
predict r, residuals
twoway scatter r yhat, ms(x) msize(huge)

Figure 18.1b, p. 759.
twoway scatter brand r, ylabel(1 "A" 2 "B" 3 "C" 4 "D")

Figure 18.1c, p. 759.
qnorm r, ms(x) msize(huge)

Inputting ABT Electronics data, table 18.2, p. 765.
clear
input strength type joint
14.87 1 1
16.81 1 2
15.83 1 3
15.47 1 4
13.60 1 5
14.76 1 6
17.40 1 7
14.62 1 8
18.43 2 1
18.76 2 2
20.12 2 3
19.11 2 4
19.81 2 5
18.43 2 6
17.16 2 7
16.40 2 8
16.95 3 1
12.28 3 2
12.00 3 3
13.18 3 4
14.99 3 5
15.76 3 6
19.35 3 7
15.52 3 8
8.59 4 1
10.90 4 2
8.60 4 3
10.13 4 4
10.28 4 5
9.98 4 6
9.41 4 7
10.04 4 8
11.55 5 1
13.36 5 2
13.64 5 3
12.16 5 4
11.62 5 5
12.39 5 6
12.05 5 7
11.95 5 8
end
Table 18.2, the mean, median and variance of pull strength by flux type, p.
765.
Note: p50 stands for the 50th percentile which is the median.
sort type
tabstat strength, by(type) statistics(mean median variance) nosep
Summary for variables: strength
by categories of: type
type | mean p50 variance
---------+------------------------------
1 | 15.42 15.17 1.530514
2 | 18.5275 18.595 1.569936
3 | 15.00375 15.255 6.183399
4 | 9.74125 10.01 .6668407
5 | 12.34 12.105 .592
---------+------------------------------
Total | 14.2065 14.13 10.95925
----------------------------------------
Fig. 18.6, p. 766.
twoway scatter type strength

Modified Levene Test, p. 767.
robvar strength, by(type)
| Summary of strength
type | Mean Std. Dev. Freq.
------------+------------------------------------
1 | 15.42 1.2371393 8
2 | 18.5275 1.252971 8
3 | 15.00375 2.4866442 8
4 | 9.7412499 .81660316 8
5 | 12.34 .76941538 8
------------+------------------------------------
Total | 14.2065 3.3104765 40
W0 = 3.0678112 df(4, 35) Pr > F = 0.02880559
W50 = 2.9357754 df(4, 35) Pr > F = 0.0341384
W10 = 3.0678112 df(4, 35) Pr > F = 0.02880559
Table 18.3, p. 768.
sort type
by type: egen median = median(strength)
gen d = abs(strength-median)
table joint type, contents(mean d) cell(5) stubw(10)
----------------------------------------------
| type
joint | 1 2 3 4 5
-----------+----------------------------------
1 | .3 .165 1.7 1.42 .555
2 | 1.64 .165 2.98 .89 1.26
3 | .66 1.52 3.26 1.41 1.54
4 | .3 .515 2.07 .12 .055
5 | 1.57 1.21 .265 .27 .485
6 | .41 .165 .505 .03 .285
7 | 2.23 1.44 4.1 .6 .055
8 | .55 2.2 .265 .03 .155
----------------------------------------------
Creating the weights and the dummy variables for type to be used in the
weighted least squares regression. Table 18.4, p. 769-771.
by type: egen s = sd(strength)
gen weight = (1/s^2)
tab type, gen(x)
gen x = 1
list type joint strength x1 x2 x3 x4 x5 weight x, clean
type joint strength x1 x2 x3 x4 x5 weight x
1. 1 1 14.87 1 0 0 0 0 .6533754 1
2. 1 2 16.81 1 0 0 0 0 .6533754 1
3. 1 3 15.83 1 0 0 0 0 .6533754 1
4. 1 4 15.47 1 0 0 0 0 .6533754 1
5. 1 5 13.6 1 0 0 0 0 .6533754 1
6. 1 6 14.76 1 0 0 0 0 .6533754 1
7. 1 7 17.4 1 0 0 0 0 .6533754 1
8. 1 8 14.62 1 0 0 0 0 .6533754 1
9. 2 1 18.43 0 1 0 0 0 .6369686 1
10. 2 2 18.76 0 1 0 0 0 .6369686 1
11. 2 3 20.12 0 1 0 0 0 .6369686 1
12. 2 4 19.11 0 1 0 0 0 .6369686 1
13. 2 5 19.81 0 1 0 0 0 .6369686 1
14. 2 6 18.43 0 1 0 0 0 .6369686 1
15. 2 7 17.16 0 1 0 0 0 .6369686 1
16. 2 8 16.4 0 1 0 0 0 .6369686 1
17. 3 1 16.95 0 0 1 0 0 .1617233 1
18. 3 2 12.28 0 0 1 0 0 .1617233 1
19. 3 3 12 0 0 1 0 0 .1617233 1
20. 3 4 13.18 0 0 1 0 0 .1617233 1
21. 3 5 14.99 0 0 1 0 0 .1617233 1
22. 3 6 15.76 0 0 1 0 0 .1617233 1
23. 3 7 19.35 0 0 1 0 0 .1617233 1
24. 3 8 15.52 0 0 1 0 0 .1617233 1
25. 4 1 8.59 0 0 0 1 0 1.499608 1
26. 4 2 10.9 0 0 0 1 0 1.499608 1
27. 4 3 8.6 0 0 0 1 0 1.499608 1
28. 4 4 10.13 0 0 0 1 0 1.499608 1
29. 4 5 10.28 0 0 0 1 0 1.499608 1
30. 4 6 9.98 0 0 0 1 0 1.499608 1
31. 4 7 9.41 0 0 0 1 0 1.499608 1
32. 4 8 10.04 0 0 0 1 0 1.499608 1
33. 5 1 11.55 0 0 0 0 1 1.689189 1
34. 5 2 13.36 0 0 0 0 1 1.689189 1
35. 5 3 13.64 0 0 0 0 1 1.689189 1
36. 5 4 12.16 0 0 0 0 1 1.689189 1
37. 5 5 11.62 0 0 0 0 1 1.689189 1
38. 5 6 12.39 0 0 0 0 1 1.689189 1
39. 5 7 12.05 0 0 0 0 1 1.689189 1
40. 5 8 11.95 0 0 0 0 1 1.689189 1
Fig. 18.7a, p.771 using the same data as previous example.
regress strength x1 x2 x3 x4 x5 [aweight=weight], noconstant
(sum of wgt is 3.7127e+01)
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 5, 35) = 1295.90
Model | 6980.9173 5 1396.18346 Prob > F = 0.0000
Residual | 37.708488 35 1.07738537 R-squared = 0.9946
-------------+------------------------------ Adj R-squared = 0.9939
Total | 7018.62579 40 175.465645 Root MSE = 1.038
------------------------------------------------------------------------------
strength | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x1 | 15.42 .4373948 35.25 0.000 14.53204 16.30796
x2 | 18.5275 .4429921 41.82 0.000 17.62818 19.42682
x3 | 15.00375 .8791615 17.07 0.000 13.21896 16.78854
x4 | 9.74125 .2887128 33.74 0.000 9.155132 10.32737
x5 | 12.34 .2720294 45.36 0.000 11.78775 12.89225
------------------------------------------------------------------------------
Fig. 18.7b, p.771.
regress strength x [aweight=weight], noconstant
(sum of wgt is 3.7127e+01)
Source | SS df MS Number of obs = 40
-------------+------------------------------ F( 1, 39) = 668.28
Model | 6631.61482 1 6631.61482 Prob > F = 0.0000
Residual | 387.010975 39 9.92335833 R-squared = 0.9449
-------------+------------------------------ Adj R-squared = 0.9434
Total | 7018.62579 40 175.465645 Root MSE = 3.1501
------------------------------------------------------------------------------
strength | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | 12.87596 .4980803 25.85 0.000 11.8685 13.88342
------------------------------------------------------------------------------
Inputting the Servo data and obtaining the mean and variance of time by
location, table 18.5, p. 774.
input time location interval
4.41 1 1
100.65 1 2
14.45 1 3
47.13 1 4
85.21 1 5
8.24 2 1
81.16 2 2
7.35 2 3
12.29 2 4
1.61 2 5
106.19 3 1
33.83 3 2
78.88 3 3
342.81 3 4
44.33 3 5
end
egen rank = rank(time)
table interval location , contents(mean time mean rank)
sort location
tabstat time , statistics(mean var ) by(location)
tabstat rank, statistics(mean var ) by(location)
----------------------------------
| location
interval | 1 2 3
----------+-----------------------
1 | 4.41 8.24 106.19
| 2 4 14
|
2 | 100.65 81.16 33.83
| 13 11 7
|
3 | 14.45 7.35 78.88
| 6 3 10
|
4 | 47.13 12.29 342.81
| 9 5 15
|
5 | 85.21 1.61 44.33
| 12 1 8
----------------------------------
Summary for variables: time
by categories of: location
location | mean variance
---------+--------------------
1 | 50.37 1788.742
2 | 22.13 1103.454
3 | 121.208 16167.45
---------+--------------------
Total | 64.56933 7306.561
------------------------------
Summary for variables: rank
by categories of: location
location | mean variance
---------+--------------------
1 | 8.4 20.3
2 | 4.8 14.2
3 | 10.8 12.7
---------+--------------------
Total | 8 20
------------------------------
Diagnostic statistics for determining the appropriate transformation of
time, bottom of p. 773.
sort location
by location: egen sd = sd(time)
by location: egen mean= mean(time)
gen sqrt = (sd^2)/mean
gen inv = sd/mean
gen arcsinsqrt = sd/(mean^2)
----------------------------------------------------------
location | mean(sqrt) mean(inv) mean(arcsin~t)
----------+-----------------------------------------------
1 | 35.51206 .8396571 .0166698
2 | 49.86238 1.501052 .0678288
3 | 133.386 1.049034 .0086548
----------------------------------------------------------
Table 18.6, p. 775.
means time
scalar k2 = r(mean_g)
capture drop myw
gen myw = .
foreach n of numlist 0/20 {
local lambda = (`n'-10)/10
scalar k1 = k2^(1-`lambda')/`lambda'
if (`lambda' ==0) {
quietly replace myw = k2*ln(time)
}
else {
quietly replace myw = k1*(time^`lambda' -1)
}
quietly xi: reg myw i.location
local rss_1000 = e(rss)/1000
display in yellow "`lambda'" _col(10) %8.1f `rss_1000'
}
-1 203.5
-.9 137.7
-.8 95.1
-.7 67.1
-.6 48.7
-.5 36.5
-.4 28.3
-.3 22.8
-.2 19.2
-.1 17.0
0 15.7
.1 15.3
.2 15.6
.3 16.7
.4 18.7
.5 21.8
.6 26.4
.7 33.0
.8 42.6
.9 56.4
1 76.2
Figure 18.8a, p. 775.
regress time location
predict rloc, residuals
qnorm rloc

Figure 18.8b, p. 775.
gen lntime = ln(time)
regress lntime location
predict rtrans, residuals
qnorm rtrans

Kruskal Wallis test of the Servo data, p. 778-779.
Note: Use equation (18.29) on page 779 to get the F statistic for the rank
test.
kwallis time, by(location)
Kruskal-Wallis equality-of-populations rank test
+---------------------------+
| location | Obs | Rank Sum |
|----------+-----+----------|
| 1 | 5 | 42.00 |
| 2 | 5 | 24.00 |
| 3 | 5 | 54.00 |
+---------------------------+
chi-squared = 4.560 with 2 d.f.
probability = 0.1023
chi-squared with ties = 4.560 with 2 d.f.
probability = 0.1023
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services
The content of this web site should not be
construed as an endorsement of any particular web site, book, or software
product by the University of California.
|