|
|
|
||||
|
|
|||||
page 68 Table 3.1 Regression of postshortage (1981) water use on income and preshortage (1980) water use. The concord1 data set is used.
proc reg data = concord1; model income water80 = water81; run; proc reg data = concord1; model water81 = income water80; run;
The REG Procedure
Model: MODEL1
Dependent Variable: income
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 14732 14732 104.46 <.0001
Error 494 69669 141.03078
Corrected Total 495 84401
Root MSE 11.87564 R-Square 0.1745
Dependent Mean 23.07661 Adj R-Sq 0.1729
Coeff Var 51.46179
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 14.63948 0.98275 14.90 <.0001
water81 1 0.00367 0.00035917 10.22 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: water80
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 900727515 900727515 696.11 <.0001
Error 494 639212788 1293953
Corrected Total 495 1539940302
Root MSE 1137.52055 R-Square 0.5849
Dependent Mean 2732.05645 Adj R-Sq 0.5841
Coeff Var 41.63606
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 645.82548 94.13406 6.86 <.0001
water81 1 0.90769 0.03440 26.38 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 671025350 335512675 391.76 <.0001
Error 493 422213359 856417
Corrected Total 495 1093238710
Root MSE 925.42777 R-Square 0.6138
Dependent Mean 2298.38710 Adj R-Sq 0.6122
Coeff Var 40.26423
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 203.82169 94.36129 2.16 0.0313
income 1 20.54504 3.38341 6.07 <.0001
water80 1 0.59313 0.02505 23.68 <.0001
page 70 Figure 3.1 Partial regression leverage plot: postshortage water use (Y) versus income (X1), adjusting for preshortage water use.
proc sort data=concord1 out=concsort; by case; run; proc reg data=concsort; model water81 = income water80; run; proc reg data=concsort; model water81 = water80; output out=out1(keep=case yres) residual=yres; run; proc reg data=concsort; model income = water80; output out=out2(keep=case x1res) residual=x1res; run; data all; merge concsort out1 out2; by case; label yres = 'ey/x2'; label x1res = 'ex1/x2'; run;
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 671025350 335512675 391.76 <.0001
Error 493 422213359 856417
Corrected Total 495 1093238710
Root MSE 925.42777 R-Square 0.6138
Dependent Mean 2298.38710 Adj R-Sq 0.6122
Coeff Var 40.26423
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 203.82169 94.36129 2.16 0.0313
income 1 20.54504 3.38341 6.07 <.0001
water80 1 0.59313 0.02505 23.68 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 639446987 639446987 696.11 <.0001
Error 494 453791723 918607
Corrected Total 495 1093238710
Root MSE 958.43974 R-Square 0.5849
Dependent Mean 2298.38710 Adj R-Sq 0.5841
Coeff Var 41.70054
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 537.87101 79.40114 6.77 <.0001
water80 1 0.64439 0.02442 26.38 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: income
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 9588.31670 9588.31670 63.31 <.0001
Error 494 74813 151.44286
Corrected Total 495 84401
Root MSE 12.30621 R-Square 0.1136
Dependent Mean 23.07661 Adj R-Sq 0.1118
Coeff Var 53.32764
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 16.25937 1.01950 15.95 <.0001
water80 1 0.00250 0.00031360 7.96 <.0001
symbol1 color=black interpol=r value=circle height=0.5; axis1 order=(-30 to 70 by 10); proc gplot data=all; plot yres*x1res / haxis=axis1; run; quit;
Figure 3.1![]()
page 71 Figure 3.2 Partial regression leverage plot: postshortage water use water81 (Y) versus preshortage water use water80 (X2), adjusting for income.
proc sort data=concord1 out=concsort1; by case; run; proc reg data=concsort1; model water81 = income water80; run; proc reg data=concsort1; model water81 = income; output out=out3(keep=case yres) residual=yres; run; proc reg data=concsort1; model water80 = income; output out=out4(keep=case x1res) residual=x1res; run; quit; data all; merge concsort out3 out4; by case; label yres = 'ey/x2'; label x1res = 'ex1/x2'; run;
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 671025350 335512675 391.76 <.0001
Error 493 422213359 856417
Corrected Total 495 1093238710
Root MSE 925.42777 R-Square 0.6138
Dependent Mean 2298.38710 Adj R-Sq 0.6122
Coeff Var 40.26423
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 203.82169 94.36129 2.16 0.0313
income 1 20.54504 3.38341 6.07 <.0001
water80 1 0.59313 0.02505 23.68 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 190820566 190820566 104.46 <.0001
Error 494 902418143 1826757
Corrected Total 495 1093238710
Root MSE 1351.57589 R-Square 0.1745
Dependent Mean 2298.38710 Adj R-Sq 0.1729
Coeff Var 58.80541
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 1201.12436 123.32451 9.74 <.0001
income 1 47.54869 4.65229 10.22 <.0001
The REG Procedure
Model: MODEL1
Dependent Variable: water80
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 174943659 174943659 63.31 <.0001
Error 494 1364996643 2763151
Corrected Total 495 1539940302
Root MSE 1662.27287 R-Square 0.1136
Dependent Mean 2732.05645 Adj R-Sq 0.1118
Coeff Var 60.84328
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 1681.43287 151.67405 11.09 <.0001
income 1 45.52763 5.72174 7.96 <.0001
symbol1 color=black interpol=r value=circle height=0.5; axis1 order=(-4000 to 10000 by 2000); proc gplot data=all; plot yres*x1res / haxis=axis1; run; quit;
Figure 3.2![]()
page 74 Table 3.2 Regression of postshortage water use on income, preshortage water use, education, retirement, number of people resident, and increase in people resident.
data concx; set concord1; retired = .; if retire = 'yes' then retired = 1; if retire = 'no' then retired = 0; run; proc freq data=concx; tables retire*retired / missing; run; proc reg data=concx; model water81 = income water80 educat retired peop81 cpeop ; run; quit;
The FREQ Procedure
Table of retire by retired
retire retired
Frequency|
Percent |
Row Pct |
Col Pct | 0| 1| Total
---------+--------+--------+
no | 350 | 0 | 350
| 70.56 | 0.00 | 70.56
| 100.00 | 0.00 |
| 100.00 | 0.00 |
---------+--------+--------+
yes | 0 | 146 | 146
| 0.00 | 29.44 | 29.44
| 0.00 | 100.00 |
| 0.00 | 100.00 |
---------+--------+--------+
Total 350 146 496
70.56 29.44 100.00
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 6 740477522 123412920 171.08 <.0001
Error 489 352761188 721393
Corrected Total 495 1093238710
Root MSE 849.34859 R-Square 0.6773
Dependent Mean 2298.38710 Adj R-Sq 0.6734
Coeff Var 36.95411
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 242.22043 206.86382 1.17 0.2422
income 1 20.96699 3.46372 6.05 <.0001
water80 1 0.49194 0.02635 18.67 <.0001
educat 1 -41.86552 13.22031 -3.17 0.0016
retired 1 189.18433 95.02142 1.99 0.0470
peop81 1 248.19702 28.72480 8.64 <.0001
cpeop 1 96.45360 80.51903 1.20 0.2315
page 80 Table 3.3 Regression of postshortage water use omitting income and education.
proc reg data=concx; model water81 = water80 peop81 retired cpeop ; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: water81
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 4 712718346 178179587 229.91 <.0001
Error 491 380520363 774991
Corrected Total 495 1093238710
Root MSE 880.33548 R-Square 0.6519
Dependent Mean 2298.38710 Adj R-Sq 0.6491
Coeff Var 38.30232
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 48.64897 107.05488 0.45 0.6497
water80 1 0.51974 0.02677 19.41 <.0001
peop81 1 265.28936 29.63234 8.95 <.0001
retired 1 67.27992 94.28846 0.71 0.4758
cpeop 1 134.46255 83.19590 1.62 0.1067
For the next examples, we will be using the wells data set. First, we need to recode chloride concentration into ln chloride concentration. In SAS, we use the log(x) command to do this.
data wells2; set wells; lnchlor = log(chlor); run;
page 86 Equation [3.32]
proc reg data = wells2; model lnchlor = deep; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 4.02334 4.02334 2.19 0.1455
Error 50 91.99885 1.83998
Corrected Total 51 96.02220
Root MSE 1.35646 R-Square 0.0419
Dependent Mean 3.20505 Adj R-Sq 0.0227
Coeff Var 42.32257
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 3.77510 0.42895 8.80 <.0001
deep 1 -0.70578 0.47729 -1.48 0.1455
The code below is for a t-test that does the same thing as the regression above.
proc ttest data = wells2; class deep; var lnchlor; run;
The TTEST Procedure
Statistics
Lower CL Upper CL Lower CL Upper CL
Variable Class N Mean Mean Mean Std Dev Std Dev Std Dev Std Err
lnchlor 10 2.5345 3.7751 5.0157 1.1929 1.7343 3.1661 0.5484
0
lnchlor 42 2.6772 3.0693 3.4615 1.0354 1.2584 1.6047 0.1942
1
lnchlor Diff (1-2) -0.253 0.7058 1.6644 1.135 1.3565 1.6862 0.4773
T-Tests
Variable Method Variances DF t Value Pr > |t|
lnchlor Pooled Equal 50 1.48 0.1455
lnchlor Satterthwaite Unequal 11.4 1.21 0.2497
Equality of Variances
Variable Method Num DF Den DF F Value Pr > F
lnchlor Folded F 9 41 1.90 0.1587
page 87 Figure 3.3 Regression of log chloride concentration on a dummy variable for well type.
symbol1 color=black interpol=r value=circle height=0.5; axis1 order=(1 to 8 by 1); axis2 order=(0 1); proc gplot data=wells2; plot lnchlor*deep / vaxis=axis1 haxis=axis2; run; quit;
page 87 Equation [3.33]
data wells3; set wells2; lndroad = log(droad); run; proc reg data=wells3; model lnchlor = deep lndroad; output out = wells4 p = yhat; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 4.50188 2.25094 1.21 0.3084
Error 49 91.52032 1.86776
Corrected Total 51 96.02220
Root MSE 1.36666 R-Square 0.0469
Dependent Mean 3.20505 Adj R-Sq 0.0080
Coeff Var 42.64091
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 4.20954 0.96096 4.38 <.0001
deep 1 -0.69712 0.48119 -1.45 0.1538
lndroad 1 -0.09097 0.17972 -0.51 0.6150
page 88 Figure 3.4 Regression of log chloride concentration on log distance from road and an intercept dummy variable for well type.
data wells5;
set wells4;
if deep=0 then yhat0=yhat;
if deep=1 then yhat1=yhat;
run;
symbol1 color=black interpol=none value=circle height=0.5;
symbol2 interpol=join;
symbol3 interpol=join;
axis1 order=(0 to 7 by 1);
axis2 order=(0 to 8 by 2);
proc gplot data=wells5;
plot lnchlor*lndroad=1 yhat0*lndroad=2 yhat1*lndroad=3 /
overlay vaxis=axis1 haxis=axis2;
run;
quit;
The graph from the proc gplot above is shown below. The overlay option tells SAS to put the three graphs onto one graph. The numbers after the equals sign correspond to the symbol statements above, telling SAS which statement applies to which graph. If only one symbol statement was used, SAS would apply it to all three of the graphs. Also note that you can use the goptions reset=all command before symbol statements. This will reset the all of the options so that options that were used in previous graphs are not applied to your current graph.
page 89 Figure 3.5 Regression of log chloride concentration on log distance from road and a slope dummy variable for well type.
data wells6; set wells5; deeproad = deep*lndroad; lndroad = log(droad); run; proc reg data = wells6; model lnchlor = lndroad deeproad; output out = wells7 p = yhata; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 1.87088 0.93544 0.49 0.6175
Error 49 94.15131 1.92146
Corrected Total 51 96.02220
Root MSE 1.38617 R-Square 0.0195
Dependent Mean 3.20505 Adj R-Sq -0.0205
Coeff Var 43.24948
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 3.66615 0.90518 4.05 0.0002
lndroad 1 -0.02897 0.20187 -0.14 0.8865
deeproad 1 -0.08147 0.09946 -0.82 0.4167
data wells8;
set wells7;
if deep=0 then yhat0=yhata;
if deep=1 then yhat1=yhata;
run;
symbol1 color=black interpol=none value=circle height=0.5;
symbol2 interpol=join;
symbol3 interpol=join;
axis1 order=(0 to 7 by 1);
axis2 order=(0 to 8 by 2);
proc gplot data=wells8;
plot lnchlor*lndroad=1 yhat0*lndroad=2 yhat1*lndroad=3 /
overlay vaxis=axis1 haxis=axis2;
run;
quit;
page 90 Table 3.4 Regression of log chloride concentration on log distance from road, with intercept and slope dummy variables for well type.
proc reg data = wells8; model lnchlor = deep lndroad deeproad; output out = wells9 p = yhatb; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 3 18.48313 6.16104 3.81 0.0157
Error 48 77.53907 1.61540
Corrected Total 51 96.02220
Root MSE 1.27098 R-Square 0.1925
Dependent Mean 3.20505 Adj R-Sq 0.1420
Coeff Var 39.65568
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 9.07346 1.87938 4.83 <.0001
deep 1 -6.71737 2.09471 -3.21 0.0024
lndroad 1 -1.10942 0.38442 -2.89 0.0058
deeproad 1 1.25585 0.42688 2.94 0.0050
page 91 Figure 3.6 Regression of log chloride concentration on log distance from road, with slope and intercept dummy variables for well type.
data wells10; set wells9; if deep=0 then yhat0=yhatb; if deep=1 then yhat1=yhatb; if deep=0 then lnchlor0=lnchlor; if deep=1 then lnchlor1=lnchlor; run; symbol1 color=black interpol=none value=square height=1.0; symbol2 color=black interpol=none value=circle height=1.0; symbol3 interpol=join; symbol4 interpol=join; axis1 order=(0 to 9 by 1); axis2 order=(0 to 8 by 2); proc gplot data=wells10; plot lnchlor0*lndroad=1 lnchlor1*lndroad=2 yhat0*lndroad=3 yhat1*lndroad=4 / overlay vaxis=axis1 haxis=axis2; run; quit;
Figure 3.6![]()
page 91 Figure 3.7 Separate regressions for shallow (left) and deep (right) wells, same lines as in Figure 3.6.
proc gplot data=wells10; plot lnchlor0*lndroad=1 yhat0*lndroad=3 / overlay vaxis=axis1 haxis=axis2; run; quit;
proc gplot data=wells10; plot lnchlor1*lndroad=2 yhat1*lndroad=4 / overlay vaxis=axis1 haxis=axis2; run; quit;
We will be using the radon data set for the next examples. First we need to create some dummy variables: rdx1 fdx2 mhr (recode of radon) lrdx3 mrdx4.
data radon1; set radon; if locale='RProng' then rdx1=1; if locale='Fringe' then rdx1=0; if locale='Control' then rdx1=0; if locale='RProng' then fdx2=0; if locale='Fringe' then fdx2=1; if locale='Control' then fdx2=0; if radon >= 0 and radon <= 1.5 then mhr='low'; if radon >= 1.6 and radon <= 2.4 then mhr='mid'; if radon > 2.5 then mhr='hig'; if mhr='low' then lrdx3=1; if mhr='mid' then lrdx3=0; if mhr='hig' then lrdx3=0; if mhr='low' then mrdx4=0; if mhr='mid' then mrdx4=1; if mhr='hig' then mrdx4=0; run;
page 93 Table 3.5 Cancer, bedrock, and radon in 26 counties.
proc print data=radon1 noobs; var county cancer locale rdx1 fdx2 mhr lrdx3 mrdx4; run;
county cancer locale rdx1 fdx2 mhr lrdx3 mrdx4 Orange 6.0 RProng 1 0 low 1 0 Putnam 10.5 RProng 1 0 mid 0 1 Sussex 6.7 RProng 1 0 mid 0 1 Warren 6.0 RProng 1 0 hig 0 0 Morris 6.1 RProng 1 0 low 1 0 Hunterdon 6.7 RProng 1 0 hig 0 0 Berks 5.2 Fringe 0 1 hig 0 0 Lehigh 5.6 Fringe 0 1 hig 0 0 Northampton 5.8 Fringe 0 1 hig 0 0 Pike 4.5 Fringe 0 1 low 1 0 Dutchess 5.5 Fringe 0 1 mid 0 1 Sullivan 5.4 Fringe 0 1 low 1 0 Ulster 6.3 Fringe 0 1 low 1 0 Columbia 6.3 Control 0 0 mid 0 1 Delaware 4.3 Control 0 0 mid 0 1 Greene 4.0 Control 0 0 mid 0 1 Otsego 5.9 Control 0 0 mid 0 1 Tioga 4.7 Control 0 0 mid 0 1 Carbon 4.8 Control 0 0 mid 0 1 Lebanon 5.8 Control 0 0 hig 0 0 Lackawanna 5.4 Control 0 0 low 1 0 Luzerne 5.2 Control 0 0 low 1 0 Schuylkill 3.6 Control 0 0 hig 0 0 Susquehanna 4.3 Control 0 0 low 1 0 Wayne 3.5 Control 0 0 low 1 0 Wyoming 6.9 Control 0 0 mid 0 1
page 94 Table 3.6 Relation between cancer rate and bedrock area.
proc reg data=radon1; model cancer = rdx1 fdx2; t1: test rdx1=0, fdx2=0; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 16.90879 8.45440 6.41 0.0061
Error 23 30.33736 1.31902
Corrected Total 25 47.24615
Root MSE 1.14848 R-Square 0.3579
Dependent Mean 5.57692 Adj R-Sq 0.3021
Coeff Var 20.59351
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 4.97692 0.31853 15.62 <.0001
rdx1 1 2.02308 0.56683 3.57 0.0016
fdx2 1 0.49451 0.53842 0.92 0.3679
The REG Procedure
Model: MODEL1
Test T1 Results for Dependent Variable cancer
Mean
Source DF Square F Value Pr > F
Numerator 2 8.45440 6.41 0.0061
Denominator 23 1.31902
proc glm data=radon1; model cancer = rdx1 fdx2; run; quit;
The GLM Procedure
Number of observations 26
The GLM Procedure
Dependent Variable: cancer
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 2 16.90879121 8.45439560 6.41 0.0061
Error 23 30.33736264 1.31901577
Corrected Total 25 47.24615385
R-Square Coeff Var Root MSE cancer Mean
0.357887 20.59351 1.148484 5.576923
Source DF Type I SS Mean Square F Value Pr > F
rdx1 1 15.79615385 15.79615385 11.98 0.0021
fdx2 1 1.11263736 1.11263736 0.84 0.3679
Source DF Type III SS Mean Square F Value Pr > F
rdx1 1 16.80218623 16.80218623 12.74 0.0016
fdx2 1 1.11263736 1.11263736 0.84 0.3679
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 4.976923077 0.31853218 15.62 <.0001
rdx1 2.023076923 0.56683217 3.57 0.0016
fdx2 0.494505495 0.53841766 0.92 0.3679
page 96 Table 3.7 Relation among cancer rate, bedrock area, and radon.
proc reg data=radon1; model cancer = rdx1 fdx2 lrdx3 mrdx4; t2: test rdx1=0, fdx2=0; t3: test lrdx3=0, mrdx4=0; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 4 22.12361 5.53090 4.62 0.0078
Error 21 25.12254 1.19631
Corrected Total 25 47.24615
Root MSE 1.09376 R-Square 0.4683
Dependent Mean 5.57692 Adj R-Sq 0.3670
Coeff Var 19.61225
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 4.52504 0.52808 8.57 <.0001
rdx1 1 2.21189 0.55102 4.01 0.0006
fdx2 1 0.86698 0.54921 1.58 0.1294
lrdx3 1 -0.11668 0.55602 -0.21 0.8358
mrdx4 1 0.90588 0.57595 1.57 0.1307
The REG Procedure
Model: MODEL1
Test T2 Results for Dependent Variable cancer
Mean
Source DF Square F Value Pr > F
Numerator 2 9.64232 8.06 0.0025
Denominator 21 1.19631
The REG Procedure
Model: MODEL1
Test T3 Results for Dependent Variable cancer
Mean
Source DF Square F Value Pr > F
Numerator 2 2.60741 2.18 0.1380
Denominator 21 1.19631
proc glm data=radon1; model cancer = rdx1 fdx2 lrdx3 mrdx4; run; quit;
The GLM Procedure
Number of observations 26
The GLM Procedure
Dependent Variable: cancer
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 22.12361500 5.53090375 4.62 0.0078
Error 21 25.12253885 1.19631137
Corrected Total 25 47.24615385
R-Square Coeff Var Root MSE cancer Mean
0.468263 19.61225 1.093760 5.576923
Source DF Type I SS Mean Square F Value Pr > F
rdx1 1 15.79615385 15.79615385 13.20 0.0016
fdx2 1 1.11263736 1.11263736 0.93 0.3458
lrdx3 1 2.25529715 2.25529715 1.89 0.1842
mrdx4 1 2.95952664 2.95952664 2.47 0.1307
Source DF Type III SS Mean Square F Value Pr > F
rdx1 1 19.27700707 19.27700707 16.11 0.0006
fdx2 1 2.98111766 2.98111766 2.49 0.1294
lrdx3 1 0.05267737 0.05267737 0.04 0.8358
mrdx4 1 2.95952664 2.95952664 2.47 0.1307
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 4.525039288 0.52807932 8.57 <.0001
rdx1 2.211891042 0.55101833 4.01 0.0006
fdx2 0.866980967 0.54921466 1.58 0.1294
lrdx3 -0.116675397 0.55601868 -0.21 0.8358
mrdx4 0.905884407 0.57594866 1.57 0.1307
page 98 Table 3.8 Relation among cancer rate, bedrock area, and radon.
data radon2; set radon1; x1x3=rdx1*lrdx3; x1x4=rdx1*mrdx4; x2x3=fdx2*lrdx3; x2x4=fdx2*mrdx4; proc reg data=radon2; model cancer = rdx1 fdx2 lrdx3 mrdx4 x1x3 x1x4 x2x3 x2x4; t4: test x1x3=0, x1x4=0, x2x3=0, x2x4=0; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 8 26.03520 3.25440 2.61 0.0460
Error 17 21.21095 1.24770
Corrected Total 25 47.24615
Root MSE 1.11701 R-Square 0.5511
Dependent Mean 5.57692 Adj R-Sq 0.3398
Coeff Var 20.02908
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 4.70000 0.78984 5.95 <.0001
rdx1 1 1.65000 1.11701 1.48 0.1579
fdx2 1 0.83333 1.01968 0.82 0.4251
lrdx3 1 -0.10000 0.96736 -0.10 0.9189
mrdx4 1 0.57143 0.89560 0.64 0.5319
x1x3 1 -0.20000 1.47766 -0.14 0.8939
x1x4 1 1.67857 1.43171 1.17 0.2572
x2x3 1 -0.03333 1.32950 -0.03 0.9803
x2x4 1 -0.60476 1.57025 -0.39 0.7049
The REG Procedure
Model: MODEL1
Test T4 Results for Dependent Variable cancer
Mean
Source DF Square F Value Pr > F
Numerator 4 0.97790 0.78 0.5513
Denominator 17 1.24770
page 99 Table 3.9 Effect coding of bedrock area from Table 3.5.
data radon3; set radon2; if locale='RProng' then rev1=1; if locale='Fringe' then rev1=0; if locale='Control' then rev1=-1; if locale='RProng' then fev2=0; if locale='Fringe' then fev2=1; if locale='Control' then fev2=-1; if mhr='low' then v3=1; if mhr='mid' then v3=0; if mhr='hig' then v3=-1; if mhr='low' then v4=0; if mhr='mid' then v4=1; if mhr='hig' then v4=-1; v1v3=rev1*v3; v1v4=rev1*v4; v2v3=fev2*v3; v2v4=fev2*v4; run; proc print data=radon3 noobs; var county locale rdx1 rev1 fdx2 fev2; run;
county locale rdx1 rev1 fdx2 fev2 Orange RProng 1 1 0 0 Putnam RProng 1 1 0 0 Sussex RProng 1 1 0 0 Warren RProng 1 1 0 0 Morris RProng 1 1 0 0 Hunterdon RProng 1 1 0 0 Berks Fringe 0 0 1 1 Lehigh Fringe 0 0 1 1 Northampton Fringe 0 0 1 1 Pike Fringe 0 0 1 1 Dutchess Fringe 0 0 1 1 Sullivan Fringe 0 0 1 1 Ulster Fringe 0 0 1 1 Columbia Control 0 -1 0 -1 Delaware Control 0 -1 0 -1 Greene Control 0 -1 0 -1 Otsego Control 0 -1 0 -1 Tioga Control 0 -1 0 -1 Carbon Control 0 -1 0 -1 Lebanon Control 0 -1 0 -1 Lackawanna Control 0 -1 0 -1 Luzerne Control 0 -1 0 -1 Schuylkill Control 0 -1 0 -1 Susquehanna Control 0 -1 0 -1 Wayne Control 0 -1 0 -1 Wyoming Control 0 -1 0 -1
The proc print below shows the rest of the effect coding and the interaction terms.
proc print data=radon3; var county locale rev1 fev2 mhr v3 v4 v1v3 v1v4 v2v3 v2v4; run;
Obs county locale rev1 fev2 mhr v3 v4 v1v3 v1v4 v2v3 v2v4 1 Orange RProng 1 0 low 1 0 1 0 0 0 2 Putnam RProng 1 0 mid 0 1 0 1 0 0 3 Sussex RProng 1 0 mid 0 1 0 1 0 0 4 Warren RProng 1 0 hig -1 -1 -1 -1 0 0 5 Morris RProng 1 0 low 1 0 1 0 0 0 6 Hunterdon RProng 1 0 hig -1 -1 -1 -1 0 0 7 Berks Fringe 0 1 hig -1 -1 0 0 -1 -1 8 Lehigh Fringe 0 1 hig -1 -1 0 0 -1 -1 9 Northampton Fringe 0 1 hig -1 -1 0 0 -1 -1 10 Pike Fringe 0 1 low 1 0 0 0 1 0 11 Dutchess Fringe 0 1 mid 0 1 0 0 0 1 12 Sullivan Fringe 0 1 low 1 0 0 0 1 0 13 Ulster Fringe 0 1 low 1 0 0 0 1 0 14 Columbia Control -1 -1 mid 0 1 0 -1 0 -1 15 Delaware Control -1 -1 mid 0 1 0 -1 0 -1 16 Greene Control -1 -1 mid 0 1 0 -1 0 -1 17 Otsego Control -1 -1 mid 0 1 0 -1 0 -1 18 Tioga Control -1 -1 mid 0 1 0 -1 0 -1 19 Carbon Control -1 -1 mid 0 1 0 -1 0 -1 20 Lebanon Control -1 -1 hig -1 -1 1 1 1 1 21 Lackawanna Control -1 -1 low 1 0 -1 0 -1 0 22 Luzerne Control -1 -1 low 1 0 -1 0 -1 0 23 Schuylkill Control -1 -1 hig -1 -1 1 1 1 1 24 Susquehanna Control -1 -1 low 1 0 -1 0 -1 0 25 Wayne Control -1 -1 low 1 0 -1 0 -1 0 26 Wyoming Control -1 -1 mid 0 1 0 -1 0 -1
page 100 Table 3.10 Relation among cancer rate, bedrock area, and radon.
proc reg data=radon3; model cancer = rev1 fev2 v3 v4 v1v3 v1v4 v2v3 v2v4; t5: test v1v3=0, v1v4=0, v2v3=0, v2v4=0; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 8 26.03520 3.25440 2.61 0.0460
Error 17 21.21095 1.24770
Corrected Total 25 47.24615
Root MSE 1.11701 R-Square 0.5511
Dependent Mean 5.57692 Adj R-Sq 0.3398
Coeff Var 20.02908
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 5.77831 0.25006 23.11 <.0001
rev1 1 1.22169 0.36311 3.36 0.0037
fev2 1 -0.30053 0.37356 -0.80 0.4322
v3 1 -0.42831 0.33555 -1.28 0.2190
v4 1 0.67884 0.37209 1.82 0.0857
v1v3 1 -0.52169 0.50123 -1.04 0.3125
v1v4 1 0.92116 0.52639 1.75 0.0981
v2v3 1 0.35053 0.48562 0.72 0.4802
v2v4 1 -0.65661 0.59507 -1.10 0.2852
The REG Procedure
Model: MODEL1
Test T5 Results for Dependent Variable cancer
Mean
Source DF Square F Value Pr > F
Numerator 4 0.97790 0.78 0.5513
Denominator 17 1.24770
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services