|
|
|
||||
|
|
|||||
Inputting the data shown on page 241.
data ch6fig05;
input x1 x2 y;
label x1='targtpop'
x2='dispoinc';
cards;
68.5 16.7 174.4
45.2 16.8 164.4
91.3 18.2 244.2
47.8 16.3 154.6
46.9 17.3 181.6
66.1 18.2 207.5
49.5 15.9 152.8
52.0 17.2 163.2
48.9 16.6 145.4
38.4 16.0 137.2
87.9 18.3 241.9
72.8 17.1 191.1
88.4 17.4 232.0
42.9 15.8 145.3
52.5 17.8 161.1
85.7 18.4 209.7
41.3 16.5 146.4
51.7 16.3 144.0
89.6 18.1 232.6
82.7 19.1 224.1
52.3 16.0 166.5
;
run;
Creating the x1x2 variable to be used in Fig. 6.7
data ch6fig05a; set ch6fig05; x1x2 = x1*x2; run;
Fig. 6.4a, p. 237.
Scatterplot matrix.
Note: Invoking a macro for the scatter matrix.
%include "c:\neter\scatter.sas"; %scatter(data = ch6fig05a, var = y x1 x2);
Fig. 6.4b, p. 237.
Correlation matrix.
proc corr data = ch6fig05a; run;
The CORR Procedure
4 Variables: x1 x2 y x1x2
Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum Label
x1 21 62.01905 18.62033 1302 38.40000 91.30000 targtpop
x2 21 17.14286 0.97035 360.00000 15.80000 19.10000 dispoinc
y 21 181.90476 36.19130 3820 137.20000 244.20000
x1x2 21 1077 373.86333 22609 614.40000 1662
Pearson Correlation Coefficients, N = 21
Prob > |r| under H0: Rho=0
x1 x2 y x1x2
x1 1.00000 0.78130 0.94455 0.99442
targtpop <.0001 <.0001 <.0001
x2 0.78130 1.00000 0.83580 0.83951
dispoinc <.0001 <.0001 <.0001
y 0.94455 0.83580 1.00000 0.95558
<.0001 <.0001 <.0001
x1x2 0.99442 0.83951 0.95558 1.00000
<.0001 <.0001 <.0001
Fig. 6.5a and b, p. 241.
Note that output statement is used to create outfig05 with fitted and residual values.
proc reg data = ch6fig05a; var x1x2; model y = x1 x2/ i; output out=outfig05 p = fitted r = residual; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: y
X'X Inverse, Parameter Estimates, and SSE
Variable Label Intercept x1 x2 y
Intercept Intercept 29.728923483 0.0721834719 -1.992553186 -68.85707315
x1 targtpop 0.0721834719 0.0003701761 -0.005549917 1.4545595828
x2 dispoinc -1.992553186 -0.005549917 0.1363106368 9.3655003765
y -68.85707315 1.4545595828 9.3655003765 2180.9274114
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 24015 12008 99.10 <.0001
Error 18 2180.92741 121.16263
Corrected Total 20 26196
Root MSE 11.00739 R-Square 0.9167
Dependent Mean 181.90476 Adj R-Sq 0.9075
Coeff Var 6.05118
Parameter Estimates
Parameter Standard
Variable Label DF Estimate Error t Value Pr > |t|
Intercept Intercept 1 -68.85707 60.01695 -1.15 0.2663
x1 targtpop 1 1.45456 0.21178 6.87 <.0001
x2 dispoinc 1 9.36550 4.06396 2.30 0.0333
Show all variables, including fitted and residual as shown in Fig. 6.5b, p. 24
proc print data = outfig05; var y x1 x2 fitted residual; run;
Obs y x1 x2 fitted residual 1 174.4 68.5 16.7 187.184 -12.7841 2 164.4 45.2 16.8 154.229 10.1706 3 244.2 91.3 18.2 234.396 9.8037 4 154.6 47.8 16.3 153.329 1.2715 5 181.6 46.9 17.3 161.385 20.2151 6 207.5 66.1 18.2 197.741 9.7586 7 152.8 49.5 15.9 152.055 0.7449 8 163.2 52.0 17.2 167.867 -4.6666 9 145.4 48.9 16.6 157.738 -12.3382 10 137.2 38.4 16.0 136.846 0.3540 11 241.9 87.9 18.3 230.387 11.5126 12 191.1 72.8 17.1 197.185 -6.0849 13 232.0 88.4 17.4 222.686 9.3143 14 145.3 42.9 15.8 141.518 3.7816 15 161.1 52.5 17.8 174.213 -13.1132 16 209.7 85.7 18.4 228.124 -18.4239 17 146.4 41.3 16.5 145.747 0.6530 18 144.0 51.7 16.3 159.001 -15.0013 19 232.6 89.6 18.1 230.987 1.6130 20 224.1 82.7 19.1 230.316 -6.2161 21 166.5 52.3 16.0 157.064 9.4356
Note: To recreate the 3-D plots in Fig. 6.6 use interactive data analysis in SAS, visit our web page http://www.ats.ucla.edu/stat/sas/teach/reg_int/reg_int_cont.htm .
Fig. 6.7, p. 246, showing 4 different diagnostic plots.
proc gplot data = outfig05; plot residual*fitted; run;
proc gplot data = outfig05; plot residual*x1; run;
proc gplot data = outfig05; plot residual*x2; run;
proc gplot data = outfig05; plot residual*x1x2; run;
Fig 6.8a-Fig 6.8d, page 247 could have been obtained all in one proc gplot command as shown below.
proc gplot data = outfig05; plot residual*fitted; plot residual*x1; plot residual*x2; plot residual*x1x2; run;
Fig 6.8a, page 247.
data outfig08; set outfig05; absresid = abs(residual); run; proc gplot data=outfig08; plot absresid*fitted; run;
Fig. 6.8b, p. 247, normal probability plot.
Note: The labels on the X-axis differs from the book.
proc univariate data = outfig05 noprint ; qqplot residual / normal; run;
Estimation of Mean Response and Prediction Limits for New Observations, p. 249-251. Adding an extra line of data in order to predict.
data ch6fig05h; input x1 x2 y; cards; 68.5 16.7 174.4 45.2 16.8 164.4 91.3 18.2 244.2 47.8 16.3 154.6 46.9 17.3 181.6 66.1 18.2 207.5 49.5 15.9 152.8 52.0 17.2 163.2 48.9 16.6 145.4 38.4 16.0 137.2 87.9 18.3 241.9 72.8 17.1 191.1 88.4 17.4 232.0 42.9 15.8 145.3 52.5 17.8 161.1 85.7 18.4 209.7 41.3 16.5 146.4 51.7 16.3 144.0 89.6 18.1 232.6 82.7 19.1 224.1 52.3 16.0 166.5 65.4 17.6 . 53.1 17.7 . ; run;
Getting the predicted value and the CI's for E[Yh] and Yh(new), p. 249-251. Upper and Lower CLMean is for E[Yh] and Upper and Lower CL is for Yh(new).
proc reg data = ch6fig05h ; model y = x1 x2 / r cli clm; ods output OutputStatistics=temp; run; quit;
The REG Procedure
Model: MODEL1
Dependent Variable: y
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 2 24015 12008 99.10 <.0001
Error 18 2180.92741 121.16263
Corrected Total 20 26196
Root MSE 11.00739 R-Square 0.9167
Dependent Mean 181.90476 Adj R-Sq 0.9075
Coeff Var 6.05118
Parameter Estimates
Parameter Standard
Variable DF Estimate Error t Value Pr > |t|
Intercept 1 -68.85707 60.01695 -1.15 0.2663
x1 1 1.45456 0.21178 6.87 <.0001
x2 1 9.36550 4.06396 2.30 0.0333
The REG Procedure
Model: MODEL1
Dependent Variable: y
Output Statistics
Dep Var Predicted Std Error
Obs y Value Mean Predict 95% CL Mean 95% CL Predict Residual
1 174.4000 187.1841 3.8409 179.1146 195.2536 162.6910 211.6772 -12.7841
2 164.4000 154.2294 3.5558 146.7591 161.6998 129.9271 178.5317 10.1706
3 244.2000 234.3963 4.5882 224.7569 244.0358 209.3421 259.4506 9.8037
4 154.6000 153.3285 3.2331 146.5361 160.1210 129.2260 177.4311 1.2715
5 181.6000 161.3849 4.4300 152.0778 170.6921 136.4566 186.3132 20.2151
6 207.5000 197.7414 4.3786 188.5424 206.9404 172.8533 222.6295 9.7586
7 152.8000 152.0551 4.1696 143.2952 160.8150 127.3259 176.7843 0.7449
8 163.2000 167.8666 3.3310 160.8684 174.8649 143.7053 192.0280 -4.6666
9 145.4000 157.7382 2.9628 151.5136 163.9628 133.7895 181.6869 -12.3382
10 137.2000 136.8460 4.0074 128.4268 145.2653 112.2354 161.4566 0.3540
11 241.9000 230.3874 4.2012 221.5610 239.2137 205.6346 255.1402 11.5126
12 191.1000 197.1849 3.4109 190.0188 204.3510 172.9744 221.3954 -6.0849
13 232.0000 222.6857 5.3808 211.3810 233.9904 196.9448 248.4266 9.3143
14 145.3000 141.5184 4.1735 132.7502 150.2866 116.7863 166.2506 3.7816
15 161.1000 174.2132 5.0377 163.6294 184.7971 148.7807 199.6458 -13.1132
16 209.7000 228.1239 4.1214 219.4652 236.7826 203.4304 252.8174 -18.4239
17 146.4000 145.7470 3.7331 137.9041 153.5899 121.3276 170.1664 0.6530
18 144.0000 159.0013 3.2529 152.1672 165.8354 134.8870 183.1157 -15.0013
19 232.6000 230.9870 4.4176 221.7059 240.2681 206.0684 255.9056 1.6130
20 224.1000 230.3161 5.8120 218.1054 242.5267 204.1647 256.4675 -6.2161
21 166.5000 157.0644 4.0792 148.4944 165.6344 132.4018 181.7270 9.4356
22 . 191.1039 2.7668 185.2911 196.9168 167.2589 214.9490 .
23 . 174.1494 4.5986 164.4881 183.8107 149.0867 199.2121 .
Output Statistics
Std Error Student Cook's
Obs Residual Residual -2-1 0 1 2 D
1 10.316 -1.239 | **| | 0.071
2 10.417 0.976 | |* | 0.037
3 10.006 0.980 | |* | 0.067
4 10.522 0.121 | | | 0.000
5 10.077 2.006 | |**** | 0.259
6 10.099 0.966 | |* | 0.059
7 10.187 0.0731 | | | 0.000
8 10.491 -0.445 | | | 0.007
9 10.601 -1.164 | **| | 0.035
10 10.252 0.0345 | | | 0.000
11 10.174 1.132 | |** | 0.073
12 10.466 -0.581 | *| | 0.012
13 9.603 0.970 | |* | 0.098
14 10.186 0.371 | | | 0.008
15 9.787 -1.340 | **| | 0.159
16 10.207 -1.805 | ***| | 0.177
The REG Procedure
Model: MODEL1
Dependent Variable: y
Output Statistics
Std Error Student Cook's
Obs Residual Residual -2-1 0 1 2 D
17 10.355 0.0631 | | | 0.000
18 10.516 -1.427 | **| | 0.065
19 10.082 0.160 | | | 0.002
20 9.348 -0.665 | *| | 0.057
21 10.224 0.923 | |* | 0.045
22 . . .
23 . . .
Sum of Residuals 0
Sum of Squared Residuals 2180.92741
Predicted Residual SS (PRESS) 3002.92331
We use Where Observation >= 22 to show just the last two observation
proc print data = temp; where Observation >= 22; run;
StdErr
Predicted Mean Lower Upper
Obs Model Dependent Observation DepVar Value Predict CLMean CLMean
22 MODEL1 y 22 . 191.1039 2.7668 185.2911 196.9168
23 MODEL1 y 23 . 174.1494 4.5986 164.4881 183.8107
StdErr Student
Obs LowerCL UpperCL Residual Residual Residual Picture CooksD
22 167.2589 214.9490 . . . .
23 149.0867 199.2121 . . . .
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services