UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Textbook Examples
Regression with Graphics by Lawrence Hamilton
Chapter 3: The Basics of Multiple Regression

A Three Variable Example

page 68 Table 3.1 Regression of postshortage (1981) water use on income and preshortage (1980) water use. The concord1 data set is used.
proc reg data = concord1;
model income water80 = water81;
run;
proc reg data = concord1;
model water81 = income water80;
run;
The REG Procedure
Model: MODEL1
Dependent Variable: income

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1          14732          14732     104.46    <.0001
Error                   494          69669      141.03078
Corrected Total         495          84401


Root MSE             11.87564    R-Square     0.1745
Dependent Mean       23.07661    Adj R-Sq     0.1729
Coeff Var            51.46179

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       14.63948        0.98275      14.90      <.0001
water81       1        0.00367     0.00035917      10.22      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: water80

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1      900727515      900727515     696.11    <.0001
Error                   494      639212788        1293953
Corrected Total         495     1539940302


Root MSE           1137.52055    R-Square     0.5849
Dependent Mean     2732.05645    Adj R-Sq     0.5841
Coeff Var            41.63606

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      645.82548       94.13406       6.86      <.0001
water81       1        0.90769        0.03440      26.38      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2      671025350      335512675     391.76    <.0001
Error                   493      422213359         856417
Corrected Total         495     1093238710


Root MSE            925.42777    R-Square     0.6138
Dependent Mean     2298.38710    Adj R-Sq     0.6122
Coeff Var            40.26423

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      203.82169       94.36129       2.16      0.0313
income        1       20.54504        3.38341       6.07      <.0001
water80       1        0.59313        0.02505      23.68      <.0001

Partial Effects

page 70 Figure 3.1 Partial regression leverage plot: postshortage water use (Y) versus income (X1), adjusting for preshortage water use.
proc sort data=concord1 out=concsort;
  by case;
run;
proc reg data=concsort;
  model water81 = income water80;
run;
proc reg data=concsort;
  model water81 = water80;
  output out=out1(keep=case yres) residual=yres;
run;
proc reg data=concsort;
  model income = water80;
  output out=out2(keep=case x1res) residual=x1res;
run;
data all;
  merge concsort out1 out2;
  by case;
  label yres = 'ey/x2';
  label x1res = 'ex1/x2';
run;
The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2      671025350      335512675     391.76    <.0001
Error                   493      422213359         856417
Corrected Total         495     1093238710


Root MSE            925.42777    R-Square     0.6138
Dependent Mean     2298.38710    Adj R-Sq     0.6122
Coeff Var            40.26423

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      203.82169       94.36129       2.16      0.0313
income        1       20.54504        3.38341       6.07      <.0001
water80       1        0.59313        0.02505      23.68      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1      639446987      639446987     696.11    <.0001
Error                   494      453791723         918607
Corrected Total         495     1093238710


Root MSE            958.43974    R-Square     0.5849
Dependent Mean     2298.38710    Adj R-Sq     0.5841
Coeff Var            41.70054

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      537.87101       79.40114       6.77      <.0001
water80       1        0.64439        0.02442      26.38      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: income

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1     9588.31670     9588.31670      63.31    <.0001
Error                   494          74813      151.44286
Corrected Total         495          84401


Root MSE             12.30621    R-Square     0.1136
Dependent Mean       23.07661    Adj R-Sq     0.1118
Coeff Var            53.32764

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       16.25937        1.01950      15.95      <.0001
water80       1        0.00250     0.00031360       7.96      <.0001
symbol1 color=black interpol=r value=circle height=0.5;
axis1 order=(-30 to 70 by 10);
proc gplot data=all;
  plot yres*x1res / haxis=axis1;
run; 
quit;
Figure 3.1
page 71 Figure 3.2 Partial regression leverage plot: postshortage water use water81 (Y) versus preshortage water use water80 (X2), adjusting for income.
proc sort data=concord1 out=concsort1;
  by case;
run;
proc reg data=concsort1;
  model water81 = income water80;
run;
proc reg data=concsort1;
  model water81 = income;
  output out=out3(keep=case yres) residual=yres;
run;
proc reg data=concsort1;
  model water80 = income;
  output out=out4(keep=case x1res) residual=x1res;
run;
quit;
data all;
  merge concsort out3 out4;
  by case;
  label yres = 'ey/x2';
  label x1res = 'ex1/x2';
run;
The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2      671025350      335512675     391.76    <.0001
Error                   493      422213359         856417
Corrected Total         495     1093238710


Root MSE            925.42777    R-Square     0.6138
Dependent Mean     2298.38710    Adj R-Sq     0.6122
Coeff Var            40.26423

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      203.82169       94.36129       2.16      0.0313
income        1       20.54504        3.38341       6.07      <.0001
water80       1        0.59313        0.02505      23.68      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1      190820566      190820566     104.46    <.0001
Error                   494      902418143        1826757
Corrected Total         495     1093238710

Root MSE           1351.57589    R-Square     0.1745
Dependent Mean     2298.38710    Adj R-Sq     0.1729
Coeff Var            58.80541

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1     1201.12436      123.32451       9.74      <.0001
income        1       47.54869        4.65229      10.22      <.0001

The REG Procedure
Model: MODEL1
Dependent Variable: water80

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1      174943659      174943659      63.31    <.0001
Error                   494     1364996643        2763151
Corrected Total         495     1539940302


Root MSE           1662.27287    R-Square     0.1136
Dependent Mean     2732.05645    Adj R-Sq     0.1118
Coeff Var            60.84328

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1     1681.43287      151.67405      11.09      <.0001
income        1       45.52763        5.72174       7.96      <.0001
symbol1 color=black interpol=r value=circle height=0.5;
axis1 order=(-4000 to 10000 by 2000);
proc gplot data=all;
  plot yres*x1res / haxis=axis1;
run; 
quit;
Figure 3.2

A Seven-variable Example

page 74 Table 3.2 Regression of postshortage water use on income, preshortage water use, education, retirement, number of people resident, and increase in people resident.
data concx;
  set concord1;
  retired = .;
  if retire = 'yes' then retired = 1;
  if retire = 'no' then retired = 0;
run;
proc freq data=concx;
  tables retire*retired / missing;
run;
proc reg data=concx;
  model water81 = income water80 educat retired peop81 cpeop ;
run;
quit;
The FREQ Procedure

Table of retire by retired

retire     retired

Frequency|
Percent  |
Row Pct  |
Col Pct  |       0|       1|  Total
---------+--------+--------+
no       |    350 |      0 |    350
         |  70.56 |   0.00 |  70.56
         | 100.00 |   0.00 |
         | 100.00 |   0.00 |
---------+--------+--------+
yes      |      0 |    146 |    146
         |   0.00 |  29.44 |  29.44
         |   0.00 | 100.00 |
         |   0.00 | 100.00 |
---------+--------+--------+
Total         350      146      496
            70.56    29.44   100.00
            
The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     6      740477522      123412920     171.08    <.0001
Error                   489      352761188         721393
Corrected Total         495     1093238710


Root MSE            849.34859    R-Square     0.6773
Dependent Mean     2298.38710    Adj R-Sq     0.6734
Coeff Var            36.95411

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      242.22043      206.86382       1.17      0.2422
income        1       20.96699        3.46372       6.05      <.0001
water80       1        0.49194        0.02635      18.67      <.0001
educat        1      -41.86552       13.22031      -3.17      0.0016
retired       1      189.18433       95.02142       1.99      0.0470
peop81        1      248.19702       28.72480       8.64      <.0001
cpeop         1       96.45360       80.51903       1.20      0.2315

F-tests for Sets of Coefficients

page 80 Table 3.3 Regression of postshortage water use omitting income and education.
proc reg data=concx;
  model water81 = water80 peop81 retired cpeop ;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: water81

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     4      712718346      178179587     229.91    <.0001
Error                   491      380520363         774991
Corrected Total         495     1093238710

Root MSE            880.33548    R-Square     0.6519
Dependent Mean     2298.38710    Adj R-Sq     0.6491
Coeff Var            38.30232

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       48.64897      107.05488       0.45      0.6497
water80       1        0.51974        0.02677      19.41      <.0001
peop81        1      265.28936       29.63234       8.95      <.0001
retired       1       67.27992       94.28846       0.71      0.4758
cpeop         1      134.46255       83.19590       1.62      0.1067

Intercept Dummy Variables

For the next examples, we will be using the wells data set. First, we need to recode chloride concentration into ln chloride concentration. In SAS, we use the log(x) command to do this.
data wells2;
  set wells;
  lnchlor = log(chlor);
run;
page 86 Equation [3.32]
proc reg data = wells2;
  model lnchlor = deep;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     1        4.02334        4.02334       2.19    0.1455
Error                    50       91.99885        1.83998
Corrected Total          51       96.02220


Root MSE              1.35646    R-Square     0.0419
Dependent Mean        3.20505    Adj R-Sq     0.0227
Coeff Var            42.32257

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        3.77510        0.42895       8.80      <.0001
deep          1       -0.70578        0.47729      -1.48      0.1455
The code below is for a t-test that does the same thing as the regression above.
proc ttest data = wells2;
  class deep;
  var lnchlor;
run;
The TTEST Procedure

                                           Statistics

                              Lower CL        Upper CL Lower CL        Upper CL
Variable  Class            N      Mean    Mean   Mean  Std Dev Std Dev Std Dev  Std Err
lnchlor                   10    2.5345  3.7751  5.0157  1.1929 1.7343  3.1661   0.5484
          0
lnchlor                   42    2.6772  3.0693  3.4615  1.0354 1.2584  1.6047   0.1942
          1
lnchlor   Diff (1-2)            -0.253  0.7058  1.6644  1.135  1.3565  1.6862   0.4773

                               T-Tests

Variable    Method           Variances      DF    t Value    Pr > |t|

lnchlor     Pooled           Equal          50       1.48      0.1455
lnchlor     Satterthwaite    Unequal      11.4       1.21      0.2497

                    Equality of Variances

Variable    Method      Num DF    Den DF    F Value    Pr > F

lnchlor     Folded F         9        41       1.90    0.1587
page 87 Figure 3.3 Regression of log chloride concentration on a dummy variable for well type.
symbol1 color=black interpol=r value=circle height=0.5;
axis1 order=(1 to 8 by 1);
axis2 order=(0 1);
proc gplot data=wells2;
  plot lnchlor*deep / vaxis=axis1 haxis=axis2;
run; 
quit;
page 87 Equation [3.33]
data wells3;
  set wells2;
  lndroad = log(droad);
run;
proc reg data=wells3;
 model lnchlor = deep lndroad;
 output out = wells4 p = yhat; 
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2        4.50188        2.25094       1.21    0.3084
Error                    49       91.52032        1.86776
Corrected Total          51       96.02220


Root MSE              1.36666    R-Square     0.0469
Dependent Mean        3.20505    Adj R-Sq     0.0080
Coeff Var            42.64091

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        4.20954        0.96096       4.38      <.0001
deep          1       -0.69712        0.48119      -1.45      0.1538
lndroad       1       -0.09097        0.17972      -0.51      0.6150

Slope Dummy Variables

page 88 Figure 3.4 Regression of log chloride concentration on log distance from road and an intercept dummy variable for well type.
data wells5;
 set wells4;
 if deep=0 then yhat0=yhat;
 if deep=1 then yhat1=yhat;
run;
symbol1 color=black interpol=none value=circle height=0.5;
symbol2 interpol=join;
symbol3 interpol=join;
axis1 order=(0 to 7 by 1);
axis2 order=(0 to 8 by 2);
proc gplot data=wells5;
  plot lnchlor*lndroad=1 yhat0*lndroad=2 yhat1*lndroad=3 / 
       overlay vaxis=axis1 haxis=axis2;
run;
quit;
The graph from the proc gplot above is shown below. The overlay option tells SAS to put the three graphs onto one graph. The numbers after the equals sign correspond to the symbol statements above, telling SAS which statement applies to which graph. If only one symbol statement was used, SAS would apply it to all three of the graphs. Also note that you can use the goptions reset=all command before symbol statements. This will reset the all of the options so that options that were used in previous graphs are not applied to your current graph.
page 89 Figure 3.5 Regression of log chloride concentration on log distance from road and a slope dummy variable for well type.
data wells6;
 set wells5;
 deeproad = deep*lndroad;
 lndroad = log(droad);
run;
proc reg data = wells6;
 model lnchlor = lndroad deeproad;
 output out = wells7 p = yhata; 
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2        1.87088        0.93544       0.49    0.6175
Error                    49       94.15131        1.92146
Corrected Total          51       96.02220


Root MSE              1.38617    R-Square     0.0195
Dependent Mean        3.20505    Adj R-Sq    -0.0205
Coeff Var            43.24948

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        3.66615        0.90518       4.05      0.0002
lndroad       1       -0.02897        0.20187      -0.14      0.8865
deeproad      1       -0.08147        0.09946      -0.82      0.4167
data wells8;
 set wells7;
 if deep=0 then yhat0=yhata;
 if deep=1 then yhat1=yhata;
run;
symbol1 color=black interpol=none value=circle height=0.5;
symbol2 interpol=join;
symbol3 interpol=join;
axis1 order=(0 to 7 by 1);
axis2 order=(0 to 8 by 2);
proc gplot data=wells8;
  plot lnchlor*lndroad=1 yhat0*lndroad=2 yhat1*lndroad=3 / 
       overlay vaxis=axis1 haxis=axis2;
run; 
quit;
page 90 Table 3.4 Regression of log chloride concentration on log distance from road, with intercept and slope dummy variables for well type.
proc reg data = wells8;
 model lnchlor = deep lndroad deeproad;
 output out = wells9 p = yhatb; 
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: lnchlor

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     3       18.48313        6.16104       3.81    0.0157
Error                    48       77.53907        1.61540
Corrected Total          51       96.02220

Root MSE              1.27098    R-Square     0.1925
Dependent Mean        3.20505    Adj R-Sq     0.1420
Coeff Var            39.65568

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        9.07346        1.87938       4.83      <.0001
deep          1       -6.71737        2.09471      -3.21      0.0024
lndroad       1       -1.10942        0.38442      -2.89      0.0058
deeproad      1        1.25585        0.42688       2.94      0.0050
page 91 Figure 3.6 Regression of log chloride concentration on log distance from road, with slope and intercept dummy variables for well type.
data wells10;
 set wells9;
 if deep=0 then yhat0=yhatb;
 if deep=1 then yhat1=yhatb;
 if deep=0 then lnchlor0=lnchlor;
 if deep=1 then lnchlor1=lnchlor;
run;

symbol1 color=black interpol=none value=square height=1.0; 
symbol2 color=black interpol=none value=circle height=1.0;
symbol3 interpol=join;
symbol4 interpol=join;
axis1 order=(0 to 9 by 1);
axis2 order=(0 to 8 by 2);
proc gplot data=wells10;
  plot lnchlor0*lndroad=1 lnchlor1*lndroad=2 yhat0*lndroad=3 
  yhat1*lndroad=4 / overlay vaxis=axis1 haxis=axis2;
run; 
quit;
Figure 3.6
page 91 Figure 3.7 Separate regressions for shallow (left) and deep (right) wells, same lines as in Figure 3.6.
proc gplot data=wells10;
  plot lnchlor0*lndroad=1 yhat0*lndroad=3 / overlay vaxis=axis1 haxis=axis2;
run;
quit;
proc gplot data=wells10;
 plot lnchlor1*lndroad=2 yhat1*lndroad=4 / overlay vaxis=axis1 haxis=axis2;
run;
quit;

Oneway Analysis of Variance

We will be using the radon data set for the next examples. First we need to create some dummy variables: rdx1 fdx2 mhr (recode of radon) lrdx3 mrdx4.
data radon1;
 set radon;
 if locale='RProng' then rdx1=1;
 if locale='Fringe' then rdx1=0;
 if locale='Control' then rdx1=0;
 if locale='RProng' then fdx2=0;
 if locale='Fringe' then fdx2=1;
 if locale='Control' then fdx2=0;
 if radon >= 0 and radon <= 1.5 then mhr='low';
 if radon >= 1.6 and radon <= 2.4 then mhr='mid';
 if radon > 2.5 then mhr='hig';
 if mhr='low' then lrdx3=1;
 if mhr='mid' then lrdx3=0;
 if mhr='hig' then lrdx3=0;
 if mhr='low' then mrdx4=0;
 if mhr='mid' then mrdx4=1;
 if mhr='hig' then mrdx4=0;
run;
page 93 Table 3.5 Cancer, bedrock, and radon in 26 counties.
proc print data=radon1 noobs;
 var county cancer locale rdx1 fdx2 mhr lrdx3 mrdx4;
run; 
county         cancer    locale     rdx1    fdx2    mhr    lrdx3    mrdx4

Orange           6.0     RProng       1       0     low      1        0
Putnam          10.5     RProng       1       0     mid      0        1
Sussex           6.7     RProng       1       0     mid      0        1
Warren           6.0     RProng       1       0     hig      0        0
Morris           6.1     RProng       1       0     low      1        0
Hunterdon        6.7     RProng       1       0     hig      0        0
Berks            5.2     Fringe       0       1     hig      0        0
Lehigh           5.6     Fringe       0       1     hig      0        0
Northampton      5.8     Fringe       0       1     hig      0        0
Pike             4.5     Fringe       0       1     low      1        0
Dutchess         5.5     Fringe       0       1     mid      0        1
Sullivan         5.4     Fringe       0       1     low      1        0
Ulster           6.3     Fringe       0       1     low      1        0
Columbia         6.3     Control      0       0     mid      0        1
Delaware         4.3     Control      0       0     mid      0        1
Greene           4.0     Control      0       0     mid      0        1
Otsego           5.9     Control      0       0     mid      0        1
Tioga            4.7     Control      0       0     mid      0        1
Carbon           4.8     Control      0       0     mid      0        1
Lebanon          5.8     Control      0       0     hig      0        0
Lackawanna       5.4     Control      0       0     low      1        0
Luzerne          5.2     Control      0       0     low      1        0
Schuylkill       3.6     Control      0       0     hig      0        0
Susquehanna      4.3     Control      0       0     low      1        0
Wayne            3.5     Control      0       0     low      1        0
Wyoming          6.9     Control      0       0     mid      0        1
page 94 Table 3.6 Relation between cancer rate and bedrock area.
proc reg data=radon1;
 model cancer = rdx1 fdx2;
 t1: test rdx1=0, fdx2=0;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2       16.90879        8.45440       6.41    0.0061
Error                    23       30.33736        1.31902
Corrected Total          25       47.24615


Root MSE              1.14848    R-Square     0.3579
Dependent Mean        5.57692    Adj R-Sq     0.3021
Coeff Var            20.59351

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        4.97692        0.31853      15.62      <.0001
rdx1          1        2.02308        0.56683       3.57      0.0016
fdx2          1        0.49451        0.53842       0.92      0.3679

The REG Procedure
Model: MODEL1

      Test T1 Results for Dependent Variable cancer

                                Mean
Source             DF         Square    F Value    Pr > F

Numerator           2        8.45440       6.41    0.0061
Denominator        23        1.31902
proc glm data=radon1;
 model cancer = rdx1 fdx2;
run;
quit;
The GLM Procedure

Number of observations    26
The GLM Procedure

Dependent Variable: cancer

                                        Sum of
Source                      DF         Squares     Mean Square    F Value    Pr > F

Model                        2     16.90879121      8.45439560       6.41    0.0061

Error                       23     30.33736264      1.31901577

Corrected Total             25     47.24615385

R-Square     Coeff Var      Root MSE    cancer Mean

0.357887      20.59351      1.148484       5.576923

Source                      DF       Type I SS     Mean Square    F Value    Pr > F

rdx1                         1     15.79615385     15.79615385      11.98    0.0021
fdx2                         1      1.11263736      1.11263736       0.84    0.3679

Source                      DF     Type III SS     Mean Square    F Value    Pr > F

rdx1                         1     16.80218623     16.80218623      12.74    0.0016
fdx2                         1      1.11263736      1.11263736       0.84    0.3679

                                  Standard
Parameter         Estimate           Error    t Value    Pr > |t|

Intercept      4.976923077      0.31853218      15.62      <.0001
rdx1           2.023076923      0.56683217       3.57      0.0016
fdx2           0.494505495      0.53841766       0.92      0.3679

Twoway Analysis of Variance

page 96 Table 3.7 Relation among cancer rate, bedrock area, and radon.
proc reg data=radon1;
 model cancer = rdx1 fdx2 lrdx3 mrdx4;
 t2: test rdx1=0, fdx2=0;
 t3: test lrdx3=0, mrdx4=0;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     4       22.12361        5.53090       4.62    0.0078
Error                    21       25.12254        1.19631
Corrected Total          25       47.24615


Root MSE              1.09376    R-Square     0.4683
Dependent Mean        5.57692    Adj R-Sq     0.3670
Coeff Var            19.61225

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        4.52504        0.52808       8.57      <.0001
rdx1          1        2.21189        0.55102       4.01      0.0006
fdx2          1        0.86698        0.54921       1.58      0.1294
lrdx3         1       -0.11668        0.55602      -0.21      0.8358
mrdx4         1        0.90588        0.57595       1.57      0.1307

The REG Procedure
Model: MODEL1

      Test T2 Results for Dependent Variable cancer

                                Mean
Source             DF         Square    F Value    Pr > F

Numerator           2        9.64232       8.06    0.0025
Denominator        21        1.19631

The REG Procedure
Model: MODEL1

      Test T3 Results for Dependent Variable cancer

                                Mean
Source             DF         Square    F Value    Pr > F

Numerator           2        2.60741       2.18    0.1380
Denominator        21        1.19631
proc glm data=radon1;
 model cancer = rdx1 fdx2 lrdx3 mrdx4;
run;
quit;
The GLM Procedure

Number of observations    26
The GLM Procedure

Dependent Variable: cancer

                                        Sum of
Source                      DF         Squares     Mean Square    F Value    Pr > F

Model                        4     22.12361500      5.53090375       4.62    0.0078

Error                       21     25.12253885      1.19631137

Corrected Total             25     47.24615385

R-Square     Coeff Var      Root MSE    cancer Mean

0.468263      19.61225      1.093760       5.576923

Source                      DF       Type I SS     Mean Square    F Value    Pr > F

rdx1                         1     15.79615385     15.79615385      13.20    0.0016
fdx2                         1      1.11263736      1.11263736       0.93    0.3458
lrdx3                        1      2.25529715      2.25529715       1.89    0.1842
mrdx4                        1      2.95952664      2.95952664       2.47    0.1307

Source                      DF     Type III SS     Mean Square    F Value    Pr > F

rdx1                         1     19.27700707     19.27700707      16.11    0.0006
fdx2                         1      2.98111766      2.98111766       2.49    0.1294
lrdx3                        1      0.05267737      0.05267737       0.04    0.8358
mrdx4                        1      2.95952664      2.95952664       2.47    0.1307

                                  Standard
Parameter         Estimate           Error    t Value    Pr > |t|

Intercept      4.525039288      0.52807932       8.57      <.0001
rdx1           2.211891042      0.55101833       4.01      0.0006
fdx2           0.866980967      0.54921466       1.58      0.1294
lrdx3         -0.116675397      0.55601868      -0.21      0.8358
mrdx4          0.905884407      0.57594866       1.57      0.1307
page 98 Table 3.8 Relation among cancer rate, bedrock area, and radon.
data radon2;
 set radon1;
 x1x3=rdx1*lrdx3;
 x1x4=rdx1*mrdx4;
 x2x3=fdx2*lrdx3;
 x2x4=fdx2*mrdx4;

proc reg data=radon2;
  model cancer = rdx1 fdx2 lrdx3 mrdx4 x1x3 x1x4 x2x3 x2x4;
  t4: test x1x3=0, x1x4=0, x2x3=0, x2x4=0;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     8       26.03520        3.25440       2.61    0.0460
Error                    17       21.21095        1.24770
Corrected Total          25       47.24615

Root MSE              1.11701    R-Square     0.5511
Dependent Mean        5.57692    Adj R-Sq     0.3398
Coeff Var            20.02908

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        4.70000        0.78984       5.95      <.0001
rdx1          1        1.65000        1.11701       1.48      0.1579
fdx2          1        0.83333        1.01968       0.82      0.4251
lrdx3         1       -0.10000        0.96736      -0.10      0.9189
mrdx4         1        0.57143        0.89560       0.64      0.5319
x1x3          1       -0.20000        1.47766      -0.14      0.8939
x1x4          1        1.67857        1.43171       1.17      0.2572
x2x3          1       -0.03333        1.32950      -0.03      0.9803
x2x4          1       -0.60476        1.57025      -0.39      0.7049

The REG Procedure
Model: MODEL1

      Test T4 Results for Dependent Variable cancer

                                Mean
Source             DF         Square    F Value    Pr > F

Numerator           4        0.97790       0.78    0.5513
Denominator        17        1.24770
page 99 Table 3.9 Effect coding of bedrock area from Table 3.5.
data radon3;
 set radon2;
 if locale='RProng' then rev1=1;
 if locale='Fringe' then rev1=0;
 if locale='Control' then rev1=-1;
 if locale='RProng' then fev2=0;
 if locale='Fringe' then fev2=1;
 if locale='Control' then fev2=-1;
 if mhr='low' then v3=1;
 if mhr='mid' then v3=0;
 if mhr='hig' then v3=-1;
 if mhr='low' then v4=0;
 if mhr='mid' then v4=1;
 if mhr='hig' then v4=-1;
 v1v3=rev1*v3;
 v1v4=rev1*v4;
 v2v3=fev2*v3;
 v2v4=fev2*v4;
run;
proc print data=radon3 noobs;
 var county locale rdx1 rev1 fdx2 fev2;
run;
county         locale     rdx1    rev1    fdx2    fev2

Orange         RProng       1       1       0       0
Putnam         RProng       1       1       0       0
Sussex         RProng       1       1       0       0
Warren         RProng       1       1       0       0
Morris         RProng       1       1       0       0
Hunterdon      RProng       1       1       0       0
Berks          Fringe       0       0       1       1
Lehigh         Fringe       0       0       1       1
Northampton    Fringe       0       0       1       1
Pike           Fringe       0       0       1       1
Dutchess       Fringe       0       0       1       1
Sullivan       Fringe       0       0       1       1
Ulster         Fringe       0       0       1       1
Columbia       Control      0      -1       0      -1
Delaware       Control      0      -1       0      -1
Greene         Control      0      -1       0      -1
Otsego         Control      0      -1       0      -1
Tioga          Control      0      -1       0      -1
Carbon         Control      0      -1       0      -1
Lebanon        Control      0      -1       0      -1
Lackawanna     Control      0      -1       0      -1
Luzerne        Control      0      -1       0      -1
Schuylkill     Control      0      -1       0      -1
Susquehanna    Control      0      -1       0      -1
Wayne          Control      0      -1       0      -1
Wyoming        Control      0      -1       0      -1
The proc print below shows the rest of the effect coding and the interaction terms.
proc print data=radon3;
 var county locale rev1 fev2 mhr v3 v4 v1v3 v1v4 v2v3 v2v4;
 run;
Obs  county      locale  rev1 fev2    mhr    v3    v4    v1v3    v1v4    v2v3    v2v4
  1  Orange      RProng    1    0     low     1     0      1       0       0       0
  2  Putnam      RProng    1    0     mid     0     1      0       1       0       0
  3  Sussex      RProng    1    0     mid     0     1      0       1       0       0
  4  Warren      RProng    1    0     hig    -1    -1     -1      -1       0       0
  5  Morris      RProng    1    0     low     1     0      1       0       0       0
  6  Hunterdon   RProng    1    0     hig    -1    -1     -1      -1       0       0
  7  Berks       Fringe    0    1     hig    -1    -1      0       0      -1      -1
  8  Lehigh      Fringe    0    1     hig    -1    -1      0       0      -1      -1
  9  Northampton Fringe    0    1     hig    -1    -1      0       0      -1      -1
 10  Pike        Fringe    0    1     low     1     0      0       0       1       0
 11  Dutchess    Fringe    0    1     mid     0     1      0       0       0       1
 12  Sullivan    Fringe    0    1     low     1     0      0       0       1       0
 13  Ulster      Fringe    0    1     low     1     0      0       0       1       0
 14  Columbia    Control  -1   -1     mid     0     1      0      -1       0      -1
 15  Delaware    Control  -1   -1     mid     0     1      0      -1       0      -1
 16  Greene      Control  -1   -1     mid     0     1      0      -1       0      -1
 17  Otsego      Control  -1   -1     mid     0     1      0      -1       0      -1
 18  Tioga       Control  -1   -1     mid     0     1      0      -1       0      -1
 19  Carbon      Control  -1   -1     mid     0     1      0      -1       0      -1
 20  Lebanon     Control  -1   -1     hig    -1    -1      1       1       1       1
 21  Lackawanna  Control  -1   -1     low     1     0     -1       0      -1       0
 22  Luzerne     Control  -1   -1     low     1     0     -1       0      -1       0
 23  Schuylkill  Control  -1   -1     hig    -1    -1      1       1       1       1
 24  Susquehanna Control  -1   -1     low     1     0     -1       0      -1       0
 25  Wayne       Control  -1   -1     low     1     0     -1       0      -1       0
 26  Wyoming     Control  -1   -1     mid     0     1      0      -1       0      -1
page 100 Table 3.10 Relation among cancer rate, bedrock area, and radon.
proc reg data=radon3;
 model cancer = rev1 fev2 v3 v4 v1v3 v1v4 v2v3 v2v4;
 t5: test v1v3=0, v1v4=0, v2v3=0, v2v4=0;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: cancer

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     8       26.03520        3.25440       2.61    0.0460
Error                    17       21.21095        1.24770
Corrected Total          25       47.24615

Root MSE              1.11701    R-Square     0.5511
Dependent Mean        5.57692    Adj R-Sq     0.3398
Coeff Var            20.02908

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1        5.77831        0.25006      23.11      <.0001
rev1          1        1.22169        0.36311       3.36      0.0037
fev2          1       -0.30053        0.37356      -0.80      0.4322
v3            1       -0.42831        0.33555      -1.28      0.2190
v4            1        0.67884        0.37209       1.82      0.0857
v1v3          1       -0.52169        0.50123      -1.04      0.3125
v1v4          1        0.92116        0.52639       1.75      0.0981
v2v3          1        0.35053        0.48562       0.72      0.4802
v2v4          1       -0.65661        0.59507      -1.10      0.2852

The REG Procedure
Model: MODEL1

      Test T5 Results for Dependent Variable cancer

                                Mean
Source             DF         Square    F Value    Pr > F

Numerator           4        0.97790       0.78    0.5513
Denominator        17        1.24770

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.