SAS Textbook Examples
Applied Linear Statistical Models by Neter, Kutner, et. al.
Chapter 6: Multiple Regression I

NOTE: This page has been delinked.  It is no longer being maintained, and information on this page may be out of date. Because this page has been delinked, we cannot not answer questions regarding this page.

Inputting the data shown on page 241.
data ch6fig05;
  input x1 x2 y;
  label x1='targtpop'
        x2='dispoinc';
cards;
  68.5  16.7  174.4
  45.2  16.8  164.4
  91.3  18.2  244.2
  47.8  16.3  154.6
  46.9  17.3  181.6
  66.1  18.2  207.5
  49.5  15.9  152.8
  52.0  17.2  163.2
  48.9  16.6  145.4
  38.4  16.0  137.2
  87.9  18.3  241.9
  72.8  17.1  191.1
  88.4  17.4  232.0
  42.9  15.8  145.3
  52.5  17.8  161.1
  85.7  18.4  209.7
  41.3  16.5  146.4
  51.7  16.3  144.0
  89.6  18.1  232.6
  82.7  19.1  224.1
  52.3  16.0  166.5
  ;
run;
Creating the x1x2 variable to be used in Fig. 6.7
data ch6fig05a;
  set ch6fig05;
  x1x2 = x1*x2;
run;
Fig. 6.4a, p. 237.
Scatterplot matrix.
Note: Invoking a macro for the scatter matrix.
%include "c:\neter\scatter.sas";
%scatter(data = ch6fig05a, var = y x1 x2);
Fig. 6.4b, p. 237.
Correlation matrix.
proc corr data = ch6fig05a;
run;
The CORR Procedure

   4  Variables:    x1       x2       y        x1x2

                                       Simple Statistics

Variable          N         Mean      Std Dev          Sum      Minimum      Maximum   Label

x1               21     62.01905     18.62033         1302     38.40000     91.30000   targtpop
x2               21     17.14286      0.97035    360.00000     15.80000     19.10000   dispoinc
y                21    181.90476     36.19130         3820    137.20000    244.20000
x1x2             21         1077    373.86333        22609    614.40000         1662

           Pearson Correlation Coefficients, N = 21
                   Prob > |r| under H0: Rho=0

                    x1            x2             y          x1x2

x1             1.00000       0.78130       0.94455       0.99442
targtpop                      <.0001        <.0001        <.0001

x2             0.78130       1.00000       0.83580       0.83951
dispoinc        <.0001                      <.0001        <.0001

y              0.94455       0.83580       1.00000       0.95558
                <.0001        <.0001                      <.0001

x1x2           0.99442       0.83951       0.95558       1.00000
                <.0001        <.0001        <.0001
Fig. 6.5a and b, p. 241.
Note that output statement is used to create outfig05 with fitted and residual values.
proc reg data = ch6fig05a;
  var x1x2;
  model y = x1 x2/ i;
  output out=outfig05 p = fitted r = residual;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: y

                           X'X Inverse, Parameter Estimates, and SSE

Variable       Label             Intercept                x1                x2                 y

Intercept      Intercept      29.728923483      0.0721834719      -1.992553186      -68.85707315
x1             targtpop       0.0721834719      0.0003701761      -0.005549917      1.4545595828
x2             dispoinc       -1.992553186      -0.005549917      0.1363106368      9.3655003765
y                             -68.85707315      1.4545595828      9.3655003765      2180.9274114

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2          24015          12008      99.10    <.0001
Error                    18     2180.92741      121.16263
Corrected Total          20          26196

Root MSE             11.00739    R-Square     0.9167
Dependent Mean      181.90476    Adj R-Sq     0.9075
Coeff Var             6.05118

                               Parameter Estimates

                                  Parameter       Standard
Variable     Label        DF       Estimate          Error    t Value    Pr > |t|

Intercept    Intercept     1      -68.85707       60.01695      -1.15      0.2663
x1           targtpop      1        1.45456        0.21178       6.87      <.0001
x2           dispoinc      1        9.36550        4.06396       2.30      0.0333
Show all variables, including fitted and residual as shown in Fig. 6.5b, p. 24
proc print data = outfig05;
  var y x1 x2 fitted residual;
run;
Obs      y       x1      x2      fitted    residual

  1    174.4    68.5    16.7    187.184    -12.7841
  2    164.4    45.2    16.8    154.229     10.1706
  3    244.2    91.3    18.2    234.396      9.8037
  4    154.6    47.8    16.3    153.329      1.2715
  5    181.6    46.9    17.3    161.385     20.2151
  6    207.5    66.1    18.2    197.741      9.7586
  7    152.8    49.5    15.9    152.055      0.7449
  8    163.2    52.0    17.2    167.867     -4.6666
  9    145.4    48.9    16.6    157.738    -12.3382
 10    137.2    38.4    16.0    136.846      0.3540
 11    241.9    87.9    18.3    230.387     11.5126
 12    191.1    72.8    17.1    197.185     -6.0849
 13    232.0    88.4    17.4    222.686      9.3143
 14    145.3    42.9    15.8    141.518      3.7816
 15    161.1    52.5    17.8    174.213    -13.1132
 16    209.7    85.7    18.4    228.124    -18.4239
 17    146.4    41.3    16.5    145.747      0.6530
 18    144.0    51.7    16.3    159.001    -15.0013
 19    232.6    89.6    18.1    230.987      1.6130
 20    224.1    82.7    19.1    230.316     -6.2161
 21    166.5    52.3    16.0    157.064      9.4356
Note: To recreate the 3-D plots in Fig. 6.6 use interactive data analysis in SAS, visit our web page http://www.ats.ucla.edu/stat/sas/teach/reg_int/reg_int_cont.htm .
Fig. 6.7, p. 246, showing 4 different diagnostic plots.
proc gplot data = outfig05;
  plot residual*fitted;
run;
proc gplot data = outfig05;
  plot residual*x1;
run;
proc gplot data = outfig05;
  plot residual*x2;
run;
proc gplot data = outfig05;
  plot residual*x1x2;
run;
Fig 6.8a-Fig 6.8d, page 247 could have been obtained all in one proc gplot command as shown below.
proc gplot data = outfig05;
  plot residual*fitted;
  plot residual*x1;
  plot residual*x2;
  plot residual*x1x2;
run;
Fig 6.8a, page 247.
data outfig08;
  set outfig05;
  absresid = abs(residual);
run;
 
proc gplot data=outfig08;
  plot absresid*fitted;
run;
Fig. 6.8b, p. 247, normal probability plot.
Note: The labels on the X-axis differs from the book.
proc univariate data = outfig05 noprint ;
  qqplot residual / normal;
run;
Estimation of Mean Response and Prediction Limits for New Observations, p. 249-251. Adding an extra line of data in order to predict.
data ch6fig05h;
  input x1 x2 y;
cards;
  68.5  16.7  174.4
  45.2  16.8  164.4
  91.3  18.2  244.2
  47.8  16.3  154.6
  46.9  17.3  181.6
  66.1  18.2  207.5
  49.5  15.9  152.8
  52.0  17.2  163.2
  48.9  16.6  145.4
  38.4  16.0  137.2
  87.9  18.3  241.9
  72.8  17.1  191.1
  88.4  17.4  232.0
  42.9  15.8  145.3
  52.5  17.8  161.1
  85.7  18.4  209.7
  41.3  16.5  146.4
  51.7  16.3  144.0
  89.6  18.1  232.6
  82.7  19.1  224.1
  52.3  16.0  166.5
  65.4  17.6    .  
  53.1  17.7    .  
  ;
run;
Getting the predicted value and the CI's for E[Yh] and Yh(new), p. 249-251. Upper and Lower CLMean is for E[Yh] and Upper and Lower CL is for Yh(new).
proc reg data = ch6fig05h  ;
  model y = x1 x2 / r cli clm;
  ods output OutputStatistics=temp;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: y

                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2          24015          12008      99.10    <.0001
Error                    18     2180.92741      121.16263
Corrected Total          20          26196

Root MSE             11.00739    R-Square     0.9167
Dependent Mean      181.90476    Adj R-Sq     0.9075
Coeff Var             6.05118

                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1      -68.85707       60.01695      -1.15      0.2663
x1            1        1.45456        0.21178       6.87      <.0001
x2            1        9.36550        4.06396       2.30      0.0333

The REG Procedure
Model: MODEL1
Dependent Variable: y

                                     Output Statistics

           Dep Var Predicted    Std Error
     Obs         y     Value Mean Predict     95% CL Mean        95% CL Predict    Residual

       1  174.4000  187.1841       3.8409  179.1146  195.2536  162.6910  211.6772  -12.7841
       2  164.4000  154.2294       3.5558  146.7591  161.6998  129.9271  178.5317   10.1706
       3  244.2000  234.3963       4.5882  224.7569  244.0358  209.3421  259.4506    9.8037
       4  154.6000  153.3285       3.2331  146.5361  160.1210  129.2260  177.4311    1.2715
       5  181.6000  161.3849       4.4300  152.0778  170.6921  136.4566  186.3132   20.2151
       6  207.5000  197.7414       4.3786  188.5424  206.9404  172.8533  222.6295    9.7586
       7  152.8000  152.0551       4.1696  143.2952  160.8150  127.3259  176.7843    0.7449
       8  163.2000  167.8666       3.3310  160.8684  174.8649  143.7053  192.0280   -4.6666
       9  145.4000  157.7382       2.9628  151.5136  163.9628  133.7895  181.6869  -12.3382
      10  137.2000  136.8460       4.0074  128.4268  145.2653  112.2354  161.4566    0.3540
      11  241.9000  230.3874       4.2012  221.5610  239.2137  205.6346  255.1402   11.5126
      12  191.1000  197.1849       3.4109  190.0188  204.3510  172.9744  221.3954   -6.0849
      13  232.0000  222.6857       5.3808  211.3810  233.9904  196.9448  248.4266    9.3143
      14  145.3000  141.5184       4.1735  132.7502  150.2866  116.7863  166.2506    3.7816
      15  161.1000  174.2132       5.0377  163.6294  184.7971  148.7807  199.6458  -13.1132
      16  209.7000  228.1239       4.1214  219.4652  236.7826  203.4304  252.8174  -18.4239
      17  146.4000  145.7470       3.7331  137.9041  153.5899  121.3276  170.1664    0.6530
      18  144.0000  159.0013       3.2529  152.1672  165.8354  134.8870  183.1157  -15.0013
      19  232.6000  230.9870       4.4176  221.7059  240.2681  206.0684  255.9056    1.6130
      20  224.1000  230.3161       5.8120  218.1054  242.5267  204.1647  256.4675   -6.2161
      21  166.5000  157.0644       4.0792  148.4944  165.6344  132.4018  181.7270    9.4356
      22         .  191.1039       2.7668  185.2911  196.9168  167.2589  214.9490         .
      23         .  174.1494       4.5986  164.4881  183.8107  149.0867  199.2121         .

                      Output Statistics

         Std Error     Student                         Cook's
     Obs  Residual    Residual      -2-1 0 1 2              D

       1    10.316      -1.239    |    **|      |       0.071
       2    10.417       0.976    |      |*     |       0.037
       3    10.006       0.980    |      |*     |       0.067
       4    10.522       0.121    |      |      |       0.000
       5    10.077       2.006    |      |****  |       0.259
       6    10.099       0.966    |      |*     |       0.059
       7    10.187      0.0731    |      |      |       0.000
       8    10.491      -0.445    |      |      |       0.007
       9    10.601      -1.164    |    **|      |       0.035
      10    10.252      0.0345    |      |      |       0.000
      11    10.174       1.132    |      |**    |       0.073
      12    10.466      -0.581    |     *|      |       0.012
      13     9.603       0.970    |      |*     |       0.098
      14    10.186       0.371    |      |      |       0.008
      15     9.787      -1.340    |    **|      |       0.159
      16    10.207      -1.805    |   ***|      |       0.177
      
The REG Procedure
Model: MODEL1
Dependent Variable: y

                      Output Statistics

         Std Error     Student                         Cook's
     Obs  Residual    Residual      -2-1 0 1 2              D

      17    10.355      0.0631    |      |      |       0.000
      18    10.516      -1.427    |    **|      |       0.065
      19    10.082       0.160    |      |      |       0.002
      20     9.348      -0.665    |     *|      |       0.057
      21    10.224       0.923    |      |*     |       0.045
      22         .           .                           .
      23         .           .                           .


Sum of Residuals                           0
Sum of Squared Residuals          2180.92741
Predicted Residual SS (PRESS)     3002.92331
We use Where Observation >= 22 to show just the last two observation
proc print data = temp;
  where Observation >= 22;
run;
                                                              StdErr
                                                Predicted       Mean      Lower      Upper
Obs  Model   Dependent  Observation     DepVar    Value      Predict     CLMean     CLMean

 22  MODEL1      y             22            .   191.1039     2.7668   185.2911   196.9168
 23  MODEL1      y             23            .   174.1494     4.5986   164.4881   183.8107

                                             StdErr     Student
Obs    LowerCL      UpperCL     Residual    Residual    Residual    Picture      CooksD

 22   167.2589     214.9490            .          .           .                    .
 23   149.0867     199.2121            .          .           .                    .

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.