UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Textbook Examples
Computer-Aided Multivariate Analysis, Fourth Edition, by Afifi, Clark and May
Chapter 7: Multiple regression and correlation

Page 126 Regression from chapter 6.
data lung;
set "c:\cama4\lung";
ffev1a = ffev1/100;
run;
proc reg data = lung;
model ffev1a = fheight;
run;
quit;
<some output omitted>
                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       -4.08670        1.15198      -3.55      0.0005
FHEIGHT       1        0.11811        0.01662       7.11      <.0001
Page 128 Descriptive statistics at the bottom of the page.
proc means data = lung;
var fage fheight ffev1a;
run;
The MEANS Procedure

Variable      N            Mean         Std Dev         Minimum         Maximum
-------------------------------------------------------------------------------
FAGE        150      40.1333333       6.8899953      26.0000000      59.0000000
FHEIGHT     150      69.2600000       2.7791892      61.0000000      76.0000000
ffev1a      150       4.0932667       0.6507523       2.5000000       5.8500000
-------------------------------------------------------------------------------
Page 133 Covariance and correlation matrices.
Covariance:
proc corr data = lung cov noprob;
var fage fheight fweight;
run;
<some output omitted>
                 Covariance Matrix, DF = 149

                     FAGE           FHEIGHT           FWEIGHT

FAGE           47.4720358        -1.0751678        -3.6492170
FHEIGHT        -1.0751678         7.7238926        34.6954362
FWEIGHT        -3.6492170        34.6954362       573.7978076
Correlation (page 134):
    Pearson Correlation Coefficients, N = 150

                 FAGE       FHEIGHT       FWEIGHT

FAGE          1.00000      -0.05615      -0.02211

FHEIGHT      -0.05615       1.00000       0.52116

FWEIGHT      -0.02211       0.52116       1.00000
Page 138 Table 7.1  ANOVA example from the lung function data (fathers).
proc reg data = lung;
model ffev1a = fheight fage;
run;
quit;
<some output omitted>
                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2       21.05697       10.52848      36.81    <.0001
Error                   147       42.04133        0.28600
Corrected Total         149       63.09830
Page 140 The t-test at the top of the page
NOTE:  This is given as part of the output for the proc reg above.
                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       -2.76075        1.13775      -2.43      0.0165
FHEIGHT       1        0.11440        0.01579       7.25      <.0001
FAGE          1       -0.02664        0.00637      -4.18      <.0001
Page 150 Table 7.5  Statistical output for the lung function data for males and females.
NOTE:  To do the top part of the table, you need to reshape the data from wide to long.  Please see our FAQ on reshaping data from wide to long using a data step for a further explanation of this code.
data long;
set lung;
mfev1a = mfev1/100;
array asex(2) fsex msex;
array aage(2) fage mage;
array aheight(2) fheight mheight;
array afev1(2) ffev1a mfev1a;

do parent = 1 to 2;
sex = asex(parent);
age = aage(parent);
height = aheight(parent);
fev1 = afev1(parent);
output;
end;
keep id sex age height fev1;
run;
proc means data = long mean std;
var age height fev1;
run;
The MEANS Procedure

Variable            Mean         Std Dev
----------------------------------------
age           38.8466667       6.9124837
height        66.6766667       3.6856572
fev1           3.5332000       0.8025856
----------------------------------------
proc reg data = long;
model fev1 = age height / stb;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.52751    R-Square     0.5709
Dependent Mean        3.53320    Adj R-Sq     0.5680
Coeff Var            14.93008


                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -6.73699        0.56329     -11.96      <.0001               0
age           1       -0.01860        0.00444      -4.19      <.0001        -0.16018
height        1        0.16486        0.00833      19.79      <.0001         0.75710
The second and third panels of the table can be obtained using the by statement in the proc means and proc reg.  First, we need to sort the data by sex and save the sorted data file (which we called longsort).  We then used proc format to create value labels for sex for clarity in the output.  We called the format for sex sex, and you can tell the variable sex from the format sex because the format always ends in a period (.).
proc sort data = long out=longsort;
by sex;
run;

proc format;
value sex 1 = "male"
          2 = "female";
run;

proc means data = longsort mean std;
by sex;
format sex sex.;
var age height fev1;
run;
sex=male

The MEANS Procedure

Variable            Mean         Std Dev
----------------------------------------
age           40.1333333       6.8899953
height        69.2600000       2.7791892
fev1           4.0932667       0.6507523
----------------------------------------

sex=female

Variable            Mean         Std Dev
----------------------------------------
age           37.5600000       6.7141841
height        64.0933333       2.4695370
fev1           2.9731333       0.4874136
----------------------------------------
proc reg data = longsort;
by sex;
format sex sex.;
model fev1 = age height / stb;
run;
quit;
sex=male

The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.53479    R-Square     0.3337
Dependent Mean        4.09327    Adj R-Sq     0.3247
Coeff Var            13.06500

                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -2.76075        1.13775      -2.43      0.0165               0
age           1       -0.02664        0.00637      -4.18      <.0001        -0.28205
height        1        0.11440        0.01579       7.25      <.0001         0.48856

sex=female

The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.41305    R-Square     0.2915
Dependent Mean        2.97313    Adj R-Sq     0.2819
Coeff Var            13.89275

                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -2.21116        0.89607      -2.47      0.0147               0
age           1       -0.01998        0.00504      -3.96      0.0001        -0.27516
height        1        0.09259        0.01370       6.76      <.0001         0.46913

Page 152 middle of the page

NOTE:  The t-test for the fh coefficient is the relevant statistic.  The sign is opposite of that shown in the text because the order of subtraction was reversed.  Also, there is some rounding error.

data long1;
set long;
female = sex - 1;
fh = female*height;
fa = female*age;
run;
proc reg data = long1;
model fev1 = female age height fh;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: fev1
                             Analysis of Variance
                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F
Model                     4      125.32516       31.33129     137.39    <.0001
Error                   295       67.27377        0.22805
Corrected Total         299      192.59893
Root MSE              0.47754    R-Square     0.6507
Dependent Mean        3.53320    Adj R-Sq     0.6460
Coeff Var            13.51586
                        Parameter Estimates
                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|
Intercept     1       -2.92255        0.99654      -2.93      0.0036
female        1        0.82968        1.41007       0.59      0.5567
age           1       -0.02339        0.00407      -5.75      <.0001
height        1        0.11485        0.01409       8.15      <.0001
fh            1       -0.02210        0.02121      -1.04      0.2981

page 153 middle of the page

NOTE:  The F test is in the last table on the line labeled "Numerator".

proc reg data = long1;
model fev1 = female age height fa fh;
test female, fa, fh;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: fev1
                             Analysis of Variance
                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F
Model                     5      125.47790       25.09558     109.92    <.0001
Error                   294       67.12103        0.22830
Corrected Total         299      192.59893
Root MSE              0.47781    R-Square     0.6515
Dependent Mean        3.53320    Adj R-Sq     0.6456
Coeff Var            13.52345
                        Parameter Estimates
                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|
Intercept     1       -2.76075        1.01653      -2.72      0.0070
female        1        0.54959        1.45182       0.38      0.7053
age           1       -0.02664        0.00569      -4.68      <.0001
height        1        0.11440        0.01411       8.11      <.0001
fa            1        0.00666        0.00815       0.82      0.4141
fh            1       -0.02180        0.02122      -1.03      0.3050
The REG Procedure
Model: MODEL1
       Test 1 Results for Dependent Variable fev1
                                Mean
Source             DF         Square    F Value    Pr > F
Numerator           3        5.17471      22.67    <.0001
Denominator       294        0.22830

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California