UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

SAS Textbook Examples
Computer-Aided Multivariate Analysis by Afifi and Clark
Chapter 7: Multiple regression and correlation

Page 125 Regression from chapter 6.
data lung;
set "c:\cama3\lung";
ffev1a = ffev1/100;
run;
proc reg data = lung;
model ffev1a = fheight;
run;
quit;
<some output omitted>
                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       -4.08670        1.15198      -3.55      0.0005
FHEIGHT       1        0.11811        0.01662       7.11      <.0001
Page 127 Descriptive statistics at the bottom of the page.
proc means data = lung;
var fage fheight ffev1a;
run;
The MEANS Procedure

Variable      N            Mean         Std Dev         Minimum         Maximum
-------------------------------------------------------------------------------
FAGE        150      40.1333333       6.8899953      26.0000000      59.0000000
FHEIGHT     150      69.2600000       2.7791892      61.0000000      76.0000000
ffev1a      150       4.0932667       0.6507523       2.5000000       5.8500000
-------------------------------------------------------------------------------
Page 133 Covariance and correlation matrices.
Covariance:
proc corr data = lung cov noprob;
var fage fheight fweight;
run;
<some output omitted>
                 Covariance Matrix, DF = 149

                     FAGE           FHEIGHT           FWEIGHT

FAGE           47.4720358        -1.0751678        -3.6492170
FHEIGHT        -1.0751678         7.7238926        34.6954362
FWEIGHT        -3.6492170        34.6954362       573.7978076
Correlation:
    Pearson Correlation Coefficients, N = 150

                 FAGE       FHEIGHT       FWEIGHT

FAGE          1.00000      -0.05615      -0.02211

FHEIGHT      -0.05615       1.00000       0.52116

FWEIGHT      -0.02211       0.52116       1.00000
Page 138 Table 7.2  ANOVA example from the lung function data (fathers).
proc reg data = lung;
model ffev1a = fheight fage;
run;
quit;
<some output omitted>
                             Analysis of Variance

                                    Sum of           Mean
Source                   DF        Squares         Square    F Value    Pr > F

Model                     2       21.05697       10.52848      36.81    <.0001
Error                   147       42.04133        0.28600
Corrected Total         149       63.09830
Page 140 The t-test at the top of the page
NOTE:  This is given as part of the output for the proc reg above.
                        Parameter Estimates

                     Parameter       Standard
Variable     DF       Estimate          Error    t Value    Pr > |t|

Intercept     1       -2.76075        1.13775      -2.43      0.0165
FHEIGHT       1        0.11440        0.01579       7.25      <.0001
FAGE          1       -0.02664        0.00637      -4.18      <.0001
Page 150 Table 7.5  Statistical output for the lung function data for males and females.
NOTE:  To do the top part of the table, you need to reshape the data from wide to long.  Please see our FAQ on reshaping data from wide to long using a data step for a further explanation of this code.
data long;
set lung;
mfev1a = mfev1/100;
array asex(2) fsex msex;
array aage(2) fage mage;
array aheight(2) fheight mheight;
array afev1(2) ffev1a mfev1a;

do parent = 1 to 2;
sex = asex(parent);
age = aage(parent);
height = aheight(parent);
fev1 = afev1(parent);
output;
end;
keep id sex age height fev1;
run;
proc means data = long mean std;
var age height fev1;
run;
The MEANS Procedure

Variable            Mean         Std Dev
----------------------------------------
age           38.8466667       6.9124837
height        66.6766667       3.6856572
fev1           3.5332000       0.8025856
----------------------------------------
proc reg data = long;
model fev1 = age height / stb;
run;
quit;
The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.52751    R-Square     0.5709
Dependent Mean        3.53320    Adj R-Sq     0.5680
Coeff Var            14.93008


                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -6.73699        0.56329     -11.96      <.0001               0
age           1       -0.01860        0.00444      -4.19      <.0001        -0.16018
height        1        0.16486        0.00833      19.79      <.0001         0.75710
The second and third panels of the table can be obtained using the by statement in the proc means and proc reg.  First, we need to sort the data by sex and save the sorted data file (which we called longsort).  We then used proc format to create value labels for sex for clarity in the output.  We called the format for sex sex, and you can tell the variable sex from the format sex because the format always ends in a period (.).
proc sort data = long out=longsort;
by sex;
run;

proc format;
value sex 1 = "male"
          2 = "female";
run;

proc means data = longsort mean std;
by sex;
format sex sex.;
var age height fev1;
run;
sex=male

The MEANS Procedure

Variable            Mean         Std Dev
----------------------------------------
age           40.1333333       6.8899953
height        69.2600000       2.7791892
fev1           4.0932667       0.6507523
----------------------------------------

sex=female

Variable            Mean         Std Dev
----------------------------------------
age           37.5600000       6.7141841
height        64.0933333       2.4695370
fev1           2.9731333       0.4874136
----------------------------------------
proc reg data = longsort;
by sex;
format sex sex.;
model fev1 = age height / stb;
run;
quit;
sex=male

The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.53479    R-Square     0.3337
Dependent Mean        4.09327    Adj R-Sq     0.3247
Coeff Var            13.06500

                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -2.76075        1.13775      -2.43      0.0165               0
age           1       -0.02664        0.00637      -4.18      <.0001        -0.28205
height        1        0.11440        0.01579       7.25      <.0001         0.48856

sex=female

The REG Procedure
Model: MODEL1
Dependent Variable: fev1

<some output omitted>


Root MSE              0.41305    R-Square     0.2915
Dependent Mean        2.97313    Adj R-Sq     0.2819
Coeff Var            13.89275

                                Parameter Estimates

                     Parameter       Standard                           Standardized
Variable     DF       Estimate          Error    t Value    Pr > |t|        Estimate

Intercept     1       -2.21116        0.89607      -2.47      0.0147               0
age           1       -0.01998        0.00504      -3.96      0.0001        -0.27516
height        1        0.09259        0.01370       6.76      <.0001         0.46913

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.