### Stata Textbook Examples Practical Multivariate Analysis, Fifth Edition, by Afifi, May and Clark Chapter 7: Multiple regression and correlation

Page 120. Regression from chapter 6.

use http://www.ats.ucla.edu/stat/stata/examples/pma5/lung, clear

generate ffev1a = ffev1/100
regress ffev1a fheight

Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  1,   148) =   50.50
Model |  16.0531702     1  16.0531702           Prob > F      =  0.0000
Residual |  47.0451258   148  .317872472           R-squared     =  0.2544
Total |   63.098296   149  .423478497           Root MSE      =   .5638

------------------------------------------------------------------------------
ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
fheight |   .1181052   .0166194     7.11   0.000     .0852633    .1509472
_cons |  -4.086702   1.151979    -3.55   0.001    -6.363155    -1.81025
------------------------------------------------------------------------------
Page 122. Descriptive statistics at the bottom of the page.
summarize fage fheight ffev1a

Variable |     Obs        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------
fage |     150    40.13333   6.889995         26         59
fheight |     150       69.26   2.779189         61         76
ffev1a |     150    4.093267   .6507523        2.5       5.85
Page 127. Covariance and correlation matrices.
Covariance:
correlate fage fheight fweight ffev1a, covariance
(obs=150)

|     fage  fheight  fweight   ffev1a
-------------+------------------------------------
fage |   47.472
fheight | -1.07517  7.72389
fweight | -3.64922  34.6954  573.798
ffev1a | -1.38762  .912232  2.06716  .423478
Correlation (page 127):
correlate fage fheight fweight ffev1a
(obs=150)

|     fage  fheight  fweight   ffev1a
-------------+------------------------------------
fage |   1.0000
fheight |  -0.0561   1.0000
fweight |  -0.0221   0.5212   1.0000
ffev1a |  -0.3095   0.5044   0.1326   1.0000
Table 7.2, page 132.
regress ffev1a fheight fage

Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
Model |   21.056968     2   10.528484           Prob > F      =  0.0000
Residual |   42.041328   147  .285995429           R-squared     =  0.3337
Total |   63.098296   149  .423478497           Root MSE      =  .53479

------------------------------------------------------------------------------
ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
fheight |    .114397    .015789     7.25   0.000     .0831943    .1455997
fage |  -.0266393   .0063687    -4.18   0.000    -.0392254   -.0140532
_cons |  -2.760746   1.137746    -2.43   0.016    -5.009197   -.5122958
------------------------------------------------------------------------------
Page 134. The t-test at the top of the page.
NOTE: This is given in the output above.
Table 7.5, page 145.
NOTE: We need to reshape the data from wide to long to get the first panel of the table. We use the Stata command reshape to do this. We use the @ symbol before the variables that we wish to reshape as a "wild card" to collect all of the age variables, for example, regardless of the prefix (in this case, "f" and "m"). Before we reshape the data, however, we need to drop the variables for the children so that the will not be picked up by the "wild card". We use the string option because the "j" variable, gender, is a string variable.

drop oc* mc* yc*
reshape long @age @height  @fev1, i(id) j(momdad) string
generate gender = 2 if momdad == "m"
replace gender = 1 if momdad == "f"
label define gend 1 "male" 2 "female"
label values gender gend
generate fev1a = fev1/100
tabstat age height fev1a, statistics(mean sd)

stats |       age    height     fev1a
---------+------------------------------
mean |  38.84667  66.67667    3.5332
sd |  6.912484  3.685657  .8025855
----------------------------------------

regress fev1a age height

Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  2,   297) =  197.57
Model |  109.953774     2   54.976887           Prob > F      =  0.0000
Residual |  82.6451491   297  .278266495           R-squared     =  0.5709
Total |  192.598923   299  .644143556           Root MSE      =  .52751
------------------------------------------------------------------------------
fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
age |  -.0185978   .0044429    -4.19   0.000    -.0273413   -.0098542
height |    .164865   .0083327    19.79   0.000     .1484664    .1812635
_cons |  -6.736985   .5632885   -11.96   0.000    -7.845528   -5.628443
------------------------------------------------------------------------------
To obtain the second and third panels of the table, we need sort the data by gender and then use the by prefix to do the descriptive statistics and regressions for each gender.
sort gender
by gender: tabstat age height fev1a, statistics(mean sd)

------------------------------------------------------------------------------------------------
-> gender = male
stats |       age    height     fev1a
---------+------------------------------
mean |  40.13333     69.26  4.093267
sd |  6.889995  2.779189  .6507523
----------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
stats |       age    height     fev1a
---------+------------------------------
mean |     37.56  64.09333  2.973133
sd |  6.714184  2.469537  .4874136
----------------------------------------

by gender: regress fev1a age height

------------------------------------------------------------------------------------------------
-> gender = male
Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
Model |   21.056968     2   10.528484           Prob > F      =  0.0000
Residual |   42.041328   147  .285995429           R-squared     =  0.3337
Total |   63.098296   149  .423478497           Root MSE      =  .53479
------------------------------------------------------------------------------
fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
age |  -.0266393   .0063687    -4.18   0.000                -.2820504
height |    .114397    .015789     7.25   0.000                 .4885592
_cons |  -2.760746   1.137746    -2.43   0.016                        .
------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   30.24
Model |  10.3185252     2  5.15926259           Prob > F      =  0.0000
Residual |  25.0797019   147  .170610217           R-squared     =  0.2915
Total |  35.3982271   149  .237571994           Root MSE      =  .41305
------------------------------------------------------------------------------
fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
age |  -.0199755   .0050405    -3.96   0.000                -.2751644
height |   .0925926   .0137042     6.76   0.000                 .4691313
_cons |   -2.21116    .896067    -2.47   0.015                        .
------------------------------------------------------------------------------
Page 147. Middle of the page.
NOTE: The coefficient and standard error for the height variable from the analysis above (.093) and the one below (.114) are used in the calculation of the Z test.
regress fev1a age i.gender##c.height
Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  4,   295) =  137.39
Model |  125.325155     4  31.3312887           Prob > F      =  0.0000
Residual |  67.2737685   295  .228046673           R-squared     =  0.6507
Total |  192.598923   299  .644143556           Root MSE      =  .47754

---------------------------------------------------------------------------------
fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
age |  -.0233887   .0040701    -5.75   0.000    -.0313988   -.0153786
2.gender |   .8296771   1.410069     0.59   0.557    -1.945392    3.604746
height |   .1148495   .0140881     8.15   0.000     .0871236    .1425754
|
gender#c.height |
2  |  -.0221023   .0212056    -1.04   0.298    -.0638357    .0196312
|
_cons |  -2.922545   .9965403    -2.93   0.004    -4.883774   -.9613152
---------------------------------------------------------------------------------

Page 148. Middle of the page.
 regress fev1a i.gender##c.height i.gender##c.age

Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  5,   294) =  109.92
Model |  125.477893     5  25.0955786           Prob > F      =  0.0000
Residual |  67.1210299   294  .228302823           R-squared     =  0.6515
Total |  192.598923   299  .644143556           Root MSE      =  .47781

---------------------------------------------------------------------------------
fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
----------------+----------------------------------------------------------------
2.gender |   .5495863   1.451823     0.38   0.705    -2.307697     3.40687
height |    .114397   .0141069     8.11   0.000     .0866338    .1421602
|
gender#c.height |
2  |  -.0218044   .0212206    -1.03   0.305     -.063568    .0199592
|
age |  -.0266393   .0056902    -4.68   0.000    -.0378381   -.0154406
|
gender#c.age |
2  |   .0066639   .0081472     0.82   0.414    -.0093704    .0226981
|
_cons |  -2.760746   1.016532    -2.72   0.007    -4.761349   -.7601438
---------------------------------------------------------------------------------

test 2.gender 2.gender#c.height 2.gender#c.age
( 1)  2.gender = 0
( 2)  2.gender#c.height = 0
( 3)  2.gender#c.age = 0

F(  3,   294) =   22.67
Prob > F =    0.0000


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.