UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Textbook Examples
Computer-Aided Multivariate Analysis, Fourth Edition, by Afifi, Clark and May
Chapter 7: Multiple regression and correlation

Page 126. Regression from chapter 6.
use http://www.ats.ucla.edu/stat/stata/examples/cama4/lung, clear

generate ffev1a = ffev1/100
regress ffev1a fheight

      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  1,   148) =   50.50
       Model |  16.0531702     1  16.0531702           Prob > F      =  0.0000
    Residual |  47.0451258   148  .317872472           R-squared     =  0.2544
-------------+------------------------------           Adj R-squared =  0.2494
       Total |   63.098296   149  .423478497           Root MSE      =   .5638

------------------------------------------------------------------------------
      ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     fheight |   .1181052   .0166194     7.11   0.000     .0852633    .1509472
       _cons |  -4.086702   1.151979    -3.55   0.001    -6.363155    -1.81025
------------------------------------------------------------------------------
Page 128. Descriptive statistics at the bottom of the page.
summarize fage fheight ffev1a

    Variable |     Obs        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------
        fage |     150    40.13333   6.889995         26         59
     fheight |     150       69.26   2.779189         61         76
      ffev1a |     150    4.093267   .6507523        2.5       5.85
Page 133. Covariance and correlation matrices.
Covariance:
correlate fage fheight fweight ffev1a, covariance
(obs=150)

             |     fage  fheight  fweight   ffev1a
-------------+------------------------------------
        fage |   47.472
     fheight | -1.07517  7.72389
     fweight | -3.64922  34.6954  573.798
      ffev1a | -1.38762  .912232  2.06716  .423478
Correlation (page 134):
correlate fage fheight fweight ffev1a
(obs=150)

             |     fage  fheight  fweight   ffev1a
-------------+------------------------------------
        fage |   1.0000
     fheight |  -0.0561   1.0000
     fweight |  -0.0221   0.5212   1.0000
      ffev1a |  -0.3095   0.5044   0.1326   1.0000
Table 7.1, page 138.
regress ffev1a fheight fage

      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
       Model |   21.056968     2   10.528484           Prob > F      =  0.0000
    Residual |   42.041328   147  .285995429           R-squared     =  0.3337
-------------+------------------------------           Adj R-squared =  0.3247
       Total |   63.098296   149  .423478497           Root MSE      =  .53479

------------------------------------------------------------------------------
      ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     fheight |    .114397    .015789     7.25   0.000     .0831943    .1455997
        fage |  -.0266393   .0063687    -4.18   0.000    -.0392254   -.0140532
       _cons |  -2.760746   1.137746    -2.43   0.016    -5.009197   -.5122958
------------------------------------------------------------------------------
Page 140. The t-test at the top of the page.
NOTE: This is given in the output above.
Table 7.5, page 150.
NOTE: We need to reshape the data from wide to long to get the first panel of the table.  We use the Stata command reshape to do this.  We use the @ symbol before the variables that we wish to reshape as a "wild card" to collect all of the age variables, for example, regardless of the prefix (in this case, "f" and "m").  Before we reshape the data, however, we need to drop the variables for the children so that the will not be picked up by the "wild card". We use the string option because the "j" variable, gender, is a string variable.
drop oc* mc* yc*
reshape long @age @height  @fev1, i(id) j(momdad) string
generate gender = 2 if momdad == "m"
replace gender = 1 if momdad == "f"
label define gend 1 "male" 2 "female"
label values gender gend
generate fev1a = fev1/100
tabstat age height fev1a, statistics(mean sd)

   stats |       age    height     fev1a
---------+------------------------------
    mean |  38.84667  66.67667    3.5332
      sd |  6.912484  3.685657  .8025855
----------------------------------------

regress fev1a age height

      Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  2,   297) =  197.57
       Model |  109.953774     2   54.976887           Prob > F      =  0.0000
    Residual |  82.6451491   297  .278266495           R-squared     =  0.5709
-------------+------------------------------           Adj R-squared =  0.5680
       Total |  192.598923   299  .644143556           Root MSE      =  .52751
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0185978   .0044429    -4.19   0.000    -.0273413   -.0098542
      height |    .164865   .0083327    19.79   0.000     .1484664    .1812635
       _cons |  -6.736985   .5632885   -11.96   0.000    -7.845528   -5.628443
------------------------------------------------------------------------------
To obtain the second and third panels of the table, we need sort the data by gender and then use the by prefix to do the descriptive statistics and regressions for each gender.
sort gender
by gender: tabstat age height fev1a, statistics(mean sd)

------------------------------------------------------------------------------------------------
-> gender = male
   stats |       age    height     fev1a
---------+------------------------------
    mean |  40.13333     69.26  4.093267
      sd |  6.889995  2.779189  .6507523
----------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
   stats |       age    height     fev1a
---------+------------------------------
    mean |     37.56  64.09333  2.973133
      sd |  6.714184  2.469537  .4874136
----------------------------------------

by gender: regress fev1a age height

------------------------------------------------------------------------------------------------
-> gender = male
      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
       Model |   21.056968     2   10.528484           Prob > F      =  0.0000
    Residual |   42.041328   147  .285995429           R-squared     =  0.3337
-------------+------------------------------           Adj R-squared =  0.3247
       Total |   63.098296   149  .423478497           Root MSE      =  .53479
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
         age |  -.0266393   .0063687    -4.18   0.000                -.2820504
      height |    .114397    .015789     7.25   0.000                 .4885592
       _cons |  -2.760746   1.137746    -2.43   0.016                        .
------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   30.24
       Model |  10.3185252     2  5.15926259           Prob > F      =  0.0000
    Residual |  25.0797019   147  .170610217           R-squared     =  0.2915
-------------+------------------------------           Adj R-squared =  0.2819
       Total |  35.3982271   149  .237571994           Root MSE      =  .41305
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
         age |  -.0199755   .0050405    -3.96   0.000                -.2751644
      height |   .0925926   .0137042     6.76   0.000                 .4691313
       _cons |   -2.21116    .896067    -2.47   0.015                        .
------------------------------------------------------------------------------
Page 152. Middle of the page.
NOTE: The coefficient and standard error for the height variable from the analysis above (.093) and the one below (.114)  are used in the calculation of the Z test.
xi: regress fev1a age i.gender*height 

i.gender          _Igender_1-2        (naturally coded; _Igender_1 omitted)
i.gender*height   _IgenXheigh_#       (coded as above)

      Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  4,   295) =  137.39
       Model |  125.325155     4  31.3312887           Prob > F      =  0.0000
    Residual |  67.2737685   295  .228046673           R-squared     =  0.6507
-------------+------------------------------           Adj R-squared =  0.6460
       Total |  192.598923   299  .644143556           Root MSE      =  .47754
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0233887   .0040701    -5.75   0.000    -.0313988   -.0153786
  _Igender_2 |   .8296771   1.410069     0.59   0.557    -1.945392    3.604746
      height |   .1148495   .0140881     8.15   0.000     .0871236    .1425754
_IgenXheig~2 |  -.0221023   .0212056    -1.04   0.298    -.0638357    .0196312
       _cons |  -2.922545   .9965403    -2.93   0.004    -4.883774   -.9613153
------------------------------------------------------------------------------
Page 153. Middle of the page.
xi: regress fev1a i.gender*height i.gender*age

i.gender          _Igender_1-2        (naturally coded; _Igender_1 omitted)
i.gender*height   _IgenXheigh_#       (coded as above)
i.gender*age      _IgenXage_#         (coded as above)
      Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  5,   294) =  109.92
       Model |  125.477893     5  25.0955786           Prob > F      =  0.0000
    Residual |  67.1210299   294  .228302823           R-squared     =  0.6515
-------------+------------------------------           Adj R-squared =  0.6456
       Total |  192.598923   299  .644143556           Root MSE      =  .47781
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  _Igender_2 |   .5495863   1.451823     0.38   0.705    -2.307697     3.40687
      height |    .114397   .0141069     8.11   0.000     .0866338    .1421602
_IgenXheig~2 |  -.0218044   .0212206    -1.03   0.305     -.063568    .0199592
  _Igender_2 |  (dropped)
         age |  -.0266393   .0056902    -4.68   0.000    -.0378381   -.0154406
 _IgenXage_2 |   .0066639   .0081472     0.82   0.414    -.0093704    .0226981
       _cons |  -2.760746   1.016532    -2.72   0.007    -4.761349   -.7601439
------------------------------------------------------------------------------

test  _Igender_2 _IgenXheigh_2 _IgenXage_2
 ( 1)  _Igender_2 = 0
 ( 2)  _IgenXheigh_2 = 0
 ( 3)  _IgenXage_2 = 0
       F(  3,   294) =   22.67
            Prob > F =    0.0000

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California