UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Textbook Examples
Computer-Aided Multivariate Analysis by Afifi and Clark
Chapter 7: Multiple Regression and Correlation

Page 125 Regression from chapter 6.
use http://www.ats.ucla.edu/stat/stata/examples/cama3/lung, clear

generate ffev1a = ffev1/100
regress ffev1a fheight

      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  1,   148) =   50.50
       Model |  16.0531702     1  16.0531702           Prob > F      =  0.0000
    Residual |  47.0451258   148  .317872472           R-squared     =  0.2544
-------------+------------------------------           Adj R-squared =  0.2494
       Total |   63.098296   149  .423478497           Root MSE      =   .5638

------------------------------------------------------------------------------
      ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     fheight |   .1181052   .0166194     7.11   0.000     .0852633    .1509472
       _cons |  -4.086702   1.151979    -3.55   0.001    -6.363155    -1.81025
------------------------------------------------------------------------------
Page 127 Descriptive statistics at the bottom of the page.
summarize fage fheight ffev1a

    Variable |     Obs        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------
        fage |     150    40.13333   6.889995         26         59
     fheight |     150       69.26   2.779189         61         76
      ffev1a |     150    4.093267   .6507523        2.5       5.85
Page 133 Covariance and correlation matrices.
Covariance:
correlate fage fheight fweight ffev1a, covariance

(obs=150)

             |     fage  fheight  fweight   ffev1a
-------------+------------------------------------
        fage |   47.472
     fheight | -1.07517  7.72389
     fweight | -3.64922  34.6954  573.798
      ffev1a | -1.38762  .912232  2.06716  .423478
Correlation:
correlate fage fheight fweight ffev1a

(obs=150)

             |     fage  fheight  fweight   ffev1a
-------------+------------------------------------
        fage |   1.0000
     fheight |  -0.0561   1.0000
     fweight |  -0.0221   0.5212   1.0000
      ffev1a |  -0.3095   0.5044   0.1326   1.0000
Page 138 Table 7.2.
regress ffev1a fheight fage

      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
       Model |   21.056968     2   10.528484           Prob > F      =  0.0000
    Residual |   42.041328   147  .285995429           R-squared     =  0.3337
-------------+------------------------------           Adj R-squared =  0.3247
       Total |   63.098296   149  .423478497           Root MSE      =  .53479

------------------------------------------------------------------------------
      ffev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     fheight |    .114397    .015789     7.25   0.000     .0831943    .1455997
        fage |  -.0266393   .0063687    -4.18   0.000    -.0392254   -.0140532
       _cons |  -2.760746   1.137746    -2.43   0.016    -5.009197   -.5122958
------------------------------------------------------------------------------
Page 140 The t-test at the top of the page

NOTE: This is given in the output above.

Table 7.5, page 150.

NOTE: We need to reshape the data from wide to long to get the first panel of the table.  We use the Stata command reshape to do this.  We use the @ symbol before the variables that we wish to reshape as a "wild card" to collect all of the age variables, for example, regardless of the prefix (in this case, "f" and "m").  Before we reshape the data, however, we need to drop the variables for the children so that the will not be picked up by the "wild card".  We use the string option because the "j" variable, gender, is a string variable.  Also note that Stata does not give us the R-statistic that is shown in the text.  The S-statistic is labeled "Root MSE" in the Stata output.

drop oc* mc* yc*
reshape long @age @height  @fev1, i(id) j(momdad) string
generate gender = 2 if momdad == "m"
replace gender = 1 if momdad == "f"
label define gend 1 "male" 2 "female"
label values gender gend
generate fev1a = fev1/100
tabstat age height fev1a, statistics(mean sd)

   stats |       age    height     fev1a
---------+------------------------------
    mean |  38.84667  66.67667    3.5332
      sd |  6.912484  3.685657  .8025855
----------------------------------------

regress fev1a age height

      Source |       SS       df       MS              Number of obs =     300
-------------+------------------------------           F(  2,   297) =  197.57
       Model |  109.953774     2   54.976887           Prob > F      =  0.0000
    Residual |  82.6451491   297  .278266495           R-squared     =  0.5709
-------------+------------------------------           Adj R-squared =  0.5680
       Total |  192.598923   299  .644143556           Root MSE      =  .52751
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |  -.0185978   .0044429    -4.19   0.000    -.0273413   -.0098542
      height |    .164865   .0083327    19.79   0.000     .1484664    .1812635
       _cons |  -6.736985   .5632885   -11.96   0.000    -7.845528   -5.628443
------------------------------------------------------------------------------
To obtain the second and third panels of the table, we need sort the data by gender and then use the by prefix to do the descriptive statistics and regressions for each gender.
sort gender
by gender: tabstat age height fev1a, statistics(mean sd)

------------------------------------------------------------------------------------------------
-> gender = male
   stats |       age    height     fev1a
---------+------------------------------
    mean |  40.13333     69.26  4.093267
      sd |  6.889995  2.779189  .6507523
----------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
   stats |       age    height     fev1a
---------+------------------------------
    mean |     37.56  64.09333  2.973133
      sd |  6.714184  2.469537  .4874136
----------------------------------------

by gender: regress fev1a age height

------------------------------------------------------------------------------------------------
-> gender = male
      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   36.81
       Model |   21.056968     2   10.528484           Prob > F      =  0.0000
    Residual |   42.041328   147  .285995429           R-squared     =  0.3337
-------------+------------------------------           Adj R-squared =  0.3247
       Total |   63.098296   149  .423478497           Root MSE      =  .53479
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
         age |  -.0266393   .0063687    -4.18   0.000                -.2820504
      height |    .114397    .015789     7.25   0.000                 .4885592
       _cons |  -2.760746   1.137746    -2.43   0.016                        .
------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
-> gender = female
      Source |       SS       df       MS              Number of obs =     150
-------------+------------------------------           F(  2,   147) =   30.24
       Model |  10.3185252     2  5.15926259           Prob > F      =  0.0000
    Residual |  25.0797019   147  .170610217           R-squared     =  0.2915
-------------+------------------------------           Adj R-squared =  0.2819
       Total |  35.3982271   149  .237571994           Root MSE      =  .41305
------------------------------------------------------------------------------
       fev1a |      Coef.   Std. Err.      t    P>|t|                     Beta
-------------+----------------------------------------------------------------
         age |  -.0199755   .0050405    -3.96   0.000                -.2751644
      height |   .0925926   .0137042     6.76   0.000                 .4691313
       _cons |   -2.21116    .896067    -2.47   0.015                        .
------------------------------------------------------------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.