### Stata Textbook Examples Regression with Graphics by Lawrence Hamilton Chapter 2: Bivariate Regression Analysis

use http://www.ats.ucla.edu/stat/stata/examples/rwg/concord1, clear
(Hamilton (1983))
Figure 2.3, page 35.
graph twoway scatter water81 income, xlabel(0(20)100) ylabel(0(2000)10000)
Figure 2.4, page 35.
graph twoway (scatter water81 income) (lfit water81 income), ///
xlabel(0(20)100) ylabel(0(2000)10000)

Compute the regression of water81 using income as the predictor. The model estimates on page 36 are displayed. Note that the residual standard deviation is listed as the Root MSE in the output.

regress water81 income

Source |       SS       df       MS                  Number of obs =     496
---------+------------------------------               F(  1,   494) =  104.46
Model |   190820566     1   190820566               Prob > F      =  0.0000
Residual |   902418143   494  1826757.38               R-squared     =  0.1745
Total |  1.0932e+09   495  2208563.05               Root MSE      =  1351.6

------------------------------------------------------------------------------
water81 |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
income |   47.54869   4.652286     10.221   0.000       38.40798     56.6894
_cons |   1201.124   123.3245      9.740   0.000       958.8191     1443.43
------------------------------------------------------------------------------
Obtain the predicted values of water81, naming this variable pw81.
predict pw81
(option xb assumed; fitted values)
Obtain the residuals, naming this variable rw81.
predict rw81, resid

Use summarize to obtain the standard deviations of water81 and income on page 40. The regression coefficients and other regression estimates in this section were output previously using the regress command.

summarize income water81

Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
income |     496    23.07661   13.05784          2        100
water81 |     496    2298.387   1486.123        100      10100  
The estimates and test statistics in this section were obtained previously with the regress command.
Figure 2.7, page 48.  Graph the scatterplot with the 99% confidence and prediction bands. This graph was constructed by overlaying four separate graphs, a scatter plot (scatter water81 income), regression line (lfit water81 income), the confidence interval for the regression line (lfitci water81 income), and the prediction interval (lfitci water81 income, stdf ).
graph twoway (scatter water81 income) (lfit water81 income) ///
(lfitci water81 income, level(99)  clcolor(blue)  ciplot(rline)) ///
(lfitci water81 income, level(99) stdf ciplot(rline)), ///
xlabel(0 23 100) ylabel(0 10000, nogrid)


Figure 2.9, page 52.
rvfplot, yline(0) xlabel(1000(1000)6000) ylabel(-2000(2000)6000)
Figure 2.12, page 54.
qnorm rw81, xlab(-4000(2000)4000) ylab(-4000(2000)6000)
Figure 2.13, page 54.  First graph and save the individual plots then graph them together.
histogram income, nodraw normal fraction bin(8) start(0) xlabel(0(20)100) ylabel(0(.1).3) saving(f2_13a,replace)
graph box income, nodraw ylabel(0(20)100) saving(f2_13b,replace)
symplot income, nodraw xlabel(0(5)20) ylabel(0(20)80) saving(f2_13c,replace)
qnorm income, nodraw xlabel(-20(20)60) ylabel(-20(20)100) saving(f2_13d,replace)

graph combine f2_13a.gph f2_13b.gph f2_13c.gph f2_13d.gph


Figure 2.14, page 55.  This is done the same way as figure 2.13 after we generate inc_3 which is income to the .3 power.
gen inc_3 = income^.3

histogram inc_3, nodraw fraction normal bin(9) xlabel(1(1)4) ylabel(0(.1).3) saving(f2_14a,replace)
graph box inc_3, nodraw ylabel(1(1)4) saving(f2_14b,replace)
symplot inc_3, nodraw xlabel(0(.5)1.5) ylabel(0(.5)1.5) saving(f2_14c,replace)
qnorm inc_3, xlabel(1(1)4) ylabel(1(1)4) saving(f2_14d,replace)
graph combine f2_14a.gph f2_14b.gph f2_14c.gph f2_14d.gph


Table 2.3, page 55.  Generate water81 to the .3 power and regress with the transformed income variable as a predictor.
gen wtr81_3 = water81^.3
regress wtr81_3 inc_3

Source |       SS       df       MS                  Number of obs =     496
---------+------------------------------               F(  1,   494) =  126.22
Model |  370.337058     1  370.337058               Prob > F      =  0.0000
Residual |  1449.41668   494  2.93404187               R-squared     =  0.2035
Total |  1819.75374   495  3.67627019               Root MSE      =  1.7129

------------------------------------------------------------------------------
wtr81_3 |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
inc_3 |   1.934535   .1721913     11.235   0.000       1.596217    2.272853
_cons |   4.989011   .4330577     11.520   0.000       4.138149    5.839873
------------------------------------------------------------------------------
Compute the mean values in Table 2.3, page 55.
summ wtr81_3 inc_3

Variable |     Obs        Mean   Std. Dev.       Min        Max
---------+-----------------------------------------------------
wtr81_3 |     496    9.776982    1.91736   3.981072   15.89631
inc_3 |     496    2.474998   .4471152   1.231144   3.981072  
Figure 2.15, page 56.
graph twoway (scatter wtr81_3 inc_3) (lfit wtr81_3 inc_3), ylabel(4(2)16) xlabel(1(1)4)
Figure 2.16, page 56. First obtain the predicted and residual values for the model with the transformed variables. Then graph and save the individual plots before graphing them together.
 predict pw81_3
. predict rw81_3, resid
. rvfplot, yline(0) xlab(8(1)12) ylab(-6(2)6) saving(f2_16a,replace)
qnorm rw81_3, xlab(-4(2)4) ylabel(-6(2)6) saving(f2_16b,replace)

graph combine f2_16a.gph f2_16b.gph



Figure 2.17, page 58.  Generate pw81_i3, the inverse transformation of pw81_3 and graph it with income and water81. The sort option is needed to graph the curve properly.

gen pw81_i3 = pw81_3^(1/.3)

graph twoway (scatter water81 income) (line pw81_i3 income, sort)
Save the changes we made to concord1 so that we can use another dataset without losing the changes we made. The new data is named concord1b.
save concord1b, replace

use http://www.ats.ucla.edu/stat/stata/examples/rwg/oilspill, clear
(Accidental Oil Spills 1973-85)
Regress oil loss on the number of spills.
regress lost spills

Source |       SS       df       MS                  Number of obs =      13
---------+------------------------------               F(  1,    11) =    6.03
Model |  167218.128     1  167218.128               Prob > F      =  0.0319
Residual |  304843.616    11   27713.056               R-squared     =  0.3542
Total |  472061.744    12  39338.4787               Root MSE      =  166.47

------------------------------------------------------------------------------
lost |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
spills |   6.956853   2.832131      2.456   0.032       .7233746    13.19033
_cons |  -44.48731   102.6833     -0.433   0.673      -270.4918    181.5172
------------------------------------------------------------------------------

Regress oil loss on number of spills through the origin, using the noconstant option.

regress lost spills, noconstant

Source |       SS       df       MS                  Number of obs =      13
---------+------------------------------               F(  1,    12) =   22.72
Model |   587004.77     1   587004.77               Prob > F      =  0.0005
Residual |  310045.453    12  25837.1211               R-squared     =  0.6544
Total |  897050.223    13  69003.8633               Root MSE      =  160.74

------------------------------------------------------------------------------
lost |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
spills |   5.860875     1.2296      4.766   0.000       3.181808    8.539943
------------------------------------------------------------------------------
Figure 2.8, page 50.
graph twoway (scatter lost spills) (lfit lost spills) ///
(lfit lost spills, estopts(noconstant)), ///
ylabel(0(100)700) xlabel(0(10)70)
Save the changes made to a new dataset oilspill2.
save oilspill2

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.