### Stata Textbook Examples Introduction to the Practice of Statistics by Moore and McCabe Chapter 10: Inference for Regression

The first examples use the file EG10_001.
use http://www.ats.ucla.edu/stat/stata/examples/mm/webdata/EG10_001, clear
Figure 10.3, page 667.
graph twoway (scatter density lskin) (lfit density lskin), ///
ylabel(1.02(.01)1.10) xlabel(1(.1)2.1)


Figure 10.4, page 668 can be produced with the regress command below. The variable density is the dependent variable, and lskin is the predictor variable.
regress  density lskin

Source |       SS       df       MS                  Number of obs =      92
---------+------------------------------               F(  1,    90) =  231.89
Model |  .016908557     1  .016908557               Prob > F      =  0.0000
Residual |   .00656234    90  .000072915               R-squared     =  0.7204
Total |  .023470897    91  .000257922               Root MSE      =  .00854

------------------------------------------------------------------------------
density |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
lskin |   -.063119   .0041449    -15.228   0.000      -.0713536   -.0548844
_cons |   1.162999   .0065596    177.296   0.000       1.149967    1.176031
------------------------------------------------------------------------------
Figure 10.6, page 670 is a plot of the residuals by case. You can generate the residual using the predict command, as shown below. We named the residual res, but we could have chosen a different name if we wanted to. Then, we use the graph command to graph res by subject (the case number). The yline(0) option puts in a line where Y is 0 (as shown in figure 10.6).
predict res, residual
. graph twoway scatter res subject, yline(0) xlabel(0(10)100) ylabel(-0.015(.005).025)
Figure 10.7, page 670 shows the residual graphed by lskin and is produced using the graph command below.
graph twoway scatter res lskin, yline(0) xlabel(1(.1)2.1) ylabel(-0.015(.005).025)
Figure 10.8, page 671 can be obtained with the qnorm command. The x-axis is labeled differently, but conveys the same information.
qnorm res
The qnorm command changes the order of the data. We sort the data on subject to put the data back in order.
sort subject
Example 10.3, page 673. We repeat the regression from above, which contains the test of B1 and the confidence interval for B1.
regress density lskin

Source |       SS       df       MS                  Number of obs =      92
---------+------------------------------               F(  1,    90) =  231.89
Model |  .016908557     1  .016908557               Prob > F      =  0.0000
Residual |   .00656234    90  .000072915               R-squared     =  0.7204
Total |  .023470897    91  .000257922               Root MSE      =  .00854

------------------------------------------------------------------------------
density |      Coef.   Std. Err.       t     P>|t|       [95% Conf. Interval]
---------+--------------------------------------------------------------------
lskin |   -.063119   .0041449    -15.228   0.000      -.0713536   -.0548844
_cons |   1.162999   .0065596    177.296   0.000       1.149967    1.176031
------------------------------------------------------------------------------
We can produce Figure 10.9, page 675 with the commands below. We already have lskin and density so we don't need to obtain those.
We use the predict command to get the predicted value of density and we call the predicted value yhat.
predict yhat
(option xb assumed; fitted values)
We use the predict command to get the standard error and we call it stderr. The stdp option tells Stata that we wish to obtain the standard error.
predict stderr, stdp
We can then create the confidence interval using the commands below. The variable yhatll is the lower confidence limit, and yhatul is the upper confidence limit. The invttail(89,.975) gives us the t value for df=89 at the upper and lower 2.5% of the distribution, which corresponds to a 95% confidence interval.
generate yhatll = yhat-stderr*invttail(89,.975)

generate yhatul = yhat+stderr*invttail(89,.975)
We then list out these values as shown in Figure 10.9, page 675.
list  lskin density yhat stderr yhatll yhatul in 1/24

lskin    density       yhat     stderr     yhatll     yhatul
1.      1.27      1.093   1.082838   .0015224   1.079367   1.086309
2.      1.56      1.063   1.064533   .0008909   1.062502   1.066565
3.      1.45      1.078   1.071477   .0010156   1.069161   1.073792
4.      1.52      1.056   1.067058   .0009122   1.064978   1.069138
5.      1.51      1.073   1.067689   .0009221   1.065587   1.069792
6.      1.51      1.071   1.067689   .0009221   1.065587   1.069792
7.       1.5      1.076   1.068321   .0009337   1.066192   1.070449
8.      1.62      1.047   1.060746    .000916   1.058658   1.062835
9.       1.5      1.089   1.068321   .0009337   1.066192   1.070449
10.      1.75      1.053   1.052541   .0011671    1.04988   1.055202
11.      1.43      1.057   1.072739    .001058   1.070327   1.075151
12.      1.81      1.051   1.048754   .0013414   1.045696   1.051812
13.       1.6      1.074   1.062009   .0009001   1.059956   1.064061
14.      1.49       1.07   1.068952    .000947   1.066792   1.071111
15.      1.29      1.081   1.081576   .0014559   1.078256   1.084895
16.      1.52      1.064   1.067058   .0009122   1.064978   1.069138
17.      1.83      1.037   1.047491   .0014044   1.044289   1.050693
18.      1.58       1.06   1.063271   .0008917   1.061238   1.065304
19.       1.7      1.065   1.055697   .0010451   1.053314   1.058079
20.      1.59      1.058    1.06264   .0008949   1.060599    1.06468
21.      2.02      1.042   1.035499   .0020745   1.030769   1.040228
22.      1.84      1.045    1.04686   .0014367   1.043584   1.050136
23.      1.87      1.026   1.044967   .0015363   1.041464   1.048469
24.      1.83      1.046   1.047491   .0014044   1.044289   1.050693 
The graph twoway command below creates figure 10.10, page 675. The first part of the graph, (scatter density lskin), creates the scatter plot. The second part, (lfit density lskin) overlays the linear regression line and the final part, (lfitci density lskin, ciplot(rline)) overlays the confidence interval for the regression line and the ciplot(rline) gets rid of the shadow with the default lfitci graph provides.

Note: Stata does not extrapolate prediction past the range of the data; therefore, the predicted value and the confidence interval that we produce in Stata does not exactly match up with figure 10.10 in the book.

graph twoway (scatter density lskin) (lfit density lskin) (lfitci density lskin, ciplot(rline))
The following commands get the values for figure 10.11. In this figure, we want the standard error for prediction. To get this standard error, we use the stdf option (in Stata lingo, the f is for forecast), and we call the variable stderrf. We then create yhatllf and yhatulf much as we did for figure 10.9.
predict stderrf, stdf
generate yhatllf = yhat-stderrf*invttail(89,.975)
generate yhatulf = yhat+stderrf*invttail(89,.975)
list  lskin density yhat stderrf yhatllf yhatulf in 1/24

lskin    density       yhat    stderrf    yhatllf    yhatulf
1.      1.27      1.093   1.082838   .0086737   1.063062   1.102613
2.      1.56      1.063   1.064533   .0085854   1.044959   1.084108
3.      1.45      1.078   1.071477   .0085992   1.051871   1.091082
4.      1.52      1.056   1.067058   .0085876   1.047479   1.086637
5.      1.51      1.073   1.067689   .0085887   1.048108   1.087271
6.      1.51      1.071   1.067689   .0085887   1.048108   1.087271
7.       1.5      1.076   1.068321   .0085899   1.048736   1.087905
8.      1.62      1.047   1.060746    .008588   1.041166   1.080327
9.       1.5      1.089   1.068321   .0085899   1.048736   1.087905
10.      1.75      1.053   1.052541   .0086184   1.032891    1.07219
11.      1.43      1.057   1.072739   .0086043   1.053121   1.092356
12.      1.81      1.051   1.048754   .0086437   1.029046   1.068461
13.       1.6      1.074   1.062009   .0085863   1.042432   1.081585
14.      1.49       1.07   1.068952   .0085914   1.049364    1.08854
15.      1.29      1.081   1.081576   .0086622   1.061826   1.101325
16.      1.52      1.064   1.067058   .0085876   1.047479   1.086637
17.      1.83      1.037   1.047491   .0086537   1.027761   1.067222
18.      1.58       1.06   1.063271   .0085854   1.043697   1.082845
19.       1.7      1.065   1.055697   .0086027   1.036083   1.075311
20.      1.59      1.058    1.06264   .0085858   1.043065   1.082215
21.      2.02      1.042   1.035499   .0087874   1.015464   1.055534
22.      1.84      1.045    1.04686    .008659   1.027118   1.066602
23.      1.87      1.026   1.044967   .0086761   1.025185   1.064748
24.      1.83      1.046   1.047491   .0086537   1.027761   1.067222     
Figure 10.12, page 678 is graphed just like figure 10.10, however we use the stdf in the lfitci graph to get the prediction interval.

Note: Stata does not extrapolate prediction past the range of the data; therefore, the predicted value and the prediction interval that we produce in Stata does not exactly match up with figure 10.12 in the book.

graph twoway (scatter density lskin) (lfit density lskin) (lfitci density lskin, ciplot(rline) stdf)

We have skipped figure 10.13 for now.
Section 10.2 shows many of the same things shown in section 10.1 but shows how to do hand computations to get the results. You can refer to section 10.1 to see how you can get these results in Stata. However, example 10.15, page 692 shows something new, how to test the significance of a correlation. This is shown below using the pwcorr command.
pwcorr lskin density, sig

|    lskin  density
----------+------------------
lskin |   1.0000
|
|
density |  -0.8488   1.0000
|   0.0000
|

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.