### Regression with Stata Chapter 4: Answers to Excersises

1. Use the crime data file that was used in chapter 2 (use http://www.ats.ucla.edu/stat/stata/webbooks/reg/crime ) and look at a regression model predicting murder from pctmetro, poverty, pcths and single using OLS and make a avplots and a lvr2plot following the regression. Are there any states that look worrisome? Repeat this analysis using regression with robust standard errors and show avplots for the analysis. Repeat the analysis using robust regression and make a manually created lvr2plot. Also run the results using qreg. Compare the results of the different analyses. Look at the weights from the robust regression and comment on the weights.

First, consider the OLS regression predicting murder from pctmetro, poverty, pcths and single.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/crime , clear
(crime data from agresti & finlay - 1997)

regress murder pctmetro poverty pcths single
      Source |       SS       df       MS              Number of obs =      51
-------------+------------------------------           F(  4,    46) =   37.90
Model |  4406.42207     4  1101.60552           Prob > F      =  0.0000
Residual |  1336.89947    46  29.0630319           R-squared     =  0.7672
Total |  5743.32154    50  114.866431           Root MSE      =   5.391

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0682218   .0380637     1.79   0.080    -.0083964      .14484
poverty |   .4380115   .3259862     1.34   0.186    -.2181648    1.094188
pcths |   .0243003   .2220237     0.11   0.913    -.4226102    .4712109
single |   3.650532   .4982054     7.33   0.000     2.647697    4.653367
_cons |  -45.31188   19.39747    -2.34   0.024    -84.35697   -6.266792
------------------------------------------------------------------------------


These results suggest that single is the only predictor significantly related to number of murders in a state. Let's look at the lvr2plot for this analysis. Washington DC looks like it has both a very high leverage and a very high residual.

. lvr2plot, mlabel(state)
. avplots

Let's consider the same analysis using robust standard errors. The results are largely the same, except that the p value for pctmetro fell from 0.08 to 0.049, which would then make it a significant predictor, however we would be somewhat skeptical of this particular result without further investigation.

regress murder pctmetro poverty pcths single, robust
Regression with robust standard errors                 Number of obs =      51
F(  4,    46) =    7.20
Prob > F      =  0.0001
R-squared     =  0.7672
Root MSE      =   5.391

------------------------------------------------------------------------------
|               Robust
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0682218   .0337517     2.02   0.049     .0002832    .1361604
poverty |   .4380115   .2568971     1.71   0.095    -.0790955    .9551185
pcths |   .0243003   .1841403     0.13   0.896    -.3463549    .3949556
single |   3.650532   1.152474     3.17   0.003     1.330723    5.970341
_cons |  -45.31188   25.39531    -1.78   0.081    -96.42999    5.806231
------------------------------------------------------------------------------


Stata allows us to compute the residual for this analysis but will not allow us to compute the leverage (hat) value. So instead of showing a lvr2plot let's look at the avplots for this analysis.

. avplots , mlabel(state)

As you can see, we still have an observation that sticks out from the rest, and this is Washington DC. This is especially pronounced for the lower right graph for single where DC would seem to have very strong leverage to influence the coefficient for single.

Now, let's look at the analysis using robust regression and save the weights, calling them rrwt.

rreg murder pctmetro poverty pcths single, genwt(rrwt)
   Huber iteration 1:  maximum difference in weights = .44857261
Huber iteration 2:  maximum difference in weights = .0399983
Biweight iteration 3:  maximum difference in weights = .15321379
Biweight iteration 4:  maximum difference in weights = .00973214

Robust regression estimates                            Number of obs =      50
F(  4,    45) =   35.25
Prob > F      =  0.0000

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0535439   .0146555     3.65   0.001     .0240262    .0830615
poverty |    .182561   .1259505     1.45   0.154    -.0711163    .4362383
pcths |  -.2245853   .0863452    -2.60   0.013    -.3984936   -.0506771
single |   1.392942   .2355845     5.91   0.000     .9184503    1.867434
_cons |   2.888033   7.945302     0.36   0.718    -13.11463    18.89069
------------------------------------------------------------------------------


If you try the avplots command, this command is not available after rreg and the lvr2plot is not available either. But we can manually create the residual and hat values and create an lvr2plot of our own, see below.

predict r, r
predict h, hat
generate r2=r^2
sum r2
<output omitted>
replace r2 = r2/r(sum)
summarize r2
<output omitted>
local rm = r(mean)
summarize h
<output omitted>
local hm = r(mean)
graph twoway scatter h r2 if state ~= "dc", yline(hm') xline(rm') mlabel(state) xlabel(0(.005).025)

As you see above, using the robust regression, none of the observations are jointly high in leverage and their residual values. Let's recap the regress results and the rreg results below and compare them.

regress murder pctmetro poverty pcths single
      Source |       SS       df       MS              Number of obs =      51
-------------+------------------------------           F(  4,    46) =   37.90
Model |  4406.42207     4  1101.60552           Prob > F      =  0.0000
Residual |  1336.89947    46  29.0630319           R-squared     =  0.7672
Total |  5743.32154    50  114.866431           Root MSE      =   5.391

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0682218   .0380637     1.79   0.080    -.0083964      .14484
poverty |   .4380115   .3259862     1.34   0.186    -.2181648    1.094188
pcths |   .0243003   .2220237     0.11   0.913    -.4226102    .4712109
single |   3.650532   .4982054     7.33   0.000     2.647697    4.653367
_cons |  -45.31188   19.39747    -2.34   0.024    -84.35697   -6.266792
------------------------------------------------------------------------------

rreg murder pctmetro poverty pcths single
   Huber iteration 1:  maximum difference in weights = .44857261
Huber iteration 2:  maximum difference in weights = .0399983
Biweight iteration 3:  maximum difference in weights = .15321379
Biweight iteration 4:  maximum difference in weights = .00973214

Robust regression estimates                            Number of obs =      50
F(  4,    45) =   35.25
Prob > F      =  0.0000

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0535439   .0146555     3.65   0.001     .0240262    .0830615
poverty |    .182561   .1259505     1.45   0.154    -.0711163    .4362383
pcths |  -.2245853   .0863452    -2.60   0.013    -.3984936   -.0506771
single |   1.392942   .2355845     5.91   0.000     .9184503    1.867434
_cons |   2.888033   7.945302     0.36   0.718    -13.11463    18.89069
------------------------------------------------------------------------------


The results are consistent for poverty and for single, where poverty was not significant in both analyses and single was significant in both analyses. However, the results for pctmetro and pcths were both not significant in the OLS analysis and were significant in the robust regression anlaysis.

Let's look at the weights used in the robust regression to further understand why the results were so different. Note that the weight for dc is . meaning that it was eliminated from the analysis entirely (because it had such a high residual). Also, ri was weighted by less than half.

hilo rrwt state
10 lowest and highest observations on rrwt

rrwt      state
46982663         ri
62949383         md
716977         nm
73472243         ma
74565543         mo
75750112         la
79708217         ky
82324958         ks
82552144         de
82728266         il

rrwt      state
99592844         sd
99639177         pa
99799356         fl
99811845         vt
99838103         ga
99863411         nh
99981867         wy
99986937         nd
99991851         ok
dc  

In our analyses in chapter 2 (involving different variables) we found dc to be a very serious outlier and decided that it should be excluded because it is not a state. If we investigated further into these variables we may reach the same conclusion and decide that dc should be excluded. If we did, we could try using OLS regression like this. These results are quite similar to the rreg results. The benefits of rreg is that it deals not only with the serious problems (like dc being a very bad outlier) but also minor problems as well.

regress murder pctmetro poverty pcths single if state != "dc"
      Source |       SS       df       MS              Number of obs =      50
-------------+------------------------------           F(  4,    45) =   39.88
Model |  606.611746     4  151.652936           Prob > F      =  0.0000
Residual |  171.137027    45  3.80304505           R-squared     =  0.7800
Total |  777.748773    49  15.8724239           Root MSE      =  1.9501

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0534333    .013795     3.87   0.000     .0256488    .0812178
poverty |   .2237151   .1185554     1.89   0.066    -.0150679     .462498
pcths |  -.1938711   .0812756    -2.39   0.021    -.3575685   -.0301737
single |   1.388337   .2217525     6.26   0.000     .9417051     1.83497
_cons |  -.0044014   7.478803    -0.00   1.000    -15.06748    15.05868
------------------------------------------------------------------------------


Let's try running the results using qreg and compare them with rreg.

qreg murder pctmetro poverty pcths single
Iteration  1:  WLS sum of weighted deviations =  187.90652

Iteration  1: sum of abs. weighted deviations =  177.16784
Iteration  2: sum of abs. weighted deviations =  167.01302
Iteration  3: sum of abs. weighted deviations =  128.40282
Iteration  4: sum of abs. weighted deviations =  125.28249
Iteration  5: sum of abs. weighted deviations =    124.226
Iteration  6: sum of abs. weighted deviations =  122.93248
Iteration  7: sum of abs. weighted deviations =   122.6427
Iteration  8: sum of abs. weighted deviations =  122.40488
Iteration  9: sum of abs. weighted deviations =  122.03476
Iteration 10: sum of abs. weighted deviations =  122.03096

Median regression                                    Number of obs =        51
Raw sum of deviations    235.3 (about 6.8000002)
Min sum of deviations  122.031                     Pseudo R2     =    0.4814

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0527879   .0226177     2.33   0.024     .0072608     .098315
poverty |   .0908506   .1831176     0.50   0.622    -.2777461    .4594473
pcths |  -.2686652   .1284197    -2.09   0.042    -.5271606   -.0101697
single |   1.796151   .2859057     6.28   0.000     1.220652    2.371649
_cons |   3.524669   11.34322     0.31   0.757    -19.30806    26.35739
------------------------------------------------------------------------------

rreg murder pctmetro poverty pcths single
   Huber iteration 1:  maximum difference in weights = .44857261
Huber iteration 2:  maximum difference in weights = .0399983
Biweight iteration 3:  maximum difference in weights = .15321379
Biweight iteration 4:  maximum difference in weights = .00973214

Robust regression estimates                            Number of obs =      50
F(  4,    45) =   35.25
Prob > F      =  0.0000

------------------------------------------------------------------------------
murder |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
pctmetro |   .0535439   .0146555     3.65   0.001     .0240262    .0830615
poverty |    .182561   .1259505     1.45   0.154    -.0711163    .4362383
pcths |  -.2245853   .0863452    -2.60   0.013    -.3984936   -.0506771
single |   1.392942   .2355845     5.91   0.000     .9184503    1.867434
_cons |   2.888033   7.945302     0.36   0.718    -13.11463    18.89069
------------------------------------------------------------------------------


While the coefficients do not always match up, the variables that were significant in the qreg are also significant in the rreg and likewise for the non-significant variables. Even though these techniques use different strategies for resisting the influence of very deviant observations, they both arrive at the same conclusions regarding which variables are significantly related to murder, although they do not always agree in the strength of the relationship, i.e. the size of the coefficients.

2. Using the elemapi2 data file (use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2 ) pretend that 550 is the lowest score that a school could achieve on api00, i.e., create a new variable with the api00 score and recode it such that any score of 550 or below becomes 550. Use meals, ell and emer to predict api scores using 1) OLS to predict the original api score (before recoding) 2) OLS to predict the recoded score where 550 was the lowest value, and 3) using tobit to predict the recoded api score indicating the lowest value is 550. Compare the results of these analyses.

First, we will use the elemapi2 data file and create the recoded version of the api score where the lowest value is 550. We will call this value api00x.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2 , clear
gen api00x = api00
replace api00x = 550 if api00 <= 550


Analysis 1. Now, we will run an OLS regression on the un-recoded version of api.

regress api00 meals ell emer
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  3,   396) =  673.00
Model |  6749782.75     3  2249927.58           Prob > F      =  0.0000
Residual |  1323889.25   396  3343.15467           R-squared     =  0.8360
Total |  8073672.00   399  20234.7669           Root MSE      =   57.82

------------------------------------------------------------------------------
api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.159189   .1497371   -21.10   0.000    -3.453568   -2.864809
ell |  -.9098732   .1846442    -4.93   0.000    -1.272878   -.5468678
emer |  -1.573496    .293112    -5.37   0.000    -2.149746   -.9972456
_cons |   886.7033    6.25976   141.65   0.000     874.3967    899.0098
------------------------------------------------------------------------------


Analysis 2. Now, we run an OLS regression on the recoded version of api.

regress api00x meals ell emer
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  3,   396) =  682.88
Model |  4567355.46     3  1522451.82           Prob > F      =  0.0000
Residual |  882862.941   396  2229.45187           R-squared     =  0.8380
Total |  5450218.40   399  13659.6952           Root MSE      =  47.217

------------------------------------------------------------------------------
api00x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.010788   .1222786   -24.62   0.000    -3.251184   -2.770392
ell |  -.3034092   .1507844    -2.01   0.045    -.5998472   -.0069713
emer |  -.7484733   .2393616    -3.13   0.002    -1.219052    -.277895
_cons |     869.31   5.111854   170.06   0.000     859.2602    879.3597
------------------------------------------------------------------------------


Analysis 3. And we use tobit to perform the analysis indicating that the lowest value possible was 550.

tobit api00x meals ell emer  , ll(550)
Tobit estimates                                   Number of obs   =        400
LR chi2(3)      =     660.74
Prob > chi2     =     0.0000
Log likelihood = -1581.8117                       Pseudo R2       =     0.1728

------------------------------------------------------------------------------
api00x |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.145065   .1595799   -19.71   0.000    -3.458792   -2.831337
ell |  -.8633529    .212474    -4.06   0.000    -1.281068   -.4456381
emer |  -1.470878   .3361215    -4.38   0.000    -2.131678   -.8100772
_cons |   885.2395   6.372871   138.91   0.000     872.7107    897.7683
-------------+----------------------------------------------------------------
_se |   57.12718   2.473494           (Ancillary parameter)
------------------------------------------------------------------------------

Obs. summary:        122  left-censored observations at api00x <=550 278 uncensored observations 

First, let's compare analysis 1 and 2. When the range in api was restricted in analysis 2, the size of the coefficients dropped due to the restriction in range of the api scores. For example, the coefficient for ell dropped from -.9 to -.3 and its significance level changed to 0.045 (nearly not significant from being quite significant). Let's see how well the tobit analysis compensated for the restriction in range by comparing analysis #1 and #3. The coefficients are quite similar in these two analyses. The standard errors are slightly larger in the tobit analysis leading the t values to be somewhat smaller. Nevertheless, the tobit estimates are much more on target than the second OLS analysis on the recoded data.

3. Using the elemapi2 data file (use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2 ) pretend that only schools with api scores of 550 or higher were included in the sample. Use meals ell and emer to predict api scores using 1) OLS to predict api from the full set of observations, 2) OLS to predict api using just the observations with api scores of 550 or higher, and 3) using truncreg to predict api using just the observations where api is 550 or higher. Compare the results of these analyses.

First, we use the elemapi2 data file and run the analysis on the complete data.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2, clear

Analysis 1 using all of the data.

regress api00 meals ell emer
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  3,   396) =  673.00
Model |  6749782.75     3  2249927.58           Prob > F      =  0.0000
Residual |  1323889.25   396  3343.15467           R-squared     =  0.8360
Total |  8073672.00   399  20234.7669           Root MSE      =   57.82

------------------------------------------------------------------------------
api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.159189   .1497371   -21.10   0.000    -3.453568   -2.864809
ell |  -.9098732   .1846442    -4.93   0.000    -1.272878   -.5468678
emer |  -1.573496    .293112    -5.37   0.000    -2.149746   -.9972456
_cons |   886.7033    6.25976   141.65   0.000     874.3967    899.0098
------------------------------------------------------------------------------


Now let's keep just the schools with api scores of 550 or higher for the next 2 analyses.

keep if api00 >= 550
(122 observations deleted)


Analysis 2 using OLS on just the schools with api scores of 550 or higher.

regress api00 meals ell emer
      Source |       SS       df       MS              Number of obs =     278
-------------+------------------------------           F(  3,   274) =  292.55
Model |  2268727.43     3  756242.478           Prob > F      =  0.0000
Residual |  708297.044   274  2585.02571           R-squared     =  0.7621
Total |  2977024.48   277  10747.3808           Root MSE      =  50.843

------------------------------------------------------------------------------
api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -2.798288   .1600331   -17.49   0.000    -3.113339   -2.483238
ell |  -.3584496   .2315111    -1.55   0.123    -.8142161    .0973169
emer |  -.9417814   .3547208    -2.65   0.008    -1.640106   -.2434569
_cons |    868.222   5.880858   147.64   0.000     856.6446    879.7994
------------------------------------------------------------------------------


Analysis 3 using truncreg on just the schools with api scores of 550 or higher.

truncreg api00 meals ell emer  , ll(550)
(note: 0 obs. truncated)

Fitting full model:

Iteration 0:   log likelihood = -1467.4296
Iteration 1:   log likelihood = -1460.6163
Iteration 2:   log likelihood = -1460.3638
Iteration 3:   log likelihood = -1460.3636
Iteration 4:   log likelihood = -1460.3636

Truncated regression
Limit:   lower =        550                             Number of obs =    278
upper =       +inf                             Wald chi2(3)  = 634.48
Log likelihood = -1460.3636                             Prob > chi2   = 0.0000

------------------------------------------------------------------------------
api00 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
eq1          |
meals |   -2.90758   .1872438   -15.53   0.000    -3.274571   -2.540589
ell |  -.8212468   .2983573    -2.75   0.006    -1.406016   -.2364771
emer |  -1.446235   .4549632    -3.18   0.001    -2.337946   -.5545233
_cons |   879.4212   6.595712   133.33   0.000     866.4939    892.3486
-------------+----------------------------------------------------------------
sigma        |
_cons |   53.34897   2.545858    20.96   0.000     48.35918    58.33876
------------------------------------------------------------------------------


Let's first compare the results of analysis 1 with analysis 2. When the schools with api scores of less than 550 are omitted, the coefficient for ell drops from -.9 to .35 and becomes no longer statistically significant. The coefficients for meals and emer remain significant although they both drop as well.

Now, let's compare analysis 3 using truncreg with the original OLS analysis of the complete data. In both of these analyses, all of the variables are significant and the coefficients are quite similar, although the standard errors are larger in the truncreg. The truncreg did a pretty good job of showing us what the coefficients were in the complete sample based just on the restricted sample.

4. Using the hsb2 data file (use http://www.ats.ucla.edu/stat/stata/webbooks/reg/hsb2 ) predict read from science, socst, math and write. Use the testparm and test commands to test the equality of the coefficients for science, socst and math. Use cnsreg to estimate a model where these three parameters are equal.

We start by using the hsb2 data file.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/hsb2 , clear
(highschool and beyond (200 cases))


We first run an ordinary regression predicting read from science, socst, math and write.

regress read science socst math  write
      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  4,   195) =   69.74
Model |  12312.7853     4  3078.19634           Prob > F      =  0.0000
Residual |  8606.63466   195   44.136588           R-squared     =  0.5886
Total |    20919.42   199  105.122714           Root MSE      =  6.6435

------------------------------------------------------------------------------
read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
science |   .2736751    .064369     4.25   0.000     .1467263    .4006238
socst |    .273267   .0574246     4.76   0.000      .160014      .38652
math |   .3028976    .072581     4.17   0.000     .1597532     .446042
write |   .1104172   .0713398     1.55   0.123    -.0302795    .2511139
_cons |   1.946078   3.087346     0.63   0.529    -4.142797    8.034954
------------------------------------------------------------------------------


We use the testparm command to test that the coefficients for science, socst and math are equal.

testparm science socst math, equal
 ( 1) - science + socst = 0.0
( 2) - science + math = 0.0

F(  2,   195) =    0.05
Prob > F =    0.9554

We can also use the test command to test that the coefficients for science, socst and math are equal.

test science=socst
 ( 1)  science - socst = 0.0

F(  1,   195) =    0.00
Prob > F =    0.9964
test socst=math, accum
 ( 1)  science - socst = 0.0
( 2)  socst - math = 0.0

F(  2,   195) =    0.05
Prob > F =    0.9554

We now constrain these three coefficients to be equal.

constraint define 1 science = socst
constraint define 2 socst = math

And we use cnsreg to estimate the model with these constraints in place.

cnsreg read science socst math write, c(1 2)

Constrained linear regression                          Number of obs =     200
F(  2,   197) =  140.80
Prob > F      =  0.0000
Root MSE      =  6.6113
( 1)  science - socst = 0.0
( 2)  socst - math = 0.0
------------------------------------------------------------------------------
read |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
science |   .2828596   .0268291    10.54   0.000     .2299505    .3357687
socst |   .2828596   .0268291    10.54   0.000     .2299505    .3357687
math |   .2828596   .0268291    10.54   0.000     .2299505    .3357687
write |   .1106022   .0708452     1.56   0.120      -.02911    .2503145
_cons |   2.012299   3.061703     0.66   0.512    -4.025622     8.05022
------------------------------------------------------------------------------

5. Using the elemapi2 data file (use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2 ) consider the following 2 regression equations.

api00 = meals ell emer
api99 = meals ell emer 

Estimate the coefficients for these predictors in predicting api00 and api99 taking into account the non-independence of the schools. Test the overall contribution of each of the predictors in jointly predicting api scores in these two years. Test whether the contribution of emer is the same for api00 and api99.

First, let's use the elemapi2 data file.

use http://www.ats.ucla.edu/stat/stata/webbooks/reg/elemapi2, clear

Next, let's analysze these equations separately.

regress api00 meals ell emer
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  3,   396) =  673.00
Model |  6749782.75     3  2249927.58           Prob > F      =  0.0000
Residual |  1323889.25   396  3343.15467           R-squared     =  0.8360
Total |  8073672.00   399  20234.7669           Root MSE      =   57.82

------------------------------------------------------------------------------
api00 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.159189   .1497371   -21.10   0.000    -3.453568   -2.864809
ell |  -.9098732   .1846442    -4.93   0.000    -1.272878   -.5468678
emer |  -1.573496    .293112    -5.37   0.000    -2.149746   -.9972456
_cons |   886.7033    6.25976   141.65   0.000     874.3967    899.0098
------------------------------------------------------------------------------
regress api99 meals ell emer
      Source |       SS       df       MS              Number of obs =     400
-------------+------------------------------           F(  3,   396) =  716.31
Model |  7293890.24     3  2431296.75           Prob > F      =  0.0000
Residual |  1344092.70   396  3394.17349           R-squared     =  0.8444
Total |  8637982.94   399    21649.08           Root MSE      =   58.26

------------------------------------------------------------------------------
api99 |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
meals |  -3.412388   .1508754   -22.62   0.000    -3.709004   -3.115771
ell |   -.793822   .1860477    -4.27   0.000    -1.159587   -.4280573
emer |  -1.516305   .2953401    -5.13   0.000    -2.096936   -.9356748
_cons |    860.191   6.307343   136.38   0.000     847.7909     872.591
------------------------------------------------------------------------------

Now, let's analyze them using sureg that takes into account the non-independence of these equations.

sureg (api00 api99 = meals ell emer)
Seemingly unrelated regression
----------------------------------------------------------------------
Equation          Obs  Parms        RMSE    "R-sq"       chi2        P
----------------------------------------------------------------------
api00             400      3    57.53019    0.8360    2039.38   0.0000
api99             400      3    57.96751    0.8444   2170.651   0.0000
----------------------------------------------------------------------

------------------------------------------------------------------------------
|      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
api00        |
meals |  -3.159189   .1489866   -21.20   0.000    -3.451197    -2.86718
ell |  -.9098732   .1837186    -4.95   0.000    -1.269955   -.5497913
emer |  -1.573496   .2916428    -5.40   0.000    -2.145105   -1.001886
_cons |   886.7033   6.228382   142.36   0.000     874.4959    898.9107
-------------+----------------------------------------------------------------
api99        |
meals |  -3.412388   .1501191   -22.73   0.000    -3.706616    -3.11816
ell |   -.793822   .1851151    -4.29   0.000    -1.156641    -.431003
emer |  -1.516305   .2938597    -5.16   0.000     -2.09226   -.9403509
_cons |    860.191   6.275727   137.07   0.000     847.8908    872.4912
------------------------------------------------------------------------------

We can test the contribution of meals ell and emer as shown below.

test meals
 ( 1)  [api00]meals = 0.0
( 2)  [api99]meals = 0.0

chi2(  2) =  518.30
Prob > chi2 =    0.0000

test ell
 ( 1)  [api00]ell = 0.0
( 2)  [api99]ell = 0.0

chi2(  2) =   24.80
Prob > chi2 =    0.0000
test emer
 ( 1)  [api00]emer = 0.0
( 2)  [api99]emer = 0.0

chi2(  2) =   29.48
Prob > chi2 =    0.0000

We can test whether the coefficients for emer were the same in predicting api00 and api99 as shown below.

test [api00]emer = [api99]emer

( 1)  [api00]emer - [api99]emer = 0.0

chi2(  1) =    0.21
Prob > chi2 =    0.6456

We can also test the contribution of meals ell and emer using more traditional multivariate tests using the mvreg and mvtest commands as shown below.

mvreg api00 api99 = meals ell emer

Equation          Obs  Parms        RMSE    "R-sq"          F        P
----------------------------------------------------------------------
api00             400      4    57.82002    0.8360   672.9954   0.0000
api99             400      4    58.25954    0.8444   716.3148   0.0000

------------------------------------------------------------------------------
|      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
api00        |
meals |  -3.159189   .1497371   -21.10   0.000    -3.453568   -2.864809
ell |  -.9098732   .1846442    -4.93   0.000    -1.272878   -.5468678
emer |  -1.573496    .293112    -5.37   0.000    -2.149746   -.9972456
_cons |   886.7033    6.25976   141.65   0.000     874.3967    899.0098
-------------+----------------------------------------------------------------
api99        |
meals |  -3.412388   .1508754   -22.62   0.000    -3.709004   -3.115771
ell |   -.793822   .1860477    -4.27   0.000    -1.159587   -.4280573
emer |  -1.516305   .2953401    -5.13   0.000    -2.096936   -.9356748
_cons |    860.191   6.307343   136.38   0.000     847.7909     872.591
------------------------------------------------------------------------------

Below we show the multivariate tests for meals ell and for emer

mvtest meals

MULTIVARIATE TESTS OF SIGNIFICANCE

Multivariate Test Criteria and Exact F Statistics for
the Hypothesis of no Overall "meals" Effect(s)

S=1    M=0    N=196.5

Test                          Value          F       Num DF     Den DF   Pr > F
Wilks' Lambda              0.43558762   255.9105          2   395.0000   0.0000
Pillai's Trace             0.56441238   255.9105          2   395.0000   0.0000
Hotelling-Lawley Trace     1.29574936   255.9105          2   395.0000   0.0000

mvtest ell
MULTIVARIATE TESTS OF SIGNIFICANCE

Multivariate Test Criteria and Exact F Statistics for
the Hypothesis of no Overall "ell" Effect(s)

S=1    M=0    N=196.5

Test                          Value          F       Num DF     Den DF   Pr > F
Wilks' Lambda              0.94161436    12.2462          2   395.0000   0.0000
Pillai's Trace             0.05838564    12.2462          2   395.0000   0.0000
Hotelling-Lawley Trace     0.06200590    12.2462          2   395.0000   0.0000

mvtest emer

MULTIVARIATE TESTS OF SIGNIFICANCE

Multivariate Test Criteria and Exact F Statistics for
the Hypothesis of no Overall "emer" Effect(s)

S=1    M=0    N=196.5

Test                          Value          F       Num DF     Den DF   Pr > F
Wilks' Lambda              0.93136794    14.5537          2   395.0000   0.0000
Pillai's Trace             0.06863206    14.5537          2   395.0000   0.0000
Hotelling-Lawley Trace     0.07368952    14.5537          2   395.0000   0.0000


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.