|
|
|
||||
|
|
|||||
Weighted least squares provides one method for dealing with heteroscedasticity. The wls0 command can be used to compute various WLS solutions. You can download wls0 over the internet by typing findit wls0 (see How can I use the findit command to search for programs and get additional help? for more information about using findit).
Let's use an example dataset that exhibits heteroscedasticity, hetdata.
use http://www.ats.ucla.edu/stat/stata/ado/analysis/hetdata
regress exp age ownrent income incomesq
Source | SS df MS Number of obs = 72
-------------+------------------------------ F( 4, 67) = 5.39
Model | 1749357.01 4 437339.252 Prob > F = 0.0008
Residual | 5432562.03 67 81083.0153 R-squared = 0.2436
-------------+------------------------------ Adj R-squared = 0.1984
Total | 7181919.03 71 101153.789 Root MSE = 284.75
------------------------------------------------------------------------------
exp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -3.081814 5.514717 -0.56 0.578 -14.08923 7.925606
ownrent | 27.94091 82.92232 0.34 0.737 -137.5727 193.4546
income | 234.347 80.36595 2.92 0.005 73.93593 394.7581
incomesq | -14.99684 7.469337 -2.01 0.049 -29.9057 -.0879859
_cons | -237.1465 199.3517 -1.19 0.238 -635.0541 160.7611
------------------------------------------------------------------------------
rvpplot income, ylab yline(0)

The residual versus income plot shows clear evidence of heteroscedasticity. Let's try a WLS weighting proportional to income. The WLS type, abse, uses the absolute value of the residuals and in this case no constant.
wls0 exp age ownrent income incomesq, wvar(income) type(abse) noconst graph
WLS regression - type: proportional to abs(e)
(sum of wgt is 5.7161e-01)
Source | SS df MS Number of obs = 72
-------------+------------------------------ F( 4, 67) = 5.73
Model | 1266234.75 4 316558.686 Prob > F = 0.0005
Residual | 3703808.10 67 55280.7179 R-squared = 0.2548
-------------+------------------------------ Adj R-squared = 0.2103
Total | 4970042.85 71 70000.6035 Root MSE = 235.12
------------------------------------------------------------------------------
exp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -2.935011 4.603331 -0.64 0.526 -12.1233 6.253276
ownrent | 50.49364 69.87914 0.72 0.472 -88.9857 189.973
income | 202.1694 76.78152 2.63 0.010 48.91285 355.426
incomesq | -12.11364 8.27314 -1.46 0.148 -28.62689 4.39962
_cons | -181.8706 165.5191 -1.10 0.276 -512.2481 148.5068
------------------------------------------------------------------------------

The residual plot is better. We can try another possibilities, such as, weighting proportional to income and income squared.
wls0 exp age ownrent income incomesq, wvar(income incomesq) type(abse) noconst graph
WLS regression - type: proportional to abs(e)
(sum of wgt is 4.3021e-01)
Source | SS df MS Number of obs = 72
-------------+------------------------------ F( 4, 67) = 6.37
Model | 1626419.82 4 406604.954 Prob > F = 0.0002
Residual | 4277725.74 67 63846.6528 R-squared = 0.2755
-------------+------------------------------ Adj R-squared = 0.2322
Total | 5904145.55 71 83156.9796 Root MSE = 252.68
------------------------------------------------------------------------------
exp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -3.038906 4.953024 -0.61 0.542 -12.92518 6.847371
ownrent | 41.89772 75.32687 0.56 0.580 -108.4553 192.2508
income | 214.7859 70.17436 3.06 0.003 74.71732 354.8545
incomesq | -13.41379 6.353738 -2.11 0.038 -26.09591 -.7316791
_cons | -199.6993 170.1115 -1.17 0.245 -539.2433 139.8448
------------------------------------------------------------------------------

Finally, let's try one more variation. This time we will make the adjustment proportional to the log of squared residuals.
wls0 exp age ownrent income incomesq, wvar(income incomesq) type(loge2) graph
WLS regression - type: proportional to log(e^2)
(sum of wgt is 2.8166e-02)
Source | SS df MS Number of obs = 72
-------------+------------------------------ F( 4, 67) = 69.69
Model | 2872576.02 4 718144.005 Prob > F = 0.0000
Residual | 690414.759 67 10304.6979 R-squared = 0.8062
-------------+------------------------------ Adj R-squared = 0.7947
Total | 3562990.78 71 50182.9687 Root MSE = 101.51
------------------------------------------------------------------------------
exp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
age | -1.233683 2.551197 -0.48 0.630 -6.325894 3.858527
ownrent | 50.94976 52.81429 0.96 0.338 -54.468 156.3675
income | 145.3045 46.3627 3.13 0.003 52.76413 237.8448
incomesq | -7.93828 3.736716 -2.12 0.037 -15.3968 -.4797648
_cons | -117.8675 101.3862 -1.16 0.249 -320.2352 84.50027
------------------------------------------------------------------------------

In addition to weight types abse and loge2 there is squared residuals (e2) and squared fitted values (xb2).
Finding the optimal WLS solution to use involves detailed knowledge of your data and trying different combinations of variables and types of weighting.
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services