|
|
|
||||
|
Help the Stat Consulting Group by
giving a gift
| |||||
|
Loading
|
|||||
We will demonstrate regfungible using the hsbdemo dataset. We begin by loading the data and then running a regression model with three predictors.
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
regress write read math science
Source | SS df MS Number of obs = 200
-------------+------------------------------ F( 3, 196) = 57.30
Model | 8353.98999 3 2784.66333 Prob > F = 0.0000
Residual | 9524.88501 196 48.5963521 R-squared = 0.4673
-------------+------------------------------ Adj R-squared = 0.4591
Total | 17878.875 199 89.843593 Root MSE = 6.9711
------------------------------------------------------------------------------
write | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .2356606 .0691053 3.41 0.001 .0993751 .3719461
math | .3194791 .0756752 4.22 0.000 .1702369 .4687213
science | .2016571 .0690962 2.92 0.004 .0653896 .3379246
_cons | 13.19155 3.068867 4.30 0.000 7.139308 19.24378
------------------------------------------------------------------------------
The R2 for this model is .4673. We want to obtain sets of standardized regression
weights for an R2 that is .005 less. The original R2 will be
called RSQb, the new reduced R2 is RSQa and the difference between
them is theta. Thus,
theta = RSQb - RSQa = .4673 - .005 = .4623
regfungible, sets(200) theta(.005)
OLS fungible regression weights analysis
Original R2: RSQb = .4672548
Reduced R2: RSQa = .4622548
theta = RSQb-RSQa = .005
r_yhata_yhatb = .9946352
Generating Alternate weights ...
Standardized OLS regression weights
1 2 3
+-------------------------------------------+
1 | .2549128629 .3157668631 .2106416581 |
+-------------------------------------------+
Maximum fungible regression weights for each variable
1 2 3
+-------------------------------------------+
1 | .3495330891 .2560737801 .1675002152 |
2 | .197326296 .4079093616 .1623860478 |
3 | .2075297825 .2678570952 .303304878 |
+-------------------------------------------+
Minimum fungible regression weights for each variable
1 2 3
+-------------------------------------------+
1 | .1548256496 .3681889758 .2498423376 |
2 | .3069399276 .2168678744 .254496415 |
3 | .2995772796 .3542006637 .1135491744 |
+-------------------------------------------+
Summary of fungible regresson weights
stats | v_1 v_2 v_3
---------+------------------------------
N | 200 200 200
mean | .2519706 .311021 .2100915
p5 | .1566942 .2185254 .1142415
p25 | .1886851 .2372786 .1403297
p50 | .2527676 .3169925 .2147582
p75 | .3173331 .3821019 .2787131
p95 | .3468746 .4054128 .3011541
----------------------------------------
The output above shows standardized regression weights from the original model (.2549128629, .3157668631,
.2106416581). Along with a summary of the new fungible weights which were added to our data. These
new variables are labeled by default v_1 through v_3. The prefix for these new
variables can be changed using the prefix option in the program.Looking at the "Summary of fungible regression weights" in the output we see the average, min, max and quartiles for the 200 fungible weights. It is often more interesting to look at the maximum and minimum weights for each of the variables. For example, the maximum value of v_1 is .3495532936 and is associated with weights .257299223 and .1661526516 for v_2 and v_3 respectively. These weights are rather different from the original weights. And, if we look at the maximum for v_2 (.4079123235) with associated v_1 and v_3 (.1970563447, .1626652555 ) we see that these weights can be very different from each other.
Next we will demonstrate that these weights generate R2's equal to RSQa. We will select the weights for a case at random, say case 155. Note, the values will differ from run to run unless you use the seed option.
/* generate standardized predictors */
egen zr = std(read)
egen zm = std(math)
egen zs = std(science)
/* get fungible weights for observation 155 */
list v_1 v_2 v_3 in 155
+--------------------------------+
| v_1 v_2 v_3 |
|--------------------------------|
155. | .1799844 .3003776 .2969216 |
+--------------------------------+
/* generate predicted value, yhata */
generate yhata = .1799844*zr + .3003776*zm + .2969216*zs
/* correlate observed and predicted */
corr write yhata
(obs=200)
| write yhata
-------------+------------------
write | 1.0000
yhata | 0.6799 1.0000
display r(rho)^2
.46225479
Next, we will generate some graphs from the results of regfungible beginning with a box plot of the
regression weights for each variable. Note the considerable variation
in the regression weights as well as the considerable overlap in values.
Let's look at the scatter plots of the fungible weights generated by the program for each pair of variates. We will use the graph matrix command for this.graph box v_*, scheme(lean1)
We will follow the scatterplot matrix with a look at each of the univariate kernal density distributions.graph matrix v_*, scheme(lean1)
forvalues i=1/3 {
kdensity v_`i', name(v_`i') scheme(lean1)
}

We will finish up by generating weights for two additional values of theta (.01 and .02) and
plotting all three sets of the first two variates on the same axes. Additionally, we will add a
marker for the actual values of the standardized regression weights for read and math.
regfungible, sets(200) theta(.01) prefix(w_)
regfungible, sets(200) theta(.02) prefix(x_)
twoway (scatter v_1 v_2)(scatter w_1 w_2, msym(oh))(scatter x_1 x_2, msym(oh)), ///
text(.2549128629 .3157668631 "+", place(c)) legend(off) scheme(lean1)

We end up with something that looks like a model of the solar system. You can see that as theta gets
smaller and smaller the values of the fungible weights converges on the least squares regression
weights. The gaps in the "orbits" would be filled in if we generated a greater number of sets of
weights.
Waller, N.G. (2008). Fungible weights in multiple regression. Psychometrica, 73, 691-703.
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services