Help the Stat Consulting Group by giving a gift

When I run my OLS regression model with a constant I get an R^{2} of about 0.35
and an F-ratio around 100. When I run the same model without a constant the R^{2} is
0.97 and the F-ratio is over 7,000. Why are R^{2} and F-ratio so large for models without
a constant?

Let's begin by going over what it means to run an OLS regression without a constant (intercept). A regression without a constant implies that the regression line should run through the origin, i.e., the point where both the response variable and predictor variable equal zero. Let's look at a scatterplot that has both the regular regression line (dashed line) and a line without the constant (solid line).

As you can see, the "true" regression line is different from noconstant line. Then how can
it be that the noconstant model has a larger R^{2} and F-ratio then a model with
a constant?

To answer this question, let's start with a review how the R^{2} and F-ratio for OLS regression models
are computed.

\[
R^{2} = \frac{SS_{model}}{SS_{model} + SS_{residual}}
\]
\[
F = \frac{\left(\frac{SS_{model}}{df_{model}}\right)}{\left(\frac{SS_{residual}}{df_{residual}}\right)}
\]

Next, let's see how each of these sums of squares are defined. For these equations we will use \(\hat{Y}\) for the predicted value of the response variable Y and \(\bar{Y}\) for the mean value of Y. \[ SS_{total} = \sum_{i = 1}^{N}(Y_i - \bar{Y})^2 \] \[ SS_{model} = \sum_{i = 1}^{N}(\hat{Y}_i - \bar{Y})^2 \] \[ SS_{residual} = \sum_{i = 1}^{N}(Y_i - \hat{Y}_i)^2 \]

When you run the regression without a constant in the model, you are declaring that the
expected value of Y when x is equal to 0 is 0. That is, \(E(Y | x = 0) = 0\). If this is
not the case, the values of \(\hat{Y}\) will be different yielding different \(SS_{model}\)
and \(SS_{residual}\) hence different \(R^2\) and F values. Typically, the sum of squares of
Y accounted for by the intercept are not included in the total sum of squares. That is,
they are neither in SS_{model} nor SS_{residual}. That is the model is
predicting the sum of squares left over after taking out the intercept. When the intercept
(or constant term) is left off and it does not have a true zero effect, the total
sum of squares being modelled is increased. This tends to inflate both SS_{model} and
SS_{residual}; however, SS_{model} increases relatively more than
SS_{residual} leading to the increase in R^{2} values.

The actual code used to calculate \(R^2\) are different with and without an intercept. This is easy to see by running models without a built-in intercept, but manually including one (a constant term). Here is some example code you can try:

sysuse auto gen const = 1 regress mpg weightSource | SS df MS Number of obs = 74 -------------+------------------------------ F( 1, 72) = 134.62 Model | 1591.9902 1 1591.9902 Prob > F = 0.0000 Residual | 851.469256 72 11.8259619 R-squared = 0.6515 -------------+------------------------------ Adj R-squared = 0.6467 Total | 2443.45946 73 33.4720474 Root MSE = 3.4389 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | -.0060087 .0005179 -11.60 0.000 -.0070411 -.0049763 _cons | 39.44028 1.614003 24.44 0.000 36.22283 42.65774 ------------------------------------------------------------------------------regress mpg const weight, noconstantSource | SS df MS Number of obs = 74 -------------+------------------------------ F( 2, 72) = 1486.41 Model | 35156.5307 2 17578.2654 Prob > F = 0.0000 Residual | 851.469256 72 11.8259619 R-squared = 0.9764 -------------+------------------------------ Adj R-squared = 0.9757 Total | 36008 74 486.594595 Root MSE = 3.4389 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- const | 39.44028 1.614003 24.44 0.000 36.22283 42.65774 weight | -.0060087 .0005179 -11.60 0.000 -.0070411 -.0049763 ------------------------------------------------------------------------------regress mpg weight, noconstant * note change in total SS between plain regress and wihtout constant * but total SS is the same for without constant and with const * when using manual intercept, intercept SS included in modelSource | SS df MS Number of obs = 74 -------------+------------------------------ F( 1, 73) = 259.18 Model | 28094.8545 1 28094.8545 Prob > F = 0.0000 Residual | 7913.14549 73 108.399253 R-squared = 0.7802 -------------+------------------------------ Adj R-squared = 0.7772 Total | 36008 74 486.594595 Root MSE = 10.411 ------------------------------------------------------------------------------ mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- weight | .006252 .0003883 16.10 0.000 .0054781 .007026 ------------------------------------------------------------------------------

one <- rep(1, 32) anova(lm(mpg ~ qsec, data = mtcars))Analysis of Variance Table Response: mpg Df Sum Sq Mean Sq F value Pr(>F) qsec 1 197.39 197.392 6.3767 0.01708 * Residuals 30 928.66 30.955 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1anova(lm(mpg ~ 0 + one + qsec, data = mtcars))Analysis of Variance Table Response: mpg Df Sum Sq Mean Sq F value Pr(>F) one 1 12916.3 12916.3 417.2570 < 2e-16 *** qsec 1 197.4 197.4 6.3767 0.01708 * Residuals 30 928.7 31.0 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1anova(lm(mpg ~ 0 + qsec, data = mtcars))Analysis of Variance Table Response: mpg Df Sum Sq Mean Sq F value Pr(>F) qsec 1 13105.6 13105.6 433.73 < 2.2e-16 *** Residuals 31 936.7 30.2 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.