### SPSS Textbook Examples Applied Regression Analysis by John Fox Chapter 13: Collinearity and its purported remedies

page 343 Table 13.1 Regression of estimated 1980 U.S. census undercount of area characteristics, for 66 central cities, state remainders, and states.

GET FILE='D:\ericksen.sav'.
regression
/statistics=defaults tol
/dep=undcount
/method=enter perc_min crimrate poverty diffeng hsgrad housing city countprc.


Variables Entered/Removed(b)
Model Variables Entered Variables Removed Method
1 Percentage of households counted by conventional personal enumeration, Percentage of housing in small, multiunit buildings, Percentage having difficulty speaking or writing English, Percentage poor, Rate of serious crimes per 1000 population, City 1=yes, 0=no, Percentage age 25 or older who had not finished high school, Percentage black or Hispanic(a) . Enter
a All requested variables entered.
b Dependent Variable: Preliminary estimate of percentage undercount

Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .841(a) .708 .667 1.42647
a Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Percentage of housing in small, multiunit buildings, Percentage having difficulty speaking or writing English, Percentage poor, Rate of serious crimes per 1000 population, City 1=yes, 0=no, Percentage age 25 or older who had not finished highschool, Percentage black or Hispanic

ANOVA(b)
Model Sum of Squares df Mean Square F Sig.
1 Regression 280.795 8 35.099 17.249 .000(a)
Residual 115.985 57 2.035

Total 396.780 65

a Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Percentage of housing in small, multiunit buildings, Percentage having difficulty speaking or writing English, Percentage poor, Rate of serious crimes per 1000 population, City 1=yes, 0=no, Percentage age 25 or older who had not finished highschool, Percentage black or Hispanic
b Dependent Variable: Preliminary estimate of percentage undercount

Coefficients(a)

Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics
Model B Std. Error Beta Tolerance VIF
1 (Constant) -1.771 1.382
-1.282 .205

Percentage black or Hispanic 7.983E-02 .023 .566 3.531 .001 .200 5.009
Rate of serious crimes per 1000 population 3.012E-02 .013 .303 2.317 .024 .299 3.344
Percentage poor -.178 .085 -.324 -2.101 .040 .216 4.625
Percentage having difficulty speaking or writing English .215 .092 .214 2.333 .023 .611 1.636
Percentage age 25 or older who had not finished high school 6.129E-02 .045 .211 1.369 .176 .216 4.619
Percentage of housing in small, multiunit buildings -3.496E-02 .025 -.139 -1.419 .161 .534 1.872
City 1=yes, 0=no 1.160 .771 .203 1.505 .138 .283 3.538
Percentage of households counted by conventional personal enumeration 3.699E-02 .009 .372 3.997 .000 .591 1.691
a Dependent Variable: Preliminary estimate of percentage undercount

page 343 Table 13.2 Correlations among eight predictors of the 1980 U.S. census undercount.

CORRELATIONS
/VARIABLES=perc_min crimrate poverty diffeng hsgrad housing city countprc.

Correlations

Percentage black or Hispanic Rate of serious crimes per 1000 population Percentage poor Percentage having difficulty speaking or writing English Percentage age 25 or older who had not finished highschool Percentage of housing in small, multiunit buildings City 1=yes, 0=no Percentage of households counted by conventional personal enumeration
Percentage black or Hispanic Pearson Correlation 1 .655 .738 .395 .535 .357 .758 -.334
Sig. (2-tailed) . .000 .000 .001 .000 .003 .000 .006
N 66 66 66 66 66 66 66 66
Rate of serious crimes per 1000 population Pearson Correlation .655 1 .369 .512 .067 .532 .729 -.233
Sig. (2-tailed) .000 . .002 .000 .595 .000 .000 .060
N 66 66 66 66 66 66 66 66
Percentage poor Pearson Correlation .738 .369 1 .152 .751 .335 .538 -.157
Sig. (2-tailed) .000 .002 . .224 .000 .006 .000 .208
N 66 66 66 66 66 66 66 66
Percentage having difficulty speaking or writing English Pearson Correlation .395 .512 .152 1 -.116 .340 .480 -.108
Sig. (2-tailed) .001 .000 .224 . .352 .005 .000 .387
N 66 66 66 66 66 66 66 66
Percentage age 25 or older who had not finished high school Pearson Correlation .535 .067 .751 -.116 1 .235 .315 -.414
Sig. (2-tailed) .000 .595 .000 .352 . .058 .010 .001
N 66 66 66 66 66 66 66 66
Percentage of housing in small, multiunit buildings Pearson Correlation .357 .532 .335 .340 .235 1 .566 -.086
Sig. (2-tailed) .003 .000 .006 .005 .058 . .000 .491
N 66 66 66 66 66 66 66 66
City 1=yes, 0=no Pearson Correlation .758 .729 .538 .480 .315 .566 1 -.269
Sig. (2-tailed) .000 .000 .000 .000 .010 .000 . .029
N 66 66 66 66 66 66 66 66
Percentage of households counted by conventional personal enumeration Pearson Correlation -.334 -.233 -.157 -.108 -.414 -.086 -.269 1
Sig. (2-tailed) .006 .060 .208 .387 .001 .491 .029 .
N 66 66 66 66 66 66 66 66

page 355 Table 13.3 B. Fox's Canadian women's labor force participation data. T is year; L is women's labor force participation rate, in percent; F is the total fertility rate, per 1000; M is men's average weekly wages in 1935 dollars; W is women's average weekly wages; D is per-capita consumer debt; and P is the percentage of part-time workers.

GET FILE='D:\bfox.sav'.
list year womwork fertil mwage fwage debt parttime.

YEAR   WOMWORK    FERTIL     MWAGE     FWAGE      DEBT  PARTTIME

1946     25.30      3748     25.35     14.05     18.18     10.28
1947     24.40      3996     26.14     14.61     28.33      9.28
1948     24.20      3725     25.11     14.23     30.55      9.51
1949     24.20      3750     25.45     14.61     35.81      8.87
1950     23.70      3669     26.79     15.26     38.39      8.54
1951     24.20      3682     26.33     14.58     26.52      8.84
1952     24.10      3845     27.89     15.66     45.65      8.60
1953     23.80      3905     29.15     16.30     52.99      5.49
1954     23.60      4047     29.52     16.57     54.84      6.67
1955     24.30      4043     32.05     17.99     65.53      6.25
1956     25.10      4092     32.98     18.33     72.56      6.32
1957     26.20      4168     32.25     17.64     69.49      7.30
1958     26.60      4073     32.52     18.16     71.71      8.65
1959     26.90      4100     33.95     18.58     78.89      8.80
1960     27.90      4119     34.63     18.95     84.99      9.39
1961     29.10      4159     35.14     18.78     87.71     10.23
1962     29.90      4134     34.49     18.74     95.31     10.77
1963     29.80      4017     35.99     19.71    104.40     10.84
1964     30.90      3886     36.68     20.06    116.80     11.70
1965     32.10      3467     37.96     20.94    130.99     12.33
1966     33.20      3150     38.68     21.20    135.25     12.18
1967     34.50      2879     39.65     21.95    142.93     13.67
1968     35.10      2681     41.20     22.68    155.47     13.82
1969     36.10      2563     42.44     23.75    165.04     14.91
1970     36.90      2571     42.02     25.63    164.53     15.52
1971     37.00      2503     45.32     26.79    169.63     15.47
1972     37.90      2302     45.61     27.51    190.62     15.85
1973     40.10      2931     45.59     27.35    209.60     15.40
1974     40.60      1875     48.06     29.64    216.66     16.23
1975     42.20      1866     46.12     29.33    224.34     16.71

Number of cases read:  30    Number of cases listed:  30

page 358 Figure 13.6 Plot of C(p)-p against p for the census undercount regression. Only subsets for which C(p)-p < 10 are shown. The following capitalized letter are employed to label the predictors in each subset: Minority, Crime, Poverty, Language, High school, hOusing, cIty, and coNventaional. Ericksen, et al. (1989) selected the predictor subset MCN . (i.e., Minority, Crime and coNventaional).

NOTE: We do not know how to make this table in SPSS.

page 359 Table 13.4 "Best" subset regression models for Ericksen et. al.'s census undercount data. Coefficient standard errors are in parentheses.

GET FILE='D:\ericksen.sav'.

regression
/statistics=defaults selection
/dep=undcount /method=enter perc_min crimrate  countprc
/method=enter diffeng
/method=enter poverty
/method=enter hsgrad housing city. 
Variables Entered/Removed(b)
Model Variables Entered Variables Removed Method
1 Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic(a) . Enter
2 Percentage having difficulty speaking or writing English(a) . Enter
3 Percentage poor(a) . Enter
4 Percentage of housing in small, multiunit buildings, City 1=yes, 0=no, Percentage age 25 or older who had not finished high school(a) . Enter
a All requested variables entered.
b Dependent Variable: Preliminary estimate of percentage undercount

Model Summary

R R Square Adjusted R Square Std. Error of the Estimate Selection Criteria
Model Akaike Information Criterion Amemiya Prediction Criterion Mallows' Prediction Criterion Schwarz Bayesian Criterion
1 .798(a) .638 .620 1.52302 59.405 .409 12.676 68.164
2 .818(b) .669 .647 1.46699 55.385 .385 8.515 66.333
3 .828(c) .686 .659 1.44207 54.032 .377 7.320 67.170
4 .841(d) .708 .667 1.42647 55.211 .385 9.000 74.918
a Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic
b Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English
c Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English, Percentage poor
d Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English, Percentage poor, Percentage of housing in small, multiunit buildings, City 1=yes, 0=no, Percentage age 25 or older who had not finished high school

ANOVA(e)
Model Sum of Squares df Mean Square F Sig.
1 Regression 252.966 3 84.322 36.352 .000(a)
Residual 143.814 62 2.320

Total 396.780 65

2 Regression 265.504 4 66.376 30.843 .000(b)
Residual 131.276 61 2.152

Total 396.780 65

3 Regression 272.006 5 54.401 26.160 .000(c)
Residual 124.774 60 2.080

Total 396.780 65

4 Regression 280.795 8 35.099 17.249 .000(d)
Residual 115.985 57 2.035

Total 396.780 65

a Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic
b Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English
c Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English, Percentage poor
d Predictors: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English, Percentage poor, Percentage of housing in small, multiunit buildings, City 1=yes, 0=no, Percentage age 25 or older who had not finished high school
e Dependent Variable: Preliminary estimate of percentage undercount

Coefficients(a)

Unstandardized Coefficients Standardized Coefficients t Sig.
Model B Std. Error Beta
1 (Constant) -2.224 .561
-3.965 .000
Percentage black or Hispanic 7.861E-02 .015 .557 5.337 .000
Rate of serious crimes per 1000 population 3.630E-02 .010 .366 3.614 .001
Percentage of households counted by conventional personal enumeration 2.800E-02 .008 .282 3.473 .001
2 (Constant) -1.976 .550
-3.592 .001
Percentage black or Hispanic 7.519E-02 .014 .533 5.273 .000
Rate of serious crimes per 1000 population 2.715E-02 .010 .274 2.613 .011
Percentage of households counted by conventional personal enumeration 2.730E-02 .008 .275 3.512 .001
Percentage having difficulty speaking or writing English .209 .087 .208 2.414 .019
3 (Constant) -.793 .860
-.921 .361
Percentage black or Hispanic .101 .020 .716 4.992 .000
Rate of serious crimes per 1000 population 2.435E-02 .010 .245 2.355 .022
Percentage of households counted by conventional personal enumeration 2.933E-02 .008 .295 3.796 .000
Percentage having difficulty speaking or writing English .184 .086 .183 2.126 .038
Percentage poor -.110 .062 -.200 -1.768 .082
4 (Constant) -1.771 1.382
-1.282 .205
Percentage black or Hispanic 7.983E-02 .023 .566 3.531 .001
Rate of serious crimes per 1000 population 3.012E-02 .013 .303 2.317 .024
Percentage of households counted by conventional personal enumeration 3.699E-02 .009 .372 3.997 .000
Percentage having difficulty speaking or writing English .215 .092 .214 2.333 .023
Percentage poor -.178 .085 -.324 -2.101 .040
Percentage age 25 or older who had not finished high school 6.129E-02 .045 .211 1.369 .176
Percentage of housing in small, multiunit buildings -3.496E-02 .025 -.139 -1.419 .161
City 1=yes, 0=no 1.160 .771 .203 1.505 .138
a Dependent Variable: Preliminary estimate of percentage undercount

Excluded Variables(d)

Beta In t Sig. Partial Correlation Collinearity Statistics
Model Tolerance
1 Percentage having difficulty speaking or writing English .208(a) 2.414 .019 .295 .731
Percentage poor -.240(a) -2.093 .040 -.259 .423
Percentage age 25 or older who had not finished high school -.124(a) -1.158 .251 -.147 .506
Percentage of housing in small, multiunit buildings -.068(a) -.747 .458 -.095 .715
City 1=yes, 0=no .169(a) 1.280 .206 .162 .331
2 Percentage poor -.200(b) -1.768 .082 -.223 .411
Percentage age 25 or older who had not finished high school -.049(b) -.449 .655 -.058 .455
Percentage of housing in small, multiunit buildings -.088(b) -1.002 .320 -.128 .709
City 1=yes, 0=no .123(b) .949 .346 .122 .323
3 Percentage age 25 or older who had not finished high school .154(c) 1.044 .301 .135 .239
Percentage of housing in small, multiunit buildings -.053(c) -.591 .557 -.077 .664
City 1=yes, 0=no .146(c) 1.147 .256 .148 .320
a Predictors in the Model: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic
b Predictors in the Model: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English
c Predictors in the Model: (Constant), Percentage of households counted by conventional personal enumeration, Rate of serious crimes per 1000 population, Percentage black or Hispanic, Percentage having difficulty speaking or writing English, Percentage poor
d Dependent Variable: Preliminary estimate of percentage undercount

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.