### SUDAAN Textbook Examples Practical Methods for Design and Analysis of Complex Surveys, Revised Edition by Lehtonen and Pahkinen Chapter 3:  Further use of auxiliary information

Stratified simple random sampling
page 74 Table 3.3  Estimates from an optimally allocated stratified simple random sample (n = 8); the Province'91 population.
NOTE:  In this data set, the fpc changes with the strata.  This is different from all of the previous examples.
data page74;
input id str clu wt ue91 lab91 fpc;
cards;
1 1 1 1.75 4123 33786 7
2 1 2 1.75 666 6016 7
3 1 4 1.75 760 5919 7
4 1 6 1.75 457 3022 7
5 2 21 6.25 61 573 25
6 2 25 6.25 262 1737 25
7 2 26 6.25 331 2543 25
8 2 27 6.25 98 545 25
;
run;

proc descript data = page74 filetype = sas design = wor deft4 totals;
weight wt;
nest str;
var ue91;
totcnt fpc;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      6
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| UE91            | Sample Size      |            8 |
|                 | Weighted Size    |        32.00 |
|                 | Total            |     15210.50 |
|                 | SE Total         |      4279.45 |
|                 | Mean             |       475.33 |
|                 | SE Mean          |       133.73 |
|                 | DEFF Mean #4     |         0.15 |
|                 | DEFF Total #4    |         0.15 |
-----------------------------------------------------
proc ratio data = page74 filetype = sas design = strwor;
weight wt;
nest str;
totcnt fpc;
numer ue91;
denom lab91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      6
Variance Estimation Method: Taylor Series (STRWOR)
by: Variable, One.

---------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1          |
---------------------------------------------------
|                 |                  |            |
| UE91/LAB91      | Sample Size      |          8 |
|                 | Weighted Size    |      32.00 |
|                 | Weighted X-Sum   |  119037.75 |
|                 | Weighted Y-Sum   |   15210.50 |
|                 | Ratio Est.       |       0.13 |
|                 | SE Ratio         |       0.00 |
---------------------------------------------------
proc descript data = page74 filetype = sas design = strwor;
weight wt;
nest str;
var ue91;
totcnt fpc;
percentile / median;
run;
Cannot extrapolate to compute confidence limit for 50.00th percentile.
Generating a missing value.

Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      6
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One, Percentiles.

for: Variable = UE91.

-----------------------------------------------------------------------------------
One                    Sample     Weighted                  Lower 95%    Upper 95%
Percentiles         Size       Size           Quantile   Limit        Limit
-----------------------------------------------------------------------------------
1
50.00                     8        32.00       189.84          .         300.36
-----------------------------------------------------------------------------------
---------------------------------
One                    SE
Percentiles         Quantile
---------------------------------
1
50.00                     .
---------------------------------
page 83 Table 3.6  Estimates from a one-stage CLU sample (n = 8); the Province'91 population.
data page83;
input id str clu wt ue91 lab91;
fpc = 32;
cards;
1 1 2 4 666 6016
2 1 2 4 528 3818
3 1 2 4 760 5919
4 1 2 4 187 1448
5 1 8 4 129 927
6 1 8 4 128 819
7 1 8 4 331 2543
8 1 8 4 568 4011
;
run;
proc descript data = page83 filetype = sas design = wor;
weight wt;
nest _one_ clu;
var ue91;
totcnt fpc _zero_;
print total setotal deffmean defftotal;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      1
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| UE91            | Total            |     13188.00 |
|                 | SE Total         |      3814.89 |
|                 | DEFF Mean #4     |         1.80 |
|                 | DEFF Total #4    |         1.80 |
-----------------------------------------------------
proc ratio data = page83 filetype = sas design = wor;
weight wt;
nest _one_ clu;
totcnt fpc _zero_;
numer ue91;
denom lab91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      1
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

---------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1          |
---------------------------------------------------
|                 |                  |            |
| UE91/LAB91      | Sample Size      |          8 |
|                 | Weighted Size    |      32.00 |
|                 | Weighted X-Sum   |  102004.00 |
|                 | Weighted Y-Sum   |   13188.00 |
|                 | Ratio Est.       |       0.13 |
|                 | SE Ratio         |       0.01 |
---------------------------------------------------
proc descript data = page83 filetype = sas design = wor ;
weight wt;
nest _one_ clu;
var ue91;
totcnt fpc _zero_ ;
percentile / median;
run;
Cannot extrapolate to compute confidence limit for 50.00th percentile.
Generating a missing value.
Cannot extrapolate to compute confidence limit for 50.00th percentile.
Generating a missing value.

Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      1
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One, Percentiles.

for: Variable = UE91.

-----------------------------------------------------------------------------------
One                    Sample     Weighted                  Lower 95%    Upper 95%
Percentiles         Size       Size           Quantile   Limit        Limit
-----------------------------------------------------------------------------------
1
50.00                     8        32.00       331.00          .            .
-----------------------------------------------------------------------------------
---------------------------------
One                    SE
Percentiles         Quantile
---------------------------------
1
50.00                     .
---------------------------------
Two-stage cluster sampling
page 88 Table 3.8  Estimates from a two-stage CLU sample (n = 8); the Province'91 population.
data page88;
input id str clu wt ue91 lab91 fpc1 fpc2 smplrat;
cards;
1 1 2 4 760 5919 8 4 .5
2 1 2 4 187 1448 8 4 .5
3 1 3 4 767 5823 8 4 .5
4 1 3 4 142 675 8 4 .5
5 1 4 4 94 831 8 4 .5
6 1 4 4 98 545 8 4 .5
7 1 7 4 262 1737 8 4 .5
8 1 7 4 219 1330 8 4 .5
;
run;

proc descript data = page88 filetype = sas design = wor totals deft4;
weight wt;
nest _one_ clu;
totcnt fpc1 fpc2;
var ue91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      3
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| UE91            | Sample Size      |            8 |
|                 | Weighted Size    |        32.00 |
|                 | Total            |     10116.00 |
|                 | SE Total         |      2658.65 |
|                 | Mean             |       316.13 |
|                 | SE Mean          |        83.08 |
|                 | DEFF Mean #4     |         0.69 |
|                 | DEFF Total #4    |         0.69 |
-----------------------------------------------------
proc ratio data = page88 filetype = sas design = wor deff;
weight wt;
nest _one_ clu;
totcnt fpc1 fpc2;
numer ue91;
denom lab91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      3
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

---------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1          |
---------------------------------------------------
|                 |                  |            |
| UE91/LAB91      | Sample Size      |          8 |
|                 | Weighted Size    |      32.00 |
|                 | Weighted X-Sum   |   73232.00 |
|                 | Weighted Y-Sum   |   10116.00 |
|                 | Ratio Est.       |       0.14 |
|                 | SE Ratio         |       0.01 |
|                 | DEFF Ratio #4    |       0.75 |
---------------------------------------------------
proc descript data = page88 filetype = sas design = wor  ;
weight wt;
nest _one_ clu;
totcnt fpc1 fpc2;
var ue91;
percentile / median;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      3
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One, Percentiles.

for: Variable = UE91.

-----------------------------------------------------------------------------------
One                    Sample     Weighted                  Lower 95%    Upper 95%
Percentiles         Size       Size           Quantile   Limit        Limit
-----------------------------------------------------------------------------------
1
50.00                     8        32.00       187.00        94.36       687.75
-----------------------------------------------------------------------------------
---------------------------------
One                    SE
Percentiles         Quantile
---------------------------------
1
50.00                   93.23
---------------------------------
Post-stratified weights
page 97 Table 3.10  A simple random sample drawn without replacement from the Province'91 population with poststratum weights.
data page97;
input id str clu wt ue91 lab91 poststr gwt postwt sruv srcvs ;
fpc = 32;
cards;
1 1 1 4 4123 33786 1 .5833 2.333 .25 .43
2 1 4 4 760 5919 1 .5833 2.333 .25 .43
3 1 5 4 721 4930 1 .5833 2.333 .25 .43
4 1 15 4 142 675 2 1.2500 5.0000 .25 .20
5 1 18 4 187 1448 2 1.2500 5.0000 .25 .20
6 1 26 4 331 2543 2 1.2500 5.0000 .25 .20
7 1 30 4 127 1084 2 1.2500 5.0000 .25 .20
8 1 31 4 219 1330 2 1.2500 5.0000 .25 .20
;
run;
poststratified conditional estimates
Note that you cannot get the deff with the postvar/postwgt statements.  The numbers on the postwgt statement must be integers (i.e., whole numbers) and are the population totals.
proc descript data = page97 filetype = sas design = wor totals ;
weight wt;
nest _one_;
totcnt fpc;
var ue91;
subgroup poststr;
levels 2;
postvar poststr;
postwgt 7 25;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
Post-stratified estimates
by: Variable, POSTSTR.

-----------------------------------------------------------------------------------
|                 |                  |
| Variable        |                  | POSTSTR
|                 |                  | Total        | 1            | 2            |
-----------------------------------------------------------------------------------
|                 |                  |              |              |              |
| UE91            | Sample Size      |            8 |            3 |            5 |
|                 | Weighted Size    |        32.00 |         7.00 |        25.00 |
|                 | Total            |     18106.00 |     13076.00 |      5030.00 |
|                 | SE Total         |      6013.65 |      5966.47 |       751.81 |
|                 | Mean             |       565.81 |      1868.00 |       201.20 |
|                 | SE Mean          |       187.93 |       852.35 |        30.07 |
-----------------------------------------------------------------------------------
poststratified unconditional estimates
This has been skipped for now.
pure design-based estimated under srs
proc descript data = page97 filetype = sas design = wor totals deff;
weight wt;
nest _one_;
totcnt fpc;
var ue91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| UE91            | Sample Size      |            8 |
|                 | Weighted Size    |        32.00 |
|                 | Total            |     26440.00 |
|                 | SE Total         |     13282.26 |
|                 | Mean             |       826.25 |
|                 | SE Mean          |       415.07 |
|                 | DEFF Mean #4     |         0.75 |
|                 | DEFF Total #4    |         0.75 |
-----------------------------------------------------
The code below gives the numbers that are shown in the calculations on page 102.
data page102;
input id str clu wt ue91 hou85 gwt adjwt smplrat;
fpc = 32;
cards;
1 1 1 4 4123 26881 .5562 2.2248 .25
2 1 4 4 760 4896 .5562 2.2248 .25
3 1 5 4 721 3730 .5562 2.2248 .25
4 1 15 4 142 556 .5562 2.2248 .25
5 1 18 4 187 1463 .5562 2.2248 .25
6 1 26 4 331 1946 .5562 2.2248 .25
7 1 30 4 127 834 .5562 2.2248 .25
8 1 31 4 219 932 .5562 2.2248 .25
;
run;
You can get the necessary numbers either of two ways:  You can use proc descript and get the totals for both variables and do the division on your own, or you can use proc ratio, as shown below.
NOTE:  6610/41238 = .16028905, which is the correct answer.
proc descript data = page102 filetype = sas design = wor;
weight wt;
nest _one_;
totcnt fpc;
var ue91 hou85;
subgroup str;
levels 1;
postvar str;
postwgt 8;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
Post-stratified estimates
by: Variable, STR.

--------------------------------------------------------------------
|                 |                  |
| Variable        |                  | STR
|                 |                  | Total        | 1            |
--------------------------------------------------------------------
|                 |                  |              |              |
| UE91            | Sample Size      |            8 |            8 |
|                 | Weighted Size    |         8.00 |         8.00 |
|                 | Total            |      6610.00 |      6610.00 |
|                 | Mean             |       826.25 |       826.25 |
|                 | SE Mean          |       415.07 |       415.07 |
--------------------------------------------------------------------
|                 |                  |              |              |
| HOU85           | Sample Size      |            8 |            8 |
|                 | Weighted Size    |         8.00 |         8.00 |
|                 | Total            |     41238.00 |     41238.00 |
|                 | Mean             |      5154.75 |      5154.75 |
|                 | SE Mean          |      2728.08 |      2728.08 |
--------------------------------------------------------------------
The goal is to get the .1603 shown in the upper middle of page 102.  You need this ratio estimate so that you can multiply it by the population total of the auxiliary  variable to calculate the ratio estimate for the total of the variable of interest.
proc ratio data = page102 filetype = sas design = wor;
weight wt;
nest _one_;
totcnt fpc;
numer ue91;
denom hou85;
subgroup str;
levels 1;
postvar str;
postwgt 8;
setenv decwidth = 4;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
Post-stratified estimates
by: Variable, STR.

----------------------------------------------------------------
|                 |                  |
| Variable        |                  | STR
|                 |                  | Total      | 1          |
----------------------------------------------------------------
|                 |                  |            |            |
| UE91/HOU85      | Sample Size      |     8.0000 |     8.0000 |
|                 | Weighted Size    |     8.0000 |     8.0000 |
|                 | Weighted X-Sum   | 41238.0000 | 41238.0000 |
|                 | Weighted Y-Sum   |  6610.0000 |  6610.0000 |
|                 | Ratio Est.       |     0.1603 |     0.1603 |
|                 | SE Ratio         |     0.0055 |     0.0055 |
----------------------------------------------------------------
simple random sample without replacement for regression estimation
page 107 Table 3.14  Model-assisted estimation results for the population total of ue91 from an SRS sample of eight elements drawn from the Province'91 population.
data page106;
input id str clu wt ue91 meanz hou85 diffhou85 smplrat;
fpc = 32;
cards;
1 1 1 4 4123 2867 26881 -24014 .25
2 1 4 4 760 2867 4896 -2029 .25
3 1 5 4 721 2867 3730 -863 .25
4 1 15 4 142 2867 556 2311 .25
5 1 18 4 187 2867 1463 1404 .25
6 1 26 4 331 2867 1946 921 .25
7 1 30 4 127 2867 834 2033 .25
8 1 31 4 219 2867 932 1935 .25
;
run;
strategy:  design-based estimator with srs
proc descript data = page106 filetype = sas design = wor totals deft;
weight wt;
nest _one_;
totcnt fpc;
var ue91;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| UE91            | Sample Size      |            8 |
|                 | Weighted Size    |        32.00 |
|                 | Total            |     26440.00 |
|                 | SE Total         |     13282.26 |
|                 | Mean             |       826.25 |
|                 | SE Mean          |       415.07 |
|                 | DEFF Mean #4     |         0.75 |
|                 | DEFF Total #4    |         0.75 |
-----------------------------------------------------
strategy:  poststratified estimator with srs*pos
proc descript data = page106 filetype = sas design = wor totals;
weight wt;
nest _one_;
totcnt fpc;
var ue91 hou85;
subgroup str;
levels 1;
postvar str;
postwgt 8;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
Post-stratified estimates
by: Variable, STR.

--------------------------------------------------------------------
|                 |                  |
| Variable        |                  | STR
|                 |                  | Total        | 1            |
--------------------------------------------------------------------
|                 |                  |              |              |
| UE91            | Sample Size      |            8 |            8 |
|                 | Weighted Size    |         8.00 |         8.00 |
|                 | Total            |      6610.00 |      6610.00 |
|                 | SE Total         |      3320.56 |      3320.56 |
|                 | Mean             |       826.25 |       826.25 |
|                 | SE Mean          |       415.07 |       415.07 |
--------------------------------------------------------------------
|                 |                  |              |              |
| HOU85           | Sample Size      |            8 |            8 |
|                 | Weighted Size    |         8.00 |         8.00 |
|                 | Total            |     41238.00 |     41238.00 |
|                 | SE Total         |     21824.64 |     21824.64 |
|                 | Mean             |      5154.75 |      5154.75 |
|                 | SE Mean          |      2728.08 |      2728.08 |
--------------------------------------------------------------------
strategy:  ratio estimator with srs*rat
This code is shown above for page 102.
strategy:  regression estimator with srs*reg
The code below produces the estimate of b-hat, 0.152, shown in the middle of page 106.
proc regress data = page106 filetype = sas design = wor;
weight wt;
nest _one_;
totcnt fpc;
model ue91 = hou85;
setenv decwidth = 3;
run;
Number of observations read       :      8    Weighted count:       32
Observations used in the analysis :      8    Weighted count:       32
Denominator degrees of freedom    :      7

Maximum number of estimable parameters for the model is  2

File PAGE106 contains    8 Clusters
8 clusters were used to fit the model
Maximum cluster size is   1 records
Minimum cluster size is   1 records

Weighted mean response is 826.250000

Multiple R-Square for the dependent variable UE91: 0.998249
Variance Estimation Method: Taylor Series (WOR)
SE Method: Robust (Binder, 1983)
Working Correlations: Independent
Response variable UE91: UE91

----------------------------------------------------------------------
Independent                                                   P-value
Variables and        Beta                                   T-Test
Effects              Coeff.          SE Beta   T-Test B=0   B=0
----------------------------------------------------------------------
Intercept                  42.655       20.540        2.077      0.076
HOU85                       0.152        0.001      212.012      0.000
----------------------------------------------------------------------
-------------------------------------------------------

Contrast               Degrees
of                      P-value
Freedom        Wald F   Wald F
-------------------------------------------------------
OVERALL MODEL             2.000   102820.497      0.000
MODEL MINUS
INTERCEPT               1.000    44949.184      0.000
INTERCEPT                 1.000        4.312      0.076
HOU85                     1.000    44949.184      0.000
-------------------------------------------------------
This gives the estimate of the total of hou85, 164952, which is needed for the equation.  Note that 32*2867 = 91753.
proc descript data = page106 filetype = sas design = wor totals;
weight wt;
nest _one_;
totcnt fpc;
var hou85;
run;
Number of observations read    :      8    Weighted count :       32
Denominator degrees of freedom :      7
Variance Estimation Method: Taylor Series (WOR)
by: Variable, One.

-----------------------------------------------------
|                 |                  |
| Variable        |                  | One
|                 |                  | 1            |
-----------------------------------------------------
|                 |                  |              |
| HOU85           | Sample Size      |            8 |
|                 | Weighted Size    |        32.00 |
|                 | Total            |    164952.00 |
|                 | SE Total         |     87298.57 |
|                 | Mean             |      5154.75 |
|                 | SE Mean          |      2728.08 |
-----------------------------------------------------

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.