### Stata FAQ  How do I analyze survey data with poststratification?

The examples below use Stata 9.

NOTE:  If you want to see the design effect or the misspecification effect, use estat effects after the command.

This example is taken from Levy and Lemeshow's Sampling of Populations, page 174.

This example uses the dogcats data set.
gen n_pop = 1300

gen n_type = .
(50 missing values generated)

replace n_type = 850 if type == "dog":type

replace n_type = 450 if type == "cat":type

svyset _n, fpc(n_pop) poststrata(type) postweight(n_type)

pweight: <none>
VCE: linearized
Poststrata: type
Postweight: n_type
Strata 1: <one>
SU 1: <observations>
FPC 1: n_pop

svy: mean totexp
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

--------------------------------------------------------------
|             Linearized
|       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp |   40.11513   1.163498      37.77699    42.45327
--------------------------------------------------------------

svy: mean totexp, over(type)
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

dog: type = dog
cat: type = cat

--------------------------------------------------------------
|             Linearized
Over |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp       |
dog |   49.85844    1.44369      46.95723    52.75964
cat |   21.71111    1.96505       17.7622    25.66003
--------------------------------------------------------------

svy: total totexp
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

--------------------------------------------------------------
|             Linearized
|      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp |   52149.67   1512.548      49110.09    55189.25
--------------------------------------------------------------

svy: total totexp, over(type)
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

dog: type = dog
cat: type = cat

--------------------------------------------------------------
|             Linearized
Over |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp       |
dog |   42379.67   1227.136      39913.65    44845.69
cat |       9770   884.2723      7992.988    11547.01
--------------------------------------------------------------

This example is taken from Lehtonen and Pahkinen's Practical Methods for Design and Analysis of Complex Surveys.

page 97 Table 3.10  A simple random sample drawn without replacement from the Province'91 population with poststratum weights.
input id str clu wt ue91 lab91 poststr gwt postwt sruv srcvs
1 1 1 4 4123 33786 1 .5833 2.333 .25 .43
2 1 4 4 760 5919 1 .5833 2.333 .25 .43
3 1 5 4 721 4930 1 .5833 2.333 .25 .43
4 1 15 4 142 675 2 1.2500 5.0000 .25 .20
5 1 18 4 187 1448 2 1.2500 5.0000 .25 .20
6 1 26 4 331 2543 2 1.2500 5.0000 .25 .20
7 1 30 4 127 1084 2 1.2500 5.0000 .25 .20
8 1 31 4 219 1330 2 1.2500 5.0000 .25 .20
end
poststratified conditional estimates

Note that you cannot get the deff with the postvar/postwgt statements.  The numbers on the postwgt statement must be integers (i.e., whole numbers) and are the population totals.
gen fpc = 32

gen postw = .
(8 missing values generated)

replace postw = 7 if poststr == 1

replace postw = 25 if poststr == 2

svyset [pw=wt], fpc(fpc) poststrata(poststr) postweight(postw)

pweight: wt
VCE: linearized
Poststrata: poststr
Postweight: postw
Strata 1: <one>
SU 1: <observations>
FPC 1: fpc

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
N. of poststrata =       2          Design df        =       7

--------------------------------------------------------------
|             Linearized
|      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
ue91 |      18106   6013.646      3885.986    32326.01
--------------------------------------------------------------

svy: ratio ue91/lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
N. of poststrata =       2          Design df        =       7

_ratio_1: ue91/lab91

--------------------------------------------------------------
|             Linearized
|      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 |   .1297472    .004386       .119376    .1401184
--------------------------------------------------------------

poststratified unconditional estimates
This has been skipped for now.

pure design-based estimated under srs

svyset [pw=wt], fpc(fpc)

pweight: wt
VCE: linearized
Strata 1: <one>
SU 1: <observations>
FPC 1: fpc

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
Design df        =       7

--------------------------------------------------------------
|             Linearized
|      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
ue91 |      26440   13282.26     -4967.551    57847.55
--------------------------------------------------------------

svy: ratio ue91/lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
Design df        =       7

_ratio_1: ue91/lab91

--------------------------------------------------------------
|             Linearized
|      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
_ratio_1 |   .1278159   .0040873      .1181511    .1374808
--------------------------------------------------------------

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.