Stata FAQ 
How do I analyze survey data with poststratification?

The examples below use Stata 9. 

NOTE:  If you want to see the design effect or the misspecification effect, use estat effects after the command.

This example is taken from Levy and Lemeshow's Sampling of Populations, page 174.

This example uses the dogcats data set.
gen n_pop = 1300

gen n_type = .
(50 missing values generated)

replace n_type = 850 if type == "dog":type
(32 real changes made)

replace n_type = 450 if type == "cat":type
(18 real changes made)

svyset _n, fpc(n_pop) poststrata(type) postweight(n_type)

      pweight: <none>
          VCE: linearized
   Poststrata: type
   Postweight: n_type
     Strata 1: <one>
         SU 1: <observations>
        FPC 1: n_pop

svy: mean totexp
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

--------------------------------------------------------------
             |             Linearized
             |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      totexp |   40.11513   1.163498      37.77699    42.45327
--------------------------------------------------------------

svy: mean totexp, over(type)
(running mean on estimation sample)

Survey: Mean estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

          dog: type = dog
          cat: type = cat

--------------------------------------------------------------
             |             Linearized
        Over |       Mean   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp       |
         dog |   49.85844    1.44369      46.95723    52.75964
         cat |   21.71111    1.96505       17.7622    25.66003
--------------------------------------------------------------

svy: total totexp
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
      totexp |   52149.67   1512.548      49110.09    55189.25
--------------------------------------------------------------

svy: total totexp, over(type)
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =      50
Number of PSUs   =      50          Population size  =    1300
N. of poststrata =       2          Design df        =      49

          dog: type = dog
          cat: type = cat

--------------------------------------------------------------
             |             Linearized
        Over |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
totexp       |
         dog |   42379.67   1227.136      39913.65    44845.69
         cat |       9770   884.2723      7992.988    11547.01
--------------------------------------------------------------

This example is taken from Lehtonen and Pahkinen's Practical Methods for Design and Analysis of Complex Surveys.

page 97 Table 3.10  A simple random sample drawn without replacement from the Province'91 population with poststratum weights.
input id str clu wt ue91 lab91 poststr gwt postwt sruv srcvs
  1 1 1 4 4123 33786 1 .5833 2.333 .25 .43
  2 1 4 4 760 5919 1 .5833 2.333 .25 .43
  3 1 5 4 721 4930 1 .5833 2.333 .25 .43
  4 1 15 4 142 675 2 1.2500 5.0000 .25 .20
  5 1 18 4 187 1448 2 1.2500 5.0000 .25 .20
  6 1 26 4 331 2543 2 1.2500 5.0000 .25 .20
  7 1 30 4 127 1084 2 1.2500 5.0000 .25 .20
  8 1 31 4 219 1330 2 1.2500 5.0000 .25 .20
end
poststratified conditional estimates

Note that you cannot get the deff with the postvar/postwgt statements.  The numbers on the postwgt statement must be integers (i.e., whole numbers) and are the population totals.
gen fpc = 32

gen postw = .
(8 missing values generated)

replace postw = 7 if poststr == 1
(3 real changes made)

replace postw = 25 if poststr == 2
(5 real changes made)

svyset [pw=wt], fpc(fpc) poststrata(poststr) postweight(postw)

      pweight: wt
          VCE: linearized
   Poststrata: poststr
   Postweight: postw
     Strata 1: <one>
         SU 1: <observations>
        FPC 1: fpc

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
N. of poststrata =       2          Design df        =       7

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        ue91 |      18106   6013.646      3885.986    32326.01
--------------------------------------------------------------

svy: ratio ue91/lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
N. of poststrata =       2          Design df        =       7

     _ratio_1: ue91/lab91

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .1297472    .004386       .119376    .1401184
--------------------------------------------------------------

poststratified unconditional estimates
This has been skipped for now.

pure design-based estimated under srs

svyset [pw=wt], fpc(fpc)

      pweight: wt
          VCE: linearized
     Strata 1: <one>
         SU 1: <observations>
        FPC 1: fpc

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       7

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        ue91 |      26440   13282.26     -4967.551    57847.55
--------------------------------------------------------------

svy: ratio ue91/lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       7

     _ratio_1: ue91/lab91

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .1278159   .0040873      .1181511    .1374808
--------------------------------------------------------------

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.