UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Stata Textbook Examples
Practical Methods for Design and Analysis of Complex Surveys, Second Edition
by Lehtonen and Pahkinen
Chapter 2:  Basic sampling techniques

The examples below use Stata 9.  If you are using Stata versions 7 or 8, please see this page.

NOTE:  If you want to see the design effect or the misspecification effect, use estat effects after the command.

Simple random sampling

page 29 Table 2.4  Estimates from a simple random sample drawn without replacement (n = 8); the Province'91 population.
input id cluster ue91 lab91
1 1 4123 33786
2 4 760 5919
3 5 721 4930
4 15 142 675
5 18 187 1448
6 26 331 2543
7 30 127 1084
8 31 219 1330
end

gen fpc = 32
gen wt = 4
gen strata = 1
svyset [pweight=wt], fpc(fpc)

      pweight: wt
          VCE: linearized
     Strata 1: <one>
         SU 1: <observations>
        FPC 1: fpc

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       7

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        ue91 |      26440   13282.26     -4967.551    57847.55
--------------------------------------------------------------

estat effects

----------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.       Deff      Deft
-------------+--------------------------------------------
        ue91 |      26440   13282.26           1   .866025
----------------------------------------------------------
Note: Weights must represent population totals for deff to be correct when using an FPC; however, deft is
      invariant to the scale of weights.
svy: ratio ue91 lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       1          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       7

     _ratio_1: ue91/lab91

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .1278159   .0040873      .1181511    .1374808
--------------------------------------------------------------

estat effects

     _ratio_1: ue91/lab91

----------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.       Deff      Deft
-------------+--------------------------------------------
    _ratio_1 |   .1278159   .0040873           1   .866025
----------------------------------------------------------
Note: Weights must represent population totals for deff to be correct when using an FPC; however, deft is
      invariant to the scale of weights.
Systematic sampling

page 46 Table 2.6  Estimates from a systematic sample drawn from the Province'91 population using implicit stratification.

NOTE:  The standard error of the total is different from that shown in the text (the text shows 11802).  However, we get the 13627 in each of the statistical packages in which we have tried to recreate this example.
input id str clu wt ue91 lab91
1 1 1 4 4123 33786
2 1 5 4 721 4930
3 2 9 4 194 2069
4 2 13 4 129 927
5 2 17 4 239 2144
6 2 21 4 61 573
7 2 25 4 262 1737
8 2 29 4 166 1615
end

gen fpc = 32
svyset clu [pweight=wt], strata(str)

      pweight: wt
          VCE: linearized
     Strata 1: str
         SU 1: clu
        FPC 1: <zero>

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       2          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       6

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        ue91 |      23580   13627.19     -9764.529    56924.53
--------------------------------------------------------------

svy: ratio ue91 lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       2          Number of obs    =       8
Number of PSUs   =       8          Population size  =      32
                                    Design df        =       6

     _ratio_1: ue91/lab91

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .1233754    .003848      .1139596    .1327912
--------------------------------------------------------------
page 60 Table 2.8  Estimates under a PPSSYS  design (n = 8); the Province'91 population.
NOTE:  The certainty PSU (the first line of the data) was entered twice and the weight was changed from 1 to .5 for each observation.  This is necessary because you need to have two observations in each strata.
input id str clu wt hou85 ue91 lab91
1 2 1 .5 26881 4123 33786
2 2 2 .5 26881 4123 33786
3 1 10 1.004 9230 1623 13727
4 1 4 1.893 4896 760 5919
5 1 7 2.173 4264 767 5823
6 1 32 2.971 3119 568 4011
7 1 26 4.762 1946 331 2543
8 1 18 6.335 1463 187 1448
9 1 13 13.730 675 129 927
end

gen fpc = 32
svyset clu [pweight=wt], strata(str)

      pweight: wt
          VCE: linearized
     Strata 1: str
         SU 1: clu
        FPC 1: <zero>

svy: total ue91
(running total on estimation sample)

Survey: Total estimation

Number of strata =       2          Number of obs    =       9
Number of PSUs   =       9          Population size  =  33.868
                                    Design df        =       7

--------------------------------------------------------------
             |             Linearized
             |      Total   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
        ue91 |   15077.43   521.1212      13845.17    16309.68
--------------------------------------------------------------

svy: ratio ue91 lab91
(running ratio on estimation sample)

Survey: Ratio estimation

Number of strata =       2          Number of obs    =       9
Number of PSUs   =       9          Population size  =  33.868
                                    Design df        =       7

     _ratio_1: ue91/lab91

--------------------------------------------------------------
             |             Linearized
             |      Ratio   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
    _ratio_1 |   .1284791   .0022215       .123226    .1337321
--------------------------------------------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.