UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS FAQ 
How do I analyze survey data with a stratified random sampling design?

This example is taken from Levy and Lemeshow's Sampling of Populations.

page 138 stratification and stratified random sampling
This example uses the hospsamp data set.

data second138;
  input id _TOTAL_ oblevel;
  cards;
  1 42 1
  2 42 1
  3 42 1
  4 42 1
  5 99 2
  6 99 2
  7 99 2
  8 99 2
  9 99 2
  10 17 3
  11 17 3
  12 17 3
  13 17 3
  14 17 3
  15 17 3
  ;
run;

NOTE:  You cannot get the totals for both the whole group and the sub-groups in the same proc surveymeans.
NOTE:  The data set second138 is used to tell SAS what the totals are.  These totals are used to compute the finite population correction (fpc).  SAS allows only one number to be supplied on the proc surveymeans statement.  Because the totals change, we need to supply them to SAS in a data set.  However, SAS will not allow them to be in the data set with the data.  Instead, they must be in their own data set.  In this data set, the variable that contains the totals must be called _TOTAL_.  The variable oblevel is copied from the original data set because SAS requires all of the variables listed on the strata statement to appear in this data set.

proc surveymeans data = hospsamp n = second138 sum ;
  weight weighta;
  strata oblevel;
  var births;
run;
The SURVEYMEANS Procedure

            Data Summary

Number of Strata                   3
Number of Observations            15
Sum of Weights            157.999931

               Statistics

Variable             Sum         Std Dev
----------------------------------------
births            183983           34014
----------------------------------------
proc surveymeans data = hospsamp n = second138 sum;
  weight weighta;
  strata oblevel;
  by oblevel;
  var births;
run;
oblevel=1

The SURVEYMEANS Procedure

            Data Summary

Number of Strata                   1
Number of Observations             4
Sum of Weights                    42

               Statistics

Variable             Sum         Std Dev
----------------------------------------
births             14931     2669.856738
----------------------------------------

oblevel=2

The SURVEYMEANS Procedure

            Data Summary

Number of Strata                   1
Number of Observations             5
Sum of Weights             98.999939

               Statistics

Variable             Sum         Std Dev
----------------------------------------
births            117117           33068
----------------------------------------

oblevel=3

The SURVEYMEANS Procedure

            Data Summary

Number of Strata                   1
Number of Observations             6
Sum of Weights            16.9999924

               Statistics

Variable             Sum         Std Dev
----------------------------------------
births             51935     7508.399372
----------------------------------------

This example is taken from Lehtonen and Pahkinen's Practical Methods for Design and Analysis of Complex Surveys.

page 74 Table 3.3  Estimates from an optimally allocated stratified simple random sample (n = 8); the Province'91 population. 
NOTE:  In this data set, the fpc changes with the strata.  This is different from the previous examples.

data page74;
  input id str clu wt ue91 lab91 fpc;
  cards;
  1 1 1 1.75 4123 33786 7
  2 1 2 1.75 666 6016 7
  3 1 4 1.75 760 5919 7
  4 1 6 1.75 457 3022 7
  5 2 21 6.25 61 573 25
  6 2 25 6.25 262 1737 25
  7 2 26 6.25 331 2543 25
  8 2 27 6.25 98 545 25
  ;
run;
data second74;
  input id str _RATE_;
  cards;
  1 1 0.57
  2 1 0.57
  3 1 0.57
  4 1 0.57
  5 2 0.16
  6 2 0.16
  7 2 0.16
  8 2 0.16
  ;
run;

proc surveymeans data = page74 r = second74 sum std;
  weight wt;
  strata str;
  cluster clu;
  var ue91;
run;
The SURVEYMEANS Procedure

            Data Summary

Number of Strata                   2
Number of Clusters                 8
Number of Observations             8
Sum of Weights                    32

               Statistics

Variable             Sum         Std Dev
----------------------------------------
ue91               15211     4285.724888
----------------------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California