UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

WesVar FAQ
How do I analyze survey data with a probability proportional to size sampling design?

This example is taken from Levy and Lemeshow's Sampling of Populations.
page 350 probability proportional to size sampling
A short "movie" of how to convert the SAS data set wvhspslct into a WesVar data set can be viewed by clicking here.  A new variable called cons was added to the hspslct data set.  This variable is a constant that is equal to one (but it could have been set equal to any number).  We have done this so that we could make the table as simple as possible.
A second "movie" shows how to analyze the data once it is in WesVar format.  You can view that movie by clicking here.
In this example, the variable wstar is used as the weight variable, the variable drawing is used as the VarUnit and the variables lifethrt and dxdead are used as the analysis variables.  The variable cons is used to make the table.  The jackknife-1 (jk1) method of creating the replicate weight is used because we do not have stratification in this sampling design.
The output (shown at the end of the analysis "movie") is given below.

The marginal sum_wts value of 50056 is the estimated population total.  The marginal lifethrt value of 6006.72 is the estimated total of the variable lifethrt, and its standard error is 1001.1200.  The marginal dxdead value of 2002.2400 is the estimated total of the variable dxdead, and its standard error is 1226.1166.  The marginal m_lifethrt value of 0.1200 is the estimated mean of the variable lifethrt, and its standard error is 0.0200.  The marginal m_dxdead value of 0.0400 is the estimated mean of the variable dxdead, and its standard error is 0.0245.  The marginal ratio value of 0.3333 is the estimated ratio of dxdead/lifethrt, and its standard error is 0.2404.
To get the second example, you need to calculate the weight variable w2star using the formula in the text.  We used the following SAS code to create this variable in a data step, and then followed the exact same procedure shown above to reproduce the output (using w2star instead of wstar as the full sample weight variable, of course!).
data hspslct2;
 set hospslct;
 /*n is 50*/
 /*N_i is admiss*/
 /* X is 7087, the total number of lifethreating conditions across all the hospitals*/
 /*X_i is tl, the total number of lifethreating conditions for each hospital*/
 if hospno = 2 then  tl = 785; 
 if hospno = 5 then  tl = 3404; 
 if hospno = 9 then  tl = 778; 
 w2star = (admiss/50)*(7087/tl);
run;

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.