UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Textbook Examples
Sampling: Design and Analysis by Sharon L. Lohr
Chapter 4: Stratified Sampling

The examples below use Stata 7 or 8.  If you are using Stata version 9, please see this page.

Page 96 at the bottom

use http://www.ats.ucla.edu/stat/stata/examples/lohr/agstrat.dta, clear
sort region
by region: count

---------------------------------------------------------------------------------------------------
-> region = NC
  103
---------------------------------------------------------------------------------------------------
-> region = NE
   21
---------------------------------------------------------------------------------------------------
-> region = S
  135
---------------------------------------------------------------------------------------------------
-> region = W
   41
Page 97, figure 4.1
graph box acres92, over(region) ylabel( , nogrid) ytitle(Millions of Acres)
Page 97 table at the bottom
NOTE:  The format option is used here so that the numbers are not displayed in scientific notation.
tabstat acres92, s(n mean var) by(region) format(%14.0g)

Summary for variables: acres92
     by categories of: region 
      region |         N      mean  variance
-------------+------------------------------
          NC |           103  300504.15534 29618183543.3
          NE |            21 97629.8095238 7647472708.16
           S |           135 211315.044444 53587487856.2
           W |            41 662295.512195  396185950266
-------------+------------------------------
       Total |           300     295612.67  112039472103
--------------------------------------------
Page 98
NOTE:  We need to make a numeric version of region for use with the svytotal command.  The numbers listed in the column labeled "Estimate" are the same as those in the text in the column labeled "Estimated Total of Farm Acres".  The second svytotal command is used to get the overall total.
svyset [pweight=weight]
gen regionnum = 1
replace regionnum = 2 if region == "NE"
replace regionnum = 3 if region == "S"
replace regionnum = 4 if region == "W"
svytotal acres92, by(regionnum)

Survey total estimation
pweight:  weight                                  Number of obs    =       300
Strata:   <one>                                   Number of strata =         1
PSU:      <observations>                          Number of PSUs   =       300
                                                  Population size  = 3077.9999
------------------------------------------------------------------------------
Total  Subpop. |   Estimate    Std. Err.   [95% Conf. Interval]        Deff
---------------+--------------------------------------------------------------
acres92        |
  regionnum==1 |  3.167e+08    3.10e+07    2.56e+08    3.78e+08    .9964524
  regionnum==2 |   21478558     6110730     9453072    3.35e+07    1.021961
  regionnum==3 |  2.920e+08    3.32e+07    2.27e+08    3.57e+08    .9971851
  regionnum==4 |  2.795e+08    5.77e+07    1.66e+08    3.93e+08    1.003436
------------------------------------------------------------------------------

svytotal acres92

Survey total estimation
pweight:  weight                                  Number of obs    =       300
Strata:   <one>                                   Number of strata =         1
PSU:      <observations>                          Number of PSUs   =       300
                                                  Population size  = 3077.9999
------------------------------------------------------------------------------
   Total |   Estimate    Std. Err.   [95% Conf. Interval]        Deff
---------+--------------------------------------------------------------------
 acres92 |  9.097e+08    5.96e+07    7.92e+08    1.03e+09    1.001605
------------------------------------------------------------------------------
Page 102 Table 4.2
clear
input str18 discipline membership num_mailed valid_ret pct_female
"Literature" 9100 915 636 38
"Classics" 1950 633 451 27
"Philosophy" 5500 658 481 18
"History" 10850 855 611 19
"Linguistics" 2100 667 493 36
"Political Science" 5500 833 575 13
"Sociology" 9000 824 588 26
end

list

     +---------------------------------------------------------------+
     |        discipline   member~p   num_ma~d   valid_~t   pct_fe~e |
     |---------------------------------------------------------------|
  1. |        Literature       9100        915        636         38 |
  2. |          Classics       1950        633        451         27 |
  3. |        Philosophy       5500        658        481         18 |
  4. |           History      10850        855        611         19 |
  5. |       Linguistics       2100        667        493         36 |
     |---------------------------------------------------------------|
  6. | Political Science       5500        833        575         13 |
  7. |         Sociology       9000        824        588         26 |
     +---------------------------------------------------------------+
     
tabstat membership num_mailed valid_ret, s(sum)

   stats |  member~p  num_ma~d  valid_~t
---------+------------------------------
     sum |     44000      5385      3835
----------------------------------------
Page 104 in the middle
NOTE:  The slight difference between this result and that shown in the text is probably due to rounding error.
use http://www.ats.ucla.edu/stat/stata/examples/lohr/agstrat.dta, clear
gen newwt = 220/21 if region == "NE"
replace newwt = 1054/103 if region == "NC"
replace newwt = 1382/135 if region == "S"
replace newwt = 422/41 if region == "W"
gen total = acres92*newwt
tabstat total, s(sum) format(%15.0g)

    variable |       sum
-------------+----------
       total |  909736007.481
------------------------

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.