UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Testing Bootstrapping

bootstrapping may not work very well with small sample
sizes.  So, we take a data file welfsub.dta and treat
that as our population.  We then run 1000 simulations
where we use bootstrapping to get a confidence interval
for the mean.  Then, we assess how many of these actually
contain the population mean (which is approximately 112).

This program  bssim.ado.txt performs the bootstrapping for a single 
iteration and is written to work with the simul command.
program define bssim
  version 6.0
  if "`1'" == "?" {
    global S_1 "lb_stan ub_stan lb_n ub_n lb_p ub_p lb_bc ub_bc"
    exit
  }
  use welfsub, clear
  gen x = uniform()
  sort x
  keep if _n <= `2'
  ci agrc
  local ub_stan = `r(ub)'
  local lb_stan = `r(lb)'
  bs "summarize agrc" "r(mean)", reps(1000)
  post `1' `lb_stan' `ub_stan' r(lb_n) r(ub_n) r(lb_p) r(ub_p) r(lb_bc) r(ub_bc)
end
  
Then, this program bsvaryn.do.txt performs the simulations
with samples of size 10, 20, 30, 40, 50, and 100.
simul bssim, reps(1000) args(10) saving(sim10) replace dots
simul bssim, reps(1000) args(20) saving(sim20) replace dots
simul bssim, reps(1000) args(30) saving(sim30) replace dots
simul bssim, reps(1000) args(40) saving(sim40) replace dots
simul bssim, reps(1000) args(50) saving(sim50) replace dots
simul bssim, reps(1000) args(100) saving(sim100) replace dots
This program count.do.txt counts how many of the intervals
contain the mean, using 4 different types of confidence
intervals (see bs for info on the types of confidence intervals
formed by bs.
use sim10
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112

use sim20
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112

use sim30
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112

use sim40
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112

use sim50
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112

use sim100
count if lb_stan < 112 & ub_stan > 112
count if lb_n    < 112 & ub_n    > 112
count if lb_p    < 112 & ub_p    > 112
count if lb_bc   < 112 & ub_bc   > 112
Here are the results we got when we ran this.  As you
see, the small samples (e.g. n=10) only captured the
mean about 72% to 78% of the time (when it should have
done so about 95% of the time.
use sim10

count if lb_stan < 112 & ub_stan > 112
  739

count if lb_n    < 112 & ub_n    > 112
  721

count if lb_p    < 112 & ub_p    > 112
  723

count if lb_bc   < 112 & ub_bc   > 112
  778

use sim20

count if lb_stan < 112 & ub_stan > 112
  841

count if lb_n    < 112 & ub_n    > 112
  820

count if lb_p    < 112 & ub_p    > 112
  836

count if lb_bc   < 112 & ub_bc   > 112
  882

use sim30

count if lb_stan < 112 & ub_stan > 112
  874

count if lb_n    < 112 & ub_n    > 112
  864

count if lb_p    < 112 & ub_p    > 112
  878

count if lb_bc   < 112 & ub_bc   > 112
  909

use sim40

count if lb_stan < 112 & ub_stan > 112
  884

count if lb_n    < 112 & ub_n    > 112
  878

count if lb_p    < 112 & ub_p    > 112
  887

count if lb_bc   < 112 & ub_bc   > 112
  903

use sim50

count if lb_stan < 112 & ub_stan > 112
  911

count if lb_n    < 112 & ub_n    > 112
  905

count if lb_p    < 112 & ub_p    > 112
  913

count if lb_bc   < 112 & ub_bc   > 112
  922

use sim100

count if lb_stan < 112 & ub_stan > 112
  909

count if lb_n    < 112 & ub_n    > 112
  907

count if lb_p    < 112 & ub_p    > 112
  911

count if lb_bc   < 112 & ub_bc   > 112
  921

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California