|
|
|
||||
|
|
|||||
Historically, Stata has had a number of procedures that incorporate exact statistics, such as, Fisher's exact test in tabulate, an exact binomial test in bitest, exact two-sample Kolmogorov-Smirnov tests in ksmirnov. Stata 10 incorporates two new exact tests exlogistic and expoisson for exact logistic and exact poisson models.
Running these models can be very resource intensive in terms of memory and time.
ExamplesExample 1: Admission to engineering school based on gender and AP calculus
input female apcalc admit n
0 0 0 12
0 1 4 8
1 0 1 5
1 1 7 7
end
generate noadmit = n - admit
clist
female apcalc admit n noadmit
1. 0 0 0 12 12
2. 0 1 4 8 4
3. 1 0 1 5 4
4. 1 1 7 7 0
tab1 female apcalc [fw=n]
-> tabulation of female
female | Freq. Percent Cum.
------------+-----------------------------------
0 | 20 62.50 62.50
1 | 12 37.50 100.00
------------+-----------------------------------
Total | 32 100.00
-> tabulation of apcalc
apcalc | Freq. Percent Cum.
------------+-----------------------------------
0 | 17 53.12 53.12
1 | 15 46.88 100.00
------------+-----------------------------------
Total | 32 100.00
/* regular logistic regression using the blogit command */
blogit admit n female apcalc, nolog
Logistic regression for grouped data Number of obs = 32
LR chi2(2) = 26.25
Prob > chi2 = 0.0000
Log likelihood = -8.0471896 Pseudo R2 = 0.6199
------------------------------------------------------------------------------
_outcome | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
female | 18.60808 1.322876 14.07 0.000 16.01529 21.20087
apcalc | 19.99437 . . . . .
_cons | -19.99437 .7071068 -28.28 0.000 -21.38027 -18.60847
------------------------------------------------------------------------------
Note: 12 failures and 7 successes completely determined.
exlogistic admit female apcalc, coef binomial(n) nolog
note: CMLE estimate for female is +inf; computing MUE
note: CMLE estimate for apcalc is +inf; computing MUE
Exact logistic regression Number of obs = 32
Binomial variable: n Model score = 18.75176
Pr >= score = 0.0000
---------------------------------------------------------------------------
admit | Coef. Suff. 2*Pr(Suff.) [95% Conf. Interval]
-------------+-------------------------------------------------------------
female | 2.336592* 8 0.0302 .2044942 +Inf
apcalc | 3.435807* 11 0.0003 1.405934 +Inf
---------------------------------------------------------------------------
(*) median unbiased estimates (MUE)
/* rerun to obtain odds ratios */
exlogistic
Exact logistic regression Number of obs = 32
Binomial variable: n Model score = 18.75176
Pr >= score = 0.0000
---------------------------------------------------------------------------
admit | Odds Ratio Suff. 2*Pr(Suff.) [95% Conf. Interval]
-------------+-------------------------------------------------------------
female | 10.34592* 8 0.0302 1.226904 +Inf
apcalc | 31.05645* 11 0.0003 4.079333 +Inf
---------------------------------------------------------------------------
(*) median unbiased estimates (MUE)
/* rerun to obtain score estimates */
exlogistic, coef test(score)
Exact logistic regression Number of obs = 32
Binomial variable: n Model score = 18.75176
Pr >= score = 0.0000
---------------------------------------------------------------------------
admit | Coef. Score Pr>=Score [95% Conf. Interval]
-------------+-------------------------------------------------------------
female | 2.336592* 6.685974 0.0151 .2044942 +Inf
apcalc | 3.435807* 14.78361 0.0001 1.405934 +Inf
---------------------------------------------------------------------------
(*) median unbiased estimates (MUE)
Example 2: Honors English with 200 observations
This example demonstrates the resources necessary to run a model with two predictors and 200 observations.
use http://www.ats.ucla.edu/stat/stata/notes3/hsb2, clear
generate honors = write>=60
/* standard logistic regression */
logit honors read female
Iteration 0: log likelihood = -115.64441
Iteration 1: log likelihood = -87.936305
Iteration 2: log likelihood = -85.536982
Iteration 3: log likelihood = -85.443948
Iteration 4: log likelihood = -85.44372
Logistic regression Number of obs = 200
LR chi2(2) = 60.40
Prob > chi2 = 0.0000
Log likelihood = -85.44372 Pseudo R2 = 0.2612
------------------------------------------------------------------------------
honors | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
read | .1443657 .0233337 6.19 0.000 .0986325 .1900989
female | 1.120926 .4081028 2.75 0.006 .321059 1.920793
_cons | -9.603365 1.426404 -6.73 0.000 -12.39906 -6.807665
------------------------------------------------------------------------------
exlogistic hon read female, coef memory(150m)
Enumerating sample-space combinations:
observation 1: enumerations = 2
observation 2: enumerations = 4
observation 3: enumerations = 8
observation 4: enumerations = 16
observation 5: enumerations = 32
(output deleted)
observation 157: enumerations = 1238985
observation 158: enumerations = 1240230
(output deleted)
observation 192: enumerations = 507715
observation 193: enumerations = 457763
(output deleted)
observation 198: enumerations = 185256
observation 199: enumerations = 125236
observation 200: enumerations = 63694
Exact logistic regression Number of obs = 200
Model score = 53.1061
Pr >= score = 0.0000
---------------------------------------------------------------------------
honors | Coef. Suff. 2*Pr(Suff.) [95% Conf. Interval]
-------------+-------------------------------------------------------------
read | .1424985 3210 0.0000 .0990385 .1904602
female | 1.106306 35 0.0081 .2593431 2.010762
---------------------------------------------------------------------------
Example 3: Exact poisson on cerebrovascular accidents
This example is from the Stata Manual
use http://www.stata-press.com/data/r10/cerebacc, clear
(cerebrovascular accidents in hypotensive-treated and control groups)
clist
treat count age
1. control 0 40/59
2. control 0 >=60
3. control 1 40/59
4. control 1 >=60
5. control 2 40/59
6. control 2 >=60
7. control 3 40/59
8. treatment 0 40/59
(output omitted)
35. treatment 0 40/59
36. treatment 0 40/59
37. treatment 0 40/59
38. treatment 0 40/59
39. treatment 1 40/59
40. treatment 1 40/59
41. treatment 1 40/59
tab treat age [fw=count]
hypotensiv |
e drug | age group
treatment | 40/59 >=60 | Total
-----------+----------------------+----------
control | 15 10 | 25
treatment | 4 0 | 4
-----------+----------------------+----------
Total | 19 10 | 29
expoisson count treat age
Estimating: treat
Enumerating sample-space combinations:
observation 1: enumerations = 11
(output omitted)
observation 39: enumerations = 410
observation 40: enumerations = 410
observation 41: enumerations = 30
Estimating: age
Enumerating sample-space combinations:
observation 1: enumerations = 5
observation 2: enumerations = 15
(output omitted)
observation 39: enumerations = 455
observation 40: enumerations = 455
observation 41: enumerations = 30
Exact Poisson regression
Number of obs = 41
---------------------------------------------------------------------------
count | Coef. Suff. 2*Pr(Suff.) [95% Conf. Interval]
-------------+-------------------------------------------------------------
treat | -1.594306 4 0.0026 -3.005089 -.4701708
age | -.5112067 10 0.2794 -1.416179 .3429232
---------------------------------------------------------------------------
/* standard poisson regression for comparison */
poisson count treat age, nolog
Poisson regression Number of obs = 41
LR chi2(2) = 10.64
Prob > chi2 = 0.0049
Log likelihood = -38.97981 Pseudo R2 = 0.1201
------------------------------------------------------------------------------
count | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
treat | -1.594306 .5573614 -2.86 0.004 -2.686714 -.5018975
age | -.5112067 .4043525 -1.26 0.206 -1.303723 .2813096
_cons | .233344 .2556594 0.91 0.361 -.2677391 .7344271
------------------------------------------------------------------------------
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services