UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Using proc multtest to perform multiple comparisons

Many times when analyzing data you need to perform multiple comparisons.  You may want to look at the effects of one level of a variable compared to another level of that variable, for example.  However, if you need to conduct many such tests, the p-values printed in the output can be misleading.  If you use the standard p-value cutoff for statistical significance of .05, and you conduct one test, then the probability of a false alarm (i.e., declaring a significant result when in fact there is not one) is 5 out of 100, or .05.  However, if you run 10 tests, the probability that at least one of those 10 is a false alarm is .401.  If you run 20 tests, the probability that at least one of those 20 tests is a false alarm is .642.  (NOTE:  The formula for calculating these probabilities is:  1 - (1 - .05)N, where N = the number of tests being conducted.)  Clearly, this inflation of the Type-I error rate represents an undesirable situation.  What can you do? 

You have several choices if you want to keep your family-wise alpha levels in check.  Perhaps the simplest thing that you can do is divide your significance level (usually .05) by the number of tests that you are conducting, and use the resulting value as your new cutoff level for statistical significance.  For example, if you were conducting five tests, you would compute .05/5 = .01, and so if your observed p-value (obtained from your output) was .01 or below, you would conclude that you have a statistically significant result.  However, if your observed p-value was greater than .01, say .04, then you would conclude that you do not have a statistically significant result.  This method is called the Bonferroni method.  While it is easy to do, it is also very conservative.  This means that you are relatively more likely to miss an effect that is actually there.   

There are several other methods by which you can obtain adjusted p-values.  In SAS, you can obtain these by using proc multtest.  This procedure adjusts the p-values for multiple comparisons.  Here, an adjusted p-value is defined as the smallest p-value significance level for which the given hypothesis would be rejected, when the entire family of tests is considered.  There are several ways that the p-value can be adjusted, but we will consider only two of them here:  Bonferroni and the bootstrap, and we will compare these methods of adjusting the p-values. 

We have already explained the Bonferroni method of adjusting the p-values, so now let's look at the bootstrap method.  It is a nonparametric method in which the data are resampled with replacement a large number of times (you specify how many times), a p-value is computed for each sample, and the adjusted p-value is calculated based on those p-values.  The p-value given in proc multtest is the proportion of p-values from the n samples that were smaller than the raw p-value based on the original data.  One of the desirable properties of the bootstrap method is that it explicitly incorporates all sources of correlation, from both the multiple contrasts and the multivariate structure. The adjusted p-values incorporate all correlations and distributional characteristics.  Like the Bonferroni method, the bootstrap method can be used with either categorical or continuous variables.  In the case of continuous variables, proc multtest mean-centers the variable, but you can specify the nocenter option if you do not want the variable centered.  The reason proc multtest mean-centers continuous variables is because the pooling of the groups is likely to yield a distribution that is multimodal, which is probably not the distribution of the null hypothesis distribution.   

Before using proc multtest, you would use another procedure, such as proc glm, to analyze your data.  If you got statistically significant results of a main effect or of an interaction, you could then use proc multtest to conduct the multiple comparisons.  Let's begin with the relatively simple case of using only a main effect.  We will use the hsb2 data set for our examples.  We will assume that we have done an analysis that shows the schtyp is statistically significant.  The option boot on the proc multtest statement tells SAS to use the bootstrap method for calculating the tests.  The n = option tells SAS how many times to run the bootstrap.  The s = option sets the seed; this is needed only if you want to reproduce your results.  The seed is used when generating the pseudo-random samples that make up the bootstrap.  The bon option gives the Bonferroni adjustment in addition to the bootstrapped adjustment.  The notables option suppresses the display of the discrete and/or the continuous variable tabulations.  Finally, the pvals option requests the table of both the raw and the adjusted p-values.  We list schtyp as the class variable and then use the contrast statement to specify the contrast in which we are interested.  In this case, we are comparing the two types of schools, public and private.  We use the test statement to test the means of each of the variables listed in the parentheses.  The output is shown below.

data hsb2;
set 'g:\sas\hsb2';
run;

* first example using only a main effect;
proc multtest data = hsb2 boot n = 100 s = 12345 bon notables pvals;
class schtyp;
contrast 'using only a main effect' 1 -1;
test mean(read write math);
run;
The Multtest Procedure

                       Model Information

Test for continuous variables:              Mean t-test
Tails for continuous tests:                 Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Center continuous variables?                Yes
Number of resamples:                        100
Seed:                                       12345

                Contrast Coefficients

                                    schtyp

Contrast                            1               2

using only a main eff               1              -1

                                 p-Values

Variable    Contrast                        Raw    Bonferroni     Bootstrap

read        using only a main eff        0.2249        0.6746        0.3700
write       using only a main eff        0.0726        0.2178        0.1200
math        using only a main eff        0.1661        0.4982        0.3100

While all nine of the p-values reported above are greater than .05, and hence your decision regarding statistical significance would not change no matter which set of p-values you used, you can see how conservative the Bonferroni correction is compared to the bootstrap adjusted p-values.

To use proc multtest to compare groups in a statistically significant interaction, you may have create a new variable in your data set.  For example, if one factor had two levels and a second factor had three levels, you would need to create a new six-level variable that would code for the six levels in the interaction of the two factors.  This can be done easily in a data step.  In our example data set, we have two categorical variables,  schtyp, which is a two-level variable indicating if the school is public or private, and prog, which is a three-level categorical variable indicating the program in which the student is enrolled (academic, vocational or general).  

data hsb2a;
set hsb2;
if schtyp = 1 and prog = 1 then sp = 1;
if schtyp = 1 and prog = 2 then sp = 2;
if schtyp = 1 and prog = 3 then sp = 3;
if schtyp = 2 and prog = 1 then sp = 4;
if schtyp = 2 and prog = 2 then sp = 5;
if schtyp = 2 and prog = 3 then sp = 6;
run;

We will use proc freq with the list option on the tables statement to ensure that the new variable, called sp, was created as desired.

proc freq data = hsb2a;
tables schtyp*prog*sp / list;
run;
The FREQ Procedure

                                                 Cumulative    Cumulative
schtyp    prog    sp    Frequency     Percent     Frequency      Percent
-------------------------------------------------------------------------
     1       1     1          39       19.50            39        19.50
     1       2     2          81       40.50           120        60.00
     1       3     3          48       24.00           168        84.00
     2       1     4           6        3.00           174        87.00
     2       2     5          24       12.00           198        99.00
     2       3     6           2        1.00           200       100.00

Now we can run the proc multtest using our new sp variable.  In this example, we will compare the two types of schools at the third level of prog.

proc multtest data = hsb2a boot n = 100 s = 12345 bon notables pvals;
class sp;
contrast 'using an interaction' 0 0 1 0 0 -1;
test mean(read write math);
run;
The Multtest Procedure

                       Model Information

Test for continuous variables:              Mean t-test
Tails for continuous tests:                 Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Center continuous variables?                Yes
Number of resamples:                        100
Seed:                                       12345

                                     Contrast Coefficients

                                                          sp

Contrast                          1              2              3              4              5

using an interaction              0              0              1              0              0

       Contrast Coefficients

                            sp

Contrast                          6

using an interaction             -1

                                 p-Values

Variable    Contrast                       Raw    Bonferroni     Bootstrap

read        using an interaction        0.7243        1.0000        0.9700
write       using an interaction        0.1984        0.5953        0.3300
math        using an interaction        0.1358        0.4075        0.2600

Again, the contrast is not statistically significant no matter which set of p-values you use.  When looking at the contrast coefficients, note that the output wraps so that the contrast coefficient for the sixth value of sp is listed on a new line in the output.

Now let's see what happens when we increase the number of samples taken during the bootstrap process.  Bootstrapping typically requires a large number of samples, but the number of samples taken must be considered in light of available computer resources and how long the process will take.  For our example, we will only increase the number of samples from 100 to 500.

proc multtest data = hsb2a boot n = 500 s = 12345 bon notables pvals;
class sp;
contrast 'using an interaction' 0 0 1 0 0 -1;
test mean(read write math);
run;
The Multtest Procedure

                       Model Information

Test for continuous variables:              Mean t-test
Tails for continuous tests:                 Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Center continuous variables?                Yes
Number of resamples:                        500
Seed:                                       12345

                                     Contrast Coefficients

                                                          sp

Contrast                          1              2              3              4              5

using an interaction              0              0              1              0              0

       Contrast Coefficients

                            sp

Contrast                          6

using an interaction             -1

                                 p-Values

Variable    Contrast                       Raw    Bonferroni     Bootstrap

read        using an interaction        0.7243        1.0000        0.9800
write       using an interaction        0.1984        0.5953        0.4320
math        using an interaction        0.1358        0.4075        0.3220

If you compare this output with the previous output, you will notice that the bootstrap adjusted p-values changed slightly.  Of course, neither the raw nor the Bonferroni adjusted p-values changed. 

Now let's use a binary outcome variable.  First, we need to create one.  We will take the variable write and make a new dichotomous variable called honcomp by assigning a one to all values greater than or equal to 60, and zero otherwise.  We will use proc means to examine our new variable.

data hsb2b;
set hsb2;
honcomp = (write >= 60);
run;

proc sort data = hsb2b;
by honcomp;
run;

proc means data = hsb2b;
var write;
by honcomp;
run; 
honcomp=0

The MEANS Procedure

                     Analysis Variable : write

  N            Mean         Std Dev         Minimum         Maximum
-------------------------------------------------------------------
147      48.9387755       8.0573497      31.0000000      59.0000000
-------------------------------------------------------------------

honcomp=1

                     Analysis Variable : write

  N            Mean         Std Dev         Minimum         Maximum
-------------------------------------------------------------------
 53      63.4150943       2.1342803      60.0000000      67.0000000
-------------------------------------------------------------------

The only change to proc multtest is on the test statement.  Here we use ca to request the Cochran-Armitage test.  This test assumes that you have a binary variable in which zero indicates a failure and one indicates a success.  In this example, we will compare the first and second levels of the variable prog.

proc multtest data = hsb2b boot n = 100 s = 12345 bon notables pvals;
class prog;
contrast 'using a binary variable' 1 -1 0;
test ca(honcomp);
run;
The Multtest Procedure

                       Model Information

Test for discrete variables:                Cochran-Armitage
Z-score approximation used:                 Everywhere
Continuity correction:                      0
Tails for discrete tests:                   Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Number of resamples:                        100
Seed:                                       12345

                        Contrast Coefficients

                                             prog

Contrast                            1               2               3

using a binary variab               1              -1               0

                                 p-Values

Variable    Contrast                        Raw    Bonferroni     Bootstrap

honcomp     using a binary variab        0.0008        0.0008        <.0001

The strata statement in proc multtest allows you to estimate the variances within levels of the stratification variable for the multiple comparisons.  However, proc multtest does not report separate tests for each level of the stratification variable.  (This is similar to blocking in GLM.)  The by statement in proc multtest allows you to test each level of a by variable separately. 

proc sort data = hsb2b;
by schtyp;
run;

proc multtest data = hsb2b boot n = 100 s = 12345 bon notables pvals;
class prog;
by schtyp;
contrast 'using a binary variable' 1 -.5 -.5;
test ca(honcomp);
run;
schtyp=1

The Multtest Procedure

                       Model Information

Test for discrete variables:                Cochran-Armitage
Z-score approximation used:                 Everywhere
Continuity correction:                      0
Tails for discrete tests:                   Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Number of resamples:                        100
Seed:                                       12345

                        Contrast Coefficients

                                             prog

Contrast                            1               2               3

using a binary variab             1.0            -0.5            -0.5

                                 p-Values

Variable    Contrast                        Raw    Bonferroni     Bootstrap

honcomp     using a binary variab        0.1562        0.1562        0.1800
schtyp=2

The Multtest Procedure

                       Model Information

Test for discrete variables:                Cochran-Armitage
Z-score approximation used:                 Everywhere
Continuity correction:                      0
Tails for discrete tests:                   Two-tailed
Strata weights:                             None
P-value adjustment:                         Bonferroni
P-value adjustment:                         Bootstrap
Number of resamples:                        100
Seed:                                       192025240

                        Contrast Coefficients

                                             prog

Contrast                            1               2               3

using a binary variab             1.0            -0.5            -0.5

                                 p-Values

Variable    Contrast                        Raw    Bonferroni     Bootstrap

honcomp     using a binary variab        0.1225        0.1225        0.1000

Notice that the seed is different for the second level of schtyp that it is for the first.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.