UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS FAQ
How do I compute tetrachoric correlations in SAS?

When two random variables under consideration are dichotomous variables, sometimes it is useful to compute their tetrachoric correlations. The calculation of tetrachoric correlation is under the assumption that the two dichotomous variables represent underlying normal distributions. In SAS, proc freq is used to obtain tetrachoric correlation matrix for a group of dichotomous variables.

In the following examples, we will use data set tetra.sas7bdat.  The dataset contains a variable id and the rest of the variables are all dichotomous.

Example 1: Computing tetrachoric correlation between two dichotomous variables

We specify the plcorr option in the tables statement to request for polychoric correlation, which is also called tetrachoric correlation when variables are dichotomous.

proc freq data = tetra;
  tables female*hon /plcorr;
run;
The FREQ Procedure

Table of female by hon

female     hon

Frequency|
Percent  |
Row Pct  |
Col Pct  |       0|       1|  Total
---------+--------+--------+
       0 |     73 |     18 |     91
         |  36.50 |   9.00 |  45.50
         |  80.22 |  19.78 |
         |  49.66 |  33.96 |
---------+--------+--------+
       1 |     74 |     35 |    109
         |  37.00 |  17.50 |  54.50
         |  67.89 |  32.11 |
         |  50.34 |  66.04 |
---------+--------+--------+
Total         147       53      200
            73.50    26.50   100.00


Statistics for Table of female by hon

Statistic                              Value       ASE
------------------------------------------------------
Gamma                                 0.3146    0.1503
Kendall's Tau-b                       0.1391    0.0684
Stuart's Tau-c                        0.1223    0.0607

Somers' D C|R                         0.1233    0.0612
Somers' D R|C                         0.1570    0.0770

Pearson Correlation                   0.1391    0.0684
Spearman Correlation                  0.1391    0.0684
Tetrachoric Correlation               0.2362    0.1156

Lambda Asymmetric C|R                 0.0000    0.0000
Lambda Asymmetric R|C                 0.0000    0.0000
Lambda Symmetric                      0.0000    0.0000

Uncertainty Coefficient C|R           0.0170    0.0169
Uncertainty Coefficient R|C           0.0143    0.0142
Uncertainty Coefficient Symmetric     0.0155    0.0154


Statistics for Table of female by hon

           Estimates of the Relative Risk (Row1/Row2)

Type of Study                   Value       95% Confidence Limits
-----------------------------------------------------------------
Case-Control (Odds Ratio)      1.9182        0.9974        3.6890
Cohort (Col1 Risk)             1.1816        1.0023        1.3930
Cohort (Col2 Risk)             0.6160        0.3752        1.0113

Sample Size = 200

Example 2: Computing tetrachoric correlation among two or more dichotomous variables

We will use SAS ODS to output the tetrachoric correlation to a data set. SAS can produce a number of output data sets based on the output from a procedure using ODS (Output Delivery System). Tetrachoric correlations are in the data set caleed measures since SAS put it with all other measures of associations together. We can then subset it to only contain tetrachoric correlations using the where statement in the subsequent data step.

proc freq data = tetra;
  tables (female ses hon sci)*(female ses hon sci) /plcorr;
  ods output measures=mycorr;
run;
data mycorra;
  set mycorr (where=(statistic="Tetrachoric Correlation"));
  keep table value;
run;
proc print data = mycorra;
run;
Obs            Table               Value

  1    Table female * female      1.0000
  2    Table ses * female        -0.2636
  3    Table hon * female         0.2362
  4    Table sci * female        -0.2562
  5    Table female * ses        -0.2636
  6    Table ses * ses            1.0000
  7    Table hon * ses            0.0735
  8    Table sci * ses            0.2668
  9    Table female * hon         0.2362
 10    Table ses * hon            0.0735
 11    Table hon * hon            1.0000
 12    Table sci * hon            0.5037
 13    Table female * sci        -0.2562
 14    Table ses * sci            0.2668
 15    Table hon * sci            0.5037
 16    Table sci * sci            1.0000

Example 3: Obtaining a tetrachoric correlation matrix for a group of variables

The example above shows how to obtain tetrachoric correlations for multiple variables. But the output is not in matrix format and this can be a problem if further analysis is to be done on the correlation matrix. In this example, we show some data steps to convert the output into a data set of type of correlation matrix. In the data step below, we created three variables.

proc freq data = tetra;
  tables (female ses hon sci)*(female ses hon sci) /plcorr;
  ods output measures=mycorr;
run;

data mycorrt;
  set mycorr (where=(statistic="Tetrachoric Correlation"));
  group = floor((_n_ - 1)/4);
  variable = scan(table, 2, " *");
  idc = scan(table, 3, " *");
   keep group value table idc variable;
run;
proc print data = mycorrt;
run;
Obs            Table               Value    group    variable    idc

  1    Table female * female      1.0000      0       female     female
  2    Table ses * female        -0.2636      0       ses        female
  3    Table hon * female         0.2362      0       hon        female
  4    Table sci * female        -0.2562      0       sci        female
  5    Table female * ses        -0.2636      1       female     ses
  6    Table ses * ses            1.0000      1       ses        ses
  7    Table hon * ses            0.0735      1       hon        ses
  8    Table sci * ses            0.2668      1       sci        ses
  9    Table female * hon         0.2362      2       female     hon
 10    Table ses * hon            0.0735      2       ses        hon
 11    Table hon * hon            1.0000      2       hon        hon
 12    Table sci * hon            0.5037      2       sci        hon
 13    Table female * sci        -0.2562      3       female     sci
 14    Table ses * sci            0.2668      3       ses        sci
 15    Table hon * sci            0.5037      3       hon        sci
 16    Table sci * sci            1.0000      3       sci        sci

proc transpose data = mycorrt out=mymatrix (drop = _name_ group)  prefix=w ;
   id variable;
   by group;
   var value ;
run;
proc print data = mymatrix;
run;
Obs     wfemale        wses        whon        wsci
 1       1.0000     -0.2636      0.2362     -0.2562
 2      -0.2636      1.0000      0.0735      0.2668
 3       0.2362      0.0735      1.0000      0.5037
 4      -0.2562      0.2668      0.5037      1.0000


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California