|
|
|
||||
|
|
|||||
When two random variables under consideration are dichotomous variables, sometimes it is useful to compute their tetrachoric correlations. The calculation of tetrachoric correlation is under the assumption that the two dichotomous variables represent underlying normal distributions. In SAS, proc freq is used to obtain tetrachoric correlation matrix for a group of dichotomous variables.
In the following examples, we will use data set tetra.sas7bdat. The dataset contains a variable id and the rest of the variables are all dichotomous.
Example 1: Computing tetrachoric correlation between two dichotomous variables
We specify the plcorr option in the tables statement to request for polychoric correlation, which is also called tetrachoric correlation when variables are dichotomous.
proc freq data = tetra; tables female*hon /plcorr; run;The FREQ Procedure Table of female by hon female hon Frequency| Percent | Row Pct | Col Pct | 0| 1| Total ---------+--------+--------+ 0 | 73 | 18 | 91 | 36.50 | 9.00 | 45.50 | 80.22 | 19.78 | | 49.66 | 33.96 | ---------+--------+--------+ 1 | 74 | 35 | 109 | 37.00 | 17.50 | 54.50 | 67.89 | 32.11 | | 50.34 | 66.04 | ---------+--------+--------+ Total 147 53 200 73.50 26.50 100.00 Statistics for Table of female by hon Statistic Value ASE ------------------------------------------------------ Gamma 0.3146 0.1503 Kendall's Tau-b 0.1391 0.0684 Stuart's Tau-c 0.1223 0.0607 Somers' D C|R 0.1233 0.0612 Somers' D R|C 0.1570 0.0770 Pearson Correlation 0.1391 0.0684 Spearman Correlation 0.1391 0.0684 Tetrachoric Correlation 0.2362 0.1156 Lambda Asymmetric C|R 0.0000 0.0000 Lambda Asymmetric R|C 0.0000 0.0000 Lambda Symmetric 0.0000 0.0000 Uncertainty Coefficient C|R 0.0170 0.0169 Uncertainty Coefficient R|C 0.0143 0.0142 Uncertainty Coefficient Symmetric 0.0155 0.0154 Statistics for Table of female by hon Estimates of the Relative Risk (Row1/Row2) Type of Study Value 95% Confidence Limits ----------------------------------------------------------------- Case-Control (Odds Ratio) 1.9182 0.9974 3.6890 Cohort (Col1 Risk) 1.1816 1.0023 1.3930 Cohort (Col2 Risk) 0.6160 0.3752 1.0113 Sample Size = 200
Example 2: Computing tetrachoric correlation among two or more dichotomous variables
We will use SAS ODS to output the tetrachoric correlation to a data set. SAS can produce a number of output data sets based on the output from a procedure using ODS (Output Delivery System). Tetrachoric correlations are in the data set caleed measures since SAS put it with all other measures of associations together. We can then subset it to only contain tetrachoric correlations using the where statement in the subsequent data step.
proc freq data = tetra; tables (female ses hon sci)*(female ses hon sci) /plcorr; ods output measures=mycorr; run; data mycorra; set mycorr (where=(statistic="Tetrachoric Correlation")); keep table value; run; proc print data = mycorra; run;Obs Table Value 1 Table female * female 1.0000 2 Table ses * female -0.2636 3 Table hon * female 0.2362 4 Table sci * female -0.2562 5 Table female * ses -0.2636 6 Table ses * ses 1.0000 7 Table hon * ses 0.0735 8 Table sci * ses 0.2668 9 Table female * hon 0.2362 10 Table ses * hon 0.0735 11 Table hon * hon 1.0000 12 Table sci * hon 0.5037 13 Table female * sci -0.2562 14 Table ses * sci 0.2668 15 Table hon * sci 0.5037 16 Table sci * sci 1.0000
Example 3: Obtaining a tetrachoric correlation matrix for a group of variables
The example above shows how to obtain tetrachoric correlations for multiple variables. But the output is not in matrix format and this can be a problem if further analysis is to be done on the correlation matrix. In this example, we show some data steps to convert the output into a data set of type of correlation matrix. In the data step below, we created three variables.
proc freq data = tetra; tables (female ses hon sci)*(female ses hon sci) /plcorr; ods output measures=mycorr; run; data mycorrt; set mycorr (where=(statistic="Tetrachoric Correlation")); group = floor((_n_ - 1)/4); variable = scan(table, 2, " *"); idc = scan(table, 3, " *"); keep group value table idc variable; run; proc print data = mycorrt; run;Obs Table Value group variable idc 1 Table female * female 1.0000 0 female female 2 Table ses * female -0.2636 0 ses female 3 Table hon * female 0.2362 0 hon female 4 Table sci * female -0.2562 0 sci female 5 Table female * ses -0.2636 1 female ses 6 Table ses * ses 1.0000 1 ses ses 7 Table hon * ses 0.0735 1 hon ses 8 Table sci * ses 0.2668 1 sci ses 9 Table female * hon 0.2362 2 female hon 10 Table ses * hon 0.0735 2 ses hon 11 Table hon * hon 1.0000 2 hon hon 12 Table sci * hon 0.5037 2 sci hon 13 Table female * sci -0.2562 3 female sci 14 Table ses * sci 0.2668 3 ses sci 15 Table hon * sci 0.5037 3 hon sci 16 Table sci * sci 1.0000 3 sci sci proc transpose data = mycorrt out=mymatrix (drop = _name_ group) prefix=w ; id variable; by group; var value ; run; proc print data = mymatrix; run;
Obs wfemale wses whon wsci
1 1.0000 -0.2636 0.2362 -0.2562 2 -0.2636 1.0000 0.0735 0.2668 3 0.2362 0.0735 1.0000 0.5037 4 -0.2562 0.2668 0.5037 1.0000
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services