UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Textbook Examples
An Introduction to Categorical Analysis by Alan Agresti
Chapter 6 - Log-linear Models for Contingency Tables

Afterlife data, table 6.1, p. 147.
data after;
  input female belief f1 b1 f2 b2 count;
cards;
  1  1   0  0  1   1 435
  1  0   0  1  1  -1 147
  0  1   1  0 -1   1 375
  0  0   1  1 -1  -1 134
;
run;
Table 6.1, p. 147.
Note: The parameter estimates from the first model corresponds to the parameters in Set 1, the second model to set 2 and the third model to set 3.
proc genmod data=after;
  model count=belief female/ d=poisson ;
  output out=temp p=predict;
run;
proc genmod data=after desc;
  model count=f1 b1/ d=poisson ;
    output out=temp p=predict;
run;
proc genmod data=after desc;
  model count=b2 f2/ d=poisson ;
run;
The GENMOD Procedure

       Model Information
Data Set              WORK.AFTER
Distribution             Poisson
Link Function                Log
Dependent Variable         count
Observations Used              4
           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF
Deviance                   1          0.1620          0.1620
Scaled Deviance            1          0.1620          0.1620
Pearson Chi-Square         1          0.1621          0.1621
Scaled Pearson X2          1          0.1621          0.1621
Log Likelihood                     5164.1959

Algorithm converged.
                            Analysis Of Parameter Estimates

                               Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq
Intercept     1      4.8760      0.0679      4.7429      5.0090    5160.87        <.0001
belief        1      1.0587      0.0692      0.9230      1.1944     233.83        <.0001
female        1      0.1340      0.0607      0.0151      0.2530       4.88        0.0272
Scale         0      1.0000      0.0000      1.0000      1.0000
NOTE: The scale parameter was held fixed.

The GENMOD Procedure

       Model Information
Data Set              WORK.AFTER
Distribution             Poisson
Link Function                Log
Dependent Variable         count
Observations Used              4
           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF
Deviance                   1          0.1620          0.1620
Scaled Deviance            1          0.1620          0.1620
Pearson Chi-Square         1          0.1621          0.1621
Scaled Pearson X2          1          0.1621          0.1621
Log Likelihood                     5164.1959

Algorithm converged.
                            Analysis Of Parameter Estimates

                               Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq
Intercept     1      6.0687      0.0451      5.9802      6.1571    18087.0        <.0001
f1            1     -0.1340      0.0607     -0.2530     -0.0151       4.88        0.0272
b1            1     -1.0587      0.0692     -1.1944     -0.9230     233.83        <.0001
Scale         0      1.0000      0.0000      1.0000      1.0000
NOTE: The scale parameter was held fixed.

The GENMOD Procedure

       Model Information
Data Set              WORK.AFTER
Distribution             Poisson
Link Function                Log
Dependent Variable         count
Observations Used              4
           Criteria For Assessing Goodness Of Fit

Criterion                 DF           Value        Value/DF
Deviance                   1          0.1620          0.1620
Scaled Deviance            1          0.1620          0.1620
Pearson Chi-Square         1          0.1621          0.1621
Scaled Pearson X2          1          0.1621          0.1621
Log Likelihood                     5164.1959

Algorithm converged.
                            Analysis Of Parameter Estimates

                               Standard     Wald 95% Confidence       Chi-
Parameter    DF    Estimate       Error           Limits            Square    Pr > ChiSq
Intercept     1      5.4723      0.0347      5.4043      5.5403    24904.4        <.0001
b2            1      0.5293      0.0346      0.4615      0.5972     233.83        <.0001
f2            1      0.0670      0.0303      0.0075      0.1265       4.88        0.0272
Scale         0      1.0000      0.0000      1.0000      1.0000
NOTE: The scale parameter was held fixed.
The observed frequencies and fitted values, table 6.1, p. 147.
proc format;
  value female 1='Female' 0='Male';
  value belief 1='Yes' 0='No';
run;
proc print data=temp;
  format female female. belief belief.;
  var female belief count predict;
run;
Obs    female    belief    count    predict

 1     Female     Yes       435     432.099
 2     Female     No        147     149.901
 3     Male       Yes       375     377.901
 4     Male       No        134     131.099
You can obtain both the fitted values and the G^2 and X^2 statistics from proc freq. In the model statement add the option expected to obtain the fitted values and chisq to obtain the G^2 and X^2 statistics since these values are simply the test statistics for testing the hypothesis of independence in two-way tables.
proc freq data = after ;
  format female female. belief belief.;
  tables female*belief/ norow nocol nopercent expected chisq;
  weight count;
  output out=temp;
run;
The FREQ Procedure

Table of female by belief
female     belief

Frequency|
Expected |No      |Yes     |  Total
---------+--------+--------+
Male     |    134 |    375 |    509
         |  131.1 |  377.9 |
---------+--------+--------+
Female   |    147 |    435 |    582
         |  149.9 |  432.1 |
---------+--------+--------+
Total         281      810     1091

Statistics for Table of female by belief
Statistic                     DF       Value      Prob
------------------------------------------------------
Chi-Square                     1      0.1621    0.6872
Likelihood Ratio Chi-Square    1      0.1620    0.6873
Continuity Adj. Chi-Square     1      0.1110    0.7390
Mantel-Haenszel Chi-Square     1      0.1619    0.6874
Phi Coefficient                       0.0122
Contingency Coefficient               0.0122
Cramer's V                            0.0122
       Fisher's Exact Test
----------------------------------
Cell (1,1) Frequency (F)       134
Left-sided Pr <= F          0.6817
Right-sided Pr >= F         0.3693

Table Probability (P)       0.0510
Two-sided Pr <= P           0.7287

Sample Size = 1091
Table 6.2, p. 149 was omitted. Inputting the drug data set, table 6.3, p. 152.
data drug;
  input alcohol smoke marijuana count;
cards;
1   1   1   911
1   1   0   538
1   0   1    44
1   0   0   456
0   1   1     3
0   1   0    43
0   0   1     2
0   0   0   279
;
run;
Table 6.4, p. 152.
ods listing close;
proc catmod data=drug;
   weight count;
   model alcohol*smoke*marijuana=_response_ / pred=freq noprofile noparm noiter noresponse;
   loglin alcohol smoke marijuana;
   ods output PredictedFreqs=temp1;
run;
quit;
data temp1 (keep=p1 id) ;
  set temp1;
  rename predfunction=p1;
  id = _n_;
run;
proc catmod data=drug;
   weight count;
   model alcohol*smoke*marijuana=_response_ / pred=freq noprofile noparm noiter noresponse;
   loglin alcohol|smoke marijuana;
   ods output PredictedFreqs=temp2;
run;
quit;
data temp2 (keep=p2 id) ;
  set temp2;
  rename predfunction=p2;
  id = _n_;
run;
proc catmod data=drug;
   weight count;
   model alcohol*smoke*marijuana=_response_ / pred=freq noprofile noparm noiter noresponse;
   loglin alcohol|marijuana smoke|marijuana;
   ods output PredictedFreqs=temp3;
run;
quit;
data temp3 (keep=p3 id) ;
  set temp3;
  rename predfunction=p3;
  id = _n_;
run;
proc catmod data=drug;
   weight count;
   model alcohol*smoke*marijuana=_response_ / pred=freq noprofile noparm noiter noresponse;
   loglin alcohol|smoke alcohol|marijuana smoke|marijuana;
   ods output PredictedFreqs=temp4;
run;
quit;
data temp4 (keep=p4 id) ;
  set temp4;
  rename predfunction=p4;
  id = _n_;
run;
proc catmod data=drug;
   weight count;
   model alcohol*smoke*marijuana=_response_ / pred=freq noprofile noparm noiter noresponse;
   loglin alcohol|smoke|marijuana;
   ods output PredictedFreqs=temp5;
run;
quit;
data temp5 ;
  set temp5;
  rename predfunction=p5;
  id = _n_;
run;
ods output close;
ods listing;
data combo;
  merge temp1 temp2 temp3 temp4 temp5;
  by id;
run;
 
data combo;
  set combo;
  rename p1 = A_C_M;
  rename p2 = AC_M;
  rename p3 = AM_CM;
  rename p4 = AC_AM_CM;
  rename p5 = ACM;
  alcohol1=alcohol+0;
  marijuana1= marijuana+0;
  smoke1=smoke+0;
  run;
proc format;
   value yesno 1='Yes' 0='No';
run;
proc print data=combo;
  format alcohol1 yesno. smoke1 yesno. marijuana1 yesno. ;
  var alcohol1 smoke1 marijuana1  A_C_M AC_M AM_CM AC_AM_CM  ACM;
run;
Obs   alcohol1   smoke1   marijuana1      A_C_M       AC_M      AM_CM   AC_AM_CM        ACM

 1      No        No         No         64.8799   162.4763   179.8404   279.6168        279
 2      No        No         Yes       47.32881   118.5237   0.239583    1.38317          2
 3      No        Yes        No        124.1939   26.59754   142.1596   42.38317         43
 4      No        Yes        Yes       90.59739   19.40246   4.760418    3.61683          3
 5      Yes       No         No        386.7001   289.1037   555.1596   455.3832        456
 6      Yes       No         Yes       282.0912   210.8963   45.76042   44.61683         44
 7      Yes       Yes        No        740.2261   837.8225   438.8404   538.6168        538
 8      Yes       Yes        Yes       539.9826   611.1775   909.2396   910.3832        911
Table 6.5, p. 153 omitted.

Table 6.6, p. 155.

ods listing close;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol smoke marijuana ;
  ods output anova=temp1;
run;
data temp1;
  set temp1;
  model = 'A_C_M';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol smoke|marijuana ;
  ods output anova=temp2;
run;
data temp2;
  set temp2;
  model = 'A_CM';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|marijuana smoke  ;
  ods output anova=temp3;
run;
data temp3;
  set temp3;
  model = 'AM_C';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  marijuana smoke|alcohol  ;
  ods output anova=temp4;
run;
data temp4;
  set temp4;
  model = 'AC_M';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|marijuana smoke|alcohol  ;
  ods output anova=temp5;
run;
data temp5;
  set temp5;
  model = 'AC_AM';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|smoke marijuana|smoke  ;
  ods output anova=temp6;
run;
data temp6;
  set temp6;
  model = 'AC_CM';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|marijuana marijuana|smoke  ;
  ods output anova=temp7;
run;
data temp7;
  set temp7;
  model = 'AM_CM';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|smoke|marijuana @2;
  ods output anova=temp8;
run;
data temp8;
  set temp8;
  model = 'AC_CM_CM';
run;
proc catmod data=drug;
  weight count;
  model alcohol*smoke*marijuana= _response_ ;
  loglin  alcohol|smoke|marijuana;
  ods output anova=temp9;
run;
data temp9;
  set temp9;
  model = 'ACM';
run;
quit;
ods output close;
ods listing;
data combo;
  set temp1 temp2 temp3 temp4 temp5 temp6 temp7 temp8 temp9;
  rename chisq=G2;
  rename probchisq=P_value;
run;
proc print data=combo noobs;
  where source='Likelihood Ratio';
  var model G2 df P_value;
run;
Model             G2     DF    P_value

A_C_M        1286.02      4    <.0001
A_CM          534.21      3    <.0001
AM_C          939.56      3    <.0001
AC_M          843.83      3    <.0001
AC_AM         497.37      2    <.0001
AC_CM          92.02      2    <.0001
AM_CM         187.75      2    <.0001
AC_CM_CM        0.37      1    0.5408
ACM              .        0     .
Table 6.7, p. 156
proc genmod data=drug ;
  class alcohol smoke marijuana;
  model count= alcohol smoke marijuana alcohol*marijuana smoke*marijuana / dist=p;
  output out=temp reslik=reslik p=pred;
run;
data temp;
  set temp;
  id = _n_;
  rename reslik=adjres1;
  rename pred=fitted1;
run;
proc genmod data=drug;
  class alcohol smoke marijuana;
  model count= alcohol smoke marijuana alcohol*marijuana smoke*marijuana alcohol*smoke/ dist=p;
  output out=temp1 reslik=reslik p=pred;
run;
data temp1;
  set temp1;
  id = _n_;
  rename reslik=adjres2;
  rename pred=fitted2;
run;
data combo;
  merge temp temp1;
  by id;
run;
proc print data=combo;
  format alcohol yesno. smoke yesno. marijuana yesno.;
  var alcohol smoke marijuana count fitted1 adjres1 fitted2 adjres2;
run;
Obs    alcohol    smoke    marijuana    count    fitted1     adjres1    fitted2     adjres2
 1       Yes       Yes        Yes        911     909.240      3.6955    910.383     0.63332
 2       Yes       Yes        No         538     438.840     12.7451    538.617    -0.63333
 3       Yes       No         Yes         44      45.760     -3.6956     44.617    -0.63336
 4       Yes       No         No         456     555.160    -12.8498    455.383     0.63332
 5       No        Yes        Yes          3       4.761     -3.7093      3.617    -0.63848
 6       No        Yes        No          43     142.160    -13.7941     42.383     0.63329
 7       No        No         Yes          2       0.240      2.3852      1.383     0.60617
 8       No        No         No         279     179.840     12.4902    279.617    -0.63333
Inputting the Injury Data, table 6.8, p. 159.
data injury;
  input G L S I count;
cards;
0 0 0 0  7287
0 0 0 1   996
0 0 1 0 11587
0 0 1 1   759
0 1 0 0  3246
0 1 0 1   973
0 1 1 0  6134
0 1 1 1   757
1 0 0 0 10381
1 0 0 1   812
1 0 1 0 10969
1 0 1 1   380
1 1 0 0  6123
1 1 0 1  1084
1 1 1 0  6693
1 1 1 1   513
;
run;
Table 6.8, p. 159.
ods listing close;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_/ pred=freq;
  loglin g|i g|l g|s i|l i|s l|s;
  ods output predictedfreqs=temp1 anova=temp;
run;
quit;
data temp1 (keep=p1 functionnum);
  set temp1;
  rename predfunction=p1;
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_/ pred=freq;
  loglin g|l|s g|i i|l i|s;
  ods output predictedfreqs=temp2 anova=temp3;
run;
quit;
ods output close;
ods listing;
data temp2;
  set temp2;
  rename predfunction=p2;
run;
data combo;
  merge temp1 temp2;
  by functionnum;
  Male=G+0;
  Location=L+0;
  Seat=S+0;
  Injury=I+0;
  rename obsfunction = observed;
run;
proc format;
  value male 0='Female' 1='Male';
  value location 0='Urban' 1='Rural';
  value Yesno 0='No' 1='Yes';
run;
proc sort data=combo;
  by male location seat injury;
run;
proc print data= combo;
  format male male. location location. seat yesno. injury yesno.;
  var male location seat injury observed p1 p2;
run;
Obs     Male     Location    Seat    Injury    observed          p1          p2

  1    Female     Urban      No       No           7287    7166.369    7273.214
  2    Female     Urban      No       Yes           996    993.0169    1009.786
  3    Female     Urban      Yes      No          11587    11748.31    11632.62
  4    Female     Urban      Yes      Yes           759    721.3055    713.3779
  5    Female     Rural      No       No           3246    3353.829    3254.662
  6    Female     Rural      No       Yes           973    988.7848    964.3382
  7    Female     Rural      Yes      No           6134    5985.493    6093.502
  8    Female     Rural      Yes      Yes           757    781.8927    797.4979
  9    Male       Urban      No       No          10381     10471.5    10358.93
 10    Male       Urban      No       Yes           812    845.1187    834.0683
 11    Male       Urban      Yes      No          10969    10837.83    10959.23
 12    Male       Urban      Yes      Yes           380    387.5588    389.7677
 13    Male       Rural      No       No           6123    6045.306    6150.192
 14    Male       Rural      No       Yes          1084     1038.08    1056.808
 15    Male       Rural      Yes      No           6693    6811.371    6697.644
 16    Male       Rural      Yes      Yes           513    518.2429    508.3564
The dissimilarity index, D, for both models, p. 162.

Note: The index for model (GI, GL, GS, IL, IS, LS) is d1 and the index for model (GIL, GIS, GLS, ILS) is d2.

proc sql;
  select  sum( abs(observed-p1) ) / (2*sum(observed)) as d1,
           sum( abs(observed-p2) ) /( 2*sum(observed) ) as d2
  from combo;
quit;
      d1        d2
------------------
0.008219  0.002507
Table 6.9, p. 160.
ods listing close;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g i l s ;
  ods output anova=temp1;
run;
data temp1;
  set temp1;
  model = 'G_I_L_S';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g|i g|l g|s i|l i|s l|s ;
  ods output anova=temp2;
run;
data temp2;
  set temp2;
  model = 'GI_GL_GS_IL_IS_LS';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g|i|l g|i|s g|l|s i|l|s ;
  ods output anova=temp3;
run;
data temp3;
  set temp3;
  model = 'GIL_GIS_GLS_ILS';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g|i|l g|s i|s l|s ;
  ods output anova=temp4;
run;
data temp4;
  set temp4;
  model = 'GIL_GS_LS_IS';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g|i|s  g|l i|l l|s ;
  ods output anova=temp5;
run;
data temp5;
  set temp5;
  model = 'GIS_GL_LI_LS';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin g|l|s  g|i i|l i|s ;
  ods output anova=temp6;
run;
data temp6;
  set temp6;
  model = 'GLS_GI_IL_IS';
run;
proc catmod data=injury;
  weight count;
  model G*I*L*S= _response_ ;
  loglin i|l|s  g|i g|l g|s ;
  ods output anova=temp7;
run;
data temp7;
  set temp7;
  model = 'ILS_GI_GL_GS';
run;
ods output close;
ods listing;
data combo;
  set temp1 temp2 temp3 temp4 temp5 temp6 temp7;
  rename chisq=G2;
  rename probchisq=P_value;
run;
proc print data=combo;
  where source='Likelihood Ratio';
  var model G2 P_value;
run;
Obs    Model                      G2    P_value

  5    G_I_L_S               2792.77    <.0001
 16    GI_GL_GS_IL_IS_LS       23.35    0.0003
 31    GIL_GIS_GLS_ILS          1.33    0.2496
 43    GIL_GS_LS_IS            18.57    0.0010
 55    GIS_GL_LI_LS            22.85    0.0001
 67    GLS_GI_IL_IS             7.46    0.1133
 79    ILS_GI_GL_GS            20.63    0.0004

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California