|
|
|
||||
|
|
|||||
Table 3.2 on page 28 using pleural thickening data.
Model I: unconstrained
lat 1
man 3
dim 2 2 2 2
lab X A B C
mod X
A|X
B|X
C|X
dat [1513 23
59 12
21 19
11 34]
*** STATISTICS ***
Number of iterations = 122 Converge criterion = 0.0000006663 Seed random values = 2610
X-squared = 0.0000 (0.0000) L-squared = 0.0000 (0.0000) Cressie-Read = 0.0000 (0.0000) Dissimilarity index = 0.0000 Degrees of freedom = 0 Log-likelihood = -891.14280 Number of parameters = 7 (+1) Sample size = 1692.0 BIC(L-squared) = 0.0000 AIC(L-squared) = 0.0000 BIC(log-likelihood) = 1834.3213 AIC(log-likelihood) = 1796.2856
*** LATENT CLASS OUTPUT ***
X 1 X 2
0.9455 0.0545
A 1 0.9900 0.2510
A 2 0.0100 0.7490
B 1 0.9646 0.3566
B 2 0.0354 0.6434
C 1 0.9891 0.2350
C 2 0.0109 0.7650
Model II: Homogeneous
lat 1
man 3
dim 2 2 2 2
lab X A B C
mod X
A|X
B|X eq1 A|X
C|X eq1 B|X
dat [1513 23
59 12
21 19
11 34]
*** STATISTICS ***
Number of iterations = 95 Converge criterion = 0.0000006431 Seed random values = 257
X-squared = 29.3556 (0.0000) L-squared = 27.4114 (0.0000) Cressie-Read = 28.5695 (0.0000) Dissimilarity index = 0.0175 Degrees of freedom = 4 Log-likelihood = -904.84848 Number of parameters = 3 (+1) Sample size = 1692.0 BIC(L-squared) = -2.3233 AIC(L-squared) = 19.4114 BIC(log-likelihood) = 1831.9980 AIC(log-likelihood) = 1815.6970
*** LATENT CLASS OUTPUT ***
X 1 X 2
0.0546 0.9454
A 1 0.2835 0.9812
A 2 0.7165 0.0188
B 1 0.2835 0.9812
B 2 0.7165 0.0188
C 1 0.2835 0.9812
C 2 0.7165 0.0188
Model III: Reader B, heterogeneous
lat 1
man 3
dim 2 2 2 2
lab X A B C
mod X
A|X
B|X
C|X eq1 A|X
dat [1513 23
59 12
21 19
11 34]
*** STATISTICS ***
Number of iterations = 168 Converge criterion = 0.0000006857 Seed random values = 1327
X-squared = 0.1344 (0.9350) L-squared = 0.1344 (0.9350) Cressie-Read = 0.1344 (0.9350) Dissimilarity index = 0.0009 Degrees of freedom = 2 Log-likelihood = -891.21001 Number of parameters = 5 (+1) Sample size = 1692.0 BIC(L-squared) = -14.7329 AIC(L-squared) = -3.8656 BIC(log-likelihood) = 1819.5884 AIC(log-likelihood) = 1792.4200
*** LATENT CLASS OUTPUT ***
X 1 X 2
0.9455 0.0545
A 1 0.9896 0.2431
A 2 0.0104 0.7569
B 1 0.9646 0.3566
B 2 0.0354 0.6434
C 1 0.9896 0.2431
C 2 0.0104 0.7569
Model IV: Reader B, false negative
lat 1
man 3
dim 2 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
des [1 0 2 0
1 0 3 0
1 0 2 0]
dat [1513 23
59 12
21 19
11 34]
*** STATISTICS ***
Number of iterations = 142
Converge criterion = 0.0000009692
Seed random values = 4135
X-squared = 29.1815 (0.0000)
L-squared = 27.2505 (0.0000)
Cressie-Read = 28.3916 (0.0000)
Dissimilarity index = 0.0171
Degrees of freedom = 3
Log-likelihood = -904.76807
Number of parameters = 4 (+1)
Sample size = 1692.0
BIC(L-squared) = 4.9495
AIC(L-squared) = 21.2505
BIC(log-likelihood) = 1839.2708
AIC(log-likelihood) = 1817.5361
*** LATENT CLASS OUTPUT ***
X 1 X 2
0.9467 0.0533
A 1 0.9808 0.2626
A 2 0.0192 0.7374
B 1 0.9808 0.2952
B 2 0.0192 0.7048
C 1 0.9808 0.2626
C 2 0.0192 0.7374
Model V: Reader B, false positive
lat 1
man 3
dim 2 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
des [1 0 2 0
3 0 2 0
1 0 2 0]
dat [1513 23
59 12
21 19
11 34]
*** STATISTICS ***
Number of iterations = 116
Converge criterion = 0.0000008031
Seed random values = 584
X-squared = 3.1099 (0.3750)
L-squared = 2.9870 (0.3936)
Cressie-Read = 3.0641 (0.3818)
Dissimilarity index = 0.0039
Degrees of freedom = 3
Log-likelihood = -892.63628
Number of parameters = 4 (+1)
Sample size = 1692.0
BIC(L-squared) = -19.3140
AIC(L-squared) = -3.0130
BIC(log-likelihood) = 1815.0072
AIC(log-likelihood) = 1793.2726
*** LATENT CLASS OUTPUT ***
X 1 X 2
0.9447 0.0553
A 1 0.9892 0.2868
A 2 0.0108 0.7132
B 1 0.9659 0.2868
B 2 0.0341 0.7132
C 1 0.9892 0.2868
C 2 0.0108 0.7132
Cheating Data Example on page 30.
We have skipped column SE(J) and SE(B) for the estimation of standard error using Jackknife and bootstrapping methods in the table. Table 3.3 on page 32 corresponds to the part of the output labeled as "(CONDITIONAL) PROBABILITIES". Table 3.4 on page 33 corresponds to the part of the output labeled as "FREQUENCIES". The last two columns from Table 3.4 can be produced using any statistical software such as Stata.
lat 1
man 4
dim 2 2 2 2 2
lab X A B C D
mod X
A|X
B|X
C|X
D|X
dat [207 46
7 5
13 4
1 2
10 3
1 2
11 4
1 2]
*** STATISTICS ***
Number of iterations = 367
Converge criterion = 0.0000009825
Seed random values = 1326
X-squared = 8.3209 (0.2155)
L-squared = 7.7643 (0.2559)
Cressie-Read = 8.0766 (0.2325)
Dissimilarity index = 0.0315
Degrees of freedom = 6
Log-likelihood = -440.02713
Number of parameters = 9 (+1)
Sample size = 319.0
BIC(L-squared) = -26.8269
AIC(L-squared) = -4.2357
BIC(log-likelihood) = 931.9410
AIC(log-likelihood) = 898.0543
*** FREQUENCIES ***
A B C D observed estimated std. res.
1 1 1 1 207.000 205.718 0.089
1 1 1 2 46.000 47.412 -0.205
1 1 2 1 7.000 8.955 -0.653
1 1 2 2 5.000 2.450 1.629
1 2 1 1 13.000 12.299 0.200
1 2 1 2 4.000 5.118 -0.494
1 2 2 1 1.000 1.956 -0.684
1 2 2 2 2.000 1.091 0.871
2 1 1 1 10.000 9.334 0.218
2 1 1 2 3.000 4.343 -0.644
2 1 2 1 1.000 1.770 -0.579
2 1 2 2 2.000 1.017 0.974
2 2 1 1 11.000 8.620 0.811
2 2 1 2 4.000 5.156 -0.509
2 2 2 1 1.000 2.347 -0.879
2 2 2 2 2.000 1.413 0.494
*** (CONDITIONAL) PROBABILITIES ***
* P(X) *
1 0.8390 (0.0781)
2 0.1610 (0.0781)
* P(A|X) *
1 | 1 0.9835 (0.0289)
2 | 1 0.0165 (0.0289)
1 | 2 0.4239 (0.1806)
2 | 2 0.5761 (0.1806)
* P(B|X) *
1 | 1 0.9709 (0.0306)
2 | 1 0.0291 (0.0306)
1 | 2 0.4118 (0.1753)
2 | 2 0.5882 (0.1753)
* P(C|X) *
1 | 1 0.9629 (0.0152)
2 | 1 0.0371 (0.0152)
1 | 2 0.7843 (0.0846)
2 | 2 0.2157 (0.0846)
* P(D|X) *
1 | 1 0.8181 (0.0263)
2 | 1 0.1819 (0.0263)
1 | 2 0.6240 (0.1004)
2 | 2 0.3760 (0.1004)
The last two columns of Table 3.4 on page 33 using Stata:
clear input A B C D observed estimated std_res 1 1 1 1 207.000 205.718 0.089 1 1 1 2 46.000 47.412 -0.205 1 1 2 1 7.000 8.955 -0.653 1 1 2 2 5.000 2.450 1.629 1 2 1 1 13.000 12.299 0.200 1 2 1 2 4.000 5.118 -0.494 1 2 2 1 1.000 1.956 -0.684 1 2 2 2 2.000 1.091 0.871 2 1 1 1 10.000 9.334 0.218 2 1 1 2 3.000 4.343 -0.644 2 1 2 1 1.000 1.770 -0.579 2 1 2 2 2.000 1.017 0.974 2 2 1 1 11.000 8.620 0.811 2 2 1 2 4.000 5.156 -0.509 2 2 2 1 1.000 2.347 -0.879 2 2 2 2 2.000 1.413 0.494 end gen x2 = (observed-estimated)^2/estimated sum x2 gen per_x2 = x2/(16*r(mean)) sort D C B A list A B C D observed estimated x2 per_x2, clean
A B C D observed estima~d x2 per_x2 1. 1 1 1 1 207 205.718 .0079892 .0009601 2. 2 1 1 1 10 9.334 .0475205 .005711 3. 1 2 1 1 13 12.299 .0399546 .0048017 4. 2 2 1 1 11 8.62 .657123 .0789728 5. 1 1 2 1 7 8.955 .4268034 .0512931 6. 2 1 2 1 1 1.77 .3349717 .0402568 7. 1 2 2 1 1 1.956 .4672474 .0561536 8. 2 2 2 1 1 2.347 .7730758 .092908 9. 1 1 1 2 46 47.412 .0420514 .0050537 10. 2 1 1 2 3 4.343 .4153002 .0499106 11. 1 2 1 2 4 5.118 .2442212 .0293504 12. 2 2 1 2 4 5.156 .2591808 .0311483 13. 1 1 2 2 5 2.45 2.654082 .3189666 14. 2 1 2 2 2 1.017 .9501368 .1141871 15. 1 2 2 2 2 1.091 .7573612 .0910194 16. 2 2 2 2 2 1.413 .2438563 .0293066
Table 3.5, 3.6 and 3.7 are omitted for now since LEM does not have those options for calculating these quantities.
Table 3.8 on page 39 using the academic cheating data with a single latent variable of two classes.
We first create a data file containing the latent classification probabilities and the modal class using the output option wla of LEM.
lat 1
man 4
dim 2 2 2 2 2
lab X A B C D
mod X
A|X
B|X
C|X
D|X
dat [207 46
7 5
13 4
1 2
10 3
1 2
11 4
1 2]
wla cheat_lca.dat
Next, we simply copy and paste the content of data file cheat_lca.dat to Stata do file editor and input it as a data set. Based on the output from the LEM run, we create a sequence of variables representing the conditional probabilities. Based on the conditional probabilities and the latent class probabilities, we then create column 3 and 4 of Table 3.8.
clear
input a b c d observed p1 p2 class error
1 1 1 1 207.0000 0.9787 0.0213 1 0.0213
1 1 1 2 46.0000 0.9442 0.0558 1 0.0558
1 1 2 1 7.0000 0.8652 0.1348 1 0.1348
1 1 2 2 5.0000 0.7032 0.2968 1 0.2968
1 2 1 1 13.0000 0.4904 0.5096 2 0.4904
1 2 1 2 4.0000 0.2621 0.7379 2 0.2621
1 2 2 1 1.0000 0.1187 0.8813 2 0.1187
1 2 2 2 2.0000 0.0473 0.9527 2 0.0473
2 1 1 1 10.0000 0.3612 0.6388 2 0.3612
2 1 1 2 3.0000 0.1726 0.8274 2 0.1726
2 1 2 1 1.0000 0.0733 0.9267 2 0.0733
2 1 2 2 2.0000 0.0284 0.9716 2 0.0284
2 2 1 1 11.0000 0.0117 0.9883 2 0.0117
2 2 1 2 4.0000 0.0044 0.9956 2 0.0044
2 2 2 1 1.0000 0.0017 0.9983 2 0.0017
2 2 2 2 2.0000 0.0006 0.9994 2 0.0006
end
gen a11 = .9835
gen a12 = .4239
gen b11 = .9709
gen b12 = .4118
gen c11 = .9629
gen c12 = .7843
gen d11 = .8181
gen d12 = .6240
gen px1 = .8390
gen py1 = 1
gen py2 = 1
foreach var of varlist a b c d {
replace py1 = py1*`var'11 if `var'==1
replace py2 = py2*`var'12 if `var'==1
replace py1 = py1*(1-`var'11) if `var'==2
replace py2 = py2*(1-`var'12) if `var'==2
}
replace py1 = py1*px1
replace py2 = py2*(1-px1)
gen odds = p2/p1
sort d c b a
list a b c d observed py1 py2 class p2 odds, clean
a b c d observed py1 py2 class p2 odds 1. 1 1 1 1 207 .6311003 .0137544 1 .0213 .0217636 2. 2 1 1 1 10 .0105879 .0186929 2 .6388 1.768549 3. 1 2 1 1 13 .0189155 .0196463 2 .5096 1.039152 4. 2 2 1 1 11 .0003173 .0267003 2 .9883 84.47009 5. 1 1 2 1 7 .024316 .0037828 1 .1348 .1558021 6. 2 1 2 1 1 .0004079 .005141 2 .9267 12.64257 7. 1 2 2 1 1 .0007288 .0054032 2 .8813 7.4246 8. 2 2 2 1 1 .0000122 .0073432 2 .9983 587.2353 9. 1 1 1 2 46 .1403217 .0082879 1 .0558 .0590976 10. 2 1 1 2 3 .0023542 .0112637 2 .8274 4.793743 11. 1 2 1 2 4 .0042057 .0118382 2 .7379 2.815338 12. 2 2 1 2 4 .0000706 .0160886 2 .9956 226.2727 13. 1 1 2 2 5 .0054065 .0022794 1 .2968 .4220705 14. 2 1 2 2 2 .0000907 .0030978 2 .9716 34.21127 15. 1 2 2 2 2 .000162 .0032558 2 .9527 20.14165 16. 2 2 2 2 2 2.72e-06 .0044247 2 .9994 1665.667
Table 3. 9 is omitted since we don't have the individual data file.
Table 3.10 is omitted since LEM requires that the data be inputted as response vectors and this model requires that the data be inputted as frequencies of scores.
Table 4.3 on page 50 using left-right clinical scale data.
Model I: Proctor
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 1 0 1 0 1 *A|X
1 0 1 0 0 1 0 1 *B|X
1 0 1 0 1 0 0 1 ] *C|X
*** FREQUENCIES ***
A B C observed estimated std. res. 1 1 1 170.000 169.437 0.043 1 1 2 0.000 1.478 -1.216 1 2 1 6.000 3.689 1.203 1 2 2 0.000 0.607 -0.779 2 1 1 73.000 72.887 0.013 2 1 2 1.000 1.208 -0.190 2 2 1 254.000 255.395 -0.087 2 2 2 69.000 68.298 0.085
Model II: Intrusion-Omission Error
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 2 0 2 0 2 *A|X
1 0 1 0 0 2 0 2 *B|X
1 0 1 0 1 0 0 2 ] *C|X
*** FREQUENCIES *** A B C observed estimated std. res. 1 1 1 170.000 170.000 -0.000 1 1 2 0.000 0.021 -0.146 1 2 1 6.000 4.550 0.680 1 2 2 0.000 1.204 -1.097 2 1 1 73.000 73.000 0.000 2 1 2 1.000 1.204 -0.186 2 2 1 254.000 255.450 -0.091 2 2 2 69.000 67.571 0.174
Model III: Variable-Specific Error
Notice that we didn't impose constraint on the third variable since the program didn't converge in LEM.
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 1 0 1 0 1 *A|X
2 0 2 0 0 2 0 2 *B|X
3 0 3 0 3 0 0 3 ] *C|X
*** FREQUENCIES *** A B C observed estimated std. res. 1 1 1 170.000 170.992 -0.076 1 1 2 0.000 0.000 -0.002 1 2 1 6.000 5.008 0.443 1 2 2 0.000 0.000 -0.010 2 1 1 73.000 73.000 -0.000 2 1 2 1.000 1.992 -0.703 2 2 1 254.000 254.000 -0.000 2 2 2 69.000 68.008 0.120
Model IV: Latent-Class Specific Error
As mentioned in the text, we have imposed constraint on the model.
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [-1 0 0 2 0 3 0 -1 *A|X
-1 0 2 0 0 3 0 -1 *B|X
-1 0 2 0 3 0 0 -1] *C|X
sta A|X [0 1 .5 .5 .5 .5 1 0]
sta B|X [0 1 .5 .5 .5 .5 1 0]
sta C|X [0 1 .5 .5 .5 .5 1 0]
wla table4_1_out.dat
A B C observed estimated std. res. 1 1 1 170.000 170.000 0.000 1 1 2 0.000 0.014 -0.117 1 2 1 6.000 5.736 0.110 1 2 2 0.000 0.130 -0.360 2 1 1 73.000 73.012 -0.001 2 1 2 1.000 0.976 0.024 2 2 1 254.000 254.132 -0.008 2 2 2 69.000 69.000 -0.000
Last two columns of Table 4.1 from the output data file table4_1_out.dat created from the LEM code above. Notice that the order of the latent classes depends on the software and is reversed here as to the result in the book.
1 1 1 170.0000 0.0000 0.0050 0.0008 0.9943 4 0.0057 1 2 1 6.0000 0.0000 0.0019 0.9981 0.0000 3 0.0019 2 1 1 73.0000 0.0000 0.9216 0.0784 0.0000 2 0.0784 2 1 2 1.0000 0.0000 0.8674 0.1326 0.0000 2 0.1326 2 2 1 254.0000 0.0000 0.0033 0.9967 0.0000 3 0.0033 2 2 2 69.0000 0.9169 0.0002 0.0830 0.0000 1 0.0831
Table 4.2 on page 51 using the clinical scale data.
Model I: Proctor model
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 1 0 1 0 1 *A|X
1 0 1 0 0 1 0 1 *B|X
1 0 1 0 1 0 0 1 ] *C|X
*** STATISTICS ***
Number of iterations = 125
Converge criterion = 0.0000008523
Seed random values = 5398
X-squared = 3.5860 (0.3098)
L-squared = 5.4403 (0.1423)
Cressie-Read = 3.9194 (0.2703)
Dissimilarity index = 0.0064
Degrees of freedom = 3
Log-likelihood = -746.10161
Number of parameters = 4 (+1)
Sample size = 573.0
BIC(L-squared) = -13.6123
AIC(L-squared) = -0.5597
BIC(log-likelihood) = 1517.6068
AIC(log-likelihood) = 1500.2032
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.3024 0.1240 0.4553 0.1184
A 1 0.9914 0.0086 0.0086 0.0086
A 2 0.0086 0.9914 0.9914 0.9914
B 1 0.9914 0.9914 0.0086 0.0086
B 2 0.0086 0.0086 0.9914 0.9914
C 1 0.9914 0.9914 0.9914 0.0086
C 2 0.0086 0.0086 0.0086 0.9914
Model II: Intrusion-omission error model
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 2 0 2 0 2 *A|X
1 0 1 0 0 2 0 2 *B|X
1 0 1 0 1 0 0 2 ] *C|X
*** STATISTICS ***
Number of iterations = 88
Converge criterion = 0.0000009873
Seed random values = 3689
X-squared = 1.7600 (0.4148)
L-squared = 2.9444 (0.2294)
Cressie-Read = 1.9907 (0.3696)
Dissimilarity index = 0.0050
Degrees of freedom = 2
Log-likelihood = -744.85363
Number of parameters = 5 (+1)
Sample size = 573.0
BIC(L-squared) = -9.7574
AIC(L-squared) = -1.0556
BIC(log-likelihood) = 1521.4617
AIC(log-likelihood) = 1499.7073
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.2944 0.1216 0.4597 0.1243
A 1 1.0000 0.0175 0.0175 0.0175
A 2 0.0000 0.9825 0.9825 0.9825
B 1 1.0000 1.0000 0.0175 0.0175
B 2 0.0000 0.0000 0.9825 0.9825
C 1 1.0000 1.0000 1.0000 0.0175
C 2 0.0000 0.0000 0.0000 0.9825
Model III: Variable-specific error model
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
dat [170 0
6 0
73 1
254 69]
des [1 0 0 1 0 1 0 1 *A|X
2 0 2 0 0 2 0 2 *B|X
-1 0 -1 0 -1 0 0 -1] *C|X
sta C|X [.9999 .0001 .9999 .0001 .9999 .0001 .0001 .9999]
*** STATISTICS ***
Number of iterations = 241
Converge criterion = 0.0000009855
Seed random values = 6016
X-squared = 0.7334 (0.6930)
L-squared = 0.8522 (0.6531)
Cressie-Read = 0.7666 (0.6816)
Dissimilarity index = 0.0035
Degrees of freedom = 2
Log-likelihood = -743.80752
Number of parameters = 5 (+1)
Sample size = 573.0
BIC(L-squared) = -11.8496
AIC(L-squared) = -3.1478
BIC(log-likelihood) = 1519.3695
AIC(log-likelihood) = 1497.6150
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.3072 0.1179 0.4528 0.1221
A 1 1.0000 0.0000 0.0000 0.0000
A 2 0.0000 1.0000 1.0000 1.0000
B 1 0.9716 0.9716 0.0284 0.0284
B 2 0.0284 0.0284 0.9716 0.9716
C 1 0.9999 0.9999 0.9999 0.0001
C 2 0.0001 0.0001 0.0001 0.9999
Model IV: Latent-class-specific model
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod X
A|X eq2
B|X eq2
C|X eq2
rec 573
dat table4_1_raw.dat
des [-1 0 0 2 0 3 0 -1 *A|X
-1 0 2 0 0 3 0 -1 *B|X
-1 0 2 0 3 0 0 -1] *C|X
sta A|X [0 1 .5 .5 .5 .5 1 0]
sta B|X [0 1 .5 .5 .5 .5 1 0]
sta C|X [0 1 .5 .5 .5 .5 1 0]
*** STATISTICS ***
Number of iterations = 30
Converge criterion = 0.0000006655
Seed random values = 5339
X-squared = 0.1559 (0.9250)
L-squared = 0.2989 (0.8612)
Cressie-Read = 0.1845 (0.9119)
Dissimilarity index = 0.0005
Degrees of freedom = 2
Log-likelihood = -743.53087
Number of parameters = 5 (+1)
Sample size = 573.0
BIC(L-squared) = -12.4029
AIC(L-squared) = -3.7011
BIC(log-likelihood) = 1518.8162
AIC(log-likelihood) = 1497.0617
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.1104 0.1219 0.4727 0.2950
A 1 0.0000 0.0124 0.0221 1.0000
A 2 1.0000 0.9876 0.9779 0.0000
B 1 0.0000 0.9876 0.0221 1.0000
B 2 1.0000 0.0124 0.9779 0.0000
C 1 0.0000 0.9876 0.9779 1.0000
C 2 1.0000 0.0124 0.0221 0.0000
Table 4.3 on page 56 using Lazarsfeld-Stouffer Attitude data
Model I: One intrinsically unscalable class model
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [-1 0 0 -1 0 -1 0 -1 0 -1 0 0
-1 0 -1 0 0 -1 0 -1 0 -1 0 0
-1 0 -1 0 -1 0 0 -1 0 -1 0 0
-1 0 -1 0 -1 0 -1 0 0 -1 0 0]
sta A|X [1 0 0 1 0 1 0 1 0 1 .3 .7]
sta B|X [1 0 1 0 0 1 0 1 0 1 .3 .7]
sta C|X [1 0 1 0 1 0 0 1 0 1 .3 .7]
sta D|X [1 0 1 0 1 0 1 0 0 1 .3 .7]
*** STATISTICS ***
Number of iterations = 148
Converge criterion = 0.0000009812
Seed random values = 5371
X-squared = 26.0852 (0.0002)
L-squared = 26.5005 (0.0002)
Cressie-Read = 26.0954 (0.0002)
Dissimilarity index = 0.0412
Degrees of freedom = 6
Log-likelihood = -2357.21059
Number of parameters = 9 (+1)
Sample size = 1000.0
BIC(L-squared) = -14.9460
AIC(L-squared) = 14.5005
BIC(log-likelihood) = 4776.5910
AIC(log-likelihood) = 4732.4212
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6
0.0498 0.0112 0.0000 0.0789 0.1880 0.6721
A 1 1.0000 0.0000 0.0000 0.0000 0.0000 0.3039
A 2 0.0000 1.0000 1.0000 1.0000 1.0000 0.6961
B 1 1.0000 1.0000 0.0000 0.0000 0.0000 0.3556
B 2 0.0000 0.0000 1.0000 1.0000 1.0000 0.6444
C 1 1.0000 1.0000 1.0000 0.0000 0.0000 0.4657
C 2 0.0000 0.0000 0.0000 1.0000 1.0000 0.5343
D 1 1.0000 1.0000 1.0000 1.0000 0.0000 0.7456
D 2 0.0000 0.0000 0.0000 0.0000 1.0000 0.2544
Model II: Two intrinsically unscalable classes model
lat 1
man 4
dim 7 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [-1 0 0 -1 0 -1 0 -1 0 -1 0 0 0 0
-1 0 -1 0 0 -1 0 -1 0 -1 0 0 0 0
-1 0 -1 0 -1 0 0 -1 0 -1 0 0 0 0
-1 0 -1 0 -1 0 -1 0 0 -1 0 0 0 0]
sta A|X [1 0 0 1 0 1 0 1 0 1 .3 .7 .4 .6]
sta B|X [1 0 1 0 0 1 0 1 0 1 .3 .7 .4 .6]
sta C|X [1 0 1 0 1 0 0 1 0 1 .3 .7 .4 .6]
sta D|X [1 0 1 0 1 0 1 0 0 1 .3 .7 .4 .6]
*** STATISTICS ***
Number of iterations = 3400
Converge criterion = 0.0000010000
Seed random values = 4829
X-squared = 4.0302 (0.0447)
L-squared = 3.5974 (0.0579)
Cressie-Read = 3.8694 (0.0492)
Dissimilarity index = 0.0079
Degrees of freedom = 1
Log-likelihood = -2345.75903
Number of parameters = 14 (+1)
Sample size = 1000.0
BIC(L-squared) = -3.3104
AIC(L-squared) = 1.5974
BIC(log-likelihood) = 4788.2266
AIC(log-likelihood) = 4719.5181
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6 X 7
0.0213 0.0000 0.0146 0.1145 0.1419 0.3008 0.4069
A 1 1.0000 0.0000 0.0000 0.0000 0.0000 0.1565 0.4562
A 2 0.0000 1.0000 1.0000 1.0000 1.0000 0.8435 0.5438
B 1 1.0000 1.0000 0.0000 0.0000 0.0000 0.2355 0.5109
B 2 0.0000 0.0000 1.0000 1.0000 1.0000 0.7645 0.4891
C 1 1.0000 1.0000 1.0000 0.0000 0.0000 0.3571 0.5670
C 2 0.0000 0.0000 0.0000 1.0000 1.0000 0.6429 0.4330
D 1 1.0000 1.0000 1.0000 1.0000 0.0000 0.3108 0.9760
D 2 0.0000 0.0000 0.0000 0.0000 1.0000 0.6892 0.0240
Model III: intrusion-omission error model
lat 1
man 4
dim 5 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [1 0 0 2 0 2 0 2 0 2
1 0 1 0 0 2 0 2 0 2
1 0 1 0 1 0 0 2 0 2
1 0 1 0 1 0 1 0 0 2]
*** STATISTICS ***
Number of iterations = 109
Converge criterion = 0.0000009896
Seed random values = 2808
X-squared = 63.3048 (0.0000)
L-squared = 71.5053 (0.0000)
Cressie-Read = 65.0494 (0.0000)
Dissimilarity index = 0.0868
Degrees of freedom = 9
Log-likelihood = -2379.71298
Number of parameters = 6 (+1)
Sample size = 1000.0
BIC(L-squared) = 9.3355
AIC(L-squared) = 53.5053
BIC(log-likelihood) = 4800.8725
AIC(log-likelihood) = 4771.4260
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5
0.1927 0.0804 0.1274 0.3376 0.2619
A 1 0.7870 0.1279 0.1279 0.1279 0.1279
A 2 0.2130 0.8721 0.8721 0.8721 0.8721
B 1 0.7870 0.7870 0.1279 0.1279 0.1279
B 2 0.2130 0.2130 0.8721 0.8721 0.8721
C 1 0.7870 0.7870 0.7870 0.1279 0.1279
C 2 0.2130 0.2130 0.2130 0.8721 0.8721
D 1 0.7870 0.7870 0.7870 0.7870 0.1279
D 2 0.2130 0.2130 0.2130 0.2130 0.8721
Model IV: variable-specific error model
lat 1
man 4
dim 5 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [1 0 0 1 0 1 0 1 0 1
2 0 2 0 0 2 0 2 0 2
3 0 3 0 3 0 0 3 0 3
4 0 4 0 4 0 4 0 0 4]
*** STATISTICS ***
Number of iterations = 192
Converge criterion = 0.0000009553
Seed random values = 778
X-squared = 42.6358 (0.0000)
L-squared = 43.6218 (0.0000)
Cressie-Read = 42.8195 (0.0000)
Dissimilarity index = 0.0767
Degrees of freedom = 7
Log-likelihood = -2365.77124
Number of parameters = 8 (+1)
Sample size = 1000.0
BIC(L-squared) = -4.7325
AIC(L-squared) = 29.6218
BIC(log-likelihood) = 4786.8045
AIC(log-likelihood) = 4747.5425
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5
0.1585 0.0620 0.0673 0.3562 0.3560
A 1 0.8602 0.1398 0.1398 0.1398 0.1398
A 2 0.1398 0.8602 0.8602 0.8602 0.8602
B 1 0.8187 0.8187 0.1813 0.1813 0.1813
B 2 0.1813 0.1813 0.8187 0.8187 0.8187
C 1 0.7656 0.7656 0.7656 0.2344 0.2344
C 2 0.2344 0.2344 0.2344 0.7656 0.7656
D 1 0.9896 0.9896 0.9896 0.9896 0.0104
D 2 0.0104 0.0104 0.0104 0.0104 0.9896
Model V: Intrusion-omission error and one intrinsically unscalable class model
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [1 0 0 2 0 2 0 2 0 2 0 0
1 0 1 0 0 2 0 2 0 2 0 0
1 0 1 0 1 0 0 2 0 2 0 0
1 0 1 0 1 0 1 0 0 2 0 0]
*** STATISTICS ***
Number of iterations = 556
Converge criterion = 0.0000009916
Seed random values = 597
X-squared = 5.3657 (0.2518)
L-squared = 5.6425 (0.2275)
Cressie-Read = 5.4356 (0.2455)
Dissimilarity index = 0.0145
Degrees of freedom = 4
Log-likelihood = -2346.78156
Number of parameters = 11 (+1)
Sample size = 1000.0
BIC(L-squared) = -21.9886
AIC(L-squared) = -2.3575
BIC(log-likelihood) = 4769.5484
AIC(log-likelihood) = 4715.5631
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6
0.0197 0.0216 0.0660 0.2837 0.1252 0.4838
A 1 1.0000 0.4537 0.4537 0.4537 0.4537 0.0256
A 2 0.0000 0.5463 0.5463 0.5463 0.5463 0.9744
B 1 1.0000 1.0000 0.4537 0.4537 0.4537 0.0791
B 2 0.0000 0.0000 0.5463 0.5463 0.5463 0.9209
C 1 1.0000 1.0000 1.0000 0.4537 0.4537 0.1708
C 2 0.0000 0.0000 0.0000 0.5463 0.5463 0.8292
D 1 1.0000 1.0000 1.0000 1.0000 0.4537 0.3993
D 2 0.0000 0.0000 0.0000 0.0000 0.5463 0.6007
Model VI: Variable-specific error and one intrinsically unscalable class model
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [1 0 0 1 0 1 0 1 0 1 0 0
2 0 2 0 0 2 0 2 0 2 0 0
3 0 3 0 3 0 0 3 0 3 0 0
4 0 4 0 4 0 4 0 0 4 0 0]
*** STATISTICS ***
Number of iterations = 536
Converge criterion = 0.0000009061
Seed random values = 2114
X-squared = 1.5933 (0.4508)
L-squared = 1.6237 (0.4440)
Cressie-Read = 1.6020 (0.4489)
Dissimilarity index = 0.0084
Degrees of freedom = 2
Log-likelihood = -2344.77216
Number of parameters = 13 (+1)
Sample size = 1000.0
BIC(L-squared) = -12.1919
AIC(L-squared) = -2.3763
BIC(log-likelihood) = 4779.3451
AIC(log-likelihood) = 4715.5443
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6
0.1800 0.0513 0.0902 0.1839 0.1449 0.3497
A 1 0.7576 0.2424 0.2424 0.2424 0.2424 0.0103
A 2 0.2424 0.7576 0.7576 0.7576 0.7576 0.9897
B 1 0.6288 0.6288 0.3712 0.3712 0.3712 0.0000
B 2 0.3712 0.3712 0.6288 0.6288 0.6288 1.0000
C 1 0.6650 0.6650 0.6650 0.3350 0.3350 0.1356
C 2 0.3350 0.3350 0.3350 0.6650 0.6650 0.8644
D 1 1.0000 1.0000 1.0000 1.0000 0.0000 0.3879
D 2 0.0000 0.0000 0.0000 0.0000 1.0000 0.6121
Table 4.4 on page 58 using Model VI in the example above. Notice that it should be model VI as explained in previous page instead of model V. The estimated frequency column is shown as a part of standard output of LEM. We used option wla to write the classification table to a text file.
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [75 3
42 10
55 8
45 16
69 16
60 25
96 52
199 229]
des [1 0 0 1 0 1 0 1 0 1 0 0
2 0 2 0 0 2 0 2 0 2 0 0
3 0 3 0 3 0 0 3 0 3 0 0
4 0 4 0 4 0 4 0 0 4 0 0]
wla table4_4.out
*** FREQUENCIES ***
A B C D observed estimated std. res.
1 1 1 1 75.000 73.185 0.212
1 1 1 2 3.000 4.370 -0.655
1 1 2 1 42.000 45.070 -0.457
1 1 2 2 10.000 8.679 0.448
1 2 1 1 55.000 55.431 -0.058
1 2 1 2 8.000 7.708 0.105
1 2 2 1 45.000 42.958 0.312
1 2 2 2 16.000 16.594 -0.146
2 1 1 1 69.000 68.652 0.042
2 1 1 2 16.000 13.649 0.636
2 1 2 1 60.000 60.266 -0.034
2 1 2 2 25.000 27.113 -0.406
2 2 1 1 96.000 96.501 -0.051
2 2 1 2 52.000 51.849 0.021
2 2 2 1 199.000 198.937 0.004
2 2 2 2 229.000 229.037 -0.002
We now copy and paste the content of data file table4_4.out to Stata do file editor and input them as data set. The model posterior probability is the largest of six possible class probabilities.
clear input A B C D observed p1 p2 p3 p4 p5 p6 class error 1 1 1 1 75.0000 0.7796 0.0708 0.0737 0.0758 0.0000 0.0000 1 0.2204 1 1 1 2 3.0000 0.0002 0.0000 0.0000 0.0001 0.9997 0.0000 5 0.0003 1 1 2 1 42.0000 0.6373 0.0579 0.0603 0.2445 0.0000 0.0000 1 0.3627 1 1 2 2 10.0000 0.0000 0.0000 0.0000 0.0001 0.9999 0.0000 5 0.0001 1 2 1 1 55.0000 0.6064 0.0551 0.1653 0.1699 0.0000 0.0033 1 0.3936 1 2 1 2 8.0000 0.0001 0.0000 0.0000 0.0001 0.9620 0.0379 5 0.0380 1 2 2 1 45.0000 0.3939 0.0358 0.1074 0.4355 0.0000 0.0274 4 0.5645 1 2 2 2 16.0000 0.0000 0.0000 0.0000 0.0000 0.8877 0.1122 5 0.1123 2 1 1 1 69.0000 0.2660 0.2359 0.2456 0.2525 0.0000 0.0000 1 0.7340 2 1 1 2 16.0000 0.0000 0.0000 0.0000 0.0001 0.9999 0.0000 5 0.0001 2 1 2 1 60.0000 0.1526 0.1353 0.1408 0.5713 0.0000 0.0000 4 0.4287 2 1 2 2 25.0000 0.0000 0.0000 0.0000 0.0001 0.9999 0.0000 5 0.0001 2 2 1 1 96.0000 0.1115 0.0989 0.2966 0.3049 0.0000 0.1882 4 0.6951 2 2 1 2 52.0000 0.0000 0.0000 0.0000 0.0000 0.4468 0.5532 6 0.4468 2 2 2 1 199.0000 0.0272 0.0241 0.0724 0.2938 0.0000 0.5824 6 0.4176 2 2 2 2 229.0000 0.0000 0.0000 0.0000 0.0000 0.2009 0.7991 6 0.2009 end egen postp=rmax(p1-p6) sort D C B A list A B C D observed class postp, clean
A B C D observed class postp 1. 1 1 1 1 75 1 .7796 2. 2 1 1 1 69 1 .266 3. 1 2 1 1 55 1 .6064 4. 2 2 1 1 96 4 .3049 5. 1 1 2 1 42 1 .6373 6. 2 1 2 1 60 4 .5713 7. 1 2 2 1 45 4 .4355 8. 2 2 2 1 199 6 .5824 9. 1 1 1 2 3 5 .9997 10. 2 1 1 2 16 5 .9999 11. 1 2 1 2 8 5 .962 12. 2 2 1 2 52 6 .5532 13. 1 1 2 2 10 5 .9999 14. 2 1 2 2 25 5 .9999 15. 1 2 2 2 16 5 .8877 16. 2 2 2 2 229 6 .7991
Result on page 60 using Stouffer-Toby role conflict data of Table 4.6 on page 61. In this example, we show how to specify using a data set with individual records. You can download the data file following the link here.
lat 1
man 4
dim 5 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
rec 216
dat table4_6_raw.dat
des [1 0 0 1 0 1 0 1 0 1
3 0 3 0 0 4 0 4 0 4
5 0 5 0 5 0 6 0 6 0
7 0 7 0 7 0 7 0 0 7]
*** STATISTICS ***
Number of iterations = 158
Converge criterion = 0.0000009505
Seed random values = 3566
X-squared = 0.8954 (0.9706)
L-squared = 0.9210 (0.9687)
Cressie-Read = 0.9032 (0.9700)
Dissimilarity index = 0.0194
Degrees of freedom = 5
Log-likelihood = -503.56819
Number of parameters = 10 (+1)
Sample size = 216.0
BIC(L-squared) = -25.9554
AIC(L-squared) = -9.0790
BIC(log-likelihood) = 1060.8892
AIC(log-likelihood) = 1027.1364
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5
0.2391 0.0179 0.1026 0.4390 0.2014
A 1 0.8639 0.1361 0.1361 0.1361 0.1361
A 2 0.1361 0.8639 0.8639 0.8639 0.8639
B 1 0.9477 0.9477 0.3638 0.3638 0.3638
B 2 0.0523 0.0523 0.6362 0.6362 0.6362
C 1 0.9398 0.9398 0.9398 0.2531 0.2531
C 2 0.0602 0.0602 0.0602 0.7469 0.7469
D 1 0.9884 0.9884 0.9884 0.9884 0.0116
D 2 0.0116 0.0116 0.0116 0.0116 0.9884
Model of one intrinsically unscalable class is fitted to Stouffer-Toby data (page 61).
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
rec 216 dat table4_6_raw.dat
des [-1 0 0 -1 0 -1 0 -1 0 -1 0 0
-1 0 -1 0 0 -1 0 -1 0 -1 0 0
-1 0 -1 0 -1 0 0 -1 0 -1 0 0
-1 0 -1 0 -1 0 -1 0 0 -1 0 0]
sta A|X [1 0 0 1 0 1 0 1 0 1 .3 .7] sta B|X [1 0 1 0 0 1 0 1 0 1 .3 .7] sta C|X [1 0 1 0 1 0 0 1 0 1 .3 .7] sta D|X [1 0 1 0 1 0 1 0 0 1 .3 .7]
*** STATISTICS ***
Number of iterations = 168 Converge criterion = 0.0000009558 Seed random values = 455
X-squared = 1.0053 (0.9854) L-squared = 0.9885 (0.9860) Cressie-Read = 0.9988 (0.9857) Dissimilarity index = 0.0135 Degrees of freedom = 6 Log-likelihood = -503.60197 Number of parameters = 9 (+1) Sample size = 216.0 BIC(L-squared) = -31.2631 AIC(L-squared) = -11.0115 BIC(log-likelihood) = 1055.5815 AIC(log-likelihood) = 1025.2039
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6
0.1771 0.0350 0.0257 0.0318 0.0484 0.6820
A 1 1.0000 0.0000 0.0000 0.0000 0.0000 0.1951
A 2 0.0000 1.0000 1.0000 1.0000 1.0000 0.8049
B 1 1.0000 1.0000 0.0000 0.0000 0.0000 0.4425
B 2 0.0000 0.0000 1.0000 1.0000 1.0000 0.5575
C 1 1.0000 1.0000 1.0000 0.0000 0.0000 0.3845
C 2 0.0000 0.0000 0.0000 1.0000 1.0000 0.6155
D 1 1.0000 1.0000 1.0000 1.0000 0.0000 0.7655
D 2 0.0000 0.0000 0.0000 0.0000 1.0000 0.2345
Latent markov model example -- to be done
Located latent class model example -- to be done
T-class mixture model example -- to be done
Table 5.1 on page 69 using IEA bus data
Model I: linear scale
lat 1
man 4
dim 5 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [1138 13
75 15
502 9
198 23
1532 43
200 59
1354 37
852 309]
des [1 0 0 1 0 1 0 1 0 1 *A|X
2 0 2 0 0 2 0 2 0 2 *B|X
3 0 3 0 3 0 0 3 0 3 *C|X
4 0 4 0 4 0 4 0 0 4] *D|X
*** STATISTICS ***
Number of iterations = 105
Converge criterion = 0.0000009275
Seed random values = 2318
X-squared = 40.3671 (0.0000)
L-squared = 46.8495 (0.0000)
Cressie-Read = 42.2199 (0.0000)
Dissimilarity index = 0.0163
Degrees of freedom = 7
Log-likelihood = -12930.79076
Number of parameters = 8 (+1)
Sample size = 6359.0
BIC(L-squared) = -14.4539
AIC(L-squared) = 32.8495
BIC(log-likelihood) = 25931.6425
AIC(log-likelihood) = 25877.5815
*** FREQUENCIES ***
A B C D observed estimated std. res.
1 1 1 1 1138.000 1148.738 -0.317
1 1 1 2 13.000 23.308 -2.135
1 1 2 1 75.000 69.801 0.622
1 1 2 2 15.000 13.094 0.527
1 2 1 1 502.000 467.044 1.617
1 2 1 2 9.000 10.886 -0.572
1 2 2 1 198.000 182.668 1.134
1 2 2 2 23.000 57.460 -4.546
2 1 1 1 1532.000 1532.935 -0.024
2 1 1 2 43.000 32.269 1.889
2 1 2 1 200.000 220.747 -1.396
2 1 2 2 59.000 60.592 -0.205
2 2 1 1 1354.000 1376.752 -0.613
2 2 1 2 37.000 34.955 0.346
2 2 2 1 852.000 852.315 -0.011
2 2 2 2 309.000 275.435 2.022
To create the column next to the estimated frequencies, we again use Stata.
clear input A B C D observed estimated std_res 1 1 1 1 1138.000 1148.738 -0.317 1 1 1 2 13.000 23.308 -2.135 1 1 2 1 75.000 69.801 0.622 1 1 2 2 15.000 13.094 0.527 1 2 1 1 502.000 467.044 1.617 1 2 1 2 9.000 10.886 -0.572 1 2 2 1 198.000 182.668 1.134 1 2 2 2 23.000 57.460 -4.546 2 1 1 1 1532.000 1532.935 -0.024 2 1 1 2 43.000 32.269 1.889 2 1 2 1 200.000 220.747 -1.396 2 1 2 2 59.000 60.592 -0.205 2 2 1 1 1354.000 1376.752 -0.613 2 2 1 2 37.000 34.955 0.346 2 2 2 1 852.000 852.315 -0.011 2 2 2 2 309.000 275.435 2.022 end
gen pearson = (observed - estimated)^2/estimated egen ttl = sum(pearson) sort D C B A list, clean
A B C D observed estima~d std_res pearson ttl 1. 1 1 1 1 1138 1148.738 -.317 .1003757 40.36703 2. 2 1 1 1 1532 1532.935 -.024 .0005704 40.36703 3. 1 2 1 1 502 467.044 1.617 2.616288 40.36703 4. 2 2 1 1 1354 1376.752 -.613 .3759947 40.36703 5. 1 1 2 1 75 69.801 .622 .3872376 40.36703 6. 2 1 2 1 200 220.747 -1.396 1.949914 40.36703 7. 1 2 2 1 198 182.668 1.134 1.286872 40.36703 8. 2 2 2 1 852 852.315 -.011 .0001164 40.36703 9. 1 1 1 2 13 23.308 -2.135 4.55873 40.36703 10. 2 1 1 2 43 32.269 1.889 3.568575 40.36703 11. 1 2 1 2 9 10.886 -.572 .3267495 40.36703 12. 2 2 1 2 37 34.955 .346 .11964 40.36703 13. 1 1 2 2 15 13.094 .527 .2774428 40.36703 14. 2 1 2 2 59 60.592 -.205 .0418283 40.36703 15. 1 2 2 2 23 57.46 -4.546 20.6664 40.36703 16. 2 2 2 2 309 275.435 2.022 4.090292 40.36703
Model II: biform scale
lat 1
man 4
dim 6 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [1138 13
75 15
502 9
198 23
1532 43
200 59
1354 37
852 309]
des [1 0 0 1 1 0 0 1 0 1 0 1 *A|X
2 0 2 0 0 2 0 2 0 2 0 2 *B|X
3 0 3 0 3 0 3 0 0 3 0 3 *C|X
4 0 4 0 4 0 4 0 4 0 0 4] *D|X
*** STATISTICS ***
Number of iterations = 178
Converge criterion = 0.0000009704
Seed random values = 2653
X-squared = 35.2821 (0.0000)
L-squared = 39.6006 (0.0000)
Cressie-Read = 36.5025 (0.0000)
Dissimilarity index = 0.0127
Degrees of freedom = 6
Log-likelihood = -12927.16632
Number of parameters = 9 (+1)
Sample size = 6359.0
BIC(L-squared) = -12.9452
AIC(L-squared) = 27.6006
BIC(log-likelihood) = 25933.1513
AIC(log-likelihood) = 25872.3326
*** FREQUENCIES ***
A B C D observed estimated std. res.
1 1 1 1 1138.000 1130.527 0.222
1 1 1 2 13.000 22.054 -1.928
1 1 2 1 75.000 73.252 0.204
1 1 2 2 15.000 10.609 1.348
1 2 1 1 502.000 503.105 -0.049
1 2 1 2 9.000 11.647 -0.775
1 2 2 1 198.000 169.048 2.227
1 2 2 2 23.000 52.757 -4.097
2 1 1 1 1532.000 1538.874 -0.175
2 1 1 2 43.000 31.549 2.039
2 1 2 1 200.000 213.575 -0.929
2 1 2 2 59.000 54.525 0.606
2 2 1 1 1354.000 1353.022 0.027
2 2 1 2 37.000 36.893 0.018
2 2 2 1 852.000 869.598 -0.597
2 2 2 2 309.000 287.966 1.240
We skip the part of creating the column labeled "Discr II" next to the estimated frequencies as it is the same way as we have shown for model I.
Model III: augmented biform
lat 1
man 4
dim 7 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [1138 13
75 15
502 9
198 23
1532 43
200 59
1354 37
852 309]
des [1 0 0 1 0 1 0 1 0 1 1 0 1 0 *A|X
2 0 2 0 0 2 0 2 0 2 0 2 0 2 *B|X
3 0 3 0 3 0 0 3 0 3 3 0 0 3 *C|X
4 0 4 0 4 0 4 0 0 4 4 0 4 0] *D|X
*** STATISTICS ***
Number of iterations = 174
Converge criterion = 0.0000009458
Seed random values = 5390
X-squared = 20.7938 (0.0009)
L-squared = 18.5343 (0.0023)
Cressie-Read = 19.8692 (0.0013)
Dissimilarity index = 0.0060
Degrees of freedom = 5
Log-likelihood = -12916.63316
Number of parameters = 10 (+1)
Sample size = 6359.0
BIC(L-squared) = -25.2539
AIC(L-squared) = 8.5343
BIC(log-likelihood) = 25920.8426
AIC(log-likelihood) = 25853.2663
*** FREQUENCIES ***
A B C D observed estimated std. res.
1 1 1 1 1138.000 1130.049 0.237
1 1 1 2 13.000 22.502 -2.003
1 1 2 1 75.000 74.036 0.112
1 1 2 2 15.000 6.614 3.261
1 2 1 1 502.000 500.219 0.080
1 2 1 2 9.000 10.783 -0.543
1 2 2 1 198.000 198.526 -0.037
1 2 2 2 23.000 30.282 -1.323
2 1 1 1 1532.000 1539.779 -0.198
2 1 1 2 43.000 32.363 1.870
2 1 2 1 200.000 208.673 -0.600
2 1 2 2 59.000 61.000 -0.256
2 2 1 1 1354.000 1354.613 -0.017
2 2 1 2 37.000 36.699 0.050
2 2 2 1 852.000 845.105 0.237
2 2 2 2 309.000 307.757 0.071
Table 5.2 on page 70 based on model III in the previous example.
lat 1
man 4
dim 7 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [1138 13
75 15
502 9
198 23
1532 43
200 59
1354 37
852 309]
des [1 0 0 1 0 1 0 1 0 1 1 0 1 0 *A|X
2 0 2 0 0 2 0 2 0 2 0 2 0 2 *B|X
3 0 3 0 3 0 0 3 0 3 3 0 0 3 *C|X
4 0 4 0 4 0 4 0 0 4 4 0 4 0] *D|X
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4 X 5 X 6 X 7
0.2078 0.2678 0.2273 0.1695 0.0630 0.0409 0.0238
A 1 0.9170 0.0830 0.0830 0.0830 0.0830 0.9170 0.9170
A 2 0.0830 0.9170 0.9170 0.9170 0.9170 0.0830 0.0830
B 1 0.8365 0.8365 0.1635 0.1635 0.1635 0.1635 0.1635
B 2 0.1635 0.1635 0.8365 0.8365 0.8365 0.8365 0.8365
C 1 0.9670 0.9670 0.9670 0.0330 0.0330 0.9670 0.0330
C 2 0.0330 0.0330 0.0330 0.9670 0.9670 0.0330 0.9670
D 1 0.9806 0.9806 0.9806 0.9806 0.0194 0.9806 0.9806
D 2 0.0194 0.0194 0.0194 0.0194 0.9806 0.0194 0.0194
Table 5.3 on page 71 based on model III in the previous example. We use option wla to write out the latent classification and error probabilities to a file and we show the content of the file after the LEM code. The first five columns are manifest variables and their frequencies. The next seven columns are posterior probabilities for each class. Based on the posterior probabilities, the classification is determined. The modal posterior probabilities is the maximum of the seven probabilities. The last column is the error probabilities, which is 1-modal posterior probabilities.
lat 1
man 4
dim 7 2 2 2 2
lab X A B C D
mod X
A|X eq2
B|X eq2
C|X eq2
D|X eq2
dat [1138 13
75 15
502 9
198 23
1532 43
200 59
1354 37
852 309]
des [1 0 0 1 0 1 0 1 0 1 1 0 1 0 *A|X
2 0 2 0 0 2 0 2 0 2 0 2 0 2 *B|X
3 0 3 0 3 0 0 3 0 3 3 0 0 3 *C|X
4 0 4 0 4 0 4 0 0 4 4 0 4 0] *D|X
wla table5_3.out
1 1 1 1 1138.0000 0.8505 0.0993 0.0165 0.0004 0.0000 0.0327 0.0007 1 0.1495 1 1 1 2 13.0000 0.8439 0.0985 0.0163 0.0004 0.0078 0.0324 0.0006 1 0.1561 1 1 2 1 75.0000 0.4432 0.0517 0.0086 0.1874 0.0014 0.0170 0.2907 1 0.5568 1 1 2 2 15.0000 0.0980 0.0114 0.0019 0.0415 0.7791 0.0038 0.0643 5 0.2209 1 2 1 1 502.0000 0.3755 0.0438 0.1903 0.0048 0.0000 0.3780 0.0075 6 0.6220 1 2 1 2 9.0000 0.3441 0.0402 0.1744 0.0044 0.0835 0.3465 0.0069 6 0.6535 1 2 2 1 198.0000 0.0323 0.0038 0.0164 0.3577 0.0026 0.0325 0.5547 7 0.4453 1 2 2 2 23.0000 0.0042 0.0005 0.0021 0.0463 0.8708 0.0042 0.0719 5 0.1292 2 1 1 1 1532.0000 0.0565 0.8044 0.1334 0.0034 0.0000 0.0022 0.0000 2 0.1956 2 1 1 2 43.0000 0.0531 0.7562 0.1254 0.0032 0.0600 0.0020 0.0000 2 0.2438 2 1 2 1 200.0000 0.0142 0.2026 0.0336 0.7342 0.0054 0.0005 0.0093 4 0.2658 2 1 2 2 59.0000 0.0010 0.0137 0.0023 0.0496 0.9328 0.0000 0.0006 5 0.0672 2 2 1 1 1354.0000 0.0126 0.1787 0.7760 0.0198 0.0001 0.0126 0.0003 3 0.2240 2 2 1 2 37.0000 0.0092 0.1303 0.5659 0.0144 0.2708 0.0092 0.0002 3 0.4341 2 2 2 1 852.0000 0.0007 0.0098 0.0425 0.9278 0.0068 0.0007 0.0118 4 0.0722 2 2 2 2 309.0000 0.0000 0.0005 0.0023 0.0503 0.9461 0.0000 0.0006 5 0.0539
Table 6. 2 on page 76 using table6_1_g.dat, table6_1_f.dat and table6_1_m.dat.
Model I: Combined group analysis
lat 1
man 5
dim 2 2 2 2 2 2
lab X A B C D G
mod
G
A|X
B|X
C|X
D|X
rec 317
dat table6_1_g.dat
*** STATISTICS ***
Number of iterations = 441
Converge criterion = 0.0000009597
Seed random values = 1059
X-squared = 24.8712 (0.2528)
L-squared = 28.8872 (0.1167)
Cressie-Read = 25.4289 (0.2291)
Dissimilarity index = 0.0856
Degrees of freedom = 21
Log-likelihood = -653.10120
Number of parameters = 10 (+1)
Sample size = 317.0
BIC(L-squared) = -92.0497
AIC(L-squared) = -13.1128
BIC(log-likelihood) = 1363.7914
AIC(log-likelihood) = 1326.2024
*** (CONDITIONAL) PROBABILITIES ***
* P(X) *
1 0.8353 (0.0796) 2 0.1647 (0.0796)
* P(G) *
1 0.4322 (0.0278) 2 0.5678 (0.0278)
* P(A|X) *
1 | 1 0.9837 (0.0291) 2 | 1 0.0163 (0.0291) 1 | 2 0.4314 (0.1779) 2 | 2 0.5686 (0.1779)
* P(B|X) *
1 | 1 0.9761 (0.0312) 2 | 1 0.0239 (0.0312) 1 | 2 0.4128 (0.1754) 2 | 2 0.5872 (0.1754)
* P(C|X) *
1 | 1 0.9629 (0.0153) 2 | 1 0.0371 (0.0153) 1 | 2 0.7858 (0.0838) 2 | 2 0.2142 (0.0838)
* P(D|X) *
1 | 1 0.8174 (0.0264) 2 | 1 0.1826 (0.0264) 1 | 2 0.6236 (0.0996) 2 | 2 0.3764 (0.0996)
Model II: Analysis on female group
lat 1
man 4
dim 2 2 2 2 2
lab X A B C D
mod
X
A|X
B|X
C|X
D|X
rec 180
dat table6_1_f.dat
*** STATISTICS ***
Number of iterations = 360
Converge criterion = 0.0000009611
Seed random values = 5364
X-squared = 7.3025 (0.2938)
L-squared = 8.6605 (0.1936)
Cressie-Read = 7.4207 (0.2837)
Dissimilarity index = 0.0362
Degrees of freedom = 6
Log-likelihood = -273.20739
Number of parameters = 9 (+1)
Sample size = 180.0
BIC(L-squared) = -22.4972
AIC(L-squared) = -3.3395
BIC(log-likelihood) = 593.1514
AIC(log-likelihood) = 564.4148
*** (CONDITIONAL) PROBABILITIES ***
* P(X) *
1 0.8534 (0.1022)
2 0.1466 (0.1022)
* P(A|X) *
1 | 1 0.9802 (0.0421)
2 | 1 0.0198 (0.0421)
1 | 2 0.3570 (0.2975)
2 | 2 0.6430 (0.2975)
* P(B|X) *
1 | 1 0.9365 (0.0514)
2 | 1 0.0635 (0.0514)
1 | 2 0.3085 (0.2616)
2 | 2 0.6915 (0.2616)
* P(C|X) *
1 | 1 0.9410 (0.0214)
2 | 1 0.0590 (0.0214)
1 | 2 0.8126 (0.1053)
2 | 2 0.1874 (0.1053)
* P(D|X) *
1 | 1 0.7909 (0.0355)
2 | 1 0.2091 (0.0355)
1 | 2 0.6255 (0.1313)
2 | 2 0.3745 (0.1313)
Model III: Analysis on male group
lat 1
man 4
dim 2 2 2 2 2
lab X A B C D
mod
X
A|X
B|X
C|X
D|X
rec 137
dat table6_1_m.dat
*** STATISTICS ***
Number of iterations = 157
Converge criterion = 0.0000008803
Seed random values = 4014
X-squared = 5.5237 (0.4786)
L-squared = 6.3978 (0.3801)
Cressie-Read = 5.5771 (0.4722)
Dissimilarity index = 0.0284
Degrees of freedom = 6
Log-likelihood = -156.17714
Number of parameters = 9 (+1)
Sample size = 137.0
BIC(L-squared) = -23.1220
AIC(L-squared) = -5.6022
BIC(log-likelihood) = 356.6341
AIC(log-likelihood) = 330.3543
*** (CONDITIONAL) PROBABILITIES ***
* P(X) *
1 0.8545 (0.0595)
2 0.1455 (0.0595)
* P(A|X) *
1 | 1 0.9775 (0.0281)
2 | 1 0.0225 (0.0281)
1 | 2 0.4299 (0.1702)
2 | 2 0.5701 (0.1702)
* P(B|X) *
1 | 1 1.0000 (0.0000) *
2 | 1 0.0000 (0.0000) *
1 | 2 0.5486 (0.1943)
2 | 2 0.4514 (0.1943)
* P(C|X) *
1 | 1 0.9944 (0.0184)
2 | 1 0.0056 (0.0184)
1 | 2 0.6817 (0.1341)
2 | 2 0.3183 (0.1341)
* P(D|X) *
1 | 1 0.8553 (0.0364)
2 | 1 0.1447 (0.0364)
1 | 2 0.5457 (0.1425)
2 | 2 0.4543 (0.1425)
Table 6.4 on page 80 using spatial data, table6_3.dat, table6_3_f.dat and table6_3_m.dat.
Model I: analysis on male group
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod
A|X eq2
B|X eq2
C|X eq2
rec 266
dat table6_3_m.dat
des [ 1 0 0 2 0 2 0 2
1 0 1 0 0 2 0 2
1 0 1 0 1 0 0 2]
*** STATISTICS ***
Number of iterations = 71
Converge criterion = 0.0000008951
Seed random values = 5672
X-squared = 2.8375 (0.2420)
L-squared = 4.4355 (0.1089)
Cressie-Read = 3.1347 (0.2086)
Dissimilarity index = 0.0132
Degrees of freedom = 2
Log-likelihood = -360.82348
Number of parameters = 5 (+1)
Sample size = 266.0
BIC(L-squared) = -6.7315
AIC(L-squared) = 0.4355
BIC(log-likelihood) = 749.5644
AIC(log-likelihood) = 731.6470
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.3033 0.1579 0.4226 0.1162
A 1 1.0000 0.0294 0.0294 0.0294
A 2 0.0000 0.9706 0.9706 0.9706
B 1 1.0000 1.0000 0.0294 0.0294
B 2 0.0000 0.0000 0.9706 0.9706
C 1 1.0000 1.0000 1.0000 0.0294
C 2 0.0000 0.0000 0.0000 0.9706
Model II: analysis on female group
lat 1
man 3
dim 4 2 2 2
lab X A B C
mod
A|X eq2
B|X eq2
C|X eq2
rec 307
dat table6_3_f.dat
des [ 1 0 0 2 0 2 0 2
1 0 1 0 0 2 0 2
1 0 1 0 1 0 0 2]
*** STATISTICS ***
Number of iterations = 127
Converge criterion = -0.0000000092
Seed random values = 642
X-squared = 1.6669 (0.4345)
L-squared = 1.5951 (0.4504)
Cressie-Read = 1.5679 (0.4566)
Dissimilarity index = 0.0031
Degrees of freedom = 2
Log-likelihood = -378.81571
Number of parameters = 5 (+1)
Sample size = 307.0
BIC(L-squared) = -9.8585
AIC(L-squared) = -2.4049
BIC(log-likelihood) = 786.2657
AIC(log-likelihood) = 767.6314
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.2858 0.0910 0.4917 0.1314
A 1 1.0000 0.0087 0.0087 0.0087
A 2 0.0000 0.9913 0.9913 0.9913
B 1 1.0000 1.0000 0.0087 0.0087
B 2 0.0000 0.0000 0.9913 0.9913
C 1 1.0000 1.0000 1.0000 0.0087
C 2 0.0000 0.0000 0.0000 0.9913
Model III: Combined analysis. The results here are a little off here, especially the p-value.
lat 1
man 4
dim 4 2 2 2 2
lab X A B C G
mod G eq2
A|X eq2
B|X eq2
C|X eq2
rec 573
dat table6_3.dat
des [-1 0
1 0 0 2 0 2 0 2
1 0 1 0 0 2 0 2
1 0 1 0 1 0 0 2]
sta G [.5 .5]
*** STATISTICS ***
Number of iterations = 87
Converge criterion = 0.0000009668
Seed random values = 3147
X-squared = 18.1788 (0.0520)
L-squared = 19.3957 (0.0355)
Cressie-Read = 18.2839 (0.0504)
Dissimilarity index = 0.0700
Degrees of freedom = 10
Log-likelihood = -1142.02696
Number of parameters = 5 (+1)
Sample size = 573.0
BIC(L-squared) = -44.1132
AIC(L-squared) = -0.6043
BIC(log-likelihood) = 2315.8084
AIC(log-likelihood) = 2294.0539
*** LATENT CLASS OUTPUT ***
X 1 X 2 X 3 X 4
0.2944 0.1216 0.4597 0.1243
A 1 1.0000 0.0175 0.0175 0.0175
A 2 0.0000 0.9825 0.9825 0.9825
B 1 1.0000 1.0000 0.0175 0.0175
B 2 0.0000 0.0000 0.9825 0.9825
C 1 1.0000 1.0000 1.0000 0.0175
C 2 0.0000 0.0000 0.0000 0.9825
G 1 0.5000 0.5000 0.5000 0.5000
G 2 0.5000 0.5000 0.5000 0.5000
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services