UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Mplus Class Notes
Analyzing Data: Latent Class Analysis (with graphs)


1.0 A Just Identified Model

We will illustrate a simple Latent Class Analysis (LCA) and show how the interpretation can be enhanced via Mplus graphics. We will see if we can identify 3 classes based on the variables read write math science and then see how well we can predict this class membership from female ses and socst.  This is a very artificial example, so please extrapolate to your research interests.

Step 1. Run the LCA

Title: 
  Latent Class Analysis with Graphs
Data:
  File is hsb2.dat ;
Variable:
  Names are 
     id female race ses schtyp prog read write math science socst;
  Usevariables are
     read write math science female ses socst;
  classes = grp(3);
Analysis:
  type=mixture;
Model:
  %overall%
    grp#1 on female ses socst;
    grp#2 on female ses socst;
Plot:
   type is plot3;
   series is read (1) write (2) math (3) science (4);

And here is the output from the LCA. 

Mplus VERSION 3.0
MUTHEN & MUTHEN
04/29/2004   9:37 AM

INPUT INSTRUCTIONS

  Title:
    Latent Class Analysis with Graphs
  Data:
    File is hsb2.dat ;
  Variable:
    Names are
       id female race ses schtyp prog read write math science socst;
    Usevariables are
       read write math science female ses socst;
    classes = grp(3);
  Analysis:
    type=mixture;
  Model:
    %overall%
      grp#1 on female ses socst;
      grp#2 on female ses socst;
  Plot:
     type is plot3;
     series is read (1) write (2) math (3) science (4);

*** WARNING in Model command
  Variable is uncorrelated with all other variables:  READ
*** WARNING in Model command
  Variable is uncorrelated with all other variables:  WRITE
*** WARNING in Model command
  Variable is uncorrelated with all other variables:  MATH
*** WARNING in Model command
  Variable is uncorrelated with all other variables:  SCIENCE
*** WARNING in Model command
  All least one variable is uncorrelated with all other variables in the model.
  Check that this is what is intended.
   5 WARNING(S) FOUND IN THE INPUT INSTRUCTIONS

Latent Class Analysis with Graphs

SUMMARY OF ANALYSIS

Number of groups                                                 1
Number of observations                                         200

Number of dependent variables                                    4
Number of independent variables                                  3
Number of continuous latent variables                            0
Number of categorical latent variables                           1

Observed dependent variables

  Continuous
   READ        WRITE       MATH        SCIENCE

Observed independent variables
   FEMALE      SES         SOCST

Categorical latent variables
   GRP

Estimator                                                      MLR
Information matrix                                        OBSERVED
Optimization Specifications for the Quasi-Newton Algorithm for
Continuous Outcomes
  Maximum number of iterations                                1000
  Convergence criterion                                  0.100D-05
Optimization Specifications for the EM Algorithm
  Maximum number of iterations                                 500
  Convergence criteria
    Loglikelihood change                                 0.100D-06
    Relative loglikelihood change                        0.100D-06
    Derivative                                           0.100D-05
Optimization Specifications for the M step of the EM Algorithm for
Categorical Latent variables
  Number of M step iterations                                    1
  M step convergence criterion                           0.100D-05
  Basis for M step termination                           ITERATION
Optimization Specifications for the M step of the EM Algorithm for
Censored, Binary or Ordered Categorical (Ordinal), Unordered
Categorical (Nominal) and Count Outcomes
  Number of M step iterations                                    1
  M step convergence criterion                           0.100D-05
  Basis for M step termination                           ITERATION
  Maximum value for logit thresholds                            15
  Minimum value for logit thresholds                           -15
  Minimum expected cell size for chi-square              0.100D-01
Optimization algorithm                                         EMA
Random Starts Specifications
  Number of initial stage starts                                10
  Number of final stage starts                                   1
  Number of initial stage iterations                            10
  Initial stage convergence criterion                    0.100D+01
  Random starts scale                                    0.500D+01
  Random seed for generating random starts                       0

Input data file(s)
  hsb2.dat
Input data format  FREE

RANDOM STARTS RESULTS RANKED FROM THE BEST TO THE WORST LOGLIKELIHOOD VALUES

Initial stage loglikelihood values, seeds, and initial stage start numbers:

           -2705.691  unperturbed      0
           -2760.828  462953           7
           -2861.398  939021           8
           -2954.460  93468            3
           -2954.460  415931           10
           -2954.460  608496           4
           -2954.460  195873           6
           -2954.460  127215           9
           -2954.460  903420           5
           -2954.460  253358           2
           -2954.460  285380           1

Loglikelihood values at local maxima, seeds, and initial stage start numbers:

           -2703.289  unperturbed      0

THE MODEL ESTIMATION TERMINATED NORMALLY

TESTS OF MODEL FIT

Loglikelihood

          H0 Value                       -2703.289

Information Criteria

          Number of Free Parameters             24
          Akaike (AIC)                    5454.579
          Bayesian (BIC)                  5533.738
          Sample-Size Adjusted BIC        5457.704
            (n* = (n + 2) / 24)
          Entropy                            0.842

FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASSES
BASED ON THE ESTIMATED MODEL

    Latent
   Classes

       1         68.35048          0.34175
       2         88.44594          0.44223
       3         43.20358          0.21602


FINAL CLASS COUNTS AND PROPORTIONS FOR THE LATENT CLASS PATTERNS
BASED ON ESTIMATED POSTERIOR PROBABILITIES

    Latent
   Classes

       1         68.35048          0.34175
       2         88.44594          0.44223
       3         43.20358          0.21602

CLASSIFICATION OF INDIVIDUALS BASED ON THEIR MOST LIKELY LATENT CLASS MEMBERSHIP

Class Counts and Proportions

    Latent
   Classes

       1               68          0.34000
       2               89          0.44500
       3               43          0.21500

Average Latent Class Probabilities for Most Likely Latent Class Membership (Row)
by Latent Class (Column)

          1        2        3

   1   0.947    0.053    0.000
   2   0.045    0.914    0.042
   3   0.000    0.082    0.918

MODEL RESULTS

                   Estimates     S.E.  Est./S.E.
Latent Class 1

 Means
    READ              42.658    0.784     54.386
    WRITE             43.977    1.673     26.291
    MATH              44.216    1.032     42.859
    SCIENCE           42.081    1.536     27.402

 Variances
    READ              35.507    7.936      4.474
    WRITE             42.315    6.466      6.544
    MATH              34.903    5.016      6.959
    SCIENCE           41.706    5.629      7.409

Latent Class 2

 Means
    READ              53.203    1.721     30.913
    WRITE             55.080    1.443     38.162
    MATH              53.626    1.481     36.216
    SCIENCE           54.763    1.340     40.877

 Variances
    READ              35.507    7.936      4.474
    WRITE             42.315    6.466      6.544
    MATH              34.903    5.016      6.959
    SCIENCE           41.706    5.629      7.409

Latent Class 3

 Means
    READ              65.381    2.231     29.305
    WRITE             61.973    0.627     98.782
    MATH              63.972    1.625     39.369
    SCIENCE           61.341    1.197     51.234

 Variances
    READ              35.507    7.936      4.474
    WRITE             42.315    6.466      6.544
    MATH              34.903    5.016      6.959
    SCIENCE           41.706    5.629      7.409

Categorical Latent Variables

 GRP#1    ON
    FEMALE            -0.213    0.841     -0.254
    SES               -1.079    0.675     -1.597
    SOCST             -0.372    0.084     -4.427

 GRP#2    ON
    FEMALE            -0.213    0.647     -0.330
    SES               -0.707    0.611     -1.157
    SOCST             -0.254    0.073     -3.500

 Intercepts
    GRP#1             23.775    5.283      4.500
    GRP#2             17.583    4.760      3.694


QUALITY OF NUMERICAL RESULTS

     Condition Number for the Information Matrix              0.225E-06
       (ratio of smallest to largest eigenvalue)


PLOT INFORMATION

The following plots are available:

  Histograms (sample values, estimated values)
  Scatterplots (sample values, estimated values)
  Sample means
  Estimated means
  Sample and estimated means
  Adjusted estimated means
  Observed individual values
  Estimated individual values
  Estimated means and observed individual values
  Estimated means and estimated individual values
  Adjusted estimated means and observed individual values
  Adjusted estimated means and estimated individual values
  Mixture distributions
  Estimated probabilities for a categorical latent variable as a
    function of its covariates

Step 2. View the Graphs

Graph Example 1.

From Graph pulldown choose View Graphs then choose Estimated Means. This shows the graph below, showing the means broken down by the latent class membership.  We can see that Class 1 is a poorly performing class, class 2 is a middle performing class, and class 3 is a well performing class.

Graph Example 2.

From Graph pulldown choose View Graphs then choose Estimated probabilities for a categorical.... Then make ses the X axis variable.  This graph is shown below.  The variable ses was not significantly related to class membership, but we see a small tendency for students to be in the better performing class (class 3) as ses increases, and less likely to be in the poor performing class (class 1).

Graph Example 3.

From Graph pulldown choose View Graphs then choose Estimated probabilities for a categorical.... Then make socst the X axis variable. As illustrated in the graph below, socst was significantly related to class membership.  As socst increases, the probabilitiy of being in class 1 (the poor performing class) decreases and the probability of being in class 3 (the well performing class) increases. The probability of being in class 2 goes up and then tapers down as socst increases.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California