UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

Stata FAQ
How can I analyze a nested model using xtmixed?

Consider the following nested experiment: A study was conducted measuring the thickness of the oxide layer on silicon wafers. The wafers were produced on two different machines (source). Four lots of wafers were selected at random from each machine. From each lot three wafers were selected at random to be measured. Finally, on each wafer three positions were selected. So, we have position nested in wafer, wafer nested in lot which is nested in source. The primary concern of this experiment is to determine whether the two machines (source) differ in the thickness of their oxide layers.

Let's load the data and look at our sample.

use http://www.ats.ucla.edu/stat/stata/data/thickness, clear

list in 1/10

     +--------------------------------------------+
     | source   lot   wafer   position   thickn~s |
     |--------------------------------------------|
  1. |      1     1       1          1       2006 |
  2. |      1     1       1          2       1999 |
  3. |      1     1       1          3       2007 |
  4. |      1     1       2          1       1980 |
  5. |      1     1       2          2       1988 |
     |--------------------------------------------|
  6. |      1     1       2          3       1982 |
  7. |      1     1       3          1       2000 |
  8. |      1     1       3          2       1998 |
  9. |      1     1       3          3       2007 |
 10. |      1     2       1          1       1991 |
     +--------------------------------------------+ 

tabstat thickness, by(source) stat(n mean sd)

Summary for variables: thickness
     by categories of: source 

  source |         N      mean        sd
---------+------------------------------
       1 |        36  1995.111  7.531943
       2 |        36  2005.194  14.86668
---------+------------------------------
   Total |        72  2000.153  12.75518
----------------------------------------
Next, we will need to create a variable that indicates lot nested in source. We will do this using the egen group command.
egen lotinsource = group(lot source), label

tab lotinsource

   group(lot |
    source) |      Freq.     Percent        Cum.
------------+-----------------------------------
        1 1 |          9       12.50       12.50
        1 2 |          9       12.50       25.00
        2 1 |          9       12.50       37.50
        2 2 |          9       12.50       50.00
        3 1 |          9       12.50       62.50
        3 2 |          9       12.50       75.00
        4 1 |          9       12.50       87.50
        4 2 |          9       12.50      100.00
------------+-----------------------------------
      Total |         72      100.00
From the table above it looks lot is crossed with source. This is not the case since a lot drawn from source1 is a different from a lot that is drawn from source2. Fortunately, xtmixed will be able to sort this out for us. Here is one way to parameterize this model.
xtmixed thickness i.source || lotinsource: || wafer:, var

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log restricted-likelihood = -223.23893  
Iteration 1:   log restricted-likelihood = -223.23893  

Computing standard errors:

Mixed-effects REML regression                   Number of obs      =        72

-----------------------------------------------------------
                |   No. of       Observations per Group
 Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
    lotinsource |        8          9        9.0          9
          wafer |       24          3        3.0          3
-----------------------------------------------------------

                                                Wald chi2(1)       =      1.53
Log restricted-likelihood = -223.23893          Prob > chi2        =    0.2167

------------------------------------------------------------------------------
   thickness |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    2.source |   10.08333   8.162245     1.24   0.217    -5.914373    26.08104
       _cons |   1995.111   5.771579   345.68   0.000     1983.799    2006.423
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
lotinsource: Identity        |
                  var(_cons) |   119.8926   77.07348      34.00901    422.6598
-----------------------------+------------------------------------------------
wafer: Identity              |
                  var(_cons) |   35.86577   14.18759      16.51834    77.87427
-----------------------------+------------------------------------------------
               var(Residual) |   12.56944   2.565726      8.424908    18.75282
------------------------------------------------------------------------------
LR test vs. linear regression:       chi2(2) =   104.69   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.
Note that the test for differences in source is not significant. Also, note that the variable position does not appear in the model. That's because variability due to position is accounted for by the residual variance. In the output above, lots nested in source (lotinsource) has a variance of 119.89, wafer has a variance of 35.87 and position (residual) has a variance of 12.57.

There is an alternative way to parameterize this model that is somewhat more efficient.

xtmixed thickness i.source || lotinsource: || _all: R.wafer, var

Performing EM optimization: 

Performing gradient-based optimization: 

Iteration 0:   log restricted-likelihood = -223.23893  
Iteration 1:   log restricted-likelihood = -223.23893  

Computing standard errors:

Mixed-effects REML regression                   Number of obs      =        72

-----------------------------------------------------------
                |   No. of       Observations per Group
 Group Variable |   Groups    Minimum    Average    Maximum
----------------+------------------------------------------
    lotinsource |        8          9        9.0          9
           _all |        8          9        9.0          9
-----------------------------------------------------------

                                                Wald chi2(1)       =      1.53
Log restricted-likelihood = -223.23893          Prob > chi2        =    0.2167

------------------------------------------------------------------------------
   thickness |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
    2.source |   10.08333   8.162245     1.24   0.217    -5.914373    26.08104
       _cons |   1995.111   5.771579   345.68   0.000     1983.799    2006.423
------------------------------------------------------------------------------

------------------------------------------------------------------------------
  Random-effects Parameters  |   Estimate   Std. Err.     [95% Conf. Interval]
-----------------------------+------------------------------------------------
lotinsource: Identity        |
                  var(_cons) |   119.8926   77.07348      34.00901    422.6598
-----------------------------+------------------------------------------------
_all: Identity               |
                var(R.wafer) |   35.86577   14.18759      16.51834    77.87426
-----------------------------+------------------------------------------------
               var(Residual) |   12.56944   2.565726      8.424908    18.75282
------------------------------------------------------------------------------
LR test vs. linear regression:       chi2(2) =   104.69   Prob > chi2 = 0.0000

Note: LR test is conservative and provided only for reference.
All of the results as the same as in our first model, however some of the labels for the variance components differ.

This design is completely balanced so the xtmixed results will be identical to those using the anova command.

anova thickness source / lot|source wafer|lot|source

                           Number of obs =      72     R-squared     =  0.9478
                           Root MSE      = 3.54534     Adj R-squared =  0.9227

                  Source |  Partial SS    df       MS           F     Prob > F
        -----------------+----------------------------------------------------
                   Model |  10947.9861    23  475.999396      37.87     0.0000
                         |
                  source |    1830.125     1    1830.125       1.53     0.2629
              lot|source |  7195.19444     6  1199.19907   
        -----------------+----------------------------------------------------
        wafer|lot|source |  1922.66667    16  120.166667       9.56     0.0000
                         |
                Residual |  603.333333    48  12.5694444   
        -----------------+----------------------------------------------------
                   Total |  11551.3194    71   162.69464
If we take the square root of the F-ratio for source, we get the same value as the z-test from the xtmixed (1.24).
display sqrt(e(F_1))

1.2353634

How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.