Help the Stat Consulting Group by giving a gift

How to Separate a Stata Single File into Multiple Data files for Analysis in HLM Software?

For a 2-level hierarchical model, HLM requires two files for a 2-level
model, one for level-1 and one for level-2. Similarly, for a 3-level
hierarchical model, HLM requires three data files. This page shows some
examples on how to convert a single Stata file into multiple data files for
analyses in HLM.

For a single data file with both level-1 and level-2 variables, we will have
to extract two files from it, one being with all the level-1 variables of
interest and the other
with all the level-2 variables of interest. The level-2 unit identifier
serves as the linking variable to link the level-1 data file and level-2 data
file together. It has to exists in both level-1 and level-2 file.

**Example 1**: Two data sets for 2-level modeling

We use HLM's example data set hsball.dta to demonstrate how to extract two data
sets from a data set with both level-1 variables and level-2 variables.
This data set
consists of student level variables and school level variables. The two data
sets, the level-1 data set and level-2 data set will be student level data
set and school level data set. The linking variable is the school identifier
called **id**. Both level-1 data set and level-2 data set should be
sorted by school. The command "**unique**" is used here to check if the
variables that we suspect to be level-2 variables indeed are level-2
variables. You can download command "**unique**" by following the link
after command "**findit** unique". We used two pairs of Stata
commands "**preserve**" and "**restore**" in the process to recover back
our original data set. We use command **collapse** to aggregate level-2 variables to their level and that is
what the level-2 data set is.

use http://www.ats.ucla.edu/stat/hlm/faq/hsball, clear sort id unique id meanses size sector pracad disclim himintyNumber of unique values of school meanses size sector pracad disclim himinty is 160 Number of records is 7185preservedrop size sector pracad disclim himinty meansessave hsb12_level1file hsb12_level1.dta savedrestorepreserve collapse (mean) meanses size sector pracad disclim himinty, by(id)save hsb12_level2file hsb12_level2.dta savedrestore

Click here for the entire do file.

**Example 2: **Three data sets for 3-level modeling

The data set used in this example is an HLM example (Chapter 8) data set.
We actually have combined three separate data sets together to come up with
a single Stata data set called eg3all.dta just for the purpose of
demonstration here.
This data set
consists of 1721 students nested in 60 schools. The information on students
has been collected at multiple time points. Therefore, time is nested in
students and students are nested in schools. The school level variables are
**size**, **lowinc** and **mobility**. The student level variables
are **female** **black** and **hispanic**. The time level variables
are **year**, **grade**, **math** and **retained**. Variables
year is shifted grade (year = grade - 1.5).

use http://www.ats.ucla.edu/stat/hlm/faq/eg3all, clearunique childidNumber of unique values of childid is 1721 Number of records is 7230unique schoolidNumber of unique values of schoolid is 60 Number of records is 7230unique gradeNumber of unique values of grade is 6 Number of records is 7230preserve /*extracting level-3 (school level) data*/drop childid female black hispanic year grade math retainedduplicates dropDuplicates in terms of all variables (7170 observations deleted)sort schoolid save eg3all_level3file eg3all_level3.dta savedrestore preserve /*extracting level-2 (student level) data*/drop year grade math retainedduplicates dropDuplicates in terms of all variables (5509 observations deleted)sort schoolid childid count1721save eg3all_level2file eg3all_level2.dta savedrestorepreserve /*extracting level-1 (time level) data*/drop schoolid female black hispanic size lowinc mobilitysort childidsave eg3all_level1file eg3all_level1.dta savedrestore

Click here for the entire do file.

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.