### Stata FAQ How to Separate a Stata Single File into Multiple Data files for Analysis in HLM Software?

HLM FAQ: How do I convert a Stata file to HLM?

For a 2-level hierarchical model, HLM requires two files for a 2-level model, one for level-1 and one for level-2. Similarly, for a 3-level hierarchical model, HLM requires three data files. This page shows some examples on how to convert a single Stata file into multiple data files for analyses in HLM.

For a single data file with both level-1 and level-2 variables, we will have to extract two files from it, one being with all the level-1 variables of interest and the other with all the level-2 variables of interest. The level-2 unit identifier serves as the linking variable to link the level-1 data file and level-2 data file together. It has to exists in both level-1 and level-2 file.

Example 1:  Two data sets for 2-level modeling

We use HLM's example data set hsball.dta to demonstrate how to extract two data sets from a data set with both level-1 variables and level-2 variables. This data set consists of student level variables and school level variables. The two data sets, the level-1 data set and level-2 data set will be student level data set and school level data set. The linking variable is the school identifier called id. Both level-1 data set and level-2 data set should be sorted by school. The command "unique" is used here to check if the variables that we suspect to be level-2 variables indeed are level-2 variables. You can download command "unique" by following the link after command "findit unique". We used two pairs of Stata commands "preserve" and "restore" in the process to recover back our original data set. We use command collapse to aggregate level-2 variables to their level and that is what the level-2 data set is.

use http://www.ats.ucla.edu/stat/hlm/faq/hsball, clear
sort id
unique  id meanses size sector pracad disclim himinty
Number of unique values of school meanses size sector pracad disclim himinty is  160
Number of records is  7185
preserve
drop size sector pracad disclim himinty meanses
save hsb12_level1
file hsb12_level1.dta saved
restore
preserve
collapse (mean)  meanses size sector pracad disclim himinty, by(id)

save hsb12_level2
file hsb12_level2.dta saved

restore

Example 2: Three data sets for 3-level modeling

The data set used in this example is an HLM example (Chapter 8) data set. We actually have combined three separate data sets together to come up with a single Stata data set called eg3all.dta just for the purpose of demonstration here. This data set consists of 1721 students nested in 60 schools. The information on students has been collected at multiple time points. Therefore, time is nested in students and students are nested in schools. The school level variables are size, lowinc and mobility. The student level variables are female black and hispanic. The time level variables are year, grade, math and retained. Variables year is shifted grade (year = grade - 1.5).

use http://www.ats.ucla.edu/stat/hlm/faq/eg3all, clear
unique childid
Number of unique values of childid is  1721
Number of records is  7230

unique schoolid
Number of unique values of schoolid is  60
Number of records is  7230

Number of unique values of grade is  6
Number of records is  7230

preserve /*extracting level-3 (school level) data*/

drop  childid female black hispanic year grade math retained

duplicates drop

Duplicates in terms of all variables

(7170 observations deleted)

sort schoolid

save eg3all_level3
file eg3all_level3.dta saved

restore

preserve /*extracting level-2 (student level) data*/

duplicates drop

Duplicates in terms of all variables

(5509 observations deleted)

sort schoolid childid

count
1721

save eg3all_level2
file eg3all_level2.dta saved
restore
preserve /*extracting level-1 (time level) data*/

drop schoolid female black hispanic size lowinc mobility

sort childid

save eg3all_level1
file eg3all_level1.dta saved

restore