|
|
|
||||
|
|
|||||
. reshape
There are two major varities of data files; those that are long and those that are wide. An example of a wide file is shown in the table below.
id grp lrn95 lrn96 lrn97 1 1 7 12 16 2 1 13 14 15 3 2 15 14 20 4 2 21 27 24 5 3 9 9 12 6 3 16 17 18 |
We would like to change this file to the long format that looks like this:
id year grp lrn 1 95 1 7 1 96 1 12 1 97 1 16 2 95 1 13 2 96 1 14 2 97 1 15 3 95 2 15 3 96 2 14 3 97 2 20 4 95 2 21 4 96 2 27 4 97 2 24 5 95 3 9 5 96 3 9 5 97 3 12 6 95 3 16 6 96 3 17 6 97 3 18 |
. use http://www.ats.ucla.edu/stat/stata/notes/wide, clear
describe
list
reshape long lrn, i(id) j(year)
describe
list
table year, contents(n lrn mean lrn sd lrn)
Here, we use the reshape long command with lrn as the new dependent variable. The major index (i) variable is id and the sub-index (j) variable is to be called year.
. reshape wide
describe
list
reshape long
describe
list
Once you reshaped a dataset from wide to long, Stata remembers how it was done and you can change back and forth without using the index variables. In fact, once you save the file Stata will be able to change from wide to long and back at any later time.
. use spf2, clear
describe
list
reshape long y, i(s) j(b)
describe
list
anova y a / s|a b a*b, repeated(b)
This second example occurs frequently in experimental design situations. Data files for repeated measures designs often have all of the responses for one subject on the same line. In fact, this is the same example that we saw in the Statistics Revisited unit in the long format. The i index is the main index that defines the separate observations. In the spf design, s, the subjects id is the main index. The j index defines the secondary level of the observations. In this case, b, is the repeated factor and is represents the secondary level of the observations.
. use wide,clear
describe
list
reshape long lrn, i(id) j(year)
describe
list
The Stata Class Notes are available on the World Wide Web by visiting ...
http://www.ats.ucla.edu/stat/stata/notes/
The datasets wide.dta, spf.dta and spf2.dta can be loaded directly into Stata, over the
Internet, using the following commands:
use http://www.ats.ucla.edu/stat/stata/notes/wide
use http://www.ats.ucla.edu/stat/stata/notes/spf
use http://www.ats.ucla.edu/stat/stata/notes/spf2
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services