Stata Class Notes 3.0
Managing Data
1.0 Stata commands in this unit
| pwd | Show current directory (pwd=print working directory) |
| dir or ls | Show files in current directory |
| cd | Change directory |
| keep if | Keep observations if condition is met |
| keep | Keep variables (dropping others) |
| drop | Drop variables (keeping others) |
| append using | Append a data file to current file |
| sort | Sort observations |
| merge | Merge a data file with current file |
2.0 Demonstration and Explanation
Example 2.1 - Honor's Thesis
Suppose we are undergraduates working on
our honors thesis and we wish to analyze just a subset of the hs1 data
file. In fact, we are studying "good readers" and just want to
focus on the students who had a reading score of 60 and higher. The
following shows how we can take the hs1 data file and make a separate
folder called honors and store a copy of our data which just has the
students with reading scores of 60 or higher.
use hs1, clear
pwd
dir
ls
cd honors
keep if read >= 60
describe
summarize read
save hsgoodread, replace
Example 2.1, continued - keeping variables
Further suppose that our
data file had many, many variables, say 2000 variables, but we only care about just
a handful of them, id female read and write. We can subset
our data file to keep just those variables as shown below.
keep id female read write
save hskept, replace
describe
list in 1/20
Example 2.1, continued - dropping variables
Instead of wanting to keep
just a handful of variables, it is possible that we might want to get rid of
just a handful of variables in our data file. Below we show how we could
get rid of the variables ses and prog.
use hsgoodread, clear
drop ses prog
save hsdropped, replace
describe
list in 1/20
Example 2.2 - Master's Thesis
Now we have moved on to our master's
thesis. We have a folder called masters and we have been given a
file with the data for the males (called hsmale) and a file for the
females (called hsfemale). We need to combine these files together
to be able to analyze them, as shown below.
cd ..
cd masters
dir
use hsmale
tabulate female
append using hsfemale
save hsmasters, replace
tabulate female
Example 2.3 - Dissertation
Now we are working on our dissertation
and, as with our masters, we have been given two files. In this case, we
have a file that has all of the demographic information (called hsdemo)
and a file with the test scores (called hstest) and we wish to merge
these files together. We show how to do this below.
cd ..
cd diss
dir
use hsdem, clear
list
sort id
save hsdem, replace
use hstest, clear
list
sort id
save , replace
use hsdem
merge id using hstest
list
tab _merge
save hsdiss
cd ..
dir
3.0 For More Information
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services
The content of this web site should not be
construed as an endorsement of any particular web site, book, or software
product by the University of California