|
|
|
||||
|
|
|||||
Sometimes two variables in a dataset may convey the same information, except one is a numeric variable and the other one is a string variable. For example, in the data set below, we have a numeric variable a coded 1/0 for gender and a string variable b also for gender but with more explicit information. It is easy to use the numeric variable, but we may also want to keep the information given from the string variable. This is a case where we want to create value labels for the numeric variable based on the string variable. In Stata, we can use the command called labmask to create value labels for the numeric variable based on the character variable. The command labmask one of the commands in a suite of commands written by Nick J. Cox called labutil. You can download it by typing findit labutil (see How can I use the findit command to search for programs and get additional help? for more information about using findit) and following the link to it.
Example 1: A simple example
clear input gender str8 female 1 female 0 male end list+-----------------+ | gender female | |-----------------| 1. | 1 female | 2. | 0 male | +-----------------+ labmask gender, values(female) describe Contains data obs: 2 vars: 2 size: 32 (99.9% of memory free) ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- gender float %9.0g gender female str8 %9s ------------------------------------------------------------------------------- Sorted by: Note: dataset has changed since last saved list +-----------------+ | gender female | |-----------------| 1. | female female | 2. | male male | +-----------------+ label list gender: 0 male 1 female
Example 2: Another example
Now how is labmask different from encode? Both of the commands create value labels for the numeric version of the string variable. However, the command encode does it based on the alphabetical order of the string values, not based on the values of the numeric variable in the data set that we want to match it to. For example, the new variable cnum below created by encode will have value 1 for boston since it the first alphabetically.
clear
input cityn str8 cityc
0 la
0 la
2 boston
2 boston
5 chicago
5 chicago
5 chicago
3 ny
3 ny
end
encode cityc, gen(cnum)
labmask cityn, values(cityc)
list
+-----------------------------+
| cityn cityc cnum |
|-----------------------------|
1. | la la la |
2. | la la la |
3. | boston boston boston |
4. | boston boston boston |
5. | chicago chicago chicago |
|-----------------------------|
6. | chicago chicago chicago |
7. | chicago chicago chicago |
8. | ny ny ny |
9. | ny ny ny |
+-----------------------------+
list, nolab
+------------------------+
| cityn cityc cnum |
|------------------------|
1. | 0 la 3 |
2. | 0 la 3 |
3. | 2 boston 1 |
4. | 2 boston 1 |
5. | 5 chicago 2 |
|------------------------|
6. | 5 chicago 2 |
7. | 5 chicago 2 |
8. | 3 ny 4 |
9. | 3 ny 4 |
+------------------------+
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services