|
|
|
||||
|
|
|||||
Sometimes your dataset includes an identifying variable that is unnecessarily long and uninformative. For example, your ID variable may be a string of length 12 with both letters and numbers (i.e., "77A34987BG34"). You may wish to create a new identifying variable that simply maps the complicated ID variable onto integers starting at 1 and going up to as many unique IDs appear in your dataset. The code below provides an example of how to do this.
data test; input id a b; cards; 9385793487 0 0 3598437987 1 0 5987398759 1 0 9593859853 0 1 5987398759 0 0 9385793487 0 0 3598437987 0 1 7892343344 1 1 ; proc print data = test; run; Obs id a b 1 9385793487 0 0 2 3598437987 1 0 3 5987398759 1 0 4 9593859853 0 1 5 5987398759 0 0 6 9385793487 0 0 7 3598437987 0 1 8 7892343344 1 1 proc sort data = test; by id; run; data test2; set test; by id; retain newid 0; if first.id then newid = newid + 1; run; proc print data = test2; run; Obs id a b newid 1 3598437987 1 0 1 2 3598437987 0 1 1 3 5987398759 1 0 2 4 5987398759 0 0 2 5 7892343344 1 1 3 6 9385793487 0 0 4 7 9385793487 0 0 4 8 9593859853 0 1 5
Now our dataset has a short and informative identifying variable.
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services