UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS FAQ:
How can I recode my ID variable to be short and numeric?

Sometimes your dataset includes an identifying variable that is unnecessarily long and uninformative.  For example, your ID variable may be a string of length 12 with both letters and numbers (i.e., "77A34987BG34").  You may wish to create a new identifying variable that simply maps the complicated ID variable onto integers starting at 1 and going up to as many unique IDs appear in your dataset.  The code below provides an example of how to do this.

data test;
  input id a b;
  cards;
9385793487 0 0
3598437987 1 0
5987398759 1 0
9593859853 0 1
5987398759 0 0
9385793487 0 0
3598437987 0 1
7892343344 1 1
;

proc print data = test;
run;

Obs        id        a    b

 1     9385793487    0    0
 2     3598437987    1    0
 3     5987398759    1    0
 4     9593859853    0    1
 5     5987398759    0    0
 6     9385793487    0    0
 7     3598437987    0    1
 8     7892343344    1    1


proc sort data = test;
  by id;
run;

data test2; set test;
  by id;
  retain newid 0;
  if first.id then newid = newid + 1;
run;

proc print data = test2; 
run;

Obs        id        a    b    newid

 1     3598437987    1    0      1
 2     3598437987    0    1      1
 3     5987398759    1    0      2
 4     5987398759    0    0      2
 5     7892343344    1    1      3
 6     9385793487    0    0      4
 7     9385793487    0    0      4
 8     9593859853    0    1      5

Now our dataset has a short and informative identifying variable.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.