SAS Learning Module

1. Introduction

This module illustrates how to create and use labels in SAS. There are two main items that can be labeled, variables and values. Once created these labels will appear in the output of statistical procedures and reports that you may produce from SAS. They are also displayed by some of the SAS/GRAPH procedures.

The program below reads the data and creates a temporary data file called auto.  The labeling shown in this module are all applied to this data file called auto.

DATA auto ;
  INPUT make $  mpg rep78 weight foreign ;
AMC     22 3 2930 0
AMC     17 3 3350 0
AMC     22 . 2640 0
Audi    17 5 2830 1
Audi    23 3 2070 1
BMW     25 4 2650 1
Buick   20 3 3250 0
Buick   15 4 4080 0
Buick   18 3 3670 0
Buick   26 . 2230 0
Buick   20 3 3280 0
Buick   16 3 3880 0
Buick   19 3 3400 0
Cad.    14 3 4330 0
Cad.    14 2 3900 0
Cad.    21 3 4290 0
Chev.   29 3 2110 0
Chev.   16 4 3690 0
Chev.   22 3 3180 0
Chev.   22 2 3220 0
Chev.   24 2 2750 0
Chev.   19 3 3430 0
Datsun  23 4 2370 1
Datsun  35 5 2020 1
Datsun  24 4 2280 1
Datsun  21 4 2750 1

The output of the proc contents is shown below.  You can see in this portion of the output of the proc contents that there are no labels attached to the variables in this file.

-----Alphabetic List of Variables and Attributes-----

#    Variable    Type    Len    Pos
5    FOREIGN     Num       8     32
1    MAKE        Char      8      0
2    MPG         Num       8      8
3    REP78       Num       8     16
4    WEIGHT      Num       8     24

2. Creating variable labels

We use the label statement in the data step to assign labels to the variables.  You could also assign labels to variables in proc steps, but then the labels only exist for that step.  When labels are assigned in the data step they are available for all procedures that use that data set.

The following program assigns variable labels to rep78, mpg and foreign.

DATA  auto2;
   SET auto;
   LABEL  rep78  ="1978 Repair Record"
          mpg    ="Miles Per Gallon"
          foreign="Where Car Was Made";


Looking at the output produced by the proc contents step shows that the labels were indeed assigned.  The relevant part of this output follows.

  -----Alphabetic List of Variables and Attributes-----

#    Variable    Type    Len    Pos    Label
5    FOREIGN     Num       8     32    Where Car Was Made
1    MAKE        Char      8      0
2    MPG         Num       8      8    Miles Per Gallon
3    REP78       Num       8     16    1978 Repair Record
4    WEIGHT      Num       8     24

These labels will also appear on the output of other procedures giving a fuller description of the variables involved.  This is demonstrated in the proc means below.


Looking at the output produced by the proc means shows that the labels were indeed assigned.  Look at the column titled Label. The relevant part of this output follows.

Variable  Label                N       Mean  Std Dev    Minimum
MPG       Miles Per Gallon    26 20.9230769   4.7575042    14
REP78     1978 Repair Record  24  3.2916667   0.8064504     2
WEIGHT                        26    3099.23 695.0794089  2020
FOREIGN   Where Car Was Made  26  0.2692308   0.4523443     0

3. Creating and using value labels

Labeling values is a two step process.  First, you must create the label formats with proc format using a value statement.  Next, you attach the label format to the variable with a format statement.  This format statement can be used in either proc or data steps.  An example of the proc format step for creating the value formats, forgnf and $makef follows.

  VALUE  forgnf 0="domestic"
                1="foreign" ;
  VALUE  $makef "AMC"    ="American Motors"
                "Buick"  ="Buick (GM)"
                "Cad."   ="Cadillac (GM)"
                "Chev."  ="Chevrolet (GM)"
                "Datsun" ="Datsun (Nissan)";

You may include any number of value statements to create label formats as needed.  Since make is a variable that contains character values, when you define the formats for it you have to precede the format name with a $ so the format name becomes $makef.  Additionally, for character variables the values of the variables must be enclosed in quotes. 

Now that the formats forgnf and $makef have been created, they must be linked to the variables, foreign and make.  This is accomplished by including a format statement in either a proc or a data step.  In the program below the format statement is used in a proc freq.

   FORMAT  foreign forgnf.
           make    $makef.;
   TABLES foreign make;

Notice that the formats forgnf. and $makef. are each followed by a period in the format statement.  This is the way that SAS tells the difference between the name of a format and the name of a variable in a format statement. 

The output of the frequencies procedure for foreign displays the newly defined labels instead of the values of the variable.

                  Where Car Was Made

                                Cumulative  Cumulative
 FOREIGN   Frequency   Percent   Frequency    Percent
domestic         19      73.1          19       73.1
foreign           7      26.9          26      100.0 

The output of the frequencies procedure for make displays the newly defined labels instead of the values of the variable.  Values for which formats haven't been defined (Audi and BMW) appear in the table without modification.

MAKE              Frequency   Percent   Frequency    Percent
American Motors          3      11.5           3       11.5
Audi                     2       7.7           5       19.2
BMW                      1       3.8           6       23.1
Buick (GM)               7      26.9          13       50.0
Cadillac (GM)            3      11.5          16       61.5
Chevrolet (GM)           6      23.1          22       84.6
Datsun (Nissan)          4      15.4          26      100.0

If you link formats to variables in a data step where a permanent file is created, then every time you use that file SAS expects to find the formats.  Thus you will have to supply the proc format code in each program that uses the file.  Since this can make each of your programs much longer than you might like, I would like to provide a tip for accomplishing this task without repeating the code for the proc format in every program.  Assuming that a small program containing only the proc format is stored in a file called in a directory on your C: drive called myfiles, the following statement will bring that code into your current program:

%INCLUDE 'C:\myfiles\';

This should save time and make maintenance of your programs easier.  The remainder of your program would follow this statement.

4. Problems to look out for

5. For more information

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.