Stata Learning Module
Labeling data

This module will show how to create labels for your data.  Stata allows you to label your data file (data label), to label the variables within your data file (variable labels), and to label the values for your variables (value labels).  Let's use a file called autolab that does not have any labels.

use http://www.ats.ucla.edu/stat/stata/modules/autolab.dta, clear 

Let's use the describe command to verify that indeed this file does not have any labels.

describe 
Contains data from autolab.dta
 obs:            74                          1978 Automobile Data
 vars:            12                          23 Oct 2008 13:36
 size:         3,478 (99.9% of memory free)   (_dta has notes)
-------------------------------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------------------------------------------------
make            str18  %-18s                  
price           int    %8.0gc                 
mpg             int    %8.0g                  
rep78           int    %8.0g                  
headroom        float  %6.1f                  
trunk           int    %8.0g                  
weight          int    %8.0gc                 
length          int    %8.0g                  
turn            int    %8.0g                  
displacement    int    %8.0g                  
gear_ratio      float  %6.2f                  
foreign         byte   %8.0g              
-------------------------------------------------------------------------------
Sorted by:   

Let's use the label data command to add a label describing the data file.  This label can be up to 80 characters long.

label data "This file contains auto data for the year 1978" 

The describe command shows that this label has been applied to the version that is currently in memory.

describe 
Contains data from autolab.dta
 obs:            74                          This file contains auto data for the year 1978
 vars:            12                          23 Oct 2008 13:36
 size:         3,478 (99.9% of memory free)   (_dta has notes)
-------------------------------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------------------------------------------------
make            str18  %-18s                  
price           int    %8.0gc                 
mpg             int    %8.0g                  
rep78           int    %8.0g                  
headroom        float  %6.1f                  
trunk           int    %8.0g                  
weight          int    %8.0gc                 
length          int    %8.0g                  
turn            int    %8.0g                  
displacement    int    %8.0g                  
gear_ratio      float  %6.2f                  
foreign         byte   %8.0g
-------------------------------------------------------------------------------
Sorted by:   

Let's use the label variable command to assign labels to the variables rep78 price, mpg and foreign.

label variable rep78   "the repair record from 1978" 
label variable price   "the price of the car in 1978" 
label variable mpg     "the miles per gallon for the car" 
label variable foreign "the origin of the car, foreign or domestic" 

The describe command shows these labels have been applied to the variables.

describe 
Contains data from autolab.dta
 obs:            74                          This file contains auto data for the year 1978
 vars:            12                          23 Oct 2008 13:36
 size:         3,478 (99.9% of memory free)   (_dta has notes)
-------------------------------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------------------------------------------------
make            str18  %-18s                  
price           int    %8.0gc                 the price of the car in 1978
mpg             int    %8.0g                  the miles per gallon for the car
rep78           int    %8.0g                  the repair record from 1978
headroom        float  %6.1f                  
trunk           int    %8.0g                  
weight          int    %8.0gc                 
length          int    %8.0g                  
turn            int    %8.0g                  
displacement    int    %8.0g                  
gear_ratio      float  %6.2f                  
foreign         byte   %8.0g                  the origin of the car, foreign or domestic
-------------------------------------------------------------------------------
Sorted by:   

Let's make a value label called foreignl to label the values of the variable foreign. This is a two step process where you first define the label, and then you assign the label to the variable.  The label define command below creates the value label called foreignl that associates 0 with domestic car and 1 with foreign car.

label define foreignl 0 "domestic car" 1 "foreign car" 

The label values command below associates the variable foreign with the label foreignl.

label values foreign foreignl  

If we use the describe command, we can see that the variable foreign has a value label called foreignl assigned to it.

describe 
Contains data from autolab.dta
 obs:            74                          This file contains auto data for the year 1978
 vars:            12                          23 Oct 2008 13:36
 size:         3,478 (99.9% of memory free)   (_dta has notes)
-------------------------------------------------------------------------------------------------------------------------
              storage  display     value
variable name   type   format      label      variable label
-------------------------------------------------------------------------------------------------------------------------
make            str18  %-18s                  
price           int    %8.0gc                 the price of the car in 1978
mpg             int    %8.0g                  the miles per gallon for the car
rep78           int    %8.0g                  the repair record from 1978
headroom        float  %6.1f                  
trunk           int    %8.0g                  
weight          int    %8.0gc                 
length          int    %8.0g                  
turn            int    %8.0g                  
displacement    int    %8.0g                  
gear_ratio      float  %6.2f                  
foreign         byte   %12.0g      foreignl   the origin of the car, foreign or domestic
-------------------------------------------------------------------------------
Sorted by:   

Now when we use the tabulate foreign command, it shows the labels domestic car and foreign car instead of just 0 and 1.

table foreign 
-------------+-----------
the origin   |
of the car,  |
foreign or   |
domestic     |      Freq.
-------------+-----------
domestic car |         52
 foreign car |         22
-------------+----------- 

Value labels are used in other commands as well. For example, below we issue the ttest , by(foreign) command, and the output labels the groups as domestic and foreign (instead of 0 and 1).

ttest mpg , by(foreign) 
Two-sample t test with equal variances

------------------------------------------------------------------------------
   Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]
---------+--------------------------------------------------------------------
domestic |      52    19.82692     .657777    4.743297    18.50638    21.14747
 foreign |      22    24.77273     1.40951    6.611187    21.84149    27.70396
---------+--------------------------------------------------------------------
combined |      74     21.2973    .6725511    5.785503     19.9569    22.63769
---------+--------------------------------------------------------------------
    diff |           -4.945804    1.362162               -7.661225   -2.230384
------------------------------------------------------------------------------
Degrees of freedom: 72

                Ho: mean(domestic) - mean(foreign) = diff = 0

     Ha: diff <0 Ha: diff ~="0" Ha: diff> 0
       t =  -3.6308                t =  -3.6308              t =  -3.6308
   P < t =   0.0003          P > |t| =   0.0005          P > t =   0.9997 

One very important note:  These labels are assigned to the data that is currently in memory.  To make these changes permanent, you need to save the data.  When you save the data, all of the labels (data labels, variable labels, value labels) will be saved with the data file.

Summary

Assign a label to the data file currently in memory.

label data "1978 auto  data"  

Assign a label to the variable foreign.

label variable foreign "the origin  of the car, foreign or domestic" 

Create the value label foreignl and assign it to the variable foreign.

label define foreignl 0 "domestic  car"  1 "foreign  car"
label values foreign foreignl  

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.