|
|
|
||||
|
|
|||||
The census uses numeric codes to represent the values of the variables. To know what the codes mean, one usually needs to refer to the codebook. When you produce a PROC FREQ, it can be very cumbersome to repeatedly refer to the codebook to determine the meaning of the numeric codes. For example, consider the program shown in Example 1 which performs a PROC FREQ on the variable sex. As you can see in Output 1, it is unclear whether 0 represents males or females.
Example 1. PROC FREQ on Sex Without Formats
PROC FREQ DATA="c:\census\to90pump";
TABLES sex;
RUN;
Output 1. Output From PROC FREQ Without Formats
Cumulative Cumulative
SEX Frequency Percent Frequency Percent
-------------------------------------------------
0 2995 49.6 2995 49.6
1 3047 50.4 6042 100.0
By contrast, Example 2 performs the PROC FREQ displaying the formatted values for sex. As you can see in Output 2, the values for Male and Female are clearly labeled.
Example 2. PROC FREQ on Sex With Formats
* this creates the formats;
%INCLUDE 'c:\census\pum90.format.sas';
* this illustrates how to use the formats;
PROC FREQ data="c:\census\to90pump";
TABLES sex;
FORMAT sex sex.;
RUN;
Output 2. Output From PROC FREQ with Formats
Cumulative Cumulative
SEX Frequency Percent Frequency Percent
----------------------------------------------------
Male 2995 49.6 2995 49.6
Female 3047 50.4 6042 100.0
Two changes were made to Example 2 to display the formatted values. First, the line
%INCLUDE 'c:\census\pum90.format.sas';
was added, which reads in the ATS pre-defined formats for the PUMS 90 data. You can download that file here. Second, the line
FORMAT sex sex.;
was added to the PROC FREQ, which instructed SAS to format the variable "sex" according to the format "sex.".
ATS has created Pre-Defined formats for many of the variables in the 1990 PUMS data files (i.e. us90pump, us90pumh, ca90pump, ca90pumh, to90pump, to90pumh). You can find a list of all variables which have formats (and their corresponding format name) in the file format.list.
In general, the format has the same name as the variable, with a trailing period (e.g. the format for "sex" is "sex."). However, SAS does not permit a format to end with a number, so the format for a variable which ended with a number was given a trailing "f" as a suffix (e.g. the format for "units1" is "units1f.". Sometimes this meant that the format for a variable had to be abbreviated (e.g. the format for "vacancy1" is "vcancy1f.").
Originally revised: 15 Oct 96
UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services