This article was originally published in Perspective, Volume 19, Number 4, 1995, pp. 21-28.
by Peter M. Saama, Ph.D.
The implementation of SAS in AIX and MVS environments differs remarkably. Consequently, SAS users migrating from the ES/9000 to the SP2/cluster need to be aware of the host-specific features of the production release of the SAS system under AIX.
Generally, all statements in a SAS program consist of two kinds of components: DATA steps and PROCedure steps, the building blocks of all SAS programs. A SAS command file or program is made up of either a DATA step or a PROC step, or both. DATA and PROC steps can appear in any order, and any number of DATA or PROC steps can be used in a SAS program. SAS statements usually begin with a keyword and always end with a semicolon (;).
These features of SAS are the same across host systems. The most distinguishing host-specific feature is how SAS handles data in external files. This article discusses SAS LIBNAME and FILENAME statements needed to access external files on the SP2/cluster complex. Host-dependent features of SAS are presented, including use of environment variables.
In contrast to MVS, where a SAS library is a partitioned data set, a SAS library under AIX is a directory. Members of the library are stored as individual files in that directory and have the ending
ssdnn
where
Figure 1 shows a sketch of a hypothetical directory structure on AIX. Directories enable you to organize your files in a hierarchical structure. Each of the directories under the home directory (which can be referred to using a tilde, '~') is a valid SAS data library. For example, the SAS data library could be a directory called '~/sas/class' with the members 'winners', 'losers', and 'totals'. Their full pathnames would be:
~/sas/class/winners.ssd01
~/sas/class/losers.ssd01
~/sas/class/totals.sssd01
The SAS LIBNAME statement is used to identify SAS data libraries to be accessed in a SAS session or in a SAS job. The general syntax of a LIBNAME statement on AIX is:
where:
The V609 engine is the default on AIX and provides write access to the current form of a SAS data library (release 6.09) as well as read access to SAS data files created by earlier releases. If you omit an engine name on the LIBNAME statement, the SAS system looks at the extensions of the files in the given directory and determines the appropriate engine.
Figure 2 shows samples of SAS LIBNAME statements for the default engine. For syntax related to the other engines see SAS Companion for UNIX Environments: Language.
In example A, the LIBNAME statement associates the libref 'in1' with the current working directory (.). The current working directory contains the SAS library members you will be accessing or creating.
In example B, the LIBNAME statement associates the libref 'in2' with your home directory (~).
In example C, the LIBNAME statement associates the libref 'in3' with a directory named '~/sas/class'. The directory must exist on the file system. SAS will not create it for you.
In example D, the LIBNAME statement associates the libref 'in4' with the environment variable 'MYSASLIB'. In the default C shell, the environment variable 'MYSASLIB' is created by typing:
setenv MYSASLIB ~/sas/class
The equivalent syntax for the Korn shell is:
export MYSASLIB=~/sas/class
The environment variable can also be used as a reference name for a SAS library. This is useful if many of your SAS programs access library members which are in the same directory. Since no engine can be specified when you associate a libref with an environment variable, the SAS system assigns one when the library is accessed.
The following statement uses the environment variable 'MYSASLIB' as a libref to access the SAS library '~/sas/class'. The SAS library contains a member called 'winners' with the pathname '~/sas/class/winners.ssd01':
PROC PRINT DATA=MYSASLIB.winners;
As a general rule environment variables which are used to reference SAS libraries cannot include lowercase letters and the variable value must be a directory. Environment variables with names that exceed eight (8) characters are easy to create but can only be used on the LIBNAME statement. We recommend that you assign variable names that do not exceed eight (8) characters in length.
You must explicitly issue the FILENAME statement for external files, such as 'flat files' containing data. The general form of the FILENAME statement on AIX is:
where:
DISK is the default device type. Sample FILENAME statements for the default device type are shown in Figure 3. For syntax related to the other device types, see SAS Companion for UNIX Environments: Language.
In example A, the FILENAME statement associates the fileref 'indata1' with the file 'gpa.rawdata' stored in the current working directory (.).
In example B, the FILENAME statement associates the fileref 'indata2' with the file 'gpa.rawdata' stored in your home directory (~).
In example C, the FILENAME statement associates the fileref 'indata3' with the file 'gpa.rawdata' stored in an existing directory named '~/sas/class'.
In example D, the FILENAME statement associates the fileref 'indata4' with the environment variable 'MYRAWDAT'. In the default C shell, the environment variable 'MYRAWDAT' is created by typing:
setenv MYRAWDAT ~/sas/class/gpa.rawdata
The equivalent syntax for the Korn shell is:
export MYRAWDAT=~/sas/class/gpa.rawdata
The environment variable can also be used as a reference name for an external file in the DATA step of a SAS program. The following statements use the environment variable 'MYRAWDAT' as a fileref to access a file called 'gpa.rawdata' in the subdirectory '~/sas/class' in two ways:
INFILE MYRAWDAT;
FILE MYRAWDAT;
As a general rule environment variables which are used to reference external files cannot include lowercase letters and the variable value must be a pathname. Environment variables with names that exceed eight (8) characters can only be used on the FILENAME statement. We recommend that you assign variable names that do not exceed eight (8) characters in length.
SAS data files (system files) are referenced with a one- or two-level name. The two-level name is of the form
libref.member-name
where libref refers to the SAS data library (directory) in which the data file resides and member-name refers to the particular member within that library. The one-level name is of the form
member-name (without a libref)
In this case, SAS stores the files in the temporary WORK library which is defined automatically by the SAS system at the beginning of each SAS session or job.
Once defined, you can use librefs and filerefs to access data libraries and external files. As a caution, it is important that you issue the LIBNAME and FILENAME statements before the SAS statements that reference the file(s).
The LIBNAME and FILENAME statements used in Figure 4 show you three alternative methods for creating a SAS data file from a space delimited file.
In example A, part 1 uses the SAS LIBNAME statement to assign the libref ('outgpa') to an AIX SAS data library, in this case your home directory. Remember that a SAS library in AIX is a directory which is used to store data members.
Part 2 of example A uses the SAS FILENAME statement to assign a fileref ('ingpa') to the space delimited file 'gpa.rawdata'. The full path name is given (directory and filename).
Part 3 of example A creates a permanent SAS data file called '~/gpa.ssd01' from the external file '~/local2/samples/sas/aix/gpa.rawdata'. The library reference name (libref) 'outgpa' is used as the first level of the two-level SAS file name 'outgpa.gpa'.
Another approach (shown in example B) allows you to use a fileref to point to the directory using a FILENAME statement. Then in the INFILE statement you can specify the fileref followed by the individual filename in parentheses. The relevant syntax for the FILENAME statement is:
FILENAME ingpa '/local2/samples/sas/aix';
and the matching syntax for the INFILE statement is (see example B in Figure 4):
INFILE ingpa('gpa.rawdata');
This is especially useful when you have to refer to several files in one directory.
Alternatively, you can refer to the raw data file directly, by specifying the pathname for the file on the INFILE statement, as shown in the sample setup in example C.
INFILE '/local2/samples/sas/aix/gpa.rawdata';
No FILENAME statement is required with this method.
Once defined, you can use a libref to access a permanent SAS data library. The SAS statements in Figure 5 show the use of a libref in the PRINT procedure of SAS. Use of the libref to make updates to a SAS data library is also demonstrated.
Part A uses the SAS LIBNAME statement to assign the libref ('ingpa') to an AIX SAS library.
Part B uses the libref ('ingpa') as the value of a PROC statement option to access a member ('gpa') of a SAS data library ('~/local2/samples/sas/aix'). In the PRINT procedure of SAS, the data library is opened for read access only.
Part C creates a permanent SAS dataset '~/gpa.ssd01' from the external raw data file '/local2/samples/sas/aix/gpa.rawdata'. The SAS dataset is created by using the library reference name (libref) 'outgpa' as the first level of a two-level SAS file name 'outgpa.gpa'.
Hypothetical directory structure on AIX
Samples of SAS LIBNAME statements on AIX
Sample of SAS FILENAME statements onAIX
Three alternative methods for creating a SAS data file from raw data
Notes: i. Change '/local2/samples/sas/ais' to the directory containing your raw data. Change 'gpa.rawdata' to the name of the file containing your data. ii. Specify a variable list that corresponds to your data.
Accessing permanent SAS data libraries
Engine and file types for libraries accessible to SAS on AIX
Device types and functions in the SAS FILENAME statement
2 Nov 95; Rev. 15 Dec 95