UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS Learning Module
Using where with SAS procedures

1. Introduction

This program builds a SAS file called auto, which we will use to demonstrate the use of the where statement. (For information about creating SAS files from raw data, see the SAS Learning Module titled Inputting Raw Data into SAS.

DATA auto ;
  LENGTH make $ 20 ;
  INPUT make $ 1-17 price mpg rep78 hdroom trunk weight length turn
        displ gratio foreign ;
CARDS;
AMC Concord        4099 22 3 2.5 11 2930 186 40 121 3.58 0
AMC Pacer          4749 17 3 3.0 11 3350 173 40 258 2.53 0
AMC Spirit         3799 22 . 3.0 12 2640 168 35 121 3.08 0
Audi 5000          9690 17 5 3.0 15 2830 189 37 131 3.20 1
Audi Fox           6295 23 3 2.5 11 2070 174 36  97 3.70 1
BMW 320i           9735 25 4 2.5 12 2650 177 34 121 3.64 1
Buick Century      4816 20 3 4.5 16 3250 196 40 196 2.93 0
Buick Electra      7827 15 4 4.0 20 4080 222 43 350 2.41 0
Buick LeSabre      5788 18 3 4.0 21 3670 218 43 231 2.73 0
Buick Opel         4453 26 . 3.0 10 2230 170 34 304 2.87 0
Buick Regal        5189 20 3 2.0 16 3280 200 42 196 2.93 0
Buick Riviera     10372 16 3 3.5 17 3880 207 43 231 2.93 0
Buick Skylark      4082 19 3 3.5 13 3400 200 42 231 3.08 0
Cad. Deville      11385 14 3 4.0 20 4330 221 44 425 2.28 0
Cad. Eldorado     14500 14 2 3.5 16 3900 204 43 350 2.19 0
Cad. Seville      15906 21 3 3.0 13 4290 204 45 350 2.24 0
Chev. Chevette     3299 29 3 2.5  9 2110 163 34 231 2.93 0
Chev. Impala       5705 16 4 4.0 20 3690 212 43 250 2.56 0
Chev. Malibu       4504 22 3 3.5 17 3180 193 31 200 2.73 0
Chev. Monte Carlo  5104 22 2 2.0 16 3220 200 41 200 2.73 0
Chev. Monza        3667 24 2 2.0  7 2750 179 40 151 2.73 0
Chev. Nova         3955 19 3 3.5 13 3430 197 43 250 2.56 0
Datsun 200         6229 23 4 1.5  6 2370 170 35 119 3.89 1
Datsun 210         4589 35 5 2.0  8 2020 165 32  85 3.70 1
Datsun 510         5079 24 4 2.5  8 2280 170 34 119 3.54 1
Datsun 810         8129 21 4 2.5  8 2750 184 38 146 3.55 1
Dodge Colt         3984 30 5 2.0  8 2120 163 35  98 3.54 0
Dodge Diplomat     4010 18 2 4.0 17 3600 206 46 318 2.47 0
Dodge Magnum       5886 16 2 4.0 17 3600 206 46 318 2.47 0
Dodge St. Regis    6342 17 2 4.5 21 3740 220 46 225 2.94 0
Fiat Strada        4296 21 3 2.5 16 2130 161 36 105 3.37 1
Ford Fiesta        4389 28 4 1.5  9 1800 147 33  98 3.15 0
Ford Mustang       4187 21 3 2.0 10 2650 179 43 140 3.08 0
Honda Accord       5799 25 5 3.0 10 2240 172 36 107 3.05 1
Honda Civic        4499 28 4 2.5  5 1760 149 34  91 3.30 1
Linc. Continental 11497 12 3 3.5 22 4840 233 51 400 2.47 0
Linc. Mark V      13594 12 3 2.5 18 4720 230 48 400 2.47 0
Linc. Versailles  13466 14 3 3.5 15 3830 201 41 302 2.47 0
Mazda GLC          3995 30 4 3.5 11 1980 154 33  86 3.73 1
Merc. Bobcat       3829 22 4 3.0  9 2580 169 39 140 2.73 0
Merc. Cougar       5379 14 4 3.5 16 4060 221 48 302 2.75 0
Merc. Marquis      6165 15 3 3.5 23 3720 212 44 302 2.26 0
Merc. Monarch      4516 18 3 3.0 15 3370 198 41 250 2.43 0
Merc. XR-7         6303 14 4 3.0 16 4130 217 45 302 2.75 0
Merc. Zephyr       3291 20 3 3.5 17 2830 195 43 140 3.08 0
Olds 98            8814 21 4 4.0 20 4060 220 43 350 2.41 0
Olds Cutl Supr     5172 19 3 2.0 16 3310 198 42 231 2.93 0
Olds Cutlass       4733 19 3 4.5 16 3300 198 42 231 2.93 0
Olds Delta 88      4890 18 4 4.0 20 3690 218 42 231 2.73 0
Olds Omega         4181 19 3 4.5 14 3370 200 43 231 3.08 0
Olds Starfire      4195 24 1 2.0 10 2730 180 40 151 2.73 0
Olds Toronado     10371 16 3 3.5 17 4030 206 43 350 2.41 0
Peugeot 604       12990 14 . 3.5 14 3420 192 38 163 3.58 1
Plym. Arrow        4647 28 3 2.0 11 3260 170 37 156 3.05 0
Plym. Champ        4425 34 5 2.5 11 1800 157 37  86 2.97 0
Plym. Horizon      4482 25 3 4.0 17 2200 165 36 105 3.37 0
Plym. Sapporo      6486 26 . 1.5  8 2520 182 38 119 3.54 0
Plym. Volare       4060 18 2 5.0 16 3330 201 44 225 3.23 0
Pont. Catalina     5798 18 4 4.0 20 3700 214 42 231 2.73 0
Pont. Firebird     4934 18 1 1.5  7 3470 198 42 231 3.08 0
Pont. Grand Prix   5222 19 3 2.0 16 3210 201 45 231 2.93 0
Pont. Le Mans      4723 19 3 3.5 17 3200 199 40 231 2.93 0
Pont. Phoenix      4424 19 . 3.5 13 3420 203 43 231 3.08 0
Pont. Sunbird      4172 24 2 2.0  7 2690 179 41 151 2.73 0
Renault Le Car     3895 26 3 3.0 10 1830 142 34  79 3.72 1
Subaru             3798 35 5 2.5 11 2050 164 36  97 3.81 1
Toyota Celica      5899 18 5 2.5 14 2410 174 36 134 3.06 1
Toyota Corolla     3748 31 5 3.0  9 2200 165 35  97 3.21 1
Toyota Corona      5719 18 5 2.0 11 2670 175 36 134 3.05 1
Volvo 260         11995 17 5 2.5 14 3170 193 37 163 2.98 1
VW Dasher          7140 23 4 2.5 12 2160 172 36  97 3.74 1
VW Diesel          5397 41 5 3.0 15 2040 155 35  90 3.78 1
VW Rabbit          4697 25 4 3.0 15 1930 155 35  89 3.78 1
VW Scirocco        6850 25 4 2.0 16 1990 156 36  97 3.78 1
;
RUN; 

2. Basic use of the where statement

The where statement allows us to run procedures on a subset of records. For example, instead of printing all records in the file, the following program prints only cars where the value for rep78 is 3 or greater.

 PROC PRINT DATA=auto;
  WHERE (rep78 >= 3);
  VAR make rep78;
RUN;   

Here is the output from the proc print. Note that we have directed SAS to print only two variables: make and rep78.

  OBS    MAKE                 rep78      
   1    AMC Concord            3        
   2    AMC Pacer              3        
   4    Audi 5000              5        
   5    Audi Fox               3        
   6    BMW 320i               4        
   7    Buick Century          3        
   8    Buick Electra          4        
   9    Buick LeSabre          3        
  11    Buick Regal            3        
  12    Buick Riviera          3        
  13    Buick Skylark          3        
  14    Cad. Deville           3        
  16    Cad. Seville           3        
  17    Chev. Chevette         3        
  18    Chev. Impala           4        
  19    Chev. Malibu           3        
  22    Chev. Nova             3        
  23    Datsun 200             4        
  24    Datsun 210             5        
  25    Datsun 510             4        
  26    Datsun 810             4        
  27    Dodge Colt             5        
  31    Fiat Strada            3        
  32    Ford Fiesta            4        
  33    Ford Mustang           3        
  34    Honda Accord           5        
  35    Honda Civic            4        
  36    Linc. Continental      3        
  37    Linc. Mark V           3        
  38    Linc. Versailles       3        
  39    Mazda GLC              4        
  40    Merc. Bobcat           4        
  41    Merc. Cougar           4        
  42    Merc. Marquis          3        
  43    Merc. Monarch          3        
  44    Merc. XR-7             4        
  45    Merc. Zephyr           3        
  46    Olds 98                4        
  47    Olds Cutl Supr         3        
  48    Olds Cutlass           3        
  49    Olds Delta 88          4        
  50    Olds Omega             3        
  52    Olds Toronado          3        
  54    Plym. Arrow            3        
  55    Plym. Champ            5        
  56    Plym. Horizon          3        
  59    Pont. Catalina         4        
  61    Pont. Grand Prix       3        
  62    Pont. Le Mans          3        
  65    Renault Le Car         3        
  66    Subaru                 5        
  67    Toyota Celica          5        
  68    Toyota Corolla         5        
  69    Toyota Corona          5        
  70    Volvo 260              5        
  71    VW Dasher              4        
  72    VW Diesel              5        
  73    VW Rabbit              4        
  74    VW Scirocco            4        

Consider the following program which compares repair records for foreign and domestic cars by creating a table of repairs (rep78) for each separately.

PROC FREQ DATA=auto; 
  TABLES rep78*foreign ;
RUN;

TABLE OF rep78 BY FOREIGN      
  
       rep78        FOREIGN                   
  
     Frequency=                          
     Percent  =                          
     Row Pct  =                          
     Col Pct  =       0=       1=  Total 
     ============================        
            1 =      2 =      0 =      2 
              =   2.90 =   0.00 =   2.90 
              = 100.00 =   0.00 =        
              =   4.17 =   0.00 =        
     ============================        
            2 =      8 =      0 =      8 
              =  11.59 =   0.00 =  11.59 
              = 100.00 =   0.00 =        
              =  16.67 =   0.00 =        
     ============================        
            3 =     27 =      3 =     30 
              =  39.13 =   4.35 =  43.48 
              =  90.00 =  10.00 =        
              =  56.25 =  14.29 =        
     ============================        
            4 =      9 =      9 =     18 
              =  13.04 =  13.04 =  26.09 
              =  50.00 =  50.00 =        
              =  18.75 =  42.86 =        
     ============================        
            5 =      2 =      9 =     11 
              =   2.90 =  13.04 =  15.94 
              =  18.18 =  81.82 =        
              =   4.17 =  42.86 =        
     ============================        
     Total          48       21       69 
                 69.57    30.43   100.00 

Using the where statement, we restrict the analysis to only cars with a repair rating of 3 or more (rep78  >=  3):

PROC FREQ DATA=auto; 
  WHERE (rep78 >= 3);
  TABLES rep78*foreign ;
RUN; 

TABLE OF rep78 BY FOREIGN      
       rep78     FOREIGN                   
  
     Frequency=                          
     Percent  =                          
     Row Pct  =                          
     Col Pct  =       0=       1=  Total 
     ============================        
            3 =     27 =      3 =     30 
              =  45.76 =   5.08 =  50.85 
              =  90.00 =  10.00 =        
              =  71.05 =  14.29 =        
     ============================        
            4 =      9 =      9 =     18 
              =  15.25 =  15.25 =  30.51 
              =  50.00 =  50.00 =        
              =  23.68 =  42.86 =        
     ============================        
            5 =      2 =      9 =     11 
              =   3.39 =  15.25 =  18.64 
              =  18.18 =  81.82 =        
              =   5.26 =  42.86 =        
     ============================        
     Total          38       21       59 
                 64.41    35.59   100.00

The where statement works with most SAS procedures. The following program prints only records for which the car has a repair rating of 2 or less:

PROC PRINT DATA=auto;
  WHERE (rep78 <= 2);
  VAR make price rep78 ;
RUN; 
 
 OBS    MAKE                 price    rep78
   3    AMC Spirit            3799      .
  10    Buick Opel            4453      .
  15    Cad. Eldorado        14500      2
  20    Chev. Monte Carlo     5104      2
  21    Chev. Monza           3667      2
  28    Dodge Diplomat        4010      2
  29    Dodge Magnum          5886      2
  30    Dodge St. Regis       6342      2
  51    Olds Starfire         4195      1
  53    Peugeot 604          12990      .
  57    Plym. Sapporo         6486      .
  58    Plym. Volare          4060      2
  60    Pont. Firebird        4934      1
  63    Pont. Phoenix         4424      .
  64    Pont. Sunbird         4172      2 

3. Missing values and the where statement

In the example above, note that some of the records print a '.' instead of a value for rep78. These are records where rep78 is missing. SAS stores missing values for numeric variables as '.' and treats them as negative infinity, or the lowest number possible. To exclude missing values, modify the where statement as follows (the rep78 ^= . indicates rep78 is not equal to missing).

PROC PRINT DATA=auto;
  WHERE (rep78 <= 2) and (rep78 ^= .) ;
  VAR make price rep78 ;
RUN;   

Note that there are no missing values in the listing.

OBS    MAKE                 price    rep78
  15    Cad. Eldorado        14500      2
  20    Chev. Monte Carlo     5104      2
  21    Chev. Monza           3667      2
  28    Dodge Diplomat        4010      2
  29    Dodge Magnum          5886      2
  30    Dodge St. Regis       6342      2
  51    Olds Starfire         4195      1
  58    Plym. Volare          4060      2
  60    Pont. Firebird        4934      1
  64    Pont. Sunbird         4172      2

Similarly, this where statement yields the same result:

PROC PRINT DATA=auto; 
  WHERE (. < rep78 <= 2); 
  VAR make price rep78 ;
RUN;   

4. More complex where statements

This program generates summary statistics for price, but only for cars with repair histories of 1 or 2:

PROC MEANS DATA=auto;
  WHERE (rep78 = 1) OR (rep78 = 2) ;
  VAR price ;
RUN;   

Here is the output from the proc means. By default, proc means will generate the following statistics: mean, minimum and maximum values, standard deviation, and the number of non-missing values for the analysis variable (in this case price).

Analysis Variable : price
N          Mean       Std Dev       Minimum       Maximum
----------------------------------------------------------
10       5687.00       3216.38       3667.00      14500.00
----------------------------------------------------------

To see summary statistics for price for cars with repair histories of 3, 4 or 5, modify the where statement accordingly:

PROC MEANS DATA=auto;
  WHERE (rep78 = 3) or (rep78 = 4) or (rep78 = 5) ; 
  VAR price ;
RUN;   

Or:

PROC MEANS DATA=auto; 
  WHERE (3 <= rep78 <= 5) ;
  VAR price ;
RUN;  

Analysis Variable : price
   N          Mean       Std Dev       Minimum       Maximum
  ----------------------------------------------------------
  59       6223.85       2880.45       3291.00      15906.00
  ----------------------------------------------------------

The where statement also works with the in operator as follows:

PROC MEANS DATA=auto; 
  WHERE rep78 in (3,4,5); 
  VAR price ;
RUN; 

5. Problems to look out for

Be careful when using less than or less than or equal or not equal when you have missing data.  Be sure to separately exclude the missing cases if you want them excluded.

How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.