UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SPSS Learning Module
Descriptive statistics

1. Introduction

This module demonstrates how to obtain basic descriptive statistics using SPSS.  The output presented is the output you would get running the program in batch in SPSS 6.1 on an AIX machine. If you run the same program on a PC under Windows with SPSS 7.5 or higher the output of tables will be publication quality. This output, though it will have a different look, will contain the same information. We will use a data file containing data on 26 automobiles with their make, price, mpg, repair record, and whether the car was foreign or domestic. The data file is presented below . 

MAKE PRICE MPG REP78 FOREIGN
AMC    4099 22 3 0
AMC    4749 17 3 0
AMC    3799 22 3 0
Audi   9690 17 5 1
Audi   6295 23 3 1
BMW    9735 25 4 1
Buick  4816 20 3 0
Buick  7827 15 4 0
Buick  5788 18 3 0
Buick  4453 26 3 0
Buick  5189 20 3 0
Buick 10372 16 3 0
Buick  4082 19 3 0
Cad.  11385 14 3 0
Cad.  14500 14 2 0
Cad.  15906 21 3 0
Chev.  3299 29 3 0
Chev.  5705 16 4 0
Chev.  4504 22 3 0
Chev.  5104 22 2 0
Chev.  3667 24 2 0
Chev.  3955 19 3 0
Datsun 6229 23 4 1
Datsun 4589 35 5 1
Datsun 5079 24 4 1
Datsun 8129 21 4 1

The program below reads the data and creates a temporary SPSS .sav file.  The descriptive statistics shown in this module are all performed on this save file. The list of variables on the DATA LIST command is make (A8) price * mpg * rep78 * foreign * . The (A8) following make indicates that make is a character variable. The * following the each of the other variables means that they are numeric variables. These are used with FREE which indicates "free field" input.

DATA LIST FREE/
   make  (A8) price * mpg * rep78 * foreign * .
BEGIN DATA.
AMC    4099 22 3 0
AMC    4749 17 3 0
AMC    3799 22 3 0
Audi   9690 17 5 1
Audi   6295 23 3 1
BMW    9735 25 4 1
Buick  4816 20 3 0
Buick  7827 15 4 0
Buick  5788 18 3 0
Buick  4453 26 3 0
Buick  5189 20 3 0
Buick 10372 16 3 0
Buick  4082 19 3 0
Cad.  11385 14 3 0
Cad.  14500 14 2 0
Cad.  15906 21 3 0
Chev.  3299 29 3 0
Chev.  5705 16 4 0
Chev.  4504 22 3 0
Chev.  5104 22 2 0
Chev.  3667 24 2 0
Chev.  3955 19 3 0
Datsun 6229 23 4 1
Datsun 4589 35 5 1
Datsun 5079 24 4 1
Datsun 8129 21 4 1
END DATA.
EXECUTE.

LIST 
  /CASES=10.
EXECUTE.

The output of the LIST command is shown below.  You can compare the program to the output below.

MAKE        PRICE      MPG    REP78  FOREIGN
AMC       4099.00    22.00     3.00      .00
AMC       4749.00    17.00     3.00      .00
AMC       3799.00    22.00     3.00      .00
Audi      9690.00    17.00     5.00     1.00
Audi      6295.00    23.00     3.00     1.00
BMW       9735.00    25.00     4.00     1.00
Buick     4816.00    20.00     3.00      .00
Buick     7827.00    15.00     4.00      .00
Buick     5788.00    18.00     3.00      .00
Buick     4453.00    26.00     3.00      .00

2. Using FREQUENCIES or CROSSTABS for counts

Both of these commands are used for obtaining information on the number of cases that have a certain characteristic.

FREQUENCIES

This command is used to obtain counts on a single variable's values.

CROSSTABS

This command is used to obtain counts on more than one variable's values. For example, to obtain counts on foreign cars with good repair record, and domestic cars with poor repair records.

We can use FREQUENCIES to produce tables of counts for individual variables.  Below, we use it to make frequency tables for make, rep78 and foreign. Since any command name can be abbreviated to three characters if those three characters are unique to that command, then FREQUENCIES can be abbreviated FREQ. The VAR subcommand is on a separate line and preceded by a slash ( / ). Subcommands may be placed on the same line as the command name. The first subcommand does not have to be preceded by a slash, but doing so forms a good habits.

FREQ
  /VAR= make.

FREQ
  /VAR= rep78.

FREQ
  /VAR= foreign.

Here is the output produced by the FREQUENCIES commands above.

MAKE
                                                Valid     Cum
Value Label       Value  Frequency  Percent  Percent  Percent
                AMC              3     11.5     11.5     11.5
                Audi             2      7.7      7.7     19.2
                BMW              1      3.8      3.8     23.1
                Buick            7     26.9     26.9     50.0
                Cad.             3     11.5     11.5     61.5
                Chev.            6     23.1     23.1     84.6
                Datsun           4     15.4     15.4    100.0
                            -------  -------  -------
                   Total        26    100.0    100.0

Valid cases      26      Missing cases      0

REP78
                                                 Valid     Cum
Value Label         Value  Frequency  Percent  Percent  Percent
                      2.00         3     11.5     11.5     11.5
                      3.00        15     57.7     57.7     69.2
                      4.00         6     23.1     23.1     92.3
                      5.00         2      7.7      7.7    100.0
                             -------  -------  -------
                     Total        26    100.0    100.0

Valid cases      26      Missing cases      0

FOREIGN
                                                Valid     Cum
Value Label        Value  Frequency  Percent  Percent  Percent
                     .00        19     73.1     73.1     73.1
                    1.00         7     26.9     26.9    100.0
                           -------  -------  -------
                   Total        26    100.0    100.0

Valid cases      26      Missing cases      0

Instead of having three separate FREQUENCIES, we could have done this all in one FREQUENCIES step as illustrated below.

FREQ
  /VAR=  make rep78 foreign .

Let's use CROSSTABS to look at a cross tabulation of the repair history of the cars (rep78) for foreign and domestic cars (foreign).  The CROSSTABS command for this is shown below.

CROSSTABS
  /TABLES=rep78  BY foreign.

This is the output produced.

REP78  by  FOREIGN

                    FOREIGN      Page 1 of 1
            Count  |
                   |
                   |                    Row
                   |     .00|    1.00| Total
REP78      --------+--------+--------+
             2.00  |     3  |        |     3
                   |        |        |  11.5
                   +--------+--------+
             3.00  |    14  |     1  |    15
                   |        |        |  57.7
                   +--------+--------+
             4.00  |     2  |     4  |     6
                   |        |        |  23.1
                   +--------+--------+
             5.00  |        |     2  |     2
                   |        |        |   7.7
                   +--------+--------+
            Column      19        7       26
             Total    73.1     26.9    100.0

Number of Missing Observations:  0

We can also show the cell percentages to provide more information by using the COUNT, ROW, COLUMN and TOTAL specifications on the CELL subcommand to request the printing of the row percentages, column percentages and total percentage along with the count.  Note that the specifications come after the = on the CELL subcommand. Generally the form is "subcommand=specifications list". Subcommands are preceded by a / (slash).

CROSSTABS
  /TABLES=rep78  BY foreign
  /CELLS= COUNT ROW COLUMN TOTAL .

The output is shown below.

REP78  by  FOREIGN

                    FOREIGN      Page 1 of 1
            Count  |
           Row Pct |
           Col Pct |                    Row
           Tot Pct |     .00|    1.00| Total
REP78      --------+--------+--------+
             2.00  |     3  |        |     3
                   | 100.0  |        |  11.5
                   |  15.8  |        |
                   |  11.5  |        |
                   +--------+--------+
             3.00  |    14  |     1  |    15
                   |  93.3  |   6.7  |  57.7
                   |  73.7  |  14.3  |
                   |  53.8  |   3.8  |
                   +--------+--------+
             4.00  |     2  |     4  |     6
                   |  33.3  |  66.7  |  23.1
                   |  10.5  |  57.1  |
                   |   7.7  |  15.4  |
                   +--------+--------+
             5.00  |        |     2  |     2
                   |        | 100.0  |   7.7
                   |        |  28.6  |
                   |        |   7.7  |
                   +--------+--------+
            Column      19        7       26
             Total    73.1     26.9    100.0

Number of Missing Observations:  0

The order of the options does not matter.  We would have gotten the same output had we written the command like this...

CROSSTABS
  /TABLES=rep78  BY foreign
  /CELLS= TOTAL COUNT ROW COLUMN  .

3. Using DESCRIPTIVES or MEANS for summary statistics

Both of these procedures are used for obtaining descriptive statistics like means and standard deviations.

DESCRIPTIVES

This command is used to obtain descriptive statistics on a single variable.

MEAN

This command is used to obtain descriptive statistics on a variable at different levels of another variable. For example, to obtain mean mpg separately for foreign cars and domestic cars.

To produce summary statistics, DESCRIPTIVES can be used.   Below, DESCRIPTIVES is used to get descriptive statistics for the variable mpg.

DESCRIPTIVES
  /VAR=mpg   .

The results of the DESCRIPTIVES are shown below.

Number of valid observations (listwise) =        26.00

                                                   Valid
Variable      Mean    Std Dev   Minimum   Maximum      N  Label
MPG          20.92       4.76     14.00     35.00     26

Suppose we would like to get the summary statistics separately for foreign and domestic cars (indicated by the variable foreign).   We can use the MEANS command and list foreign after the keyword BY on the TABLES subcommand. The example below will produce separate results for the different values of foreign.


MEANS
  /TABLES=mpg  BY foreign  .

As you see below, the results are presented separately for the 7 foreign cars (foreign equals 1) and the 19 domestic cars (when foreign is 0).

                 - - Description of Subpopulations - -

Summaries of     MPG
By levels of     FOREIGN

Variable      Value  Label            Mean    Std Dev    Cases
For Entire Population              20.9231     4.7575       26
FOREIGN         .00                19.7895     4.0357       19
FOREIGN        1.00                24.0000     5.5076        7
  Total Cases = 26

4. Using EXAMINE for detailed summary statistics

You can use EXAMINE to get more detailed summary statistics. I assume that beginners using SPSS syntax mode will most probably be using telnet to a UNIX machine. Since you may not have access to X-Windows, you will likely be restricted to low resolution graphics. Therefore, the output from running the program below will be presented as low resolution graphics. Thus if you are running in batch on UNIX you should include the following command in your program.

SET HIGHRES=OFF .  

DO NOT include this command if you are running on the PC or have access to SPSS Dialog Box mode on UNIX.

EXAMINE
  /VARIABLES=mpg .

And here are the results of the EXAMINE.

     MPG
 Valid cases:        26.0   Missing cases:        .0   Percent missing:      .0

 Mean       20.9231  Std Err      .9330  Min   14.00  Skewness     .9355
 Median     21.0000  Variance   22.6338  Max   35.00  S E Skew     .4556
 5% Trim    20.6026  Std Dev     4.7575  Range 21.00  Kurtosis    1.7927
 95% CI for Mean (19.0015, 22.8447)      IQR    6.25  S E Kurt     .8865

 Frequency    Stem &  Leaf
      .00        1 t
     3.00        1 f  445
     4.00        1 s  6677
     3.00        1 .  899
     4.00        2 *  0011
     6.00        2 t  222233
     3.00        2 f  445
     1.00        2 s  6
     1.00        2 .  9
     1.00 Extremes    (35)

 Stem width:     10.00
 Each leaf:       1 case(s)

           |
           |     (O) Case: 24
           |
           |
           |
           |
           |
        30 +
           |    --+--
           |      |
           |      |
           |      |
           |      |
           |      |
           |      |
           |    +-+-+
           |    |   |
           |    | * |
        20 +    |   |
           |    |   |
           |    |   |
           |    +-+-+
           |      |
           |      |
           |      |
           |    --+--
           |
        10 +
           |
           |
           +------------------------------------------------

  Variable     MPG

 N of Cases      26.00

To obtain separate EXAMINE results for foreign and domestic cars, all you have to do is add BY foreign to the VARIABLES subcommand.  This will work with some but not all SPSS commands. 

EXAMINE
  /VARIABLES=mpg BY foreign.

As you see in the output below, you get a complete set of output for overall mpg. This is followed by complete output for the case where foreign is 0, and then another set of output when foreign is 1.  Finally you get side-by-side Box Plots for each level of foreign

     MPG
 Valid cases:        26.0   Missing cases:        .0   Percent missing:      .0

 Mean     20.9231  Std Err    .9330  Min    14.00  Skewness     .9355
 Median   21.0000  Variance 22.6338  Max    35.00  S E Skew     .4556
 5% Trim  20.6026  Std Dev   4.7575  Range  21.00  Kurtosis    1.7927
 95% CI for Mean (19.0015, 22.8447)  IQR     6.25  S E Kurt     .8865

 Frequency    Stem &  Leaf
      .00        1 t
     3.00        1 f  445
     4.00        1 s  6677
     3.00        1 .  899
     4.00        2 *  0011
     6.00        2 t  222233
     3.00        2 f  445
     1.00        2 s  6
     1.00        2 .  9
     1.00 Extremes    (35)

 Stem width:     10.00
 Each leaf:       1 case(s)

           |
           |
           |     (O) Case: 24
           |
           |
           |
           |
           |
        30 +
           |    --+--
           |      |
           |      |
           |      |
           |      |
           |      |
           |      |
           |    +-+-+
           |    |   |
           |    | * |
        20 +    |   |
           |    |   |
           |    |   |
           |    +-+-+
           |      |
           |      |
           |      |
           |    --+--
           |
        10 +
           |
           |
           +--------------------------------------------------------------------

  Variable     MPG

 N of Cases      26.00

    Symbol Key:       *    - Median     (O)  - Outlier    (E)  - Extreme

     MPG
 By  FOREIGN        .00
 Valid cases:        19.0   Missing cases:        .0   Percent missing:      .0

 Mean     19.7895  Std Err    .9258  Min    14.00 Skewness    .4774
 Median   20.0000  Variance 16.2865  Max    29.00 S E Skew    .5238
 5% Trim  19.5994  Std Dev   4.0357  Range  15.00 Kurtosis    .0412
 95% CI for Mean (17.8443, 21.7346)  IQR     6.00 S E Kurt   1.0143

 Frequency    Stem &  Leaf

     2.00        1 *  44
     7.00        1 .  5667899
     8.00        2 *  00122224
     2.00        2 .  69

 Stem width:     10.00
 Each leaf:       1 case(s)

     MPG
 By  FOREIGN       1.00

 Valid cases:     7.0   Missing cases:   .0   Percent missing:  .0

 Mean     24.0000  Std Err   2.0817  Min   17.000  Skewness  1.3408
 Median   23.0000  Variance 30.3333  Max   35.000  S E Skew   .7937
 5% Trim  23.7778  Std Dev   5.5076  Range 18.000  Kurtosis  3.2861
 95% CI for Mean (18.9063, 29.0937)  IQR    4.000  S E Kurt  1.5875

 Frequency    Stem &  Leaf

     1.00 Extremes    (17)
     4.00        2 *  1334
     1.00        2 .  5
     1.00 Extremes    (35)

 Stem width:     10.00
 Each leaf:       1 case(s)

           |
           |                        (E) Case: 24
           |
           |
           |
           |
           |
        30 +
           |    --+--
M          |      |
P          |      |
G          |      |
           |      |                --+--
           |      |                +-+-+
           |      |                |   |
           |      |                | * |
           |    +-+-+              +-+-+
           |    |   |              --+--
        20 +    | * |
           |    |   |
           |    |   |
           |    |   |               (O) Case: 4
           |    +-+-+
           |      |
           |      |
           |    --+--
           |
        10 +
           |
           |
           +--------------------------------------------------------------------

  FOREIGN        .00                1.00

 N of Cases      19.00               7.00
   Symbol Key:       *    - Median     (O)  - Outlier    (E)  - Extreme

5. Problems to look out for

6. For more information


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California