Help the Stat Consulting Group by giving a gift

How can SPSS help me document my data?

The **codebook** command was introduced in SPSS version 17. It
provides information about the variables in a dataset, such as the type,
variable labels, value labels, as well as the number of cases in each level of
categorical variables and means and standard deviations of continuous variables.
This information can be as important as the data themselves, because it helps to
give meaning to the data. Also, this information can help you distinguish
between two similar datasets.

The examples below will use the hs1.sav dataset. Let's start by looking at the Variable View.

get file "D:\data\hsb1.sav".

You can access the **codebook** command via the point-and-click interface by
clicking on Analyze -> Reports -> Codebook.

Let's consider the syntax below. Although it may look complicated, only
the command itself is necessary. If you issue the **codebook** command
by itself, you will get the variable information for all of the variables in the
dataset; counts and percents for all categories of nominal and ordinal
variables; and means, standard deviations and quartiles for scale variables.
This may be more output than you want, so you may prefer to select which
variables and what information about them you would like to see. In the
example below, we have selected six variables from our dataset. In square
brackets ( [] ) after each variable name, we have indicated the measurement
level. Scale variables (AKA continuous variables) are indicates with an **s**, ordinal variables
(AKA categorical variables)
with an **o**, and nominal variables with an **n**. The measurement
level specified in the command may or may not match that shown in the Variable
View. For example, as we can see above, the variable **socst** has a
nominal measurement; however, in the **codebook** command below, we have
specified it as a scale variable. The type of measurement determines what will
be provided in the output for the variable: counts and percents for all
categories of nominal and ordinal variables; means, standard deviations and
quartiles for scale variables.

On the **varinfo** subcommand, we request some of the information that we see in
the Variable View. On the **fileinfo** subcommand, we request information on
the data file itself, such as the name of the data file, its location, the file
label, any documents attached to the data file and a count of the number of
cases in the dataset. On the **statistics** subcommand, we request the count
and percent, which gives the number of cases and percent of cases in each level
of nominal and ordinal variables. We also request the mean and standard
deviation of scale variables.

codebook ses [o] prgtype write [s] science [s] socst [s] /varinfo position label type format measure valuelabels missing /fileinfo name location label documents casecount /statistics percent mean stddev.

In the example below, we show how to get minimal output (by using the keyword
**none** on the **statistics** subcommand), and ordering the output in
alphabetical order (by using specifying **varorder = alpha** on the **
options** subcommand).

codebook ses prgtype science socst /varinfo label type valuelabels /options varorder = alpha /statistics none.

For more information about documenting data in SPSS, please visit
SPSS Learning Modules: Labeling and documenting
data and
Statistical Consulting Seminars: Introduction to SPSS Syntax,
Part 1 (section 13).