How can SPSS help me document my data?

The codebook command was introduced in SPSS version 17.  It provides information about the variables in a dataset, such as the type, variable labels, value labels, as well as the number of cases in each level of categorical variables and means and standard deviations of continuous variables.  This information can be as important as the data themselves, because it helps to give meaning to the data.  Also, this information can help you distinguish between two similar datasets.  

The examples below will use the hs1.sav dataset.  Let's start by looking at the Variable View.

get file "D:\data\hsb1.sav".

You can access the codebook command via the point-and-click interface by clicking on Analyze -> Reports -> Codebook.

Let's consider the syntax below.  Although it may look complicated, only the command itself is necessary.  If you issue the codebook command by itself, you will get the variable information for all of the variables in the dataset; counts and percents for all categories of nominal and ordinal variables; and means, standard deviations and quartiles for scale variables.  This may be more output than you want, so you may prefer to select which variables and what information about them you would like to see.  In the example below, we have selected six variables from our dataset.  In square brackets ( [] ) after each variable name, we have indicated the measurement level.  Scale variables (AKA continuous variables) are indicates with an s, ordinal variables (AKA categorical variables) with an o, and nominal variables with an n.  The measurement level specified in the command may or may not match that shown in the Variable View.  For example, as we can see above, the variable socst has a nominal measurement; however, in the codebook command below, we have specified it as a scale variable. The type of measurement determines what will be provided in the output for the variable:  counts and percents for all categories of nominal and ordinal variables; means, standard deviations and quartiles for scale variables. 

On the varinfo subcommand, we request some of the information that we see in the Variable View.  On the fileinfo subcommand, we request information on the data file itself, such as the name of the data file, its location, the file label, any documents attached to the data file and a count of the number of cases in the dataset.  On the statistics subcommand, we request the count and percent, which gives the number of cases and percent of cases in each level of nominal and ordinal variables.  We also request the mean and standard deviation of scale variables.

codebook ses [o] prgtype write [s] science [s] socst [s]
 /varinfo position label type format measure valuelabels missing
 /fileinfo name location label documents casecount
 /statistics  percent mean stddev.

In the example below, we show how to get minimal output (by using the keyword none on the statistics subcommand), and ordering the output in alphabetical order (by using specifying varorder = alpha on the options subcommand).

codebook ses prgtype science socst
 /varinfo label type valuelabels
 /options varorder = alpha
 /statistics none.

For more information about documenting data in SPSS, please visit  SPSS Learning Modules:  Labeling and documenting data and Statistical Consulting Seminars:  Introduction to SPSS Syntax, Part 1 (section 13).

How to cite this page

Report an error on this page or leave a comment

The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.