UCLA Academic Technology Services HomeServicesClassesContactJobs
Help the Stat Consulting Group by giving a gift             
Loading

SPSS FAQ
What kinds of new variables can I make with the create command?

The create command has many functions that are useful for making new variables.  Below is a list of these functions.

Function name Action
CSUM Cumulative sum
DIFF Difference
FFT Fast Fourier transform
IFFT Inverse fast Fourier transform
LAG Lag
LEAD Lead
MA Centered moving averages
PMA Prior moving averages
RMED Running medians
SDIFF Seasonal difference
T4253H Smoothing

Let's use the hsb2 data set and make new variables using some of these functions.  We will start by deleting from this data set some of the variables that we will not be using.  After making new variables, we will use the list command to show the first few cases of the original and new variable.

delete variables female ses schtyp prog read write math science.

We will start with the function for cumulative sum. 

create v1 = csum(socst).
list socst v1
 /cases from 1 to 7.
    socst           v1 
 
    57.00        57.00 
    61.00       118.00 
    31.00       149.00 
    56.00       205.00 
    61.00       266.00 
    61.00       327.00 
    61.00       388.00 
 
Number of cases read:  7    Number of cases listed:  7

The diff function can be used to create a variable with the difference between values of the original variable.  The degree of the difference must be specified.  In this example, we will make two new variables.  The first will be differenced once and the second, v3, will be differenced twice.

create v2 = diff(socst, 1)
 /v3 =diff(socst, 2).
list socst v2 v3
 /cases from 1 to 7.
    socst         v2         v3 
 
    57.00        .          . 
    61.00       4.00        . 
    31.00     -30.00     -34.00 
    56.00      25.00      55.00 
    61.00       5.00     -20.00 
    61.00        .00      -5.00 
    61.00        .00        .00 
 
Number of cases read:  7    Number of cases listed:  7

The lag function can be used to make variables with lags of various lengths.  The degree of lag must be specified.  If a multiple variables with a range of lagged values is desired, the end points of the lags can be specified.  In the first example, v4 contains the thrice lagged values of socst.  In the second example, three new variables are made.  The first, v5, contains the once lagged values of socst; v6 contains the twice lagged values of socst; v7 is the same as v4

create v4 = lag(socst, 3).
create v5 to v7 = lag(socst, 1, 3).
list socst v4 to v7
 /cases from 1 to 7.
    socst        v4        v5        v6        v7 
 
    57.00       .         .         .         . 
    61.00       .       57.00       .         . 
    31.00       .       61.00     57.00       . 
    56.00     57.00     31.00     61.00     57.00 
    61.00     61.00     56.00     31.00     61.00 
    61.00     31.00     61.00     56.00     31.00 
    61.00     56.00     61.00     61.00     56.00 
 
Number of cases read:  7    Number of cases listed:  7

The lead function works just like the lag function.  In this example, we use a lead of 2.

create v8 = lead(socst, 2).
list socst v8
 /cases from 1 to 7.
    socst        v8 
 
    57.00     31.00 
    61.00     56.00 
    31.00     61.00 
    56.00     61.00 
    61.00     61.00 
    61.00     36.00 
    61.00     51.00 
 
Number of cases read:  7    Number of cases listed:  7

The create command can be combined with the split file command, so that the functions operate within groups of cases.  In the example below, the lag function is used.  As expected, the first case within each level of the variable race is missing.

sort cases by race.
split file by race.
create v9 = lag(socst, 1).
split file off.
list race socst v9
 /cases from 1 to 40.
    race     socst        v9 
 
     1.00     36.00       . 
     1.00     61.00     36.00 
     1.00     46.00     61.00 
     1.00     36.00     46.00 
     1.00     51.00     36.00 
     1.00     46.00     51.00 
     1.00     42.00     46.00 
     1.00     46.00     42.00 
     1.00     51.00     46.00 
     1.00     36.00     51.00 
     1.00     31.00     36.00 
     1.00     56.00     31.00 
     1.00     56.00     56.00 
     1.00     48.00     56.00 
     1.00     41.00     48.00 
     1.00     51.00     41.00 
     1.00     66.00     51.00 
     1.00     51.00     66.00 
     1.00     41.00     51.00 
     1.00     51.00     41.00 
     1.00     41.00     51.00 
     1.00     41.00     41.00 
     1.00     61.00     41.00 
     1.00     61.00     61.00 
     2.00     41.00       . 
     2.00     56.00     41.00 
     2.00     46.00     56.00 
     2.00     41.00     46.00 
     2.00     56.00     41.00 
     2.00     51.00     56.00 
     2.00     56.00     51.00 
     2.00     71.00     56.00 
     2.00     36.00     71.00 
     2.00     51.00     36.00 
     2.00     56.00     51.00 
     3.00     61.00       . 
     3.00     51.00     61.00 
     3.00     56.00     51.00 
     3.00     56.00     56.00 
     3.00     31.00     56.00 
 
Number of cases read:  40    Number of cases listed:  40

When the create command makes a new variable, it also labels that variable.  This is very useful if you are making many new variables.


How to cite this page

Report an error on this page or leave a comment

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.