UCLA Academic Technology Services HomeServicesClassesContactJobs

SPSS Class Notes
Managing Data


1.0 SPSS commands used in this unit

select if keeps selected cases in the data current data file
descriptives procedure for obtaining means, standard deviations, etc.
save outfile saves the current data file with a new name
display displays requested file information
frequencies calculates frequencies
add files appends (stacks) data files (adds cases)
sort cases sorts cases in the data file
match files merges data files (adds variables)

2.0 Demonstration and explanation

In this unit we will illustrate methods for subsetting data (in other words, using only some of the cases), appending data (adding cases from another SPSS data file), and merging data (adding variables from another SPSS data file). 

2.1  Subsetting cases

Let's open the data file.

  • File
     Open
      Data
       choose c:\spss_data\hs1.sav
get file "c:\spss_data\hs1.sav".

Let's pretend that we are working on our honors thesis and that we want to study just "good readers", those with reading scores 60 or higher. We will open the file and then "select cases" to include the students with reading scores of 60 or higher.

  • Data
     Select Cases
      click if "condition is satisfied" and click "if" box
       read >= 60
        Continue
          choose "Delete unselected cases"
* keeping cases for which students have a reading
* score of 60 or higher.
select if read >=60.
descriptives 
 /var=read.

Notice that the undesired cases have now been deleted.  Now we will save our data.

  • File
     Save as
      c:\spss_data\honors\hsgoodread.sav
* saving the data file.
save outfile "c:\spss_data\hsgoodread.sav".


2.2  Subsetting variables

Let's open the hs1 data file again.

  • File
     Open
      Data
       choose c:\spss_data\hs1.sav
get file "c:\spss_data\hs1.sav".

We want to keep just some variables, including id female read and write.  We keep these variables in the same procedure that we use to save the data file.  Notice you can also choose keep all if that is more helpful to you.

  • File
     Save as
      choose c:\spss_data\hskept.sav
       variables
        drop all
         click check boxes next to id, female, read, write
          Save
  • File
     Display Data File Information
      Working File
* pretend we have 2000 variables and we want to keep just
* some of the variables.  We want to keep just the variables
* id female read write.
save outfile = "c:\spss_data\hskept.sav" 
 /keep=id  female read write.
display names.

2.3  Appending

Let's suppose we are working on our masters thesis.  There are two files, one for the males (hsmale.sav) and one for the females (hsfemale.sav).  We would like to combine these files.  We will start by opening the file with the data for the males.

  • File
     Open
      Data
       c:\spss_data\hsmale.sav
* have one file with males, females in another file 
* and need to "append" the files.
get file "c:\spss_data\hsmale.sav".

As we can see, the variable female (which indicates gender) is a constant.  This is what we would expect in a file with data only for males.

  • Analyze
     Descriptive Statistics
      Frequencies
       select female
freq 
 /var=female.

Now we can append the files.

  • Data
     Merge Files
      Add Cases
       An external SPSS Statistics data file
        choose c:\spss_data\hsfemale.sav
         Continue
add files 
 /file=* 
 /file="c:\spss_data\hsfemale.sav".

We will now save the data file with a new name.

  • File
     Save As
      c:\spss_data\hsmasters.sav
save outfile "c:\spss_data\hsmasters.sav".

2.4  Merging

Now let's suppose that we are working on our dissertation.  The data are in two files, one with the demographic information (hsdem.sav) and one with the test scores (hstest.sav).  We would like to match merge these files based on id.  Before we can match merge these files, we need to open each file, sort it on id, and then save the sorted file.

  • File
     Open
      Data
       c:\spss_data\hsdem.sav
  • Data
     Sort Cases
      choose id
  • File
     Save As
      c:\spss_data\hsdem.sav
* one file has demographic information, the other has 
* test scores and we want to "match merge" the files.
get file "c:\spss_data\hsdem.sav".

sort cases by id.
 
save outfile "c:\spss_data\hsdem.sav".

Now that we have sorted and saved the first file (hsdem.sav), we will do the same thing for the second file (hstest.sav).

  • File
     Open
      Data
       c:\spss_data\hstest.sav
  • Data
     Sort Cases
      choose id
  • File
     Save As
      c:\spss_data\hstest.sav
get file "c:\spss_data\hstest.sav".

		
sort cases by id.
save outfile "c:\spss_data\hstest.sav".

Finally, we will open the first file (hsdem.sav) and merge it with the second file (hstest.sav).  We will save the merged data file with the name hsdiss.sav.

  • File
     Open
      Data
       c:\spss_data\hsdem.sav
get file "c:\spss_data\hsdem.sav".

It is important that we merge the data sets by the same variable on which we sorted the two files.

  • Data
     Merge Files
      Add Variables
       An external SPSS Statistics data file
        choose c:\spss_data\hstest.sav
         Continue
          click "match cases on key variable"
           move id as key variable
            click "Indicate case as source variable" 
            (name it fromtest)
match files 
 /file=* 
 /in=fromdem 
 /file="c:\spss_data\hstest.sav" 
 /in=fromtest
 /by id.

Finally, we will save the data file with a new name.

  • File
     Save As
      c:\spss_data\hsdiss.sav
save outfile "c:\spss_data\hsdiss.sav".

3.0 Syntax version

* working on honors thesis.
* want to make a subset just keeping those who have read >= 60.
get file "c:\spss_data\hs1.sav".

* keeping cases for which students have a reading score of 60 or higher.
select if read >=60.
descriptives 
 /var=read.
save outfile "c:\spss_data\hsgoodread.sav".

* pretend we have 2000 variables and we want to keep just some of the variables.
* we want to keep just the variables id female read write.
save outfile = "c:\spss_data\hskept.sav" 
 /keep=id  female read write.
display names.
get file "c:\spss_data\hskept.sav".
display names.

* extra example not in point and click.
* we want to drop just the variables ses and prog.
get file "c:\spss_data\hsgoodread.sav".
save outfile "c:\spss_data\hsdropped.sav" 
 /drop=ses prog.
display names.
get file "c:\spss_data\hsdropped.sav".
display names.

* have one file with males, females in another file and need to "append" the files.
get file "c:\spss_data\hsmale.sav".
freq 
 /var=female.

add files 
 /file=* 
 /file="c:\spss_data\hsfemale.sav".
freq 
 /var=female.

save outfile "c:\spss_data\hsmasters.sav".

* one file has demographic scores, the other has test scores and we want to "match merge" the files.
get file "c:\spss_data\hsdem.sav".
list cases from 1 to 10.

sort cases by id.
save outfile "c:\spss_data\hsdem.sav".

get file "c:\spss_data\hstest.sav".
list cases from 1 to 10.

sort cases by id.
save outfile "c:\spss_data\hstest.sav".

get file "c:\spss_data\hsdem.sav".
match files 
 /file=* 
 /in=fromdem 
 /file="c:\spss_data\hstest.sav" 
 /in=fromtest
 /by id.

list cases from 1 to 10.

list variables id fromdem fromtest.
crosstab 
 /tables=fromdem by fromtest.

save outfile "c:\spss_data\hsdiss.sav".

3.0 For more information


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.