UCLA Academic Technology Services HomeServicesClassesContactJobs

SPSS Class Notes
Exploring Data


1.0 SPSS commands used in this unit

descriptives procedure for obtaining means, standard deviations, etc.
compute creates new numeric variables
filter excludes certain cases from the analysis
use all uses all cases in the data set
means calculates means for different groups
examine procedure for obtaining descriptive statistics
graph general procedure for creating graphs
frequencies calculates frequencies
crosstabs calculates crosstabulations
correlations calculates correlations

2.0 Demonstration and explanation

In this unit we will explore our data set.  By "explore", we mean  conduct some descriptive statistics on variables that will be important to the analysis that we plan to run.  This exploration is very important, because it allows us to become familiar with our data.  Also, this there are any problems with the data, such as out-of-range values, etc., we can discover them.

Let's begin by opening the data file.

  • File
      Open
        select the C: drive, the SPSS folder, and hs0.sav
* open the data file.
get file "c:\spss_data\hs0.sav".

We will begin by getting the descriptive statistics for some of the variables.

  • Analyze
      Descriptive Statistics
        Descriptives...
          select gender read write math science
* descriptives for some of the variables.
descriptives
  variables=gender read write math science.

Now we will do the same thing, but we will only look at that the records for students who earned reading scores of 60 or above.

  • Data
      Select Cases...
        select "if condition is satisfied"
          if read >= 60
  • Analyze
      Descriptive Statistics
        Descriptives...
          select gender read write math science
* create a filter for reading scores 60 
* and above and recompute the 
* descriptive statistics.
compute f_read60=(read >=  60).
filter by f_read60.
execute.
descriptives
  variables=gender read write math science.

For the next example, we will select a different set of cases to be analyzed.  We will begin by using all of the cases, and then providing the selection criteria. 

  • Data
      Select Cases...
        select "all cases"
  • Data
     Select Cases...
      select "if condition is satisfied" 
       if prgtype = "academic"
  • Analyze
     Descriptive Statistics
      Descriptives...
       select gender read write math science
* after removing the previous filter 
* (with the "use all" command), create 
* a new filter and recompute the 
* descriptive statistics.
use all.
compute f_acad=(prgtype="academic").
filter by f_acad.
execute.
descriptives
  variables=gender read write math science.

Instead of selecting cases based on the value of a variable, we will now look at cases that fall into a range.  As before, we will start by resetting the selection criteria to include all cases.  Next, we will specify the range of cases that we want included in the analysis.

  • Data
      Select Cases...
       select "all cases"
  • Data 
     Select Cases...
      select "based on time or case range"
       range 1 to 40
  • Analyze
     Descriptive Statistics
      Descriptives...
       select gender read write math science

 

* after removing the previous filter, 
* select the first 40 cases.
filter off.
use 1 thru 40.
execute.


descriptives
  variables=gender read write math science.

Now we are going to move on to some different types of analyses.  We will begin by using all of the cases in the data set.  Then we will compare the means of the variables read, write, math and science broken down by prgtype

  • Data
     Select Cases...
      select "all cases"
  • Analyze
     Compare Means
      Means...
       select read write math science as the dependent variable
       select prgtype as the independent variable
* compare means using all cases.
use all.
 
means tables = read write math science by prgtype.

We can do some basic graphics, such as stem and leaf plots, boxplots and histograms.

  • Analyze
     Descriptive Statistics 
      Explore...
       select write as the dependent variable
        click "plots..." button
         select "stem and leaf"
  • Graphs
     Legacy Dialogs
      Boxplot...
       select "simple" and "summaries for groups of cases"
        click on "define"
         select write as the variable and gender as the category axis
  • Graphs
     Legacy Dialogs
       Histogram...
        select write and check "Display normal curve" box
  • Analyze
     Descriptive Statistics
      Frequencies...
       select ses
        click on "Charts"
         select "histograms"
  • Analyze
     Descriptive Statistics
      Frequencies...
       select write
        click on "Charts"
         select "histograms"
* stem and leaf plot.
examine variables = write
 /plot stemleaf.

* boxplot.
examine variables = write by gender
 /plot = boxplot
 /statistics = none.



* histogram.
graph
 /histogram(normal) = write.
* histogram.
frequencies variables = ses
 /histogram.


frequencies variables = write
 /histogram.

Now we will look at some crosstabulations and correlations.

  • Analyze
     Descriptive Statistics
      Crosstabs...
       select prgtype for the rows and ses for the columns
        OK
  • Analyze
     Correlate
      Bivariate...
       select read write math science
  • Analyze
     Correlate
      Bivariate...
       select read write math science
        click on "Options..."
         click to "Exclude cases listwise"
* crosstabs.
crosstabs
 /tables = prgtype by ses.

* correlations.
correlations
 /variables=read write math science.

* changing from casewise to listwise 
deletion of missing data.
correlations
 /variables=read write math science
 /missing=listwise.

Let's do some more graphics.  The graphical representation of a correlation is a scatterplot, so let's try a couple of those.

  • Graphs
     Legacy Dialogs
      Scatter/Dot...
       Simple Scatter
        click on "Define"
         select write for the y-axis and read for the x-axis
  • Graphs
     Legacy Dialogs
      Scatter/Dot...
       Matrix Scatter
        Define
         select read math science write as matrix variables
* scatterplot.
graph
 /scatterplot = read with write.
 
* scatterplot matrix.
graph
 /scatterplot(matrix) = read write math science.

3.0 Syntax version

* opening the data file.
get file "c:\spss_data\hs0.sav".
* descriptives for some of the variables.
descriptives
  variables=gender read write math science.

* create a filter for reading scores 60 and above and.
* recomputing the descriptive statistics.
compute f_read60=(read >=  60).
filter by f_read60.
execute.

descriptives
  variables=gender read write math science.

* after removing the previous filter (with the "use all" command), create .
* a new filter and recompute the descriptive statistics.
use all.

compute f_acad=(prgtype="academic").
filter by f_acad.
execute.

descriptives
  variables=gender read write math science.

* after removing the previous filter, select the first 40 cases.
filter off.

use 1 thru 40.
execute.

descriptives
  variables=gender read write math science.

* compare means using all cases.
use all.

means tables = read write math science by prgtype.

* stem and leaf plot.
examine variables = write
 /plot stemleaf.

* boxplot.
examine variables = write by gender
 /plot = boxplot
 /statistics = none.

* histogram.
graph
 /histogram(normal) = write.

* histogram.
frequencies variables = ses
 /histogram.

frequencies variables = write
 /histogram.

* crosstabs.
crosstabs
 /tables = prgtype by ses.

* correlations.
correlations
 /variables=read write math science.

* changing from casewise to listwise deletion of missing data.
correlations
 /variables=read write math science
 /missing=listwise.

* scatterplot.
graph
 /scatterplot = read with write.

* SPSS does not provide code for including sun flowers on the graph.

* scatterplot matrix.
graph
 /scatterplot(matrix) = read write math science.

4.0 For more information


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.