UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Statistica Class Notes
Analyzing Data

1.0 Demonstration and explanation

For this section we will be using the hs1.sta data set that we worked with in previous sections.

t-tests

This is the one-sample t-test, testing whether the sample of writing scores was drawn from a population with a mean of 50.

This is the two-sample independent t-test with separate (unequal) variances.

This is the paired t-test, testing whether or not the mean of write equals the mean of science.

Anova

In this example we perform a one-way analysis of variance (ANOVA).

In this example we perform a two-way analysis of variance (ANOVA).  The plot option creates plots of the means, which can be a great visual aid to understanding the data.

The Tukey test is used to test all the pair-wise comparisons of the levels of prog by ses.

Now we will do an analysis of covariance (ANCOVA). Note that the results are exactly the same as in the regression where write and science are regressed on math.

Regression

This is plain old OLS regression.

It is often very useful to look at the standardized residual versus standardized predicted plot in order to look for outliers and to check for homogeneity of variance.  The ideal situation is to see no observations beyond the reference lines, which means that there are no outliers.  Also, we would like the points on the plot to be distributed randomly, which means that all the systematic variance has been explained by the model.

We will save the unstandardized residuals to a new data set.

The P-P plots command produces a normal probability plot.  It is a method of testing if the residuals from the regression are normally distributed.

The Q-Q plots produces a normal quantile plot. It is another method for testing if the residuals are normally distributed. The normal quantile plot is more sensitive to deviances from normality in the tails of the distribution, whereas the normal probability plot is more sensitive to deviances near the mean of the distribution.

Logistic regression

Logistic regression requires a dependent variable that is dichotomous (i.e., has only two values).  As we do not have such a variable in our data set, we will create one called honcomp (honors composition). This is purely for illustrative purposes only!  First, we need to copy the variable write, and then we will recode the copied variable, which we will call honcomp.

Non-parametric tests

The signrank test is the nonparametric analog of the paired t-test.

The Mann Whitney U test is the nonparametric analog of the independent two-sample t-test.

The Kruskal Wallis test is the nonparametric analog of the one-way ANOVA.

Note that you will have to scroll to the left in the results to see the output from the Kruskal Wallis test, or you can use the folder system on the left of the output to move to the Kruskal Wallis results.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California