UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

Regression with Stata
Chapter 1
Self Assessment Answers

Question 1.
Make five graphs of api99: histogram, kdensity plot, boxplot, symmetry plot and normal quantile plot.

Answer 1.
First we use the elemapi2 data file.

Below we make the plots mentioned in question 1.

Histogram

kdensity plot

boxplot

symmetry plot

normal quantile plot

Question 2.
What is the correlation between api99 and meals?

Answer 2.
Below we use the corr command to get this correlation.

Question 3.
Regress api99 on meals. What does the output tell you?

Answer 3.
Below we perform the regression predicting api99 from meals.

We see that the coefficient for meals has a t value of -43 and that it is significant. The coefficient is -4.18 (let's round it to -4.2) so very every unit increase in meals, api99 goes down by 4.22 points. In other words, for every percent increase in children who receive free meals in a school, the api score for that school would be predicted to decrease by 4.2 points.

Question 4.
Create and list the fitted (predicted) values.

Answer 4.
We can create the predicted values using the predict command, as shown below.

We can view the first 20 predicted and actual values for api99 like this.

Question 5.
Graph meals and api99 with and without the regression line.

Answer 5.
We can graph api99 by meals like this.

We can show a graph of api99 by meals with a regression line using the scatter program (assuming you installed it as shown in chapter 1) like this.

Question 6.
Look at the correlations among the variables api99 meals ell avg_ed using the corr and pwcorr commands. Explain how these commands are different. Make a scatterplot matrix for these variables and relate the correlation results to the scatterplot matrix.

We first show the output using the corr command.

Now we use the pwcorr command.

It is hard to see the differences unless we use the obs option.

The corr command performs listwise deletion, so all of the correlations are based on the listwise n of 381. The pwcorr performs pairwise deletion and shows the correlation based on the number valid observations for each pair, for example api99 and meals have 400 valid pairs, but api99 and avg_ed have 381 valid pairs.

Below we show the scatterplot for api99 meals ell avg_ed.

The scatterplot matrix is a visual representation of the correlation between the variables. For each scatterplot in the scatterplot matrix, you can see the corresponding correlation in the correlation matrix.

Question 7.
Perform a regression predicting api99 from meals ell avg_ed. Interpret the output.

Answer 7.
We can run this regression as shown below.

The t- value for all of these predictors are significant, so each is useful in predicting api99. The coefficient for meals is -3.6 and indicates that for every additional percent of children who receive free meals, the api score is predicted to be 3.6 points lower. The coefficient for ell is -.9, indicating that for every percentage increase in non-English speaking students, the api score for the school is predicted to be .9 units less.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.