UCLA Academic Technology Services HomeServicesClassesContactJobs

Stata Paper Examples
Using Graphs Instead of Tables in Political Science
by Jonathan Kastellec and Eduardo Leoni

The abstract for this paper can be found here. The full text can be downloaded for free if you are connecting from a UCLA IP address. 

NOTE: The code for this page was generated in Stata 11.  The graphs from this paper have been replicated in form only.  We present them using a demonstration dataset, NOT the datasets used to generate the graphs in the paper. 

The code shown on this page includes several user-written commands.  Before running the code, you should install:

For details on how to find and download Stata programs, see our FAQ page on the findit command. These examples use the hsbdemo.dta data file which you can obtain from within Stata using the code below. 

use http://www.ats.ucla.edu/stat/data/hsbdemo, clear

* table 1
label define yesno 1 "yes" 0 "no"
label value honors yesno
tab honor female, col

+-------------------+
| Key               |
|-------------------|
|     frequency     |
| column percentage |
+-------------------+

    honors |        female
   english |      male     female |     Total
-----------+----------------------+----------
        no |        73         74 |       147 
           |     80.22      67.89 |     73.50 
-----------+----------------------+----------
       yes |        18         35 |        53 
           |     19.78      32.11 |     26.50 
-----------+----------------------+----------
     Total |        91        109 |       200 
           |    100.00     100.00 |    100.00


* figure 2 on page 758
set scheme lean1
bysort honor female: gen freq = _N	  
spineplot female  honor, text(freq, mlabsize(*1.4)) ///
           legend(off) yla(0.1 "Male" 0.9 "Female", noticks axis(2)) ///
		   ytitle("By Female", axis(2)) xtitle(" ", axis(2)) ///
           barall(xscale(off) xlabel(, valuelabels) ///
           title("By being in honors class"))



* table 2
sum write read math science

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       write |       200      52.775    9.478586         31         67
        read |       200       52.23    10.25294         28         76
        math |       200      52.645    9.368448         33         75
     science |       200       51.85    9.900891         26         74

* figure 3
ciplot write read math science, hor



* table 3
sum  write read math science female honors awards

    Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
       write |       200      52.775    9.478586         31         67
        read |       200       52.23    10.25294         28         76
        math |       200      52.645    9.368448         33         75
     science |       200       51.85    9.900891         26         74
      female |       200        .545    .4992205          0          1
-------------+--------------------------------------------------------
      honors |       200        .265    .4424407          0          1
      awards |       200        1.67    1.818691          0          7


*figure 4, part 2
vioplot write read math science, hor


*figure 4, part 1
replace schtyp = 0 if schtyp == 2 
collapse (mean) female honors schtyp
onewayplot female honors schtyp, title("Means of Binary Variables") xtitle("Means")


* table 4
/* The Stata Journal (2003) 3, Number 3, pp. 245–269
Confidence intervals and p-values for delivery to the end user */
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
tab3way prog female ses, coltot

Table entries are cell frequencies
Missing categories ignored

--------------------------------------------------------------
          |                   ses and female                 
type of   | ----- low ----    --- middle ---    ---- high ----
program   |   male  female      male  female      male  female
----------+---------------------------------------------------
  general |      7       9        10      10         4       5
 academic |      4      15        22      22        21      21
 vocation |      4       8        15      16         4       3
    TOTAL |     15      32        47      48        29      29
--------------------------------------------------------------

gen one = 1
collapse (sum) one, by(prog female ses)
egen total = sum(one), by(female ses)
gen prop = one/total


* figure 5
twoway (scatter prog prop if female==1) ///
       (scatter prog prop if female==0), by(ses, row(3)) ///
	    legend(order(1 "female" 2 "male") row(1)) scheme(lean1) ///
		ylab(1(1) 3, valuelabels)
graph display, ysize(6)
		

* table 5
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
	
regress write math female read

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  4,   195) =   54.38
       Model |   9427.3364     4   2356.8341           Prob > F      =  0.0000
    Residual |   8451.5386   195  43.3412236           R-squared     =  0.5273
-------------+------------------------------           Adj R-squared =  0.5176
       Total |   17878.875   199   89.843593           Root MSE      =  6.5834

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .3922032   .0668992     5.86   0.000     .2602643    .5241422
      female |   5.521369   .9425592     5.86   0.000      3.66245    7.380288
        read |   .3185602     .06153     5.18   0.000     .1972105    .4399099
         ses |   .4862292   .6826549     0.71   0.477    -.8601055    1.832564
       _cons |   11.48071   2.925053     3.92   0.000     5.711913    17.24952
------------------------------------------------------------------------------

estimates store m1, title(Model 1)
regress write math female read ses
estimates store m2, title(Full Model)
estout m1 m2, cells(b(star fmt(%8.3f)) se(par fmt(%8.3f))) ///
stats(r2 N, fmt(%6.4f %9.0g)) varlabels(_cons Constant) ///
varwidth(8) modelwidth(10) legend collabels(, none) ///
posthead("") prefoot("") postfoot("")

------------------------------------
                 m1            m2  
------------------------------------

math          0.397***      0.392***
            (0.066)       (0.067)  
female        5.443***      5.521***
            (0.935)       (0.943)  
read          0.325***      0.319***
            (0.061)       (0.062)  
ses                         0.486  
                          (0.683)  
Constant     11.896***     11.481***
            (2.863)       (2.925)  

------------------------------------
r2           0.5261        0.5273  
N               200           200  
------------------------------------

* p<0.05, ** p<0.01, *** p<0.001

clear
capture eststo clear
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
eststo, title(Model 1): regress write math female read

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   72.52
       Model |  9405.34864     3  3135.11621           Prob > F      =  0.0000
    Residual |  8473.52636   196  43.2322773           R-squared     =  0.5261
-------------+------------------------------           Adj R-squared =  0.5188
       Total |   17878.875   199   89.843593           Root MSE      =  6.5751

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .3974826   .0664037     5.99   0.000      .266525    .5284401
      female |    5.44337   .9349987     5.82   0.000      3.59942    7.287319
        read |   .3252389   .0607348     5.36   0.000     .2054613    .4450166
       _cons |   11.89566   2.862845     4.16   0.000     6.249728     17.5416
------------------------------------------------------------------------------
(est1 stored)

eststo, title(Full Model): regress write math female read ses

      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  4,   195) =   54.38
       Model |   9427.3364     4   2356.8341           Prob > F      =  0.0000
    Residual |   8451.5386   195  43.3412236           R-squared     =  0.5273
-------------+------------------------------           Adj R-squared =  0.5176
       Total |   17878.875   199   89.843593           Root MSE      =  6.5834

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .3922032   .0668992     5.86   0.000     .2602643    .5241422
      female |   5.521369   .9425592     5.86   0.000      3.66245    7.380288
        read |   .3185602     .06153     5.18   0.000     .1972105    .4399099
         ses |   .4862292   .6826549     0.71   0.477    -.8601055    1.832564
       _cons |   11.48071   2.925053     3.92   0.000     5.711913    17.24952
------------------------------------------------------------------------------
(est2 stored)

esttab, label nodepvar nonumber

----------------------------------------------------
                          Model 1      Full Model  
----------------------------------------------------
math score                  0.397***        0.392***
                           (5.99)          (5.86)  
female                      5.443***        5.521***
                           (5.82)          (5.86)  
reading score               0.325***        0.319***
                           (5.36)          (5.18)  
ses                                         0.486  
                                           (0.71)  
Constant                    11.90***        11.48***
                           (4.16)          (3.92)  
----------------------------------------------------
Observations                  200             200  
----------------------------------------------------
t statistics in parentheses
* p<0.05, ** p<0.01, *** p<0.001


* figure 6
regress write math female read


      Source |       SS       df       MS              Number of obs =     200
-------------+------------------------------           F(  3,   196) =   72.52
       Model |  9405.34864     3  3135.11621           Prob > F      =  0.0000
    Residual |  8473.52636   196  43.2322773           R-squared     =  0.5261
-------------+------------------------------           Adj R-squared =  0.5188
       Total |   17878.875   199   89.843593           Root MSE      =  6.5751

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .3974826   .0664037     5.99   0.000      .266525    .5284401
      female |    5.44337   .9349987     5.82   0.000      3.59942    7.287319
        read |   .3252389   .0607348     5.36   0.000     .2054613    .4450166
       _cons |   11.89566   2.862845     4.16   0.000     6.249728     17.5416
------------------------------------------------------------------------------


parmest, label list(parm estimate min* max* p) saving(mypars, replace)


     +--------------------------------------------------------+
     |   parm    estimate       min95       max95           p |
     |--------------------------------------------------------|
  1. |   math   .39748258   .26652501   .52844015   1.009e-08 |
  2. | female   5.4433699   3.5994204   7.2873194   2.348e-08 |
  3. |   read   .32523894    .2054613   .44501657   2.381e-07 |
  4. |  _cons   11.895664    6.249728   17.541599   .00004852 |
     +--------------------------------------------------------+
file mypars.dta saved

use mypars, clear
encode parm, gen(parmn)
eclplot estimate min95 max95 parmn if parm!="_cons" , hori ylabel(2(1) 4) ///
text(3.5 7 "R-squared = .53" "Adj R-squared = .52" "n = 200", justification(left))  




* figure 7
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
tempfile tf1 tf2
parmby "reg write math read if female==0", ///
       lab saving(‘tf1’,replace) idn(1) ids(male)

Command: reg write math read if female==0

      Source |       SS       df       MS              Number of obs =      91
-------------+------------------------------           F(  2,    88) =   45.09
       Model |  4837.46939     2  2418.73469           Prob > F      =  0.0000
    Residual |  4720.20094    88   53.638647           R-squared     =  0.5061
-------------+------------------------------           Adj R-squared =  0.4949
       Total |  9557.67033    90  106.196337           Root MSE      =  7.3238

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .3932125   .1006582     3.91   0.000     .1931754    .5932495
        read |   .4159189   .0925923     4.49   0.000     .2319113    .5999266
       _cons |   7.331651   4.603424     1.59   0.115    -1.816687    16.47999
------------------------------------------------------------------------------
file ‘tf1’.dta saved

	  
parmby "reg write math read if female==1", ///
       lab saving(‘tf2’,replace) idn(2) ids(female)

Command: reg write math read if female==1

      Source |       SS       df       MS              Number of obs =     109
-------------+------------------------------           F(  2,   106) =   52.10
       Model |  3541.83282     2  1770.91641           Prob > F      =  0.0000
    Residual |  3603.15801   106  33.9920567           R-squared     =  0.4957
-------------+------------------------------           Adj R-squared =  0.4862
       Total |  7144.99083   108  66.1573225           Root MSE      =  5.8303

------------------------------------------------------------------------------
       write |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |    .419655    .087195     4.81   0.000     .2467825    .5925275
        read |   .2306052   .0793334     2.91   0.004      .073319    .3878914
       _cons |    21.0731   3.370705     6.25   0.000     14.39034    27.75585
------------------------------------------------------------------------------
file ‘tf2’.dta saved

dsconcat ‘tf1’ ‘tf2’

sencode idstr, gene(modtype)
sencode label, gene(predictor)
lab var modtype "Gender"
lab var predictor "Predictor"

replace predictor = predictor + .1 if modtype ==2

drop if parm=="_cons"

twoway (rcap min95 max95 predictor if modtype==1, hor) ///
	(scatter predictor estimate if modtype==1) ///
	(rcap min95 max95 predictor if modtype==2, hor) ///
	(scatter predictor estimate if modtype==2), ///
	yscale(range(0 3)) ylabel(1 2, valuelabel) ///
	legend(order(2 "Model 1" 4 "Model 2") row(1) ring(0)) scheme(lean1) ///
	xscale(range(-.4 .8)) xlabel(-.2(.2) .6) ///
	xline(0, lpattern(shortdash))
	  
	

Below is code for replicating graphs like Figure 6 when using a logistic model or when plotting marginal effects.

*logit model
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
logit honors math science read

Logistic regression                               Number of obs   =        200
                                                  LR chi2(3)      =      74.30
                                                  Prob > chi2     =     0.0000
Log likelihood = -78.493853                       Pseudo R2       =     0.3212

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .1156084   .0318287     3.63   0.000     .0532253    .1779915
     science |   .0305099   .0294797     1.03   0.301    -.0272692     .088289
        read |   .0626137   .0267384     2.34   0.019     .0102074      .11502
       _cons |  -12.53017   1.869501    -6.70   0.000    -16.19432   -8.866016
------------------------------------------------------------------------------


*margins, post
parmest, label list(parm estimate min* max* p) eform saving(mypars, replace)

     +---------------------------------------------------------+
     |    parm    estimate       min95       max95           p |
     |---------------------------------------------------------|
  1. |    math   1.1225562   1.0546672   1.1948152   .00028101 |
  2. | science   1.0309801   .97309923   1.0923038   .30069377 |
  3. |    read   1.0646155   1.0102597   1.1218959   .01919541 |
  4. |   _cons   3.616e-06   9.266e-08    .0001411   2.050e-11 |
     +---------------------------------------------------------+
file mypars.dta saved

use mypars, clear
encode parm, gen(parmn)
eclplot estimate min95 max95 parmn if parm!="_cons" , ///
hori xline(1, lpattern(shortdash)) ytitle("") ylabel(2 3 4) xtitle(Odds Ratio)



* marginal effect
use http://www.ats.ucla.edu/stat/data/hsbdemo, clear
logit honors math science read


Logistic regression                               Number of obs   =        200
                                                  LR chi2(3)      =      74.30
                                                  Prob > chi2     =     0.0000
Log likelihood = -78.493853                       Pseudo R2       =     0.3212

------------------------------------------------------------------------------
      honors |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .1156084   .0318287     3.63   0.000     .0532253    .1779915
     science |   .0305099   .0294797     1.03   0.301    -.0272692     .088289
        read |   .0626137   .0267384     2.34   0.019     .0102074      .11502
       _cons |  -12.53017   1.869501    -6.70   0.000    -16.19432   -8.866016
------------------------------------------------------------------------------

margins, dydx(math science read) atmeans post

Conditional marginal effects                      Number of obs   =        200
Model VCE    : OIM

Expression   : Pr(honors), predict()
dy/dx w.r.t. : math science read
at           : math            =      52.645 (mean)
               science         =       51.85 (mean)
               read            =       52.23 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        math |   .0162469   .0045199     3.59   0.000     .0073881    .0251057
     science |   .0042877   .0040702     1.05   0.292    -.0036897     .012265
        read |   .0087993    .003778     2.33   0.020     .0013945    .0162042
------------------------------------------------------------------------------

parmest, label list(parm estimate min* max* p) saving(mypars, replace)

     +----------------------------------------------------------+
     |    parm    estimate        min95       max95           p |
     |----------------------------------------------------------|
  1. |    math   .01624687    .00738807   .02510566   .00032497 |
  2. | science   .00428767   -.00368968   .01226502   .29213848 |
  3. |    read   .00879933    .00139449   .01620417   .01985544 |
     +----------------------------------------------------------+
file mypars.dta saved

use mypars, clear
encode parm, gen(parmn)
eclplot estimate min95 max95 parmn , ///
hori xline(0, lpattern(shortdash)) ylabel(1 2 3) ///
ytitle("") xtitle(Marginal Effect)


References

Kastellec, J.P. and Leoni, E.L. "Using Graphs Instead of Tables in Political Science". Perspectives on Politics (2007), 5:4:755-771 Cambridge University Press.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.