UCLA Academic Technology Services HomeServicesClassesContactJobs
Search

SAS Library
An Introduction to Publication Quality
Graphics In SAS for Windows


This page was adapted from a page of the same name (dated August 13, 1998 )  created by Oliver Schabenberger .  We thank Professor Schabenberger for permission to adapt and distribute this page via our web site.


Novice users of The SAS System frequently ask how to create publication quality graphics. Although the core of The SAS System contains several procedures for simple line-printer graphs, such as PROC PLOT and PROC CHART, the output of these procedures are not publication quality. The SAS/GRAPH module contains procedures for creating high quality graphics. While it is easy to use procedures such as GPLOT, GCHART, G3D, etc. and produce some nice looking graph on screen, it is yet another story to format these graphics for a specific output device, so that they can be printed, copied to a word processor, exported as GIFs, etc.

The study of the SAS/GRAPH manuals (two volumes) is essential to learn all the ins and outs of this process. I have always felt, that the most efficient means to getting ood graphics is to use a template or program writen by someone else and modify to accommodate your needs and wants. This web page contains several example programs for producing publication quality graphics formatted for a HP laser printer. The graphs are displayed WYSIWYG in Windows and can be printed from screen.

Contents

1. How does SAS/GRAPH differ from other graphics packages?
2. Setting GOPTIONS

3. Bivariate Plots

4. Histograms

5. Replaying Multiple Graphs Per Page


1. How does SAS/GRAPH differ from other graphics packages?

2. Setting GOPTIONS

2.1. Goptions for screen preview, formatted for HP LaserJet

goptions device=win
         targetdevice=HPLJ4SI gunit=in hby=0
         vsize=10.0 in hsize=8.0 in noborder
         noprompt display;

proc gdevice; run;

symbol1 color=black h=0.1 value=plus;

2.2 Landscaping graphs

goptions device=win
         targetdevice=HPLJ4SI gunit=in hby=0
         vsize=8.0 in hsize=10.0 in noborder
         noprompt display rotate=landscape;

2.3. Saving a graph in a specific graphics format

filename outfile 'C:\Research\Graphs\FirstGraph.cgm';
goptions gsfmode=replace gsfname=outfile devic=cgmmwwc;

filename outfile 'C:\Research\Graphs\Gifs\FirstGif.gif';
goptions gsfmode=replace gsfname=outfile device=gif373;

3. Bivariate Plots

data power;
  input range analys $ trueD power05 cnt;
  datalines;
  50     CRF       0      0.04080     1
  50     CRF       1      0.08425     1
  50     CRF       2      0.17533     1
  50     CRF       3      0.34800     1
  50     CRF       4      0.54400     1
  50     ERF       0      0.15980     2
  50     ERF       1      0.20625     2
  50     ERF       2      0.27967     2
  50     ERF       3      0.43100     2
  50     ERF       4      0.58500     2
  50     SPD       0      0.05280     3
  50     SPD       1      0.07550     3
  50     SPD       2      0.21033     3
  50     SPD       3      0.39950     3
  50     SPD       4      0.59400     3
 100     CRF       0      0.04120     4
 100     CRF       1      0.10500     4
 100     CRF       2      0.21567     4
 100     CRF       3      0.39150     4
 100     CRF       4      0.60600     4
 100     ERF       0      0.06410     5
 100     ERF       1      0.13125     5
 100     ERF       2      0.24467     5
 100     ERF       3      0.41700     5
 100     ERF       4      0.61000     5
 100     SPD       0      0.05410     6
 100     SPD       1      0.06650     6
 100     SPD       2      0.25200     6
 100     SPD       3      0.48000     6
 100     SPD       4      0.69900     6
 200     CRF       0      0.04270     7
 200     CRF       1      0.15925     7
 200     CRF       2      0.31467     7
 200     CRF       3      0.50400     7
 200     CRF       4      0.68000     7
 200     ERF       0      0.05420     8
 200     ERF       1      0.16800     8
 200     ERF       2      0.33467     8
 200     ERF       3      0.52350     8
 200     ERF       4      0.69500     8
 200     SPD       0      0.05780     9
 200     SPD       1      0.07550     9
 200     SPD       2      0.34833     9
 200     SPD       3      0.62000     9
 200     SPD       4      0.80100     9
;
run;
proc sort data=power; by range analys trued; run;
 

3.1. Standard Bivariate Plot

goptions device=win targetdevice=HPLJ4SI gunit=in  hby=0
         vsize=10.0 in hsize=8.0 in noborder
         noprompt display;

/* Define the symbols to be used in the graph. Only one symbol type is used */
/* here. I= defines the interpolation method. To connect points use I=JOIN. */
/* I=NONE does not connect the dots. L=1 would be the line type, if the     */
/* data points were connected. V=CIRCLE defines the symbol to be used.      */
symbol1  color=black  height=0.15  i=none l=1  v=circle;

/* Define the first axis */
/* all measurements are in inches, since GUNIT=IN in GOPTIONS */
axis1 length = 5.5
      origin=(1.50, 2.00)
      width = 2                 /*     line width */
      offset=(0.2, 0.1)
      value=(c=black f=centb height=0.13)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb   'True mean difference');

/* The definition of the vertical axis. The A=90 option in the LABEL */
/* statement rotates the label 90 degrees */
axis2 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      order =(0 to 1 by 0.10)  /* the min, max and step size */
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Power function');

/* eliminate existing titles and footnotes before creating the graph */
/* so they won't clutter the graph */
title; footnote;

/* Create the graph and display it. Assign AXIS1 defined above as the */
/* horizontal and AXIS2 as the vertical axis                          */
proc gplot data=power;
     note move=(1.5 in , 1.0 in) f=centb h=0.12 in
      "Figure 1.  Power function as a function of true mean differences";
     plot Power05*trued  = 1 / haxis=axis1 vaxis=axis2 ;
run; quit;

3.2. Varying Symbols based on a data set variable

plot power05*trued = 1  / ...;

/* define three different symbol types, one for each analysis */
symbol1  color=black  height=0.15  in i=none l=1  v=circle  ;
symbol2  color=black  height=0.15  in i=none l=1  v=plus    ;
symbol3  color=black  height=0.15  in i=none l=1  v=diamond ;

/* Call GPLOT but use only the data corresponding to range=100 */
/* Notice that the PLOT statement contains = CNT which instructs SAS to */
/* get the symbol assignment from the data set variable CNT.   */
proc gplot data=power(where=(range=100));
     note move=(1.5 in , 1.0 in) f=centb h=0.12 in
      "Figure 1.  Power function as a function of true mean differences";
     plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2 ;
run; quit;

 

3.3. Adding a legend

plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2 ;

plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2 nolegend;

legend1 across=1
        down  =3
        frame
        position=(top left inside)
        mode  =share
        value =(f=centb h=0.1 in c=black 'Correct Random Field'
                                         'Estimated Random Field'
                                         'Split Plot Design')
        label =(f=centb h=0.15 in c=black 'Type of Analysis:');

symbol1  color=black  height=0.15  in i=join l=1  v=circle  w=2;
symbol2  color=black  height=0.15  in i=join l=1  v=plus    w=2;
symbol3  color=black  height=0.15  in i=join l=1  v=diamond w=2;
proc gplot data=power(where=(range=100));
     note move=(1.5 in , 1.0 in) f=centb h=0.12 in
      "Figure 1.  Power function as a function of true mean differences";
     plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2 legend=legend1;
run; quit;

If you are not sure how to order the text descriptors in the VALUE statement, leave them off the first time you run the graph. SAS will use the internal values from the data set instead.

3.4. Annotations with annotation data set

/* Create the annotation data set. Notice that it is based on the  */
/* actual data set POWER. For each analysis output one label.      */

data anno;
  length function color style $8 text $20;
  /* retain a few variables so you don't have to define */
  /* them for every observation.                        */
  retain color 'black' style 'swiss' hsys '3' xsys ysys '2' ;
  set power(where=(range=100)); by range analys;
  /* Output a label when hitting the last observation in  */
  /* a particular analysis group so the label will appear */
  /* at the right end of the graph                        */
  if last.analys then do;
     function='label';
     text = Analys; /* get the label text from the data set variale */
     angle=0;
     x = Trued; y = power05; /* position of label on graph */
     size=1.2;
     position = '6'; /* position of label relative to x and y */
                     /* '6' this displays it right justified  */
     output;   /* output the observation to the annotation data set */
  end;
run;
proc gplot data=power(where=(range=100));
     note move=(1.5 in , 1.0 in) f=centb h=0.12 in
      "Figure 1.  Power function as a function of true mean differences";
     plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2
                                annotate=anno nolegend;
run; quit;

The next example plots the SPD analysis but writes the actual data values over the symbols.

data anno;
  length function color style $8 text $20;
  retain color 'black' style 'swiss' hsys '3' xsys ysys '2' ;
  set power(where=( (range=100) and (analys='SPD')));
  function='label'; text = left(put(power05,4.2));
  angle=0; x = Trued; y = power05; size=1.2; position = '2'; output;
run;
proc gplot data=power(where=((range=100) and (analys='SPD')));
     note move=(1.5 in , 1.0 in) f=centb h=0.12 in
      "Figure 1.  Power function as a function of true mean differences."
          move=(1.5 in , 0.8 in) f=centb h=0.12 in
      "           Power values shown.";
     plot Power05*trued  = cnt/ haxis=axis1 vaxis=axis2
                                annotate=anno nolegend;
run; quit;

4. Histograms

4.1. Standard Histogram

/* Create the data by randomly drawing from the Chi-Square */
/* disribution 1000 times                                  */
data chi2;
  do i = 1 to 1000;
      x = 2*rangam(1234,3);
      output;
  end;
run;

/* define the (horizontal) axis */
axis1 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb   height=0.13 in)
      label=(c=black H=0.12 in f=centb   'Chi2 Realization');

/* define the vertical axis */
axis2 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Percent');

proc gchart data=chi2;
   vbar x / type=percent   /* display frequency as percentages        */
            levels=10      /* number of levels for numeric variable x */
            raxis =axis2   /* axis definition for the response (vertical) */
            maxis =axis1;  /* axis definition for midpoints (horizontal) */
run; quit;


Since the variable X is numeric, you can define the class midpoints to be used in three different ways:
a) simply specify the number of levels to be created as above with the LEVELS= option.
b) define specific midpoints with the MIDPOINTS= optin, e.g., MIDPOINTS=1 to 30 by 2;
c) Let SAS calculate the class midpoints automatically by neither specifying LEVELS= or MIDPOINTS=

4.2. Side-by-side (grouped) histograms

data chi2;
  do i = 1 to 2000;
      if i <= 1000 then do; /* Chi(6) distribution */
        x = 2*rangam(1234,3); dist='Chi(6)';
      end; else do;         /* A Gaussian distribution with same mean and var. */
        x = 6 + Sqrt(12)*rannor(1234); dist='N(6,12)';
      end;
      output;
  end;
run;

/* define an axis for the groups */
axis3 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb   height=0.13 in)
      label=(c=black H=0.12 in f=centb   'Distribution Type');

proc gchart data=chi2;
   vbar x / type=percent
            group=dist
            levels=10      /* number of levels for numeric variable x */
            raxis =axis2   /* axis for the response (vertical) */
            maxis =axis1   /* axis for the midpoint axis (horizontal) */
            gaxis =axis3;  /* axis for the grouping axis     */
run; quit;

4.3. Subdivided (subgrouped) histograms

data sales;
  input state $ year sales;
  datalines;
MI 1996 1211
MI 1997 1045
MI 1998 1829
OH 1996  781
OH 1997  654
OH 1998 1098
WI 1996 2319
WI 1997 1890
WI 1998 1200
VA 1997 1000
;
run;

axis2 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Cumulative Sales in Thsd $');

axis1 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      order =(1996, 1997, 1998)
      value=(c=black f=centb   height=0.13 in)
      label=(c=black H=0.12 in f=centb   'Calendar Year');

legend2 across=1
        frame
        position=(top right inside)
        mode  = share
        value =(f=centb h=0.1 in c=black 'Michigan'
                                         'Ohio'
                                         'Virginia'
                                         'Wisconsin')
        label =(f=centb h=0.15 in c=black 'State:');

proc gchart data=sales;
   vbar year / discrete     /* interpret YEAR as a discrete variable */
               width=7      /* define the width of a bar */
               raxis=axis2
               maxis=axis1
               sumvar=sales    /* the variable you wish to sum */
               subgroup=state  /* the variable defining the subdisivion */
               legend=legend2;
run; quit;

The patterns SAS uses to fill the bars may not be the ones you like most. To change the fill pattern use the PATTERN statement:

pattern1 c=black value=s;  /* solid black             */
pattern2 c=black value=e;  /* empty ==> white         */
pattern3 c=black value=L1; /* left hatched density 1  */
pattern4 c=black value=R3; /* right hatched density 3 */
proc gchart data=sales;
   vbar year / discrete
               width=7
               raxis=axis2
               maxis=axis1
               sumvar=sales
               subgroup=state
               legend=legend2;
run; quit;

5. Replaying Multiple Graphs Per Page

goptions nodisplay;

/* Delete the contents of the graphics catalog MyCat, if  */
/* the catalog exists. If the catalog doe not exist, this */
/* will create an error message, but SAS will continue to */
/* process statements. So, no harm done.                  */
proc greplay igout=MyCat tc=tempcat nofs; delete _all_; quit;

goptions device=win targetdevice=HPLJ4SI gunit=in  hby=0
         vsize=10.0 in hsize=8.0 in noborder
         noprompt
         nodisplay;  /* turn automatic display of graphics off */

/* Now generate the graphs */

/* First generate plots of the power function for all ranges */
symbol1  color=black  height=0.15  in i=join l=1  v=circle w=2;

axis1 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb   height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb   'True mean difference');

axis2 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      order =(0 to 1 by 0.10)
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Power function');

title; footnote;

data anno;
  length function color style $8 text $23;
  retain color 'black' style 'swiss' hsys '3' xsys ysys '2' ;
  set power(where=(range=100)); by range analys;
  if last.analys then do;
     function='label';
     if Analys='SPD' then text = 'Split-Plot Analysis';
     else if analys='ERF' then text='Estimated Random Field';
          else if analys='CRF' then text='Correct Random Field';
     angle=0; x = 0; y = 0.95; size=1.5; position = '6'; output;
  end;
run;

/* Notice the GOUT= option on the PROC GPLOT statement */
/* It places the graph into the graphics catalog from  */
/* where we pull it again when the graphs are replayed /
proc gplot data=power(where=(range=100)) gout=MyCat;
     plot Power05*trued  = 1 / haxis=axis1 vaxis=axis2 annotate=anno;
     by analys;
run; quit;

/* ----------------------------------------------- */
/* Now produce two histograms used earlier         */
/* ----------------------------------------------- */

pattern1 c=black value=X1; /* light cross-hatched */
data chi2;
  do i = 1 to 1000;
      x = 2*rangam(1234,3);
      output;
  end;
run;

axis1 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb   height=0.13 in)
      label=(c=black H=0.12 in f=centb   'Chi2 Realization');

axis2 length = 5.5 in
      origin=(1.50 in, 2.00 in) width=2
      offset=(0.2 in, 0.1 in)
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Percent');

/* Chart is placed in the graphics catalog */
proc gchart data=chi2 gout=MyCat;
   vbar x / type=percent
            levels=10      /* number of levels for numeric variable x */
            raxis =axis2   /* axis for the response (vertical) */
            maxis =axis1;  /* midpoint axis (horizontal) */
run; quit;
 

data sales;
  input state $ year sales;
  datalines;
MI 1996 1211
MI 1997 1045
MI 1998 1829
OH 1996  781
OH 1997  654
OH 1998 1098
WI 1996 2319
WI 1997 1890
WI 1998 1200
VA 1997 1000
;
run;

axis2 length = 6.0 in
      origin=(1 in, 1.00 in) width=2
      value=(c=black f=centb  height=0.13 in)
      major=(c=black H=0.09 width=2)
      minor=(c=black H=0.05 width=2 n=1)
      label=(c=black H=0.12 in f=centb a=90 'Cumulative Sales in Thsd $');

axis1 length = 6.0 in
      origin=(1 in, 1.00 in) width=2
      order =(1996, 1997, 1998)
      value=(c=black f=centb   height=0.13 in)
      label=(c=black H=0.12 in f=centb   'Calendar Year');

legend2 across=1
        frame
        position=(top right inside)
        mode  = share
        value =(f=centb h=0.1 in c=black 'Michigan'
                                         'Ohio'
                                         'Virginia'
                                         'Wisconsin')
        label =(f=centb h=0.15 in c=black 'State:');

proc gchart data=sales gout=MyCat;
   vbar year / discrete
               width=7
               raxis=axis2
               maxis=axis1
               sumvar=sales
               subgroup=state
               legend=legend2;
run; quit;
 

/* Turn the displaying of graphs back on */
goptions display;

/* Run PROC GREPLAY. First define a template for placing   */
/* four graphs onto a page. The four sections are defined  */
/* in terms of their corner coordinates. The lower left    */
/* corner of the page has coordinate (0,0). LLX refers to the */
/* lower left x coordinate of a panel, URY to the upper right */
/* Y coordinate and so forth                               */
proc greplay igout=MyCat tc=tempcat nofs;
   TDef Four Des='Four plots and a caption'
   1/LLX=0   LLY=50 ULX=0  ULY=100 URX=50  URY=100  LRX=50  LRY=50
   2/LLX=50  LLY=50 ULX=50 ULY=100 URX=100 URY=100  LRX=100 LRY=50
   3/LLX=0   LLY=0  ULX=0  ULY=50  URX=50  URY=50   LRX=50  LRY=0
   4/LLX=50  LLY=0  ULX=50 ULY=50  URX=100 URY=50   LRX=100 LRY=0;

   /* replay the graphs in the template. Notice that the first */
   /* plot, chart, etc. is referenced as Gplot, Gchart and so  */
   /* forth. Gplot1 is the second graph in the catalog produced */
   /* by PROC GPLOT.                                            */
   /* The TREPLAY statment puts the first plot in the first panel */
   /* the first chart in the third panel and so forth.          */
   Template=Four;
   Treplay  1:Gplot    2:Gplot1   3:Gchart   4:GChart1;
quit;

(quality of gif not even close to printed quality)


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California