UCLA Academic Technology Services HomeServicesClassesContactJobs

SAS FAQ
How do I make a histogram with percentage on top of each bar?

Consider this simple data file with a variable called mynum. We are going to create a histogram of mynum with percentage on top of each bar. 

data test;
  input mynum @@;
  cards;
  2.3 -.6
  3.3 3.5
  2.4 5.6
  7.8 2.4
  2.8 4.5
  6.3 1.2
  0.5  .8
  .9  1.2
  1.4 1.5
  2.3 2.5
  2.7 3.5
  3.1 4.6
  5.5 5.8
  5.3 7.6
  7.3 7.8
;
run;

Let's say the midpoints that we are going to use for our histogram are -.5, 1.5, ... 7.5, since the range of the variable is between -1 to 8. The width of each bin is 1. Below in a data step, a variable called ind is created as a group variable representing each bin (bar). Then we use proc freq with ods output option to output the frequencies and percentages to a dataset called temp3

data temp2;
  set test;
  do i = -.5 to 7.5 by 1;
  if  i-.5 <= mynum < i+.5 then ind=i;
  end;
run;
proc freq data=temp2;
tables ind ;
ods output OneWayFreqs=temp3;
run;

Let's print out the ods output data set temp3 from proc freq.

proc print data=temp3;
run;
                                               
                                                                     Cum         Cum
         Obs    Table    F_ind     ind    Frequency    Percent    Frequency    Percent

          1      ind     -0.5     -0.5           1       3.33            1       3.33
          2      ind      0.5      0.5           3      10.00            4      13.33
          3      ind      1.5      1.5           4      13.33            8      26.67
          4      ind      2.5      2.5           7      23.33           15      50.00
          5      ind      3.5      3.5           4      13.33           19      63.33
          6      ind      4.5      4.5           2       6.67           21      70.00
          7      ind      5.5      5.5           4      13.33           25      83.33
          8      ind      6.5      6.5           1       3.33           26      86.67
          9      ind      7.5      7.5           4      13.33           30     100.00

We can now use this data set to create an annotate data set and use it with proc univariate to create the histogram with percentage on top of each bar. 

data anno;
   set temp3;
   length function color text $8;
    
   function = 'label';
   color    = 'blue';
   size     =  1;
   xsys     = '2';
   ysys     = '2';
   when     = 'a';
   x=ind; /*the x-coordinate for the text*/ 
   y=percent+.5; /*the y-coordinate*/
   text=left(put(percent, 4.2));
run;
proc univariate data=temp2 noprint;
histogram mynum /anno=anno cfill=pink midpoints=-.5 to 7.5 by 1;
run;

Using the same idea, we can also get a histogram with cumulative percentage on top of each bar. This time we lower the Y-axis  position so the percentage is inside of the bar.

data anno;
    set temp3;
    length function color text $8;
    
      function = 'label';
      color    = 'blue';
      size     =  1;
      xsys     = '2';
      ysys     = '2';
      when     = 'a';
      x=ind;
      y=percent-.5;
	  text=left(put(cumpercent, 4.2));
run;
proc univariate data=temp2 noprint;
histogram mynum /anno=anno cfill=pink midpoints=-.5 to 7.5 by 1;
run;


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.