### Stata FAQ

How should I analyze percentile rank data?

I have to do an analysis of variance on some test scores that were given
to me as percentile scores. My question is, "How should I analyze percentile rank data?"

The problem, of course, is that percentile rank data are not normally
distributed. Percentile ranks are ordinal and usually form a rectangular (uniform) distribution. The easiest solution is to transform the percentile rank scores into z-scores (standard normal scores) using an inverse normal function. The
z-scores will be normally distributed with mean equal to zero and a standard deviation of one. The range of the
z-scores will be between ±2.33. In Stata, the transformation would look like this:

**generate zscore = invnorm(pctrank/100)**

Specialists in testing often transform percentile ranks into NCE (normal curve equivalence) scores.
NCEs are a type of standardized score with a mean of 50 and a standard deviation of 21.06. NCEs have a range of
one to 99 and in many ways look a lot like percentile ranks. Here is how the
NCE transformation would look in Stata:

**generate nce = invnorm(pctrank/100)*21.06 + 50**

Here is a table that gives a rank of percentile rank scores and their equivalent
z-scores and NCE scores:

pctrank zscore nce
1. 1 -2.326348 1.007114
2. 2 -2.053749 6.748048
3. 3 -1.880794 10.39049
4. 4 -1.750686 13.13055
5. 5 -1.644854 15.35938
6. 10 -1.281552 23.01052
7. 20 -.8416212 32.27546
8. 25 -.6744897 35.79525
9. 30 -.5244005 38.95612
10. 40 -.2533471 44.66451
11. 50 0 50
12. 60 .2533471 55.33549
13. 70 .5244005 61.04388
14. 75 .6744897 64.20476
15. 80 .8416212 67.72454
16. 90 1.281552 76.98948
17. 95 1.644854 84.64062
18. 96 1.750686 86.86945
19. 97 1.880794 89.60951
20. 98 2.053749 93.25195
21. 99 2.326348 98.99289

The content of this web site should not be construed as an endorsement
of any particular web site, book, or software product by the
University of California.