UCLA Academic Technology Services HomeServicesClassesContactJobs

Stata FAQ
How can I sort data within rows independently for each observation?

Sorting columns is easy in Stata but sorting rows is a little trickier. Consider the data shown below with variables v1 through v8 with ten observations. Let's say that these data are ratings by eight raters on ten subjects. The data are ordered within row by rater, but you would like to have the data sorted within each row from lowest to highest. Ultimately, after sorting you want to create an overall score for each subject that is the average of the middle four values, that is, you want to discard the two lowest and two highest scores for each subject and then compute the average of the remaining four ratings.
clist, noobs

       id         v1         v2         v3         v4         v5         v6         v7         v8
        1         34         35         41         29         26         34         33         36
        2         39         39         44         26         42         37         37         42
        3         39         31         40         39         51         42         36         42
        4         31         36         46         39         46         50         31         40
        5         39         41         33         42         41         34         37         46
        6         44         44         39         34         46         28         46         43
        7         42         46         38         36         46         34         46         45
        8         34         49         39         42         56         42         39         42
        9         37         44         45         39         46         41         47         40
       10         44         44         40         40         31         47         37         43
You want to change the data so that they are sorted for lowest to highest for each row. You can do this using two reshape and one sort command. Here are the commands to change the data so that it is sorted within each row.
reshape long v, i(id) j(var)
sort id v
by id: gen nv=_n
drop var
reshape wide v, i(id) j(nv)
Here are what the row sorted data look now like. Note that the lowest score is on the left in variable v1 and the highest score is on the right in variable v8.
clist, noobs

       id         v1         v2         v3         v4         v5         v6         v7         v8
        1         26         29         33         34         34         35         36         41
        2         26         37         37         39         39         42         42         44
        3         31         36         39         39         40         42         42         51
        4         31         31         36         39         40         46         46         50
        5         33         34         37         39         41         41         42         46
        6         28         34         39         43         44         44         46         46
        7         34         36         38         42         45         46         46         46
        8         34         39         39         42         42         42         49         56
        9         37         39         40         41         44         45         46         47
       10         31         37         40         40         43         44         44         47
To get the average of the middle four ratings we do not actually have to throw out any of values, we can just use the egen rowmean() command to compute the mean of v3 through v6 as shown below.
egen mean_rate=rowmean(v3-v6)

clist mean_rate

     mean_rate
  1.      32.5
  2.     37.25
  3.        43
  4.     45.25
  5.      37.5
  6.     36.75
  7.      38.5
  8.     44.75
  9.     42.75
 10.      39.5
If you don't wan to go to the trouble of doing the row sort manually you can use Nick Cox's (2009) command rowsort (findit rowsort) whose latest incarnation was presented in a Stata Journal article.

References

Cox, N. J. 2009. Speaking Stata: Rowwise. Stata Journal 9: 137-157.


How to cite this page

Report an error on this page

UCLA Researchers are invited to our Statistical Consulting Services
We recommend others to our list of Other Resources for Statistical Computing Help
These pages are Copyrighted (c) by UCLA Academic Technology Services


The content of this web site should not be construed as an endorsement of any particular web site, book, or software product by the University of California.