How to Sort Data Frames in R
One way of sorting data in R is to determine the order that elements should be in, if you were to sort. This sounds long winded, but as you’ll see, having this flexibility means you can write statements that are very natural.
How to get the order
First, determine the element order to sort state.info$Population in ascending order. Do this using the order() function:
> order.pop <- order(some.states$Population) > order.pop  2 8 4 3 6 7 1 10 9 5
This means to sort the elements in ascending order, you first take the second element, then the eighth element, then the fourth element, and so on. Try it:
> some.states$Population[order.pop]  365 579 2110 2212 2541 3100 3615 4931 8277  21198
How to sort a data frame in ascending order
You calculated the order in which the elements of Population should be in order for it to be sorted in ascending order, and you stored that result in order.pop. Now, use order.pop to sort the data frame some.states in ascending order of population:
> some.states[order.pop, ] Region Population Income Alaska West 365 6315 Delaware South 579 4809 Arkansas South 2110 3378 .... Georgia South 4931 4091 Florida South 8277 4815 California West 21198 5114
How to sort in decreasing order
Just like sort(), the order() function also takes an argument called decreasing. For example, to sort some.states in decreasing order of population:
> order(some.states$Population)  2 8 4 3 6 7 1 10 9 5 > order(some.states$Population, decreasing=TRUE)  5 9 10 1 7 6 3 4 8 2
Just as before, you can sort the data frame some.states in decreasing order of population. Try it, but this time don’t assign the order to a temporary variable:
> some.states[order(some.states$Population, decreasing=TRUE), ] Region Population Income California West 21198 5114 Florida South 8277 4815 Georgia South 4931 4091 .... Arkansas South 2110 3378 Delaware South 579 4809 Alaska West 365 6315
How to sort on more than one column
You probably think that sorting is very straightforward, and you’re correct. Sorting on more than one column is almost as easy.
You can pass more than one vector as an argument to the order() function. If you do so, the result will be the equivalent of adding a secondary sorting key. In other words, the order will be determined by the first vector and any ties will then sort according to the second vector.
Next, you get to sort some.states on more than one column — in this case, Region and Population. If this sounds confusing, don’t worry — it really isn’t. Try it yourself. First, calculate the order to sort some.states in the order of region as well at population:
> index <- with(some.states, order(Region, Population)) > some.states[index, ] Region Population Income Connecticut Northeast 3100 5348 Delaware South 579 4809 Arkansas South 2110 3378 Alabama South 3615 3624 Georgia South 4931 4091 Florida South 8277 4815 Alaska West 365 6315 Arizona West 2212 4530 Colorado West 2541 4884 California West 21198 5114