How to Sort Data Frames in R - dummies

# How to Sort Data Frames in R

One way of sorting data in R is to determine the order that elements should be in, if you were to sort. This sounds long winded, but as you’ll see, having this flexibility means you can write statements that are very natural.

## How to get the order

First, determine the element order to sort state.info\$Population in ascending order. Do this using the order() function:

```> order.pop <- order(some.states\$Population)
> order.pop
[1] 2 8 4 3 6 7 1 10 9 5```

This means to sort the elements in ascending order, you first take the second element, then the eighth element, then the fourth element, and so on. Try it:

```> some.states\$Population[order.pop]
[1]  365  579 2110 2212 2541 3100 3615 4931 8277
[10] 21198```

## How to sort a data frame in ascending order

You calculated the order in which the elements of Population should be in order for it to be sorted in ascending order, and you stored that result in order.pop. Now, use order.pop to sort the data frame some.states in ascending order of population:

```> some.states[order.pop, ]
Region Population Income
Delaware    South    579  4809
Arkansas    South    2110  3378
....
Georgia     South    4931  4091
Florida     South    8277  4815
California    West   21198  5114```

## How to sort in decreasing order

Just like sort(), the order() function also takes an argument called decreasing. For example, to sort some.states in decreasing order of population:

```> order(some.states\$Population)
[1] 2 8 4 3 6 7 1 10 9 5
> order(some.states\$Population, decreasing=TRUE)
[1] 5 9 10 1 7 6 3 4 8 2```

Just as before, you can sort the data frame some.states in decreasing order of population. Try it, but this time don’t assign the order to a temporary variable:

```> some.states[order(some.states\$Population, decreasing=TRUE), ]
Region Population Income
California    West   21198  5114
Florida     South    8277  4815
Georgia     South    4931  4091
....
Arkansas    South    2110  3378
Delaware    South    579  4809

## How to sort on more than one column

You probably think that sorting is very straightforward, and you’re correct. Sorting on more than one column is almost as easy.

You can pass more than one vector as an argument to the order() function. If you do so, the result will be the equivalent of adding a secondary sorting key. In other words, the order will be determined by the first vector and any ties will then sort according to the second vector.

Next, you get to sort some.states on more than one column — in this case, Region and Population. If this sounds confusing, don’t worry — it really isn’t. Try it yourself. First, calculate the order to sort some.states in the order of region as well at population:

```> index <- with(some.states, order(Region, Population))
> some.states[index, ]
Region Population Income
Connecticut Northeast    3100  5348
Delaware    South    579  4809
Arkansas    South    2110  3378
Alabama     South    3615  3624
Georgia     South    4931  4091
Florida     South    8277  4815