How to Work with Ordered Factors in R - dummies

How to Work with Ordered Factors in R

By Andrie de Vries, Joris Meys

In R, there is a special data type for ordinal data. This type is called ordered factors and is an extension of factors that you’re already familiar with.

To create an ordered factor in R, you have two options:

  • Use the factor() function with the argument ordered=TRUE.

  • Use the ordered() function.

Say you want to represent the status of five projects. Each project has a status of low, medium, or high:

> status <- c("Lo", "Hi", "Med", "Med", "Hi")

Now create an ordered factor with this status data:

> ordered.status <- factor(status, levels=c("Lo", "Med", "Hi"), ordered=TRUE)
> ordered.status
[1] Lo Hi Med Med Hi
Levels: Lo < Med < Hi

You can tell an ordered factor from an ordinary factor by the presence of directional signs (< or >) in the levels.

In R, there is a really big practical advantage to using ordered factors. A great many R functions recognize and treat ordered factors differently by printing results in the order that you expect. For example, compare the results of table(status) with table(ordered.status):

> table(status)
 Hi Lo Med
 2  1  2

Notice that the results are ordered alphabetically. However, the results of performing the same function on the ordered factor yields results that are easier to interpret because they’re now sorted in the order Lo, Med, Hi:

> table(ordered.status)
 Lo Med Hi
 1  2  2

R preserves the ordering information inherent in ordered factors. In Part V, you see how this becomes an essential tool to gain control over the appearance of bar charts.

Also, in statistical modeling, R applies the appropriate statistical transformation (of contrasts) when you have factors or ordered factors in your model.