How to Create a Factor in R

By Andrie de Vries, Joris Meys

To create a factor in R, you use the factor() function. The first three arguments of factor() warrant some exploration:

  • x: The input vector that you want to turn into a factor.

  • levels: An optional vector of the values that x might have taken. The default is lexicographically sorted, unique values of x.

  • labels: Another optional vector that, by default, takes the same values as levels. You can use this argument to rename your levels.

The fact that you can supply both levels and labels to factor can lead to confusion. Just remember that levels refers to the input values of x, while labels refers to the output values of the new factor.

Consider the following example of a vector consisting of compass directions:

> directions <- c("North", "East", "South", "South")

Notice that this vector contains the value “South” twice and lacks the value “West”. First, convert directions to a factor:

> factor(directions)
[1] North East South South
Levels: East North South

Notice that the levels of your new factor does not contain the value “West”, which is as expected. In practice, however, it makes sense to have all the possible compass directions as levels of your factor. To add the missing level, you specify the levels arguments of factor:

> factor(directions, levels= c("North", "East", "South", "West"))
[1] North East South South
Levels: North East South West

As you can see, the values are still the same but this time the levels also contain “West”.

Now imagine that you actually prefer to have abbreviated names for the levels. To do this, you make use of the labels argument:

> factor(directions, levels= c("North", "East", "South", "West"), labels=c("N", "E", "S", "W"))
[1] N E S S
Levels: N E S W