How to Set the Contrasts for Your Data with R - dummies

How to Set the Contrasts for Your Data with R

By Andrie de Vries, Joris Meys

Before you can use R’s aov() function with your data, you’d better set the contrasts you’re going to use. Contrasts are very often forgotten about when doing ANOVA (analysis of variables), but they generally help with interpreting the model and increase the accuracy of aov() and the helper functions.

What are those contrasts then? Factors are translated to a set of variables, one less than the number of levels of the factor. Say you have a factor with three levels. R creates two variables, and each level of the factor is represented by a combination of values. These values define how the coefficients of the model have to be interpreted.

By default, R uses treatment contrasts, as you can see when you check the relevant option like this:

> options('contrasts')
    unordered      ordered
"contr.treatment"   "contr.poly"

Here you see that R uses different contrasts for unordered and ordered factors. These contrasts are actually contrast functions. They return a matrix with the contrast values for each level of the factor. The default contrasts for a factor with three levels look like this:

> X <- factor(c('A','B','C'))
> contr.treatment(X)
 B C
A 0 0
B 1 0
C 0 1

The two variables B and C are called that way because the variable B has a value of 1 if the factor level is B; otherwise, it has a value of 0. The same goes for C. Level A is represented by two zeros and called the reference level. In a one-factor model, the intercept is the mean of A.

You can change these contrasts using the same options() function, like this:

> options(contrasts=c('contr.sum','contr.poly'))

The contrast function, contr.sum(), gives orthogonal contrasts where you compare every level to the overall mean. You can get more information about these contrasts on the Help page ?contr.sum.