How to Perform Analyses of Variance with Two Kinds of Variables
What happens when you are performing analyses of variance (ANOVA) and you have a Between Groups variable and a Within Groups variable . . . at the same time? How can that happen?
Very easily. Here’s an example. Suppose you want to study the effects of presentation media on the reading speeds of fourth-graders. You randomly assign your fourth-graders (subjects) to read either e-readers or books. That’s the Between Groups variable.
Let’s say you’re also interested in the effects of font. So you assign each subject to read each of these fonts: Haettenschweiler, Arial, and Calibri. Because each subject reads all the fonts, that’s the Within Groups variable. For completeness, you have to randomly order the fonts for each subject.
What would the ANOVA table look like? It’s categorized into a set of sources that make up Between Groups variability, and a set of sources that make up Within Groups variability.
Source | SS | df | MS | F |
---|---|---|---|---|
Between | SS_{Between} | df_{Between} | ||
A | SS_{A} | df_{A} | SS_{A}/df_{A} | MS_{A}/MS_{S/A} |
S/A | SS_{S/A} | df_{S/A} | SS_{S/A}/df_{S/A} | |
Within | SS_{Within} | df_{Within} | ||
B | SS_{B} | df_{B} | SS_{B}/df_{B} | MS_{B}/MS_{B X S/A} |
A X B | SS_{A X B} | df_{A X B} | SS_{A X B} /df_{A X B} | MS_{A X B}/MS_{B X S/A} |
B X S/A | SS_{B X S/A} | df_{B XS/A} | SS_{B X S/A}/df_{B X S/A} | |
Total | SS_{Total} | df_{Total} |
In the Between category, A is the name of the Between Groups variable. Read “S/A” as “S within A.” This just says that the people in one level of A are different from the people in the other levels of A.
In the Within category, B is the name of the Within Groups A X B is the interaction of the two variables. B X S/A is something like the B variable interacting with subjects within A. As you can see, anything associated with B falls into the Within Groups category.
The first thing to note is the three F-ratios. The first one tests for differences among the levels of A, the second for differences among the levels of B, and the third for the interaction of the two. Notice also that the denominator for the first F-ratio is different from the ones for the other two. This happens more and more as ANOVAs increase in complexity.
Next, it’s important to be aware of some relationships. At the top level:
SS_{Between} + SS_{Within} = SS_{Total}
df_{Between} + df_{Within} = df_{Total}
The Between component breaks down further:
SS_{A} + SS_{S/A} = SS_{Between}
df_{A} + df_{S/A} = df_{Between}
The Within component breaks down, too:
SS_{B} + SS_{A X B} + SS_{B X S/A} = SS_{Within}
df_{B} + df_{A X B} + df_{B X S/A} = df_{Within}
Knowing these relationships helps complete the ANOVA table after Excel has gone through its paces.