How to Compare Two Data Samples with R’s T-Test
If you want to use R’s t.test() function to compare your data, you first have to check, among other things, whether both samples are normally distributed.
If you want to know if the average temperature differs between the periods the beaver is active and inactive, you can do so with a simple command:
> t.test(temp ~ activ, data=beaver2) Welch Two-Sample t-test data: temp by activ t = -18.5479, df = 80.852, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.8927106 -0.7197342 sample estimates: mean in group 0 mean in group 1 37.09684 37.90306
Normally, you can only carry out a t-test on samples for which the variances are approximately equal. R uses Welch’s variation on the t-test, which corrects for unequal variances.
You get a whole lot of information here:
The second line gives you the test statistic (t for this test), the degrees of freedom (df), and the according p-value. The very small p-value indicates that the means of both samples differ significantly.
The alternative hypothesis tells you what you can conclude if the p-value is lower than the limit for significance. Generally, scientists consider the alternative hypothesis to be true if the p-value is lower than 0.05.
The 95 percent confidence interval is the interval that contains the difference between the means with 95 percent probability, so in this case the difference between the means lies probably between 0.72 and 0.89.
The last line gives you the means of both samples.
You read the formula temp ~ activ as “evaluate temp within groups determined by activ.” Alternatively, you can use two separate vectors for the samples you want to compare and pass both to the function, as in the following example:
> activetemp <- beaver2$temp[beaver2$activ==1] > inactivetemp <- beaver2$temp[beaver2$activ==0] > t.test(activetemp, inactivetemp)