The Relationship between Confidence Intervals and Significance Testing

Biology Essentials For Dummies

You can use confidence intervals (CIs) as an alternative to some of the usual significance tests. To assess significance using CIs, you first define a number that measures the amount of effect you're testing for. This effect size can be the difference between two means or two proportions, the ratio of two means, an odds ratio, a relative risk ratio, or a hazard ratio, among others.

The complete absence of any effect corresponds to a difference of 0, or a ratio of 1, so these are called the "no-effect" values.

The following are always true:

If the 95 percent CI around the observed effect size includes the no-effect value (0 for differences, 1 for ratios), then the effect is not statistically significant (that is, a significance test for that effect will produce p > 0.05).
If the 95 percent CI around the observed effect size does not include the no-effect value, then the effect is significant (that is, a significance test for that effect will produce p = 0.05).

The same kind of correspondence is true for other confidence levels and significance levels: 90 percent confidence levels correspond to the p = 0.10 significance level, 99 percent confidence levels correspond to the p = 0.01 significance level, and so on.

So you have two different, but related, ways to prove that some effect is present — you can use significance tests, and you can use confidence intervals. Which one is better? The two methods are consistent with each other, but many people prefer the CI approach to the p-value approach. Why?

The p value is the result of the complex interplay between the observed effect size, the sample size, and the size of random fluctuations, all boiled down into a single number that doesn't tell you whether the effect was large or small, clinically important or negligible.
The CI around the mean effect clearly shows you the observed effect size, along with an indicator of how uncertain your knowledge of that effect size is. It tells you not only whether the effect is statistically significant, but also can give you an intuitive sense of whether the effect is clinically important.
The CI approach lends itself to a very simple and natural way of comparing two products for equivalence or noninferiority.

About This Article

About the book author:

John C. Pezzullo, PhD, has held faculty appointments in the departments of biomathematics and biostatistics, pharmacology, nursing, and internal medicine at Georgetown University. He is semi-retired and continues to teach biostatistics and clinical trial design online to Georgetown University students.