Confidence Interval Basics

By John Pezzullo

In biostatistics, it’s important to be comfortable with the basic concepts and terminology related to confidence intervals. This is an area where nuances of meaning can be tricky, and the right-sounding words can be used the wrong way.

Defining confidence intervals

Informally, a confidence interval indicates a range of values that’s likely to encompass the true value. More formally, the CI around your sample statistic is calculated in such a way that it has a specified chance of surrounding (or “containing”) the value of the corresponding population parameter.

Unlike the SE, which is usually written as a ± number immediately following your measured value (for example, a blood glucose measurement of 120 ± 3 mg/dL), the CI is usually written as a pair of numbers separated by a dash, like this: 114–126.

The two numbers that make up the lower and upper ends of the confidence interval are called the lower and upper confidence limits (CLs). Sometimes you see the abbreviations written with a subscript L or U, like this: CLL or CLU, indicating the lower and upper confidence limits, respectively.

Although SEs and CIs are both used as indicators of the precision of a numerical quantity, they differ in their focus (sample or population):

  • A standard error indicates how much your observed sample statistic may fluctuate if the same experiment is repeated a large number of times, so the SE focuses on the sample.

  • A confidence interval indicates the range that’s likely to contain the true population parameter, so the CI focuses on the population.

One important property of confidence intervals (and standard errors) is that they vary inversely with the square root of the sample size. For example, if you were to quadruple your sample size, it would cut the SE in half, and it would cut the width of the CI in half. This “square root law” is one of the most widely applicable rules in all of statistics.

Understanding confidence levels

The probability that the confidence interval encompasses the true value is called the confidence level of the CI. You can calculate a CI for any confidence level you like, but the most commonly used value is 95 percent. Whenever you report a confidence interval, you must state the confidence level, like this: 95% CI = 114–126.

In general, higher confidence levels correspond to wider confidence intervals, and lower confidence level intervals are narrower. For example, the range 118–122 may have a 50 percent chance of containing the true population parameter within it; 115–125 may have a 90 percent chance of containing the truth, and 112–128 may have a 99 percent chance.

The confidence level is sometimes abbreviated CL, just like the confidence limit, which can be confusing. Fortunately, the distinction is usually clear from the context in which the CL abbreviation appears’.

Understanding balanced and unbalanced confidence intervals

Properly calculated 95 percent confidence intervals contain the true value 95 percent of the time and fail to contain the true value the other 5 percent of the time.

Usually, 95 percent confidence limits are calculated to be balanced so that the 5 percent failures are split evenly — the true value is less than the lower confidence limit 2.5 percent of the time and greater than the upper confidence limit 2.5 percent of the time. This is called a two-sided, balanced CI.

But the confidence limits don’t have to be balanced. Sometimes the consequences of overestimating a value may be more severe than underestimating it, or vice versa. You can calculate an unbalanced, two-sided, 95 percent confidence limit that splits the 5 percent exceptions so that the true value is smaller than the lower confidence limit 4 percent of the time, and larger than the upper confidence limit 1 percent of the time.

Unbalanced confidence limits extend farther out from the estimated value on the side with the smaller percentage.

In some situations, like noninferiority studies, you may want all the failures to be on one side; that is, you want a one-sided confidence limit. Actually, the other side goes out an infinite distance. For example, you can have an observed value of 120 with a one-sided confidence interval that goes from minus infinity to +125, or another one-sided confidence interval that goes from 115 to plus infinity.