Statistical Analysis with R For Dummies
Book image
Explore Book Buy On Amazon
After you calculate the variance of a set of numbers, you have a value whose units are different from your original measurements. For example, if your original measurements are in inches, their variance is in square inches. This is because you square the deviations before you average them. So the variance in the five-score population in the preceding example is 6.8 square inches.

It might be hard to grasp what that means. Often, it's more intuitive if the variation statistic is in the same units as the original measurements. It's easy to turn variance into that kind of statistic. All you have to do is take the square root of the variance.

Like the variance, this square root is so important that it is has a special name: standard deviation.

Population standard deviation

The standard deviation of a population is the square root of the population variance. The symbol for the population standard deviation is Σ (sigma). Its formula is


For this 5-score population of measurements (in inches):

50, 47, 52, 46, and 45

the population variance is 6.8 square inches, and the population standard deviation is 2.61 inches (rounded off).

Sample standard deviation

The standard deviation of a sample — an estimate of the standard deviation of a population — is the square root of the sample variance. Its symbol is s and its formula is


For this sample of measurements (in inches):

50, 47, 52, 46, and 45

the estimated population variance is 8.4 square inches, and the estimated population standard deviation is 2.92 inches (rounded off).

Using R to compute standard deviation

As is the case with variance, using R to compute the standard deviation is easy: You use the sd() function. And like its variance counterpart, sd() calculates s, not Σ:

> sd(heights) [1] 2.915476

For Σ — treating the five numbers as a self-contained population, in other words — you have to multiply the sd() result by the square root of (N-1)/N:

> sd(heights)*(sqrt((length(heights)-1)/length(heights))) [1] 2.607681

Again, if you're going to use this one frequently, defining a function is a good idea:

sd.p=function(x){sd(x)*sqrt((length(x)-1)/length(x))} And here's how you use this function:

> sd.p(heights)

[1] 2.607681

About This Article

This article is from the book:

About the book author:

Joseph Schmuller, PhD, has taught undergraduate and graduate statistics, and has 25 years of IT experience. The author of four editions of Statistical Analysis with Excel For Dummies and three editions of Teach Yourself UML in 24 Hours (SAMS), he has created online coursework for and is a former Editor in Chief of PC AI magazine. He is a Research Scholar at the University of North Florida.

This article can be found in the category: