Applying the Empirical Rule (68-95-99.7) to a Statistical Data Set
How to Find the Interquartile Range for a Statistical Sample
How to Calculate the Mean of a Statistical Data Set

How to Interpret Standard Deviation in a Statistical Data Set

Standard deviation can be difficult to interpret as a single number on its own. Basically, a small standard deviation means that the values in a statistical data set are close to the mean of the data set, on average, and a large standard deviation means that the values in the data set are farther away from the mean, on average.

The standard deviation measures how concentrated the data are around the mean; the more concentrated, the smaller the standard deviation.

A small standard deviation can be a goal in certain situations where the results are restricted, for example, in product manufacturing and quality control. A particular type of car part that has to be 2 centimeters in diameter to fit properly had better not have a very big standard deviation during the manufacturing process. A big standard deviation in this case would mean that lots of parts end up in the trash because they don’t fit right; either that or the cars will have problems down the road.

But in situations where you just observe and record data, a large standard deviation isn’t necessarily a bad thing; it just reflects a large amount of variation in the group that is being studied. For example, if you look at salaries for everyone in a certain company, including everyone from the student intern to the CEO, the standard deviation may be very large. On the other hand, if you narrow the group down by looking only at the student interns, the standard deviation is smaller, because the individuals within this group have salaries that are less variable. The second data set isn’t better, it’s just less variable.

Similar to the mean, outliers affect the standard deviation (after all, the formula for standard deviation includes the mean). Here’s an example: the salaries of the L.A. Lakers in the 2009–2010 season range from the highest, $23,034,375 (Kobe Bryant) down to $959,111 (Didier Ilunga-Mbenga and Josh Powell). Lots of variation, to be sure! The standard deviation of the salaries for this team turns out to be $6,567,405; it’s almost as large as the average. However, as you may guess, if you remove Kobe Bryant’s salary from the data set, the standard deviation decreases because the remaining salaries are more concentrated around the mean. The standard deviation becomes $4,671,508.

Here are some properties that can help you when interpreting a standard deviation:

  • The standard deviation can never be a negative number, due to the way it’s calculated and the fact that it measures a distance (distances are never negative numbers).

  • The smallest possible value for the standard deviation is 0, and that happens only in contrived situations where every single number in the data set is exactly the same (no deviation).

  • The standard deviation is affected by outliers (extremely low or extremely high numbers in the data set). That’s because the standard deviation is based on the distance from the mean. And remember, the mean is also affected by outliers.

  • The standard deviation has the same units as the original data.

  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
Why the Statistical Mean and Median of a Histogram Often Have Different Centers
Simplifying Excess Statistical Data in a Time Chart
What a Time Chart Can Tell You about a Statistical Data Set
How to Identify Skew and Symmetry in a Statistical Histogram
How to Place Borderline Statistical Values in a Histogram
Advertisement

Inside Dummies.com