Why the Statistical Mean and Median of a Histogram Often Have Different Centers
A histogram gives you a rough idea of where the “center” of the data lies. The word center is in quotes because many different statistics are used to designate the center. The two most common measures of center are the average (the mean) and the median.
To visualize the average age (the mean), picture the data as people sitting in various places on a teeter-totter (aka seesaw). Your objective is to balance it. Because data don’t move around, assume the people stay where they are and you move the pivot point (which you can also think of as the hinge or fulcrum) anywhere you want. The mean is the place the pivot point has to be in order to balance the weight on each side of the teeter-totter.
The balancing point of the teeter-totter is affected by how far away the people are on each side, not by the number of people on each side. So the mean is affected by the actual values of the data, rather than the amount of data.
The median is the place where you put the pivot point so you have an equal number of people on each side of the teeter-totter, regardless of their weights. (Hence the teeter-totter may still be off balance in terms of weights.) ‘So the median isn’t affected by the values of the data, just their location within the data set.
The mean is affected by outliers, values in the data set that are away from the rest of the data, on the high end and/or the low end. The median, being the middle number, is not affected by outliers.