How to Place Borderline Statistical Values in a Histogram - dummies

# How to Place Borderline Statistical Values in a Histogram

When you create a histogram, you need to divide the data set into separate groups. However, some statistical data may be right on the borderline between two groups. What do you do in these situations?

Take a look at the following table showing Best Actress Oscar Award winners between 1928 and 1935:

Ages of Best Actress Oscar Award Winners 1928–1935
Year Winner Age Movie
1928 Laura Gainor 22 Sunrise
1929 Mary Pickford 37 Coquette
1930 Norma Shearer 30 The Divorcee
1931 Marie Dressler 62 Min and Bill
1932 Helen Hayes 32 The Sin of Madelon Claudet
1933 Katharine Hepburn 26 Morning Glory
1934 Collette Colbert 31 It Happened One Night
1935 Bette Davis 27 Dangerous

Did you notice that one actress’s age lies right on a borderline? Norma Shearer was 30 years old in 1930 when she won the Oscar for The Divorcee. Now, say you divide the age groups in the histogram into 5-year segments (20–25, 25–30, 30–35, and so on). Would you place her in the 25–30 age group (the lower bar) or the 30–35 age group (the upper bar)?

As long as you are consistent with all the data points, you can either put all the borderline points into their respective lower bars or put all of them into their respective upper bars. The important thing is to pick a direction and be consistent.

The histogram in this example went with the convention of putting all borderline values into their respective upper bars — which puts Norma Shearer’s age in the third bar, the 30–35 age group of the histogram. It is common practice to make the bar intervals left inclusive (that is, the bars include the left endpoint but not the right), just as this example histogram does. Hence, this bar contains the age of 30, but not 35.