The Median in R: median() - dummies

By Joseph Schmuller

Median is a fancy name for a simple concept: It’s the middle value in a group of numbers. Arrange the numbers in order, and the median is the value below which half the scores fall and above which half the scores fall:

> sort(reading.speeds)

[1] 45 49 55 56 62 78

> sort(reading.speeds.new)

[1] 45 49 55 56 62 180

In each case, the median is halfway between 55 and 56, or 55.5.

So it’s no big mystery how to use R to find the median:

> median(reading.speeds)

[1] 55.5

median(reading.speeds.new)

[1] 55.5

With larger data sets, you might encounter replication of scores. In any case, the median is still the middle value. For example, here are the horsepowers for 4-cylinder cars in Cars93:

with(Cars93, Horsepower.Four <- Horsepower[Cylinders == 4])

> sort(Horsepower.Four)

[1] 63 74 81 81 82 82 85 90 90 92 92 92 92 92

[15] 93 96 100 100 100 102 103 105 110 110 110 110 110 110

[29] 110 114 115 124 127 128 130 130 130 134 135 138 140 140

[43] 140 141 150 155 160 164 208

You see quite a bit of duplication in these numbers — particularly around the middle. Count through the sorted values and you’ll see that 24 scores are equal to or less than 110, and 24 scores are greater than or equal to 110, which makes the median

> median(Horsepower.Four)

[1] 110