How to Extract Data from Plots in R

By Andrie de Vries, Joris Meys

The hist() and boxplot() functions in R have another incredibly nice feature: You can get access to all the data R uses to plot the histogram or box plot and use it in further calculations. Getting that information is as easy as assigning the output of the function to an object. For example, you get the information on the breaks, counts, and density in a histogram like this:

> mpghist <- hist(cars$mpg)

Your histogram is still plotted, but on top of that you now create an object that contains a list with — among other things — the elements breaks, counts, and density. For a box plot, you can do exactly the same and get an object that contains a list with — among other things — the elements stats and n, representing the used statistics and the number of cases in each category.

All that information you could, of course, also get using other functions in R. It can help, though, to quickly add some extra information to a plot. For example, you can add the number of cases for each box to a box plot like this:

> mpgbox <- boxplot(mpg ~ cyl, data=cars)
> n <- nlevels(as.factor(cars$cyl))
> text(1:n, mpgbox$stats[1,],
+ paste('n =', mpgbox$n),
+ pos=1)

With this code, you add a text value under the lower whisker. The x-coordinates 1 through n coincide with the middle of each box. You get the y-coordinates from the stats element in the mpgbox object, which tells you where the lower whisker is. The argument pos=1 in the text function places the text under the coordinates. You can try playing around with it yourself.