How ggplot2 Works in R - dummies

By Joseph Schmuller

In ggplot2, Wickham’s implementation of Wilkinson’s grammar is an easy-to-learn structure for R graphics code.

A graph starts with the function ggplot(), which takes two arguments. The first argument is the source of the data. The second argument maps the data components of interest into components of the graph. That argument is a function called aes(), which stands for aesthetic mapping. Each argument to aes() is called an aesthetic.

For example, if you’re creating a histogram of Temp in the airquality data frame, you want Type on the x-axis. The code looks like this:

ggplot(airquality, aes(x=Temp))

All that does is specify the foundation for the graph — the data source and the mapping. If you type that code into the Scripts window and press Ctrl+R, all you would have is a blank grid with Temp on the x-axis.

Well, what about the histogram? To add it to the foundation, you add another function that tells R to plot the histogram and take care of all the details. The function you add is called a geom function (geom is short for geometric object).

These geom functions come in a variety of types: ggplot2 supplies one for almost every graphing need, and provides the flexibility to work with special cases. For a histogram, the geom function is geom_histogram(). For a bar plot, it’s geom_bar(). For a point, it’s geom_point().

To add a geom to ggplot, you use a plus sign:

ggplot(airquality, aes(x=Temp)) +

That’s just about it, except for any finishing touches to the graph’s appearance. To modify the appearance of the geom, you add arguments to the geom() function. To modify the background color scheme, you can add one or more theme() functions. To add labels to the axes and a title to the graph, you add the function labs().

So, the overall structure for a ggplot graph is

ggplot(data_source, aes(map data components to graph components)) +
  geom_xxx(arguments to modify the appearance of the geom) +
  theme_xxx(arguments to change the overall appearance) +
  labs(add axis-labels and a title)

It’s like building a house: The ggplot() function is the foundation, the geom() function is the house, theme() is the landscaping, and labs() puts the address on the door. Additional functions are available for modifying the graph.

Still another way to look at ggplot (and more in line with mainstream thinking) is to imagine a graph as a set of layers. The ggplot() function provides the first layer, the geom function the next, and so on.