How to Define What Data to Use in a ggplot2 Layer in R

By Andrie de Vries, Joris Meys

The first element of a ggplot2 layer is the data. There is only one rule in R for supplying data to ggplot(): Your data must be in the form of a data frame. This is different from base graphics, which allow plotting of data in vectors, matrices, and other structures.

You can use the built-in dataset quakes. This dataset is a data frame with information about earthquakes near Fiji.

You tell ggplot() what data to use and how to map your data to your geom in the ggplot() function. The ggplot() function takes two arguments:

  • data: a data frame with your data (for example, data=quakes).

  • : The dots argument indicates that any other argument you specified here gets passed on to downstream functions (that is, other functions that ggplot() happens to call). In the case of ggplot(), this means that anything you specify in this argument is available to your geoms and stats that you define later.

Because the dots argument is available to any geom or stat in your plot, it’s a convenient place to define the mapping between your data and the visual elements of your plot.

This is where you typically specify a mapping between your data and your geom.