This Cheat Sheet is a handy reference for Base R statistical functions, interactive applications, machine learning, databases, and images.
Base R statistical functions
Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages.
Central Tendency and Variability
| Function | What it calculates |
| mean(x) | Mean of the numbers in vector x |
| median(x) | Median of the numbers in vector x |
| var(x) | Estimated variance of the population from which the numbers in vector x are sampled |
| sd(x) | Estimated standard deviation of the population from which the numbers in vector x are sampled |
| scale(x) | Standard scores (z-scores) for the numbers in vector x |
Relative Standing
| Function | What it calculates |
| sort(x) | The numbers in vector x in increasing order |
| sort(x)[n] | The nth smallest number in vector x |
| rank(x) | Ranks of the numbers (in increasing order) in vector x |
| rank(-x) | Ranks of the numbers (in decreasing order) in vector x |
| rank(x, ties.method= “average”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained |
| rank(x, ties.method= “min”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained |
| rank(x, ties.method = “max”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained |
| quantile(x) | The 0th, 25th, 50th, 75th, and 100th percentiles (the quartiles, in other words) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.) |
t-tests
| Function | What it calculates |
| t.test(x,mu=n, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from n. |
| t.test(x,mu=n, alternative = “greater”) | One-tailed t-test that the mean of the numbers in vector x is greater than n. |
| t.test(x,mu=n, alternative = “less”) | One-tailed t-test that the mean of the numbers in vector x is less than n. |
| t.test(x,y,mu=0, var.equal = TRUE, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The variances in the two vectors are assumed to be equal. |
| t.test(x,y,mu=0, alternative = “two.sided”, paired = TRUE) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The vectors represent matched samples. |
Analysis of Variance (ANOVA)
| Function | What it calculates |
| aov(y~x, data = d) | Single-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vector x as the levels of the independent variable. The data are in data frame d. |
| aov(y~x + Error(w/x), data = d) | Repeated Measures ANOVA, with the numbers in vector y as the dependent variable and the elements in vector x as the levels of an independent variable. Error(w/x) indicates that each element in vector w experiences all the levels of x. (In other words, x is a repeated measure.) The data are in data frame d. |
| aov(y~x*z, data = d) | Two-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vectors x and z as the levels of the two independent variables. The data are in data frame d. |
| aov(y~x*z + Error(w/z), data = d) | Mixed ANOVA, with the numbers in vector z as the dependent variable and the elements of vectors x and y as the levels of the two independent variables. Error(w/z) indicates that each element in vector w experiences all the levels of z. (In other words, z is a repeated measure.) The data are in data frame d. |
Correlation and regression
| Function | What it calculates |
| cor(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y |
| cor.test(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y, along with a t-test of the significance of the correlation coefficient. |
| lm(y~x, data = d) | Linear regression analysis with the numbers in vector y as the dependent variable and the numbers in vector x as the independent variable. Data are in data frame d. |
| coefficients(a) | Slope and intercept of linear regression model a. |
| confint(a) | Confidence intervals of the slope and intercept of linear regression model a. |
| lm(y~x+z, data = d) | Multiple regression analysis with the numbers in vector y as the dependent variable and the numbers in vectors x and z as the independent variables. Data are in data frame d. |
When you carry out an ANOVA or a regression analysis, store the analysis in a list — for example: a <- lm(y~x, data = d). Then, to see the tabled results, use the summary() function: summary(a)
Interacting with a user
R provides the shiny package and the shinydashboard package for developing interactive applications. Here are selected functions from these packages.
Functions from the shiny package
| Function | What it does |
| shinyApp() | Ties a user interface and a server into a shiny application |
| fluidPage() | Creates a browser page that changes with the width of the browser |
| sliderInput() | Defines a slider and its input for a shiny user interface |
| plotOutput() | Reserves a shiny user interface area for a plot |
| renderPlot() | Draws the plot on a shiny user interface |
| textOutput() | Reserves a shiny user interface area for text |
| renderText() | Adds text to a shiny user interface |
| selectInput() | Creates a drop-down menu on a shiny user interface |
Functions from the shinydashboard package
| Function | What it creates for a shinydashboard page |
| dashboardPage() | The page |
| dashboardHeader() | Page header |
| dashboardSidebar() | Page sidebar |
| sidebarMenu() | A menu for a sidebar |
| menuItem() | An item for a menu |
| dashboardBody() | Page body |
| fluidRow() | A variable-width row inside the dashboard body |
| box() | A box inside a row |
| valueBoxOutput() | A reserved space for a value box |
| renderValueBox | Reactive context for a value box |
| valueBox | A value box |
| column() | A column within a fluid row |
| tabBox() | A tab for a tabbed page |
Machine learning
R provides a number of packages and functions for machine learning. Here are some of them.
Machine learning packages and functions
| Package | Function | What it does |
| rattle | rattle() | Opens the Rattle graphical user interface |
| rpart | rpart() | Creates a decision tree |
| rpart.plot | prp() | Draws a decision tree |
| randomForest | randomForest() | Creates a random forest of decision trees |
| rattle | printRandomForests() | Prints the rules of a forest’s individual decision trees |
| e1071 | svm() | Trains a support vector machine |
| e1071 | predict() | Creates a vector of predicted classifications based on a support vector machine |
| kernlab | ksvm() | Trains a support vector machine |
| base R | kmeans() | Creates a k-means clustering analysis |
| nnet | nnet() | Creates a neural network with one hidden layer |
| NeuralNetTools | plotnet() | Draws a neural network |
| nnet | predict() | Creates a vector of predictions based on a neural network |
Databases
Created for statistical analysis, R has wide array of packages and functions for dealing with large amounts of data. This selection is the tip of the iceberg’s tip.
Packages and functions for exploring databases
| Package | Function | What it does |
| didrooRFM | findRFM() | Performs a recency, frequency, money analysis on a database of retail transactions |
| vcd | assocstats() | Calculates statistics for tables of categorical data |
| vcd | assoc() | Creates a graphic that shows deviations from independence in a table of categorical data |
| tidyverse | glimpse() | Provides a partial view of a data frame with the columns appearing onscreen as rows |
| plotrix | std.error() | Calculates the standard error of the mean |
| plyr | inner_join() | Joins data frames |
| lubridate | wday() | Returns day of the week of a calendar date |
| lubridate | ymd() | Returns a date in R date-format |
Images
Here are some functions to help you get started using R to process images. They all live in the magick package.
Functions from the magick package
| Function | What it does |
| image_read() | Reads an image into R and turns it into a magick object |
| image_resize() | Resizes an image |
| image_rotate() | Rotates an image |
| image_flip() | Rotates an image on a horizontal axis |
| image_flop() | Rotates an image on a vertical axis |
| image_annotate() | Adds text to an image |
| image_background() | Sets the background for an image |
| image_composite() | Combines images |
| image_morph() | Makes one image appear to gradually become (morph into) another |
| image_animate() | Puts an animation into the RStudio Viewer window |
| image_apply() | Applies a function to every frame in an animated GIF |
| image_write() | Saves an animation as a reusable GIF |


