This Cheat Sheet is a handy reference for Base R statistical functions, interactive applications, machine learning, databases, and images.

## Base R statistical functions

Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages.

**Central Tendency and Variability**

Function |
What it calculates |

mean(x) | Mean of the numbers in vector x |

median(x) | Median of the numbers in vector x |

var(x) | Estimated variance of the population from which the numbers in vector x are sampled |

sd(x) | Estimated standard deviation of the population from which the numbers in vector x are sampled |

scale(x) | Standard scores (z-scores) for the numbers in vector x |

**Relative Standing**

Function |
What it calculates |

sort(x) | The numbers in vector x in increasing order |

sort(x)[n] | The nth smallest number in vector x |

rank(x) | Ranks of the numbers (in increasing order) in vector x |

rank(-x) | Ranks of the numbers (in decreasing order) in vector x |

rank(x, ties.method= “average”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained |

rank(x, ties.method= “min”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained |

rank(x, ties.method = “max”) | Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained |

quantile(x) | The 0th, 25th, 50th, 75th, and 100th percentiles (the quartiles, in other words) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.) |

*t-*tests

Function |
What it calculates |

t.test(x,mu=n, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from n. |

t.test(x,mu=n, alternative = “greater”) | One-tailed t-test that the mean of the numbers in vector x is greater than n. |

t.test(x,mu=n, alternative = “less”) | One-tailed t-test that the mean of the numbers in vector x is less than n. |

t.test(x,y,mu=0, var.equal = TRUE, alternative = “two.sided”) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The variances in the two vectors are assumed to be equal. |

t.test(x,y,mu=0, alternative = “two.sided”, paired = TRUE) | Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The vectors represent matched samples. |

**Analysis of Variance (ANOVA)**

Function |
What it calculates |

aov(y~x, data = d) | Single-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vector x as the levels of the independent variable. The data are in data frame d. |

aov(y~x + Error(w/x), data = d) | Repeated Measures ANOVA, with the numbers in vector y as the dependent variable and the elements in vector x as the levels of an independent variable. Error(w/x) indicates that each element in vector w experiences all the levels of x. (In other words, x is a repeated measure.) The data are in data frame d. |

aov(y~x*z, data = d) | Two-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vectors x and z as the levels of the two independent variables. The data are in data frame d. |

aov(y~x*z + Error(w/z), data = d) | Mixed ANOVA, with the numbers in vector z as the dependent variable and the elements of vectors x and y as the levels of the two independent variables. Error(w/z) indicates that each element in vector w experiences all the levels of z. (In other words, z is a repeated measure.) The data are in data frame d. |

**Correlation and regression**

Function |
What it calculates |

cor(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y |

cor.test(x,y) | Correlation coefficient between the numbers in vector x and the numbers in vector y, along with a t-test of the significance of the correlation coefficient. |

lm(y~x, data = d) | Linear regression analysis with the numbers in vector y as the dependent variable and the numbers in vector x as the independent variable. Data are in data frame d. |

coefficients(a) | Slope and intercept of linear regression model a. |

confint(a) | Confidence intervals of the slope and intercept of linear regression model a. |

lm(y~x+z, data = d) | Multiple regression analysis with the numbers in vector y as the dependent variable and the numbers in vectors x and z as the independent variables. Data are in data frame d. |

When you carry out an ANOVA or a regression analysis, store the analysis in a list — for example: a <- lm(y~x, data = d). Then, to see the tabled results, use the summary() function: summary(a)

## Interacting with a user

R provides the shiny package and the shinydashboard package for developing interactive applications. Here are selected functions from these packages.

**Functions from the shiny package**

Function |
What it does |

shinyApp() | Ties a user interface and a server into a shiny application |

fluidPage() | Creates a browser page that changes with the width of the browser |

sliderInput() | Defines a slider and its input for a shiny user interface |

plotOutput() | Reserves a shiny user interface area for a plot |

renderPlot() | Draws the plot on a shiny user interface |

textOutput() | Reserves a shiny user interface area for text |

renderText() | Adds text to a shiny user interface |

selectInput() | Creates a drop-down menu on a shiny user interface |

**Functions from the shinydashboard package**

Function |
What it creates for a shinydashboard page |

dashboardPage() | The page |

dashboardHeader() | Page header |

dashboardSidebar() | Page sidebar |

sidebarMenu() | A menu for a sidebar |

menuItem() | An item for a menu |

dashboardBody() | Page body |

fluidRow() | A variable-width row inside the dashboard body |

box() | A box inside a row |

valueBoxOutput() | A reserved space for a value box |

renderValueBox | Reactive context for a value box |

valueBox | A value box |

column() | A column within a fluid row |

tabBox() | A tab for a tabbed page |

## Machine learning

R provides a number of packages and functions for machine learning. Here are some of them.

**Machine learning packages and functions**

Package |
Function |
What it does |

rattle | rattle() | Opens the Rattle graphical user interface |

rpart | rpart() | Creates a decision tree |

rpart.plot | prp() | Draws a decision tree |

randomForest | randomForest() | Creates a random forest of decision trees |

rattle | printRandomForests() | Prints the rules of a forest’s individual decision trees |

e1071 | svm() | Trains a support vector machine |

e1071 | predict() | Creates a vector of predicted classifications based on a support vector machine |

kernlab | ksvm() | Trains a support vector machine |

base R | kmeans() | Creates a k-means clustering analysis |

nnet | nnet() | Creates a neural network with one hidden layer |

NeuralNetTools | plotnet() | Draws a neural network |

nnet | predict() | Creates a vector of predictions based on a neural network |

## Databases

Created for statistical analysis, R has wide array of packages and functions for dealing with large amounts of data. This selection is the tip of the iceberg’s tip.

**Packages and functions for exploring databases**

Package |
Function |
What it does |

didrooRFM | findRFM() | Performs a recency, frequency, money analysis on a database of retail transactions |

vcd | assocstats() | Calculates statistics for tables of categorical data |

vcd | assoc() | Creates a graphic that shows deviations from independence in a table of categorical data |

tidyverse | glimpse() | Provides a partial view of a data frame with the columns appearing onscreen as rows |

plotrix | std.error() | Calculates the standard error of the mean |

plyr | inner_join() | Joins data frames |

lubridate | wday() | Returns day of the week of a calendar date |

lubridate | ymd() | Returns a date in R date-format |

## Images

Here are some functions to help you get started using R to process images. They all live in the magick package.

**Functions from the magick package**

Function |
What it does |

image_read() | Reads an image into R and turns it into a magick object |

image_resize() | Resizes an image |

image_rotate() | Rotates an image |

image_flip() | Rotates an image on a horizontal axis |

image_flop() | Rotates an image on a vertical axis |

image_annotate() | Adds text to an image |

image_background() | Sets the background for an image |

image_composite() | Combines images |

image_morph() | Makes one image appear to gradually become (morph into) another |

image_animate() | Puts an animation into the RStudio Viewer window |

image_apply() | Applies a function to every frame in an animated GIF |

image_write() | Saves an animation as a reusable GIF |