 R Projects For Dummies Cheat Sheet - dummies

# R Projects For Dummies Cheat Sheet

To complete any project using R, you work with functions that live in packages designed for specific areas. This cheat sheet provides some information about these functions.

## Interacting with Users with R Functions

Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages. R provides the `shiny `package and the `shinydashboard `package for developing interactive applications. Here are selected functions from these packages:

 Function What it Calculates `mean(x)` Mean of the numbers in vector x. `median(x)` Median of the numbers in vector x `var(x)` Estimated variance of the population from which the numbers in vector x are sampled `sd(x)` Estimated standard deviation of the population from which the numbers in vector x are sampled `scale(x)` Standard scores (z-scores) for the numbers in vector x

Relative Standing

 Function What it Calculates `sort(x)` The numbers in vector x in increasing order `sort(x)[n]` The nth smallest number in vector x `rank(x)` Ranks of the numbers (in increasing order) in vector x `rank(-x)` Ranks of the numbers (in decreasing order) in vector x `rank(x, ties.method= "average")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained `rank(x, ties.method= "min")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained `rank(x, ties.method = "max")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained `quantile(x)` The 0th, 25th, 50th, 75th, and 100th percentiles (i.e, the quartiles) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.)

T-tests

 Function What it Calculates `t.test(x,mu=n, alternative = "two.sided")` Two-tailed t-test that the mean of the numbers in vector x is different from n. `t.test(x,mu=n, alternative = "greater")` One-tailed t-test that the mean of the numbers in vector x is greater than n. `t.test(x,mu=n, alternative = "less")` One-tailed t-test that the mean of the numbers in vector x is less than n. `t.test(x,y,mu=0, var.equal = TRUE, alternative = "two.sided")` Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The variances in the two vectors are assumed to be equal. `t.test(x,y,mu=0, alternative = "two.sided", paired = TRUE)` Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The vectors represent matched samples.

Analysis of Variance (ANOVA)

 Function What it Calculates `aov(y~x, data = d)` Single-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vector x as the levels of the independent variable. The data are in data frame d. `aov(y~x + Error(w/x), data = d)` Repeated Measures ANOVA, with the numbers in vector y as the dependent variable and the elements in vector x as the levels of an independent variable. Error(w/x) indicates that each element in vector w experiences all the levels of x (i.e., x is a repeated measure). The data are in data frame d. `aov(y~x*z, data = d)` Two-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vectors x and z as the levels of the two independent variables. The data are in data frame d. `aov(y~x*z + Error(w/z), data = d)` Mixed ANOVA, with the numbers in vector z as the dependent variable and the elements of vectors x and y as the levels of the two independent variables. Error(w/z) indicates that each element in vector w experiences all the levels of z (i.e., z is a repeated measure). The data are in data frame d.

Correlation and Regression

 Function What it Calculates `cor(x,y)` Correlation coefficient between the numbers in vector x and the numbers in vector y `cor.test(x,y)` Correlation coefficient between the numbers in vector x and the numbers in vector y, along with a t-test of the significance of the correlation coefficient. `lm(y~x, data = d)` Linear regression analysis with the numbers in vector y as the dependent variable and the numbers in vector x as the independent variable. Data are in data frame d. `coefficients(a)` Slope and intercept of linear regression model a. `confint(a)` Confidence intervals of the slope and intercept of linear regression model a `lm(y~x+z, data = d)` Multiple regression analysis with the numbers in vector y as the dependent variable and the numbers in vectors x and z as the independent variables. Data are in data frame d.

When you carry out an ANOVA or a regression analysis, store the analysis in a list.

For example, `a <- lm(y~x, data = d)`.

Then, to see the tabled results, use the `summary()` function:

`summary(a)`

## Tackling Machine Learning with R

Machine Learning (ML) is a popular area. R provides a number of ML-related packages and functions. Here are some of them:

Machine Learning Packages and Functions

 Package Function What it does `rattle` `rattle()` Opens the Rattle Graphic User Interface `rpart` `rpart()` Creates a decision tree `rpart.plot` `prp()` Draws a decision tree `randomForest` `randomForest()` Creates a random forest of decision trees `rattle` `printRandomForests()` Prints the rules of a forest’s individual decision trees `e1071` `svm()` Trains a support vector machine `e1071` `predict()` Creates a vector of predicted classifications based on a support vector machine `kernlab` `ksvm()` Trains a support vector machine `base R` `kmeans()` Creates a k-means clustering analysis `nnet` `nnet()` Creates a neural network with one hidden layer `NeuralNetTools` `plotnet()` Draws a neural network `nnet` `predict()` Creates a vector of predictions based on a neural network

## Working with Large(ish) Databases in R

Created for statistical analysis, R has a wide array of packages and functions for dealing with large amounts of data. This selection is the tip of the iceberg’s tip:

Packages and Functions for Exploring Databases

 Package Function What it does `didrooRFM` `findRFM()` Performs a Recency, Frequency, Money analysis on a database of retail transactions `vcd` `assocstats()` Calculates statistics for tables of categorical data `vcd` `assoc()` Creates a graphic that shows deviations from independence in a table of categorical data `tidyverse` `glimpse()` Provides a partial view of a data frame with the columns appearing onscreen as rows `plotrix` `std.error()` Calculates the standard error of the mean `plyr` `inner_join()` Joins data frames `lubridate` `wday()` Returns day of the week of a calendar date `lubridate` `ymd()` Returns a date in R date-format

## Manipulating Maps and Images with R

Here are some packages and functions to help you get started using R to draw maps and to process images.

Packages and Functions for Plotting Maps and for Processing Images

 Package Function What it does `maps` `map_data()` Returns a data frame of latitudes and longitudes `ggmaps` `geocode()` Returns latitude and longitude of a place-name `magick` `image_read()` Reads an image into R and turns it into a magick object `magick` `image_resize()` Resizes an image `magick` `image_rotate()` Rotates an image `magick` `image_flip()` Rotates an image on a horizontal axis `magick` `image_flop()` Rotates an image on a vertical axis `magick` `image_annotate()` Adds text to an image `magick` `image_background()` Sets the background for an image `magick` `image_composite()` Combines images `magick` `image_morph()` Makes one image appear to gradually become (morph into) another `magick` `image_animate()` Puts an animation into the RStudio Viewer window `magick` `image_apply()` Applies a function to every frame in an animated GIF `magick` `image_write()` Saves an animation as a reusable GIF