##### R Projects For Dummies To complete any project using R, you work with functions that live in packages designed for specific areas. This cheat sheet provides some information about these functions.

## Interacting with users with R functions

Here’s a selection of statistical functions that come with the standard R installation. You’ll find many others in R packages. R provides the `shiny `package and the `shinydashboard `package for developing interactive applications. Here are selected functions from these packages:

 Function What it Calculates `mean(x)` Mean of the numbers in vector x. `median(x)` Median of the numbers in vector x `var(x)` Estimated variance of the population from which the numbers in vector x are sampled `sd(x)` Estimated standard deviation of the population from which the numbers in vector x are sampled `scale(x)` Standard scores (z-scores) for the numbers in vector x

Relative Standing

 Function What it Calculates `sort(x)` The numbers in vector x in increasing order `sort(x)[n]` The nth smallest number in vector x `rank(x)` Ranks of the numbers (in increasing order) in vector x `rank(-x)` Ranks of the numbers (in decreasing order) in vector x `rank(x, ties.method= "average")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the average of the ranks that the ties would have attained `rank(x, ties.method=  "min")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the minimum of the ranks that the ties would have attained `rank(x, ties.method = "max")` Ranks of the numbers (in increasing order) in vector x, with tied numbers given the maximum of the ranks that the ties would have attained `quantile(x)` The 0th, 25th, 50th, 75th, and 100th percentiles (i.e, the quartiles) of the numbers in vector x. (That’s not a misprint: quantile(x) returns the quartiles of x.)

T-tests

 Function What it Calculates `t.test(x,mu=n, alternative = "two.sided")` Two-tailed t-test that the mean of the numbers in vector x is different from n. `t.test(x,mu=n, alternative = "greater")` One-tailed t-test that the mean of the numbers in vector x is greater than n. `t.test(x,mu=n, alternative = "less")` One-tailed t-test that the mean of the numbers in vector x is less than n. `t.test(x,y,mu=0, var.equal  = TRUE, alternative = "two.sided")` Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The variances in the two vectors are assumed to be equal. `t.test(x,y,mu=0, alternative = "two.sided", paired  = TRUE)` Two-tailed t-test that the mean of the numbers in vector x is different from the mean of the numbers in vector y. The vectors represent matched samples.

Analysis of Variance (ANOVA)

 Function What it Calculates `aov(y~x, data = d)` Single-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vector x as the levels of the independent variable. The data are in data frame d. `aov(y~x + Error(w/x), data = d)` Repeated Measures ANOVA, with the numbers in vector y as the dependent variable and the elements in vector x as the levels of an independent variable. Error(w/x) indicates that each element in vector w experiences all the levels of x (i.e., x is a repeated measure). The data are in data frame d. `aov(y~x*z, data = d)` Two-factor ANOVA, with the numbers in vector y as the dependent variable and the elements of vectors x and z as the levels of the two independent variables. The data are in data frame d. `aov(y~x*z + Error(w/z), data = d)` Mixed ANOVA, with the numbers in vector z as the dependent variable and the elements of vectors x and y as the levels of the two independent variables. Error(w/z) indicates that each element in vector w experiences all the levels of z (i.e., z is a repeated measure). The data are in data frame d.

Correlation and Regression

 Function What it Calculates `cor(x,y)` Correlation coefficient between the numbers in vector x and the numbers in vector y `cor.test(x,y)` Correlation coefficient between the numbers in vector x and the numbers in vector y, along with a t-test of the significance of the correlation coefficient. `lm(y~x, data = d)` Linear regression analysis with the numbers in vector y as the dependent variable and the numbers in vector x as the independent variable. Data are in data frame d. `coefficients(a)` Slope and intercept of linear regression model a. `confint(a)` Confidence intervals of the slope and intercept of linear regression model a `lm(y~x+z, data = d)` Multiple regression analysis with the numbers in vector y as the dependent variable and the numbers in vectors x and z as the independent variables. Data are in data frame d.

When you carry out an ANOVA or a regression analysis, store the analysis in a list.

For example, `a <- lm(y~x, data = d)`.

Then, to see the tabled results, use the `summary()` function:

`summary(a)`

## Tackling machine learning with R

Machine Learning (ML) is a popular area. R provides a number of ML-related packages and functions. Here are some of them:

Machine Learning Packages and Functions

 Package Function What it does `rattle` `rattle()` Opens the Rattle Graphic User Interface `rpart` `rpart()` Creates a decision tree `rpart.plot` `prp()` Draws a decision tree `randomForest` `randomForest()` Creates a random forest of decision trees `rattle` `printRandomForests()` Prints the rules of a forest’s individual decision trees `e1071` `svm()` Trains a support vector machine `e1071` `predict()` Creates a vector of predicted classifications based on a support vector machine `kernlab` `ksvm()` Trains a support vector machine `base R` `kmeans()` Creates a k-means clustering analysis `nnet` `nnet()` Creates a neural network with one hidden layer `NeuralNetTools` `plotnet()` Draws a neural network `nnet` `predict()` Creates a vector of predictions based on a neural network

## Working with large(ish) databases in R

Created for statistical analysis, R has a wide array of packages and functions for dealing with large amounts of data. This selection is the tip of the iceberg’s tip:

Packages and Functions for Exploring Databases

 Package Function What it does `didrooRFM` `findRFM()` Performs a Recency, Frequency, Money analysis on a database of retail transactions `vcd` `assocstats()` Calculates statistics for tables of categorical data `vcd` `assoc()` Creates a graphic that shows deviations from independence in a table of categorical data `tidyverse` `glimpse()` Provides a partial view of a data frame with the columns appearing onscreen as rows `plotrix` `std.error()` Calculates the standard error of the mean `plyr` `inner_join()` Joins data frames `lubridate` `wday()` Returns day of the week of a calendar date `lubridate` `ymd()` Returns a date in R date-format

## Manipulating maps and images with R

Here are some packages and functions to help you get started using R to draw maps and to process images.

Packages and Functions for Plotting Maps and for Processing Images

 Package Function What it does `maps` `map_data()` Returns a data frame of latitudes and longitudes `ggmaps` `geocode()` Returns latitude and longitude of a place-name `magick` `image_read()` Reads an image into R and turns it into a magick object `magick` `image_resize()` Resizes an image `magick` `image_rotate()` Rotates an image `magick` `image_flip()` Rotates an image on a horizontal axis `magick` `image_flop()` Rotates an image on a vertical axis `magick` `image_annotate()` Adds text to an image `magick` `image_background()` Sets the background for an image `magick` `image_composite()` Combines images `magick` `image_morph()` Makes one image appear to gradually become (morph into) another `magick` `image_animate()` Puts an animation into the RStudio Viewer window `magick` `image_apply()` Applies a function to every frame in an animated GIF `magick` `image_write()` Saves an animation as a reusable GIF