##### Cheat Sheet

# Machine Learning For Dummies Cheat Sheet

Machine learning is an incredible technology that you use more often than you think today and with the potential to do even more tomorrow. The interesting thing about machine learning is that both R and Python make the task easier than more people realize because both languages come with a lot of built-in and extended support (through the use of libraries, datasets, and other resources). With that in mind, this cheat sheet helps you access the most commonly needed reminders for making your machine learning experience fast and easy.

## Choosing the Right Algorithm for Machine Learning

Machine learning involves the use of many different algorithms. This table gives you a quick summary of the strengths and weaknesses of various algorithms.

Algorithm |
Best at |
Pros |
Cons |

Random Forest | Apt at almost any machine learning problem Bioinformatics |
Can work in parallel Seldom overfits Automatically handles missing values No need to transform any variable No need to tweak parameters Can be used by almost anyone with excellent results |
Difficult to interpret Weaker on regression when estimating values at the extremities of the distribution of response values Biased in multiclass problems toward more frequent classes |

Gradient Boosting | Apt at almost any machine learning problem Search engines (solving the problem of learning to rank) |
It can approximate most nonlinear function Best in class predictor Automatically handles missing values No need to transform any variable |
It can overfit if run for too many iterations Sensitive to noisy data and outliers Doesn’t work well without parameter tuning |

Linear regression | Baseline predictions Econometric predictions Modelling marketing responses |
Simple to understand and explain It seldom overfits Using L1 & L2 regularization is effective in feature selection Fast to train Easy to train on big data thanks to its stochastic version |
You have to work hard to make it fit nonlinear functions Can suffer from outliers |

Support Vector Machines | Character recognition Image recognition Text classification |
Automatic nonlinear feature creation Can approximate complex nonlinear functions |
Difficult to interpret when applying nonlinear kernels Suffers from too many examples, after 10,000 examples it starts taking too long to train |

K-nearest Neighbors | Computer vision Multilabel tagging Recommender systems Spell checking problems |
Fast, lazy training Can naturally handle extreme multiclass problems (like tagging text) |
Slow and cumbersome in the predicting phase Can fail to predict correctly due to the curse of dimensionality |

Adaboost | Face detection | Automatically handles missing values No need to transform any variable It doesn’t overfit easily Few parameters to tweak It can leverage many different weak-learners |
Sensitive to noisy data and outliers Never the best in class predictions |

Naive Bayes | Face recognition Sentiment analysis Spam detection Text classification |
Easy and fast to implement, doesn’t require too much memory and can be used for online learning Easy to understand Takes into account prior knowledge |
Strong and unrealistic feature independence assumptions Fails estimating rare occurrences Suffers from irrelevant features |

Neural Networks | Image recognition Language recognition and translation Speech recognition Vision recognition |
Can approximate any nonlinear function Robust to outliers Works only with a portion of the examples (the support vectors) |
Very difficult to set up Difficult to tune because of too many parameters and you have also to decide the architecture of the network Difficult to interpret Easy to overfit |

Logistic regression | Ordering results by probability Modelling marketing responses |
Simple to understand and explain It seldom overfits Using L1 & L2 regularization is effective in feature selection The best algorithm for predicting probabilities of an event Fast to train Easy to train on big data thanks to its stochastic version |
You have to work hard to make it fit nonlinear functions Can suffer from outliers |

SVD | Recommender systems | Can restructure data in a meaningful way | Difficult to understand why data has been restructured in a certain way |

PCA | Removing collinearity Reducing dimensions of the dataset |
Can reduce data dimensionality | Implies strong linear assumptions (components are a weighted summations of features) |

K-means | Segmentation | Fast in finding clusters Can detect outliers in multiple dimensions |
Suffers from multicollinearity Clusters are spherical, can’t detect groups of other shape Unstable solutions, depends on initialization |

## Getting the Right Library for Machine Learning

When working with R and Python for machine learning, you gain the benefit of not having to reinvent the wheel when it comes to algorithms. There is a library available to meet your specific needs — you just need to know which one to use. This table provides you with a listing of the libraries used for machine learning for both R and Python. When you want to perform any algorithm-related task, simply load the library needed for that task into your programming environment.

Algorithm |
Python implementation |
R implementation |

Adaboost | sklearn.ensemble.AdaBoostClassifier sklearn.ensemble.AdaBoostRegressor |
library(ada) : ada |

Gradient Boosting | sklearn.ensemble.GradientBoostingClassifier sklearn.ensemble.GradientBoostingRegressor |
library(gbm) : gbm |

K-means | sklearn.cluster.KMeans sklearn.cluster.MiniBatchKMeans |
library(stats) : kmeans |

K-nearest Neighbors | sklearn.neighbors.KNeighborsClassifier sklearn.neighbors.KNeighborsRegressor |
library(class): knn |

Linear regression | sklearn.linear_model.LinearRegression sklearn.linear_model.Ridge sklearn.linear_model.Lasso sklearn.linear_model.ElasticNet sklearn.linear_model.SGDRegressor |
library(stats) : lm library(stats) : glm library(MASS) : lm.ridge library(lars) : lars library(glmnet) : glmnet |

Logistic regression | sklearn.linear_model.LogisticRegression sklearn.linear_model.SGDClassifier |
library(stats) : glm library(glmnet) : glmnet |

Naive Bayes | sklearn.naive_bayes.GaussianNB sklearn.naive_bayes.MultinomialNB sklearn.naive_bayes.BernoulliNB |
library(klaR) : NaiveBayes library(e1071) : naiveBayes |

Neural Networks | sklearn.neural_network.BernoulliRBM (in version 0.18 of Scikit-learn, a new implementation of supervised neural network will be introducted) |
library(neuralnet) : neuralnet library(AMORE) : train library(nnet) : nnet |

PCA | sklearn.decomposition.PCA | library(stats): princomp library(stats) : stats |

Random Forest | sklearn.ensemble.RandomForestClassifier sklearn.ensemble.RandomForestRegressor sklearn.ensemble.ExtraTreesClassifier sklearn.ensemble.ExtraTreesRegressor |
library(randomForest) : randomForest |

Support Vector Machines | sklearn.svm.SVC sklearn.svm.LinearSVC sklearn.svm.NuSVC sklearn.svm.SVR sklearn.svm.LinearSVR sklearn.svm.NuSVR sklearn.svm.OneClassSVM |
library(e1071) : svm |

SVD | sklearn.decomposition.TruncatedSVD sklearn.decomposition.NMF |
library(irlba) : irlba library(svd) : svd |

## Locating the Algorithm You Need for Machine Learning

There are a number of different algorithms you can use for machine learning. However, finding the specific algorithm you want to know about can be difficult. This table provides you with the online location for information about the algorithms used in machine learning.