{"appState":{"pageLoadApiCallsStatus":true},"articleState":{"article":{"headers":{"creationTime":"2016-07-15T00:30:22+00:00","modifiedTime":"2016-07-18T01:41:04+00:00","timestamp":"2022-09-14T18:15:10+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Information Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33572"},"slug":"information-technology","categoryId":33572},{"name":"AI","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33574"},"slug":"ai","categoryId":33574},{"name":"Machine Learning","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33575"},"slug":"machine-learning","categoryId":33575}],"title":"Choosing the Right Algorithm for Machine Learning","strippedTitle":"choosing the right algorithm for machine learning","slug":"choosing-right-algorithm-machine-learning","canonicalUrl":"","seo":{"metaDescription":"Machine learning involves the use of many different algorithms. This table gives you a quick summary of the strengths and weaknesses of various algorithms. Algo","noIndex":0,"noFollow":0},"content":"Machine learning involves the use of many different algorithms. This table gives you a quick summary of the strengths and weaknesses of various algorithms.\r\n<table width=\"679\">\r\n<tbody>\r\n<tr>\r\n<td width=\"155\"><strong>Algorithm</strong></td>\r\n<td width=\"142\"><strong>Best at</strong></td>\r\n<td width=\"138\"><strong>Pros</strong></td>\r\n<td width=\"244\"><strong>Cons</strong></td>\r\n</tr>\r\n<tr>\r\n<td>Random Forest</td>\r\n<td width=\"142\">Apt at almost any machine learning problem<br></br>\r\nBioinformatics</td>\r\n<td width=\"138\">Can work in parallel<br></br>\r\nSeldom overfits<br></br>\r\nAutomatically handles missing values<br></br>\r\nNo need to transform any variable<br></br>\r\nNo need to tweak parameters<br></br>\r\nCan be used by almost anyone with excellent results</td>\r\n<td width=\"244\">Difficult to interpret<br></br>\r\nWeaker on regression when estimating values at the extremities of the distribution of response values<br></br>\r\nBiased in multiclass problems toward more frequent classes</td>\r\n</tr>\r\n<tr>\r\n<td>Gradient Boosting</td>\r\n<td width=\"142\">Apt at almost any machine learning problem<br></br>\r\nSearch engines (solving the problem of learning to rank)</td>\r\n<td width=\"138\">It can approximate most nonlinear function<br></br>\r\nBest in class predictor<br></br>\r\nAutomatically handles missing values<br></br>\r\nNo need to transform any variable</td>\r\n<td width=\"244\">It can overfit if run for too many iterations<br></br>\r\nSensitive to noisy data and outliers<br></br>\r\nDoesn’t work well without parameter tuning</td>\r\n</tr>\r\n<tr>\r\n<td>Linear regression</td>\r\n<td width=\"142\">Baseline predictions<br></br>\r\nEconometric predictions<br></br>\r\nModelling marketing responses</td>\r\n<td width=\"138\">Simple to understand and explain<br></br>\r\nIt seldom overfits<br></br>\r\nUsing L1 & L2 regularization is effective in feature selection<br></br>\r\nFast to train<br></br>\r\nEasy to train on big data thanks to its stochastic version</td>\r\n<td width=\"244\">You have to work hard to make it fit nonlinear functions<br></br>\r\nCan suffer from outliers</td>\r\n</tr>\r\n<tr>\r\n<td>Support Vector Machines</td>\r\n<td width=\"142\">Character recognition<br></br>\r\nImage recognition<br></br>\r\nText classification</td>\r\n<td width=\"138\">Automatic nonlinear feature creation<br></br>\r\nCan approximate complex nonlinear functions</td>\r\n<td width=\"244\">Difficult to interpret when applying nonlinear kernels<br></br>\r\nSuffers from too many examples, after 10,000 examples it starts taking too long to train</td>\r\n</tr>\r\n<tr>\r\n<td>K-nearest Neighbors</td>\r\n<td width=\"142\">Computer vision<br></br>\r\nMultilabel tagging<br></br>\r\nRecommender systems<br></br>\r\nSpell checking problems</td>\r\n<td width=\"138\">Fast, lazy training<br></br>\r\nCan naturally handle extreme multiclass problems (like tagging text)</td>\r\n<td width=\"244\">Slow and cumbersome in the predicting phase<br></br>\r\nCan fail to predict correctly due to the curse of dimensionality</td>\r\n</tr>\r\n<tr>\r\n<td>Adaboost</td>\r\n<td width=\"142\">Face detection</td>\r\n<td width=\"138\">Automatically handles missing values<br></br>\r\nNo need to transform any variable<br></br>\r\nIt doesn’t overfit easily<br></br>\r\nFew parameters to tweak<br></br>\r\nIt can leverage many different weak-learners</td>\r\n<td width=\"244\">Sensitive to noisy data and outliers<br></br>\r\nNever the best in class predictions</td>\r\n</tr>\r\n<tr>\r\n<td>Naive Bayes</td>\r\n<td width=\"142\">Face recognition<br></br>\r\nSentiment analysis<br></br>\r\nSpam detection<br></br>\r\nText classification</td>\r\n<td width=\"138\">Easy and fast to implement, doesn’t require too much memory and can be used for online learning<br></br>\r\nEasy to understand<br></br>\r\nTakes into account prior knowledge</td>\r\n<td width=\"244\">Strong and unrealistic feature independence assumptions<br></br>\r\nFails estimating rare occurrences<br></br>\r\nSuffers from irrelevant features</td>\r\n</tr>\r\n<tr>\r\n<td>Neural Networks</td>\r\n<td width=\"142\">Image recognition<br></br>\r\nLanguage recognition and translation<br></br>\r\nSpeech recognition<br></br>\r\nVision recognition</td>\r\n<td width=\"138\">Can approximate any nonlinear function<br></br>\r\nRobust to outliers<br></br>\r\nWorks only with a portion of the examples (the support vectors)</td>\r\n<td width=\"244\">Very difficult to set up<br></br>\r\nDifficult to tune because of too many parameters and you have also to decide the architecture of the network<br></br>\r\nDifficult to interpret<br></br>\r\nEasy to overfit</td>\r\n</tr>\r\n<tr>\r\n<td>Logistic regression</td>\r\n<td width=\"142\">Ordering results by probability<br></br>\r\nModelling marketing responses</td>\r\n<td width=\"138\">Simple to understand and explain<br></br>\r\nIt seldom overfits<br></br>\r\nUsing L1 & L2 regularization is effective in feature selection<br></br>\r\nThe best algorithm for predicting probabilities of an event<br></br>\r\nFast to train<br></br>\r\nEasy to train on big data thanks to its stochastic version</td>\r\n<td width=\"244\">You have to work hard to make it fit nonlinear functions<br></br>\r\nCan suffer from outliers</td>\r\n</tr>\r\n<tr>\r\n<td>SVD</td>\r\n<td width=\"142\">Recommender systems</td>\r\n<td width=\"138\">Can restructure data in a meaningful way</td>\r\n<td width=\"244\">Difficult to understand why data has been restructured in a certain way</td>\r\n</tr>\r\n<tr>\r\n<td>PCA</td>\r\n<td width=\"142\">Removing collinearity<br></br>\r\nReducing dimensions of the dataset</td>\r\n<td width=\"138\">Can reduce data dimensionality</td>\r\n<td width=\"244\">Implies strong linear assumptions (components are a weighted summations of features)</td>\r\n</tr>\r\n<tr>\r\n<td>K-means</td>\r\n<td width=\"142\">Segmentation</td>\r\n<td width=\"138\">Fast in finding clusters<br></br>\r\nCan detect outliers in multiple dimensions</td>\r\n<td width=\"244\">Suffers from multicollinearity<br></br>\r\nClusters are spherical, can’t detect groups of other shape<br></br>\r\nUnstable solutions, depends on initialization</td>\r\n</tr>\r\n</tbody>\r\n</table>","description":"Machine learning involves the use of many different algorithms. This table gives you a quick summary of the strengths and weaknesses of various algorithms.\r\n<table width=\"679\">\r\n<tbody>\r\n<tr>\r\n<td width=\"155\"><strong>Algorithm</strong></td>\r\n<td width=\"142\"><strong>Best at</strong></td>\r\n<td width=\"138\"><strong>Pros</strong></td>\r\n<td width=\"244\"><strong>Cons</strong></td>\r\n</tr>\r\n<tr>\r\n<td>Random Forest</td>\r\n<td width=\"142\">Apt at almost any machine learning problem<br></br>\r\nBioinformatics</td>\r\n<td width=\"138\">Can work in parallel<br></br>\r\nSeldom overfits<br></br>\r\nAutomatically handles missing values<br></br>\r\nNo need to transform any variable<br></br>\r\nNo need to tweak parameters<br></br>\r\nCan be used by almost anyone with excellent results</td>\r\n<td width=\"244\">Difficult to interpret<br></br>\r\nWeaker on regression when estimating values at the extremities of the distribution of response values<br></br>\r\nBiased in multiclass problems toward more frequent classes</td>\r\n</tr>\r\n<tr>\r\n<td>Gradient Boosting</td>\r\n<td width=\"142\">Apt at almost any machine learning problem<br></br>\r\nSearch engines (solving the problem of learning to rank)</td>\r\n<td width=\"138\">It can approximate most nonlinear function<br></br>\r\nBest in class predictor<br></br>\r\nAutomatically handles missing values<br></br>\r\nNo need to transform any variable</td>\r\n<td width=\"244\">It can overfit if run for too many iterations<br></br>\r\nSensitive to noisy data and outliers<br></br>\r\nDoesn’t work well without parameter tuning</td>\r\n</tr>\r\n<tr>\r\n<td>Linear regression</td>\r\n<td width=\"142\">Baseline predictions<br></br>\r\nEconometric predictions<br></br>\r\nModelling marketing responses</td>\r\n<td width=\"138\">Simple to understand and explain<br></br>\r\nIt seldom overfits<br></br>\r\nUsing L1 & L2 regularization is effective in feature selection<br></br>\r\nFast to train<br></br>\r\nEasy to train on big data thanks to its stochastic version</td>\r\n<td width=\"244\">You have to work hard to make it fit nonlinear functions<br></br>\r\nCan suffer from outliers</td>\r\n</tr>\r\n<tr>\r\n<td>Support Vector Machines</td>\r\n<td width=\"142\">Character recognition<br></br>\r\nImage recognition<br></br>\r\nText classification</td>\r\n<td width=\"138\">Automatic nonlinear feature creation<br></br>\r\nCan approximate complex nonlinear functions</td>\r\n<td width=\"244\">Difficult to interpret when applying nonlinear kernels<br></br>\r\nSuffers from too many examples, after 10,000 examples it starts taking too long to train</td>\r\n</tr>\r\n<tr>\r\n<td>K-nearest Neighbors</td>\r\n<td width=\"142\">Computer vision<br></br>\r\nMultilabel tagging<br></br>\r\nRecommender systems<br></br>\r\nSpell checking problems</td>\r\n<td width=\"138\">Fast, lazy training<br></br>\r\nCan naturally handle extreme multiclass problems (like tagging text)</td>\r\n<td width=\"244\">Slow and cumbersome in the predicting phase<br></br>\r\nCan fail to predict correctly due to the curse of dimensionality</td>\r\n</tr>\r\n<tr>\r\n<td>Adaboost</td>\r\n<td width=\"142\">Face detection</td>\r\n<td width=\"138\">Automatically handles missing values<br></br>\r\nNo need to transform any variable<br></br>\r\nIt doesn’t overfit easily<br></br>\r\nFew parameters to tweak<br></br>\r\nIt can leverage many different weak-learners</td>\r\n<td width=\"244\">Sensitive to noisy data and outliers<br></br>\r\nNever the best in class predictions</td>\r\n</tr>\r\n<tr>\r\n<td>Naive Bayes</td>\r\n<td width=\"142\">Face recognition<br></br>\r\nSentiment analysis<br></br>\r\nSpam detection<br></br>\r\nText classification</td>\r\n<td width=\"138\">Easy and fast to implement, doesn’t require too much memory and can be used for online learning<br></br>\r\nEasy to understand<br></br>\r\nTakes into account prior knowledge</td>\r\n<td width=\"244\">Strong and unrealistic feature independence assumptions<br></br>\r\nFails estimating rare occurrences<br></br>\r\nSuffers from irrelevant features</td>\r\n</tr>\r\n<tr>\r\n<td>Neural Networks</td>\r\n<td width=\"142\">Image recognition<br></br>\r\nLanguage recognition and translation<br></br>\r\nSpeech recognition<br></br>\r\nVision recognition</td>\r\n<td width=\"138\">Can approximate any nonlinear function<br></br>\r\nRobust to outliers<br></br>\r\nWorks only with a portion of the examples (the support vectors)</td>\r\n<td width=\"244\">Very difficult to set up<br></br>\r\nDifficult to tune because of too many parameters and you have also to decide the architecture of the network<br></br>\r\nDifficult to interpret<br></br>\r\nEasy to overfit</td>\r\n</tr>\r\n<tr>\r\n<td>Logistic regression</td>\r\n<td width=\"142\">Ordering results by probability<br></br>\r\nModelling marketing responses</td>\r\n<td width=\"138\">Simple to understand and explain<br></br>\r\nIt seldom overfits<br></br>\r\nUsing L1 & L2 regularization is effective in feature selection<br></br>\r\nThe best algorithm for predicting probabilities of an event<br></br>\r\nFast to train<br></br>\r\nEasy to train on big data thanks to its stochastic version</td>\r\n<td width=\"244\">You have to work hard to make it fit nonlinear functions<br></br>\r\nCan suffer from outliers</td>\r\n</tr>\r\n<tr>\r\n<td>SVD</td>\r\n<td width=\"142\">Recommender systems</td>\r\n<td width=\"138\">Can restructure data in a meaningful way</td>\r\n<td width=\"244\">Difficult to understand why data has been restructured in a certain way</td>\r\n</tr>\r\n<tr>\r\n<td>PCA</td>\r\n<td width=\"142\">Removing collinearity<br></br>\r\nReducing dimensions of the dataset</td>\r\n<td width=\"138\">Can reduce data dimensionality</td>\r\n<td width=\"244\">Implies strong linear assumptions (components are a weighted summations of features)</td>\r\n</tr>\r\n<tr>\r\n<td>K-means</td>\r\n<td width=\"142\">Segmentation</td>\r\n<td width=\"138\">Fast in finding clusters<br></br>\r\nCan detect outliers in multiple dimensions</td>\r\n<td width=\"244\">Suffers from multicollinearity<br></br>\r\nClusters are spherical, can’t detect groups of other shape<br></br>\r\nUnstable solutions, depends on initialization</td>\r\n</tr>\r\n</tbody>\r\n</table>","blurb":"","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b>Luca Massaron,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b>Luca Massaron,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"primaryCategoryTaxonomy":{"categoryId":33575,"title":"Machine Learning","slug":"machine-learning","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33575"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":226836,"title":"10 Ways to Improve Your Machine Learning Models","slug":"10-ways-improve-machine-learning-models","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/226836"}},{"articleId":226831,"title":"Performing Classification Tasks for Machine Learning","slug":"performing-classification-tasks-machine-learning","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/226831"}},{"articleId":226828,"title":"10 Machine Learning Packages to Master","slug":"10-machine-learning-packages-master","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/226828"}},{"articleId":226825,"title":"Using Machine Learning to Analyze Reviews from E-Commerce","slug":"using-machine-learning-analyze-reviews-e-commerce","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/226825"}},{"articleId":226822,"title":"Understanding How Machines Read","slug":"understanding-machines-read","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/226822"}}],"fromCategory":[{"articleId":284149,"title":"The Machine Learning Process","slug":"the-machine-learning-process","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/284149"}},{"articleId":284144,"title":"Machine Learning: Leveraging Decision Trees with Random Forest Ensembles","slug":"machine-learning-leveraging-decision-trees-with-random-forest-ensembles","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/284144"}},{"articleId":284139,"title":"What Is Computer Vision?","slug":"what-is-computer-vision","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/284139"}},{"articleId":284133,"title":"How to Use Anaconda for Machine Learning","slug":"how-to-use-anaconda-for-machine-learning","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/284133"}},{"articleId":284130,"title":"The Relationship between AI and Machine Learning","slug":"the-relationship-between-ai-and-machine-learning","categoryList":["technology","information-technology","ai","machine-learning"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/284130"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281761,"slug":"machine-learning-for-dummies","isbn":"9781119724018","categoryList":["technology","information-technology","ai","machine-learning"],"amazon":{"default":"https://www.amazon.com/gp/product/1119724015/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/1119724015/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/1119724015-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/1119724015/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/1119724015/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/machine-learning-for-dummies-2nd-edition-cover-9781119724018-203x255.jpg","width":203,"height":255},"title":"Machine Learning For Dummies","testBankPinActivationLink":"","bookOutOfPrint":true,"authorsInfo":"<p><p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b>Luca Massaron,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques. <p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b><b data-author-id=\"9110\">Luca Massaron</b>,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques.</p>","authors":[{"authorId":9109,"name":"John Paul Mueller","slug":"john-paul-mueller","description":" <p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b>Luca Massaron,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9109"}},{"authorId":9110,"name":"Luca Massaron","slug":"luca-massaron","description":" <p><b>John Mueller</b> has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). <b>Luca Massaron,</b> a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques. ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9110"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{"key":"cat","values":["technology","information-technology","ai","machine-learning"]},{"key":"isbn","values":["9781119724018"]}]\" id=\"du-slot-63221a2e456a5\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{"key":"cat","values":["technology","information-technology","ai","machine-learning"]},{"key":"isbn","values":["9781119724018"]}]\" id=\"du-slot-63221a2e45dc2\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":null,"lifeExpectancySetFrom":null,"dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":221423},"articleLoadedStatus":"success"},"listState":{"list":{},"objectTitle":"","status":"initial","pageType":null,"objectId":null,"page":1,"sortField":"time","sortOrder":1,"categoriesIds":[],"articleTypes":[],"filterData":{},"filterDataLoadedStatus":"initial","pageSize":10},"adsState":{"pageScripts":{"headers":{"timestamp":"2024-03-04T05:50:01+00:00"},"adsId":0,"data":{"scripts":[{"pages":["all"],"location":"header","script":"\r\n<script src=\"https://cdn.optimizely.com/js/10563184655.js\"></script>","enabled":false},{"pages":["all"],"location":"header","script":"\r\n<script>var _comscore = _comscore || [];_comscore.push({ c1: \"2\", c2: \"15097263\" });(function() {var s = document.createElement(\"script\"), el = document.getElementsByTagName(\"script\")[0]; s.async = true;s.src = (document.location.protocol == \"https:\" ? \"https://sb\" : \"http://b\") + \".scorecardresearch.com/beacon.js\";el.parentNode.insertBefore(s, el);})();</script><noscript><img src=\"https://sb.scorecardresearch.com/p?c1=2&c2=15097263&cv=2.0&cj=1\" /></noscript>\r\n","enabled":true},{"pages":["all"],"location":"footer","script":"\r\n<script type='text/javascript'>\r\n(function(){var g=function(e,h,f,g){\r\nthis.get=function(a){for(var a=a+\"=\",c=document.cookie.split(\";\"),b=0,e=c.length;b<e;b++){for(var d=c[b];\" \"==d.charAt(0);)d=d.substring(1,d.length);if(0==d.indexOf(a))return d.substring(a.length,d.length)}return null};\r\nthis.set=function(a,c){var b=\"\",b=new Date;b.setTime(b.getTime()+6048E5);b=\"; expires=\"+b.toGMTString();document.cookie=a+\"=\"+c+b+\"; path=/; \"};\r\nthis.check=function(){var a=this.get(f);if(a)a=a.split(\":\");else if(100!=e)\"v\"==h&&(e=Math.random()>=e/100?0:100),a=[h,e,0],this.set(f,a.join(\":\"));else return!0;var c=a[1];if(100==c)return!0;switch(a[0]){case \"v\":return!1;case \"r\":return c=a[2]%Math.floor(100/c),a[2]++,this.set(f,a.join(\":\")),!c}return!0};\r\nthis.go=function(){if(this.check()){var a=document.createElement(\"script\");a.type=\"text/javascript\";a.src=g;document.body&&document.body.appendChild(a)}};\r\nthis.start=function(){var t=this;\"complete\"!==document.readyState?window.addEventListener?window.addEventListener(\"load\",function(){t.go()},!1):window.attachEvent&&window.attachEvent(\"onload\",function(){t.go()}):t.go()};};\r\ntry{(new g(100,\"r\",\"QSI_S_ZN_5o5yqpvMVjgDOuN\",\"https://zn5o5yqpvmvjgdoun-wiley.siteintercept.qualtrics.com/SIE/?Q_ZID=ZN_5o5yqpvMVjgDOuN\")).start()}catch(i){}})();\r\n</script><div id='ZN_5o5yqpvMVjgDOuN'></div>\r\n","enabled":false},{"pages":["all"],"location":"header","script":"\r\n<script>\r\n (function(h,o,t,j,a,r){\r\n h.hj=h.hj||function(){(h.hj.q=h.hj.q||[]).push(arguments)};\r\n h._hjSettings={hjid:257151,hjsv:6};\r\n a=o.getElementsByTagName('head')[0];\r\n r=o.createElement('script');r.async=1;\r\n r.src=t+h._hjSettings.hjid+j+h._hjSettings.hjsv;\r\n a.appendChild(r);\r\n })(window,document,'https://static.hotjar.com/c/hotjar-','.js?sv=');\r\n</script>","enabled":false},{"pages":["article"],"location":"header","script":" <script src=\"//get.s-onetag.com/bffe21a1-6bb8-4928-9449-7beadb468dae/tag.min.js\" async defer></script>","enabled":true},{"pages":["homepage"],"location":"header","script":"<meta name=\"facebook-domain-verification\" content=\"irk8y0irxf718trg3uwwuexg6xpva0\" />","enabled":true},{"pages":["homepage","article","category","search"],"location":"footer","script":"\r\n<noscript>\r\n<img height=\"1\" width=\"1\" src=\"https://www.facebook.com/tr?id=256338321977984&ev=PageView&noscript=1\"/>\r\n</noscript>\r\n","enabled":true}]}},"pageScriptsLoadedStatus":"success"},"navigationState":{"navigationCollections":[{"collectionId":287568,"title":"BYOB (Be Your Own Boss)","hasSubCategories":false,"url":"/collection/for-the-entry-level-entrepreneur-287568"},{"collectionId":293237,"title":"Be a Rad Dad","hasSubCategories":false,"url":"/collection/be-the-best-dad-293237"},{"collectionId":295890,"title":"Career Shifting","hasSubCategories":false,"url":"/collection/career-shifting-295890"},{"collectionId":294090,"title":"Contemplating the Cosmos","hasSubCategories":false,"url":"/collection/theres-something-about-space-294090"},{"collectionId":287563,"title":"For Those Seeking Peace of Mind","hasSubCategories":false,"url":"/collection/for-those-seeking-peace-of-mind-287563"},{"collectionId":287570,"title":"For the Aspiring Aficionado","hasSubCategories":false,"url":"/collection/for-the-bougielicious-287570"},{"collectionId":291903,"title":"For the Budding Cannabis Enthusiast","hasSubCategories":false,"url":"/collection/for-the-budding-cannabis-enthusiast-291903"},{"collectionId":299891,"title":"For the College Bound","hasSubCategories":false,"url":"/collection/for-the-college-bound-299891"},{"collectionId":291934,"title":"For the Exam-Season Crammer","hasSubCategories":false,"url":"/collection/for-the-exam-season-crammer-291934"},{"collectionId":301547,"title":"For the Game Day Prepper","hasSubCategories":false,"url":"/collection/big-game-day-prep-made-easy-301547"}],"navigationCollectionsLoadedStatus":"success","navigationCategories":{"books":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/books/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/books/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/books/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/books/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/books/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/books/level-0-category-0"}},"articles":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/articles/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/articles/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/articles/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/articles/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/articles/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/articles/level-0-category-0"}}},"navigationCategoriesLoadedStatus":"success"},"searchState":{"searchList":[],"searchStatus":"initial","relatedArticlesList":[],"relatedArticlesStatus":"initial"},"routeState":{"name":"Article4","path":"/article/technology/information-technology/ai/machine-learning/choosing-right-algorithm-machine-learning-221423/","hash":"","query":{},"params":{"category1":"technology","category2":"information-technology","category3":"ai","category4":"machine-learning","article":"choosing-right-algorithm-machine-learning-221423"},"fullPath":"/article/technology/information-technology/ai/machine-learning/choosing-right-algorithm-machine-learning-221423/","meta":{"routeType":"article","breadcrumbInfo":{"suffix":"Articles","baseRoute":"/category/articles"},"prerenderWithAsyncData":true},"from":{"name":null,"path":"/","hash":"","query":{},"params":{},"fullPath":"/","meta":{}}},"dropsState":{"submitEmailResponse":false,"status":"initial"},"profileState":{"auth":{},"userOptions":{},"status":"success"}}

Choosing the Right Algorithm for Machine Learning

By: John Paul Mueller and Luca Massaron and

Updated: 07-18-2016

From The Book: Machine Learning For Dummies

Machine Learning For Dummies

Book image

Explore Book Buy On Amazon

Machine learning involves the use of many different algorithms. This table gives you a quick summary of the strengths and weaknesses of various algorithms.

Algorithm	Best at	Pros	Cons
Random Forest	Apt at almost any machine learning problem Bioinformatics	Can work in parallel Seldom overfits Automatically handles missing values No need to transform any variable No need to tweak parameters Can be used by almost anyone with excellent results	Difficult to interpret Weaker on regression when estimating values at the extremities of the distribution of response values Biased in multiclass problems toward more frequent classes
Gradient Boosting	Apt at almost any machine learning problem Search engines (solving the problem of learning to rank)	It can approximate most nonlinear function Best in class predictor Automatically handles missing values No need to transform any variable	It can overfit if run for too many iterations Sensitive to noisy data and outliers Doesn’t work well without parameter tuning
Linear regression	Baseline predictions Econometric predictions Modelling marketing responses	Simple to understand and explain It seldom overfits Using L1 & L2 regularization is effective in feature selection Fast to train Easy to train on big data thanks to its stochastic version	You have to work hard to make it fit nonlinear functions Can suffer from outliers
Support Vector Machines	Character recognition Image recognition Text classification	Automatic nonlinear feature creation Can approximate complex nonlinear functions	Difficult to interpret when applying nonlinear kernels Suffers from too many examples, after 10,000 examples it starts taking too long to train
K-nearest Neighbors	Computer vision Multilabel tagging Recommender systems Spell checking problems	Fast, lazy training Can naturally handle extreme multiclass problems (like tagging text)	Slow and cumbersome in the predicting phase Can fail to predict correctly due to the curse of dimensionality
Adaboost	Face detection	Automatically handles missing values No need to transform any variable It doesn’t overfit easily Few parameters to tweak It can leverage many different weak-learners	Sensitive to noisy data and outliers Never the best in class predictions
Naive Bayes	Face recognition Sentiment analysis Spam detection Text classification	Easy and fast to implement, doesn’t require too much memory and can be used for online learning Easy to understand Takes into account prior knowledge	Strong and unrealistic feature independence assumptions Fails estimating rare occurrences Suffers from irrelevant features
Neural Networks	Image recognition Language recognition and translation Speech recognition Vision recognition	Can approximate any nonlinear function Robust to outliers Works only with a portion of the examples (the support vectors)	Very difficult to set up Difficult to tune because of too many parameters and you have also to decide the architecture of the network Difficult to interpret Easy to overfit
Logistic regression	Ordering results by probability Modelling marketing responses	Simple to understand and explain It seldom overfits Using L1 & L2 regularization is effective in feature selection The best algorithm for predicting probabilities of an event Fast to train Easy to train on big data thanks to its stochastic version	You have to work hard to make it fit nonlinear functions Can suffer from outliers
SVD	Recommender systems	Can restructure data in a meaningful way	Difficult to understand why data has been restructured in a certain way
PCA	Removing collinearity Reducing dimensions of the dataset	Can reduce data dimensionality	Implies strong linear assumptions (components are a weighted summations of features)
K-means	Segmentation	Fast in finding clusters Can detect outliers in multiple dimensions	Suffers from multicollinearity Clusters are spherical, can’t detect groups of other shape Unstable solutions, depends on initialization

About This Article

This article is from the book:

Machine Learning For Dummies ,

About the book authors:

John Mueller has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). Luca Massaron, a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques.

John Mueller has produced 114 books and more than 600 articles on topics ranging from functional programming techniques to working with Amazon Web Services (AWS). Luca Massaron, a Google Developer Expert (GDE),??interprets big data and transforms it into smart data through simple and effective data mining and machine learning techniques.

This article can be found in the category:

Machine Learning ,