{"appState":{"pageLoadApiCallsStatus":true},"articleState":{"article":{"headers":{"creationTime":"2018-04-10T18:29:59+00:00","modifiedTime":"2018-04-10T18:29:59+00:00","timestamp":"2022-09-14T18:16:24+00:00"},"data":{"breadcrumbs":[{"name":"Technology","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33512"},"slug":"technology","categoryId":33512},{"name":"Programming & Web Design","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33592"},"slug":"programming-web-design","categoryId":33592},{"name":"R","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33607"},"slug":"r","categoryId":33607}],"title":"R Project for ML Concepts: Titanic","strippedTitle":"r project for ml concepts: titanic","slug":"r-project-ml-concepts-titanic","canonicalUrl":"","seo":{"metaDescription":"A dataset that’s often used to illustrate ML concepts in R programming is the information about passengers on the Titanic ’s disastrous voyage in 1912. The targ","noIndex":0,"noFollow":0},"content":"A dataset that’s often used to illustrate ML concepts in R programming is the information about passengers on the <em>Titanic</em>’s disastrous voyage in 1912. The target variable is whether the passenger survived. You can use this data to create a decision tree.\r\n\r\nThe data resides in an R package called <code>titanic</code>. If it’s not already on the Packages tab, click Install. In the Install Packages dialog box, type <strong>titanic</strong> and click the Install button. After the package downloads, find it on the Packages tab and select its check box.\r\n\r\nIn the <code>titanic</code> package, you’ll find <code>titanic_train</code> and <code>titanic_test</code>. Don’t be tempted to use one as the training set and the other as the test set for this particular application of <code>Rattle</code>. The <code>titanic_test</code> set doesn’t include the <code>Survived</code> variable, so it’s not usable for testing a decision tree the way I lay out the process here.\r\n\r\nInstead, create the data frame like this:\r\n\r\n<code>library(titanic)</code>\r\n<code>titanic.df <- titanic_train</code>\r\n\r\nThen use <code>Rattle</code>’s Data tab to read in the dataset. This image shows what the Data tab looks like after a few modifications.\r\n\r\n[caption id=\"attachment_251619\" align=\"aligncenter\" width=\"535\"]<img class=\"wp-image-251619 size-full\" src=\"https://www.dummies.com/wp-content/uploads/r-projects-modified-rattle-data-tab.jpg\" alt=\"Rattle data tab after modification\" width=\"535\" height=\"236\" /> The <code>Rattle</code> Data tab, after modifying the <code>titanic.df</code> dataset.[/caption]\r\n\r\nWhat are those modifications? First, a rule of thumb: If a variable is categoric and has a lot of unique values (and if it’s not already classified as an Ident (identifier)), click its Ignore radio button. Also, when first encountering this dataset, <code>Rattle</code> thinks <code>Embarked</code> is the target variable. Use the radio buttons to change <code>Embarked</code> to <code>Categoric</code> and to change <code>Survived</code> to <code>Target</code>.\r\n\r\nGood luck!","description":"A dataset that’s often used to illustrate ML concepts in R programming is the information about passengers on the <em>Titanic</em>’s disastrous voyage in 1912. The target variable is whether the passenger survived. You can use this data to create a decision tree.\r\n\r\nThe data resides in an R package called <code>titanic</code>. If it’s not already on the Packages tab, click Install. In the Install Packages dialog box, type <strong>titanic</strong> and click the Install button. After the package downloads, find it on the Packages tab and select its check box.\r\n\r\nIn the <code>titanic</code> package, you’ll find <code>titanic_train</code> and <code>titanic_test</code>. Don’t be tempted to use one as the training set and the other as the test set for this particular application of <code>Rattle</code>. The <code>titanic_test</code> set doesn’t include the <code>Survived</code> variable, so it’s not usable for testing a decision tree the way I lay out the process here.\r\n\r\nInstead, create the data frame like this:\r\n\r\n<code>library(titanic)</code>\r\n<code>titanic.df <- titanic_train</code>\r\n\r\nThen use <code>Rattle</code>’s Data tab to read in the dataset. This image shows what the Data tab looks like after a few modifications.\r\n\r\n[caption id=\"attachment_251619\" align=\"aligncenter\" width=\"535\"]<img class=\"wp-image-251619 size-full\" src=\"https://www.dummies.com/wp-content/uploads/r-projects-modified-rattle-data-tab.jpg\" alt=\"Rattle data tab after modification\" width=\"535\" height=\"236\" /> The <code>Rattle</code> Data tab, after modifying the <code>titanic.df</code> dataset.[/caption]\r\n\r\nWhat are those modifications? First, a rule of thumb: If a variable is categoric and has a lot of unique values (and if it’s not already classified as an Ident (identifier)), click its Ignore radio button. Also, when first encountering this dataset, <code>Rattle</code> thinks <code>Embarked</code> is the target variable. Use the radio buttons to change <code>Embarked</code> to <code>Categoric</code> and to change <code>Survived</code> to <code>Target</code>.\r\n\r\nGood luck!","blurb":"","authors":[{"authorId":9759,"name":"Joseph Schmuller","slug":"joseph-schmuller","description":" <p><b>Joseph Schmuller</b> works on the Digital & Enterprise Architecture Team at Availity. He has taught statistics at the undergraduate and graduate levels. He has created and delivered courses for LinkedIn Learning, and he is the author of all previous editions of <i>Statistical Analysis with Excel For Dummies.</i></p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9759"}}],"primaryCategoryTaxonomy":{"categoryId":33607,"title":"R","slug":"r","_links":{"self":"https://dummies-api.dummies.com/v2/categories/33607"}},"secondaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"tertiaryCategoryTaxonomy":{"categoryId":0,"title":null,"slug":null,"_links":null},"trendingArticles":null,"inThisArticle":[],"relatedArticles":{"fromBook":[{"articleId":251666,"title":"R Project: Combining an Image with an Animated Image","slug":"r-project-combining-image-animated-image","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251666"}},{"articleId":251663,"title":"11 Useful Resources for R Programmers","slug":"11-useful-resources-r-programmers","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251663"}},{"articleId":251660,"title":"R Project: Delay and Weather","slug":"r-project-delay-weather","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251660"}},{"articleId":251657,"title":"R Project for RFM Analysis: Another Data Set","slug":"r-project-rfm-analysis-another-data-set","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251657"}},{"articleId":251653,"title":"R Project for Neural Networks: Rattling Around","slug":"r-project-neural-networks-rattling-around","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251653"}}],"fromCategory":[{"articleId":262959,"title":"Statistical Analysis with R For Dummies Cheat Sheet","slug":"statistical-analysis-with-r-for-dummies-cheat-sheet","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/262959"}},{"articleId":251666,"title":"R Project: Combining an Image with an Animated Image","slug":"r-project-combining-image-animated-image","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251666"}},{"articleId":251663,"title":"11 Useful Resources for R Programmers","slug":"11-useful-resources-r-programmers","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251663"}},{"articleId":251660,"title":"R Project: Delay and Weather","slug":"r-project-delay-weather","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251660"}},{"articleId":251657,"title":"R Project for RFM Analysis: Another Data Set","slug":"r-project-rfm-analysis-another-data-set","categoryList":["technology","programming-web-design","r"],"_links":{"self":"https://dummies-api.dummies.com/v2/articles/251657"}}]},"hasRelatedBookFromSearch":false,"relatedBook":{"bookId":281847,"slug":"r-projects-for-dummies","isbn":"9781119446187","categoryList":["technology","programming-web-design","r"],"amazon":{"default":"https://www.amazon.com/gp/product/111944618X/ref=as_li_tl?ie=UTF8&tag=wiley01-20","ca":"https://www.amazon.ca/gp/product/111944618X/ref=as_li_tl?ie=UTF8&tag=wiley01-20","indigo_ca":"http://www.tkqlhce.com/click-9208661-13710633?url=https://www.chapters.indigo.ca/en-ca/books/product/111944618X-item.html&cjsku=978111945484","gb":"https://www.amazon.co.uk/gp/product/111944618X/ref=as_li_tl?ie=UTF8&tag=wiley01-20","de":"https://www.amazon.de/gp/product/111944618X/ref=as_li_tl?ie=UTF8&tag=wiley01-20"},"image":{"src":"https://www.dummies.com/wp-content/uploads/r-projects-for-dummies-cover-9781119446187-203x255.jpg","width":203,"height":255},"title":"R Projects For Dummies","testBankPinActivationLink":"","bookOutOfPrint":false,"authorsInfo":"<p><b data-author-id=\"9759\">Joseph Schmuller, PhD,</b> is a veteran of more than 25 years in Information Technology. He is the author of several books, including <i>Statistical Analysis with R For Dummies</i> and four editions of <i>Statistical Analysis with Excel For Dummies.</i> In addition, he has written numerous articles and created online coursework for Lynda.com. </p>","authors":[{"authorId":9759,"name":"Joseph Schmuller","slug":"joseph-schmuller","description":" <p><b>Joseph Schmuller</b> works on the Digital & Enterprise Architecture Team at Availity. He has taught statistics at the undergraduate and graduate levels. He has created and delivered courses for LinkedIn Learning, and he is the author of all previous editions of <i>Statistical Analysis with Excel For Dummies.</i></p> ","hasArticle":false,"_links":{"self":"https://dummies-api.dummies.com/v2/authors/9759"}}],"_links":{"self":"https://dummies-api.dummies.com/v2/books/"}},"collections":[],"articleAds":{"footerAd":"<div class=\"du-ad-region row\" id=\"article_page_adhesion_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_adhesion_ad\" data-refreshed=\"false\" \r\n data-target = \"[{"key":"cat","values":["technology","programming-web-design","r"]},{"key":"isbn","values":["9781119446187"]}]\" id=\"du-slot-63221a782bc3f\"></div></div>","rightAd":"<div class=\"du-ad-region row\" id=\"article_page_right_ad\"><div class=\"du-ad-unit col-md-12\" data-slot-id=\"article_page_right_ad\" data-refreshed=\"false\" \r\n data-target = \"[{"key":"cat","values":["technology","programming-web-design","r"]},{"key":"isbn","values":["9781119446187"]}]\" id=\"du-slot-63221a782c4c5\"></div></div>"},"articleType":{"articleType":"Articles","articleList":null,"content":null,"videoInfo":{"videoId":null,"name":null,"accountId":null,"playerId":null,"thumbnailUrl":null,"description":null,"uploadDate":null}},"sponsorship":{"sponsorshipPage":false,"backgroundImage":{"src":null,"width":0,"height":0},"brandingLine":"","brandingLink":"","brandingLogo":{"src":null,"width":0,"height":0},"sponsorAd":"","sponsorEbookTitle":"","sponsorEbookLink":"","sponsorEbookImage":{"src":null,"width":0,"height":0}},"primaryLearningPath":"Advance","lifeExpectancy":null,"lifeExpectancySetFrom":null,"dummiesForKids":"no","sponsoredContent":"no","adInfo":"","adPairKey":[]},"status":"publish","visibility":"public","articleId":251618},"articleLoadedStatus":"success"},"listState":{"list":{},"objectTitle":"","status":"initial","pageType":null,"objectId":null,"page":1,"sortField":"time","sortOrder":1,"categoriesIds":[],"articleTypes":[],"filterData":{},"filterDataLoadedStatus":"initial","pageSize":10},"adsState":{"pageScripts":{"headers":{"timestamp":"2024-03-04T05:50:01+00:00"},"adsId":0,"data":{"scripts":[{"pages":["all"],"location":"header","script":"\r\n<script src=\"https://cdn.optimizely.com/js/10563184655.js\"></script>","enabled":false},{"pages":["all"],"location":"header","script":"\r\n<script>var _comscore = _comscore || [];_comscore.push({ c1: \"2\", c2: \"15097263\" });(function() {var s = document.createElement(\"script\"), el = document.getElementsByTagName(\"script\")[0]; s.async = true;s.src = (document.location.protocol == \"https:\" ? \"https://sb\" : \"http://b\") + \".scorecardresearch.com/beacon.js\";el.parentNode.insertBefore(s, el);})();</script><noscript><img src=\"https://sb.scorecardresearch.com/p?c1=2&c2=15097263&cv=2.0&cj=1\" /></noscript>\r\n","enabled":true},{"pages":["all"],"location":"footer","script":"\r\n<script type='text/javascript'>\r\n(function(){var g=function(e,h,f,g){\r\nthis.get=function(a){for(var a=a+\"=\",c=document.cookie.split(\";\"),b=0,e=c.length;b<e;b++){for(var d=c[b];\" \"==d.charAt(0);)d=d.substring(1,d.length);if(0==d.indexOf(a))return d.substring(a.length,d.length)}return null};\r\nthis.set=function(a,c){var b=\"\",b=new Date;b.setTime(b.getTime()+6048E5);b=\"; expires=\"+b.toGMTString();document.cookie=a+\"=\"+c+b+\"; path=/; \"};\r\nthis.check=function(){var a=this.get(f);if(a)a=a.split(\":\");else if(100!=e)\"v\"==h&&(e=Math.random()>=e/100?0:100),a=[h,e,0],this.set(f,a.join(\":\"));else return!0;var c=a[1];if(100==c)return!0;switch(a[0]){case \"v\":return!1;case \"r\":return c=a[2]%Math.floor(100/c),a[2]++,this.set(f,a.join(\":\")),!c}return!0};\r\nthis.go=function(){if(this.check()){var a=document.createElement(\"script\");a.type=\"text/javascript\";a.src=g;document.body&&document.body.appendChild(a)}};\r\nthis.start=function(){var t=this;\"complete\"!==document.readyState?window.addEventListener?window.addEventListener(\"load\",function(){t.go()},!1):window.attachEvent&&window.attachEvent(\"onload\",function(){t.go()}):t.go()};};\r\ntry{(new g(100,\"r\",\"QSI_S_ZN_5o5yqpvMVjgDOuN\",\"https://zn5o5yqpvmvjgdoun-wiley.siteintercept.qualtrics.com/SIE/?Q_ZID=ZN_5o5yqpvMVjgDOuN\")).start()}catch(i){}})();\r\n</script><div id='ZN_5o5yqpvMVjgDOuN'></div>\r\n","enabled":false},{"pages":["all"],"location":"header","script":"\r\n<script>\r\n (function(h,o,t,j,a,r){\r\n h.hj=h.hj||function(){(h.hj.q=h.hj.q||[]).push(arguments)};\r\n h._hjSettings={hjid:257151,hjsv:6};\r\n a=o.getElementsByTagName('head')[0];\r\n r=o.createElement('script');r.async=1;\r\n r.src=t+h._hjSettings.hjid+j+h._hjSettings.hjsv;\r\n a.appendChild(r);\r\n })(window,document,'https://static.hotjar.com/c/hotjar-','.js?sv=');\r\n</script>","enabled":false},{"pages":["article"],"location":"header","script":" <script src=\"//get.s-onetag.com/bffe21a1-6bb8-4928-9449-7beadb468dae/tag.min.js\" async defer></script>","enabled":true},{"pages":["homepage"],"location":"header","script":"<meta name=\"facebook-domain-verification\" content=\"irk8y0irxf718trg3uwwuexg6xpva0\" />","enabled":true},{"pages":["homepage","article","category","search"],"location":"footer","script":"\r\n<noscript>\r\n<img height=\"1\" width=\"1\" src=\"https://www.facebook.com/tr?id=256338321977984&ev=PageView&noscript=1\"/>\r\n</noscript>\r\n","enabled":true}]}},"pageScriptsLoadedStatus":"success"},"navigationState":{"navigationCollections":[{"collectionId":287568,"title":"BYOB (Be Your Own Boss)","hasSubCategories":false,"url":"/collection/for-the-entry-level-entrepreneur-287568"},{"collectionId":293237,"title":"Be a Rad Dad","hasSubCategories":false,"url":"/collection/be-the-best-dad-293237"},{"collectionId":295890,"title":"Career Shifting","hasSubCategories":false,"url":"/collection/career-shifting-295890"},{"collectionId":294090,"title":"Contemplating the Cosmos","hasSubCategories":false,"url":"/collection/theres-something-about-space-294090"},{"collectionId":287563,"title":"For Those Seeking Peace of Mind","hasSubCategories":false,"url":"/collection/for-those-seeking-peace-of-mind-287563"},{"collectionId":287570,"title":"For the Aspiring Aficionado","hasSubCategories":false,"url":"/collection/for-the-bougielicious-287570"},{"collectionId":291903,"title":"For the Budding Cannabis Enthusiast","hasSubCategories":false,"url":"/collection/for-the-budding-cannabis-enthusiast-291903"},{"collectionId":299891,"title":"For the College Bound","hasSubCategories":false,"url":"/collection/for-the-college-bound-299891"},{"collectionId":291934,"title":"For the Exam-Season Crammer","hasSubCategories":false,"url":"/collection/for-the-exam-season-crammer-291934"},{"collectionId":301547,"title":"For the Game Day Prepper","hasSubCategories":false,"url":"/collection/big-game-day-prep-made-easy-301547"}],"navigationCollectionsLoadedStatus":"success","navigationCategories":{"books":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/books/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/books/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/books/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/books/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/books/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/books/level-0-category-0"}},"articles":{"0":{"data":[{"categoryId":33512,"title":"Technology","hasSubCategories":true,"url":"/category/articles/technology-33512"},{"categoryId":33662,"title":"Academics & The Arts","hasSubCategories":true,"url":"/category/articles/academics-the-arts-33662"},{"categoryId":33809,"title":"Home, Auto, & Hobbies","hasSubCategories":true,"url":"/category/articles/home-auto-hobbies-33809"},{"categoryId":34038,"title":"Body, Mind, & Spirit","hasSubCategories":true,"url":"/category/articles/body-mind-spirit-34038"},{"categoryId":34224,"title":"Business, Careers, & Money","hasSubCategories":true,"url":"/category/articles/business-careers-money-34224"}],"breadcrumbs":[],"categoryTitle":"Level 0 Category","mainCategoryUrl":"/category/articles/level-0-category-0"}}},"navigationCategoriesLoadedStatus":"success"},"searchState":{"searchList":[],"searchStatus":"initial","relatedArticlesList":[],"relatedArticlesStatus":"initial"},"routeState":{"name":"Article3","path":"/article/technology/programming-web-design/r/r-project-ml-concepts-titanic-251618/","hash":"","query":{},"params":{"category1":"technology","category2":"programming-web-design","category3":"r","article":"r-project-ml-concepts-titanic-251618"},"fullPath":"/article/technology/programming-web-design/r/r-project-ml-concepts-titanic-251618/","meta":{"routeType":"article","breadcrumbInfo":{"suffix":"Articles","baseRoute":"/category/articles"},"prerenderWithAsyncData":true},"from":{"name":null,"path":"/","hash":"","query":{},"params":{},"fullPath":"/","meta":{}}},"dropsState":{"submitEmailResponse":false,"status":"initial"},"profileState":{"auth":{},"userOptions":{},"status":"success"}}

R Project for ML Concepts: Titanic

By: Joseph Schmuller and

Updated: 04-10-2018

From The Book: R Projects For Dummies

R Projects For Dummies

Book image

Explore Book Buy On Amazon

A dataset that’s often used to illustrate ML concepts in R programming is the information about passengers on the Titanic’s disastrous voyage in 1912. The target variable is whether the passenger survived. You can use this data to create a decision tree.

The data resides in an R package called titanic. If it’s not already on the Packages tab, click Install. In the Install Packages dialog box, type titanic and click the Install button. After the package downloads, find it on the Packages tab and select its check box.

In the titanic package, you’ll find titanic_train and titanic_test. Don’t be tempted to use one as the training set and the other as the test set for this particular application of Rattle. The titanic_test set doesn’t include the Survived variable, so it’s not usable for testing a decision tree the way I lay out the process here.

Instead, create the data frame like this:

library(titanic) titanic.df <- titanic_train

Then use Rattle’s Data tab to read in the dataset. This image shows what the Data tab looks like after a few modifications.

Rattle data tab after modification

The Rattle Data tab, after modifying the titanic.df dataset.

What are those modifications? First, a rule of thumb: If a variable is categoric and has a lot of unique values (and if it’s not already classified as an Ident (identifier)), click its Ignore radio button. Also, when first encountering this dataset, Rattle thinks Embarked is the target variable. Use the radio buttons to change Embarked to Categoric and to change Survived to Target.

Good luck!

About This Article

This article is from the book:

R Projects For Dummies ,

About the book author:

Joseph Schmuller, PhD, is a veteran of more than 25 years in Information Technology. He is the author of several books, including Statistical Analysis with R For Dummies and four editions of Statistical Analysis with Excel For Dummies. In addition, he has written numerous articles and created online coursework for Lynda.com.

This article can be found in the category:

R ,