Using the Rattle Package with R - dummies

Using the Rattle Package with R

By Joseph Schmuller

R has numerous functions and packages that deal with ML. Data science honcho Graham Williams has created Rattle, a graphical user interface (GUI) to many of these functions. You can use Rattle for certain ML projects.

Much of what Rattle does depends on a package called RGtk2, which uses R functions to access the Gnu Image Manipulation Program (GIMP) toolkit. (GIMP is a widely used open source image editor.) So the first thing to do is download and install this package. On the Packages tab, click Install. In the Install Packages dialog box, type RGtk2 and click Install. After the download finishes, find RGtk2 on the Packages tab and click its check box.

Now do the same for Rattle: On the Packages tab, click Install. In the Install Packages dialog box, type rattle and click Install. When the download finishes, find Rattle on the Packages tab and click its check box.

In R Studio’s Script panel, type

rattle()

and then press Ctrl+R to run. This image shows the window that opens. The window might not be visible at first — it might have opened behind other windows, for example — so you might have to hunt around for it, but you’ll find it. Expand it to make it look like this.

Rattle window R
The Rattle window.

The main panel presents a welcome message and some info about Rattle. The menu bar at the top features Project (for starting, opening, and saving Rattle projects), Tools (a menu of choices that correspond to buttons and tabs), Settings (that deal with graphics), and Help.

The row below the menu bar holds icons, the most important of which is Execute. The idea is to look at each tab and make selections, and then click Execute to carry out those selections. (If you’re a Trekkie, think of clicking the Execute icon as Captain Picard saying “Make it so!”)

The next row holds the tabs. The first tab (on the left) is for Data. This tab presents the welcome message and, more importantly, allows you to choose the data source. The Explore tab is for — you guessed it — exploring data. The Test tab supplies two-sample statistical tests. If you have to transform data, the Transform tab is for you. The Cluster tab enables several kinds of cluster analysis, a type of unsupervised learning. The Associate tab sets you up with association analysis, which identifies relationships between variables. The Model tab provides several kinds of ML, including decision trees, support vector machines, and neural networks. The next tab allows you to Evaluate your ML creation. The Log tab tracks your interactions with Rattle as R script, which can be quite instructive if you’re trying to learn R.

Remember that Rattle is a GUI to R functions for some complex analyses, and you can’t always know in advance what those functions are or which packages they live in. Accordingly, a frequent part of the interaction with Rattle is a dialog box that opens and says that you have to install a particular package, and asks whether you want to install it. Always click Yes.