Looking at the Rattle Log for R Programming
The Log tab shows your interactions with
Rattle as R code. Here’s a good example of working with the
In the hierarchical clustering analysis, click on Data Plot. You see a plot that looks very much like this.
To find the code that produced this plot, select the Log tab and scroll down until you find this:
plot(crs$dataset[, c(1:4)], col=cutree(crs$hclust,3))
Copy and paste that line into the RStudio Script panel and then press Ctrl+R to run it.
On the Plots tab, you see the same scatterplot matrix, but without the title. The plotting characters aren’t filled, and their border colors (black, red, and green) are the colors of the clusters to which
Rattle has assigned them.
To make the matrix look more like the image above, change
cr$dataset[, c(1:4)] to
c(1:5)]. This change adds the fifth row and the fifth column.
Add the argument
lower.panel=NULL to eliminate everything below the main diagonal. Then add plot character arguments so that the code is
plot(crs$dataset[, c(1:5)], col=cutree(crs$hclust, 3), lower.panel=NULL, pch=21,cex=2, bg = c("black","grey","white")[iris.uci$species])
Now the border color of each character corresponds to its assigned cluster, and its fill color corresponds to its species. If you run this code, you see that in the scatterplots, some of the plot characters have red borders and are filled with gray and some red-border characters are filled with white. In the fifth column, all points in the rightmost group should have green borders, but some have red borders. What does all this tell us? That the clustering isn’t perfect! That is, the three clusters do not correspond exactly with the three species.
Poking around in the
Rattle log was a pretty good idea!
Rattle Evaluation tab has procedures for evaluating your ML creations.