How to Identify and Correct Dropped Dimensions in R

Every function in R expects your data to be in a specific format. That doesn't mean simply whether it's an integer, character, or factor, but also whether you supply a vector, a matrix, a data frame, or a list. Many functions can deal with multiple formats, but sometimes the result isn't what you expect at all.

For instance, R automatically tries to reduce the number of dimensions when subsetting a matrix, array, or data frame. If you want to calculate the row sums of the numeric variables in a data frame — for example, the built-in data frame sleep — you can write a little function like this:

rowsum.df <- function(x){

id <- sapply(x,is.numeric)

rowSums(x[, id])

}

If you try that out on two built-in data frames, pressure and sleep, you get a result for the first one but the following error message for the second:

> rowsum.df(sleep)

Error in rowSums(x[, id]) :

'x' must be an array of at least two dimensions

Because sleep contains only a single numeric variable, x[, id] returns a vector instead of a data frame, and that causes the error in rowSums().

You can solve this problem either by adding drop=FALSE or by using the list subsetting method x[i] instead.

  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
Advertisement

Inside Dummies.com