How to Cast Data to Wide Format in R
How to Change Plot Options in R
How to Model Linear Data Relations with R

How to Traverse a List or Data Frame with R Apply Functions

When your data is in the form of a list, and you want to perform calculations on each element of that list in R, the appropriate apply function is lapply(). For example, to get the class of each element of iris, do the following:

> lapply(iris, class)

As you know, when you use sapply(), R attempts to simplify the results to a matrix or vector:

> sapply(iris, class)
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  "numeric"  "numeric"  "numeric"  "numeric"   "factor"

Say you want to calculate the mean of each column of iris:

> sapply(iris, mean)
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  5.843333   3.057333   3.758000   1.199333      NA
Warning message:
In mean.default(X[[5L]], ...) :
 argument is not numeric or logical: returning NA

There is a problem with this line of code. It throws a warning message because species is not a numeric column. So, you may want to write a small function inside apply() that tests whether the argument is numeric. If it is, then calculate the mean score; otherwise, simply return NA.

The FUN argument of the apply() functions can be any function, including your own custom functions. In fact, you can go one step further. It’s actually possible to define a function inside the FUN argument call to any apply() function:

> sapply(iris, function(x) ifelse(is.numeric(x), mean(x), NA))
Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
  5.843333   3.057333   3.758000   1.199333      NA

What’s happening here? You defined a function that takes a single argument x. If x is numeric, it returns mean(x); otherwise, it returns NA. Because sapply() traverses your list, each column, in turn, is passed to your function and evaluated.

When you define a nameless function like this inside another function, it’s called an anonymous function. Anonymous functions are useful when you want to calculate something fairly simple, but you don’t necessarily want to permanently store that function in your workspace.

  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
How to Extract a Subset of a Vector in R
How to Find Help Using R Mailing Lists
How to Repeat Vectors in R
How to Predict New Data Values with R
ggplot2 in R: How to Map Data to Lines, Points, Symbols and More