How to Use the apply() Function to Summarize Arrays in R - dummies

# How to Use the apply() Function to Summarize Arrays in R

If you have data in the form of an array or matrix and you want to summarize this data, R’s apply() function is really useful. The apply() function traverses an array or matrix by column or row and applies a summarizing function.

The apply() function takes four arguments:

• X: This is your data — an array (or matrix).

• MARGIN: A numeric vector indicating the dimension over which to traverse; 1 means rows and 2 means columns.

• FUN: The function to apply (for example, sum or mean).

• (dots): If your FUN function requires any additional arguments, you can add them here.

To illustrate this, look at the built-in dataset Titanic. This is a four-dimensional table with passenger data of the ship Titanic, describing their cabin class, gender, age, and whether they survived.

```> str(Titanic)
Table [1:4, 1:2, 1:2, 1:2] 0 0 35 0 0 0 17 0 118 154 ...
- attr(*, "dimnames")=List of 4
..\$ Class  : chr [1:4] "1st" "2nd" "3rd" "Crew"
..\$ Sex   : chr [1:2] "Male" "Female"
..\$ Age   : chr [1:2] "Child" "Adult"
..\$ Survived: chr [1:2] "No" "Yes"```

To find out how many passengers were in each of their cabin classes, you need to summarize Titanic over its first dimension, Class:

```> apply(Titanic, 1, sum)
1st 2nd 3rd Crew
325 285 706 885```

Similarly, to calculate the number of passengers in the different age groups, you need to apply the sum() function over the third dimension:

```> apply(Titanic, 3, sum)
```> apply(Titanic, c(3, 4), sum)