How to Recognize and Fix List Errors in R - dummies

How to Recognize and Fix List Errors in R

Although lists help with keeping data together and come in very handy when you’re processing multiple datasets in R, they can cause some trouble as well.

First, you can easily forget that some function returns a list instead of a vector. For example, many programmers forget that strsplit() returns a list instead of a vector. So, if you want the second word from a sentence, the following code doesn’t return an error, but it doesn’t give you the right answer either:

> strsplit('this is a sentence',' ')[2]

In this example, strsplit() returns a list with one element, the vector with the words from the sentence:

> strsplit('this is a sentence',' ')
[1] "this"   "is"    "a"    "sentence"

To access this vector, you first have to select the wanted element from the list. Only then can you look for the second value using the vector indices, like this:

> strsplit('this is a sentence',' ')[[1]][2]
[1] "is"

Even the indexing mechanism itself can cause errors of this kind. For example, you have some names of customers and you want to add a dot between their first and last names. So, first, you split them like this:

> customer <- c('Johan Delong','Marie Petit')
> namesplit <- strsplit(customer,' ')

You want to paste the second name together with a dot in between, so you need to select the second element from the list. If you use single brackets, you get the following:

> paste(namesplit[2],collapse='.')
[1] "c("Marie"