How to Create Subsets of Your Data in R
Often the first task in data processing is to create subsets of your data in R for further analysis. You’re already familiar with the three subset operators:
$: The dollar-sign operator selects a single element of your data (and drops the dimensions of the returned object). When you use this operator with a data frame, the result is always a vector; when you use it with a named list, you get that element.
[[: The double-square-brackets operator also returns a single element, but it offers you the flexibility of referring to the elements by position, rather than by name. You use it for data frames and lists.
[: The single-square-brackets operator can return multiple elements of your data.
This summary is simplified.
When you use the single-square-brackets operator, you return multiple elements of your data. This means that you need a way of specifying exactly which elements you need.
In this paragraph, you can try subsetting with the built-in dataset islands, a named numeric vector with 48 elements.
> str(islands) Named num [1:48] 11506 5500 16988 2968 16 ... - attr(*, "names")= chr [1:48] "Africa" "Antarctica" "Asia" "Australia" ...
| Subset | Effect | Example |
|---|---|---|
| Blank | Returns all your data | islands[] |
| Positive numerical values | Extracts the elements at these locations | islands[c(8, 1, 1, 42)] |
| Negative numerical values | Extract all but these elements; in other words, excludes these elements | islands[-(3:46)] |
| Logical values | A logical value of TRUE includes element; FALSE excludes element | islands[islands < 20] |
| Text strings | Includes elements where the names match | islands[c("Madagascar", "Cuba")] |









