How to Extend Text Functionality with Stringr in R
If you’ve worked at all with the text manipulation functions of R, you probably wonder why all these functions have such unmemorable names and seemingly diverse syntax. If so, you’re not alone.
In fact, Hadley Wickham wrote a package available from CRAN that simplifies and standardizes working with text in R. This package is called stringr, and you can install it by using the R console or by choosing Tools→Install Packages in RStudio.
Although you have to install a package only once, you have to load it into the workspace using the library() function every time you start a new R session and plan to use the functions in that package.
Here are some of the advantages of using stringr rather than the standard R functions:
Function names and arguments are consistent and more descriptive. For example, all stringr functions have names starting with str_ (such as str_detect() and str_replace()).
stringr has a more consistent way of dealing with cases with missing data or empty values.
stringr has a more consistent way of ensuring that input and output data are of the same type.
The stringr equivalent for grep() is str_detect(), and the equivalent for gsub() is str_replace_all().
As a starting point to explore stringr, you may find some of these functions useful:
str_detect(): Detects the presence or absence of a pattern in a string
str_extract(): Extracts the first piece of a string that matches a pattern
str_length(): Returns the length of a string (in characters)
str_locate(): Locates the position of the first occurrence of a pattern in a string
str_match(): Extracts the first matched group from a string
str_replace(): Replaces the first occurrence of a matched pattern in a string
str_split(): Splits up a string into a variable number of pieces
str_sub(): Extracts substrings from a character vector
str_trim(): Trims white space from the start and end of string
str_wrap(): Wraps strings into nicely formatted paragraphs