How to Set the Contrasts for Your Data with R
How to Generate Your Own Error Messages in R
How to Make a Minimal Reproducible Example to Get Help with R

How to Substitute Text in R

The sub() function (short for substitute) in R searches for a pattern in text and replaces this pattern with replacement text. You use sub() to substitute text for text, and you use its cousin gsub() to substitute all occurrences of a pattern. (The g in gsub() stands for global.)

Suppose you have the sentence He is a wolf in cheap clothing, which is clearly a mistake. You can fix it with a gsub() substitution. The gsub() function takes three arguments: the pattern to find, the replacement pattern, and the text to modify:

> gsub("cheap", "sheep's", "A wolf in cheap clothing")
[1] "A wolf in sheep's clothing"

Another common type of problem that can be solved with text substitution is removing substrings. Removing substrings is the same as replacing the substring with empty text (that is, nothing at all).

Imagine a situation in which you have three file names in a vector: file_a.csv, file_b.csv, and file_c.csv. Your task is to extract the a, b, and c from those file names. You can do this in two steps: First, replace the pattern "file_" with nothing, and then replace the ".csv" with nothing. You’ll be left with your desired vector:

> x <- c("file_a.csv", "file_b.csv", "file_c.csv")
> y <- gsub("file_", "", x)
> y
[1] "a.csv" "b.csv" "c.csv"
> gsub(".csv", "", y)
[1] "a" "b" "c"
  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
How to Test Vector Types in R
How to Create a Lattice Plot in R
How to Evaluate Linear Data with R
How to Cast Data to Wide Format in R
How to Create a Factor in R