How to Describe the Variation of Data in R
How to Change Values in a Vector in R
How to Use Factors or Numeric Data in R

How to Substitute Text in R

The sub() function (short for substitute) in R searches for a pattern in text and replaces this pattern with replacement text. You use sub() to substitute text for text, and you use its cousin gsub() to substitute all occurrences of a pattern. (The g in gsub() stands for global.)

Suppose you have the sentence He is a wolf in cheap clothing, which is clearly a mistake. You can fix it with a gsub() substitution. The gsub() function takes three arguments: the pattern to find, the replacement pattern, and the text to modify:

> gsub("cheap", "sheep's", "A wolf in cheap clothing")
[1] "A wolf in sheep's clothing"

Another common type of problem that can be solved with text substitution is removing substrings. Removing substrings is the same as replacing the substring with empty text (that is, nothing at all).

Imagine a situation in which you have three file names in a vector: file_a.csv, file_b.csv, and file_c.csv. Your task is to extract the a, b, and c from those file names. You can do this in two steps: First, replace the pattern "file_" with nothing, and then replace the ".csv" with nothing. You’ll be left with your desired vector:

> x <- c("file_a.csv", "file_b.csv", "file_c.csv")
> y <- gsub("file_", "", x)
> y
[1] "a.csv" "b.csv" "c.csv"
> gsub(".csv", "", y)
[1] "a" "b" "c"
  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
How to Analyze Data in Tables with R
How to Predict New Data Values with R
How to Concatenate Text Cases in R
How to Use Loops with Indices in R
How to Summarize a Dataset in R