How to Create Character Vectors for Text Data in R

By Andrie de Vries, Joris Meys

Text in R is represented by character vectors. A character vector is — you guessed it! — a vector consisting of characters.

In the world of computer programming, text often is referred to as a string. Here the word text refers to a single element of a vector, but you should be aware that the R Help files sometimes refer to strings and sometimes to text. They mean the same thing.


Take a look at how R uses character vectors to represent text. You assign some text to a character vector and get it to extract subsets of that data. You also get familiar with the very powerful concept of named vectors, vectors in which each element has a name. This is useful because you can then refer to the elements by name as well as position.

Assign a value to a character vector

You assign a value to a character vector by using the assignment operator (<-), the same way you do for all other variables. You test whether a variable is of class character, for example, by using the is.character() function as follows:

> x <- "Hello world!"
> is.character(x)

Notice that x is a character vector of length 1. To find out how many characters are in the text, use nchar:

> length(x)
[1] 1
> nchar(x)
[1] 12

This function tells you that x has length 1 and that the single element in x has 12 characters.

Create a character vector with more than one element

To create a character vector with more than one element, use the combine function, c():

x <- c("Hello", "world!")
> length(x)
[1] 2
> nchar(x)
[1] 5 6

Notice that this time, R tells you that your vector has length 2 and that the first element has five characters and the second element has six characters.