Online Test Banks
Score higher
See Online Test Banks
eLearning
Learning anything is easy
Browse Online Courses
Mobile Apps
Learning on the go
Explore Mobile Apps
Dummies Store
Shop for books and more
Start Shopping

How to Create a Data Frame from Scratch in R

The conversion from a matrix to a data frame in R can’t be used to construct a data frame with different types of values. If you combine both numeric and character data in a matrix for example, everything will be converted to character.

You can construct a data frame from scratch, though, using the data.frame() function.

Make a data frame from vectors in R

So, let’s make a little data frame with the names, salaries, and starting dates of a few imaginary co-workers. First, you create three vectors that contain the necessary information like this:

> employee <- c('John Doe','Peter Gynn','Jolie Hope')
> salary <- c(21000, 23400, 26800)
> startdate <- as.Date(c('2010-11-1','2008-3-25','2007-3-14'))

Now you have three different vectors in your workspace:

  • A character vector called employee, containing the names

  • A numeric vector called salary, containing the yearly salaries

  • A date vector called startdate, containing the dates on which the contracts started

Next, you combine the three vectors into a data frame using the following code:

> employ.data <- data.frame(employee, salary, startdate)

The result of this is a data frame, employ.data, with the following structure:

> str(employ.data)
'data.frame': 3 obs. of 3 variables:
 $ employee : Factor w/ 3 levels "John Doe","Jolie Hope",..: 1 3 2
 $ salary  : num 21000 23400 26800
 $ startdate: Date, format: "2010-11-01" "2008-03-25" ...

To combine a number of vectors into a data frame, you simple add all vectors as arguments to the data.frame() function, separated by commas. R will create a data frame with the variables that are named the same as the vectors used.

Keep characters as characters in R

You may have noticed something odd when looking at the structure of employ.data. Whereas the vector employee is a character vector, R made the variable employee in the data frame a factor.

R does this by default, but you have an extra argument to the data.frame() function that can avoid this — namely, the argument stringsAsFactors. In the employ.data example, you can prevent the transformation to a factor of the employee variable by using the following code:

> employ.data <- data.frame(employee, salary, startdate, stringsAsFactors=FALSE)

If you look at the structure of the data frame now, you see that the variable employee is a character vector, as shown in the following output:

> str(employ.data)
'data.frame': 3 obs. of 3 variables:
 $ employee : chr "John Doe" "Peter Gynn" "Jolie Hope"
 $ salary  : num 21000 23400 26800
 $ startdate: Date, format: "2010-11-01" "2008-03-25" ...

By default, R always transforms character vectors to factors when creating a data frame with character vectors or converting a character matrix to a data frame. This can be a nasty cause of errors in your code if you’re not aware of it. If you make it a habit to always specify the stringsAsFactors argument, you can avoid a lot of frustration.

  • Add a Comment
  • Print
  • Share
blog comments powered by Disqus
Advertisement

Inside Dummies.com

Dummies.com Sweepstakes

Win $500. Easy.