10 Tips on Working with Packages in R
One of the very attractive features of R is that it contains a large collection of third-party packages (collections of functions in a well-defined format). To get the most out of R, you need to understand where to find additional packages, how to download and install them, and how to use them.
Poking around the nooks and crannies of CRAN
The Comprehensive R Archive Network (CRAN) is a network of web servers around the world where you can find the R source code, R manuals and documentation, and contributed packages.
CRAN isn’t a single website; it’s a collection of web servers, each with an identical copy of all the information on CRAN. Thus, each web server is called a mirror. The idea is that you choose the mirror that is located nearest to where you are, which reduces international or long-distance Internet traffic. You can find a list of CRAN mirrors here.
Regardless of which R interface you use, you can permanently save your preferred CRAN mirror (and other settings) in a special file called .RProfile, located in the user’s home directory or the R startup directory. For example, to set the Imperial College, UK mirror as your default CRAN mirror, include this line in your .RProfile:
options(“repos” = c(CRAN = “http://cran.ma.imperial.ac.uk/”))
Finding interesting packages
At the beginning of 2015, there were more than 6,000 packages on CRAN. That means finding a package for your task at hand may seem difficult.
Fortunately, a handful of volunteer experts have collated some of the most widely used packages into curated lists. These lists are called CRAN task views. You can find task views for empirical finance, statistical genetics, machine learning, statistical learning, and many other fascinating topics.
Each package has its own web page on CRAN. On the web page for a package, you find a summary, information about the packages that are used, a link to the package website (if such a site exists), and other useful information.
To install a package use the install.packages() function. This simple command downloads the package from a specified repository (by default, CRAN) and installs it on your machine:
Note that the argument to install.packages() is a character string. In other words, remember the quotes around the package name!
In RGui, as well as in RStudio, you find a menu command to do the same thing:
In RGui, choose Packages→Install package(s).
In RStudio, choose Tool→Install packages. . . .
To load a package, you use the library() or require() function. These functions are identical in their effects, but they differ in the return value:
library(): Invisibly returns a list of packages that are attached, or stops with an error if the package is not on your machine.
require(): Returns TRUE if the package was successfully attached and FALSE if not.
The R documentation suggests that library() is the preferred way of loading packages in scripts, while require() is preferred inside functions and packages.
So, after installing the package fortunes you load it like this:
Note that you don’t have to quote the name of the package in the argument of library(), but it is good practice to always quote the package name.
Although it is possible to unload a package within an R session by using the detach() function, in practice it usually is much easier to simply restart your R session.
Reading the package manual and vignette
The package manual is a collection of all functions and other package documentation. You can access the manual in two ways. The first way is to use the help argument to the library() function:
> library(help = “fortunes”)
The second way is to find the manual on the package website. If you point your browser window to the CRAN page for the fortunes package, you’ll notice a link to the manual toward the bottom of the page.
Whichever approach you choose, the result is a PDF document containing the package manual.
Some package authors also write one or more vignettes, documents that illustrate how to use the package. A vignette typically shows some examples of how to use the functions and how to get started. The key thing is that a vignette illustrates how to use the package with R code and output, just like this book.
To read the vignette for the fortunes package, try the following:
To ensure that you have the latest version of a package, use update.packages():
This function connects to CRAN (by default) and checks whether there are updates for all the packages that you’ve installed on your machine. If there are, it asks you whether you want to update each package, and then downloads the code and installs the new version.
If you add update.packages(ask = FALSE), R updates all out-of-date packages in the current library location, without prompting you. Also, you can tell update.packages() to look at a repository other than CRAN by changing the repos argument. If the repos argument points to a file on your machine (or network), R installs the package from this file.
Both RGui and RStudio have menu options that allow you to update the packages:
In RGui, choose Packages→Update package(s).
In RStudio, choose Tools→Check for Package Updates. . . .
Both applications allow you to graphically select packages to update.
Forging ahead with R-Forge
Although not universally true, packages on CRAN tend to have some minimum level of maturity.
So, where do packages live that are in the development cycle? Quite often, they live at R-Forge. R-Forge gives developers a platform to develop and test their R packages. For example, R-Forge offers
A build and check system on Windows and Linux operating systems (Mac OSX is not supported)
Backup and administration
To install a project from R-Forge, you also use the install.packages() function, but you have to specify the repos argument. For example, to install the development version of the package data.table, try the following:
> install.packages(“data.table”, repos = “http://R-Forge.R-project.org”)
Although R-Forge doesn’t have a build and check system for Mac OSX specifically, Mac users can install and use packages from R-Forge by installing the source package. You find more information in the FAQ for Mac.
Getting packages from github
In recent years, many developers have started to use github as a code development site. Although github does not offer any of the R-specific features of CRAN or R-Forge, sometimes code is easier to share by using github. So you may occasionally get instructions to install a package directly from github.
On the Linux and Mac OSX operating systems, installing packages from github is comparatively easy. However, on Windows you also must first install RTools (a set of compilers and other tools to build packages from source). To install RTools on a Windows machine, carefully follow the instructions.
Conducting installations from BioConductor
BioConductor is a repository of R packages and software, a collection of tools that specializes in analysis of genomic and related data.
BioConductor has its own sets of rules for developers. For example, to install a package from BioConductor you have to source a script from its server:
Then you can use the biocLite() function to install packages from BioConductor. If you don’t supply an argument, you just install the necessary base packages from the BioConductor project.
BioConductor extensively uses object-orientation programming with S4 classes.
Reading the R manual
The “R Installation and Administration” manual is a comprehensive guide to the installation and administration of R. Chapter 6 of this manual contains all the information you need about working with packages.