In some cases, you don’t have real values to calculate with. In most real-life data sets in R, in fact, at least a few values are missing. Also, some calculations have infinity as a result (such as dividing by zero) or can’t be carried out at all (such as taking the logarithm of a negative value). Luckily, R can deal with all these situations.

Using infinity

To start exploring infinity in R, see what happens when you try to divide by zero:

> 2 / 0
[1] Inf

R correctly tells you the result is Inf, or infinity. Negative infinity is shown as -Inf. You can use Inf just as you use a real number in calculations:

> 4 - Inf
[1] -Inf

To check whether a value is finite, use the functions is.finite() and is.infinite(). The first function returns TRUE if the number is finite; the second one returns TRUE if the number is infinite.

R considers everything larger than the largest number a computer can hold to be infinity — on most machines, that’s approximately 1.8 × 10308. This definition of infinity can lead to unexpected results, as shown in the following example:

> is.finite(10^(305:310))

What does this line of code mean now? See whether you understand the nesting and vectorization in this example. If you break up the line starting from the inner parentheses, it becomes comprehensible:

  • You know already that 305:310 gives you a vector, containing the integers from 305 to 310.

  • All operators are vectorized, so 10^(305:310) gives you a vector with the results of 10 to the power of 305, 306, 307, 308, 309, and 310.

  • That vector is given as an argument to is.finite(). This function tells you that the two last results — 10^309 and 10^310 — are infinite for R.

Dealing with undefined outcomes

Your math teacher probably explained that if you divide any real number by infinity, you get zero. But what if you divide infinity by infinity?

> Inf / Inf
[1] NaN

Well, R tells you that the outcome is NaN. That result simply means Not a Number. This is R’s way of telling you that the outcome of that calculation is not defined.

The funny thing is that R actually considers NaN to be numeric, so you can use NaN in calculations. The outcome of those calculations is always NaN, though, as you see here:

> NaN + 4
[1] NaN

You can test whether a calculation results in NaN by using the is.nan() function. Note that both is.finite() and is.infinite() return FALSE when you’re testing on a NaN value.

Dealing with missing values

One of the most common problems in statistics is incomplete data sets. To deal with missing values, R uses the reserved keyword NA, which stands for Not Available. You can use NA as a valid value, so you can assign it as a value as well:

> x <- NA

You have to take into account, however, that calculations with a value of NA also generally return NA as a result:

> x + 4
[1] NA
> log(x)
[1] NA

If you want to test whether a value is NA, you can use the function, as follows:

[1] TRUE

Note that the function also returns TRUE if the value is NaN. The functions is.finite(), is.infinite(), and is.nan() return FALSE for NA values.

Calculating infinite, undefined, and missing values

The following table provides an overview of results from the functions described above. You are unlikely to use any of these except for, which you may use quite a lot!

Function Inf –Inf NaN NA

About This Article

This article is from the book:

About the book authors:

Andrie de Vries is a leading R expert and Business Services Director for Revolution Analytics. With over 20 years of experience, he provides consulting and training services in the use of R. Joris Meys is a statistician, R programmer and R lecturer with the faculty of Bio-Engineering at the University of Ghent.

This article can be found in the category: