By Andrie de Vries, Joris Meys

Every object you create in R ends up in this environment, which is called the global environment. The global environment is the universe of the R user where everything happens.

R gurus will tell you that this “universe” is actually contained in another “universe” and that one in yet another, and so on — but that “outer space” is a hostile environment suited only to daring coders without fear of breaking things. So, there’s no need to go there now.

You may work with some objects that you didn’t first create in the global environment. You likely use the arguments x, mult, and FUN as if they’re objects, and you create an object percent within the function that you can’t find back in the global environment after using the function. So, what’s going on?

Creating a test case

Let’s find out through a small example. First, create an object x and a small test() function like this:

x <- 1:5
test <- function(x){
 cat(“This is x:”, x, “n”)
 cat(“This is x after removing it:”, x, “n”)

The test() function doesn’t do much. It takes an argument x, prints it to the console, removes it, and tries to print it again. You may think this function will fail, because x disappears after the line rm(x). But no, if you try this function it works just fine, as shown in the following example:

> test(5:1)
This is x: 5 4 3 2 1
This is x after removing it: 1 2 3 4 5

Even after removing x, R still can find another x that it can print. If you look a bit more closely, you see that the x printed in the second line is actually not the one you gave as an argument, but the x you created before in the global environment. How come?

Searching the path

If you use a function, the function first creates a temporary local environment. This local environment is nested within the global environment, which means that, from that local environment, you also can access any object from the global environment. As soon as the function ends, the local environment is destroyed together with all objects in it.

To be completely correct, a function always creates an environment within the environment it’s called from, called the parent environment. If you call a function from the global environment, either through a script or by using the command line, this parent environment happens to be the global environment.

You can see a schematic illustration of how the test() function works below. The big rectangle represents the global environment, and the small rectangle represents the local environment of the test function. In the global environment, you assign the value 1:5 to the object x. In the function call, however, you assign the value 5:1 to the argument x. This argument becomes an object x in the local environment.

How R looks through global and local environments.

How R looks through global and local environments.

If R sees any object name — in this case, x — mentioned in any code in the function, it first searches the local environment. Because it finds an object x there, it uses that one for the first cat() statement. In the next line, R removes that object x. So, when R reaches the third line, it can’t find an object x in the local environment anymore. No problem.

R moves up the stack of environments and checks to see if it finds anything looking like an x in the global environment. Because it can find an x there, it uses that one in the second cat() statement.

If you use rm() inside a function, rm() will, by default, delete only objects within the local environment of that function. This way, you can avoid running out of memory when you write functions that have to work on huge datasets. You can immediately remove big temporary objects instead of waiting for the function to do so at the end.