By Stephen R. Davis

The heap is an amorphous block of memory that your C++ program can access as necessary. Learn about why it exists and how to use it.

Just as it is possible to pass a pointer to a function, it is possible for a function to return a pointer. A function that returns the address of a double is declared as follows:

double* fn(void);

However, you must be very careful when returning a pointer. To understand the dangers, you must know something about variable scope.

Limited scope in C++

Scope is the range over which a variable is defined. Consider the following code snippet:

// the following variable is accessible to
// all functions and defined as long as the
// program is running(global scope)
int intGlobal;
// the following variable intChild is accessible
// only to the function and is defined only
// as long as C++ is executing child() or a 
// function which child() calls (function scope)
void child(void)
{
    int intChild;
}
// the following variable intParent has function
// scope
void parent(void)
{
    int intParent = 0;
    child();
    int intLater = 0;
    intParent = intLater;
}
int main(int nArgs, char* pArgs[])
{
    parent();
}

This program fragment starts with the declaration of a variable intGlobal. This variable exists from the time the program begins executing until it terminates. You say that intGlobal “has program scope.” You also say that the variable “goes into scope” even before the function main() is called.

The function main() immediately invokes parent(). The first thing that the processor sees in parent() is the declaration of intParent. At that point, intParent goes into scope — that is, intParent is defined and available for the remainder of the function parent().

The second statement in parent() is the call to child(). Once again, the function child() declares a local variable, this time intChild. The scope of the variable intChild is limited to the function child(). Technically, intParent is not defined within the scope of child() because child() doesn’t have access to intParent; however, the variable intParent continues to exist while child() is executing.

When child() exits, the variable intChild goes out of scope. Not only is intChild no longer accessible, it no longer exists. (The memory occupied by intChild is returned to the general pool to be used for other things.)

As parent() continues executing, the variable intLater goes into scope at the declaration. At the point that parent() returns to main(), both intParent and intLater go out of scope.

Because intGlobal is declared globally in this example, it is available to all three functions and remains available for the life of the program.

Examining the scope problem in C++

The following code segment compiles without error but doesn’t work (don’t you just hate that?):

double* child(void)
{
    double dLocalVariable;
    return &dLocalVariable;
}
void parent(void)
{
    double* pdLocal;
    pdLocal  = child();
    *pdLocal = 1.0;
}

The problem with this function is that dLocalVariable is defined only within the scope of the function child(). Thus, by the time the memory address of dLocalVariable is returned from child(), it refers to a variable that no longer exists. The memory that dLocalVariable formerly occupied is probably being used for something else.

This error is very common because it can creep up in a number of ways. Unfortunately, this error does not cause the program to instantly stop. In fact, the program may work fine most of the time — that is, the program continues to work as long as the memory formerly occupied by dLocalVariable is not reused immediately. Such intermittent problems are the most difficult ones to solve.

Providing a solution using the heap in C++

The scope problem originated because C++ took back the locally defined memory before the programmer was ready. What is needed is a block of memory controlled by the programmer. She can allocate the memory and put it back when she wants to — not because C++ thinks it’s a good idea. Such a block of memory is called the heap.

Heap memory is allocated using the new keyword followed by the type of object to allocate. The new command breaks a chunk of memory off the heap big enough to hold the specified type of object and returns its address. For example, the following allocates a double variable off the heap:

double* child(void)
{
    double* pdLocalVariable = new double;
    return pdLocalVariable;
}

This function now works properly. Although the variable pdLocalVariable goes out of scope when the function child() returns, the memory to which pdLocalVariable refers does not. A memory location returned by new does not go out of scope until it is explicitly returned to the heap using the keyword delete, which is specifically designed for that purpose:

void parent(void)
{
    // child() returns the address of a block
    // of heap memory
    double* pdMyDouble = child();
    // store a value there
    *pdMyDouble = 1.1;
    // ...
    // now return the memory to the heap
    delete pdMyDouble;
    pdMyDouble = 0;
    // ...
}

Here the pointer returned by child() is used to store a double value. After the function is finished with the memory location, it is returned to the heap. The function parent() sets the pointer to 0 after the heap memory has been returned — this is not a requirement, but it is a very good idea.

If the programmer mistakenly attempts to store something in * pdMyDouble after the delete, the program will crash immediately with a meaningful error message.

You can use new to allocate arrays from the heap as well, but you must return an array using the delete[] keyword:

int* nArray = new int[10];
nArray[0] = 0;
delete[] nArray;

Technically new int[10] invokes the new[] operator but it works the same as new.