MCS-287 Notes for EOPL Section 6.2 (Spring 2000)

Conceptual issues

In Section 6.1, each variable had exactly one name. Now we will consider the possibility that a variable can have more than one name. This is called "aliasing." We will indicate aliasing in our diagrams simply by writing several names adjacent to a single variable, separating the names with commas.

Most languages only allow aliasing to occur as a result of invoking a procedure using call-by-reference. One noteworthy exception is C++, in which the programmer can introduce a new alias for an existing variable at any time, without needing to invoke a procedure. This allows all the same issues to arise as with call-by-reference, just as in Section 6.1 we were able to use ordinary assignment to see the same issues as with call-by-value.

In call-by-reference, a parameter does not name a new variable (unlike call-by-value). Instead, it becomes a new alias for the caller's variable. This definition only works if the actual argument expression happens to be a variable; we'll see below how to modify the definition for other cases. The simple case suffices to understand the swap example at the top of page 187, as shown in the following diagram:


call-by-reference swap
Call-by-reference swap

This same simple definition of call-by-reference also suffices to understand what happens when arrays are passed to procedures. Remember, the parameter name becomes a new alias for the caller's variable. This is true in both the indirect and the direct array models. In the indirect model, the caller's variable is a single cell, which may refer to an array object. The parameter becomes an additional alias for the single cell. In the direct model, the caller's variable may be a multi-cell array variable. The parameter becomes an additional alias for that array variable. Using this understanding, we can re-evaluate the code from Figure 6.1.5, but this time using call-by-reference. If we use the indirect array model, we get the diagram shown in part (a) (the top part) of Figure 6.2.2 on page 192. However, if we use the direct array model, we do not get the diagram shown in part (b) of that figure; that diagram is in error. The correct diagram for Figure 6.2.2(b), showing call-by-reference with the direct array model, is as follows:


x,u: 5,6,4; v: 3,8 => x,u: 3,9,4; v: 3,8
Call-by-reference with direct arrays: For consistency with Figure 6.2.2, but not with other diagrams in my notes, I've shown the numbers inside the cells.

We can now return to the question of what happens if an actual procedure argument isn't a variable. (This would foul up our definition that the parameter becomes an additional alias for the caller's variable, since there isn't a caller's variable.)

One simple case if if the procedure argument is an expression referring to an array element, such as a[1]. In this case, we can make the procedure's parameter an alias for the individual cell within the array. To show this in our diagrams will require a new notation, to indicate that a name is a name for a particular cell within an array, rather than for the whole array. Consider first the indirect array model. If the procedure p has a parameter named x, then an invocation like p(a[1]) could be diagramed as follows:


x aliases a[1]
Aliasing an element of an indirect array
The situation is similar with direct arrays; evaluating p(a[1]) results in x being an alias for element 1 of the array variable a, as shown below:

x aliases a[1]
Aliasing an element of a direct array

As an aside, some languages (such as Fortran, the classic call-by-reference language) take this one step further and allow a procedure's parameter to alias a subarray within a larger array of the caller. In our interpreter, we'll stick with aliasing either the whole array or an individual element.

Now we are left with the most perplexing case: what if the actual argument expression is something that in no way corresponds to a memory location. For example, it might be as simple as the constant 1, or it might be a procedure application such as +(2, 2). Some languages simply forbid passing an expression like this to a parameter that uses call-by-reference. This is typical in languages where individual parameters can be specified as either call-by-value or call-by-reference. In languages where all parameter passing is call-by-reference, it is more typical to simply create a new cell to hold the argument value, and make the parameter name be a name for this new cell. In other words, in this case, call-by-reference winds up identical to call-by-value. This is the approach EOPL takes.

Many real languages use call-by-reference, at least as an option. As mentioned above, Fortran uses call-by-reference for all parameter passing. Pascal allows each parameter to be marked as either by-value or by-reference. C++, as mentioned earlier, allows general alias creation, and in particular allows individual parameters to be by-reference, rather than the default of by-value. (C, by contrast, only allows call-by-value.) There are some interesting variations on the theme. For example, the rules of Fortran say that it is illegal to write a procedure invocation like

swap(a[1], a[f(b)])
unless you somehow know that f(b) will never evaluate to 1. (See the discussion on pages 188-189.) Fortran implementations are free to give surprising answers if you ever try to make two parameters be aliases of each other.

Implementation details

The implementation of call-by-reference can be considerably simplified, given my substitute Array ADT from the previous section. We can eliminate the ae record type (page 190) and instead say that an L-value or Denoted Value is always a Cell. (This is in the indirect array model. In the direct array model, it can also be an entire array.) This means that in place of Figure 6.2.1 (page 191), we can just use a new definition of eval-rand, keeping our old definitions of denoted->expressed and denoted-value-assign!. If we use the first eval-rand shown below with the other procedures from the indirect array model call-by-value interpreter, we get an indirect array model call-by-reference interpreter. If we use the second eval-rand below with the other procedures from the direct array model call-by-value interpreter, we get a direct array model call-by-reference interpreter.

The indirect array model eval-rand procedure for call-by-reference is as follows:

(define eval-rand
  (lambda (rand env)
    (variant-case rand
      (varref (var) (apply-env env var))
      (arrayref (array index)
        (array-cell (eval-array-exp array env) (eval-exp index env)))
      (else (make-cell (eval-exp rand env))))))
and the direct array model one is as follows:
(define eval-rand
  (lambda (rand env)
    (variant-case rand
      (varref (var) (apply-env env var))
      (arrayref (array index)
        (array-cell (eval-array-exp array env) (eval-exp index env)))
      (else
       (let ((val (eval-exp rand env)))
         (if (array? val)
             val
             (make-cell val)))))))