CS 3723/3721
Programming Languages
|
For a compiled program, its static structure is the structure of the source program, how it is organized. The dynamic structure is the structure that evolves during run-time.
One uses the word binding to mean an association of a property with an entity in a programming language. All binding of storage locations to program names occur at compile-time. The bindings are fixed and unchanged throughout run-time.
This is the storage model used for the FORTRAN language up to 1977. It is a simple model, easy to set up, with very little to manage at run-time. In fact the main changes at run-time are with the parameters. In FORTRAN, parameters are always passed by reference, that is, the location (or reference) of the parameter is passed. FORTRAN passes arrays just the same as C or Java, but FORTRAN also passes simple variables by reference, rather than by value, as C and Java do. In a FORTRAN program, this is the main thing that changes during run-time, since a function may be passed different addresses for its parameters during different calls at run-time.
These features of FORTRAN can lead to a curious error: If one passes a constant, say 2, as a parameter, what is actually passed is a reference to the constant. Suppose inside the function, the formal parameter (call it x) is incremented, say with x = x + 1. Then this can result in a change of the constant itself, so that a = 2 would then actually give a the value of 3.
The static storage model allows simple allocation (at compile-time) and allows efficient execution (at run-time). So why don't we still use this method? Why not just use static storage?
Answers to "Why not use static storage?":
|
All three of C, Java, and C++ allow the declaration of static variables, so at run time, each of these languages has a static portion of storage for these variables.
Allocation on the stack can be as simple as modifying a single stacktop pointer, and deallocation is typically just as easy. Implementing functions using only static storage (as was done in FORTRAN) does not allow recursive functions, whereas the stack allocation supports recursion. This method has just a little extra run-time overhead compared with static allocation.
This method uses storage more efficiently, since local variables for all defined functions are allocated using static allocation, whereas the stack allocation only needs to allocate storage for currently executing functions. For example, if there are three functions A, B, and C, each with a 100-byte local array, then all three arrays must be allocated (300 bytes) in the static model, while only the arrays for executing functions are allocated in the stack model.
This stack allocation method was introduced in 1960 for Algol 60 by Peter Naur. (Of course he worked on it before then.) It might seem plausible that Naur developed Algol 60, and then hit on the clever way to implement function calls. The reality is more impressive though: Naur thought of the stack allocation method and then invented a language that would utilize this method. Since 1960 all "standard" compiled languages have used this method, including later dialects of Algol, Pascal and descendants, PL/I, C and C++, and Java.
Note that actual parameters (possibly expressions) are evaluated in the environment of the calling function. Then each resulting value of this evaluation is copied into the spot in the called function's activation record for the corresponding formal parameter (so that the formal parameter behaves like a local variable). Then all the additional information is copied into the activation record and the function is put into execution.
The Pascal language allowed the definition of a "helper" function to be buried inside the definition of the function it is helping. The provided a rudimentary version of the "object-oriented" approach, since the information in the helper function was hidden from view (so-called information hiding). Function definitions could be buried within other functions to an arbitrary depth of nesting, and this feature made Pascal implementation much more difficult than it otherwise would have been. (The stack needed extra pointers to access variables at different levels of nesting efficiently.) These mechanisms provided interesting examples for implementation in compiler construction classes, but in practice they were seldom used. The developers of C, and those of C++ and Java who followed, did not allow a function definition to be nested inside another function.
In each case, a desired amount of storage is allocated and a pointer to the storage is returned. Later it is possible to return the storage to the heap (sort of like recycling) to be used during other allocations. Pascal, C, and C++ have explicit deallocation as the norm, whereas automatic deallocation, called garbage collection, is possible. In Java automatic garbage collection is the norm, although explicit deallocation is possible.
Here are the different functions for allocating and deallocating in the different languages:
Language | Allocate | Deallocate | Automatic Garbage Collection |
---|---|---|---|
Pascal | new() | dispose() | No |
C | malloc() | free() | Not normally |
C++ | new | delete | Not normally |
Java | new | System.gc() | Yes |
(Note: In Pascal and C, the above are functions. In C++, new and delete are operators, although you can also use the C functions, assuming that you do not mix them with the C++ operators (causing a catastophe). In Java, System.gc() is a function, while new is a keyword, a part of the language used to allocate storage.)
Memory leaks are a serious problem in C, but they are even more of a problem in C++, because of the complexity of this language, particularly the complexity of hidden allocations and deallocations. Bjarne Stroustrup, the inventor of C++, had this to say in in his FAQ about C++:
Question: "How do I deal with memory leaks?" Answer: By writing code that doesn't have any. Clearly, if your code has new operations, delete operations, and pointer arithmetic all over the place, you are going to mess up somewhere and get leaks, stray pointers, etc. This is true independently of how conscientious you are with your allocations: eventually the complexity of the code will overcome the time and effort you can afford. It follows that successful techniques rely on hiding allocation and deallocation inside more manageable types. Good examples are the standard containers. They manage memory for their elements better than you could without disproportionate effort.
If systematic application of these techniques is not possible in your environment (you have to use code from elsewhere, part of your program was written by Neanderthals, etc.), be sure to use a memory leak detector as part of your standard development procedure, or plug in a garbage collector. |
Mistake 2: An active pointer to garbage. A much more serious mistake than the previous one is to deallocate storage when the program still has an active pointer to the storage. Mistake 1 just leads to programs that keep using more memory as they run; they have to be restarted every now and then. But Mistake 2 can lead to a program that crashes -- a potential catastrophe.