CSSE501 Object-Oriented Development Chapter 12: Implications of Substitution In this chapter we will investigate some of the implications of the principle of substitution in statically typed object-oriented programming languages. In particular, we will consider: The impact on memory management The meaning of assignment The distinction between testing for identity and testing for equality Idealization of is-a Relationship A TextWindow is-a Window Because TextWindow is subclassed from Window, all behavior associated with Windows is also manifest by instances of TextWindow. Therefore, a variable declared as maintaining an instance of Window should be able to hold a value of type TextWindow. Unfortunately, practical programming language implementation issues complicate this idealized picture. Memory Allocation - Stack and Heap Based Generally, programming languages use two different techniques for allocation of memory. Stack-based allocation. Amount of space required is determined at compile time, based on static types of variables. Memory allocation and release is tied to procedure entry/exit. Can be performed very efficiently. Heap-based allocation. Amount of space used can be determined at run-time, based upon dynamic considerations. Memory allocation and release is not tied to procedure entry/exit, and either must be handled by user or by a run-time library (garbage collection). Generally considered to be somewhat less efficient. The Problem with Substitution class Window { public: private: }; virtual void oops(); int height; int width; class TextWindow : public Window { public: virtual void oops(); private: char * contents; int cursorLocation; }; Window x; // how much space to set aside? TextWindow y; x = y; // what should happen here? Memory Strategies How much memory should be set aside for the variable x ? (Minimum Static Space Allocation) Allocate the amount of space necessary for the base class only. (C++) (Maximum Static Space Allocation) Allocate the amount of space for the largest subclass. (Dynamic Space Allocation) Allocate for x only the amount of space required to hold a pointer. (Smalltalk, Java) Minimum Static Space Allocation The language C++ uses the minimum static space allocation approach. This is very efficient, but leads to some subtle difficulties. What happens in the following assignment? Window x; TextWindow y; x = y; Assigning a Larger Value to a Smaller Box The Slicing Problem The problem is you are trying to take a large box and squeeze it into a small space. Clearly this won't work. Thus, the extra fields are simply sliced off. Question - Does this matter? Answer - Only if somebody notices. Solution - Design the language to make it difficult to notice. Rules for Member Function Binding in C++ The rules for deciding what member function to execute are complicated because of the slicing problem. With variables that are declared normally, the binding of member function name to function body is based on the static type of the argument (regardless whether the function is declared virtual or not). With variables that are declared as references or pointers, the binding of the member function name to function body is based on the dynamic type. Illustration void Window::oops() { printf("Window oops\n"); } void TextWindow::oops() { printf("TextWindow oops %d\n", cursorLocation); } TextWindow x; Window a; Window * b; TextWindow * c; a = x; a.oops(); // executes Window version c = &x; c->oops(); // executes TextWindow version Maximum Static Space Allocation A different approach would be to allocate the Maximum amount of space you would ever need. Would nicely solve the slicing problem. Would often allocate unused space. Maximum amount of space not known until all classes have been seen. For this reason, not used in practice. Dynamic Memory Allocation In the third approach, all objects are actually pointers. Only enough space for a pointer is allocated at compile time. Actual data storage is allocated on the heap at runtime. Used in Smalltalk, Object Pascal, and Objective-C, Java. Requires user to explicitly allocate new objects and, in some languages, explicitly free no longer used storage. May also lead to pointer semantics for assignment and equality testing. Meaning of Assignment? What does it mean when an instance of a class is assigned to another variable? class Box { public int value; } Box x = new Box(); x.value = 7; Box y = x; y.value = 12; // what is x.value? Two possibilities: Copy semantics. x and y are independent of each other, a change in one has no effect on the other. Pointer semantics. x and y refer to the same object, and hence a change in one will alter the other. Copy Semantics versus Pointer Semantics If a value is indirectly accessed through a pointer, when an assignment is performed (or equality test is made), is the quantity assigned simply the pointer or is it the actual value? Problems with Pointer Semantics If x is assigned to y and then changes are made to x, are these changes reflected in y? If x is explicitly freed, what happens if the user tries to access memory through y? In C++, programmer can make assignment (equality testing) mean anything they want. Object Pascal, Java uses pointer semantics, no built-in provision for copies. Smalltalk and Objective-C use pointer semantics, have several techniques for making copies. Equality and Identity A test for identity asks whether two variables refer to exactly the same object. A test for equality asks whether two variables refer to values that are equivalent. Of course, the meaning of equivalent is inheritently domain specific. Object-oriented languages allow the programmer to control the meaning of the equality test by allowing the redefinition of a standard method. (for example, equals in Java). Paradoxes of Equality, Part 1 But child classes cannot change the type signature of overridden methods. This means the argument must often be more general than one would like: class Object { public boolean equals (Object right) { ... } } class PlayingCard extends Object { public boolean equals (Object right) { ... // right must be object even if we are only ... // interested in comparing cards to cards } } Paradoxes of Equality, Part 2 Because equality is a message sent to the left argument, there is no guarantee that properties such as symmetry or transitivity are preserved: class Foo { boolean equals (Object right) { ... } } Foo a, b; if (a.equals(b)) // even if this is true if (b.equals(a)) // no guarantee that this is true Paradoxes of Equality, Part 3 And if you add inheritance into the mix, the possibilities for paradoxical behavior increase even more. class Parent { boolean equals (Object x) { ... } } class Child extends Parent { boolean equals (Object x) { ... } } Parent p; Child c; if (p.equals(c)) // will execute using the parent method if (c.equals(p)) // will execute using the child method Chapter Summary We have explored the implications that result from the inclusion of the principle of substitution in an object oriented programming language. Because values are not known until run time, you either have complex semantics (as in C++) or objects are dynamic (as in Java and most other languages). Because objects are dynamic, most object-oriented languages end up using a garbage collection system. Dynamic semantics naturally lean to pointer semantics for assignment Pointer semantics mean that equality and identity are two different concepts Since equality is domain specific, the programmer must be free to redefine the meaning as appropriate Because the programmer can redefine equality arbitrarily, there is no guarantee that semantics of equals is preserved.