(11.1)
Data Abstraction
– Problems with subprogram abstraction
– Encapsulation
– Data abstraction
– Language issues for ADTs
– Examples
» Ada
» C++
» Java
– Parameterized ADTs
(11.2)
Object-oriented programming
– Components of object-oriented programming languages
– Fundamental properties of the object-oriented model
– Relation to data abstraction
– Design issues for OOPL
– Examples
» Smalltalk 80
» C++
» Ada 95
» Java
– Comparisons
» C++ and Smalltalk
» C++ and Ada 95
» C++ and Java
– Implementation issues
(11.3)
No way to selectively provide visibility for subprograms
No convenient ways to collect subprograms together to perform a set of services
Program that uses subprogram ( client program ) must know details of all data structures used by subprogram
– client can “work around” services provided by subprogram
– hard to make client independent of implementation techniques for data structures
» discourages reuse
Difficult to build on and modify the services provided by subprogram
Many languages don’t provide for separately compiled subprograms
One solution
– a grouping of subprograms that are logically related that can be separately compiled
– called encapsulations
Examples of encapsulation mechanisms
– nested subprograms in some ALGOL-like languages
» Pascal
– FORTRAN 77 and C
» files containing one or more subprograms can be independently compiled
– FORTRAN 90, Modula-2, Modula-3, C++, Ada (and other contemporary languages)
» separately compilable modules
(11.4)
(11.5)
A better solution than just encapsulation
Can write programs that depend on abstract properties of a type, rather than implementation
Informally, an Abstract Data Type (ADT) is a [collection of] data structures and operations on those data structures
– example is floating point number
» can define variables of that type
» operations are predefined
» representation is hidden and can’t manipulate except through built-in operations
ADT
– isolates programs from the representation
– maintains integrity of data structure by preventing direct manipulation
(11.6)
Formally, an ADT is a user-defined data type where
– the representation of and operations on objects of the type are defined in a single syntactic unit; also, other units can create objects of the type.
– the representation of objects of the type is hidden from the program units that use these objects, so the only operations possible are those provided in the type's definition.
Advantages of first restriction are same as those for encapsulation
– program organization
– modifiability (everything associated with a data structure is together)
– separate compilation
(11.7)
Advantage of second restriction is reliability
– by hiding the data representations, user code cannot directly access objects of the type
– user code cannot depend on the representation, allowing the representation to be changed without affecting user code
By this definition, built-in types are ADTs
– e.g., int type in C
» the representation is hidden
» operations are all built-in
» user programs can define objects of int type
User-defined abstract data types must have the same characteristics as built-in abstract data types
ADTs provide mechanisms to limit visibility
– public part indicates what can be seen (and used from) outside
» what is exported
– private part describes what will be hidden from clients
» made available to allow compiler to determine needed information
» C++ allows specified program units access to the private information
• friend functions and classes
(11.8)
(11.9)
Language requirements for data abstraction
– a syntactic unit in which to encapsulate the type definition.
– a method of making type names and subprogram headers visible to clients, while hiding actual definitions
» public/private
– some primitive operations must be built into the language processor (usually just assignment and comparisons for equality and inequality)
» some operations are commonly needed, but must be defined by the type designer
» e.g., iterators, constructors, destructors
Can put ADTs in PL
– as a type definition extended to include operations (C++)
» use directly to declare variables
– as a collection of objects and operations (Ada)
» may need to be instantiated before declaring variables
Language design issues
– encapsulate a single type, or something more?
– what types can be abstract?
– can abstract types be parameterized?
– how are imported types and operations qualified?
Simula-67 was first language to address this issue
– classes provided encapsulation, but no information hiding
(11.10)
Abstraction mechanism is the package
Each package has two pieces (can be in same or separate files)
– specification
» public part
» private part
– body
» implementation of all operations exported in public part
» may include other procedures, functions, type and variable declarations, which are hidden from clients
• all variables are static
» may provide initialization section
• executed when declaration involving package is elaborated
Any type can be exported
Operations on exported types may be restricted
– private (:=, =, /=, plus operations exported)
– limited private (only operations exported)
(11.11)
Evaluation
– exporting any type as private is good
» cost is recompilation of clients when the representation is changed
– can’t import specific entities from other packages
– good facilities for separate compilation
(11.12)
(11.13)
Based on C struct type and Simula 67 classes
Class is the encapsulation device
– all of the class instances of a class share a single copy of the member functions
– each instance of a class has its own copy of the class data members
– instances can be static, semidynamic, or explicit dynamic
Information Hiding
– private clause for hidden entities
– public clause for interface entities
– protected clause - for inheritance
(11.14)
Constructors
– functions to initialize the data members of instances
– may also allocate storage if part of the object is heapdynamic
– can include parameters to provide parameterization of the objects
– implicitly called when an instance is created
» can be explicitly called
– name is the same as the class name
Destructors
– functions to cleanup after an instance is destroyed; usually just to reclaim heap storage
– implicitly called when the object’s lifetime ends
» can be explicitly called
– name is the class name, preceded by a tilda (~)
(11.15)
Friend functions
– allow access to private members to some unrelated units or functions
Evaluation
– classes are similar to Ada packages for providing abstract data type
– difference is packages are encapsulations, whereas classes
are types
(11.16)
Similar to C++ except
– all user-defined types are classes
– all objects are allocated from the heap and accessed through reference variables
– individual entities in classes (methods and variables) have access control modifiers (public or private), rather than C++ clauses
– functions can only be defined in classes
– Java has a second scoping mechanism, package scope, that is used instead of friends
» all entities in all classes in a package that don’t have access control modifiers are visible throughout the package
(11.17)
Ada generic packages may be parameterized with
– type of element stored in data structure
– operators among those elements
Must be instantiated before declaring variables
– instantiation of generic behaves like text substitution
– package BST_Integer is new binary_search _tree(INTEGER)
» like text of generic package substituted here, with parameters substituted
– EXCEPT references to non-local variables, etc. occur as if happen at point where generic was declared
If have multiple instantiations, need to disambiguate when declare exported types
– package BST_Real is new binary_search_tree(REAL)
– tree1: BST_Integer.bst;
– tree2: BST_Real.bst;
(11.18)
C++
– classes can be somewhat generic by writing parameterized constructor functions
– class itself may be parameterized as a templated class stack (int size) { stk_ptr = new int [size]; max_len = size - 1; top = -1;
} stack (100) stk;
– Java doesn’t support generic abstract data types
(11.19)
The problem with Abstract Data Types is that they are static
– can’t modify types or operations
» except for generics/templates
– means extra work to modify existing ADTs
Object-oriented programming (OOP) languages extend data abstraction ideas to
– allow hierarchies of abstractions
– make modifying existing abstractions for other uses very easy
Leads to new approach to programming
– identify real world objects of problem domain and processing required of them
– create simulations of those objects, processes, and the communication between them by modifying existing objects whenever possible
(11.20)
Two approaches to designing OOPL
– start from scratch (Smalltalk 1972!!)
» allows cleaner design
» better integration of object features
» no installed base
– modify an existing PL (C++, Ada 95)
» can build on body of existing code
» OO features usually not as smoothly integrated
» backward compatibility issues of warts from initial language design
(11.21)
Object : encapsulated operations plus local variables that define an object’s state
– state is retained between executions
– objects send and receive messages
Messages : requests from sender to receiver to perform work
– can be parameterized
– in pure OOL are also objects
– return results
Methods : descriptions of operations to be done when a message is received
Classes : templates for objects, with methods and state variables
– objects are instantiations of classes
– classes are also objects (have instantiation method to create new objects)
Abstract Data Types
– encapsulation into a single syntactic unit that includes operations and variables
– also information hiding capabilities
Inheritance
– fundamental defining characteristic of OOPL
– classes are hierarchical
» subclass/superclass or parent/derived
» lower in structure inherit variables and methods of ancestor classes
» can redefine those, or add additional, or eliminate some
– single inheritance (tree structure) or multiple inheritance
(acyclic graph)
» if single inheritance can talk about a root class
(11.22)
(11.23)
Polymorphism
– special kind of dynamic binding
» message to method
– same message can be sent to different objects, and the object will respond properly
– similar to function overloading except
» overloading is static (known at compile time)
» polymorphism is dynamic (class of object known at run time)
(11.24)
Class == generic package
Object == instantiation of generic
– actually, closer to instance of exported type
Messages == calls to operations exported by ADT
Methods == bodies (code) for operations exported by ADT
EXCEPT
– data abstraction mechanism allows only one level of generic/instantiation
– OO model allows multiple levels of inheritance
– no dynamic binding of method invocation in ADTs
(11.25)
Exclusivity of objects
– everything is an object
» elegant and pure, but slow for primitive types
– add objects to complete typing system
» fast for primitive types, but confusing
– include an imperative style typing system for primitive types, but everything else is an object
» relatively fast, and less confusion
Are subclasses subtypes?
– does an “is a” relationship hold between parent and child classes?
(11.26)
Interface or implementation inheritance?
– if only interface of parent class is visible to subclass, interface inheritance
» may be inefficient
– if interface and implementation visible to subclass, implementation inheritance
Type checking and polymorphism
– if overridding methods must have the same parameter types and return type, checking may be static
– Otherwise need dynamic type checking, which is slow and delays error detection
Single or multiple inheritance
– multiple is extremely convenient
– multiple also makes the language and implementation more complex, and is less efficient
(11.27)
Allocation and deallocation of objects
– if all objects are allocated from heap, references to them are uniform (as in Java)
– is deallocation explicit (heap-dynamic objects in C++) or implicit (Java)
Should all binding of messages to methods be dynamic?
– if yes, inefficient
– if none are, great loss of flexibility
(11.28)
Smalltalk is the prototypical pure OOPL
All entities in a program are objects
– referenced by pointers
All computation is done by sending messages (perhaps parameterized by object names) to objects
– message invokes a method
– reply returns result to sender, or notifies that action has been done
Also incorporates graphical programming environment
– program editor
– compiler
– class library browser
» with associated classes
– also written in Smalltalk
» can be modified
Messages
– object to receive message
– message
» method to invoke
» possibly parameters
Unary messages
– specify only object and method
– firstAngle sin
» invokes sin method of firstAngle object
Binary messages
– infix order
– total / 100
» sends message / 100 to object total
» which invokes / method of total with parameter 100
(11.29)
Keyword messages
– indicate parameter values by specifying keywords
– keywords also identify the method
– firstArray at: 1 put: 5
» invokes at:put: method of firstArray with parameters 1 and 5
Message expressions
– messages may be combined in expressions
» unary have highest precedence, then binary, then keyword
» associate left to right
» order may be specified by parentheses
– messages may be cascaded
» ourPen home; up; goto: 500@500
» equivalent to ourPen home.
ourPen up.
ourPen goto: 500@500
(11.30)
(11.31)
Assignment
– object <- object
– index <- index + 5
Blocks
– unnamed objects specified by [ <expressions> ]
» expressions are separated by .
– evaluated when they are sent the value message
» always in the context of their definition
– may be assigned to variables
» foo <- [ ... ]
Logical loops
– blocks may contain conditions
– all blocks have whileTrue methods
– sends value to condition block
– evaluates body block if result is true
[ <logical condition> ] whileTrue:
[ <body of loop> ]
Iterative loops
– all integer objects have a timesRepeat method
– also have
12 timesRepeat: [ ... ]
» to:do:
» to:by:do:
– a block is the loop body
6 to: 10 do: [ ... ]
Selection
– true and false are also objects
– each has ifTrue:, ifFalse:, ifTrue:ifFalse:, and IfFalse:ifTrue: methods total = 0 ifTrue: [ ... ]
“returns true or false object”
“true object executes this; false ignores” ifFalse: [ ... ] “false object executes this; true ignores”
(11.32)
(11.33)
Dynamic binding
– when a message arrives at an object, the class of which the object is an instance is searched for a corresponding method
– if not there, search superclass, etc.
Only single inheritance
– every class is an offspring of the root class Object
Evaluation
– simple, consistent syntax
– relatively slow
» message passing overhead for all control constructs
» dynamic binding of message to method
– dynamic binding allows type errors to be detected only at run-time
(11.34)
Essentially all of variable declaration, types, and control structures are those of C
C++ classes represent an addition to type structure of C
Inheritance
– multiple inheritance allowed
– classes may be stand-alone
– three information hiding modes
» public: everyone may access
» private: no one else may access
» protected: class and subclasses may access
– when deriving a class from a base class, specify a protection mode
» public mode: public, protected, and private are retained in subclass
» private mode: everything in base class is private
• may reexport public members of base class
(11.35)
Dynamic binding
– C++ member functions are statically bound unless the function definition is identified as virtual
– if virtual function name is called with a pointer or reference variable with the base class type, which member function to execute must be determined at run-time
– pure virtual functions are set to 0 in class header
» must be redefined in derived classes
– classes containing a pure virtual function can never be instantiated directly
» must be derived
(11.36)
General characteristics
– all data are objects except the primitive types
– all primitive types have wrapper classes that store one data value
– all objects are heap-dynamic, referenced through reference variables, and most are explicitly allocated
Inheritance
– single inheritance only
» but implementing interface can provide some of the benefits of multiple inheritance
» an interface can include only method declarations and named constants public class Clock extends Applet implements Runnable
– methods can be final (can’t be overridden)
(11.37)
Dynamic binding is the default
– except for final methods
Package provides additional encapsulation mechanism
– packages are a container for related classes
– entries defined without access modifier (private, protected, public) has package scope
» visible throughout package but not outside
– similarly, protected entries are visible throughout package
(11.38)
Type extension builds on derived types with tagged types
– tag associated with type identifies particular type
Classes are packages with tagged types
Package Object_Package is type Object is tagged private; procedure Draw (O: in out Object); private type Object is tagged record
X_Coord, Y_Coord: Real; end record; end Object_Package;
(11.39)
Then may derive a new class by using new reserved word and modifying tagged type exported
Overloading defines new methods with Object_Package; use Object_package;
Package Circle_Package is type Circle is new Object with record radius: Real; end record; procedure Draw (C: in out Circle); end Circle_Package
(11.40)
Derived packages form tree of classes
Can refer to type and all types beneath it in tree by type’class
– Object’class
– Square’class
Then use these as parameters to procedures to provide dynamic binding of procedure invocation procedure foo (OC:Object’class) is begin
Area(OC); -- which Area
-- determined at
-- run time end foo;
Circle
Object
Square Ellipse
Rectangle
Pure abstract base types are defined using the word abstract in type and subprogram definitions
Package World is type Thing is abstract tagged null record; function Area(T: in Thing) return Real is abstract; end World;
With World; package My_World is type Object is new Thing with record ... end record; procedure Area(O: in Object) return Real is ... end Area; type Circle is new Object with record ... end record; procedure Area(C: in Circle) return Real is ... end Area; end My_World;
(11.41)
Inheritance
– C++ provides greater flexibility of access control
– C++ provides multiple inheritance
» good or bad?
Dynamic vs. static binding
– Smalltalk full dynamic binding with great flexibility
– C++ allows programmer to control binding time
» virtual functions, which all must return same type
Control
– Smalltalk does everything through message passing
– C++ provides conventional control structures
(11.42)
(11.43)
Classes as types
– C++ classes are types
» all instances of a class are the same type, and one can legally access the instance variables of another
– Smalltalk classes are not types, and the language is essentially typeless
– C++ provides static type checking, Smalltalk does not
Efficiency
– C++ substantially more efficient with run-time CPU and memory requirements
Elegance
– Smalltalk is consistent, fundamentally object-oriented
– C++ is a hybrid language in which compatibility with C was an essential design consideration
Ada 95 has more consistent type mechanism
– C++ has C type structure, plus classes
C++ provides cleaner multiple inheritance
C++ must make dynamic/static function invocation decision at time root class is defined
– must be virtual function
– Ada 95 allows that decision to be made at time derived class is defined
C++ allows dynamic binding only for pointers and reference types
Ada 95 doesn’t provide constructor and destructor functions
– must be explicitly invoked
(11.44)
(11.45)
Java more consistent with OO model
– all classes must descend from Object
No friend mechanism in Java
– packages provide cleaner alternative
Dynamic binding “normal” way of binding messages to methods
Java allows single inheritance only
– but interfaces provide some of the same capability as multiple inheritance
(11.46)
Store state of an object in a class instance record
– template known at compile time
– access instance variables by offset
– subclass instantiates CIR from parent before populating local instance variables
CIR also provides a mechanism for accessing code for dynamically bound methods
– CIR points to table ( virtual method table ) which contains pointers to code for each dynamically bound method