COEN 171 - lecture 12


COEN 171 - Data Abstraction and OOP


 Data Abstraction

– Problems with subprogram abstraction

– Encapsulation

– Data abstraction

– Language issues for ADTs

– Examples

» Ada

» C++

» Java

– Parameterized ADTs

 Object-oriented programming

– Components of object-oriented programming languages

– Fundamental properties of the object-oriented model

– Relation to data abstraction

– Design issues for OOPL

– Examples

» Smalltalk 80

» C++

» Ada 95

» Java

– Comparisons

» C++ and Smalltalk

» C++ and Ada 95

» C++ and Java

– Implementation issues


Subprogram Problems

 No way to selectively provide visibility for subprograms

 No convenient ways to collect subprograms together to perform a set of services

 Program that uses subprogram ( client program ) must know details of all data structures used by subprogram

– client can “work around” services provided by subprogram

– hard to make client independent of implementation techniques for data structures

» discourages reuse

 Difficult to build on and modify the services provided by subprogram

 Many languages don’t provide for separately compiled subprograms


 One solution

– a grouping of subprograms that are logically related that can be separately compiled

– called encapsulations

 Examples of encapsulation mechanisms

– nested subprograms in some ALGOL-like languages

» Pascal

– FORTRAN 77 and C

» files containing one or more subprograms can be independently compiled

– FORTRAN 90, Modula-2, Modula-3, C++, Ada (and other contemporary languages)

» separately compilable modules



Data Abstraction

 A better solution than just encapsulation

 Can write programs that depend on abstract properties of a type, rather than implementation

 Informally, an Abstract Data Type (ADT) is a [collection of] data structures and operations on those data structures

– example is floating point number

» can define variables of that type

» operations are predefined

» representation is hidden and can’t manipulate except through built-in operations


– isolates programs from the representation

– maintains integrity of data structure by preventing direct manipulation


Data Abstraction (continued)

 Formally, an ADT is a user-defined data type where

– the representation of and operations on objects of the type are defined in a single syntactic unit; also, other units can create objects of the type.

– the representation of objects of the type is hidden from the program units that use these objects, so the only operations possible are those provided in the type's definition.

 Advantages of first restriction are same as those for encapsulation

– program organization

– modifiability (everything associated with a data structure is together)

– separate compilation


Data Abstraction (continued)

 Advantage of second restriction is reliability

– by hiding the data representations, user code cannot directly access objects of the type

– user code cannot depend on the representation, allowing the representation to be changed without affecting user code

 By this definition, built-in types are ADTs

– e.g., int type in C

» the representation is hidden

» operations are all built-in

» user programs can define objects of int type

 User-defined abstract data types must have the same characteristics as built-in abstract data types

Data Abstraction (continued)

 ADTs provide mechanisms to limit visibility

– public part indicates what can be seen (and used from) outside

» what is exported

– private part describes what will be hidden from clients

» made available to allow compiler to determine needed information

» C++ allows specified program units access to the private information

• friend functions and classes



Language Issues for ADTs

 Language requirements for data abstraction

– a syntactic unit in which to encapsulate the type definition.

– a method of making type names and subprogram headers visible to clients, while hiding actual definitions

» public/private

– some primitive operations must be built into the language processor (usually just assignment and comparisons for equality and inequality)

» some operations are commonly needed, but must be defined by the type designer

» e.g., iterators, constructors, destructors

 Can put ADTs in PL

– as a type definition extended to include operations (C++)

» use directly to declare variables

– as a collection of objects and operations (Ada)

» may need to be instantiated before declaring variables

Language Issues for ADTs (continued)

 Language design issues

– encapsulate a single type, or something more?

– what types can be abstract?

– can abstract types be parameterized?

– how are imported types and operations qualified?

 Simula-67 was first language to address this issue

– classes provided encapsulation, but no information hiding


Data Abstraction in Ada

 Abstraction mechanism is the package

 Each package has two pieces (can be in same or separate files)

– specification

» public part

» private part

– body

» implementation of all operations exported in public part

» may include other procedures, functions, type and variable declarations, which are hidden from clients

• all variables are static

» may provide initialization section

• executed when declaration involving package is elaborated

 Any type can be exported

 Operations on exported types may be restricted

– private (:=, =, /=, plus operations exported)

– limited private (only operations exported)


Data Abstraction in Ada (continued)

 Evaluation

– exporting any type as private is good

» cost is recompilation of clients when the representation is changed

– can’t import specific entities from other packages

– good facilities for separate compilation



Data Abstraction in C++

 Based on C struct type and Simula 67 classes

 Class is the encapsulation device

– all of the class instances of a class share a single copy of the member functions

– each instance of a class has its own copy of the class data members

– instances can be static, semidynamic, or explicit dynamic

 Information Hiding

– private clause for hidden entities

– public clause for interface entities

– protected clause - for inheritance


Data Abstraction in C++ (continued)

 Constructors

– functions to initialize the data members of instances

– may also allocate storage if part of the object is heapdynamic

– can include parameters to provide parameterization of the objects

– implicitly called when an instance is created

» can be explicitly called

– name is the same as the class name

 Destructors

– functions to cleanup after an instance is destroyed; usually just to reclaim heap storage

– implicitly called when the object’s lifetime ends

» can be explicitly called

– name is the class name, preceded by a tilda (~)


Data Abstraction in C++ (continued)

 Friend functions

– allow access to private members to some unrelated units or functions

 Evaluation

– classes are similar to Ada packages for providing abstract data type

– difference is packages are encapsulations, whereas classes

are types


Data Abstraction in Java

 Similar to C++ except

– all user-defined types are classes

– all objects are allocated from the heap and accessed through reference variables

– individual entities in classes (methods and variables) have access control modifiers (public or private), rather than C++ clauses

– functions can only be defined in classes

– Java has a second scoping mechanism, package scope, that is used instead of friends

» all entities in all classes in a package that don’t have access control modifiers are visible throughout the package


Parameterized ADTs

 Ada generic packages may be parameterized with

– type of element stored in data structure

– operators among those elements

 Must be instantiated before declaring variables

– instantiation of generic behaves like text substitution

– package BST_Integer is new binary_search _tree(INTEGER)

» like text of generic package substituted here, with parameters substituted

– EXCEPT references to non-local variables, etc. occur as if happen at point where generic was declared

 If have multiple instantiations, need to disambiguate when declare exported types

– package BST_Real is new binary_search_tree(REAL)

– tree1: BST_Integer.bst;

– tree2: BST_Real.bst;


Parameterized ADTs (continued)

 C++

– classes can be somewhat generic by writing parameterized constructor functions

– class itself may be parameterized as a templated class stack (int size) { stk_ptr = new int [size]; max_len = size - 1; top = -1;

} stack (100) stk;

– Java doesn’t support generic abstract data types


Object-Oriented Programming

 The problem with Abstract Data Types is that they are static

– can’t modify types or operations

» except for generics/templates

– means extra work to modify existing ADTs

 Object-oriented programming (OOP) languages extend data abstraction ideas to

– allow hierarchies of abstractions

– make modifying existing abstractions for other uses very easy

 Leads to new approach to programming

– identify real world objects of problem domain and processing required of them

– create simulations of those objects, processes, and the communication between them by modifying existing objects whenever possible


Object-Oriented Programming (continued)

 Two approaches to designing OOPL

– start from scratch (Smalltalk 1972!!)

» allows cleaner design

» better integration of object features

» no installed base

– modify an existing PL (C++, Ada 95)

» can build on body of existing code

» OO features usually not as smoothly integrated

» backward compatibility issues of warts from initial language design


OOPL Components

 Object : encapsulated operations plus local variables that define an object’s state

– state is retained between executions

– objects send and receive messages

 Messages : requests from sender to receiver to perform work

– can be parameterized

– in pure OOL are also objects

– return results

 Methods : descriptions of operations to be done when a message is received

 Classes : templates for objects, with methods and state variables

– objects are instantiations of classes

– classes are also objects (have instantiation method to create new objects)

Fundamental Properties of OO Model

 Abstract Data Types

– encapsulation into a single syntactic unit that includes operations and variables

– also information hiding capabilities

 Inheritance

– fundamental defining characteristic of OOPL

– classes are hierarchical

» subclass/superclass or parent/derived

» lower in structure inherit variables and methods of ancestor classes

» can redefine those, or add additional, or eliminate some

– single inheritance (tree structure) or multiple inheritance

(acyclic graph)

» if single inheritance can talk about a root class


Fundamental Properties of OO Model



 Polymorphism

– special kind of dynamic binding

» message to method

– same message can be sent to different objects, and the object will respond properly

– similar to function overloading except

» overloading is static (known at compile time)

» polymorphism is dynamic (class of object known at run time)


Comparison with Data Abstraction

 Class == generic package

 Object == instantiation of generic

– actually, closer to instance of exported type

 Messages == calls to operations exported by ADT

 Methods == bodies (code) for operations exported by ADT


– data abstraction mechanism allows only one level of generic/instantiation

– OO model allows multiple levels of inheritance

– no dynamic binding of method invocation in ADTs


OOP Language Design Issues

 Exclusivity of objects

– everything is an object

» elegant and pure, but slow for primitive types

– add objects to complete typing system

» fast for primitive types, but confusing

– include an imperative style typing system for primitive types, but everything else is an object

» relatively fast, and less confusion

 Are subclasses subtypes?

– does an “is a” relationship hold between parent and child classes?

OOP Language Design Issues (continued)


 Interface or implementation inheritance?

– if only interface of parent class is visible to subclass, interface inheritance

» may be inefficient

– if interface and implementation visible to subclass, implementation inheritance

 Type checking and polymorphism

– if overridding methods must have the same parameter types and return type, checking may be static

– Otherwise need dynamic type checking, which is slow and delays error detection

 Single or multiple inheritance

– multiple is extremely convenient

– multiple also makes the language and implementation more complex, and is less efficient

OOP Language Design Issues (continued)


 Allocation and deallocation of objects

– if all objects are allocated from heap, references to them are uniform (as in Java)

– is deallocation explicit (heap-dynamic objects in C++) or implicit (Java)

 Should all binding of messages to methods be dynamic?

– if yes, inefficient

– if none are, great loss of flexibility


Smalltalk 80

 Smalltalk is the prototypical pure OOPL

 All entities in a program are objects

– referenced by pointers

 All computation is done by sending messages (perhaps parameterized by object names) to objects

– message invokes a method

– reply returns result to sender, or notifies that action has been done

 Also incorporates graphical programming environment

– program editor

– compiler

– class library browser

» with associated classes

– also written in Smalltalk

» can be modified

Smalltalk 80 (continued)

 Messages

– object to receive message

– message

» method to invoke

» possibly parameters

 Unary messages

– specify only object and method

– firstAngle sin

» invokes sin method of firstAngle object

 Binary messages

– infix order

– total / 100

» sends message / 100 to object total

» which invokes / method of total with parameter 100


Smalltalk 80 (continued)

 Keyword messages

– indicate parameter values by specifying keywords

– keywords also identify the method

– firstArray at: 1 put: 5

» invokes at:put: method of firstArray with parameters 1 and 5

 Message expressions

– messages may be combined in expressions

» unary have highest precedence, then binary, then keyword

» associate left to right

» order may be specified by parentheses

– messages may be cascaded

» ourPen home; up; goto: 500@500

» equivalent to ourPen home.

ourPen up.

ourPen goto: 500@500



Smalltalk 80 (continued)

 Assignment

– object <- object

– index <- index + 5

 Blocks

– unnamed objects specified by [ <expressions> ]

» expressions are separated by .

– evaluated when they are sent the value message

» always in the context of their definition

– may be assigned to variables

» foo <- [ ... ]

 Logical loops

– blocks may contain conditions

– all blocks have whileTrue methods

– sends value to condition block

– evaluates body block if result is true

[ <logical condition> ] whileTrue:

[ <body of loop> ]

Smalltalk 80 (continued)

 Iterative loops

– all integer objects have a timesRepeat method

– also have

12 timesRepeat: [ ... ]

» to:do:

» to:by:do:

– a block is the loop body

6 to: 10 do: [ ... ]

 Selection

– true and false are also objects

– each has ifTrue:, ifFalse:, ifTrue:ifFalse:, and IfFalse:ifTrue: methods total = 0 ifTrue: [ ... ]

“returns true or false object”

“true object executes this; false ignores” ifFalse: [ ... ] “false object executes this; true ignores”



Smalltalk 80 (continued)

 Dynamic binding

– when a message arrives at an object, the class of which the object is an instance is searched for a corresponding method

– if not there, search superclass, etc.

 Only single inheritance

– every class is an offspring of the root class Object

 Evaluation

– simple, consistent syntax

– relatively slow

» message passing overhead for all control constructs

» dynamic binding of message to method

– dynamic binding allows type errors to be detected only at run-time



 Essentially all of variable declaration, types, and control structures are those of C

 C++ classes represent an addition to type structure of C

 Inheritance

– multiple inheritance allowed

– classes may be stand-alone

– three information hiding modes

» public: everyone may access

» private: no one else may access

» protected: class and subclasses may access

– when deriving a class from a base class, specify a protection mode

» public mode: public, protected, and private are retained in subclass

» private mode: everything in base class is private

• may reexport public members of base class


C++ (continued)

 Dynamic binding

– C++ member functions are statically bound unless the function definition is identified as virtual

– if virtual function name is called with a pointer or reference variable with the base class type, which member function to execute must be determined at run-time

– pure virtual functions are set to 0 in class header

» must be redefined in derived classes

– classes containing a pure virtual function can never be instantiated directly

» must be derived



 General characteristics

– all data are objects except the primitive types

– all primitive types have wrapper classes that store one data value

– all objects are heap-dynamic, referenced through reference variables, and most are explicitly allocated

 Inheritance

– single inheritance only

» but implementing interface can provide some of the benefits of multiple inheritance

» an interface can include only method declarations and named constants public class Clock extends Applet implements Runnable

– methods can be final (can’t be overridden)


Java (continued)

 Dynamic binding is the default

– except for final methods

 Package provides additional encapsulation mechanism

– packages are a container for related classes

– entries defined without access modifier (private, protected, public) has package scope

» visible throughout package but not outside

– similarly, protected entries are visible throughout package


Ada 95

 Type extension builds on derived types with tagged types

– tag associated with type identifies particular type

 Classes are packages with tagged types

Package Object_Package is type Object is tagged private; procedure Draw (O: in out Object); private type Object is tagged record

X_Coord, Y_Coord: Real; end record; end Object_Package;


Ada 95 (continued)

 Then may derive a new class by using new reserved word and modifying tagged type exported

 Overloading defines new methods with Object_Package; use Object_package;

Package Circle_Package is type Circle is new Object with record radius: Real; end record; procedure Draw (C: in out Circle); end Circle_Package


Ada 95 (continued)

 Derived packages form tree of classes

 Can refer to type and all types beneath it in tree by type’class

– Object’class

– Square’class

 Then use these as parameters to procedures to provide dynamic binding of procedure invocation procedure foo (OC:Object’class) is begin

Area(OC); -- which Area

-- determined at

-- run time end foo;



Square Ellipse


Ada 95 (continued)

 Pure abstract base types are defined using the word abstract in type and subprogram definitions

Package World is type Thing is abstract tagged null record; function Area(T: in Thing) return Real is abstract; end World;

With World; package My_World is type Object is new Thing with record ... end record; procedure Area(O: in Object) return Real is ... end Area; type Circle is new Object with record ... end record; procedure Area(C: in Circle) return Real is ... end Area; end My_World;


Comparing C++ and Smalltalk

 Inheritance

– C++ provides greater flexibility of access control

– C++ provides multiple inheritance

» good or bad?

 Dynamic vs. static binding

– Smalltalk full dynamic binding with great flexibility

– C++ allows programmer to control binding time

» virtual functions, which all must return same type

 Control

– Smalltalk does everything through message passing

– C++ provides conventional control structures



Comparing C++ and Smalltalk (continued)

 Classes as types

– C++ classes are types

» all instances of a class are the same type, and one can legally access the instance variables of another

– Smalltalk classes are not types, and the language is essentially typeless

– C++ provides static type checking, Smalltalk does not

 Efficiency

– C++ substantially more efficient with run-time CPU and memory requirements

 Elegance

– Smalltalk is consistent, fundamentally object-oriented

– C++ is a hybrid language in which compatibility with C was an essential design consideration

Comparing C++ and Ada 95

 Ada 95 has more consistent type mechanism

– C++ has C type structure, plus classes

 C++ provides cleaner multiple inheritance

 C++ must make dynamic/static function invocation decision at time root class is defined

– must be virtual function

– Ada 95 allows that decision to be made at time derived class is defined

 C++ allows dynamic binding only for pointers and reference types

 Ada 95 doesn’t provide constructor and destructor functions

– must be explicitly invoked



Comparing C++ and Java

 Java more consistent with OO model

– all classes must descend from Object

 No friend mechanism in Java

– packages provide cleaner alternative

 Dynamic binding “normal” way of binding messages to methods

 Java allows single inheritance only

– but interfaces provide some of the same capability as multiple inheritance


Implementing OO Constructs

 Store state of an object in a class instance record

– template known at compile time

– access instance variables by offset

– subclass instantiates CIR from parent before populating local instance variables

 CIR also provides a mechanism for accessing code for dynamically bound methods

– CIR points to table ( virtual method table ) which contains pointers to code for each dynamically bound method
