Types and Data types

advertisement
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices
Datatypes, Interfaces and Types
John R. Herring
US/HMMG
1 Introduction
Three of the most commonly used stereotypes in the ISO TC 211 Harmonized Model are
DataType, Interface, and Type. They represent in different ways, the two sides of an object;
datatypes represent state, interfaces represent behavior and types are an interface with a canonical
partial state structure. The definitions and descriptions culled from the UML specifications (both
UML 1.x and UML 2.0) are:
«DataType»
A data type is a type whose values have no identity (i.e., they are pure values). … In the
metamodel, a DataType defines a special kind of Classifier in which Operations are all pure
functions (i.e., they can return DataValues but they cannot change DataValues, because they
have no identity). … A Primitive defines a predefined DataType, without any relevant UML
substructure (i.e., it has no UML parts). A primitive datatype may have a logical algebra of
operations and constraints defined outside of UML.
«Interface»
[A] named set of operations that characterize the behavior of an element. … In the metamodel, an
Interface contains a set of Operations that together define a service offered by a Classifier
realizing the Interface. A Classifier may offer several services, which means that it may realize
several Interfaces, and several Classifiers may realize the same Interface. … Interfaces may not
have Attributes, Associations, or Methods. An Interface may participate in an Association provided
the Interface cannot see the Association; that is, a Classifier (other than an Interface) may have
an Association to an Interface that is navigable from the Classifier but not from the Interface. …
All [operations] defined in an Interface are public.
«Type»
Specifies a domain of objects together with the operations applicable to the objects, without
defining the physical implementation of those objects. A type may not contain any methods,
maintain its own thread of control, or be nested. However, it may have attributes and associations.
The associations of a Type are defined solely for the purpose of specifying the behavior of the
type's operations and do not represent the implementation of state data.
Thus, an interface defines purely functional behaviors, consisting of operations and their
signatures. A data type defines an immutable value, usually in a programming language data
structure consisting directly of primitives or other data types. A type (or abstract type) consists of
operations, along with a logical data structure that defines some of the behavioral elements of the
concrete classes that implement (realize) it. In the extreme case, the data members of a type can
be sufficient to create a generic type constructor (or initializer) and a generic type query interface
that should be implementable by all concrete classes that realized it. This logical design pattern
for the three kinds of classifiers is:
17 February 2016
Created by Dr. John R. Herring
Page 1 of 7
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices
«interface»
X-interface
«type»X-type
State
«datatype»
X-dataType
+getData() : X-dataType
+setData(in data : X-dataType) : bool
X
Figure 1-1: Common Interface, Type, DataType and Object Pattern
NOTE – A Word on Equivalences of Models: In defining a mechanism for modeling, ISO TC
211 must be careful not to allow the peculiarities of particular tool’s implementations or
interpretation of UML, nor UML or XML or JAVA constraints that are not based on inescapable
logic to bias the way in which models are done. For example, the models in the ISO 191xx
standards use multiple-inheritance even though JAVA and XML do not support it, and most of
the self-proclaimed object experts claim (based on implementation difficulties usually tied to a
particular language or programming environment) that it has ‘insurmountable’ problems. Both
JAVA’s use of Interfaces and XML’s use of choice blocks are both alternate, albeit partial,
solutions to its implementation, and bypass the problems usually attributed to the practice.
2 Data Type
Data types are inherently transient unless contained inside a persistent object and essentially
abstract because most if not all object management solution require some form of identity based
on logical or physical storage location.
The transient nature of datatypes is due to their lack of identity. Since they cannot be identified,
they cannot be stored unless they are placed in an identifiable container (an object of another
class). Since they cannot be identified, only their container can point to them, because their
container is the only object that knows where the value is stored (because of encapsulation). The
closest thing to a data type in programming is the value of a C-structure – a collection of named
primitives and other C-structures. The C-structure is a simple transient container, identified by its
memory address or variable name, which can hold the value of a data type.
17 February 2016
Created by Dr. John R. Herring
Page 2 of 7
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices
The data type is abstract in the sense that they can essentially never be stored unless they are in a
container of some object, either as an attribute of an object class or as a member of a strong
aggregation role of that class. Since data types are not identifiable except by their value, that
value must have a container whose value is changeable and expressible as the data type. In most
programming languages, the only data types are the primitives built into the language, such as
Integer, Real, String, and Boolean. If we declare a variable to be of such a type, as in
“X : Integer” then only a transient local slot is created that can contain an integer value. Further,
if the routine is recursive (can call itself directly or indirectly) then each copy of the stack for that
routine will have a different value for the data type slot “X”. In essence, X is not an Integer but
an Integer container (transient at that). The closest we get to pure data types is the temporary
constants that are created in expressions as in “X = 1 + 3”. Literally, inside the machine, the 1
and the 3 are created in machine registers, a ‘4’ is created as the output of an arithmetic
computation, and then that value is used to modify the ‘object’ X.
All object have an associated data type, consisting of the information that is stored internally to
the object, (except for the identity of the object which is not considered part of the object’s
value). In most programming languages, expressions such as “A = B” transfer the data type
information from one object (B) to another (A), going through whatever casting operations
(transformations between alternative representations) that are needed. Using the pattern in Figure
1-1, this is equivalent to A.X-type::setData(B.X-type::getData()).
If the two sets of expression (A = B, B = A) and (B = A, A = B) always get you back to where
you started, then the casting operations from A to B and from B to A are (as a set) ‘idempotent,’
and the two classes have equivalent data types. This sort of abstract constructor/initializer
process is key to ISO 19118: Encoding. For example, the XML produced by an encoder is
essentially the content of the types associated data type. The source and target system are
assumed to have equivalent but not equal object types for the encoded XML. In short, the
following has to work:
encode
decode
System1:: X 
 XML :: X 
 System2 :: X
That does not say that the two object classes involved are the same, since their behavior for other
operations and their internal data structure can be radically different.
Actually, they usually are not behaviorally different. A data type comes with a certain inherent
semantics, defined by its operations, attributes and constraints. For example, the Positive Integers
can be defined by an axiomatic set called Peano's Axioms, which combined with additional
axioms to describe subtraction give you the full arithmetic. Any representations that satisfy these
axioms are mathematically equivalent. The most common example in computer science is the
difference between ones-complement and twos-complement Integers, which are equivalent (in
their common domain) but incompatible representations of the Integers.
Because of the lack of identity, a datatype cannot be in any but a strong aggregration. Because of
a UML limitation, the aggreation cannot be backwardly navigable. ISO TC 211 has made an
exception to this last rule. A datatype can have an outward pointing association if one of the two
following is true:
17 February 2016
Created by Dr. John R. Herring
Page 3 of 7
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices
1. The target of the association is another datatype and the association is a strong
aggregation
2. The target of the association is a well-known immutable object for which a universally
recognized identifier exists.
In case 1, the structure is equivalent to a member attribute of the data type. In case 2, the
structure is equivalent to the datatype having a member attribute whose value is the identity of
the target. For example, from ISO 19107, the datatype DirectPosition has a relation to SC_CRS.
This is equivalent to DirectPosition having an attribute of type CharacterString (or a namespace
enhanced name such as GenericName from 19103, or RS_Identifer from ISO 19115) to hold the
coordinate system identity. Technically this is a violation of UML rules, but essentially it does
not violate the intent. Figure 2-1 shows the two alternatives. The first is directly from ISO 19107,
and the second is the fully UML-compliant equivalent model. The diagrams are from a model
drawn in Enterprise Architect, which uses UML 1.4 notation not supported in Rational Rose.
This is not a very big issue, since the current ISO TC 211 method for using Rational Rose has
equivalent characteristics.
cd Data Model
RS_ReferenceSystem
«DataType»
DirectPosition
+
+
coordinate: Sequence<Number>
/dimension: Integer
+directPosition
«Abstract»
SC_CRS
+coordinateReferenceSystem
0..*
{abstract}
0..1
+
+
kindCode: SC_KindCode
remarks: CharacterString
«DataType»
DirectPosition2
+ coordinate: Number [1..* ordered]
+/ dimension: Integer
+ coordinateReferenceSystem: GenericName [0..1]
Figure 2-1 DirectPosition example: Use of Associations by datatypes
In general, if a datatype in the ISO191xx documents has an association role named
“referenceToB” pointing to a type “B”, then it should be replaceable by an attribute
“referenceToB” of type CharacterString (or similar type) that will contain the identity of a
logically immutable instance of type “B”.
3 Interface
The UML specification makes several statements about interfaces that describe their nature and
use. Some of them are:

An interface is only a collection of operations with a name.

It cannot be directly instantiated.
17 February 2016
Created by Dr. John R. Herring
Page 4 of 7
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices

The purpose of an interface is to collect a set of operations that constitute a coherent
service offered by classifiers.

Interfaces provide a way to partition and characterize groups of operations.

An interface does not imply any internal structure of the realizing classifier. For example,
it does not define which algorithm to use for realizing an operation.

Several classifiers may realize the same interface.

The relationship between interface and class is not necessarily one-to-one; a class may
offer several interfaces and one interface may be offered by more than one class.

The same operation may be defined in multiple interfaces that a class supports; if their
specifications are identical, then there is no conflict; otherwise, the model is ill formed.

Moreover, a class may contain additional operations besides those found in its interfaces.

[A] classifier offering the interface must provide not only the operations declared in the
interface but also those declared in the ancestors of the interface.
Along with types, which are similar, interfaces give a free structure to define behavior without
the worry of creating contradictions by defining data structure. Since interfaces are just sets of
protocols for operations (no methods, no data) there is no logically problems with any form of
inheritance. A concrete class must implement any operation in all of the interfaces it realizes, and
so it must implement a union of the operation protocols defined by any interface it directly
realizes or is associated to transitively by some form of inheritance.
Interface classifiers can be used in operation protocols if the operation only depends on the
values that would be returned by the interface operations.
4 Type
The type is a partial behavioral definition of an object, just as the data type is the structural
definition of an object. It is similar to an interface in the mechanism in which it is realized by
implementation classes. Types are abstract in that they can never be instantiated (they have no
methods and no directly defined internal data structure).
In general, the attributes and association roles associated to a type are abstract, and each
implementation of a type can be different. For example, if a type has an attribute “point” of type
DirectPosition, then the implementation class must have a way to get and set this value as if it
were an attribute. If the implied semantics of a type only require an attribute to be readable, the
attribute declaration should be prefaced by the stereotype <<readonly>> and the implementations
need only implement the get operation for that attribute. A similar mechanism it to mark the
attribute as ‘derived’ which adds the semantics that the attribute can be determined by the value
of other attributes (not necessarily of the type) and can thus its value only be affected indirectly
The UML specification makes several statements about types that describe their nature and use.
Some of them are:
17 February 2016
Created by Dr. John R. Herring
Page 5 of 7
Harmonized Model Management Group
Recommendation Paper
ISO TC 211
Best Practices

[A type] specifies a domain of objects together with the operations applicable to the
objects, without defining the physical implementation of those objects.

A type may not contain any methods, maintain its own thread of control, or be nested.
However, it may have attributes and associations.

The associations of a Type are defined solely for the purpose of specifying the behavior
of the type's operations and do not represent the implementation of state data.

Although an object may have at most one Implementation Class, it may conform to
multiple different Types.

An Implementation class is said to realize a Type if it provides all of the operations
defined for the Type with the same behavior as specified for the Type’s operations.

An Implementation Class may realize a number of different Types.

[The] physical attributes and associations of the Implementation class do not have to be
the same as those of any Type it realizes and the Implementation Class may provide
methods for its operations in terms of its physical attributes and associations.
In ISO/TS19103 Clause 6.3 Classes
A class according to this Technical Specification is viewed as a specification and not as
an implementation. Attributes are considered abstract and do not have to be directly
implemented (i.e. as fields in a record or instance variables in an object). This is not in
conflict with the process of encoding as described in ISO 19118, as this describes an
external representation that does not have to be equivalent to the internal representation.
For each class defined according to this Technical Specification, the set of attributes
defined with this class, together with the sets of attributes of classes that are reachable
directly or indirectly via associations, shall be sufficient to fully support the
implementation of each operation defined for this particular class.
This clause essentially says the classes in all ISO TC 211 documents are to be considered as if
they were marked with the stereotype «Type». To ensure this behavior, most classes in the HM
should be «Type» stereotyped.
5 Rules
1. Interfaces must be marked with the stereotype «Interface».
2.
«Interface» classifiers must not inherit or be inherited from anything other than another
«Interface» classifier. All other may ‘realize’ the «Interface» classifier.
3.
«Interface» classifiers must not realize «Type» classifiers.
17 February 2016
Created by Dr. John R. Herring
Page 6 of 7
Harmonized Model Management Group
Recommendation Paper
4.
5.
6.
7.
8.
ISO TC 211
Best Practices
«Interface» classifiers must not be involved in any associations.
Data types must be marked with the stereotype «DataType».
«DataType» classifiers must not have identities
«DataType» classifiers must not be the target of association roles that are by reference.
«DataType» classifiers must not be the source of association roles that are not targeted
for other «DataType» classifiers except when the target class’ instances are essentially
immutable (not subject to change) and all have well know identifiers which could be
expressed as a «DataType» classifiers, such as in the case of coordinate reference
systems, prime meridians, or standards authorities such as EPSG (European Petroleum
Survey Group), OGC, CEN, IEC or ISO.
«DataType» classifiers must not have attribute members other than those whose type is
another «DataType» classifier or built-in Primitive.
10. Primitives, which are «DataType» classifiers, must not be involved in any associations.
11. «DataType» classifiers cannot inherit from anything other than another «DataType»
9.
classifier.
12. Types must be marked with the stereotype «Type».
13. «Type» classifiers must not inherit or be inherited from anything other than another
«Type» classifier. All other may ‘realize’ the «Type» classifier.
6 Guidelines
1. Most classifiers in the HM should be «Type», «Interface» or «DataType» classifiers
depending on usage and complexity. This rule does not apply if the standard is at the
implementation level, in which case an implementation language specific UML patterns
should be followed, such as the one defined for GML application schemas in ISO 19136.
2. Abstract classes should be «Type» classifiers.
3. Services should be «Interface» or «Type» classifiers. Services should be «Type»
classifiers only if they have publicly accessible state or relation information that is
essential to their operations.
17 February 2016
Created by Dr. John R. Herring
Page 7 of 7
Download