Experiences of UML-to-GML Encoding

advertisement
Experiences of UML-to-GML Encoding
Roy Grønmo, Ida Solheim, David Skogan
SINTEF Telecom and Informatics
Forskningsveien 1, Pb 124 Blindern, N-0314 Oslo, Norway
{roy.gronmo | ida.solheim | david.skogan}@informatics.sintef.no
Abstract. This paper presents experiences gained from the
development of an automatic conversion from GI application schema
to an XML exchange format. The application schema is expressed in
the Unified Modelling Language (UML), and the chosen exchange
format is the Geographic Markup Language (GML) specified by the
Open GIS Consortium (OGC). A set of conversion rules have been
identified and implemented in a tool that reads UML class diagrams
and writes corresponding GML code. A comprehensive cadastre
model has constituted the test case. The work has been performed as
part of the national GeNorway project.
1
1.1
Introduction
Two Prevailing Encoding Approaches for GI
In 2001, a controversial issue has been dominating the relationship between two
standardisation bodies for geographic information – Open GIS Consortium (OGC)
and ISO/TC 211 (ISO) . The disagreement can be summed up in the following double
question: How should geographic information be encoded, and what should the
exchange format look like?
According to ISO, the data provider and the data receiver are supposed to agree on
a so-called application schema. An application schema is typically UML class
diagrams expressing the structure and content of the data to be exchanged. The
standard ISO 19109 Rules for application schema (ISO, 2001b) prescribes how to
make an application schema in UML. The standard ISO 19118 Encoding (ISO,
2001c) prescribes conversion rules for the translation from an application schema in
UML into a corresponding XML Schema (W3C, 2002). Thereby ISO has created an
XML format for encoding of geographic information.
On the other hand, OGC has developed another XML format for GI encoding,
called Geographic Markup Language (GML) (OGC, 2001). GML plays a central role
in OGC’s successful Web Mapping testbeds and Web Services specifications, and is
being implemented in GI systems in several countries. GML is currently a competitor
to ISO’s XML format.
1.2
The GeNorway Project
SINTEF has been involved in specifying ISO standards as well as implementing
UML model-based tools in GI projects such as DISGIS (Grønmo et al., 2000) and
JNIP (Grønmo and Skogan, 2001). These efforts are continued within the project
GeNorway – Model-based infrastructure for living geospatial data in eNorway.
GeNorway is an ongoing two-year project funded by the Norwegian Research
Council and with the GIS vendor Norkart as project owner. The project will test the
practicability of selected standards from ISO/TC 211 in an implementation of a Web
Feature Server (WFS) according to OGC’s specification. Therefore, GeNorway has to
develop solutions conforming to both ISO standards and OpenGIS specifications.
Approach and preliminary results were presented at the ACM GIS 2001 in Atlanta
(Grønmo, 2001).
A WFS is a Web service with predefined XML requests and responses. GML shall
be used within WFS to represent the geographic models and instances to be
communicated between a client and a server. Important questions in GeNorway have
therefore been:
•
Can GML be generated automatically from ISO-conform UML models?
• If so, does the generated GML code prove to be as expressive, compact
and readable as hand-coded GML?
• If not, how and why does automatic GML generation fail?
Since neither UML nor GML is designed to match the other, UML-to-GML
encoding is not trivial. This paper will start the discussion by identifying some design
criteria for UML-to-GML encoding.
2
Design Criteria
Encoding a UML class diagram into an XML Schema can be done in a number of
different ways. This topic is well covered by David Carlson (Carlson, 2001), who
points out that the encoding strategy will vary depending on the problem scenario.
The GeNorway project has chosen a set of general design criteria for UML-to-GML
encoding. These design criteria will be the basis for evaluation of the UML-to-GML
conversion rules presented in the next section. For convenience, the term UML model
is used for a UML class diagram, the term GML is used for GML 2.0, and the term
ISO is used for ISO/TC 211. The design criteria are:
1.
The UML models shall fulfil the rules specified by ISO 19103 Conceptual
schema language (ISO, 2001a) and ISO 19109 Rules for application schema
(ISO, 2001b).
2.
3.
4.
5.
The UML models shall be conceptual and neutral to implementation choices. The
UML models shall not be modified to “fit” GML requirements. That means, the
UML modeller shall not need to know anything about GML.
The generated GML schema shall be fully determined by the UML model. This
eliminates the possibility of user configuration. The major advantage is that
agreement on a UML application schema implies agreement on the to-begenerated GML application schema. A positive side effect is that the
corresponding code generation tool will be easier to implement.
The generated GML schemas should exploit the constructs provided by XML
Schema (inheritance, data types, xlinks, facets etc). This will make the GML
schemas easy to read and understand.
The generated GML schemas and corresponding GML documents shall fulfil the
GML specification.
Design criterion 1 is specific to ISO UML models. Criteria 2, 3 and 4 are all
general UML-to-XML design criteria. Criterion 5 is specific to GML.
3
UML-to-GML Conversion Rules
A conversion rule transforms schema constructs of one schema language into schema
constructs of another schema language. The UML-to-GML conversion rules must
support both general UML constructs and ISO-specific constructs. Some of these
constructs need nothing but trivial conversions, while others require non-trivial
conversions.
Road
classification : CharacterString
number : CharacterString
linearGeometry : GM_Curve
<complexType name="RoadType">
<complexContent>
<extension base="gml:AbstractFeatureType">
<sequence>
Conversion
<element name="classification" type="string"/>
<element name="number" type="string"/>
<element name="linearGeometry”
type="gml:LineStringPropertyType"/>
</sequence>
</extension>
</complexContent>
</complexType>
Figure 1: Converting an ISO-conform UML class Road to a GML Schema RoadType
Figure 1 illustrates a trivial conversion from the UML constructs class, attribute
and attribute type. A UML class Road is converted to an XML Schema complexType.
The attributes within the UML class are converted to XML Schema elements within
the corresponding complexType. The attribute types are converted as shown in Table
1. This table shows examples of trivial conversions of ISO-specific constructs:
•
•
from ISO basic types (given in ISO 19103 Conceptual schema language) to
XML Schema basic types, and
from ISO geometry types (given in ISO 19107 Spatial Schema) to GML
geometry types.
Table 1: Left part: ISO basic types converted to XML
Right part: ISO Spatial Schema types converted to GML
Basic Type
according to
ISO 19103
CharacterString
Integer
Date
Boolean
Real
XML Schema Basic
Type
Spatial Type according
to ISO 19107
GML 2.0 Type
string
integer
date
boolean
decimal
GM_Point
GM_Curve
GM_CompositeSurface
PointPropertyType
LineStringPropertyType
PolygonPropertyType
Table 1 and 2 present the complete set of conversion rules. Table 1 presents only
trivial conversion rules. Table 2 presents some trivial and some non-trivial conversion
rules. Non-trivial conversions are necessary for encoding inheritance, associations and
order of attributes and associations. The non-trivial conversions are discussed further
in the next section.
Table 2: UML constructs converted to GML 2.0
UML
Construct
Package
Class
Conversion to GML 2.0
Packages are ignored.
Classes with stereotype Enumeration or CodeList are converted to:
<simpleType name="UMLCLASSNAMEType">
<restriction base="string">
Classes that inherit from a superclass are converted to a complexType and
an element declaration:
<complexType name="UMLCLASSNAMEType">
<complexContent>
<extension base="UMLSUPERCLASSNAMEType">
...
<attributeGroup ref="gml:AssociationAttributeGroup"/>
...
<element name="UMLCLASSNAME" type="UMLCLASSNAMEType"
substitutionGroup="UMLSUPERCLASSNAME"/>
Classes that do not inherit from other classes and have at least one
navigable association to another class, are converted to a GML collection
type and an element declaration:
<complexType name=" UMLCLASSNAMEType ">
<complexContent>
<extension base="gml:AbstractFeatureCollectionBaseType">
...
<element name="UMLCLASSNAME" type="UMLCLASSNAMEType"
substitutionGroup="gml:_FeatureCollection"/>
All other classes are converted to a GML feature type and an element
declaration:
<complexType name=" UMLCLASSNAMEType ">
<complexContent>
<extension base="gml:AbstractFeatureType">
...
<element name="UMLCLASSNAME" type="UMLCLASSNAMEType"
substitutionGroup="gml:_Feature"/>
Attribute
Attribute type
All classes that are abstract are converted to an abstract XML element
type.
Attributes within classes of stereotype Enumeration or CodeList are
converted to <enumeration value="UMLATTRIBUTENAME"/> within the
<restriction base="string"> element within the <simpleType> of the
corresponding class.
Attributes within all other classes are converted to <element
name="UMLATTRIBUTENAME"
within the <sequence>
of
the
<complexType> of the corresponding class.
Attribute types are ignorede for attributes within classes stereotyped as
Enumeration or Codelist.
Attribute types that are identified as ISO/TC 211 basic types or ISO/TC 211
spatial types are converted according to Table 1.
All other types are assumed to be user-defined types within the UML model
as class names. The converted type will be UMLATTRIBUTETYPEType.
(If these types are not defined as class names within the UML model, the
GML Schema will not be a legal XML Schema.)
Association
The resulting type, CONVERTED_UMLATTRTYPE, is inserted as the value
of the type attribute within the <element> of the corresponding attribute
(<element
name="UMLATTRIBUTENAME"
type="CONVERTED_UMLATTRIBUTETYPE)"
Composition, aggregation and association are treated the same way.
Navigable UML class associations are converted to explicit GML
featureAssociation types and an element declaration (Two-way
associations result in two type and two element declarations):
<complexType
name="UMLCLASSNAME.ROLENAMEATTHEOTHERCLASSType">
<complexContent>
<restriction base="gml:FeatureAssociationType">
<sequence minOccurs="0">
<element ref="UMLOTHERCLASSNAME"/>
</sequence>
<attributeGroup ref="gml:AssociationAttributeGroup"/>
...
<element
name="UMLCLASSNAME.ROLENAMEATTHEOTHERCLASS"
type="UMLCLASSNAME.ROLENAMEATTHEOTHERCLASSType"
substitutionGroup="gml:featureMember"/>
This explicit GML featureAssociation type will be part of the sequence of
the UML class that has the navigable association:
<complexType name=" UMLCLASSNAMEType">
<sequence>
<element ref="UMLCLASSNAME.ROLENAMEATTHEOTHERCLASS
">
Cardinality
Inheritance
Other
UML
constructs
(including
operations)
The navigable UML class association is depending on an explicit role name
on the “visible” side of the association (Figure 2)
Attribute and association cardinalities are converted to values of the
minOccurs and maxOccurs attributes within the corresponding <element>.
Integer values are converted to themselves, * is converted to the xml value:
unbounded.
UML class inheritance is converted to XML element type inheritance by
<extension> elements:
<complexType name="UMLSUBCLASSNAMEType">
<complexContent>
<extension base=" UMLSUPERCLASSNAMEType">
Multiple inheritance is not supported.
Ignored.
+ vi si bl eRol eSi de
Country
Ci ty
0..*
+visibleRoleSide
Shop
1
0..*
Items
+visibleRoleSide
Figure 2: A navigable UML class association must have a role name on the "visible"
side
4
Problems Encountered
The UML-to-GML conversion rules have been implemented in a code generation
tool. This tool has been tested successfully with a comprehensive cadastre model
spanning about a hundred classes. The cadastre model has been made conformant to
the requirements of ISO 19103 Conceptual schema language and ISO 19109 Rules
for application schema. The model makes extensive use of UML inheritance,
associations and attributes, and should thus supply a good test case. When applying
the conversion rules to this model, we were able to satisfy the design criteria to a large
extent. However, some problems were encountered, of which the most important are
discussed in the following subsections.
4.1
Inheritance in GML
It is incompatible to use the two GML base types feature and featureCollection
with a general inheritance hierarchy. This problem is explained by looking at a
concrete example shown in Figure 3. The example defines an inheritance structure in
ISO UML. The question is how to compose an inheritance structure in the GML
Schema.
RealEstate
owner : String
Farm
has Anim als : Boolean
+ far mFiel d
Building
num berOfFloors : Integer
0..*
Field
vegetation : String
Figure 3: How to determine proper GML base types for the GML types
corresponding to this inheritance hierarchy?
According to GML’s rules, all the types containing other types shall inherit from
featureCollection. All other types shall inherit from feature. Thus GML’s rules dictate
that Farm shall inherit from featureCollection, while RealEstate, Building and Field
shall inherit from feature.
On the other hand, design criterion 4 requires that an XML Schema must maintain
the same inheritance structure as within the UML model, instead of copy-down of
attributes from supertypes. This fact implies that Farm and Building must inherit from
RealEstate, while RealEstate (and Field) may be assigned a proper GML predefined
supertype. The chosen GML supertype for RealEstate will indirectly be the supertype
also for all the subtypes Farm and Building. Generally speaking a GML supertype can
only be chosen for the root type in any inheritance hierarchy within the application
schema.
GML’s rules prescribe one XML Schema inheritance structure, and design
criterion 4 prescribes another XML Schema inheritance structure. The example above
shows that they are in conflict with each other.
4.2
Multiple Inheritance in ISO UML
ISO UML allows multiple inheritance, whereas XML Schema does not. This
implies that UML attributes must be copied down to all subclasses. Such copying
violates design criterion 4 by not exploiting the XML Schema inheritance construct
and thus making the GML less readable. If the conversion rules do not support
multiple inheritance within UML, they violate design criterion 1.
4.3
Ordered subelements in GML
There is no way to specify the order of UML attributes and associations, whereas
the corresponding subelements within a GML schema will have a specified order.
Thus, UML-to-GML encoding will supply subelements in an unpredictable order.
This fact may reduce the readability of the generated GML (design criterion 4) and
cause problems when regenerating GML from UML.
4.4
Global Declarations in GML
GML states that all element and type declarations must be defined globally within
each application schema. The UML associations are converted to GML
featureMember types and elements with names corresponding to the association role
name. ISO UML prescribes that a UML association role name be unique within each
class, not necessarily within the application schema.
4.5
Modelling of Value Domain Restrictions in ISO UML
ISO UML has no guidelines on how to model value domain restrictions, while
XML Schema has built-in support for facets. Facets are a powerful tool to restrict the
value domain of simple XML elements. Examples of such value domain restrictions
may be that a string type must have a length of eight characters, a string type must
start with an alphanumeric character, and the legal integer values are only the even
numbers. UML itself does not provide any construct that corresponds to facets, and
ISO UML has not defined how UML extension mechanisms can be used to support
this. This is a violation of design criterion 4 of exploiting the constructs available in
XML Schema.
4.6
Complicated Definition of Associations in GML
Associations are modelled by GML with the use of explicit featureMember
elements (“feature-property” model). These featureMember elements will then
contain or refer to the elements that participate in the association. Carlson (2001) and
ISO 19118 define associations directly by contained subelements. This makes a
simpler encoding from UML, and the GML becomes more readable. Defining explicit
featureMember elements violates design criterion 4 by making the XML Schema
unnecessarily complex.
5
Proposed changes to ISO UML and to GML
Changes to GML and ISO UML can solve almost all of the problems listed in the
previous section. This paper proposes the following changes to ISO UML:
•
•
Exclude multiple inheritance. This change will solve problem 4.2. Multiple
inheritance has often been a major source of complexity and errors (Shan et
al., 1993, Swaine, 1989, Madsen, 1995). This is why languages such as XML
and Java has chosen to not provide unrestricted multiple inheritance. ISO
19103 (ISO, 2001a) states that “Multiple inheritance shall be used at a
minimum, because it tends to increase model complexity.”
Define a way to express value domain restrictions corresponding to XML
Schema facets. This will solve problem 4.5. A possible approach may be use
of the Object Constraint Language (OCL) (Warmer and Kleppe, 1999).
The proposed changes to GML are:
•
•
•
Allow prefixing of associations by the corresponding ISO UML class name.
This change will solve problem 4.3.
Remove the featureCollection base type. Use the feature base type for all
previously defined featureCollections. A feature containing subelements is
implicitly a featureCollection, and there is no need to state this explicitly. This
change will solve problem 4.1. The removal of featureCollection is no loss
because: A featureCollection contains a set of general subelements, which is
the only difference from a feature type. The possibility to contain subelements
will be lost if we remove featureCollection. But the containment has to be
refined anyway to ensure that only the “correct” subelements are contained.
Once this containment is refined, the general containment relation adds
nothing.
Remove the “feature-property” model. Compensate by letting the subelements
appear directly as part elements. This change will solve problem 4.6. The
change will make the GML encoding agree with Carlson (2001).
The remaining problem to solve is the order of attributes (4.3). Design criterion 2
asserts that the UML models shall be conceptual and neutral to implementation
choices. Hence, the attribute order is irrelevant in UML, but it is relevant in XML
Schema. Regeneration of an XML Schema may therefore reorder the attributes and
thereby cause failure in data transfer. The problem may be overcome by some
extension to the code generation tool, either user intervention or automatic
interpretation of a “master” XML Schema containing the wanted attribute order.
6
Related Work
Two other recent works addressing the topic of UML-to-GML encoding are
discussed below.
Patterns in GML (Galdos, 2002) is a draft submitted to OGC by Galdos Systems
Inc. This company has been central in defining GML. The draft describes the
intentions and models of GML and presents encoding rules from UML to GML.
However, there are no clear guidelines for modelling UML application schemas, and
there is no relation to ISO UML modelling guidelines. (The latter fact violates our
design criterion 1 of compliance with ISO standards.) The UML modelling presented
by Galdos (op.cit.) is characterised as follows:
•
•
It introduces UML extensions for XML Schema constructions such as
complexType, simpleType and restriction.
It requires that feature types inherit from the GML predefined base types
featureCollection and feature, and that this inheritance structure is modelled
explicitly.
The above items make it obvious that UML is used in an implementationdependent way, and that a UML modeller must have good knowledge of GML
(violation of our design criterion 2).
UML Model and Encoding Rules of GML2 (Portele, 2002) is a discussion paper
(draft) submitted to OGC. This paper is intended to be input to the process of the ISO
New Work Item Proposal for GML (ISO, 2002b) where Portele is the appointed
leader. The paper shows how application schemas can be modelled according to ISO
UML. This part of the work coincides to some degree with the work presented in this
paper.
Portele (op.cit.) introduces UML classes that correspond to the GML elements
feature and featureCollection. This fact makes the UML model less accessible to nonGML-experts and ties it to GML implementation (violation of our design criterion 2).
Neither of these two drafts is capable of separating the UML models from their
implementations. The principle of implementation-neutral UML models has a
considerable advantage when it comes to generating code to different
implementations. GML may be today’s choice, but tomorrow other needs may arise,
such as Java generation, CORBA IDL generation, service interface generation, and so
on. GML-oriented UML models not only confuse people that are not GML experts,
but also prevent usage of the same UML models for different purposes.
7
Conclusions and Future Work
In the GeNorway project we have elaborated a set of conversion rules and
developed an automatic tool for translating UML class models to GML Schemas.
The findings of this work can be summarised as follows:
•
•
•
We have shown that is possible to generate GML from conceptual and
implementation-neutral UML models that comply with the modelling
guildelines of ISO/TC 211. Thereby, we have applied the model-based
approach of ISO 19118 Encoding and verified its practicability in the
OGC/GML world.
Our design criteria have to a large extent been satisfied. However, the work
has disclosed some disadvantages of both ISO UML and GML when it comes
to model conversion. Most of these can be eliminated by reasonable changes
to ISO UML and GML.
A valuable side effect of the UML-to-GML encoding tool is its inherent and
automatic quality control of UML models. Thus, our test case consisting of a
large and comprehensive cadastre model has been checked, corrected and
refined as part of the encoding experiment
It is important to point out that conceptual, implementation-neutral modelling does
not require any knowledge of GML (or other specific implementation technologies).
Consequently, when a new version of GML (or whatever implementation platform or
programming language) is adopted, there is no need to rewrite the model, only change
the appropriate conversion rules.
It is worth checking if the findings of GeNorway can be related to work performed
by the Object Management Group (OMG). There are two topics in OMG that are
particularly relevant: (1) OMG has launched the idea of a model-driven architecture
(MDA) (OMG, 2002a), which to a large extent coincides with ISO 19118 Encoding.
OMG’s MDA has attracted attention and aroused interest around the world and also
among OGC members. (2) XML Metadata Interchange (XMI) (OMG, 1999) is a
cross-domain OMG standard for conversion between UML and XML. The question
of using XMI for GI encoding deserves further investigation.
The model-based approach used by the GeNorway project is applicable to more
than data exchange formats. Also services can be defined in the same way. The
authors of this paper believe that Web services, e.g. OGC’s WFS and WMS, can have
their interfaces generated automatically from UML models. UML interface models, or
operations in UML class diagrams, will probably serve this purpose. UML-specified
interfaces should be translatable (“encodable”) into e.g. WFS specifications, or into
corresponding interface specifications written in e.g. the Web Services Description
Language (WSDL) (Ariba et al., 2001). This topic encourages further work.
8
References
Ariba, IBM and Microsoft (2001), Web Services Description Language (WSDL) 1.1,
W3C Note: www.w3.org/TR/wsdl
Carlson, D. (2001), Modeling XML Applications with UML - Practical e-Business
Applications, Addison Wesley.
Galdos (2002), Patterns in GML - Draft - www.galdosinc.com, 9th of January, 2002.
Grønmo, R. (2001), Supporting GI standards with a model-driven architecture. ACM
GIS 2001, Atlanta, USA. 2001.
Grønmo, R., Berre, A.-J., Solheim, I., Hoff, H. and Lantz, K. (2000), DISGIS: An
Interoperability Framework for GIS - Using the ISO/TC 211 Model-based
Approach. Global Spatial Data Infrastructure (GSDI) 4, Cape Town, South
Africa. 2000.
Grønmo, R. and Skogan, D. (2001), SINTEF Report: Joint Nordic test case using
ISO/TC 211 standards, STF40 A01010. 2001.
ISO (2001a), Draft Technical Specification 19103, Geographic information Conceptual schema language, ISO/TC 211 N 1082. 12th of July, 2001a.
ISO (2001b), Final text of CD 19109, Geographic information - Rules for application
schema, ISO/TC 211 N 1127. 19th of July, 2001b.
ISO (2001c), Final text of CD 19118 Geographic information - Encoding, ISO/TC
211 N 1136. 9th of August, 2001c.
ISO (2002a), ISO/TC 211 Geographic information/Geomatics: www.isotc211.org
ISO (2002b), New work item proposal: Geographic information - Geography Markup
Language (GML), ISO/TC 211 N 1220. 8th of February, 2002b.
Madsen, O. (1995), Open issues in object-oriented programming - A Scandinavian
perspective, Software Practice & Experience, 25: 3-43 Suppl. 4 DEC 30
1995.
OGC (2001), Geography Markup Language (GML) 2.0, OGC Recommendation
Paper, 01-029. February, 2001.
OGC (2002), Open GIS Consortium: www.opengis.org
OMG (1999), XML Metadata Interchange (XMI) Version 1.1, OMG Document ad/9910-02. October 25, 1999.
OMG
(2002a), Object Management
www.omg.org/mda
Group's
Model
Driven
Architecture:
OMG (2002b), Unified Modelling Language: www.uml.org
Portele, C. (2002), UML Model and Encoding Rules of GML2 - Discussion Paper
(Draft), OpenGIS Project Document 02-005. 11th of January, 2002.
Shan, Y.-P., Cargill, T., Cox, B., Cook, W., Loomis, M. and Snyder, A. (1993), Is
Multiple Inheritance Essential to OOP, ACM SIGPLAN NOTICES, 28 (10):
360-363 OCT 1993.
Swaine, M. (1989), Is Multiple Inheritance Necessary, Dr. Dobbs Journal, 14 (3):
107-& MAR 1989.
W3C (2002), XML Schema: www.w3.org/XML/Schema
Warmer, J. B. and Kleppe, A. G. (1999), The Object Constraint Language: Precise
Modeling With Uml, Addison-Wesley Pub Co.
Download