Property-Based Genericity: A Dynamic Approach

advertisement
Property-Based Genericity: A Dynamic
Approach
Loïc Denuzière
Technical Report no 1003, June 2010
revision 2186
Property-based genericity is an object-oriented programming paradigm. It allows to model in a generic
way some systems which are hard to represent using classical object-oriented programming. It was
introduced by the C++-oriented SCOOP paradigm used in Olena, an image processing library. The key
principle of this paradigm is the description of classes in terms of a list of the properties its instances
must have, instead of their inheritance trees.
We will present this paradigm and show that it can be extended to other languages than C++ and
other fields than image processing. We will then introduce an example implementation of property-based
genericity in Common Lisp, which will take advantage of its dynamic capabilities and its extensibility.
La généricité par propriétés est un paradigme de programmation orientée objet qui permet de modéliser
de manière générique certains systèmes délicats à représenter en programmation objet classique. Elle a
été introduite par le paradigme orienté C++ SCOOP utilisé dans Olena, une bibliothèque de traitement
d’images. Le principe fondamental est de caractériser une classe non pas par ses relations d’héritage,
mais par une liste des propriétés que possèdent ses instances.
Nous présenterons ce paradigme et montrerons qu’il peut s’étendre à d’autres langages que le C++
et d’autres domaines d’application que le traitement d’images. Nous introduirons ensuite un exemple
d’implémentation de généricité par propriétés en Common Lisp qui tire parti des capacités dynamiques
de ce langage ainsi que de son extensibilité.
Keywords
Climb, Lisp, Olena, généricité, propriétés
Laboratoire de Recherche et Développement de l’Epita
14-16, rue Voltaire – F-94276 Le Kremlin-Bicêtre cedex – France
Tél. +33 1 53 14 59 47 – Fax. +33 1 53 14 59 22
denuziere@lrde.epita.fr – http://www.lrde.epita.fr/
2
Copying this document
c 2010 LRDE.
Copyright Permission is granted to copy, distribute and/or modify this document under the terms of
the GNU Free Documentation License, Version 1.2 or any later version published by the Free
Software Foundation; with the Invariant Sections being just “Copying this document”, no FrontCover Texts, and no Back-Cover Texts.
A copy of the license is provided in the file COPYING.DOC.
Contents
1 Property-based genericity
1.1 Limits of classical object-oriented programming . . . . . . . . . . . . . . . . . . .
1.2 The SCOOP approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Definition of property-based genericity . . . . . . . . . . . . . . . . . . . . . . . .
5
5
7
9
2 Dynamic implementation
2.1 Lisp syntax for properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
12
16
A Bibliography
20
Introduction
The image processing library Olena, which has been under development at the EPITA Research
and Development Laboratory (LRDE) for more than ten years, is well known for its high level of
genericity. A whole new programming paradigm, called Static C++ Object-Oriented Paradigm
(SCOOP), was developed in order to make this possible. While using extensively C++-specific
idioms, such as the Curiously Recurring Template Pattern (CRTP) (Coplien, 1995), SCOOP
defines a general set of concepts which enhance the polymorphic capabilities of object-oriented
programs.
In this paper, we first present the genericity issues that this paradigm addresses. Then, after
quickly describing the way SCOOP overcomes these issues, we will give a formal definition of
property-based genericity. Finally, as a proof of its independence from SCOOP and C++, we
will present a fully dynamic implementation in Common Lisp.
Acknowledgements
Thanks to Didier Verna and Christopher Chedeau for their help and support.
Chapter 1
Property-based genericity
1.1
Limits of classical object-oriented programming
Let’s consider an image processing library. In terms of genericity, it aims at the following features
(Ballas, 2008):
• Different types of images:
– Any-dimensional (1D, 2D, 3D. . . );
– Supported by different types of grids: rectangular, honeycomb, triangular, or even
irregular;
– Stored contiguously, by pieces, calculated on-the-fly, or any other data retrieval method;
– Fully loadable in memory or not;
– And many other properties possible;
• Different types of values stored in the images: RGB, grayscale, labels, symbolic data,
vectors, or any user-defined type;
• Different types of algorithms available for these images. The critical point is to be able to
write each generic algorithm once, and the library has to make it available to run on all
supported kinds of images.
Another feature that should be available is the possibility to write specialized versions of these
generic algorithms for images with a given property. We can take the example of the image
copying algorithm:
• the generic algorithm has to iterate over the sites of the image in order to copy the values
individually;
• contiguously-stored images can get a significant speedup by using system-provided primitives to batch-copy the data (e.g. memcpy in C).
When trying to do this using a classical object-oriented approach, many problems arise. We
will show them in C++, but they are not language-specific.
Considering that any type of image can contain any type of value, the first logical step is
to make the image type parameterized by the type of value, using what many languages call
generics or, in the case of C++, templates:
1.1 Limits of classical object-oriented programming
6
template < class Value >
class Image
{
Value get ( Point p );
// ...
};
Listing 1.1: Image values as template parameters
Then we have to deal with the problem of the multiple image properties. We can declare one
of these properties at a time using inheritance:
template < class Value >
class ContiguousImage : public Image < Value >
{
std :: vector < Value > _data ;
// ...
};
template < class Value >
class DisruptedImage : public Image < Value >
{
// Implement disrupted storage ...
};
Listing 1.2: Inheritance for a single property
However, the solution is extremely limited. How can several image properties be combined
using inheritance ? At first, multiple inheritance seems to be suitable for such purpose, but
actually it faces several flaws:
• The great number of properties (Olena, for example, declares fifteen image properties,
with two to six values for each) quickly leads to a combinatorial explosion of the number
of classes that need to be declared.
• This design doesn’t allow for a number of essential checks. For example, each concrete
image class must be associated exactly once with each property: an image cannot be both
contiguous and disrupted, and it must have a storage property. With this design however,
one can declare an image class which inherits from several or neither of them.
7
1.2
Property-based genericity
The SCOOP approach
In order to address this kind of situation, the LRDE’s Olena team has developed and refined
SCOOP: the Static C++ Obect-Oriented Paradigm (Géraud and Levillain, 2008). This paradigm
brings together a number of known C++ idioms, most of which we will not deal with here, to
perform a great number of checks at compile time.
The technique used by SCOOP to solve the genericity problem is based on type traits (Alexandrescu, 2001). Each image property is implemented as an inheritance arborescence of empty
classes. A traits class is then defined that declares, for each concrete image class, the appropriate “value”, ie., that typedefs all the properties.
// Each property is a namespace , and each property value is a class
namespace storage
{
class any {};
class contiguous : public any {};
class disrupted : public any {};
}
// Define the traits class
template < class I > class traits {};
// Define a concrete image class
template < class V >
class my_image_class : public image <V >
{
// Implement class ...
};
// Specialize the traits class to declare the properties of the concrete class
template < >
class traits < my_image_class >
{
typedef storage :: contiguous storage ;
typedef grid :: rectangular grid ;
// ...
};
Listing 1.3: Defining properties in SCOOP
Using this traits class, one can then declare generic algorithms that work on all kinds of images,
and specialized versions depending on given properties. All that is needed is a facade function
that explicitly dispatches according to the image’s traits.
// The generic algorithm
template < class V >
image <V > copy_impl ( const image <V >& ima , storage :: any )
{
// Implement value - by - value copy
}
// The specialized ( overloaded ) algorithm for contiguous storage
1.2 The SCOOP approach
8
template < class V >
image <V > copy_impl ( const image <V >& ima , storage :: contiguous )
{
// Implement copy using memcpy ()
}
// The facade function
template < class I >
I copy ( const I & ima )
{
return copy_impl ( ima , traits <I >:: storage ());
}
Listing 1.4: Property-wise dispatch in SCOOP
We can remark that in spite of copy being a regular function (as opposed to a member
function), its behavior is quite similar to a (statically dispatched) method1 . Therefore, in the
rest of this report, we will refer to them collectively as “methods”.
1 The fact that this dispatch is static is due to performance-driven design choices in SCOOP. Such dispatch
may as well be made dynamically, as we will se later.
9
1.3
Property-based genericity
Definition of property-based genericity
So far, we have only described a C++-specific technique that allows us to deal with genericity
problems caused by the great number of properties a class can have. We are now going to give a
general, language-independent definition of the property-based genericity paradigm, along with
a pseudo-UML notation.
Property-based genericity is an object-oriented programming paradigm. It introduces the
notions of property and property value, and a number of relations between them and the existing
notions of classes, inheritance and method dispatch.
Figure 1.1: Pseudo-UML for property
Figure 1.2: Pseudo-UML for property value
Characterization
A property P is always linked to one and only one abstract class C. An abstract class can be
characterized by any number of properties.
This relation is read: “P characterizes C” or “C is characterized by P”.
Example: Storage characterizes Image.
Figure 1.3: Pseudo-UML for abstract class characterization
1.3.1
Property valuation
A property value V is always linked to one and only one property P.
This relation is read: “V is a value of P” or “P can be V”.
Example: Storage can be Contiguous.
We see here that a property is generally referred to using a noun, while a property value is an
adjective.
1.3 Definition of property-based genericity
10
Figure 1.4: Pseudo-UML for property valuation
1.3.2
Property subvaluation
Two property values V1 and V2 which characterize the same property P can be linked by a
subvaluation relation. As a consequence, the set of values of a property form a forest.
This relation is read “V1 is a subvalue of V2”.
Example: In the Storage property, Piecewise is a subvalue of Disrupted.
When a value V1 of the property P is not a subvalue of any other value, we call it a direct
value of P.
Figure 1.5: Pseudo-UML for property subvaluation
1.3.3
Property fulfillment
Any class C1 inheriting from an abstract class C2 must be linked to one value V for each property
P that characterizes C2. This relation is read “The P of C1 is V”.
Example: The Storage of ClassicalImage is Contiguous.
This relation is transitive in relation to subvaluation: if the P of C is V, and V is a subvalue
of V2, then the P of C is V2.
Example: If the Storage of MyImage is Piecewise, then it is also Disrupted.
1.3.4
Operation polymorphism
Property-based genericity allows to extend operation polymorphism2 . In addition to the specialization on the dynamic class of their argument(s), as in classical object-oriented code, methods
can also be specialized on the values of one or several properties.
2 We
will include in the definition of “operation polymorphism” the static method binding used in SCOOP.
11
Property-based genericity
Figure 1.6: Pseudo-UML for property fulfillment
Example: The method “copy” of the class Image has one generic implementation, and one
specialized implementation for subclasses whose Storage is Contiguous.
Figure 1.7: Pseudo-UML for method dispatch over properties
Chapter 2
Dynamic implementation
As a proof of the viability of property-based genericity as a general-purpose, language-independent paradigm, we will present a Common Lisp extension which provides a fully dynamic implementation of the paradigm.
In most programming languages, the only possibility offered to extend the language is by
adding functions and classes. Common Lisp, thanks to the simplicity of its syntax, its powerful
macro system and the dynamic capabilities of the Common Lisp Object System (CLOS), allows
the library implementor to add whole new syntactic rules to the language. The syntax for the
interface of a Common Lisp library can be designed to match exactly its semantics: creating a
library is actually equivalent to creating a Common Lisp dialect.
In this chapter, we will place ourselves in the role of a Common Lisp library creator, that
is, a language extension designer. We will present the incremental creation of the syntax for a
Common Lisp extension which supports property-based genericity. Then, we will see how such
extension is actually implemented.
2.1
2.1.1
Lisp syntax for properties
Properties
The first thing we need to be able to do is create a property. It seems logical to use a syntax
close to that of class inheritance, as both properties and subclasses express a refinement of the
definition of an abstract class.
( defproperty storage ( image ))
Listing 2.1: Syntax for property creation
So far, so good. Then, we need a syntax to create property values. A possibility is to use the
inheritance syntax again:
( d e f i n e - p r o p e r t y - v a l u e contiguous ( storage ))
Listing 2.2: Syntax for property value creation
But this syntax is semantically unsatisfactory. There is a fundamental difference between
abstract class characterization and property valuation:
• On one hand, an abstract class can exist without any property.
13
Dynamic implementation
• On the other hand, properties are intrinsically dependent on the existence of their values.
A property doesn’t have any sense if there is not any value associated with it, since in this
case the concrete classes cannot fulfill the property.
Therefore, we need to ensure that as soon as the user creates a property, they also list the
possible values for this property. This can be enforced by grouping together the creation of
properties and property values in a single expression:
( defproperty storage ( image )
( contiguous
disrupted
piecewise
on-the-fly ))
Listing 2.3: Syntax for property creation with values
Now, we can integrate the fact that the values of a property are not a list, but rather a forest:
the direct subclasses are the roots of the trees, and subvaluation is the parent relation.
The Lisp syntax is well adapted for tree representation (Seibel, 2003, chap. 13):
• nodes are represented as symbols;
• subtrees are represented as lists: the first element of the list is the internal node, and the
other elements are the children.
The macro defproperty becomes:
( defproperty storage ( image )
( contiguous
( disrupted piecewise )
on-the-fly ))
Listing 2.4: Syntax for property creation with value arborescence
As a final refinement, we want to enable the user to only specify properties when they differ
from the most common value; that is, we want to provide a default value for properties. This
is actually what we can call an “optional” syntactic element. Adding such global options in a
“definition macro” is generally done using an associative list (also called alist): each option is a
sublist, the first element of which designates the name of the option, and the rest is the value.
This finally gives us the following syntax for defproperty:
( defproperty storage ( image )
( contiguous
( disrupted piecewise )
on-the-fly )
(: default contiguous ))
Listing 2.5: Syntax for property creation with value arborescence and default value
2.1.2
Concrete classes
The second part of the syntax extension is the property fulfillment by a concrete class. Let us
first study the standard class definition macro:
2.1 Lisp syntax for properties
14
(defclass new-class (superclass1 superclass2)
((slot1)
(slot2))
(:documentation "A dummy class.")
(:metaclass standard-class))
This syntax is composed of four parts:
• the name of the class (in green);
• the list of the direct superclasses (in red);
• the list of the slots (which other languages generally call attributes or member variables, in
purple);
• a set of options (in blue), with the same “alist” syntax we have used for defproperty.
With this composition, a consistent extension is to list the properties as an extra option in the
final part, which results in the following syntax:
( defclass my-image-class ( image )
(( slot1 )
( slot2 ))
(: documentation " An example concrete class . " )
(: properties : storage contiguous
: grid rectangular ))
Listing 2.6: Extended syntax for class creation with property fulfillment
2.1.3
Dispatch
A specificity of properties is that they allow methods to be dispatched according to more than
just the object’s class. In most languages, this is difficult to express, as the dispatch is implied by
the fact that a method belongs to a class. For example, SCOOP needs to use normal overloaded
functions and perform property-wise dispatch “by hand”, because member functions do not
permit any variation in the dispatch process.
In CLOS however, methods do not belong to classes. Instead, dispatch is performed by generic
functions, which can choose between several implementations (the so-called methods) according
to the type of their arguments.
; ; Define a generic function
( defgeneric f ( x ))
; ; Define methods for several classes
( defmethod f (( x a-class ))
( print " a-class " ))
( defmethod f (( x another-class ))
( print " another-class " ))
; ; Dynamic dispatch
( defvar x ( make-instance ’ a-class ))
( defvar y ( make-instance ’ another-class ))
15
Dynamic implementation
( f x ) ; prints " a-class "
( f y ) ; prints " another-class "
Listing 2.7: Example of dispatch in CLOS
This approach allows us to express property-wise dispatch in the same way as we express classwise dispatch. With a simple syntactic extension which resembles the one we used in the
defclass option1 , we can express property-wise dispatch with a standard defmethod.
( defgeneric copy ( ima ))
; ; Generic method
( defmethod copy (( ima image ))
#| Implement generic copy |# )
; ; Specialized method for contiguous images
( defmethod copy (( ima image : storage contiguous ))
#| Implement batch copy |# )
Listing 2.8: Syntax for property-wise dispatch
2.1.4
Conclusion
As a demonstration of the conciseness of the syntax we just created, we present here an example
complete code for the creation of a property arborescence, along with the corresponding pseudoUML.
( defclass image ()
())
( defgeneric copy ( ima ))
( defproperty storage ( image )
( contiguous
( disrupted piecewise )
on-the-fly ))
( defmethod copy (( ima image ))
( print " Image copied using default algorithm . " ))
( defmethod copy (( ima image : storage contiguous ))
( print " Contiguous image copied . " ))
Figure 2.1:
(UML)
An example property hierarchy
( defclass my-image-class ( image )
(( data : type vector ))
(: properties : storage contiguous ))
Listing 2.9:
(Lisp)
An example property hierarchy
1 This syntax (:key1 value1 :key2 value2) is actually widely used in Common Lisp as a lighter alternative
to alists, in cases when there is no risk to confuse keys and values (typically, when values are atoms). It is called
property-list, or plist for short.
2.2 Implementation
2.2
16
Implementation
Now that we have seen what the user interface for properties looks like, let’s take a look at how
we implemented them.
2.2.1
Revisiting inheritance-based properties
In the first chapter, we said that inheritance and language-provided dispatch were ill-suited to
implement property-based genericity. This stands true in the context of a static language such as
C++, but the flexibility and dynamic capabilities offered by Common Lisp allow us to reconsider
this statement:
• Since we can create classes on the fly, the problem caused by the huge number of concrete
classes doesn’t exist anymore. Instead of creating all classes we might need a priori, we only
need to automate the creation of the necessary class just before instanciating an image.
• The classes are not actually written by the user; they are generated by the library. This
generator code is able to perform all kinds of static checks, such as the existence and unicity
check we mentionned in section 1.1, and fail if they are not filled.
So, implementing properties using inheritance is possible in Common Lisp. Let’s see how we
do it.
• At the top of the hierarchy is the abstract class; in our running example, the class image.
• Then, the classes representing property values inherit from the associated property. This
inheritance can be indirect: if V2 is a subvalue of V1, then V2 is a direct subclass of V1,
which is itself a subclass of the abstract class.
• Finally, a concrete class directly inherits from all of its property values.
This whole hierarchy is described by the UML diagram Figure 2.2.
One may argue that this class hierarchy does not respect the “C1 is a C2” semantics of the
inheritance relation. This is actually true; but the important point here is that inheritance is
only used as an internal means of acheiving the correct dispatch. The semantics are those of
properties.
17
Dynamic implementation
Figure 2.2: Class hierarchy for properties implementation
2.2.2
Linking with the user interface
Now, we have a user interface to describe properties, and a class diagram to implement it. What
is left to do is the glue so that the former generates code which implements the latter while
checking for its correctness.
Properties
To implement defproperty, we will use a very powerful Common Lisp feature called macros.
A macro is like a function which, instead of being called at runtime with its arguments already
evaluated, is called at compile time with its arguments unevaluated. Common Lisp macros differ
from C/C++ macros in the following aspects:
• the arguments are not passed to it as a verbatim text representation: they are Lisp objects
composed of the lists and atoms that form the abstract syntax tree (AST);
• instead of doing just text replacement and concatenation, the whole language is available
when processing a macro. Therefore, it can explore this AST, modify it and generate code
depending on its structure.
In our case, the macro defproperty will:
• check for the existence of the abstract class;
• browse (depth-first) the given property values and generate the corresponding defclasses;
• store in a global hashtable the necessary information: using the name of the property as a
key, the stored data references the different values and the default value.
Concrete classes
While property declaration is a whole new syntax added to the language, property fulfillment by
a concrete class has been chosen to be an extension of an existing one. Therefore, it cannot be
done using a macro.
2.2 Implementation
18
CLOS provides us numerous possibilities to alter its interals, grouped in what is called the
Meta-Object Protocol (MOP). In particular, it calls a number of generic functions which indicate
how a defclass declaration is transformed into an actual class representation. The one which
processes the options passed to defclass is called ensure-class-using-class. This function
receives, among other arguments not useful here, the list of defclass options (as a plist) and
the list of superclasses.
What we need to do here is not to override the default behavior of ensure-class-using-class;
we only need to take the :properties option, create an abstract class which inherits from the
classes associated with these properties, and finally add it to the superclasses. For such a task,
CLOS provides us with an elegant system called secondary methods.
Secondary methods are, as their name suggest, additional methods which are not meant to
alter the semantics of the main (or primary) methods, but to add a check or — in our case — a
small behavior to the primary methods. A secondary method is declared by adding a :keyword
between the method name and its arguments:
( defmethod my-function : before ( arg1 arg2 )
( implement ...))
Listing 2.10: A secondary method
There are several types of secondary methods; the one which interests us is called :around
methods. An around method is called before the primary methods, and is allowed to alter the
arguments before calling the builtin call-next-method which, in turn, calls the primary method.
The body of the around method is the following:
• Check if there is a :properties option; if not, do nothing and directly call call-next-method;
• Check if one of the superclasses has associated properties, and if all the provided properties
and values exist for this class; if not, throw an error;
• Determine, thanks to the global hashtable, all the property values;
• Create an empty class which inherits from all the associated classes, or retrieve it if it
already exists;
• Add this class to the list of superclasses;
• Call call-next-method.
Dispatch
The syntax extension we use to express property-wise dispatch resembles the one we use for
defclass. However, CLOS does not provide a way to extend defmethod the way we intend to.
This leaves us with little choice but to provide our own wrapper macro around defmethod which
we call defalgo.
For each argument, defalgo will check whether and how it is specialized. If it is unspecialized or class-specialized, it is left as is. If it is property-specialized, then the three atoms
(abstract-class :property value) are replaced with the name of the class associated with
the given value.
Conclusion
This report presented a programming paradigm which extends the capabilities of object-oriented
design, called property-based genericity. We showed that, while originating from a very static
C++, this paradigm is well suited for use in a completely dynamic context.
The Common Lisp implementation is still quite basic, but it already shows some very interesting potential. Now that we have this working base, we will be able to experiment some additions
to the paradigm, such as additional checks regarding the compatibility of certain properties (does
a given grid exist in dimension N?). To further prove its viability, we will also incorporate this
implementation into the Common Lisp Image Manipulation Bundle (Climb), the LRDE’s image
processing library project.
Appendix A
Bibliography
Alexandrescu, A. (2001). Modern C++ Design, Generic Programming and Design Pattern
Applied.
Ballas, N. (2008). Image taxonomy in milena. Technical report, EPITA Research and Development Laboratory (LRDE).
Coplien, J. O. (1995). Curiously recurring template patterns. In C++ Report, pages 24–27.
Géraud, T. and Levillain, R. (2008). Semantics-driven genericity: A sequel to the static c++
object-oriented programming paradigm (scoop 2). In 6th International Workshop on Multiparadigm Programming with Object-Oriented Languages.
Seibel, P. (2003). Practical Common Lisp.
Download