Property-Based Genericity: A Dynamic Approach Loïc Denuzière Technical Report no 1003, June 2010 revision 2186 Property-based genericity is an object-oriented programming paradigm. It allows to model in a generic way some systems which are hard to represent using classical object-oriented programming. It was introduced by the C++-oriented SCOOP paradigm used in Olena, an image processing library. The key principle of this paradigm is the description of classes in terms of a list of the properties its instances must have, instead of their inheritance trees. We will present this paradigm and show that it can be extended to other languages than C++ and other fields than image processing. We will then introduce an example implementation of property-based genericity in Common Lisp, which will take advantage of its dynamic capabilities and its extensibility. La généricité par propriétés est un paradigme de programmation orientée objet qui permet de modéliser de manière générique certains systèmes délicats à représenter en programmation objet classique. Elle a été introduite par le paradigme orienté C++ SCOOP utilisé dans Olena, une bibliothèque de traitement d’images. Le principe fondamental est de caractériser une classe non pas par ses relations d’héritage, mais par une liste des propriétés que possèdent ses instances. Nous présenterons ce paradigme et montrerons qu’il peut s’étendre à d’autres langages que le C++ et d’autres domaines d’application que le traitement d’images. Nous introduirons ensuite un exemple d’implémentation de généricité par propriétés en Common Lisp qui tire parti des capacités dynamiques de ce langage ainsi que de son extensibilité. Keywords Climb, Lisp, Olena, généricité, propriétés Laboratoire de Recherche et Développement de l’Epita 14-16, rue Voltaire – F-94276 Le Kremlin-Bicêtre cedex – France Tél. +33 1 53 14 59 47 – Fax. +33 1 53 14 59 22 denuziere@lrde.epita.fr – http://www.lrde.epita.fr/ 2 Copying this document c 2010 LRDE. Copyright Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with the Invariant Sections being just “Copying this document”, no FrontCover Texts, and no Back-Cover Texts. A copy of the license is provided in the file COPYING.DOC. Contents 1 Property-based genericity 1.1 Limits of classical object-oriented programming . . . . . . . . . . . . . . . . . . . 1.2 The SCOOP approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Definition of property-based genericity . . . . . . . . . . . . . . . . . . . . . . . . 5 5 7 9 2 Dynamic implementation 2.1 Lisp syntax for properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 12 16 A Bibliography 20 Introduction The image processing library Olena, which has been under development at the EPITA Research and Development Laboratory (LRDE) for more than ten years, is well known for its high level of genericity. A whole new programming paradigm, called Static C++ Object-Oriented Paradigm (SCOOP), was developed in order to make this possible. While using extensively C++-specific idioms, such as the Curiously Recurring Template Pattern (CRTP) (Coplien, 1995), SCOOP defines a general set of concepts which enhance the polymorphic capabilities of object-oriented programs. In this paper, we first present the genericity issues that this paradigm addresses. Then, after quickly describing the way SCOOP overcomes these issues, we will give a formal definition of property-based genericity. Finally, as a proof of its independence from SCOOP and C++, we will present a fully dynamic implementation in Common Lisp. Acknowledgements Thanks to Didier Verna and Christopher Chedeau for their help and support. Chapter 1 Property-based genericity 1.1 Limits of classical object-oriented programming Let’s consider an image processing library. In terms of genericity, it aims at the following features (Ballas, 2008): • Different types of images: – Any-dimensional (1D, 2D, 3D. . . ); – Supported by different types of grids: rectangular, honeycomb, triangular, or even irregular; – Stored contiguously, by pieces, calculated on-the-fly, or any other data retrieval method; – Fully loadable in memory or not; – And many other properties possible; • Different types of values stored in the images: RGB, grayscale, labels, symbolic data, vectors, or any user-defined type; • Different types of algorithms available for these images. The critical point is to be able to write each generic algorithm once, and the library has to make it available to run on all supported kinds of images. Another feature that should be available is the possibility to write specialized versions of these generic algorithms for images with a given property. We can take the example of the image copying algorithm: • the generic algorithm has to iterate over the sites of the image in order to copy the values individually; • contiguously-stored images can get a significant speedup by using system-provided primitives to batch-copy the data (e.g. memcpy in C). When trying to do this using a classical object-oriented approach, many problems arise. We will show them in C++, but they are not language-specific. Considering that any type of image can contain any type of value, the first logical step is to make the image type parameterized by the type of value, using what many languages call generics or, in the case of C++, templates: 1.1 Limits of classical object-oriented programming 6 template < class Value > class Image { Value get ( Point p ); // ... }; Listing 1.1: Image values as template parameters Then we have to deal with the problem of the multiple image properties. We can declare one of these properties at a time using inheritance: template < class Value > class ContiguousImage : public Image < Value > { std :: vector < Value > _data ; // ... }; template < class Value > class DisruptedImage : public Image < Value > { // Implement disrupted storage ... }; Listing 1.2: Inheritance for a single property However, the solution is extremely limited. How can several image properties be combined using inheritance ? At first, multiple inheritance seems to be suitable for such purpose, but actually it faces several flaws: • The great number of properties (Olena, for example, declares fifteen image properties, with two to six values for each) quickly leads to a combinatorial explosion of the number of classes that need to be declared. • This design doesn’t allow for a number of essential checks. For example, each concrete image class must be associated exactly once with each property: an image cannot be both contiguous and disrupted, and it must have a storage property. With this design however, one can declare an image class which inherits from several or neither of them. 7 1.2 Property-based genericity The SCOOP approach In order to address this kind of situation, the LRDE’s Olena team has developed and refined SCOOP: the Static C++ Obect-Oriented Paradigm (Géraud and Levillain, 2008). This paradigm brings together a number of known C++ idioms, most of which we will not deal with here, to perform a great number of checks at compile time. The technique used by SCOOP to solve the genericity problem is based on type traits (Alexandrescu, 2001). Each image property is implemented as an inheritance arborescence of empty classes. A traits class is then defined that declares, for each concrete image class, the appropriate “value”, ie., that typedefs all the properties. // Each property is a namespace , and each property value is a class namespace storage { class any {}; class contiguous : public any {}; class disrupted : public any {}; } // Define the traits class template < class I > class traits {}; // Define a concrete image class template < class V > class my_image_class : public image <V > { // Implement class ... }; // Specialize the traits class to declare the properties of the concrete class template < > class traits < my_image_class > { typedef storage :: contiguous storage ; typedef grid :: rectangular grid ; // ... }; Listing 1.3: Defining properties in SCOOP Using this traits class, one can then declare generic algorithms that work on all kinds of images, and specialized versions depending on given properties. All that is needed is a facade function that explicitly dispatches according to the image’s traits. // The generic algorithm template < class V > image <V > copy_impl ( const image <V >& ima , storage :: any ) { // Implement value - by - value copy } // The specialized ( overloaded ) algorithm for contiguous storage 1.2 The SCOOP approach 8 template < class V > image <V > copy_impl ( const image <V >& ima , storage :: contiguous ) { // Implement copy using memcpy () } // The facade function template < class I > I copy ( const I & ima ) { return copy_impl ( ima , traits <I >:: storage ()); } Listing 1.4: Property-wise dispatch in SCOOP We can remark that in spite of copy being a regular function (as opposed to a member function), its behavior is quite similar to a (statically dispatched) method1 . Therefore, in the rest of this report, we will refer to them collectively as “methods”. 1 The fact that this dispatch is static is due to performance-driven design choices in SCOOP. Such dispatch may as well be made dynamically, as we will se later. 9 1.3 Property-based genericity Definition of property-based genericity So far, we have only described a C++-specific technique that allows us to deal with genericity problems caused by the great number of properties a class can have. We are now going to give a general, language-independent definition of the property-based genericity paradigm, along with a pseudo-UML notation. Property-based genericity is an object-oriented programming paradigm. It introduces the notions of property and property value, and a number of relations between them and the existing notions of classes, inheritance and method dispatch. Figure 1.1: Pseudo-UML for property Figure 1.2: Pseudo-UML for property value Characterization A property P is always linked to one and only one abstract class C. An abstract class can be characterized by any number of properties. This relation is read: “P characterizes C” or “C is characterized by P”. Example: Storage characterizes Image. Figure 1.3: Pseudo-UML for abstract class characterization 1.3.1 Property valuation A property value V is always linked to one and only one property P. This relation is read: “V is a value of P” or “P can be V”. Example: Storage can be Contiguous. We see here that a property is generally referred to using a noun, while a property value is an adjective. 1.3 Definition of property-based genericity 10 Figure 1.4: Pseudo-UML for property valuation 1.3.2 Property subvaluation Two property values V1 and V2 which characterize the same property P can be linked by a subvaluation relation. As a consequence, the set of values of a property form a forest. This relation is read “V1 is a subvalue of V2”. Example: In the Storage property, Piecewise is a subvalue of Disrupted. When a value V1 of the property P is not a subvalue of any other value, we call it a direct value of P. Figure 1.5: Pseudo-UML for property subvaluation 1.3.3 Property fulfillment Any class C1 inheriting from an abstract class C2 must be linked to one value V for each property P that characterizes C2. This relation is read “The P of C1 is V”. Example: The Storage of ClassicalImage is Contiguous. This relation is transitive in relation to subvaluation: if the P of C is V, and V is a subvalue of V2, then the P of C is V2. Example: If the Storage of MyImage is Piecewise, then it is also Disrupted. 1.3.4 Operation polymorphism Property-based genericity allows to extend operation polymorphism2 . In addition to the specialization on the dynamic class of their argument(s), as in classical object-oriented code, methods can also be specialized on the values of one or several properties. 2 We will include in the definition of “operation polymorphism” the static method binding used in SCOOP. 11 Property-based genericity Figure 1.6: Pseudo-UML for property fulfillment Example: The method “copy” of the class Image has one generic implementation, and one specialized implementation for subclasses whose Storage is Contiguous. Figure 1.7: Pseudo-UML for method dispatch over properties Chapter 2 Dynamic implementation As a proof of the viability of property-based genericity as a general-purpose, language-independent paradigm, we will present a Common Lisp extension which provides a fully dynamic implementation of the paradigm. In most programming languages, the only possibility offered to extend the language is by adding functions and classes. Common Lisp, thanks to the simplicity of its syntax, its powerful macro system and the dynamic capabilities of the Common Lisp Object System (CLOS), allows the library implementor to add whole new syntactic rules to the language. The syntax for the interface of a Common Lisp library can be designed to match exactly its semantics: creating a library is actually equivalent to creating a Common Lisp dialect. In this chapter, we will place ourselves in the role of a Common Lisp library creator, that is, a language extension designer. We will present the incremental creation of the syntax for a Common Lisp extension which supports property-based genericity. Then, we will see how such extension is actually implemented. 2.1 2.1.1 Lisp syntax for properties Properties The first thing we need to be able to do is create a property. It seems logical to use a syntax close to that of class inheritance, as both properties and subclasses express a refinement of the definition of an abstract class. ( defproperty storage ( image )) Listing 2.1: Syntax for property creation So far, so good. Then, we need a syntax to create property values. A possibility is to use the inheritance syntax again: ( d e f i n e - p r o p e r t y - v a l u e contiguous ( storage )) Listing 2.2: Syntax for property value creation But this syntax is semantically unsatisfactory. There is a fundamental difference between abstract class characterization and property valuation: • On one hand, an abstract class can exist without any property. 13 Dynamic implementation • On the other hand, properties are intrinsically dependent on the existence of their values. A property doesn’t have any sense if there is not any value associated with it, since in this case the concrete classes cannot fulfill the property. Therefore, we need to ensure that as soon as the user creates a property, they also list the possible values for this property. This can be enforced by grouping together the creation of properties and property values in a single expression: ( defproperty storage ( image ) ( contiguous disrupted piecewise on-the-fly )) Listing 2.3: Syntax for property creation with values Now, we can integrate the fact that the values of a property are not a list, but rather a forest: the direct subclasses are the roots of the trees, and subvaluation is the parent relation. The Lisp syntax is well adapted for tree representation (Seibel, 2003, chap. 13): • nodes are represented as symbols; • subtrees are represented as lists: the first element of the list is the internal node, and the other elements are the children. The macro defproperty becomes: ( defproperty storage ( image ) ( contiguous ( disrupted piecewise ) on-the-fly )) Listing 2.4: Syntax for property creation with value arborescence As a final refinement, we want to enable the user to only specify properties when they differ from the most common value; that is, we want to provide a default value for properties. This is actually what we can call an “optional” syntactic element. Adding such global options in a “definition macro” is generally done using an associative list (also called alist): each option is a sublist, the first element of which designates the name of the option, and the rest is the value. This finally gives us the following syntax for defproperty: ( defproperty storage ( image ) ( contiguous ( disrupted piecewise ) on-the-fly ) (: default contiguous )) Listing 2.5: Syntax for property creation with value arborescence and default value 2.1.2 Concrete classes The second part of the syntax extension is the property fulfillment by a concrete class. Let us first study the standard class definition macro: 2.1 Lisp syntax for properties 14 (defclass new-class (superclass1 superclass2) ((slot1) (slot2)) (:documentation "A dummy class.") (:metaclass standard-class)) This syntax is composed of four parts: • the name of the class (in green); • the list of the direct superclasses (in red); • the list of the slots (which other languages generally call attributes or member variables, in purple); • a set of options (in blue), with the same “alist” syntax we have used for defproperty. With this composition, a consistent extension is to list the properties as an extra option in the final part, which results in the following syntax: ( defclass my-image-class ( image ) (( slot1 ) ( slot2 )) (: documentation " An example concrete class . " ) (: properties : storage contiguous : grid rectangular )) Listing 2.6: Extended syntax for class creation with property fulfillment 2.1.3 Dispatch A specificity of properties is that they allow methods to be dispatched according to more than just the object’s class. In most languages, this is difficult to express, as the dispatch is implied by the fact that a method belongs to a class. For example, SCOOP needs to use normal overloaded functions and perform property-wise dispatch “by hand”, because member functions do not permit any variation in the dispatch process. In CLOS however, methods do not belong to classes. Instead, dispatch is performed by generic functions, which can choose between several implementations (the so-called methods) according to the type of their arguments. ; ; Define a generic function ( defgeneric f ( x )) ; ; Define methods for several classes ( defmethod f (( x a-class )) ( print " a-class " )) ( defmethod f (( x another-class )) ( print " another-class " )) ; ; Dynamic dispatch ( defvar x ( make-instance ’ a-class )) ( defvar y ( make-instance ’ another-class )) 15 Dynamic implementation ( f x ) ; prints " a-class " ( f y ) ; prints " another-class " Listing 2.7: Example of dispatch in CLOS This approach allows us to express property-wise dispatch in the same way as we express classwise dispatch. With a simple syntactic extension which resembles the one we used in the defclass option1 , we can express property-wise dispatch with a standard defmethod. ( defgeneric copy ( ima )) ; ; Generic method ( defmethod copy (( ima image )) #| Implement generic copy |# ) ; ; Specialized method for contiguous images ( defmethod copy (( ima image : storage contiguous )) #| Implement batch copy |# ) Listing 2.8: Syntax for property-wise dispatch 2.1.4 Conclusion As a demonstration of the conciseness of the syntax we just created, we present here an example complete code for the creation of a property arborescence, along with the corresponding pseudoUML. ( defclass image () ()) ( defgeneric copy ( ima )) ( defproperty storage ( image ) ( contiguous ( disrupted piecewise ) on-the-fly )) ( defmethod copy (( ima image )) ( print " Image copied using default algorithm . " )) ( defmethod copy (( ima image : storage contiguous )) ( print " Contiguous image copied . " )) Figure 2.1: (UML) An example property hierarchy ( defclass my-image-class ( image ) (( data : type vector )) (: properties : storage contiguous )) Listing 2.9: (Lisp) An example property hierarchy 1 This syntax (:key1 value1 :key2 value2) is actually widely used in Common Lisp as a lighter alternative to alists, in cases when there is no risk to confuse keys and values (typically, when values are atoms). It is called property-list, or plist for short. 2.2 Implementation 2.2 16 Implementation Now that we have seen what the user interface for properties looks like, let’s take a look at how we implemented them. 2.2.1 Revisiting inheritance-based properties In the first chapter, we said that inheritance and language-provided dispatch were ill-suited to implement property-based genericity. This stands true in the context of a static language such as C++, but the flexibility and dynamic capabilities offered by Common Lisp allow us to reconsider this statement: • Since we can create classes on the fly, the problem caused by the huge number of concrete classes doesn’t exist anymore. Instead of creating all classes we might need a priori, we only need to automate the creation of the necessary class just before instanciating an image. • The classes are not actually written by the user; they are generated by the library. This generator code is able to perform all kinds of static checks, such as the existence and unicity check we mentionned in section 1.1, and fail if they are not filled. So, implementing properties using inheritance is possible in Common Lisp. Let’s see how we do it. • At the top of the hierarchy is the abstract class; in our running example, the class image. • Then, the classes representing property values inherit from the associated property. This inheritance can be indirect: if V2 is a subvalue of V1, then V2 is a direct subclass of V1, which is itself a subclass of the abstract class. • Finally, a concrete class directly inherits from all of its property values. This whole hierarchy is described by the UML diagram Figure 2.2. One may argue that this class hierarchy does not respect the “C1 is a C2” semantics of the inheritance relation. This is actually true; but the important point here is that inheritance is only used as an internal means of acheiving the correct dispatch. The semantics are those of properties. 17 Dynamic implementation Figure 2.2: Class hierarchy for properties implementation 2.2.2 Linking with the user interface Now, we have a user interface to describe properties, and a class diagram to implement it. What is left to do is the glue so that the former generates code which implements the latter while checking for its correctness. Properties To implement defproperty, we will use a very powerful Common Lisp feature called macros. A macro is like a function which, instead of being called at runtime with its arguments already evaluated, is called at compile time with its arguments unevaluated. Common Lisp macros differ from C/C++ macros in the following aspects: • the arguments are not passed to it as a verbatim text representation: they are Lisp objects composed of the lists and atoms that form the abstract syntax tree (AST); • instead of doing just text replacement and concatenation, the whole language is available when processing a macro. Therefore, it can explore this AST, modify it and generate code depending on its structure. In our case, the macro defproperty will: • check for the existence of the abstract class; • browse (depth-first) the given property values and generate the corresponding defclasses; • store in a global hashtable the necessary information: using the name of the property as a key, the stored data references the different values and the default value. Concrete classes While property declaration is a whole new syntax added to the language, property fulfillment by a concrete class has been chosen to be an extension of an existing one. Therefore, it cannot be done using a macro. 2.2 Implementation 18 CLOS provides us numerous possibilities to alter its interals, grouped in what is called the Meta-Object Protocol (MOP). In particular, it calls a number of generic functions which indicate how a defclass declaration is transformed into an actual class representation. The one which processes the options passed to defclass is called ensure-class-using-class. This function receives, among other arguments not useful here, the list of defclass options (as a plist) and the list of superclasses. What we need to do here is not to override the default behavior of ensure-class-using-class; we only need to take the :properties option, create an abstract class which inherits from the classes associated with these properties, and finally add it to the superclasses. For such a task, CLOS provides us with an elegant system called secondary methods. Secondary methods are, as their name suggest, additional methods which are not meant to alter the semantics of the main (or primary) methods, but to add a check or — in our case — a small behavior to the primary methods. A secondary method is declared by adding a :keyword between the method name and its arguments: ( defmethod my-function : before ( arg1 arg2 ) ( implement ...)) Listing 2.10: A secondary method There are several types of secondary methods; the one which interests us is called :around methods. An around method is called before the primary methods, and is allowed to alter the arguments before calling the builtin call-next-method which, in turn, calls the primary method. The body of the around method is the following: • Check if there is a :properties option; if not, do nothing and directly call call-next-method; • Check if one of the superclasses has associated properties, and if all the provided properties and values exist for this class; if not, throw an error; • Determine, thanks to the global hashtable, all the property values; • Create an empty class which inherits from all the associated classes, or retrieve it if it already exists; • Add this class to the list of superclasses; • Call call-next-method. Dispatch The syntax extension we use to express property-wise dispatch resembles the one we use for defclass. However, CLOS does not provide a way to extend defmethod the way we intend to. This leaves us with little choice but to provide our own wrapper macro around defmethod which we call defalgo. For each argument, defalgo will check whether and how it is specialized. If it is unspecialized or class-specialized, it is left as is. If it is property-specialized, then the three atoms (abstract-class :property value) are replaced with the name of the class associated with the given value. Conclusion This report presented a programming paradigm which extends the capabilities of object-oriented design, called property-based genericity. We showed that, while originating from a very static C++, this paradigm is well suited for use in a completely dynamic context. The Common Lisp implementation is still quite basic, but it already shows some very interesting potential. Now that we have this working base, we will be able to experiment some additions to the paradigm, such as additional checks regarding the compatibility of certain properties (does a given grid exist in dimension N?). To further prove its viability, we will also incorporate this implementation into the Common Lisp Image Manipulation Bundle (Climb), the LRDE’s image processing library project. Appendix A Bibliography Alexandrescu, A. (2001). Modern C++ Design, Generic Programming and Design Pattern Applied. Ballas, N. (2008). Image taxonomy in milena. Technical report, EPITA Research and Development Laboratory (LRDE). Coplien, J. O. (1995). Curiously recurring template patterns. In C++ Report, pages 24–27. Géraud, T. and Levillain, R. (2008). Semantics-driven genericity: A sequel to the static c++ object-oriented programming paradigm (scoop 2). In 6th International Workshop on Multiparadigm Programming with Object-Oriented Languages. Seibel, P. (2003). Practical Common Lisp.