Semantic Integrity in Component Based Development Martin Blom, Eivind J. Nordby Karlstad University, Department of Computer Science, 651 88 Karlstad, Sweden {Martin.Blom, Eivind.Nordby}@kau.se Abstract: Due to the distributed nature of the growing field of Component Based Software Engineering (CBSE) and the increasing complexity of modern systems, the semantic aspects of software components must be well specified. In this literature survey, we have investigated how semantic integrity aspects are promoted and managed for components. The importance of semantic aspects and of preserving the semantic integrity is well known and well understood in academia for traditional object-oriented (OO) programming. Although component based programming has many similarities with OO programming, there are certain differences which may cause the semantic integrity to be managed differently. In component-based development, when components are downloaded to be used in systems that may or may not behave well, the responsibilities for the functionality and robustness are not as clear as in traditional programming. 1 Introduction Component-based software development is a growing field in computer science. Different frameworks and standards for components have been developed. Most of them center on CORBA, COM and JavaBeans. Components are normally described to the environment through some Interface Definition Language (IDL). The syntactic aspects are usually well-defined and syntactic constructs that are not accessible in all target languages are not allowed. Especially in mixed environments, like the target environment for CORBA, this helps the syntactic integration of components. The semantic aspects of components, however, are not so well supported and are often entirely left out. In traditional object-oriented programming, the semantic aspects have been a topic of great interest for a long period of time. Different ways of describing and handling them have been proposed [20,15]. Since objects and classes have a number of properties in common with components, most of the ideas developed in traditional and object-oriented programming theory will be usable in component-based development as well. There are, however, a number of properties specific to or more articulated for components: Components may be constructed by third-party vendors. Components may be used in varying application environments. Component source code may not be available for evaluation. Components may be distributed. Component may be used from within different programming environments. These properties may or may not be present in a given component-oriented environment. Investigations have shown that most successful component-projects have been done in-house [4], a fact that eliminates the first property listed above. Some common components, such as plug-ins, can only be used with certain applications in certain operating systems. This eliminates the second point and is another example of how difficult properties of component usage are simplified or removed. Even if difficulties may be avoided by reducing some of the problems above, the need to describe semantic properties properly and to preserve the semantic integrity is imperative in component-based development. By the semantic integrity of a software system, we mean that each part of the system respects the intended purpose of any other individual part of that software system. This is a condition to build stable systems and requires that each part be clearly described and that description be maintained as that part evolves. Contracts, as described by Meyer [19] and expressed through pre- and postconditions as introduced by Hoare [13], are important tools for this description. Therefore, we need to study if contracts are used at all in current literature on components and, if so, how they are specified and applied. In the next section, we will present and define some concepts used in the survey. The survey itself is presented in Section 3 and discussed in Section 4. In Section 5, we draw some conclusions from the survey. 2 Common terms used in this review In this section, we state our interpretation of some terms used in this review. 2.1 Semantic Integrity When discussing programming languages and programming techniques, the term semantics is one of the most important ones. Informally, semantics is what a program means, as opposed to what the program looks like, which is called syntax [1]. Syntax is a number of rules defining possible legal ways to combine the words and symbols (tokens) in the language. Semantics is what the different combinations of these tokens actually perform in terms of logical operations. By Semantic Integrity, we mean the preservation of the collective semantic properties of a program, a module or a system. If the semantic properties are violated, the system will enter unstable or inconsistent states, which will, eventually, lead to some kind of malfunction. The largest problem is perhaps not to achieve semantic integrity once it is defined, but rather to define it in the first place and to maintain it over time when the parts of the system evolve. Some languages such as Eiffel [18,19,21] have assertion statements to ensure that semantic properties of modules are not violated, but most programming languages leave this task to the programmer. In any case, only a subset of all useful preconditions can be tested automatically, so the programmer should always be disciplined in his approach to semantics. There are different approaches to semantic integrity. The approaches covered by this survey include contracts, other descriptions of observable behavior and testing. The most common approach is to express semantic integrity requirements using assertions for an entire system (invariants) or for parts of a system (pre- and postconditions). 2.2 Object and Component An object has a number of properties (methods and data elements), some exported and some private to the object. A number of methods are usually exported, thus defining the operational interface to the object. Data elements are usually not exported since their usage would be difficult to control. Private methods and data elements are accessed through the exported methods defined in the public interface. The interface methods are accessed from other parts of the system and need a clear definition to preserve the semantic integrity of the object. Although the notion of a software component is more than thirty years old [17], there is still no clear definition available of what a component is. The literature suggests many varying definitions. Everything ranging from dll-files to stand-alone applications is considered a component by some author [6,7,25]. Some authors argue that a component needs to be executable on its own whereas others say that a component is simply one or more classes with a common interface. We will not try to define what a component is in this survey, but accept the definition used by the authors of each individual publication. One point, which is clear however, is that components have certain properties, as mentioned in the introduction, that make them more complicated to use and maintain than objects. 2.3 Interface According to the online Wordsmyth dictionary [29], an interface is whatever is needed (“equipment or programs”) in order to “communicate between different systems or programs”. When used to define modules, classes and components, however, it usually means the syntactic definition of this interface. This is what is called signature in C++, and includes the function name and the number and types of the arguments. In other languages, it may also include the return type, the exceptions thrown and a few other attributes. This is also the implication of a CORBA IDL Interface Declaration [16]. This kind of interface description contains only an intuitive level of semantic description, through the choice of names for the functions and the formal arguments, and gives only a very limited support to semantic integrity considerations. 2.4 Contract Contracts take part in the process of preserving the semantic integrity of a software module. A contract in programming terms is very analogous to a contract in real life. It defines certain obligations and benefits for both parts in an agreement. It is usually directional, having a provider and a consumer. The consumer must ensure that certain assertions are true before it can use the service offered by the provider. These assertions are called preconditions and it may or may not be possible to express them in a programming language [19]. The service providing function benefits from a contract because it does not have to check that the precondition is satisfied inside the service function itself, but can concentrate on providing the service as efficiently and cleanly as possible. The provider is obligated to produce the expected service. The consumer then knows that if it did satisfy the precondition, it is guaranteed correct service or output, as specified in the so-called postcondition. Pre- and postconditions were first introduced by Hoare [13] in 1972 in his Hoaretriplets, but have been popularized in later years by Bertrand Meyer and the Eiffel Language [19]. What Meyer introduced was not only the actual term contract, which encompasses both pre- and postconditions, but also a clear definition of the responsibilities of both the provider and the consumer of a module. The contract should be viewed as a specification of responsibilities. Some authors, including Meyer himself, argue that a contract should be machine-testable and formalized to the extent that no ambiguities remain. What is often omitted is the fact that not all contracts can be formalized or even expressed in a machine-testable way. There are also certain preconditions, which are too costly to actually test. One example is to find out if a networked database can be updated. It might imply that the actual update must take place up until the final step, which is clearly not desirable, just to find out if it is legal to do the update. There are numerous examples of situations where it is not possible to machine-test a contract. In Eiffel, this kind of ”assertions” is represented as comments. The term contract is sometimes used as synonym to the interface of the component [14]. This seems to be more common in component literature than in literature about OO or structured programming. Throughout this survey, however, we will use the term contract in the traditional sense, i.e. as a preand postcondition definition. 3 Semantic Integrity and Components This section contains the actual survey of how semantic integrity aspects for components are treated in the literature. The survey is concentrated to scientific and engineering proceedings and communications. Even if many authors discuss what is covered by our term “semantic integrity”, the term itself is not normally used, except in relation to databases. So, in order to find the relevant papers we have searched some common research and engineering databases for combinations of keywords like semantic, integrity, semantic description, software, component, interface, interface definition and contract. We have found that authors have very different approaches to the semantic aspects when discussing components. Depending on the focus of the individual paper, the semantic awareness in the publication varies. We have identified five levels of semantic awareness, and have used them to structure this presentation according to the predominant aspect of each publication. The awareness levels can be summarized by the keywords “No semantics”, “Intuitive semantics”, “Pragmatic semantics”, “Executable semantics” and “Formal semantics”. The level “No semantics” encompasses the discussions that focus exclusively on the syntactic interface descriptions. This level covers IDL and similar interface descriptions. We have mostly included this level for the sake of completeness, since it is not concerned with semantic aspects at all. With “Intuitive semantics” we mean the discussions that point out the importance that the interfaces are semantically consistent when components are to be substituted or when components evolve, but without further specifying what that means. This level covers unstructured descriptions and comments about what functions should do. It also covers testing, which is normally based more on an exhaustive attempt to find a wrong answer than on a description of the intended semantics of a system or a module. “Pragmatic semantics” means that the designers and engineers are highly aware of the semantic implications and requirements of their components, but they do not express these semantics in any particular syntax or formalism. They express the semantic conditions as contracts inserted as comments in the design documentation, the interface descriptions or in the code. “Executable semantics” means that the semantic aspects and contracts are expressed in some kind of executable language. They can be tested at run time but not used to prove that the program is correct. This covers the Object Constraint Language (OCL) [23] developed for the Unified Modeling Language (UML) and also various assertion constructions found in some programming languages. Finally, “Formal semantics” cover the area of formal methods used to prove a program's semantic properties. There is a lot of research in this area. Specification languages like VDM, Z and λ fall into this category. These five levels of semantic awareness are presented in the following sections. 3.1 No Semantics It is quite common that an interface for a component is described through its signatures alone, and that the interpretation is left more or less to the user's intuition. This can be illustrated by an exchange of ideas by email arranged and published by the journal “Software - Concepts and tools”. The participants should answer the question "What characterizes a (software) component?" [7]. In the article, different definitions for the interface of a component are proposed. Wolfgang Pree and Gustav Pomberger state that “the component (module) interface is described either textually by means of an interface description language (IDL) or visually/interactively using appropriate tools”. Kai Koskimies proposes “An interface is a collection of signatures of services belonging logically together”. 3.2 Intuitive Semantics There is a general consensus in the software engineering community that an interface definition language, like CORBA’s IDL, COM’s interface or Java’s interface definition, is unable to capture the semantic aspects of a component. Most authors refer to the semantic integrity of components informally, in terms of “maintaining the interface” or similar. Sometimes the word interface means the syntactic definition of a component, sometimes it is understood to include even the semantic aspects. In the email exchange referred to above, Clemens Szyperski defines “A software component is a unit … with contractually specified interfaces”. Frantisek Plasil also stresses the semantic aspects, but does not include them in the interface concept: “A component requires/provides interfaces (plus contract description)”. Manfred Broy stresses the same point in a reaction to Wolfgang Pree’s definition above: “Description must not only mean the syntactic interface”. Jürgen Henn does not distinguish between objects and components in this respect, and defines a component as being defined “using well defined interfaces and behavior”. In a later comment during the same email exchange, Wolfgang Pree proposes that the distinction between syntactic and semantic aspects of a component could form an initial classification scheme. His conclusion opens for a wide area of research: “Unfortunately, almost no practically relevant concepts and tools are around so far to tackle the problems related to semantic issues.” In the same journal issue, Adele Goldberg raises some questions about a reuse business model [11], focused on economy of reuse. In this very informal paper, she asserts that maintenance cost can be kept low when changing one component in the system, since “the component change and testing is done once, and the per application cost is the re-incorporation of each application’s use of the component”. One condition for this is that “the reusers do not alter the service expectations of the component”. A successful reuse strategy requires trust in the components, and “trust is partially built by knowing what to expect from any asset stored in the library”. Reuse also requires “determining which services the component offers and which services the component requires from the application framework or from other components”. She does not specify what this specification shall contain, but it must be possible to detect if backward compatibility is broken or not, in other words if the client modules need to be changed in order to maintain the semantic integrity of the system. The question, which she asks: “When can evolutionary demands permit backward compatibility to be broken, if ever?” deserves careful attention. In a recent paper in Information and Software Technology, Hall argues that successful reuse of components requires a good knowledge of the environment and architecture where the components shall be used [12]. He also argues against an exaggerated belief in a massive reuse of general components. An oft-repeated argument for quality gain from the reuse of components in general, he says, is that the components will have been very well tested. He then writes, with reference to [28]: “Weyuker, among many others, has very properly warned against this view – when reusing a component it will most surely be used differently, and prior testing may not predict reliability in its new context: The fate of the Ariane 5 space rocket is a salutatory warning”. So Hall's conclusion is that we cannot automatically expect massive reuse of general components, because the environments vary and the remedy, which Weyuker promotes in her paper, is extensive testing of every new combination of components. In both cases we find that the semantic variations are described in general terms like changed conditions and different environments and architectures. 3.3 Pragmatic Semantics Under this title we wanted to study papers where the major focus was on a conscious, disciplined approach to semantic questions. We believe that such an approach will contribute to an increased quality even without being too focused on the formalities or executable test mechanisms. Our research group has investigated a few software engineering projects and found that the semantic aspects are to a large extent based on an intuitive reasoning [3]. We expect the same conclusion to hold even for CBSE. Therefore, we are working on a pragmatic method that will highlight the semantic aspects in software development, based on the contract ideas, without requiring the contracts to be expressed in any particular formalism. To our surprise, we could not find any recent publication with this approach. We find, however, that Meyer supports this approach. In his book Object-Oriented Software Construction [20], he spends a lot of space for this kind of arguments before he discusses how the reasoning can be supported by executable assertions. There are also voices against this approach. The creators of OCL, for instance, claim that constraints described in natural language will always result in ambiguities [23]. Therefore, a formal language that remains easy to read and write is needed. In our case studies, we found that pre- and postconditions were used, but there was a lack of understanding of their implications, so the conditions set up did not make much sense [3]. An understanding is needed even when working with testable contracts,. Our conclusion so far is that a conscious and disciplined approach is better that a purely intuitive one. A pragmatic understanding may be a useful step on the way from the intuitive handling of semantics to the executable contracts and assertions discussed in the next subsection. A pragmatic understanding is also the only tool available for all the semantic aspects which cannot be machine-tested at all. 3.4 Executable Semantics This section discusses semantic descriptions that can be executed and controlled by the computer itself during run-time. A common approach to increased semantic integrity in both object-oriented and structured programming communities is to add or use assertions in the programming language. These assertions are then used to express preconditions and postconditions in programming language terms and to include mechanisms for testing these conditions during runtime. This approach is used even in component contexts. One example is Cicalese [9] in the paper Behavioral Specification of Distributed Software Component Interfaces. The paper presents Biscotti, an extension of Java that enhances Java remote method invocations interfaces with Eiffel-style preconditions, postconditions and invariants. The Object Constraint Language (OCL), a specification language recently developed in the framework of UML [23], is another example. There is an expectation that OCL might play a role even in the component area to describe semantic aspects of components. OCL is developed in the framework of UML, but is used by some of the papers referred to in this survey for defining contract semantics for components. OCL is an imperative language to express side-effect free conditions. It is suitable for expressing for instance invariants, preconditions, postconditions and guards. These are elements used in contracts. One interesting paper trying to define contracts on different levels is Beugnard et.al’s article Making Components Contract Aware [2]. The authors define four levels of “increasingly negotiable properties”. The first level is the syntactic level where usual interface definition languages and programming languages are used. This is our "No semantics" level discussed above. The second level is the behavioral level where pre- and postconditions are defined using Eiffel or similar languages. It corresponds to our "Executable semantics" level discussed here. The authors suggest how to implement the contracts in an interface description language. The third level is the synchronization level where tools like path expressions and service object synchronization can be utilized. The fourth and final level is the quality of service level where the authors suggest tools like TAO (The Adaptive Communication environment ORB). Level three and four are outside the scope of this survey. The four levels are presented in Figure 1. Level 4 Quality-of-service Level Level 3 Synchronization Level Level 2 Behavioural Level Level 1 Syntactic Level Figure 1: The Four Contract Levels according to Beugnard et al. Tom Digre writes about high level business components to be used in complex applications with a high demand on short development times. The Business Object Component Architecture (Boca) [10] is presented as a solution to the need for reusable components in this area. Boca is the result of a work in the Object Management Group (OMG) during 1996. It is an architecture based, contract centered approach and defines an environment for business oriented component development. While it is true that the implementation framework, for example CORBA, defines a simplifying technology abstraction above the messaging infrastructure, this framework is not sufficiently close to the domain abstraction. In Boca, higher level domain components are described in a component definition language (CDL) expressing the domain specific semantic requirements (called contracts) in an implementation independent manner. “The contract abstraction isolates the domain specification from the technology implementation, preserving the integrity and interoperability of enterprise objects across an evolving technical infrastructure”. The contractual specifications are supported by CDL, which is a declarative superset of OMG’s IDL. Components are specified in terms of externally visible interfaces and semantics, not the implementation. Domain-specific abstractions can be defined. 3.5 Formal Semantics Most authors are positive to the usage of contracts for the specification of components but some argue that contracts are not enough and that formal methods and formal reasoning are necessary. So researchers are doing interesting work on contracts on a more formal level. This research yields results that may not be directly applicable to existing technologies. Formal methods do not necessarily pay off as indicated by Pfleeger and Hatton [24] but they can sometimes be modified and made more pragmatic in order to increase the pay-off. Büchi and Sekerinski [8] identify the problem with just using IDLs since these can only express syntactical issues. They advocate the use of contracts, but argue that pre- and postconditions must be defined formally. They state that we need formal contracts along with refinement and nondeterminism in order to guarantee full encapsulation of a component. By encapsulation they mean that the component is used as a black box, described by formal contracts. They argue that encapsulation scale up better than conventional contracts (pre- and postconditions). Wolfgang Weck [27] discusses contracts as being a crucial part of a component and that the contracts should be the only part necessary for component composition and inheritance. The main focus in his paper is not how to practically implement contracts in commercial and practical environments but he has an interesting discussion on the separation of specification and implementation. He stresses the differences between combining components through their interfaces and actually binding the code which implements the interface. An interesting and unusual approach to semantic quality for object components was taken by Nordhagen [22] in her recent doctoral thesis at the University of Oslo. Nordhagen calls herself “a practitioner who resorted to theory” in order to define the similarity of software. She defines an object component as a collection of collaborating objects. Her approach is not based on the contract concept, but on an mathematical definition of observable behavior. Two object components are defined to have the same behavior in a context if and only if they react identically to all possible input sequences in that context. Even non deterministic behavior is taken into account. She defines an object-oriented calculus, which she calls Omicron. Based on Omicron, she then develops a number of pragmatic rules that should be observed in order for a substitution of an object component by another one to be safe. Many of the rules correspond to common design practice. However, some requirements traditionally considered to be necessary are not, and a few new, unexpected ones were detected. One of the new discoveries is that, if you want to refer to individual objects in the component, you have to make them explicitly visible. Otherwise you cannot substitute the component with another one in a safe manner. 4 Summary and discussion In this section we summarize and discuss the different semantic approaches. In an attempt to structure the great variety of approaches, we start by suggesting a three dimensional taxonomy for components. Some new references are introduced during this discussion. 4.1 A Taxonomy for Components During the survey, we have seen that the discussions about component based software engineering go on different planes and in different dimensions. We have ended up identifying the three dimensions "Abstraction level", "Life cycle stage" and "Semantic awareness". The main focus of the survey has been the semantic awareness dimension, which was introduced with the survey in the previous section. We will now present the other two dimensions. Some authors refer to a level of abstraction for components. When done explicitly, it is most often related to reuse and done in terms of high level or business level components. These components are normally specialized and are conceived as big and coarse. The semantic level varies between intuitive and executable. Most authors agree that this is not an area for general off the shelfcomponents, but that each organization needs to fit its components into its own business level framework. The opposite would be low level, general components with small granularity and a larger potential for reuse. They can be exemplified by components for list and stack management, communication infrastructure and some general-purpose middleware. The technical mechanisms for this kind of components are provided by the CORBA, COM and JavaBeans standards, which are often referred to as “plumbing standards”. Some authors expect a market for off the shelf components in this area. We also see that different stages of the component life cycle are discussed. Components are defined, they are potentially reused and they are maintained. The definition stage is when the need for a component is identified and one decides to construct one. The definition of a component is typically expressed using some form of Interface Definition Language, like the CORBA IDL or Java Interfaces. In this stage, the semantics tend to focus on the intended application. The reuse stage is when the engineers will try to define components for reuse or try to find existing components satisfying their need for a specific purpose. Reuse can be of different kinds, for example copy and paste reuse, binary reuse and adaptation for reuse, all of which raise different semantic considerations. Reuse may be discussed in terms of application expectations. It is often concluded that components are seldom reused as they are, but some kind of adaptation is needed [5]. This discussion if often done in connection to business level components. This stage raises semantic aspects of compatibility and substitutability. The maintenance stage is when a component evolves over time. The semantic issues here concern backward compatibility and upgrade paths when new implementations or new functionality is introduced. When reasoning about component maintenance, informal terms are often used like “presenting the same expectations” [11]. Some authors have an explicit focus on some specific stage, other do not make that explicit. However, we have not found a good discussion of how the semantic properties of components can be described or maintained during all these three life cycle stages. What is sometimes seen is a focus on testing to assure semantic system integrity [28]. Of course, this approach does not contradict the usefulness of a thorough planning and investigation of the semantic aspects of a component prior to its use, which can serve to reduce the number of errors found during testing. It is an exiting challenge to try to place the papers we have found along these three dimensions. More than just being an academic exercise, we believe that such a classification can reveal some of the intentions behind the individual papers, and also contribute to making the ongoing communication and research even more fruitful. A tentative classification of what we conceive as each paper’s primary focus is shown in tables 1 and 2. To make the tables readable, we have used the authors or, in the case of IDL and OCL, the name of the language, to represent the papers. Of course, the focus is often not explicit and may be difficult to identify, and many papers will fit in more than one box. The distinction between high-level and low-level components in particular can be controversial. Unless the focus of a paper is explicitly towards the high levels, we have classified it as a low-level paper. Semantics Life cycle None Intuitive Pragmatic Executable Maintenance Goldberg Reuse Bosch Goldberg Hall Weyuker Beugnard Digre Definition Bosch Weyuker Beugnard Digre Formal Table 1: High abstraction level Semantics Life cycle None Intuitive Pragmatic Maintenance Executable Formal Hoare Meyer OCL Nordhagen Reuse IDL Bosch Weyuker Meyer Beugnard Cicalese Hoare Meyer OCL Szyperski Büchi Nordhagen Pfleeger Weck Definition IDL Bosch Weyuker Meyer Blom Beugnard Cicalese Hoare Holland Meyer OCL Szyperski Bosch Büchi Pfleeger Truss Weck Table 2: Low abstraction level 4.2 Discussion of the Results We have studied many approaches to semantic integrity, from the purely syntactical ones to the formal ones. Even if our primary classification is on semantic awareness, we have also tried to classify the authors in our three dimensional taxonomy. This is because all the dimensions are involved and affected by the view of the author of each individual paper, even when that is not explicitly stated, so they need to be included in the analysis and the conclusions. There is a general consensus that the interface description languages do not support semantic descriptions as they stand, and that there is a need for semantic descriptions of components. The most predominant approaches are the intuitive and the executable ones, mostly concentrated to initial definition and to reuse aspects. Maintenance aspects and the semantic issues raised by maintenance are rarely discussed, and when they are, it is in very intuitive terms. The fact that the column “Pragmatic” is almost empty means that we have not been able to find papers discussing the semantic integrity problems for components in a pragmatic way. This approach is rather rejected as not being sufficiently exact, since a free text description of conditions allows for misunderstandings between writer and reader. This observation surprises us, because all kinds of semantics cannot be mechanically tested. We believe that a conscious engineering approach is of great value, even in the absence of automatic support tools. Those who reject the pragmatic semantics approach tend to advocate executable semantics instead. The formal approach is valuable for security critical applications. Büchi [8] argues that it will ease the process of formal reasoning of components and make it easier to compose components based on their contracts. The strong formalism does however restrict the application of their proposed ideas to environments where the overhead of formal methods can be tolerated. The formal approach is generally conceived as too costly for the majority of the component market. Provided that our classification gives justice to the papers, it shows, among other things, that only a few papers explicitly addresses the semantic problems related to component maintenance. However, this is an area of serious compatibility problems. Adele Goldberg [ 11] states that component maintenance costs can probably be kept low “as long as the reusers do not alter the service expectations of the component”. We would add, as long as the revised component does not alter the expectations on and services provided to the reusers. This is a vital condition to avoid violations against a system’s semantic integrity. When this is not sufficiently well understood or respected, we see a drift in the expectations and services of a component, which will eventually result in hard to manage integration problems. Semantic integrity rules to manage the maintenance problem exist. Based on Meyer’s contracts [20], it is possible to express semantic integrity rules for component reuse, substitution and maintenance. One issue, which has been raised frequently, is how to make sure that a new component revision is compatible with the old one. Another one is when the interface has to be replaced by a new definition. The simple answer may be that a new revision is compatible with the old one when the new contract meets the requirements of the old one. This may appear obvious, but our experience is that it is not [3]. We find the Boca approach [10] promising for the business level components. It explicitly addresses the semantic problems and presents a manageable solution to them. Our major concern is the high overhead, both in administration and system complexity, to make it work. On the other hand, the target area of business systems normally already requires and implements system management routines, so it should be possible to implement the Boca approach in that kind of environment. 5 Conclusion As we can see from the survey, there is a lot of activity going on, trying to define what a component is, both at the low technical abstraction level and on the higher business level. Most of the focus is on how a component could be defined and reused, but some authors focus on the maintenance. We have not tried to draw any conclusion on what a component is. What we have tried to investigate is how the semantic aspects of the components are treated in published papers. From what we can see, most of the papers do not focus strongly on the semantic aspects at all. For those who do, the focus is mostly on the intuitive and the executable levels. The intuitive level contains informal statements about expected services and the executable level contains explicit semantic definitions using executable statements. The first and most obvious focus is on the syntactic component interface, and this is what most of the literature is about. The majority of contributions in the area do not specify the meaning of the term interface, but use it in some general sense, including all descriptive aspects of the component. Practically speaking, the implication of interface will almost always be the syntactic definition of the interface, since that is what is readily available today. The semantic aspects are not supported by the interface definitions and remain some general, unspecified cloud in the mind of most authors or the reader. Another focus is on explicit descriptions of the semantics of the components, including the pragmatic, executable and formal levels. The dominant author here is Bertrand Meyer. There are quite a few very formal contributions in the field, but the application of the theories is not always evident. We have found a few contributions on the less formal side. The Four Contract Levels of Beugnard and the Boca model of Digre are two of these. Although not formal in the mathematical sense of the word, they still want to materialize the semantic descriptions. To achieve that they need to extend the standard IDL language. The pragmatic approach is very weakly represented in the survey. The last semantic awareness level is the intuitive level. This is often present where semantics are but a secondary target for the paper. Even papers, which are not about semantics, often include some aspects of semantic considerations. Mainly, these are in the form of “expected interface”, “unchanged services” or other similar phrases. This expresses a broad acceptance of the importance of semantic integrity considerations on an intuitive level in the software engineering community. To conclude, what we find is missing in the literature today is a review of which semantic integrity aspects are of importance in component development and how they can be described. The discussion is very tied to tools to support and enforce some guidelines. We would like to se a broadening of this discussion to include even the pragmatic, disciplined approach, irrespectively of whether or not there is a tool to support this discipline. The overall conclusion of this survey must be that the discussion of semantic integrity in component contexts has not yet reached very far beyond the syntactic definition. This is a vast and open research area. 6 References [1] Aho, Sethi, Ullman, Compilers-Principles, Techniques and Tools, Addison-Wesley Publishing Company, 1986 [2] Beugnard, Antoine, Jézéquel, Jean-Marc, Plouzeau, Noël and Watkins, Damien, Making Components Contract Aware, IEEE Computer July 1999, pp. 37-45 [3] Blom, Martin, Semantic Integrity in Program Development, Master’s Thesis, Department of Computer Science, Karlstad University, 1997 [4] Bosch, J., Lecture in the doctoral course on Component Based Software Engineering, University of Västerås, January 2000. [5] Bosch, J., Superimposition: a component adaptation technique, Information and Software Technology 41 (1999) 257-2731, Elsevier Science B.V, ISSN: 0950-5849. [6] Brown, Alan W., Wallnau, Kurt C, The Current State of CBSE, IEEE Software September/October 1998, pp. 37-46 1 http://www.libris.kb.se/elsevier/cgibin/reference?article=0950584999000075 [7] Broy, Manfred et al., What characterizes a (software) component?, Software Concepts & Tools 19 (1998) 1, 49-56, ISSN: 1432-2188, Springer Verlag June 1998. [8] Büchi, Martin and Sekerinski, Emil, Formal Methods for Component Software: The Refinement Calculus Perspective, proceedings of the Second International Workshop on Component-Oriented Programming WCOP ’97, Turku Center for Computer Science, General Publication no 5, 1997 [9] Cicalese, C.D.T., Rotenstreich, S., Behavioral Specification of distributed Software Component Interfaces, IEEE Computer July 1999, pp. 46-53. [10] Digre, T., Business Object Component September/October 1998, pp 60-69. [11] Goldberg, Adele, A reuse business model, Software - Concepts & Tools 19 (1998) 1, 11-13, ISSN: 1432-2188, Springer Verlag June 1998. [12] Hall, P.A.V., Architecture-driven component reuse, Information and Software Technology 41 (1999) 963-9682, Elsevier Science B.V, ISSN: 0950-5849. [13] Hoare, C.A.R., Proof of Correctness of Data Representation, Acta Informatica vol 1, 1972m pages 271-281. [14] Holland, I. M., Specifying reusable components using contracts, In Proceedings of ECOOP 1992. [15] iContract, The Java(tm) Design by Contract(tm) Tool, http://www.reliable-systems.com/tools/iContract/iContract.htm [16] The Object Management Group, Inc. (OMG), The Common Object Request Broker: Architecture and Specification, Chapter 3: OMG IDL Syntax and Semantics, Minor revision 2.3.1: October 1999, http://www.omg.org/cgi-bin/doc?formal/99-10-07 [17] McIlroy, M.D., Mass-produced Software Components, Software Engineering Concepts and Techniques, J.M. Buxton, P. Naur and B.Randell, Eds., Van Nostrand Reinhold, 1976, pp 88-98 [18] Meyer, Bertrand, Eiffel: Programming for Reusability and Extendability, SIGPLAN Notices, vol. 22 no.2 February 1987, pages 85-94. [19] Meyer, Bertrand, Eiffel: a Language and Environment for Software Engineering, Journal of Systems and Software, 1988 [20] Meyer, Bertrand, Object-oriented Software Construction, Prentice-Hall, 1988 [21] Meyer, Bertrand, Eiffel: the Language, Prentice Hall, 1992 [22] Nordhagen, E. K., divide et impera, A Computational Framework for Verifying Object component Substitutability, Dr. Scient thesis, University of Oslo, November 1998. [23] Object Constraint Language Specification version 1.1, 1 September 1997, can be downloaded from http://www-4.ibm.com/software/ad/standards/ocl.html. 2 Architecture, http://www.libris.kb.se/elsevier/cgibin/reference?article=0950584999000713 IEEE Software [24] Pfleeger, Shari Laurence and Hatton, Les, Investigating the Influence of Formal Methods, IEEE Computer February 1997, pp. 33-43 [25] Szyperski, Clemens, Component Software, Beyond Object-Oriented programming, Addison-Wesley, 1997 [26] Truss, J.K., Discrete Mathematics for Computer Scientists, Addison-Wesley Publishing Company, 1991 [27] Weck, Wolfgang, Inheritance Using Contracts & Object Composition, proceedings of the Second International Workshop on Component-Oriented Programming WCOP ’97, Turku Center for Computer Science, General Publication no 5, 1997 [28] Weyuker E. J., Testing Component-Based Software: A Cautionary Tale, IEEE Software,Sept/Oct 1998, pp. 54-59. [29] Wordsmyth, the educational dictionary-thesaurus, http://www.wordsmyth.net/.