Performance Model Interchange Format: Semantic Validation Technical Report Daniel Garcı́a1 Catalina M. Lladó1 Connie U. Smith2 R. Puigjaner1 1. Dep. de Cien. Mat. i Inf. Universitat de les Illes Balears 07071, Palma de Mallorca, Spain. danielgcou@gmail.com, cllado@uib.es, putxi@uib.es 2. Performance Engineering Services. PO Box 2640 Santa Fe New Mexico, 87504-2640 USA www.perfeng.com May 2006 Abstract A Performance Model Interchange Format (PMIF) provides a mechanism whereby system model information may be transferred among queueing network model (QNM) based modeling tools. The PMIF allows diverse tools to exchange information and requires only that those tools provide importing/exporting mechanisms from/to the PMIF. The XML specification of the PMIF allows implementers to use widely available tools to parse the XML file, check the syntax, and simplify the translation to/from the XML format. Those tools, however, do not know the semantics of a QNM so they cannot check the XML to ensure that it contains a valid QNM. This paper presents the study of the validations needed to carry out such a semantic analysis, and the development of a semantic validation tool that can be used by any developer who wants to implement PMIF import/export mechanisms. Keywords. Interchange Format, Performance Models, Queueing Network Models, SPE, Semantic validation, Tool interoperability, XML. 1 Introduction A Performance Model Interchange Format (PMIF) is a common representation for system performance model data that can be used to move models 1 among modeling tools. The PMIF allows diverse tools to exchange information and requires only that the importing and exporting tools either support the PMIF or provide an interface that reads/writes model specifications from/to a file. Previous work defined a PMIF meta-model for system performance models that use a queueing network model (QNM) paradigm and an XML (Extensible Markup Language) [1] schema for the meta-model. A proof of concept was provided with a prototype implementation in which the QNM export was implemented in the SPE·ED [2] tool and the importing tool was Qnap [3]. A set of examples was provided for validation, see [4]. The XML specification of the PMIF allows implementers to use widely available tools to parse the XML file, check the syntax, simplify the translation to/from the XML format, and other common tasks. Those tools, however, do not know the semantics of a QNM so they cannot check the XML to ensure that it contains a valid QNM. For example, standard XML tools can check to confirm that arcs in the QNM model are connected to elements that have been specified in the model, but they cannot determine that those elements are actually Nodes (they could be Workloads). Moreover, they cannot determine other conditions that a QNM must have such as the existence of a valid path from a source node to a sink node. It is best to implement the semantic analysis of a PMIF model in one tool so that importing tools do not have to duplicate the error checking. This makes it easier to implement a PMIF import function. It is also helpful for testing a PMIF export function to confirm that it generates proper models. Moreover, the semantic analysis can be offered as a Web service so that it can be developed, installed, and maintained once for all users who wish to use this capability. Related work is covered in the next section. Then the remainder of the paper is organized as follows: in order to make the paper self-contained the PMIF XML schema is described in Section 3. The types of semantic validation are described in Section 4. Section 5 describes the design, and Section 6 the implementation of a tool for performing these semantic validations. Sections 7 presents a case study and Section 8 offers some conclusions. 2 Related Work Related work specifically on semantic validation of QNM has not been reported. Before PMIF, modelling tools either implemented error checking 2 internally, or in some cases, just prevented the error with their input language or Graphical User Interface (GUI). For example, a GUI could prevent a user from drawing a QNM model that does not have a path from the source node to the sink node. Some command line tools that use input/output files, as for example Qnap, produce error messages when the models are not semantically correct, for example if a class arrives to a node which does not have a corresponding service specified for this class. There has been some work on semantic validation of XML in general. The three primary approaches for specifying and checking semantic properties are: 1. Domain-specific custom programs 2. XSLT stylesheets 3. Constraint specification languages Each of these is discussed in the following paragraphs. Item 1, custom programs, hard code the validation functions. The primary disadvantages of this approach are that there is a fair amount of code required for handling the XML so the validation checks tend to get ”lost” in this other code, and that the code must be maintained. It is particularly problematic when many programs must duplicate the code. For example, we do not want all tools that import PMIF to have to check for semantic validity. Item 2, XSLT stylesheets, can also be used to check for semantic validity. They code rules in XSLT functions and produce a report on problems found using standard XML tools. For example, in earlier work [4, 5] we used XSLT to convert a PMIF file into a Qnap input file. Similar code could check the semantic validity at the same time. Item 3, constraints, are formal specifications either in the form of an assertion (true/false statement) or a conditional rule (if-then) in combination with assertions. Examples of constraint languages that might be applicable to QNM validation include: • RDF - Resource Description Framework for representing relationships and meaning of information for the ”Semantic Web” [6] • OCL - Object Constraint Language for specifying conditions that must hold in UML models [7] • Schematron - an assertion language based on XPATH [8] 3 • xlinkit - a constraint language that combines first order logic with XPATH [9] • CLiX - a commercial version of xlinkit with products for developing and applying the constraints [10] • XCML - eXtensible Constraint Markup Language for expressing both static and dynamic semantic constraints in both simple and composite forms [11]. Constraint specification approaches require a formal expression of the semantic properties that a valid PMIF must contain. General purpose software then tests XML documents against the constraints and reports errors when exceptions are found. CliX provides for consistency checks within a specified, distributed family of documents (and databases). XCML addresses a single document only, but adds expressiveness to the language. A formal specification of conditions that must hold for an XML document to be semantically valid is useful for those who wish to implement import and export functions. It is more precise than prose so it eliminates ambiguity. A variety of tools are available for testing documents against constraints. Other tools generate test cases based on constraints [10, 9]. This paper reports the semantic conditions that must hold for a PMIF to be valid. It also defines the order for checking the conditions. Both of these contributions are needed regardless of the implementation technology used. We have chosen to use custom programming code to implement the proof of concept because: • it is easier for us to check conditions that must hold in the model topology (such as open workloads must have a valid path from a source node to a sink node) with programming logic • it is easier for us to debug the code than the complex constraints that would be required • there currently is no ”standard” constraint language • we do not have a commercial XML product for processing constraints • we have defined some conditions to generate warnings rather than errors so we would need a constraint-checking tool that allows that possibility. 4 It is possible to use a constraint-based tool or an XSLT stylesheet for checking the ”easy” conditions. However, it does not seem worthwhile in this case because we construct internal data structures for the complex tests anyway, so we might as well use them to test the ”easy” conditions. This avoids the complexity of using multiple tools with tests divided between them. Note that we preclude the primary disadvantages of the custom-code approach described earlier because we create a single program or service that separates the semantic validation, and this code can be reused by any tool that wishes to import the PMIF. Nevertheless, there are some documentation advantages for specifying most of our semantic conditions with a constraint language even if we do not currently use commercial products to apply them. We are considering documenting them in the PMIF meta-model with OCL until XML semantic validation techniques are well established and many affordable or free tools are available to support them. 3 PMIF Overview PMIF was first defined using an EIA/CDIF (Electronic Industries Association/CASE Data Interchange Format) paradigm that calls for defining the information requirements for a QNM with a meta-model [12], that is, a model of the information that goes into constructing a QNM. A transfer format was then created from the meta-model and used to exchange information. The PMIF allows diverse tools to exchange information and requires only that the importing and exporting tools either support the PMIF or provide an interface that reads/writes model specifications from/to a file. A new version of the PMIF meta-model and its XML schema specification (PMIF 2.0) was presented in [4, 5]. In order to comprehend this paper, it is necessary to understand how the different elements of a QNM and their relationships are specified in the PMIF. Therefore, the PMIF meta-model is shown in Fig. 1. The diagram shows that a QueueingNetworkModel is composed of one or more Nodes, and one or more Workloads. A Server provides service for one or more Workloads. A Workload represents a collection of transactions or jobs that make similar ServiceRequests from Servers. There are two types of Workloads: OpenWorkload and ClosedWorkload. A ServiceRequest specifies the average TimeService, DemandService or WorkUnitService for each Workload that visits the Server. A TimeSer- 5 Figure 1: XML schema for PMIF 2.0 viceRequest specifies the average service time and number of visits. A DemandServiceRequest specifies the average service demand (service time x number of visits). A WorkUnitServiceRequest specifies the average number of visits requested by each Workload that visits a WorkUnitServer. Upon completion of the ServiceRequest, the Workload Transits to other Nodes with a specified probability. The attributes of the PMIF elements are shown in Table 11 . More detailed information and the PMIF schema specification can be found in [4, 5]. The schema itself is at www.perfeng.com/pmif/pmifschema.xsd. 4 PMIF Semantic Validation The PMIF XML schema enables PMIF users to syntactically validate their models. However, the models might be semantically incorrect. In this section the semantic validations we consider necessary are described. The validations are grouped into three categories as follows: 1. Error generation: some conditions cause the PMIF model to be incoherent. In fact, most queueing network modelling tools (for example QNAP [13]) would generate an error if these conditions are found. 1 the Transit sub-element shown is a sub-element of all Workload sub-elements and all ServiceRequest sub-elements 6 Element Node Arc Workload ServiceRequest Table 1: PMIF elements and its attributes Sub-element Attributes SourceNode Name SinkNode Name Server Name, Quantity, SchedulingPolicy Name, Quantity, SchedulingPolicy WorkUnitServer TimeUnits, ServiceTime None FromNode, ToNode, Description WorkloadName, ArrivalRate, TimeUnits OpenWorkload ArrivesAt, DepartsAt WorkloadName, TimeUnits ClosedWorkload NumberOfJobs, ThinkTime, ThinkDevice WorkloadName, ServerID, TimeUnits DemandServiceRequest ServiceDemand, NumberOfVisits WorkloadName, ServerID, TimeUnits TimeServiceRequest ServiceTime, NumberOfVisits WorkloadName, ServerID WorkUnitServiceRequest NumberOfVisits Transit To, Probability Therefore, this group of semantic validations generates an error because the model cannot be solved as specified. 2. Warnings: other situations may occur in which, even though the model is, in general, still correct it might not actually produce the intended model, or it might be specified in a way that can cause confusion. Two levels of warnings have been defined. Some validations will produce a significant warning while others will generate a non-significant one, depending on the importance of the problem. 3. Excluded: other validations are described, which were also considered but have not been included in the development of the tool. The reason for their exclusion is also explained. 4.1 Coherent Identifiers In XML schemas the ID attribute is used to specify a unique identifer, and the attribute IDREF is a reference to an ID. In the PMIF schema, two groups of entities have an ID attribute, Nodes and Workloads and the 7 Table 2: Coherent groups (and subgroups) of and attribute Element Sub-element Attribute FromNode Arc None ToNode ArrivesAt OpenWorkload DepartsAt Workload ClosedWorkload ThinkDevice DemandServerID ServiceRequest WorkloadName TimeServerID ServiceRequest ServiceRequest WorkloadName WorkUnitServerID ServiceRequest WorkloadName Transit To identifiers for each element ID Group Nodes Nodes Nodes Nodes Nodes Nodes Workloads Nodes Workloads Nodes Workloads Node Sub-ID Group All but SinkNode All but SourceNode SourceNode SinkNode Server Server Any Server Any WorkUnitServer Any see Table 3 IDREF’s are used to relate those to other entities. The XML schema (syntactic) validation checks that attributes of type IDREF have a value that has been specified as an ID. However, such an ID might be incorrectly used in the sense that every IDREF attribute of elements (or sub-elements) has some semantic restrictions on which elements or sub-elements may be referenced. For example, the sub-element OpenWorkload has an attribute of type IDREF which is ArrivesAt. This IDREF needs to be a reference to an element of type Node (it cannot be of type Workload) and moreover it needs to be of type SourceNode (sub-element of Node). Table 2 shows the valid groups and subgroups of identifiers for the attributes of the different elements and sub-elements. As shown in the table, the attribute To of the Transit element must reference a Node. Additionally, the Transit element is a sub-element of type Workload and of ServiceRequest, and for both of them it is always related to a specific Workload. Depending on whether this workload is open or closed the To attribute of the Transit element can only reference some sub-elements of Node. The allowable references are shown in Table 3. All the validations described in this section belong to the first group, so when problems are found, an error will be generated. 8 Table 3: Subgroups of identifiers valid for Transit elements depending on the associated workload Element Associated Workload Attribute ID Group Sub-ID Group Server OpenWorkload To Node WorkUnitServer Transit SinkNode Server ClosedWorkload To Node WorkUnitServer 4.2 Elements Specified but not Referenced Elements of type Node and Workload do not play a roll in a model unless they are referenced in another element. More specifically, Workload, Server (unless it is a ThinkDevice) and WorkUnitServer elements need to be referenced in some ServiceRequest element. SourceNode and SinkNode elements need to be referenced in an OpenWorkload element as ArrivesAt and DepartsAt attributes respectively. Otherwise a significant warning will be generated because the model may not be the one intended. 4.3 Duplicates The ID attribute of Node and Workload elements makes sure that those are unique. However, ServiceRequest and Transit elements might be specified more than once and the PMIF XML file would still be valid against the PMIF schema. Therefore the semantic validation needs to do the following: • For each couple Node and Workload only one ServiceRequest can be specified. • For each Workload element only one Transit element can be specified with the same value for the attribute To (i.e., transiting to the same node). • For each ServiceRequest element only one Transit element can be specified with the same value for the attribute To (i.e., transiting to the same node). An error is generated if these conditions do not hold because the intended behavior is ambiguous. 9 4.4 Multiple Servers The Quantity attribute specifies the number of parallel servers that a Node of type Server or WorkUnitServer has. The PMIF schema specification allows this value to be zero. Hence, the semantic validation needs to verify that a node with Quantity equals zero is not associated to any Workload in a ServiceRequest. Otherwise, an error will be generated. On the other hand, if the attribute Quantity of an element of type Node is equal to zero but this element is not associated to any Workload in a ServiceRequest a significant warning will be produced because the model can be solved but may not be the intended model. 4.5 Think Device When a Node is referenced as ThinkDevice in a ClosedWorkload element, this server cannot be referenced by any other ServiceRequest element. Moreover, such a Node needs to have a SchedulingPolicy that is IS (Infinite Server). A violation of these conditions results in an error. 4.6 Coherent Workload Chains As described in Section 3 a Workload can be a ClosedWorkload or an OpenWorkload. The XML schema validation makes sure that the attributes specified for each Workload sub-element are the appropriate ones for that subelement. However, the workloads specified as closed might not really form a closed chain and the ones specified as open workloads might not be really open. Firstly, each element containing elements of type Transit (i.e. ServiceRequest or Workload elements), needs to have the sum of the probabilities of its Transits greater than zero, otherwise, an error will be generated. We are not assuming that each of the probabilities needs to be different from zero due to the following: A queueing network modelling tool that internally represents the routing probabilities as a matrix would, in general, have many of the positions of the matrix with value 0. If such a tool automatically generates a PMIF XML file, it would probably generate many Transit elements with probability zero. Therefore, this case would only give a significant warning. Secondly, if a workload chain arrives to a specific Node, this Node needs to be referenced by a ServiceRequest that relates it to the specific Workload. Otherwise an error will be generated. On the other hand, the reverse case, i.e. if there exists a ServiceRequest that relates a Workload and a Server 10 but the Workload chain never gets to that Server, a significant warning is produced. The second part of this validation is different for open and closed workloads, but both result in an error when the following conditions do not hold. For the OpenWorkload case: • There exists a path from the SourceNode specified by the attribute ArrivesAt to the SinkNode specified by the attribute DepartsAt, which includes at least one intermediate node of type Server or WorkUnitserver. • A SinkNode can only be used in a workload chain if it is referenced by the attribute DepartsAt of the OpenWorkload specification. Finally, the validation for ClosedWorkloads implies the following: • There exists a loop going from the ThinkDevice node and back to it which includes at least one intermediate node of type Server or WorkUnitserver. 4.7 Time Units TimeUnits is an optional attribute that appears in many elements. Since it is optional, a model will be syntactically valid when this attribute appears in some elements but not in all of the elements. If none of the elements have TimeUnits specified it implies that they all have the same time unit whatever that is (some modelling tools, for example Qnap, use this paradigm). On the other hand, when this attribute is specified for some elements and not specified for others, a non-significant warning is given with a message that the default value for time units is seconds. Moreover, we also consider it useful to give a non significant warning if different values for the attribute TimeUnits are used throughout the model because it might not be correct. 4.8 Attribute Values Equals Zero The PMIF schema is defined allowing many attributes to have a value of zero which can be useful for some specific cases, such as temporarily ”disabling” a server to investigate the impact. However, in general, models will not have a zero value for the following attributes: ServiceTime for WorkUnitServer, ArrivalRate for OpenWorkload, NumberofJobs for ClosedWorkload, NumberofVisits for ServiceRequest, ServiceTime for TimeServiceRequest, ServiceDemand for DemandServiceRequest. A significant warning will be 11 given if any of these attributes have a zero value since the model might not be the one intended. 4.9 Routing Probability Equations (and Number of Visits) In the PMIF XML schema routing probabilities are required and number of visits are optional since the number of visits can always be calculated from the routing probabilities as shown in [4]. Many tools use visits, others use probabilities, so both are included to make the PMIF easier to import. Therefore when the number of visits are also specified in a model, their values need to be consistent with the routing probability values using the well-known routing probability equations (see for example [14]). An error is produced otherwise (referenced as b in Fig.2). Additionally, since the number of visits is an optional attribute, it can happen that for the same Workload, in some ServiceRequest this attribute is filled and in some others it is not. A significant warning will be given in that case (referenced as a in Fig.2). 4.10 First Come First Served Servers In a model with multiple workloads, First Come First Served (FCFS) servers need to have the same service time for all the workloads in order for that model to be solved analytically. However, such a model could be simulated. Therefore this validation will produce a significant warning. 4.11 Excluded Validations Routing probabilities for each Node and Workload should add to one. However, that does not need to be validated since many tools (for example Qnap and SPE·ED ) normalize the probabilities and therefore the values given by the user do not need to follow that rule. When the scheduling policy of a server is of type infinite server the quantity value is not normally used since, in general, it is not meaningful. A warning could have been produced in this case but some tools (as for example SPE·ED ) use the quantity value for special calculations. Another possible validation would be to check whether a model is trap free, that is, it does not contain a sub-chain that allows a client to get in and never get out again. We thought this would be a duty for the queueing network modelling tool to do. 12 5 Validation Design This section describes how the validations explained in the previous section will be performed. Due to their inter-related nature the validations cannot be carried out in a random order since the checking of most conditions only makes sense if other conditions have already been validated. Therefore, our validation process will have different phases that need to happen sequentially. Fig. 2 shows these phases and the validations performed in each of them (each validation is represented by the number of the subsection where that validation is described). Since one of the advantages of having the PMIF meta-model specified as an XML schema is that the PMIF files can easily be syntactically validated against that schema, the semantic validation starting point is an XML file that is syntactically valid. The validation dependencies are shown in Fig. 2, so that if there is an arrow that goes from validation a to validation b it means that validation b only makes sense if validation a has already been confirmed. Moreover, we have grouped the validations in three sets depending on whether they generate errors, significant warnings or non significant warnings (see Section 4). These groups are also distinguished in Fig. 2. It is not possible to conduct all validations in one pass of the PMIF XML file. Therefore, intermediate data structures are used to facilitate the PMIF validation. The PMIF document is parsed once and these structures are built. Sub-section 5.1 describes the data structures used to store the model specification. Similarly, the errors and warnings are stored in a second data structure so that they can be collected and inserted in the output at the appropriate location. The error structure is quite simple since only two actions are needed: to insert an error or warning for a specific line number and to retrieve the errors and/or warnings for a given line number. Thus, the functions that our PMIF validation application performs are: 1. PMIF file analysis and construction of model data structures. 2. Validation using data structure analysis and construction of error data structure. 3. Output generation with errors and warnings found. We would like to execute the validation application in three different ways, single invocation from a command line, using a graphical user interface, and through a Web service. Therefore the validation application is 13 Figure 2: Validation phases and dependencies (the number inside each circle corresponds to the subsection where the validation is described) 14 Figure 3: Graphical user interface independent from the interface and the different interfaces are developed separately. The Web service is developed similarly to the one described in [15]. The graphical user interface, shown in Fig 3, has a text area in which the PMIF XML file to be validated can be pasted or opened. It also has the File and Edit menus to allow the user to manipulate files (open, save, close) and edit them (copy, cut and paste). More importantly, it allows the user to specify the level to be used in the validation. Level 0 will produce no warnings, only errors. Level 1 will produce errors and significant warnings only, and Level 2 will do all. Finally, the validation result will be shown in the same text area where the input was and it will be the PMIF model (as in the input) with line number and the error/warning messages following the line where they have been found. 5.1 Data Structures In order to store in memory the model information gathered during the analysis of the PMIF document, the data structure proposed is composed of five lists: one list of Nodes, one of Arcs, one of Workloads and another one of ServiceRequests, so covering the four main elements of the PMIF schema. The fifth list is of Transits. Each Workload and each ServiceRequest 15 Figure 4: Data structures has its own list of Transits. Each element of these lists will contain all the information that its corresponding PMIF element has. The lists also include cross pointers that facilitate the semantic validation. For example, the ServiceRequest list has for each of its elements two pointers, one to the Node and another one to the Workload that it relates (instead of keeping the names of the workload and node inside the element). Fig. 4 shows the four main lists and how they are related through the pointers (the Transit list is omitted due to space constraints). The Nodes are the base element and do not reference any other element, the Arcs reference the From and To nodes that they relate, the Workloads reference the ThinkDevice if they are closed and the SourceNode and SinkNode if they are open, and finally the ServiceRequests reference the Node and Workload which they relate. The data structures could have been organized differently, for instance splitting the list into sublists depending on the type of the different subelements. A Workload list would then have two sub-lists, one for closed workloads and another one for open workloads. However, such a structure would only make the validation application more complicated and the subelements only need to be treated separately in a few cases. Hence we think that it is not worth it to build these sub-lists. 6 Implementation of the Semantic Validation Tool The PMIF semantic validation tool has been implemented in Java since it is a platform independent language and it provides Application Programming Interfaces (APIs) that facilitate the manipulation of XML documents. We have used the Java API for XML Processing (JAXP) [16] and as part of it the Simple API for XML (SAX) [17], described below. The Xerxes library [18] is also used with SAX to perform the syntactic validation of the 16 XML document against the PMIF schema. JAXP allows the use of any XML compatible parser: SAX or DOM [16] are two alternatives. Using DOM a memory representation of the XML document is automatically built. However, it would be difficult to use the DOM data structure and API to perform the semantic validations. In addition, DOM is less efficient than SAX and the extra features that DOM provides with respect to SAX are not needed for our implementation. Differently, SAX is an event-based parser, which means that the parser reads the XML document from the beginning to the end and, every time it recognizes an XML syntactical unit, it executes the appropriate method. It provides empty methods by default, and we inserted code to create the memory structures described in Section 5. The implemented classes have been grouped into 2 packages as follows: • Information, which contains the classes used to build the memory data structures. This package is described in subsection 6.1. • Modules, containing the classes that carry out the tasks related to the semantic validation itself, see subsection 6.2. 6.1 Information Package The implementation of the data structures described in Section 5 is done in Java. Classes and sub-classes have been implemented in correspondence to the elements and subelements of the PMIF XML schema described in Section 3 (for example, class Node with subclass Server, WorkUnitServer, SourceNode and SinkNode). The lists depicted in Fig. 4 are built using these classes. All the classes share the following features: • Attributes can be grouped into PMIF element attributes (same attributes that are specified for each of the elements in the PMIF schema) and attributes with information needed to facilitate the validation procedures. • They have an attribute named Linenumber that indicates its location in the PMIF XML file. • They include methods to: – insert, retrieve, test and print attribute values, – check if optional attributes have values filled in, – and facilitate the semantic validation. 17 • The methods for inserting attribute values that are of type IDREF in the XML PMIF schema, insert pointers to the corresponding ID, in order to create the structures depicted in Fig. 4. Another class also included in this package is the ErrorsAndWarnings which defines the structure to store the error and warnings messages that have to be generated in the output. Those are inserted in the structure while the validation is in process. The main element in the class is a hash table with the XML document line number as the key. The details for all the classes can be found in the API documentation available at http://dmi.uib.es/~cllado/pmif/validation. 6.2 Modules Package In Fig. 5 the processes needed to build the validation tool are shown, as well as the relationships between them. The boxes represent data and the circles represent procedures. The input and output files are represented in the bottom left corner and the part corresponding to the graphical user interface is drawn using a non-continuous line since it is an optional part of the execution (when the validation is invoked using the command line or through a Web service it does not need such an interface). The Main class shown in the figure will invoke the Analyzer class directly or through the Interface class (in case the graphical user interface is used). The class Analyzer will invoke several classes in the following order: • SchemaValidator : validates the input file syntactically against the PMIF XML schema. • SAXParser : parses the XML document and calls the methods of the class Handler depending on the XML structures found. These methods get model information and save it in the data structures represented by the class Data. • Validations: performs the validations described in Section 4 using the class Data, and saves the errors/warning information in the ErrorAndWarnings class. • GenerateResults: generates the output using the ErrorAndWarnings class. 18 Figure 5: Classes interaction for the validation process 7 Case Study This section describes the validation of a case study step by step. The example used is based on the ATM model from [12]. The PMIF XML file corresponding to this model can be found at http://dmi.uib.es/~cllado/pmif/ATM.xml. This file has been modified so that even though it is syntactically valid, it has many semantic errors, or inconsistences. In fact, most of the semantic validations described in Section 4 fail in one way or another when this example is checked. The modified ATM PMIF specification is shown in the Appendix, where the non-valid elements (or the ones that will produce warnings) are commented with the name of the subsection in which the related validation is described. As described in Section 5, the PMIF XML file is parsed once and intermediate data structures are built. Fig. 6 shows, partially, the resulting structures once this example is parsed, including all node and workload elements and 3 ServiceRequest elements (drawing all of them would have taken too much space). For clarity, only the most relevant attributes of each element are shown. Using these structures, the validations are carried out following the 19 Figure 6: Memory structures generated for the ATM case study 20 phases described in Fig. 2 in such a way that a phase only starts if the validation of the previous one has finished without errors (though it could have generated warnings), see Section 5. Given this example, the first phase does not validate, so the rest of them cannot be carried out. The output obtained is 1 error which is written in the output file as2 : Line 23. ERROR-02f: In the OpenWorkload test (23), the attribute ArrivestAt must be a SourceNode. When this error is corrected and the validation is executed again, the second phase can start and the output obtained is 2 errors and 3 warnings (phase 2 has given errors and so phase 3 does not start). The output file contains the following errors/warnings: • Line 7. ERROR-06b: The Server CPU (7) has 0 servers (attribute quantity 0) but it is associated with a DemandServiceRequest or a TimeServiceRequest or is a ThinkDevice of ClosedWorkload. • Line 12. WARNING-08c: The Server CPUNEW (12) is not referenced by any TimeServiceRequest or DemandServiceRequest or ThinkDevice of a ClosedWorkload. WARNING-06a: The Server CPUNEW (12) has 0 servers (attribute quantity 0) but it is not used. • Line 23. WARNING-08e: The OpenWorkload test (23) is not referenced by any ServiceRequest (23). • Line 45. ERROR-04a: This combination (CPU, Withdrawal) is a repetition of a previous ServiceRequest (38). The errors are corrected (not the warnings since they do not cause the interruption of the validation process) and the new errors/warnings found corresponding to phase 3 only, are: • Line 2. WARNING-09e: Different units in the TimeUnits are used. The used units are: Sec: WorkUnitServer (9) / OpenWorkload (22) / OpenWorkload (27) / OpenWorkload (31) / DemandServiceRequest (37) Ms: WorkUnitServer (8) • Line 22. WARNING-11b: The OpenWorkload test (22) has a value 0.0 in its ArrivalRate attribute. ERROR-13c: A ServiceRequest does not exist for CPU and test. ERROR-13e: A transition does not exist to the SinkNode for the OpenWorkload test. 2 Note that the line numbers shown in the text do not correspond with the lines of the example the way it is shown in the Appendix. 21 • Line 31. ERROR-13c: A ServiceRequest does not exist for CPU and Get-balance. ERROR-13e: A transition does not exist to the SinkNode for the OpenWorkload Get-balance. WARNING-13f: The WorkUnitServer ATM with ServiceRequest associated with OpenWorkload Get-balance, is unreachable by this Workload. WARNING-13f: The WorkUnitServer DISKS with ServiceRequest associated with OpenWorkload Get-balance, is unreachable by this Workload. Again, we correct the errors found so far and the validation results in 1 error corresponding to phase 4 execution only: Line 47. ERROR-14a: NumberOfVisits values are inconsistent with Probability values in the OpenWorkload Withdrawal (27) for this WorkUnitServiceRequest (47). The value for this NumberOfVisits must be 8 according to Transit elements. When this last error is corrected, in the subsequent execution of the validation, phase 5 is also carried out and the ATM PMIF file is semantically valid since there are no other errors. 8 Conclusion The Performance Model Interchange Format (PMIF) allows diverse tools to exchange QNM performance models and requires only that the importing and exporting tools either support the PMIF or provide an interface that reads/writes model specifications from/to a file. The XML specification of the PMIF allows implementers to use widely available tools to parse the XML file, check the syntax, simplify the translation to/from the XML format, and other common tasks. Those tools, however, do not know the semantics of a QNM so they cannot check the XML to ensure that it contains a valid QNM. This work describes a companion tool that checks the PMIF XML to ensure that it contains a semantically correct QNM. This tool makes it possible for modelling tools that import the PMIF to omit these error checking features. The tool is also helpful for testing a PMIF export function to confirm that it generates proper models. It is possible to use these features via a Web service, then it is not necessary to install or maintain the tool. PMIF validation updates can thus be handled once and reused by other tools. This approach is a general one that applies to PMIF models produced by many different types of tools, such as software modelling tools, analytic tools 22 and simulation tools. Some validation cannot be done because things invalid in one tool may be fine in a different tool. For example, Qnap requires a separate source node for each open workload, but other tools may only need one source node. Therefore, tools may need to supplement this validation with some additional model-checking steps. A constraint language or XSLT may be a good way for tools to supplement our tests with their own custom checks The number and type of validations are easily extensible so a tool provider could use what we have developed and upgrade as needed. Similarly, the classification of errors and warnings is also easily adapted. The tool can be downloaded from http://dmi.uib.es/~cllado/pmif/validation/. Our future work includes extending the PMIF to add elements and attributes typically found in QNM that may not be solved analytically, but may be simulated. It will be easy to add the semantic validation for these extensions. In addition, we are considering documenting the semantic conditions in the PMIF meta-model to better define the requirements for a PMIF XML file to be valid for both implementers and users of the PMIF. Acknowledgment The research carried out at the Universitat de les Illes Balears has been partially funded by the Ministerio de Educacion i Ciencia, on the TIC 200306293 project. References [1] W3C, “World Wide Web Consortium,” http://www.w3c.org/XML, 2003. [2] “SPEED, Software Performance Modeling Tool,” www.perfeng.com, 2006. [3] D. Potier and M. Veran, “Qnap2: A portable environment for queueing systems modelling,” in I International Conference on Modeling Techniques and Tools for Performance Analysis, D. Potier, Ed. North Holland, May 1985, pp. 25–63. [4] C. Smith and C. Llado, “Performance model interchange format (PMIF 2.0): XML definition and implementation,” in Proc. of the First In23 ternational Conference on the Quantitative Evaluation of Systems, September 2004, pp. 38–47. [5] ——, “Performance model interchange format (PMIF 2.0): XML definition and implementation. tecnical report,” www.perfeng.com/paperndx.htm, LS Computer Technology, Inc., Tech. Rep., April 2004. [6] W3C, “World wide web consortium,” http://www.w3c.org, 2001. [7] OMG, “UML 2.0 OCL specification, formal/04-30-2004,” 2004. [8] R. Jelliffe, “The schematron assertion language 1.5,” October 2000. [9] C. Nentwich and W. Emmerich, “Valid versus meaningful: Raising the level of semantic validation,” in Proceedings WWW2003. ACM, 2003. [10] C. Nentwich, “Clix - a validation rule language for xml,” http://www.w3.org/2004/12/rules-ws/paper/24/, 2004. [11] J. Hu and L. Tao, “An extensible constraint markup language: Specification, modeling, and processing,” in Proceedings XML 2004, 2004. [12] C. Smith and L. Williams, “A performance model inter-change format,” Journal of Systems and Software, vol. 49, no. 1, 1999. [13] Simulog, “Modline 2.0 qnap2 9.3: Reference manual,” 1996. [14] G. Bolch and et al., Queueing Networks and Markov Chains. Modeling and Performance Evaluation with Computer Science Applications. Wiley Interscience, 1998. [15] J. Rossello, C. Llado, R. Puigjaner, and C. Smith, “A web service for solving queueing networks models using PMIF.” in Proceedings of the Fith International Workshop of Software and Performance, July 2005, pp. 187–192. [16] “Java API for XML processing http://java.sun.com/webservices/jaxp/index.jsp, 2006. (JAXP),” [17] “Simple API for XML (SAX),” http://java.sun.com/j2se/1.4.2/docs/api /org/xml/sax/package-summary.html, 2003. [18] Apache, “Xerxes2 Java parser 2.7.1,” xerces2-j/, 2005. 24 http://xerces.apache.org/ A Case Study: ATM Modified <QueueingNetworkModel xmlns : x s i= ” h t t p : / /www. w3 . o r g / 2001/XMLSchema−i n s t a n c e ” x s i : noNamespaceSchemaLocation =”D: \PMIF\ pmifSchema . xsd ” Name=”ATM Modyfied ”> <Node> <SourceNode Name=” SourceNode ”/> <SinkNode Name=” SinkNode ”/> <!−− D −−> <S e r v e r Name=”CPU” Quantity=”0” S c h e d u l i n g P o l i c y=”PS”/> <!−− D −−> <WorkUnitServer Name=”ATM” Quantity=”1” S c h e d u l i n g P o l i c y=” IS ” TimeUnits=”ms” S e r v i c e T i m e=”1 ”/> <WorkUnitServer Name=”DISKS” Quantity=”1 ” S c h e d u l i n g P o l i c y=”FCFS” TimeUnits=” s e c ” S e r v i c e T i m e=” 0 . 0 5 ”/> <!−− G −−> <S e r v e r Name=”CPUNEW” Quantity=”0” S c h e d u l i n g P o l i c y=”PS”/> </Node> <Arc FromNode=” SourceNode ” ToNode=”CPU”/> <Arc FromNode=”CPU” ToNode=” SinkNode ”/> <Arc FromNode=”ATM” ToNode=”CPU”/> <Arc FromNode=”CPU” ToNode=”ATM”/> <Arc FromNode=”DISKS” ToNode=”CPU”/> <Arc FromNode=”CPU” ToNode=”DISKS”/> <Workload> <!−− A and H −−> <OpenWorkload WorkloadName=” t e s t ” A r r i v a l R a t e=”0” A r r i v e s A t=”CPU” DepartsAt=” SinkNode ” TimeUnits=” s e c ”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </OpenWorkload> <OpenWorkload WorkloadName=” Withdrawal ” A r r i v a l R a t e=”1” A r r i v e s A t=” SourceNode ” DepartsAt=” SinkNode ” TimeUnits=” s e c ”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </OpenWorkload> <OpenWorkload WorkloadName=” G e t b a l a n c e ” A r r i v a l R a t e=”1 ” A r r i v e s A t=” SourceNode ” DepartsAt=” SinkNode ” TimeUnits=” s e c ”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> 25 </OpenWorkload> </Workload> <S e r v i c e R e q u e s t > <DemandServiceRequest WorkloadName=” Withdrawal ” S e r v e r I D=”CPU” ServiceDemand=” 0 . 0 0 6 3 ” TimeUnits=” s e c ” NumberOfVisits=” 20 ”> <T r a n s i t To=”ATM” P r o b a b i l i t y=” 0 . 5 5 ”/> <T r a n s i t To=”DISKS” P r o b a b i l i t y=” 0 . 4 ”/> <T r a n s i t To=” SinkNode ” P r o b a b i l i t y=” 0 . 0 5 ”/> </DemandServiceRequest> <!−− C −−> <DemandServiceRequest WorkloadName=” Withdrawal ” S e r v e r I D=”CPU” ServiceDemand=” 0 . 0 0 8 3 ” TimeUnits=” s e c ” NumberOfVisits=” 20 ”> <T r a n s i t To=”ATM” P r o b a b i l i t y=” 0 . 5 5 ”/> <T r a n s i t To=”DISKS” P r o b a b i l i t y=” 0 . 4 ”/> <T r a n s i t To=” SinkNode ” P r o b a b i l i t y=” 0 . 0 5 ”/> </DemandServiceRequest> <WorkUnitServiceRequest WorkloadName=” Withdrawal ” S e r v e r I D=”ATM” NumberOfVisits=” 11 ”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </WorkUnitServiceRequest> <!−− I . b −−> <WorkUnitServiceRequest WorkloadName=” Withdrawal ” S e r v e r I D=”DISKS” NumberOfVisits=”5”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </WorkUnitServiceRequest> <!−− F <DemandServiceRequest WorkloadName=” G e t b a l a n c e ” S e r v e r I D=”CPU” ServiceDemand=” 0 . 0 0 2 5 ” TimeUnits=” s e c ” NumberOfVisits=” 10 ”> <T r a n s i t To=”ATM” P r o b a b i l i t y=” 0 . 6 ”/> <T r a n s i t To=”DISKS” P r o b a b i l i t y=” 0 . 3 ”/> <T r a n s i t To=” SinkNode ” P r o b a b i l i t y=” 0 . 1 ”/> </DemandServiceRequest> −−> <WorkUnitServiceRequest WorkloadName=” G e t b a l a n c e ” S e r v e r I D=”ATM” NumberOfVisits=”6 ”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </WorkUnitServiceRequest> <WorkUnitServiceRequest WorkloadName=” G e t b a l a n c e ” S e r v e r I D=”DISKS” NumberOfVisits=”3”> <T r a n s i t To=”CPU” P r o b a b i l i t y=”1 ”/> </WorkUnitServiceRequest> </S e r v i c e R e q u e s t > </QueueingNetworkModel> 26