CLARA CLAS12 RECONSTRUCTION AND ANALYSIS FRAMEWORK User’s Manual Thomas Jefferson National Accelerator Facility 12000 Jefferson Avenue, Newport News, Virginia, 23606 2 CLARA Users Manual Document Control Sheet Title Version Issue Author Date CLARA Users Manual 3.1 2 Vardan Gyurjyan 27 July 2012 Table of Contents Preface.....................................................................................................................5 Who is this guide for? ........................................................................................6 Acknowledgements ..............................................................................................6 Glossary of terms .................................................................................................8 Chapter 1 Physics Data Processing: A Contemporary Approach .........9 Data processing requirements ...................................................................................................................9 Increasing physics output .......................................................................................................................... 10 Choice of computing model and architecture .............................................................................. 10 Chapter 2 The Framework ............................................................................. 12 Programming paradigm................................................................................................................................ 12 Data vs. algorithm ........................................................................................................................................... 12 Data classification............................................................................................................................................ 13 Design architecture ......................................................................................................................................... 14 Services .................................................................................................................................................................... 16 Service categories .................................................................................................................................................16 Service couplings ..................................................................................................................................................16 Service compositions ..........................................................................................................................................16 Service deployment ......................................................................................................................................... 17 Data processing environment, service container, and SaaS implementation .......17 Service registration and discovery........................................................................................................ 19 Service granularity .......................................................................................................................................... 20 Service accessibility ........................................................................................................................................ 21 Service invocation ............................................................................................................................................ 21 Transient data envelope............................................................................................................................... 21 Service communication monitoring ..................................................................................................... 23 Exception propagation and reporting ....................................................................................................24 Cloud formation................................................................................................................................................. 24 ClaRA batch deployment ................................................................................................................................26 Chapter 3 ClaRA SaaS In Nutshell............................................................... 27 All you need to convert a software application into a ClaRA service. ......................... 27 It is perhaps best explained through an example. .....................................................................28 Chapter 4 Service Developer’s Workshop ................................................. 32 How to install ClaRA? .................................................................................................................................... 32 Environment Setup ..............................................................................................................................................32 JToolBox ......................................................................................................................................................................33 Clara Platform .........................................................................................................................................................33 What if I also need the source code? ....................................................................................................33 How do I start a platform? ......................................................................................................................... 34 Environment Setup ..............................................................................................................................................34 Platform Name Definition................................................................................................................................34 Running the Platform .........................................................................................................................................34 How to start an additional data processing environment? ................................................. 35 My first hello-world service........................................................................................................................ 36 3 4 CLARA Users Manual How do I make my service available to the public? ................................................................. 38 How do I check if my service is up and running? ..................................................................... 42 How do I request a service?....................................................................................................................... 44 How can I request a service based on a keyword in a description? ............................. 46 Asynchronous service request ................................................................................................................ 48 Configuring a service ..................................................................................................................................... 51 Services that read/write EvIO files ...................................................................................................... 55 Application design or service composition ..................................................................................... 60 ClaRA application debugging and communication logging service .............................. 62 Chapter 5 Application Designer ................................................................... 63 Designing software applications ............................................................................................................ 63 Graphical interface .......................................................................................................................................... 63 Toolbar menus ........................................................................................................................................................64 Clas12 PDP engine deployment and testing ...................................................................................69 Appendix A ........................................................................................................... 74 Clas12 Java coding standards ................................................................................................................ 74 Packages ................................................................................................................................................................74 Variables .....................................................................................................................................................................74 Methods ..................................................................................................................................................................75 Naming Conventions ..........................................................................................................................................75 Recommendations ............................................................................................................................................76 Appendix B ........................................................................................................... 78 Object model design recommendations ............................................................................................ 78 “Has a” versus “Is a” relationship ..........................................................................................................78 More on using “is a” relationship .............................................................................................................78 Interface based design .....................................................................................................................................78 Contexts in which interfaces help ...........................................................................................................79 References ................................................................................................................ 80 Preface Software application development is a process of writing, testing, debugging, and maintaining the source code of particular algorithms. The source code is usually written in one of the high level programming languages (such as Java, C++, Python, etc.). As a result, software applications represent a set of instructions that computers use to perform specific operations to achieve required behaviors. The process of designing a software application requires expertise in many different subjects including knowledge of the application domain, specialized algorithms, formal logic, and of course, syntactic and semantic knowledge of the chosen high level programming language. This is a description of a widely adopted traditional approach that is used to design and develop software applications. So, what if a user is an expert in a specific application domain yet has a limited knowledge and experience in software programming. Obviously this user cannot actively contribute to the process of developing domain specific software applications. However, the same user can very efficiently design an application using available software building blocks: a process that does not require source code writing and compiling. The ClaRA application design process is different from a traditional approach in two major ways. First, ClaRA application design is performed by linking together graphical icons (representing selected software building blocks, called services) on the design canvas of the ClaRA designer graphical interface, which is then “compiled” into a coherent application ready for the execution. The second and critical difference is that ClaRA applications execute according to the rules of data flow instead of a more traditional programming approach, where sequential series of instructions (lines of code) are written to perform a required algorithm. In this respect ClaRA application design promotes data and data flow as the main concept behind any physics data processing algorithm. ClaRA service execution is data-driven and data-dependent. The flow of data between services in the application determines the execution order of services within the application. These differences may seem minor at first, but the impact is revolutionary because it allows the data paths between application parts (services) to be the application designer’s main focus. A ClaRA service has inputs, is capable of processing data, and producing the output data. Once a given service receives valid data, that service executes its engine, produces output data, and passes that data to the next service in the dataflow path. This guide shows how to develop ClaRA services (application building blocks) and compose data processing applications based on these services. In this guide, the term ClaRA service has a very specific meaning. It is similar to the SOA (Service Oriented Architecture) service definition and functionality. A ClaRA service provides an independent software component that is combined with other ClaRA services to assemble a final physics data processing application. Let us elaborate on this definition, assuming that the reader is familiar with the object-oriented programming. Consider the difference between objects and services. An object represents an entity in a specific domain, being nothing more than data combined with associated processing routines. A service, on the other hand, is an atomic, self-contained piece of software. The ClaRA framework 5 6 CLARA Users Manual is responsible for locating, loading into memory, and connecting/linking services at runtime. ClaRA provides a unique environment for an application designer to load services dynamically as needed, deploy services at a desired granularity, link services across the network and across multiple virtual machines, combine services written in different programming languages, etc. Service specific information, including meta-data, is available through the ClaRA service interface. This is a guide to a new approach for developing physics data processing applications using self-contained, compiled applications, called services, which interact with other services through a welldefined interface. Who is this guide for? This guide is designed to aid service programmers as well as physics data processing application designers. Chapter 2 describes technical and implementation details of the ClaRA framework. Chapters 2, 3 and 4 are for developers that are already familiar with basic concepts of object-oriented programming, and have experience in programming in one of the high level programming languages, such as Java, C++ or Python. Chapters 3 and 4 provide service programmers with concrete ClaRA service examples. If you are experienced programmer and are familiar with SOA, then chapter 3 and paragraph titled “Clas12 PDP engine deployment and testing” from the Chapter 5 will be enough to start developing ClaRA services. Physics data processing application designers not interested in service programming and framework implementation details can skip chapters 2, 3 and 4. However, the interface definition description in the chapter 2 will be very helpful for designing physics data processing applications. Acknowledgements I would like to express my gratitude to those who helped me through this project; to all those who provided support, offered comments, and assisted in the editing, proofreading and design this manual. I would like to extend special thanks to Carl Timmer for his invaluable support during the development of this project, as well as for proofreading and editing this manual. I would like to thank students Sebouh Paul and Sebastian Mancille, as well as a research scientist Gagik Gavalian (ODU) for their implementation efforts. Their comments and suggestions helped in shaping the ClaRA framework. Without their efforts ClaRA would have remained only a theoretical concept. Finally, I would like to thank the entire Hall-B off-line group for continuing support of the ClaRA project. It has been a pleasure collaborating with Dave Heddle (CNU), Jerry Gilfoyle (URich), Dennis Weygand, Veronique Ziegler, Mac Mestayer, Johann Goetz, Yelena Prok, Maurizio Ungaro, and others. 7 8 CLARA Users Manual Glossary of terms ClaRA cloud: A collection of servers hosted on the network to store, manage, and process experimental physics data. Data Processing Environment (DPE): A Java or C++ process that provides a complete, run-time environment for deployment, execution and management of ClaRA services. Deploying a Service: The process of dynamically loading the shared object of a service engine and presenting it as a ClaRA service inside of a data processing cloud. Linking Services: Each ClaRA service has an input and an output. Typically the services are "linked" together into a chain to design a specific application starting from the "event source" (e.g. reading from a file) and ending at some "event sink" (e.g. writing to a file). Orchestrator: A ClaRA based program designed to coordinate service/services execution. This process usually runs outside of the DPE. PDP: Physics data processing Platform: A master DPE where ClaRA cloud registration, discovery and administrative services are running. SaaS: Software as a service (SOA component) Service Container: ClaRA service naming convention, used to logically group services. Service Engine: A compiled class (shared object, i.e. class file in java and .so file in C++) that implements the ClaRA interface and is designed to provide certain data processing functionality. SOA: Service oriented architecture Transient data envelope (or communication envelope): ClaRA message structure that contains data (required for service execution), as well as communication and service operational details. Chapter 1 Physics Data Processing: A Contemporary Approach Data processing requirements Modern high energy and nuclear physics experiments require significant computing power to keep up with large data volumes. To achieve quality data analysis, intellectual input from diverse groups within a large collaboration must be brought together. Such analysis in a collaborative environment has historically involved a computing model based on self-contained, monolithic software applications running in batch-processing mode. This model, if not organized properly, can be inefficient in terms of deployment, maintenance, response to errors, update propagation, scalability and fault-tolerance. CLAS offline group have experienced such problems during the fifteen years of operation of the CLAS on-line and off-line software. Even though these challenges are common to all Physics Data Processing (PDP) applications, the small size of the CLAS offline group magnified their effect. Experimental configurations have become more complex and compute capacity has expanded at a rate consistent with Moore’s Law. As a consequence, these compute applications have become much more complex, with significant interaction between diverse program components. This has led to computing systems so complex and intertwined that the programs have become difficult to maintain and extend. In large collaborations it is difficult to enforce policies on computer hardware. For example, groups use whatever computing resources they have at their home institutions, which evolve as new hardware is added. Additional software and organizational effort must be put in place to update and rebuild software applications for new hardware or OS environments. In order to improve productivity, it is essential to provide locationindependent access to data, as well as flexibility of design, operation, maintenance and extension of physics data processing applications. These applications have a very long lifetime, and the ability to upgrade technologies is therefore essential. They must be organized in a way that easily permits the discarding of aged components and the inclusion of new ones without having to redesign entire software packages at each change. The addition of new modules and removal of unsatisfactory ones is a natural process of the evolution of applications over time. Experience shows that software evolution and diversification is important and results in more efficient and robust applications. 9 10 CLARA Users Manual New generations of young physicists doing data analysis may or may not have the programming skills required for extending/modifying applications that were written using older technologies. For example, JAVA is the main educational programming language in many universities today, but most of the data production software applications are written in C++ and some even in FORTRAN. The offline software of the CLAS12 project aims at providing tools to the collaboration that allows design, simulation, and data analysis to proceed in an efficient, repeatable, and understandable way. The process should be designed to minimize errors and to allow crosschecks of results. As much as possible, software engineering related details should be hidden from collaborators, allowing them to concentrate on the physics. Increasing physics output There are well-known practices leading to improved software productivity and quality. These include software modularity, minimized coupling and dependencies between modules, simplicity and operational specialization of modules, technology abstraction (including high level programming languages), and most importantly, rapid prototyping, deployment and testing cycles. It is also important to take into account the qualifications of software contributors: the physicist best understands the physics process and algorithms, and the computer scientist/programmer has advanced skills in software programming. An environment that encourages collaboration, with code development responsibilities clearly separated and established, can increase the quality and number of physics data analysis contributions. The CLAS12 software group has studied different PDP frameworks and has searched for contemporary approaches and computing architectures best suited to achieve the previously described goals. The CLAS12 framework design was inspired to a great extent by the GAUDI framework that was adopted by the LHCb experiment. The CLAS12 framework includes GAUDI’s data centricity, its clear separation between data and algorithms, and its data classifications. However, GAUDI is based on an Object Oriented Architecture (OOA), requiring compilation in a self-contained, monolithic application that can only be scaled in batch or grid systems. This approach usually requires that a relatively large software group be involved in the development, maintenance and operation. Choice of computing model and architecture Wile researching emerging computing trends, the cloud-computing model caught our attention. This model promises to address our computing challenges. It has matured and many scientific organizations, including CERN, are moving in that direction. The cloud-computing model is based on a Service Oriented Architecture (SOA). SOA is a way of designing, developing, deploying and managing software systems characterized by coarse-grained services that represent reusable functionality. In SOA, service consumers compose applications or systems using the functionality provided by these services through a standard interface. SOA is not a technology and is more like a blueprint for designing and developing computational environments. Services usually are loosely coupled, depending on each other minimally. Services encapsulate and hide technologies as well as programming details used inside a service. 11 12 CLARA Users Manual Chapter 2 The Framework Programming paradigm The ClaRA framework uses a service-oriented architecture to enhance the efficiency, agility, and productivity of PDP processes. Services are the primary means through which physics data processing logic is implemented. PDP applications, developed using the ClaRA framework, consist of services, running in a context that is agnostic to the global data processing application logic. Services are loosely coupled and can participate in multiple algorithmic compositions. Legacy processes or applications can be presented as services and integrated into a PDP application. Simple services can be linked together and presented as one, complex, composite service. This framework provides a federation of services, so that service-based PDP applications can be united while maintaining their individual autonomy and self-governance. It is important to mention that ClaRA makes a clear separation between the service programmer and the PDP application designer. The physicist can be productive by designing and composing PDP applications using available, efficiently and professionally written services in the inventory without knowing service programming technical details. Services usually are long-lived and are maintained and operated by their owners on distributed ClaRA service containers. This approach provides an application designer the ability to modify PDP applications by incorporating different services in order to find optimal operational conditions, thus demonstrating the overall agility of the ClaRA framework. Data vs. algorithm Message passing is the most popular communication model for distributed computing. It is key for building SOA-based frameworks. This model is attractive due to the fact that messaging does not emulate the syntax of programming language function calls (like CORBA and RPC for example). Instead, structured data messages are passed between distributed components (i.e. services). In this distributed communication model success largely depends on the clever design of the message structure: a communication envelope that describes not only transferred data but also communication and service operational details. In order for a service communication to be truly useful, every party has to share/use the same vocabulary for expressing the communication details (i.e. common message-interface). The ClaRA framework provides developers with the means for interacting with services based on the publish-subscribe (cMsg) message exchanges. But such explicit interactions, where a service invokes operations exported by the predefined interface of a well-known target service, are only one piece of the messaging puzzle. To make this clear, consider a persistency service that converts ClaRA transient data into a ROOT tree. Using ClaRA tools one can link a charge particle tracking service to this persistency service for storing reconstruction results in a ROOT format. In this particular scenario, the persistency service (i.e. invocation target) is known in advance and the responsibilities between the requestor service and the provider service are defined in a service contract. But that same messaging strategy is far less suitable for indicating event occurrences, for example a file-not-found exception. In such situations, the developer of the service either doesn’t know who is interested in the event, or doesn’t want to hardcode the event handling logic in the service. Indeed, doing so would increase its complexity and reduce its reusability and maintainability. What ClaRA provides for such cases is a way to deliver event notification to services that register their interest in one or more events. This is possible due to the ClaRA message envelope design (service communication message structure) that contains event notification. ClaRA services are loosely coupled, since there are no dependencies between services because event-producing services typically invoke generic operations such as execute/notify (rather than target service specific algorithmic methods). Even more, a service developer is unable to predict future customers (i.e. services that will be linked to it). Only a final physics data processing application (service composition) designer knows the event/data flow outline. Rather than contacting services directly, the implicit invocation mechanism only signals that output-data is ready (an event has occurred) and it does not say what needs to be done to that data (how to react to that event). This clearly improves its maintainability, and it simplifies reengineering processes. ClaRA services can be considered as event handlers for one another. Since event handlers are external to other services, the workflow modification of a handler does not require modification of any event producing services. Data classification Even though the ClaRA service interface is data agnostic the user needs to supply a specific type of data to a specific service. Service input meta-data is available through a description and within each message (transient data envelope) itself. So, the focus of an application designer who composes a PDP service-based application is going to be not a traditional algorithm (i.e. a thread where one method calls another method), but rather a data flow. This is a clear paradigm shift from traditional software programming. In this approach, data and modules that transform it are tools for designing an application. Thus, a ClaRA application consists of services (encapsulating traditional software algorithms) communicating data among each other. One possible example would be a TrackFinder service that encapsulates a tracking algorithm (engine, using the ClaRA terminology) that requires hits as an input data and produces tracks as a resulting data. 13 14 CLARA Users Manual ClaRA defines three basic categories of data: a) Event data representing actual physics raw events and subsequent alterations of it (for example reconstructed data, simulated data, DST, etc.), b) Detector data, representing experimental apparatus (slow control data, geometry and calibration data, magnetic field maps, etc.), and c) Statistical data, representing a result of an Event data processing (tuples, histograms, etc.). An interesting design choice adopted by the Gaudi framework developers, to separate transient and persistent data types, was influential for ClaRA. The fact that most of the PDP application services are independent of the technology used for object persistency made this design choice a natural decision for ClaRA. It is inevitable that over time persistency technologies will evolve and we think that this choice will make ClaRA PDP applications independent of them. It is also important to mention that persistent and transient data processing have very different optimization criteria. For a persistent data, the goals are to optimize I/O performance, eliminate duplications and inconsistencies, and reduce data size. Yet, for transient data, the primary objective is to optimize execution performance (even transient data duplication can be implemented if it helps to improve performance and ease of use). Design architecture This framework was designed based on a specific set of principles. The fundamental unit of ClaRA-based PDP application logic is the service. Services exist as independent software programs with a common interface defined by the framework. User classes, encapsulating specific algorithms and compliant to the required interface, can be presented as ClaRA services using the ClaRA Software-as-a-Service (SaaS) implementation. Figure 1. ClaRA framework architecture Each service has its own set of data processing functionalities. These functionalities or capabilities, suitable for invocation by other services, can be discovered via registration information available from the ClaRA platform registry services. One of the service design recommendations is to keep a small and simple code base, which will help future programmers to easily extend, modify, maintain and port services. Services must be agnostic to any eternal data processing logic. Services must be discoverable and able to take part in complex service compositions. By standardizing communication between services, adapting a PDP application to changes in one of its components becomes easier and simplifies data transfer security (for example by deploying a specialized access control service). The ClaRA architecture consists of four layers (see Figure 1). The first layer is the PDP service bus that provides an abstraction of the cMsg publishsubscribe messaging system. Every service or component from the eventprocessing layer communicates via this bus, which acts as a messaging tunnel between services. Such an approach has the advantage of reducing the number of point-to-point connections between services required to allow services to communicate in the distributed ClaRA cloud. The service layer houses the inventory of simple/entity and complex/composite services (linked service chains presented as a single service) used to build PDP applications. An administrative registration service stores information about every registered service in the service layer, including address, description and operational details. The orchestration of data analyses applications is accomplished by the help of an application controller, resident in the orchestration layer of the ClaRA architecture. Clients from the physics complex event processing (PCEP) layer are designed to subscribe and analyse event data in real-time in order to generate immediate insight and enable instant response to changing conditions in the PDP application. A software component from the PCEP layer can subscribe to data from different (parallel running) services and/or composite service chains. This way, by correlating multiple events, PCEP components can make high-level decisions, concerning for example particle ids, triggers, etc. 15 16 CLARA Users Manual Services Physics data analysis logic is implemented as a service or service compositions, designed in accordance with ClaRA service design principles. Service categories ClaRA specifies four types of services: entity, utility, task and orchestrated task. Entity services are highly reusable and generic. They are atomic enough to take part in different service compositions. Users find many self-contained and legacy software systems very useful. These systems can be presented as utility services. The difference between entity and utility service is size and complexity. We hope in the future that the utility service definition will be deprecated. Currently the legacy software applications temporarily are labeled as utility services before they will be categorized (after proper segmentation and modularization) as entity services. Task and orchestrated task services are both composite services, with the only difference being that task-services are self-governed, while orchestrated services are aggregated services controlled by the software components from the orchestration layer of the framework. Service couplings Two coupling modes exist between services and service consumers: Contract-to-Functional and Consumer-to-Contract. Contract-to-Functional coupling is used between ClaRA services, making them bound to a contract according to which they must receive and send data. Each service itself can be a consumer. The second mode is one in which ClaRA services are coupled to consumers (other services, orchestrators, etc.) by Consumer-to-Contract coupling, which is defined as an agreement of a service to trigger service engine execution after receiving input data. Using only these two, data-in-data-out coupling contracts, services are able to abstract and encapsulate service-programming details (programming languages, technologies, algorithmic solutions, etc.). Service functional information is obtained through meta-data available as part of the contract. Service quality information can be obtained from the ClaRA platform registration services. Service compositions A service composition is comprised of services that have been assembled to provide the functionality required to accomplish a specific data processing task. ClaRA distinguishes between two types of service compositions: primitive and complex. Primitive compositions use message exchange across two or more services. Complex compositions, however, require an orchestrator. Because the frameworks requirement for services is to be agnostic to any physics data processing logic, one service may be invoked by multiple data proccessing applications, each of which can involve that same service in a different composition. A collection of entity services can form the basis of a ClaRA service repository that can be independently administered within its own physical deployment environment. So, the ClaRA framework helps to build services, service compositions, and service inventories. The service-oriented approach of ClaRA changes the overall complexion of a PDP application. Because the majority of services delivered are reusable resources agnostic to analysis, they do not belong to any one application. By dissolving boundaries between applications, the physics data production is increasingly represented by a growing body of services that exist within a continuously expanding service inventory. Service deployment ClaRA uses SaaS technology as a way of delivering on-demand, readymade physics data processing solutions (“service engines” in ClaRA terminology) as ClaRA services. This approach eliminates the need to install and run these engines on a PDP application user’s computers, freeing the user from complex software and hardware management. The PDP application user uses a service, but does not control the operating system, hardware or network infrastructure on which it is running. The quality of the physics data-processing application (including syntactic, semantic qualities and performance) depends highly on the quality of constituent services. It is, therefore, absolutely critical to test and validate an engine before deploying it as a ClaRA service. Physics data-processing engines must be validated with respect to workflow, thread-safety, integrity, reliability, scalability, availability, accuracy, testability and portability. Data processing environment, service container, and SaaS implementation The highly distributed nature of ClaRA is largely due to traits of the ClaRA service container. A service container is the physical manifestation of an abstract service representation and provides the implementation of a ClaRA service interface. A service container is a thread within the ClaRA Data Processing Environment (DPE) that provides a complete run-time environment for software components. DPE presents a shared memory that used by service containers to communicate transient data between services within the same DPE. This prevents unnecessary copying of the data during service communications. Services in a DPE are group in multiple service containers. 17 18 CLARA Users Manual Figure 2. ClaRA data processing environment houses multiple service containers. Service containers use DPE shared memory for transferring data between services within the local (DPE) environment. The ClaRA service container allows the selective deployment of services exactly when and where you need them. In its simplest state, a service container is an operating system process that can be managed by the ClaRA framework. A service container is capable of managing multiple instances of user service engines. Several service containers can coexist within the same DPE providing the logical grouping of services. Service containers may also be distributed across multiple machines for the purposes of scaling up to handle increased data volume. ClaRA administrative services start service containers in a specified DPE. They also monitor and track functionality of service containers by subscribing to specific events from a service container, reporting the number of requests to a specific container, as well as notifying when a successful execution of a particular service (or its failure) has occurred. Figure 3. ClaRA service container groups multiple service engines, and provides SaaS implementation. A ClaRA service container provides the message flow in and out of a deployed service. It also handles a number of facilities, such as service lifecycle and data flow management. As illustrated in Figure 2, the service container manages an entry point and an exit point, which are used to dispatch a message (transient data envelope) to and from the service engine. In more complex cases, one input message can be directed into many remote service containers, each with its own routing information. Service registration and discovery The core of the ClaRA registration and discovery mechanism is the normative registry service that the ClaRA services and containers are dynamically registering with. The normative service, which is started by the framework in the master DPE (platform), functions as a naming and directory service for entire ClaRA cloud infrastructure. Services and service-containers in the ClaRA registry are described using unique names, types and descriptions. The ClaRA naming convention defines the service container name as: DPE_host_IP_address/service_container_name where the service_container_name is a string specified by the user. Likewise, the service name is constructed as: service_container_name/service_name The description of a service is based on a user-defined and/or commonly used high energy and nuclear physics data processing taxonomies. Querying the 19 20 CLARA Users Manual name, the type or a description defines the service discovery process. The service is advertised by its service information (see Figure 3) in the registry. By retrieving this service information, the user can discover services. Note that at the moment the service and/or service container discovery process is modest, and is not taking into account service functional information. Name Host Type Service Container Start Time Load Canonical Name Status Description Service Author Version Language Figure 4. Service and service-container registration information. ClaRA supports 3 service container types: Java, C++ and Python. Service granularity Service granularity describes the amount of physics data processing performed by a single request to a service. There is no single suggested size for all ClaRA services. To define the size of a service one should take into account the following (PDP application specific) design requirements: Service invocation/request frequencies Service network distribution The data amount passed during the service interaction In addition to the distribution and data transfer, it is important that the granularity of a service match the functional modularity of a PDP application (e.g. detector component specific reconstruction services used to build particle identification application). One should also consider designing services with finer granularity in case there is a functionality that is going to be cloned and/or changed over time (e.g. track fitting algorithms). Service accessibility Service accessibility describes the intended class of users of a service. ClaRA implements two types of service visibility, described as either public or private. Public visibility means that all users within the ClaRA cloud infrastructure are able to discover and use the service. Private means that service can be discovered, but will respond to specific clients (orchestrators and/or services) only. Service invocation ClaRA services are invoked using SOA most common Request/Reply communication mechanism. Two separate implementations of this mechanism are supported: synchronous and asynchronous. The basic mechanism of the synchronous service communication is when the requestor service sends a request to a service and waits. When the service has processed the request, it sends a reply. The requestor receives the reply and resumes it’s internal processing. The asynchronous communication mechanism is based on an eventbased approach, most commonly known as publish/subscribe communication. This type of communication is native to ClaRA, which is using cMsg publish/subscribe middleware for message passing between services. In this mode, a requester defines an event or subject of interest and subscribes to this event. Next requester sends its request to the receiver service along with the subject to which the response must be returned. Whenever the service is ready it publishes the response to the requested subject. Transient data envelope Think of a ClaRA service as a combination of its interface (the public view of the service), and its algorithmic implementation (the private view of the service). A ClaRA service interface provides the following functionalities: hides the details of the implementation expresses the service’s functions provides parameters for the service operations A ClaRA service is a software component that offers functionality on a semantic level by specifying its interface in a standardized way. A semantic level refers to a service that is self-descriptive in a way that it can be consumed dynamically and loosely coupled by other ClaRA services with a consistent understanding of communicating data. The major backbone of the ClaRA system is data. Data fed to services, generate a data processing action. All data sent between services are required to be self-descriptive. A transient data envelope that contains service data is the main object passed between ClaRA services. The mutual understanding and acceptance of this 21 22 CLARA Users Manual object couples services. When we say ClaRA services are loosely coupled we mean that this transient data object is the one and only physical coupling between ClaRA services. In the ClaRA framework a special class that has implementations both in Java and C++ represent the transient data envelope. This class contains fields and methods to pack and retrieve transient data as well as describe data, service communication and service operational details. Fields of the envelope are show in the Figure 5. Figure 5. Transient data envelope and a service container with a single service engine The meta-data segment of the envelope defines the type (mime-type), description, and measurement unit of the transient data object. The communication segment of the envelope is designed to inform the receiving service of the high level programming language, the version of the engine, and the engine execution status of the source service. Even though we consider only event level parallelization for ClaRA based PDP applications, we do not discard the possibility of having multi-tier services (services that are shared by multiple service based applications) in service compositions for building certain PDP applications. The requestID is designed to synchronize request/response pairs. The control segment of the envelope informs the receiver the name of the service responsible for creating the data stored in the envelope (dataSource), specific service name that the data is addressed to (dataDestination), as well as the name of the service that threw an exception during engine execution (exceptionSource). This segment of the transient envelope is used for controlling data flow between services of an application. Even though service engines can define additional data links, these control fields of the ClaRA transient envelope are designed to control the data flow within application-defined service link diagram. Any exception thrown during the execution of the service engine will be passed on following the predefined data pass of the PDP application, yet the control segment exceptionDestination field will allow for specifying the exception data path in every communication. Fields doneBroadcast and dataBroadcast can be set by an external orchestrator of an application or by any service engine. These fields are used to store the name of a service. This will notify a service container to inform the completion of a particular service engine execution. The done broadcast message envelope is designed not to have the data object in it. Contrary to the fact that user can request done and data to be broadcasted, errors and warnings are broadcasted by the service container in case service engine or service container detect a specific alarm condition. It is important to mention that the name of the service (canonical name) defines the physical location of the service. The location information is important to design PDP applications with the location-optimized communications within the ClaRA network distributed cloud. The location information is also useful when querying sets of data generated in an area of interest (for example any orchestrator that subscribes data from a specific service). Data versioning is a relatively new requirement within physics data processing, but is very useful for reporting purposes. It is a common means to track services that processed data. This is useful within the system because of how service data processing algorithms and solutions are hidden from direct access. The engineControl of the control segment of the transient data envelope is designed to control/configure (for example to alter the configuration of the service specified for communication) service engine at the run time. Service communication monitoring Auditing and logging play an important role within the distributed ClaRA cloud. The anticipated complexity of PDP applications, scaled over multiple ClaRA data processing environments and multiple service containers, requires tracking and constant monitoring of service communications and in some cases data flows between services. Reliable service communications ensure that the data gets to its intended destination, thus assuring overall PDP application quality. As part of the framework’s administrative and management capabilities, ClaRA provides auditing and logging services. These services are deployed within the ClaRA cloud master DPE, known as the platform. They can have multiple means for tracking service communications and data. System-level information about the health of the service itself and the flow of messages can be tracked and monitored. PDP application-level auditing, logging, and fault handling are accomplished through the transient data envelope metadata fields, namely the service execution status and the data description. The framework 23 24 CLARA Users Manual uses service data endpoints to deliver system-level errors, such as service engine thrown exceptions, as well as application-level errors (for example hot sector, detector noise, etc.). Exception propagation and reporting There is an underlying philosophy behind the way that the communication tracking, system errors, and application faults are handled. In addition to the normal handling of the outgoing flow of transient data, additional destinations are available to the service for auditing the message and for reporting errors. The service container implementation uses special message subjects for reporting/tracking, system errors and application fault events (see paragraph titled “Transient data envelope”). Anyone interested in these events can subscribe to the specific message subject and receive notification on the occurrence of specific events. From the service implementation's point of view, in the case of an exception it simply creates a ClaRA transient data object with proper description of the event and publishes it to a specific, predefined message subject. The ClaRA framework takes care of managing processes, such as auditing, logging, and error reporting to all interested (subscribing) services and/or service orchestrators. This approach provides a separation between the implementation of the service and the details surrounding fault handling. The implementer of a service need only be concerned that the service has a place to put such information, whether it is information concerning the successful processing of good data, or the reporting of errors and bad data. Exception events can be handled at both the individual service level and the service orchestrator level. A PDP application may make use of different implementations of individual services over time. The tracking of a fault occurrence or the auditing of an individual message can be tied to the context of a PDP application’s independent orchestrator that overlooks the entire cloud deployment exception status. For this purpose the ClaRA framework provides a normative service that subscribes to specific exception events and logs them in the ClaRA database (see Chapter 4, paragraph “ClaRA application debugging and communication logging service” for more details). Cloud formation Conceptually a ClaRA PDP application designer and/or user acquires physics data processing services from a ClaRA network distributed environment (i.e cloud) and then designs and runs an application based on selected services. Figure 6. ClaRA Cloud formation Therefore, ClaRA cloud offers users services to access PDP algorithms and applications, persistent and/or transient data resources. Figure 6 shows the relationship between services and the data transfer modes between services in a ClaRA cloud environment. A ClaRA cloud consists of multiple data processing environments (see paragraph “Data processing environment, service container, and SaaS implementation”) each providing a complete run time environment for service deployment and operation. Each of the DPEs of the ClaRA cloud host at least one service container with at least one service. Scalability and flexibility are the most important features driving the emergence of Cloud computing. ClaRA services and DPEs can be scaled across geographical locations, software configurations and performances. For data transfer efficiency reasons, transient data communication between the same language service containers, within a DPE, is established through shared memory. The data that is sent across language barriers or across the network is transferred through pub-sub middleware (cMsg publish-subscribe communication protocol). 25 26 CLARA Users Manual ClaRA batch deployment ClaRA native cloud deployment is incompatible with existing cluster batch queuing systems. Figure 7 ClaRA batch job queuing system deployment ClaRA alleviates the incompatibility of specific cloud computing applications deployed in a traditional queuing system by extending ClaRA’s cloud scheduler functionality. As a result the ClaRA cloud scheduler is capable of dynamically acquiring and using available computing nodes within a queuing system. After getting permission and access to cluster nodes, the cloud scheduler will start 2 processes (jobs, if you like): Java and C++ DPEs. Information about the newly started DPEs will be delivered to all PDP application orchestrators. This will trigger a new set of deployments of services used to compose a user-specific PDP application. This mechanism assumes that PDP application orchestrators are running on a dedicated computing node outside of a batch processing system (see Figure 7). Chapter 3 ClaRA SaaS In Nutshell The ClaRA service model assumes that the software, as well as the solution itself is provided as a complete service. This approach is referred to as Software as a Service (SaaS). A ClaRA service may be concisely described as a software application (service engine) that is deployed on a Clara DPE and can be accessed locally as well as globally over the Internet. With the exception of a user’s and other service interactions with a service engine, all the aspects of a service are abstracted away (including algorithmic solutions, composition, inheritance, technology, etc.). As was mentioned, ClaRA SaaS supports multiple users and provides a shared data model through a single-instance, known as a multi-tenancy model (i.e. services that are shared between multiple PDP applications). So, the use of the multi-tenancy model in the ClaRA SaaS implementation dictates the only requirement to service software: the software must be thread enabled or thread safe. All you need to convert a software application into a ClaRA service. In order to present software as a ClaRA service we need 2 things: understanding of the ClaRA transient data envelope implementing of the ClaRA service interface (or inherit from the ClaRA service abstract class) The ClaRA transient data envelope is described in the previous chapter. The ClaRA framework provides the transient data representing class both in Java and C++. An object of this class is passed as a parameter to the ClaRA service interface methods. The table below describes the ClaRA service interface methods. Method signature public void configure (JioSerial data); public JioSerial execute (JioSerial data); public JioSerial execute (JioSerial[] data); public String getName(); public String getAuthor(); public String getDescription(); Description Service configuration Service engine execution Service engine execution Returns service engine name Returns name of engine author Functional description of service engine 27 28 CLARA Users Manual public String public String public void getVersion(); getLanguage(); destruct(); Returns version of engine Returns engine programming language Graceful withdrawal of service/engine Table 3. ClaRA service interface. JioSerial is the class that represents transient data format in Java. It is perhaps best explained through an example. Let us consider a class that has a method that contains a sequence of statements that implements a specific physics data processing algorithm, for example the Clas12 central barrel tracker cross-list definition. The code is presented below, without inclusive algorithm explanation, since the goal of this exercise is to demonstrate the technical aspect of converting a class into a ClaRA service. public class BTCrossListMaker { public EvioEvent processEvent(EvioEvent input) { EventBuilder builder = new EventBuilder(input); int BT_TAG = 1000; ArrayList<BSThit> hits; //get the single strip hits hits = BSThit.getHits(input); //find the clusters from these hits BSTClusterFinder_ContHits gcf = new BSTClusterFinder_ContHits(); ArrayList<BSTcluster> clusters = gcf.findClusters(hits); //create the line-segment objects ArrayList<BSTlinesegment> bstsegments= new ArrayList<BSTlinesegment>(); for (BSTcluster thecluster : clusters) { BSTlinesegment theline = new BSTlinesegment(thecluster); bstsegments.add(theline); } //make the crosses BSTCrossMaker crsmk = new BSTCrossMaker(); ArrayList<BSTcross> bstcrosses = crsmk.findCrosses(bstsegments); // make the list of crosses BTcrosslist btcrosslist = new BTcrosslist(bstcrosses, null, Clas12Constants.nSVTregions, 0); //make the bank for the crosses try { BTEvioOutput.createBTcrosses(builder, btcrosslist, BT_TAG); } catch (EvioException e) { e.printStackTrace(); } return builder.getEvent(); } } To make this class a service we need to present it as a ClaRA service engine. For that we need to edit the presented code and implement the following changes: change the class signature by implementing the ClaRA service interface physically implement all interface methods (se Table 3) The service engine functionality is defined by two overloaded interface methods: execute. The configure method takes care of configuring the engine (in this particular case it is going to be an empty method). The comments (especially bold italic comments in the execute method) describe the steps taken. public class BTCrossListMaker implements ICService { @Override public void configure(JioSerial data) { // For this particular example we do not need configuration } @Override public JioSerial execute(JioSerial data) { // check the type of the received data MimeType mt = data.getMimeType(); // reject the service if the data type is not EvIO if (mt != MimeType.EVIO_OBJECT) { JioSerial out = new JioSerial(CConstants.REJECT); String msg = String.format("Wrong input type: %s", mt); out.setStatus(CConstants.error); out.setDataDescription(msg); return out; } 29 30 CLARA Users Manual // now get the data from the transient data envelope and call the event processEvent method of the presented class EvioEvent result = processEvent(data.getData()); // put the result data into the transient data envelope and return JioSerial out = new JioSerial(); out.setData(result, MimeType.EVIO_OBJECT); } @Override public JioSerial execute(JioSerial[] data) { return null; } @Override public void destruct() { //do nothing } @Override public String getName() { return “BstCrossMaker”; } @Override public String getAuthor() { return “Ziegler”; } @Override public String getDescription() { return “Here can go an THML description of this engine with complete description of accepted and returned EvIO bank structures”; } @Override public String getVersion() { return “1.0”; } @Override public String getLanguage() { return CConstants.LANG_JAVA; } This is all what is required to present the regular PDP class as a service engine. The only thing left is to compile and copy the class file (.so file in the case of C++) into the $CLARA_SERVICES directory. We have to make sure to copy all the auxiliary library files necessary for the proper functionality of the initial class into the $CLAS/lib directory. In the Chapter 5 we will discuss in more details of how to deploy and test this BTCrossListMaker service, utilizing standard EvIO persistent to EvIO transient data convertor services in the ClaRA application designed using the application designer graphical user interface. 31 32 CLARA Users Manual Chapter 4 Service Developer’s Workshop Welcome to the hands-on section of this guide. This section is for those users who would like to develop future ClaRA application building blocks, i.e. services. We will walk you through the steps you need to do to handle a particular scenario or use case. Examples provided in this chapter are in Java, yet syntactically C++ is not much different. To compile, deploy and run provided examples you need a copy of the Java platform: JDK6 (Standard Development Kit, edition 6.0) or JDK7 (edition 7.0). Usually you will have java already installed, if not you can download the JDK software package from http://www.oracle.com/technetwork/java/javase/downloads/index.htm One other software package that is necessary for checking out ClaRA from the Clas12 software repository is the Subversion version control system (SVN). If you do not already have SVN installed, install it from http://www.collab.net/downloads/subversion#tab-3 If you would like to offer suggestions for this section contact us at gurjyan@jlab.org. How to install ClaRA? Environment Setup First make sure the CLARA environmental variable is set and that it points to an existing directory. mkdir -p $HOME/clara/services export CLARA_SEVICES=$HOME/clara/services or in tcsh: setenv CLARA_SERVICES $HOME/clara/services The environment variables should be set in your .bashrc or .cshrc file. JToolBox Next you need to checkout the JToolBox project and build it. svn co https://clas12svn.jlab.org/repos/clas12/JToolBox/trunk $HOME/clara/JToolBox cd $HOME/clara/JToolBox ant Clara Platform To install Clara you need to check out the latest release from the clas12 software svn repository. svn co https://clas12svn.jlab.org/repos/clas12/Clara/tags/2.1.3 $HOME/clara/Clara-2.1.3 The next step is to change directories to a newly checked out Clara distribution directory and type install. However, it is important to make sure that the CLARA environmental variable is set and is pointing to a local directory where the future service engine binary files as well CLAS12 software common java libraries will be installed. cd $HOME/clara/Clara-2.1.3 ant What if I also need the source code? The stable version of the project can be checked out from the ClaRA svn repository trunk directory, and an ant command will compile and install ClaRA locally. svn co https://clas12svn.jlab.org/repos/clas12/Clara/trunk Clara cd Clara ant Note, that the package requires Sun's Java Runtime Environment (JRE) release 1.6.x (or later), as well as Ant 1.8.x to compile. 33 34 CLARA Users Manual How do I start a platform? Environment Setup Assuming you have the Clara platform checked out into $HOME/clara/Clara-2.1.3 you will need to update the PATH environmental variable and have $HOME/clara/Clara-2.1.3/bin added to the shell's executable search path. export PATH=$PATH=$HOME/clara/Clara-2.1.3/bin or in tcsh: setenv PATH ${PATH${HOME}/clara/Clara-2.1.3/bin Again, these environment variables should be included in your .bashrc or .cshrc file. Platform Name Definition Clara data processing environments will be assigned names based on the ClaRA naming convention. A ClaRA platform (i.e. the master DPE) will have a name composed from an IP address of the hosting node + “_platform” (e.g. 129.57.29.62_platform). Regular DPEs will get name = IP_address+”_admin” (e.g. 129.57.29.62_admin). Running the Platform The set of Unix shell scripts and Windows executable files are provided in the bin directory of the distribution that will help to start and operate the ClaRA framework environment. You can start the platform with following command: >clara-platform Below is the console of a ClaRA platform that was started locally. >> **** cMsg server sucessfully started at Wed Sep 05 14:57:08 EDT 2012 **** << ************************************************** * Clara-2 Platform * ************************************************** - Name = 129.57.70.17_platform - Host = 129.57.70.17 - TCP port = 45000 - UDP port = 45000 - Start time = 2012/09/05 14:57:08 ************************************************** ************************************************** * Clara-2 Data Processing Environment * ************************************************** - Name = 129.57.70.17_admin - Host = 129.57.70.17 - Start time = 2012/09/05 14:57:08 - Platform host = 129.57.70.17 - Platform name = 129.57.70.17_platform - Platform TCP port = 45000 - Platform UDP port = 45000 ************************************************** 129.57.70.17_admin registered with the platform at 2012/09/05 14:57:08 How to start an additional data processing environment? A platform starts the master DPE (platform) that hosts the ClaRA normative services. These are services that are responsible for user service registration and discovery as well as overall service runtime management and administration. By design only one DPE is allowed to be running on a single computing node. Thus, additional DPEs must be started on a node that is different than the platform node. To start an additional DPE on a host one can use the provided Unix shell script from the bin directory of the ClaRA distribution. The –host option of the script will define the host of the master DPE (platform). Below is shown a console of the DPE that is started using the Unix shell script from the bin directory of the distribution. >bin/clara-dpe –host ankaa.jlab.org >> **** cMsg server sucessfully started at Wed Jul 27 15:41:55 EDT 2012 **** << ************************************************** * CLARA-2 Data Processing Environment * ************************************************** - Name = 129.57.29.104_admin - Host = 129.57.29.104 - Start time = 2012/07/25 22:41:55 - Platform host = 129.57.29.62 - Platform name = 129.57.29.62_platform - Platform TCP port = 45000 - Platform UDP port = 45000 ************************************************** 129.57.29.104_admin registered with the platform at 2012/07/27 15:41:55 35 36 CLARA Users Manual My first hello-world service It has become customary when learning a new programming language or testing an unfamiliar programming environment to write a 'Hello world!' program. So, without breaking the tradition let us create our first HelloWorldService that will service the “Hello World!” string. To create a service first we need to inherit from the JService class and implement all the ClaRA interface methods. public class HelloWorldService extends JService { According to the ClaRA terminology we are writing a hello-world service engine that is going to be deployed in one of the service containers of a data processing environment of a ClaRA cloud. So, in order to present our hello-world engine as a service we need to implement abstract methods of the JService class. These methods are designed to interface the user code with the framework that assures proper registration and execution of the user code as a service. @Override public String getName() { return "hello"; } @Override public String getAuthor() { return "Gyurjyan"; } @Override public String getDescription() { return "Hello World service"; } @Override public String getVersion() { return "1.0"; } @Override public String getLanguage() { return CConstants.LANG_JAVA; } @Override public JioSerial execute(JioSerial[] data) { return null; } @Override public void configure(JioSerial data) { } @Override public void destruct() { } It is important to mention that the getName interface method defines the name of our hello-world service engine that is going to be used by the framework to construct the canonical name of the service. The canonical name of the service is the name that is going to be used for further service registration, discovery, composition and execution. In the above example code the name of the helloworld service engine is set to be “hello”. The rest of the interface methods are used to describe the functionality of a service, informing the framework and future service customer about the hello-world service engine author, version, language, and etc. The operational details, i.e. the actual functionality of our hello-world service, are going to be described in the execute method of the ClaRA interface. @Override public JioSerial execute(JioSerial data) { // output transient data object JioSerial out = new JioSerial(); out.setLanguage(CConstants.LANG_JAVA); // check the input data mime-type if(data.getMimeType().type().equals(MimeType.STRING.type())){ // get input data object from the JioSerial transient data String inputDataObject = data.getStringObject(); // we do not care about the input data content // generate the output data out.setData(“Hello World!”); out.setDataDescription("response to "+inputDataObject); out.setStatus(CConstants.info); } else { // Reject with an execution status = error out.setData(CConstants.REJECT); out.setDataDescription("I can accept only strings"); 37 38 CLARA Users Manual out.setStatus(CConstants.error); } return out; } In order to make the hello-world code ClaRA “ready” we need to compile and copy the compiled byte-code into the $CLARA_SERVICES directory. >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar HelloWorldService.java -d $CLARA_SERVICES/ How do I make my service available to the public? In a previous exercise we named our hello-world example as a service, which is not a correct description of the created class. What we actually coded in the previous example was the engine of the hello-world service. In order to offer the hello-world engine as a ClaRA service we need to deploy it inside of the ClaRA service container. If you have read the entire manual up to this point, you already know that a ClaRA service container is the framework component that presents user engines as services. Containers are also used for logical grouping of services. In order to create a container or use the existing container for our hello-world engine deployment, we must communicate with the ClaRA framework administrative services. To do that let us write our first orchestrator that connects to the ClaRA framework and asks the framework normative services to create a service container and deploy the required engine in it. The code of our simple DeployService orchestrator is shown below. public class DeployService extends JOrchestrator { As you see user orchestrators must inherit from the JOrchestrator class. This is the class that provides public methods to communicate with the ClaRA framework. The constructor of the orchestrator simply calls the parent constructor, which will establish a connection between this orchestrator and the ClaRA framework (platform). /** * Constructor * Connects to the ClaRA platform * * @param name of this orchestrator */ public DeployService(String name) { super(name); } It is important to be reminded that any ClaRA service or orchestrator must have a unique name. In this particular case the name will be defined by the user as a parameter to the main method of the DeployService orchestrator. public static void main(String[] args){ String String String String String oName host container engineName engineClass = = = = = args[0]; args[1]; args[2]; args[3]; args[4]; // get textual representation of the host IP address host = CUtil.getIPAddress(host); // an instance of this class DeployService dso = new DeployService(oName); // print orchestrator info System.out.println(dso); // container canonical name String conCanName = host.trim()+"/"+container.trim(); // service canonical name String serCanName = conCanName+"/"+engineName.trim(); // get registration information form the platform normative services dso.updateRegistration(); The local variable oName is the user specified name of this orchestrator. For the deployment of our hello-world engine as a service, we need to specify the name of the host of the service where it will be running, the service container name, the name of the engine and finally the class name of the engine. As you see from the code this information is used to create both the service container and service canonical names. As you already know, the ClaRA naming convention requires that a service container name be constructed as host/container_name and a service name be constructed as host/container_name/engine_name. Now you might ask, what is the name of the hello-world engine? The answer to this question is in the hello-world engine code where you defined a name of the 39 40 CLARA Users Manual engine when you implemented the getName() ClaRA interface method. The rest of the code, where we actually communicate with the normative services and ask them to start the container and the service is listed below. // create a container if it is not already registered if(!dso.isContainerRunning(conCanName)){ if(!dso.startServiceContainer(host,container)){ System.out.println("Error: Failed to create a service container on the host ="+host); System.exit(1); } } // wait until service container is fully registered CUtil.sleep(500); // start a service on the specified container if it is not already deployed if(!dso.isServiceRunning(serCanName)){ if(!dso.startService(host,container,engineName,engineClass)){ System.out.println("Error: Failed to create a service on the container = "+conCanName); System.exit(2); } else { System.out.println("Started "+serCanName+" at: "+ CUtil.getCurrentTime()); } } else { System.out.println("Service "+serCanName+" exists on the platform."); } dso.exit(); } Let us now compile and run DeployService class. The example below of the compilation command will install compiled class file into $CLARA_SERVICES dir. >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA_SERVICES/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar DeployService.java -d $CLARA_SERVICES/ Before execution make sure that the ClaRA platform is up and running. DeployService requires 3 parameters: host where the hello-world service will run name of the service container (arbitrary, for example xContainer) hello-world service class name (examples.service.HelloWorld - take a look at the package statement of the HelloWoirld.java). The registration name of an orchestrator can be auto-generated using generateName() method of the ClaRA utility package CUtil (see appendix B). This way you will avoid name conflict, resulting in a orchestrator registration rejection. However, we will not recommend auto-generating the service container name since this is part of the future service canonical name. After running the DeployService class you will get the following console: >java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.DeployService xName localhost xContainer hello examples.service.HelloWorldService ************************************************** * ClaRA 2 client = xName ************************************************** - Name = xName - Host = 192.168.1.132 - Start time = 2012/08/06 19:51:20 - UDL = cMsg://192.168.1.132:45000/cMsg/129.57.29.62_platform - Platform = 129.57.29.62_platform - Platform TCP port = 45000 - Platform UDP port = 45000 -Connected to host = 192.168.1.132 ************************************************** Started 192.168.1.132/xContainer/hello at: 2012/08/06 19:51:21 The result of the hello-world service deployment will be indicated on the platform console: >bin/clara-platform >> **** cMsg server successfully started at Mon Aug 06 19:28:11 EDT 2012 **** << ************************************************** * CLARA-2 Platform * ************************************************** - Name = 192.168.1.132_platform - Host = 192.168.1.132 - TCP port = 45000 - UDP port = 45000 41 42 CLARA Users Manual - Start time = 2012/08/06 19:28:12 ************************************************** ************************************************** * CLARA-2 Data Processing Environment * ************************************************** - Name = 192.168.1.132_admin - Host = 192.168.1.132 - Start time = 2012/08/06 19:28:12 - Platform host = 192.168.1.132 - Platform name = 192.168.1.132_platform - Platform TCP port = 45000 - Platform UDP port = 45000 ************************************************** 192.168.1.132_admin registered with the platform at 2012/08/06 19:28:12 192.168.1.132/xContainer registered with the platform at 2012/08/06 19:51:20 192.168.1.132/xContainer/Hello registered with the platform at 2012/08/06 19:51:21 How do I check if my service is up and running? In the previous chapter we described a set of steps necessary to deploy a service engine as a service in a ClaRA platform. As a proof of the successful deployment we checked the platform console printouts. However, it is likely that you will not have access to the platform console and more importantly, you might want to check if there are already deployed services with identical engines before a new service deployment. For this reason let us see how we can get a list of registered services on a ClaRA cloud that are using the same service engine. Take a look at this code. public class ServicesByEngineName extends JOrchestrator{ /** * Constructor * Connects to the ClaRA platform * * @param name of this orchestrator */ public ServicesByEngineName(String name) { super(name); } We repeat the standard beginning of the orchestrator class for the sake of completeness of the presented code that can be directly replicated in your test projects. In the main method of this orchestrator we create an object of this class and request registration information from the registration and discovery services of the platform. Based on the registration information we obtain the list of all registered service canonical names on the cloud that are using the required engine. To get the list of all registered services we use the engine name that was defined when the service developer override getName() ClaRA interface method. Note that an engine name is case sensitive. public static void main(String[] args) { String oName = args[0]; String engineName = args[1]; // an instance of this class ServicesByEngineName rso = new ServicesByEngineName(oName); // get registration information form the platform normative services rso.updateRegistration(); // get all registered service canonical names that has a required engine name. ArrayList<String> serviceNames = rso.getServiceNamesByEngineName(engineName); if(serviceNames!=null && !serviceNames.isEmpty()){ for(String sn:serviceNames){ // list service information System.out.println(); rso.listServiceInformation(sn); } } else { System.out.println("No service with the specified engine name = "+engineName+" was found."); } rso.exit(); } } In the segment of the code above we iterate over all the service names, presenting the requested engine as a service on a cloud. For all found services we print short information, including the author and the version of a service. Below is the snapshot of a terminal that was used to compile and run above the presented orchestrator. >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA_SERVICES/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar ServicesByEngineName.java -d $CLARA_SERVICES/ 43 44 CLARA Users Manual >java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.check.ServicesByEngineName xyz Hello Description: Hello World service Version : 1.0 Author : Gyurjyan How do I request a service? Now that we have developed and deployed our first service let us request the hello-world service to greet us. As you already guessed we need to write an orchestrator that will communicate with the ClaRA platform registration and discovery services, find the hello-world service and request the service of it. We will name our orchestrator as a ServiceByCanonicalName since in this exercise we are planning to access a service by its canonical name. The listing of the ServiceByCanonicalName.java code is shown below. public class ServiceByCanonicalName extends JOrchestrator { /** * Constructor * Connects to the ClaRA platform * * @param name of this orchestrator */ public ServiceByCanonicalName(String name) { super(name); } public static void main(String[] args) { String oName = args[0]; String serviceName = args[1]; // instance of this class ServiceByCanonicalName rs = new ServiceByCanonicalName(oName); // get registration information form the platform registration and discovery services rs.updateRegistration(); // check the platform registration if the service in question is registered if(rs.isServiceRunning(serviceName)){ // request the service // create a request transient data JioSerial dataRequest = new JioSerial(); dataRequest.setLanguage(CConstants.LANG_JAVA); dataRequest.setData(oName); dataRequest.setMimeType(MimeType.STRING); // send the request JioSerial dataResponse = rs.syncRunService(serviceName, dataRequest,1000); // print the response System.out.println(dataResponse); } else { System.out.println("Service was not found"); } rs.exit(); } } // request the service // create a request transient data JioSerial dataRequest = new JioSerial(); dataRequest.setLanguage(CConstants.LANG_JAVA); dataRequest.setData(oName); dataRequest.setMimeType(MimeType.STRING); // send the request JioSerial dataResponse = rs.syncRunService(serviceName, dataRequest,1000); // print the response System.out.println(dataResponse); } else { System.out.println("Service was not found"); } rs.exit(); } } As usual the first argument to the main method is the name of this orchestrator, required for the proper connection to the ClaRA platform. The second argument is the canonical name of the service that we are planning to access. The first thing we do in the code is to obtain the registration information from the registry and discovery services. Having this information, we proceed to check if the service of interest is registered with the platform registry services. This will indicate that the service is up and running and is ready to accept requests. For the request we create a transient input data object and specify the 45 46 CLARA Users Manual input data type as a string and specify the programming language of the requester client (i.e. this orchestrator) that created the transient data. Defining the language of the transient data object creator in the coming releases is expected to be deprecated, but for the 2.1.1 release we need this due to the transient data handling differences between Java and C++. As the input transient data content we send the registration name of the ServiceByCanonicalName orchestrator. Let us compile and run this orchestrator. >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA_SERVICES/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar ServiceByCanonicalName.java -d $CLARA_SERVICES/ >java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.request.ServiceByCanonicalName xName 129.57.81.247/xContainer/hello mimeType = STRING dataUnit = undefined dataDescription = response to xName language = java version = 0.0 status = Info control = undefined exceptionSink = undefined exceptionSource = undefined data = 2012/08/07 17:04:40: Hello After the execution you will get a similar printout (shown above) indicating the hello-world service output data details. How can I request a service based on a keyword in a description? By far the most common use of a request to a service is not through the canonical name but rather through the keyword describing the service or through the engine name. In this exercise we will demonstrate how we can find and request the hello-world service just by knowing that the description of the service contains the word “hello”. As usual we start by extending JOrchestrator class (due to the similarity of all orchestrator classes the class definition and the constructor of this particular orchestrator is not shown here). We pass two parameters to the main method of this class: a) the name of this orchestrator (arbitrary) and the description keyword (in this particular case it would be the “hello” string). As always the connection to the ClaRA framework platform is done when we call the parent (JOrchestrator) constructor. In the next step we ask the platform registration services to report all registered services. public static void main(String[] args) { String oName = args[0]; String descKey = args[1]; // an instance of this class ServiceByDescriptionrso = new ServiceByDescription (oName); // get registration information form the platform normative services rso.updateRegistration(); The method getServiceNameByDescription will scan over all registered service descriptions and return the canonical name of the service that contains the required keyword in its description, and is local to this orchestrator or is a remote service and has the smallest load. The service load is defined as the number of requests to a service per second. // get the service canonical name that has a required description //keyword in the description. String serviceName = rso.getServiceNameByDescription(descKey); Here is the rest of the code where we request a found service by creating and sending a transient data object. if(!serviceName.equals(CConstants.udf)){ System.out.println("Found a service = "+serviceName+"\n"); // list service information rso.listServiceInformation(serviceName); // create a request transient data JioSerial dataRequest = new JioSerial(); dataRequest.setLanguage(CConstants.LANG_JAVA); dataRequest.setData(oName); dataRequest.setMimeType(MimeType.STRING); // send the request System.out.println("\nRequesting the service\n"); 47 48 CLARA Users Manual JioSerial dataResponse = rso.syncRunService(serviceName, dataRequest,1000); // print the response System.out.println(dataResponse); } else { System.out.println("Can not find a service with the specified description keyword."); } rso.exit(); } Below is a snapshot of the terminal that was used to compile and run the described orchestrator. >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA_SERVICES/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar ServiceByDescription.java -d $CLARA_SERVICES/ >java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.request.ServiceByDescription xName hello Found a service = 192.168.1.132/xContainer/Hello Description: Hello World service Version : 1.0 Author : Gyurjyan Requesting the service mimeType = STRING dataUnit = undefined dataDescription = response to xName language = java version = 0.0 status = Info control = undefined exceptionSink = undefined exceptionSource = undefined data = 2012/08/07 22:22:50: Hello Asynchronous service request In previous chapters we demonstrated mainly synchronous service requests, where an entire action, including a service response handling, was taking place in a main thread of an orchestrator. However a general design consideration and a key attribute of the ClaRA is the loose coupling between components that is achieved by means of asynchronous message passing between services. So, let us walk you through an example of an orchestrator that sends an asynchronous request to a previously created hello-world service and receives the response in a callback. public class ServiceAsync extends JOrchestrator implements ICallBack { private String oName; private String serviceName; /** * Connects to the ClaRA platform * @param name of this orchestrator */ public ServiceAsync(String name) { super(name); } Note that the response from the requested service will arrive asynchronously in a callback of a JCallBack object (that we are going to create in a main method before sending a request to a service). The JCallback object, in turn, will call the monitorCallback method of the ICallBack interface, implemented by the orchestrator example above. As part of the ICallBack agreement we implement the monitorCallBack method that simply prints the received (output) transient data object from the requested service. @Override public void monitorCallBack(JioSerial data) { //here we simply print the transient data System.out.println(data); // remove the monitor serviceMonitorOff(serviceName,oName,oName); // exit after you deal with the response exit(); } After printing data on the screen, the orchestrator stops monitoring the required service and gracefully exits (see code snippet below). Now let us take a look at the main method of the class. Along with the usual steps of creating an object of the class and finding a service of interest by 49 50 CLARA Users Manual means of communication with the platform registration and discovery services, we create a JCallBack object. public static void main(String[] args) { // instance of this class ServiceAsync rs = new ServiceAsync(args[0]); rs.oName = args[0]; rs.serviceName = args[1]; // get registration information form the platform registration and //discovery services rs.updateRegistration(); // check the platform registration if the service in question is //registered if(rs.isServiceRunning(rs.serviceName)){ JCallBack myCallBack = new JCallBack(rs); // monitor this service rs.serviceMonitorOn(rs.serviceName,rs.oName,rs.oName,myCallBack); // Now that we have callback in a place, request the service // create a request transient data JioSerial dataRequest = new JioSerial(); dataRequest.setLanguage(CConstants.LANG_JAVA); dataRequest.setData(rs.oName); dataRequest.setMimeType(MimeType.STRING); // async send the request rs.runService(rs.serviceName, dataRequest,1); } else { System.out.println("Service was not found"); } } This is the object that will be notified whenever the monitored service (in this particular example the hello-world service) generates output data. Note that we use one of the overloaded constructors of the JCallBack that ignores the sender, the subject and the type of the received message. After we set the monitor to watch the output of the hello-world service, we create and send transient data requesting asynchronously the hello-world service. Configuring a service A ClaRA composite service typically consists of a set of entity services running on one or more machines that cooperate to provide a useful facility to an end-user or to another software system. A particularly critical aspect of a PDP application deployment practice is the ability to configure individual services that are part of an application. An example might be an event-service of a charged particle tracking application, which uses an input file as the frontend and has a back-end that stores reconstructed persistent data. In this case we need to configure the event-service and define the required end points in terms of input/output file locations. In this tutorial we show how one can configure the previously discussed hello-world service. We will slightly modify our hello-world service and make it configurable. For that let us modify the execute method that will greet in a language defined at the service configuration stage. First we implement the configure ClaRA interface method. This method sets the private language indicator variable according to a language request that was obtained through the transient configuration data. // defines a language using which we say hello private int language = 0; @Override public void configure(JioSerial data) { // check the input data mime-type if(data.getMimeType().type().equals(MimeType.STRING.type())){ // get data object String cd = data.getStringObject(); if(cd.equalsIgnoreCase("Armenian")){ language = 1; } else if (cd.equalsIgnoreCase("Italian")){ language = 2; } else if (cd.equalsIgnoreCase("Russian")){ language = 3; } else if (cd.equalsIgnoreCase("French")){ language = 4; } else if (cd.equalsIgnoreCase("German")){ language = 5; } else if (cd.equalsIgnoreCase("Greek")){ language = 6; } else if (cd.equalsIgnoreCase("Hebrew")){ language = 7; } else if (cd.equalsIgnoreCase("Japanese")){ language = 8; } else if (cd.equalsIgnoreCase("Thai")){ 51 52 CLARA Users Manual language = 9; } } } Now, let us modify the previously mentioned execute method of the hello-world service that will use a language indicator to construct the greeting message. The switch statement and the case for the service request rejection are shown below. public JioSerial execute(JioSerial data) { // output transient data object JioSerial out = new JioSerial(); out.setLanguage(CConstants.LANG_JAVA); out.setMimeType(MimeType.STRING); // check the input data mime-type if(data.getMimeType().type().equals(MimeType.STRING.type())){ // get the data content String inputDataObject = data.getStringObject(); // generate the output data switch(language){ case 0: out.setData(CUtil.getCurrentTime()+": Hello"); break; case 1: out.setData(CUtil.getCurrentTime()+": Barev"); break; case 2: out.setData(CUtil.getCurrentTime()+": Salve"); break; case 3: out.setData(CUtil.getCurrentTime()+": Zdrastvui"); break; case 4: out.setData(CUtil.getCurrentTime()+": Bonjour"); break; case 5: out.setData(CUtil.getCurrentTime()+": Guten Tag"); break; case 6: out.setData(CUtil.getCurrentTime()+": Gia'sou"); break; case 7: out.setData(CUtil.getCurrentTime()+": Shalom"); break; case 8: out.setData(CUtil.getCurrentTime()+": Konnichiwa"); break; case 9: out.setData(CUtil.getCurrentTime()+": Sa-watdee"); break; } out.setDataDescription("response to " + inputDataObject); out.setStatus(CConstants.info); } else { // Reject with an execution status = error out.setData(CConstants.REJECT); out.setDataDescription("I can accept only strings"); out.setStatus(CConstants.error); } return out; } If you have been following along in this tutorial from the start then you know that for any external communication (including the configuration request) with a service you need to write an orchestrator. An example of a service configuration orchestrator is shown below. public class ConfigureService extends JOrchestrator { public ConfigureService(String name) { super(name); } Now let us write the main method of ConfigureService orchestrator. As usual first we get the registration information for the service of interest, making sure the service is properly deployed. public static void main(String[] args) { String oName = args[0]; String service = args[1]; String language = args[2]; 53 54 CLARA Users Manual ConfigureService cs = new ConfigureService(oName); // get registration information form the platform normative services cs.updateRegistration(); // check if the service1 is registered if(!cs.isServiceRunning(service)){ System.out.println("Error: Can not find the registration information for the service = "+service); System.exit(1); } // create configuration data JioSerial cData = new JioSerial(); cData.setLanguage(CConstants.LANG_JAVA); cData.setData(language); cData.setMimeType(MimeType.STRING); if(!cs.configureService(service,cData)){ System.out.println("Error: Configure failed."); } cs.exit(); } } To make sure that configuration and following service execution operationally correct and consistent we need first to compile and deploy hello-world service (refer the previous paragraphs) and then compile and the ConfigureService orchestrator. Here is how we compile and run ConfigureService orchestrator. are the run the >javac -cp $CLARA_SERVICES/lib/clara.jar:$CLARA_SERVICES/lib/jtools1.0.jar:$CLARA_SERVICES/lib/cMsg-3.3.jar ConfigureService.java -d $CLARA_SERVICES/ Below is the snapshot of the terminal that was used to run the hello-world service configuration orchestrator. Note that this orchestrator uses a service canonical name to locate the requested service to be configured. As a reminder, the canonical name of a service is constructed as host/container-name/enginename. Note that the name of the deployed service can be seen in the platform console. >java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.ConfigureService oName 129.57.81.247/xContainer/Hello Armenian As you can see, our orchestrator is not verbose. So, in order to be convinced that we configured properly the hello-world service, let us request the helloworld service and see it will greet us in a language required at the configuration stage (i.e. Armenian in this particular execution of the ConfigureService orchestrator. See above). In order to request the hello-world service we use ServiceByCanonicalName orchestrator (see “How do I request a service” paragraph). java -cp "$CLARA_SERVICES/.:$CLARA_SERVICES/lib/*" examples.orchestrator.request.ServiceByCanonicalName xName 129.57.81.247/xContainer/hello date dataSource dataDestination dataDescription status mimeType dataUnit language version control exceptionSource exceptionDestination data = 2012/08/15 11:38:44 = undefined = undefined = response to xName = Info = STRING = undefined = java = 0.0 = undefined = undefined = undefined = 2012/08/15 11:38:44: Barev If you get the word “Barev” in the data content of the transient data envelope from the hello-world service then the configuration of the service was successful. Services that read/write EvIO files As we learned the ClaRA transient data envelope can package arbitrary data objects as well as primitive types and arrays of primitive types. The ClaRA framework is designed to be transient data object agnostic with the exception of EvIO objects. EvIO is the JLAB data acquisition (CODA) common event format, and is adopted to be the Clas12 offline software standard data format. So, it would be a useful exercise to write a service that can read/write EvIO format files. One can anticipate that these services are going to be very similar and can be designed as one service with a special configuration option. Yet, let us write two separate services, one for reading and another one for writing. This will prepare us for the example in the following chapter where we will learn how to link together services and compose an application. The code for the EvioFileReaderService class is shown below. 55 56 CLARA Users Manual public class EvioFileReaderService extends JService { private EvioReader reader; private String filename; // stores file IO error. no error = null private String openingError; @Override public void configure(JioSerial data) { try { filename = data.getStringObject(); if(reader != null){ reader.close(); } reader = new EvioReader(filename); openingError = null; } catch (IOException e) { openingError = e.toString(); e.printStackTrace(); } } The above code snippet shows the class definition with three object fields that store EvIO reader object, the EvIO file name, and file I/O operation status, as well as the implementation of the ClaRA configure interface method. The main action of this service is coded inside of the execute method (shown bellow). @Override public JioSerial execute(JioSerial data) { if(reader == null){ JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); if(openingError != null){ out.setDataDescription("file not specified"); } else { out.setDataDescription(openingError); } return out; } try { EvioEvent event = reader.parseNextEvent(); if(event == null){ JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription("no more events!"); } JioSerial out = new JioSerial(event); out.setMimeType(MimeType.EVIO_OBJECT); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription("data from file " + filename); return out; } catch (EvioException e) { JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription(e.toString()); return out; } } In the above code we reject the service in case we have a problem opening the file and create the Evioreader object. We also do this in the case of an exception thrown during the creation of the EvioEvent object. Note that the data description contains the string describing the reason for the rejection. The EvioEvent object, which is successfully created as a result of reading the EvIO file, will be packed inside of the transient data envelope and be presented as the output of this service. The rest of the code of the service is shown below where you see the usual implementation of the ClaRA interface methods, description, author, version, etc. of the service. @Override public JioSerial execute(JioSerial[] data) { return null; } @Override public void destruct() { reader.close(); } @Override public String getName() { return "ReadEvio"; } @Override public String getAuthor() { return "Sebouh Paul"; } @Override public String getDescription() { 57 58 CLARA Users Manual return "Reads evio events from file and returns them"; } @Override public String getVersion() { return "4.0"; } @Override public String getLanguage() { return CConstants.LANG_JAVA; } As we mention previously the EvIO writer service (EvioFileWriterService) code is very similar to the above-presented EvioFileReaderService service class, and is presented below (without explanation) mainly for convenience to the reader that might cut and paste the code into the preferred IDE for further tests and modifications. public class EvioFileWriterService extends JService { private EventWriter writer; private String filename; // stores file IO error. no error = null private String openingError; @Override public void configure(JioSerial data) { try { filename = data.getStringObject(); if(writer != null){ writer.close(); } writer = new EventWriter(filename); } catch (IOException e) { openingError = e.toString(); e.printStackTrace(); } catch (EvioException e) { e.printStackTrace(); } } @Override public JioSerial execute(JioSerial data) { if(writer == null){ JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); if(openingError != null){ out.setDataDescription("file not specified"); } else { out.setDataDescription(openingError); } return out; } if(data.getMimeType() != MimeType.EVIO_OBJECT){ JioSerial out = new JioSerial(CConstants.REJECT); out.setDataDescription("expected mime type EVIO_OBJECT, received type " + data.getMimeType()); out.setStatus(CConstants.error); System.out.println("error: wrong data type"); return out; } try { EvioEvent event = (EvioEvent) data.getData(); writer.writeEvent(event); JioSerial out = new JioSerial("done"); out.setMimeType(MimeType.STRING); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription("data has successfully been written to file " + filename); return out; } catch (EvioException e) { JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription(e.toString()); return out; } catch (IOException e) { JioSerial out = new JioSerial(CConstants.REJECT); out.setStatus(CConstants.error); out.setLanguage(CConstants.LANG_JAVA); out.setDataDescription(e.toString()); return out; } } @Override public JioSerial execute(JioSerial[] data) { return null; } @Override public void destruct() { try { 59 60 CLARA Users Manual writer.close(); } catch (EvioException e) { e.printStackTrace(); } catch (IOException e) { e.printStackTrace(); } } @Override public String getName() { return "WriteEvio"; } @Override public String getAuthor() { return "Sebouh Paul"; } @Override public String getDescription() { return "writes evio events to a file"; } @Override public String getVersion() { return "4.0"; } @Override public String getLanguage() { return CConstants.LANG_JAVA; } } Refer to previous paragraphs of this chapter for similar steps that can be used to compile and deploy discussed service engines as ClaRA services. Application design or service composition So far we have looked at single services in isolation without talking about how to combine them. One of the most essential features of the ClaRA is the ability to take standalone services (application building blocks) and compose an application based on these services. For the following exercise we will design an application consisting of 2 services. We will use EvIO read and write services that were created in the previous tutorial section. A ClaRA application design process is nothing more then linking services together, which means that we present the output data of the first service as an input data for a service, linked to the first one. Let us now link the EvioFileReaderService to the EvioFileWriterService, creating a ClaRA application that copies the content of an EvIO format file to another file. We are not going to make EvioFileWriterService more intelligent making this application a purely proof of concept exercise. At this point we assume that the mentioned services are already deployed in a ClaRA environment. Now let us write a simple orchestrator the will link these services together. Here is the code: public class LinkServices extends JOrchestrator { public LinkServices(String name) { super(name); } In the main method we accept the canonical names of the services to be linked. After checking if these services are properly deployed we link them. public static void main(String[] args) { String oName = args[0]; String service1 = args[1]; String service2 = args[2]; // an instance of this class LinkServices lso = new LinkServices(oName); // get registration information form the platform normative services lso.updateRegistration(); // check if the service1 is registered if(!lso.isServiceRunning(service1)){ System.out.println("Error: Can not find the registration information for the service = "+service1); System.exit(1); } // check if the service2 is registered if(!lso.isServiceRunning(service2)){ System.out.println("Error: Can not find the registration information for the service = "+service2); System.exit(2); } // link services if(!lso.linkServices(service1,service2)){ System.out.println("Error: linking services"); } lso.exit(); } } At this point we suggest that the reader test the created ClaRA composition by writing an orchestrator that will appropriately configure EvioFileReader and 61 62 CLARA Users Manual EvioFileWriter services and start the application. ClaRA application debugging and communication logging service The clara-platform Unix script executed with a –log command line argument will start the service communications logging normative service. This is a service that logs all service communication transient meta-data in the ClaRA database. The parameter given to the –log option defines the severity level of a service execution status that will be logged in the database. For example the -log warning option will tell the logging service to log messages having warning or error status. The option –log without a parameter logs all messages (info, warning and error) from every service and or orchestrator actively running in a ClaRA DPE (data processing environment). Note that claradpe Unix shell script also has the –log option. The table below illustrates the service_log table description that is used to store service communications transient meta-data. Field data data_source data_destination data_description exec_status mime_type data_unit language version control exception_source exception_destination Type varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) varchar(100) Null NO YES YES YES YES YES YES YES YES YES YES YES Key PRI Default NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL NULL Extra Chapter 5 Application Designer Designing software applications In the previous chapters we focus on developing services (building blocks) for the future physics data processing applications. In this chapter we will show how to build ClaRA applications using available services without writing a single line of code. Graphical interface The ClaRA designer user interface provides an intuitive experience of visual design and execution of service based PDP applications. The snapshot of the graphical interface is shown below. It consists of three main parts: a) the main tool bar at the top, b) the engine/service tree representation pane on the left, and c) the multipurpose pane on the right that hosts the tabbed pane presenting the Info panel, the service engine Deployer graphical interface and the Editor (an application designer drawing canvas). 63 64 CLARA Users Manual Figure 6. ClaRA application designer. This screenshot was taken on MacOSX. Since Java has pluggable look and feel, the look of the designer graphical interface on your platform might be different, but don't worry, the panes and buttons will be in the same position and they would have the same exact functionality. The shell script (Unix csh) clara-designer, located in the bin directory of the ClaRA distribution will start the user interface. >bin/clara-designer Note, that the designer requires that a ClaRA DPE is running on a local node. Any attempt to run a designer without an active DPE will result in a warning dialog message: In case the ClaRA platform (muster DPE) is running on a remote host, and you wish to run designer locally, you must run local DPE first: > bin/clara-dpe Toolbar menus The toolbar contains three menus: File: controls access to user service engines, deployed services and stored PDP application design drawings. It also allows execution of third party applications like EvIO data browser. Edit: contains menu items that help to graphically design and edit physics data processing applications. Control: deploys and runs service-based applications on a specified ClaRA DPE. The File menu Open menu let’s the user choosing between visualizing available service engines (menu item Engines) ClaRA deployed and active services (menu item Services) saved application design drawings (menu item Applications…) It also runs an external data browser application (menu Data Viewer). Figure 7. Open menu. Engines and services will be visualized in a tree form in the left info pane of the graphical interface. Application design schemas will be shown in the application design drawing canvas (see Figure 6). The Application… menu item opens a file chooser to help user to navigate the file system and then choose a file representing a specific ClaRA application design (illustrated above). By default, a file chooser displays the root of the apps directory from the file system pointed by the CLARA environmental variable where ClaRA application design drawings are stored. Selected files from the file chooser will be visualized on the designer canvas and accessed by the Editor tab (see Figure 6). The Save As… menu item saves a designed ClaRA application representation (as shown in the Editor pane) into the $CLARA_SERVICES/apps directory. Before actual saving, the user will be prompted to provide the name for the created/edited application: 65 66 CLARA Users Manual The Exit menu item gracefully exits the ClaRA application designer. Service deployment To deploy a user developed PDP engine as a service we first open available engine classes (File-Open- Engines) and compiled shared object files in the engines tree representation pane of the designer (see Figure 8). Figure 8. Service engines are shown on your left in a tree form. The section on your right shows the Deployer graphical interface of the ClaRA application designer. Selecting a service engine from the tree will force the designer to switch to the Deployer tab that provides a form to enter the required engine deployment parameters. The engine name text field will be filled automatically. Parameters that wait for a user input are the host name and the container name for the future service. After providing all required parameters click the Ok button to deploy the engine as a service on the required host and container of the ClaRA cloud. The result of the deployment operation will be shown in the text area of the Deployer interface, indicating the date of deployment and a short summary of the operation. One can also request the newly deployed service to report its functional description. For that we need first to list all the available services actively running on the ClaRA cloud. The File-Open_Services menu combination will show the topology of the ClaRA cloud services in a tree form (as show in the Figure 9). Selecting the service of interest from the tree will initiate a request to the service to report its operational description that will be shown in the Info tab of the designer. Figure 9. Info tab of the designer shows the operation description of the selected service. The Edit menu The ClaRA designer Edit menu (see Figure 10) is aimed to help the user to create physics data processing applications based on services that are actively deployed on the ClaRA cloud environment. The Command fields of the Edit menu mainly control the behavior of the Editor canvas of the interface. The Edit menu items operations are described below: Link: graphically links service-representing icons together (draws the line between them). Acceleration: Ctrl^E Boxes: draws a box around a service icon (for visual clarity) Grid: this method involves drawing a grid of equal ratio on the application designer canvas. Sub-menu items provide controls to make visible (Show, acceleration Ctrl^G) and invisible (Hide) the already active grid, as well as enable (On) and disable (Off) the grid. The Align control command will align service icons to the grid. Service representing icons can be placed anywhere on the canvas screen, however if the grid is enabled (Grid-On), the Align menu item action will snap back in neat alignment on the designer canvas. Zoom: Zoom in (In, acceleration Ctrl^I) helps a user to get a close look at select details of the PDP application design diagram. Zoom out (Out, acceleration Ctrl^O) helps user to see the big picture of a particular application design Update (acceleration Ctrl^U): This command will initiate the communication with the ClaRA cloud administrative services to update the states of all services visualized on the designer canvas in terms of their deployment and interaction mechanisms (links). Delete Component (acceleration Ctrl^D): Removes the selected component from the canvas after an additional confirmation dialog. Delete All: Clears the entire designer canvas of the interface in case the user confirms the action. 67 68 CLARA Users Manual Figure 10. Edit menu The Control menu The Control menu commands (see Figure 11) directly interact (read/write) with the ClaRA cloud administrative services, as opposed to the Edit menu that mostly acts on the ClaRA application designer canvas (with the exception of the Update menu action that does read-only access to the cloud). Figure 11. Control menu The application menu accesses prebuilt PDP application orchestrators, both for deployment and running purposes. These orchestrators are located in the bin directory and accessed by the $CLARA_SERVICES environmental variable. Service deployment specific orchestrators are stored in the bin/deploy directory and ClaRA based applications running orchestrators are saved in the bin/run directory. Deploy and Run submenu items of the Application menu open a file chooser to help the user navigate and then choose orchestrators located in the bin/deploy and bin/run directories respectfully. The rest of the performed operations of the Control menu are described as follows: Link Components (acceleration Ctrl^L): Creates physical data-links between services of a graphically described PDP application. After this operation, connections/links between services (that were previously described as a black, connecting lines) will be drawn as green lines, indicating active data links between cloud deployed ClaRA services. Remove Links: Removes all active links between services shown on the designer canvas. As opposed to the Link menu action, this action will remove links between services that are physically deployed on a ClaRA cloud. Remove Component: Removes a component from the designer canvas and at the same time requests that the services remove its registration and exit from the cloud. Clas12 PDP engine deployment and testing The principle functionality of a ClaRA service is to take input data, process it and produce a new output data. So, by design a ClaRA service deals with ClaRA transient data packed in a transient data envelope (see chapter 2) whether the data is coming from or going to a persistent storage or another service. One of the important ClaRA design choices is the separation between persistent and transient data. This design choice is based on the realization that physics data processing applications processing complex data structures, can rapidly become unmaintainable if data access requires knowledge of the details of used persistent data structures. Loose coupling of data storage and data usage was adopted to avoid being locked-in to a single persistency technology whether it was for experimental data storage or for data analyses and visualization. To simplify service engine development and testing, the framework provides persistent data for the ClaRA transient data format convertor services. This allows service engine code to concentrate on expressing its data-handling logic rather than the details of how the data is retrieved or stored. Current Clas12 transient data format is chosen to be EvIO. Before testing any user service engines we need to deploy provided convertor services. Figure 12 shows the deployment process of two persistency convertors (EvIOToEvioReader: EvIO persistent to EvIO transient, and EvioToEvioWriter: EvIO transient to Evio persistent) on a cloud. 69 70 CLARA Users Manual Figure 12. EvIO persistent to transient data convertor services deployed. Let us, for example, consider that we would like to debug and test the performance of the point-hit finder service engine of the Clas12 central tracker. For that we have to deploy the service engine as a service in the ClaRA platform. Figure 13 illustrates the process of deploying the PointHitFinder service engine in the CTTest container. Figure 13. Central PointHitFinder service deployment The final step would be to design a test application based on above described and deployed three services. Before we graphically design the application it is a good idea to directly ask each of the deployed service to report it’s functional description. This way we make sure that first; services are properly deployed and second; that the functional description of a service is consistent with the expected functionality of that particular building block of our future application. To retrieve the description of each interested service we ask the normative registry services to return a list of all deployed services by the following sequence of actions: File-Open-Services. Figure 14. Description of the EvioToEvioreader service The description of the selected service from the tree (left pane of the interface) will be shown on the Info panel (see Figure 14). Now we start the graphical design of the PointHitFinder service test application by selecting the Editor designer canvas. Next we drag and drop three selected services: EvioToEvioReader, PointHitFinder, and EvioToEvioWriter (i.e. building blocks of our application) into the designer canvas. It is desirable to position the services in the canvas according to the data flow of the application in mind. This assumes that the PointHitFinder service will get the transient data from the EvioToEvioReader service and send the processed data to the EvioToEvioWriter service. Using the Edit-Link (acceleration Ctrl^E) menu-action sequence we draw the links between services by holding the left mouse-click and moving from the data source service to the data destination service. At the destination service we release the left mouse-click and that will activate the link action confirmation dialog (shown below). Figure 15. Link confirmation interface After all the links are drawn we suggest saving this application for future use, by using the File-Save As…(acceleration Ctrl^S) menu action sequence. The 71 72 CLARA Users Manual graphical representation of the created ClaRA application will be stored in $CLARA_SERVICES/apps directory and can be accessed in the future by the File-Open_Application… menu action sequence. Figure 16. Deployed PointHitFinder tester application based on three services (the green color of the link between services indicates active data links between service) The links drawn between services are solely graphical representations of the data flow of the application. In order to physically link services together (deploy an application) we use the Control-Link Components (acceleration Ctrl^L) menu action sequence. Figure 17. generic-orchestrator is used to run applications that use EvIO as a transient data format After having the designed application deployed and active we can use the generic orchestrator to test the application. The menu sequence to use is Control-Application-Run (see Figure 17) that opens a file chooser, browsing the $CLARA_SERVICES/bin/run directory. Figure 18. User interface of the generic orchestrator that is used to run service compositions that use EvIO as transient data format Choose the generic-orchestrator to open the graphical interface of the generic orchestrator (see Figure 18) that lets the user choose the input and output files to read and write persistent EvIO format files as well as the number of processing events and the number of concurrently running application threads (number of simultaneously running chains of services). Note that ClaRA supports event level parallelization only. After execution of the ClaRA application is complete (shown by the progress bar of the generic orchestrator interface), the Info panel of the ClaRA designer GUI will present details of the application execution including average processing time per event as well as possible error and/or warning messages. The output data can be analyzed using the Data Viewer accessed using the File-Open-Data Viewer menu action combination. 73 74 CLARA Users Manual Appendix A Clas12 Java coding standards Packages Create a new package for each service project. All developed classes/interfaces aiding a proper functionality of a service must be part of the package. Create a directory structure in accordance with java package conventions. Classes Place each class in a separate file. Begin each class with a comment including: File name Date Author Brief description the rationale for constructing the class Write all comments using /** …*/ comments using javadoc conventions. Preface each class with a comment describing the purpose of the class. Example: /** * JSA: Thomas Jefferson National Accelerator Facility * This software was developed under a United States Government * license, * Described in the NOTICE file included as part of this distribution. * Copyright (c), Aug 13, 2010 * </p> * This singleton class is a source for Clara platform syste * parameters, including expid and cMsg parameters. * It parses setup.xml file to get platform configuration parameters. * In the case file is not found it will assign default values to the * Clara platform system parameters. * @author Vardan Gyurjyan * @version 1.3.1 */ public class AConfig { …} Variables Authors are encouraged to use javadoc conventions to describe the nature, the purpose, constraints, and the usage of instance and static variables. Example: /* * The current number of elements. * Must be non-negative, and less than or equal to capacity. */ protected int count_; Methods Use javadoc conventions to describe nature, purpose, preconditions, effects, algorithmic notes, usage instructions, reminders, etc. Be as precise as reasonably possible in documenting method description. Along the description of the method always use the following javadoc tags: @param paramName description. @return description of return value @exception exceptionName description Example: /** * Sends input data to the service, which will trigger core engine * execution * @param containerName Service container name * @param serviceName Service name * @param input Input data object * @param requestId Service request ID for multiple input * synchronization * @return Boolean=false if failed */ public boolean requestService(String containerName, String serviceName,Object input, int requestId){…} Naming Conventions packages: lowercase classes: Camel notation, i.e. Attached words with the first letter of each word capitalized. Example: CorseTrackFinder Exception class: Camel notation, where a class name ends with the word Exception. Example: TrackReconstructionException Interface. (pick one) Camel notation, where an interface name 75 76 CLARA Users Manual o ends with the word Interface, OR o ends with Ifc, OR o starts with I constants (finals): UPPER_CASE_WITH_UNDERSCORES private or protected: (pick one) o first word lowercase with the trailing underscore_, OR o prefix with this, OR o prefix with my static private or protected: first word lowercase with the two trailing underscores__, local variables: first word lower case but internal words capitalized methods: first word lower case but internal words capitalized Code layout Number of spaces to indent. Left-brace (``{) placement at end of line or beginning of next line. Maximum line length. Spill-over indentation for breaking up long lines. Declare all class variables in one place (by normal convention, at the top of the class). Recommendations Avoid using inner classes. Limit a class size to less than 500 code-lines (including comments). Be precise about what you are importing. Check that all declared imports are actually used. When reasonable, consider writing a main method for the class in each java file. The main method should provide a simple unit test or demo. If others can implement your class functionality completely differently, define an interface, not an abstract class. Generally, use abstract classes only when they are partially abstract; i.e., they implement some functionality that must be shared across all subclasses. Internal data objects passed across services MUST implement Serializable interface Never declare instance variables as public. Instead provide access (get/set) methods to protected/private variables. Making variables public gives up control over internal class structure. Minimize statics, except for static final constants. Prefer long to int, and double to float to avoid arithmetic overflow and underflow. Avoid giving a variable the same name as one in a superclass. Prefer declaring arrays as Type[] arrayName rather than Type arrayName[]. Write methods that perform only single algorithmic operation. In particular, separate out methods that change object state from those that just rely upon it. Prefer synchronized methods to synchronized blocks. Whenever reasonable, define a default (no-argument) constructor so objects can be created via Class.newInstance().This allows classes of types unknown at compile time to be dynamically loaded and instantiated. Use the method equals() instead of operator == when comparing objects. In particular, do not use == to compare Strings. Declare and initialize a new local variable rather than reusing (reassigning) an existing one whose value happens to no longer be used at that program point. Assign null to any reference variable that is no longer being used to enable garbage collection. 77 78 CLARA Users Manual Appendix B Object model design recommendations Most likely the service engine developer will be using an object-oriented programming paradigm to develop physics data processing engines. In this appendix we would like to present a set of suggestions that will prevent common mistakes in object-oriented design and programming. “Has a” versus “Is a” relationship These two relationships are basic mechanisms for developing object model of a software program. However, the “is a” relationship or inheritance is applicable only in very specific context, and in majority physics data processing algorithm the appropriate choice is a“has a” relationship. In another words, the typical goal of extending responsibilities of a certain class is more suitable by delegating work to other more specialized objects. So, in order to choose a proper relationship we suggest the following two recommendations: Use the “has a” relationship to extend duties of a certain class by delegating work to other more appropriate objects. Use the “is a” relationship if you need to extend the attributes and methods. Take into account though that the “is a” relationship presents a weak encapsulation within a class hierarchy, so it is a good idea to use this mechanism when you really need this kind of relationship. More on using “is a” relationship Use this type of object-oriented relationship if the class you are developing can be described as “is a special kind of”, not “is a role played by a” extends rather than overrides or nullifies its superclass does not extend or subclass a utility class (i.e. always keep methods providing useful functionality in a utility class and do not inherit from it) Expresses special kind of roles that is a natural extension of the parent class Interface based design In object-oriented software programs objects interact with other objects to complete a designed algorithm. An interface is a powerful concept in object- oriented programming that makes things pluggable. An interface holds a collection of method signatures (nether description nor any source code behind these method signatures). So, in effect, an interface defines a contract between interacting objects. Interface design is very potent, and unfortunately is often used without through analyses of the software task in consideration. Overuse of this concept will make an object model more complex and abstract. Contexts in which interfaces help It is clear that if object connections (i.e. what I know?) and object interactions (who I interact with?) are predefined and they are not going to be changed or expanded, we do not need an interface-based design. However if during the design process we claim that: “we don’t care what kind of object we are interacting with as log as that object provides a functionality in terms of a specific method signature” then we need to design our object model utilizing interfaces. Interface-based design is used in a variety of scenarios, yet we would like to mention some common cases when interfaces really help. Use interfaces to factor out common method signatures to bring a higher level of abstraction to an object model. In another words if you have similar signature methods in multiple classes you consolidate these methods under one interface. This also helps to achieve an overall visual simplification of a code. Use interfaces if you would like to exercise a so-called proxy strategy. For example, say I am interested in basic characteristics of a detector “hit”. Instead of knowing the object type that deals with the specific detector hit (for example DC hit) one can ask a question to a Hit object (playing a proxy for all detector components) about the basic characteristics of the DC hit. Here we assume that we have common detector hit manipulating methods, and can factor them into the interface that was implemented by the DC hit class. Use interfaces to embrace future expansions of the object model. We can factor out method signatures and group them in the interface sooner, so objects from different classes and authors can be graciously accommodated in Clas12 software base in the future. 79 80 CLARA Users Manual References 1. Thomas Erl 2007 SOA: Principles of Service Design (Prentice Hall, ISBN: 0-13-234482-3) 2. V. Gyurjyan et.al., CLARA: A contemporary Approach to Physics Data Processing, Journal of Physics Conference Series 331, 032013 (2011) 3. C Timmer, et al. cMsg – A General Purpose, Publish-Subscribe, Interprocess Communication Implementation and Framework, Proceedings of the International Conference on Computing in High Energy and Nuclear Physics, Victoria BC, Canada 2007. 4. G. Barrand et al., GAUDI: A Software Architecture and Framework for Building HEP Data Processing Applications, Comp. Phys. Commun. 140, 44 (2001). 5. E Wolin, et al. EVIO –Lightweight Object-Oriented I/O Package, Proceedings of the IEE NSS, Hawaii, US 2007.