Francisco Hernandez, Purushotham Bangalore, Jeff Gray, and Kevin Reilly Department of Computer and Information Sciences University of Alabama at Birmingham Birmingham, AL, USA Overview • Provide an abstract high-level layer to model the Grid Workflows. • Automate the specification of Grid workflows. • Generate Globus specific code from the graphical models with the help of the Java CoG Kit. RSL Creation Workflow Model Interpreter File Transfers Java CoG Kit API Globus Toolkit Job Management Adapters 2 Outline • Related Work • Domain-Specific Modeling • Meta-Model • Modeling Process • Interpreter • Limitations • Future Work • Conclusions 3 Related Work (1) • Idea of composing applications from reusable components is not new: (e.g., Webflow, Unicore, DAGMan, Symphony, Triana). • Workflows have gained increased attention for their application in composing a flow of tasks in a Grid environment: GridAnt 4 Related Work (2) • Amin et al.1, proposes a technology and architecture-independent abstraction layer to provide interoperability across multiple Grid implementations, resulting in an Open Grid Computing Environment (OGCE). • Concept is comparable to using meta-models that abstract the underline Grid technologies but is realized at a lower level of abstraction. 1. Amin, K., Hategan, M., von Laszewski, G., and Zulezec, N., “Abstracting the Grid,” Proceedings of the 12th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2004), 11-13 February 2004, La Coruña, Spain. 5 Domain-Specific Modeling (1) • Domain-specific modeling (DSM) is a technology that focuses on higher levels of abstraction at the problem space and avoids low-level details at the solutions space. Allowing a user to manipulate graphical models of the problem in hand. • A special type of generator called a model interpreter can translate the models into executable specifications used to automatically synthesize software. • GME is a domain-specific modeling environment that can be configured and adapted from meta-level specifications that describe the domain. 6 Domain-Specific Modeling (2) • DSM has been useful in automating different kinds of applications in which the environment is dynamic and tightly integrated with the physical environment including: – embedded systems1, – automotive manufacturing2, – complex QoS applications3. 1. 2. 3. Neema, S., Bapty, T., Gray, J., and Gokhale, A., “Generators for Synthesis of QoS Adaptation in Distributed Real-Time Embedded Systems,” First ACM SIGPLAN/SIGSOFT Conference on Generative Programming and Component Engineering (GPCE ’02), SpringerVerlag LNCS 2487, Pittsburgh, PA, October 6-8, 2002, pp. 236-251. Long, E., Misra, A., and Sztipanovits, J., “Increasing Productivity at Saturn,” IEEE Computer, August 1998, pp. 35-43. Bapty, T., Neema, S., and Gray, J., “Model-Integrated Computing For Composition of Complex QoS Applications Using The Generic Modeling Environment (GME),” OMG Workshop on Real-Time and Embedded Distributed Object Computing, Washington, DC, July 15-18, 2002. 7 Domain Specific Modeling (3) General Meta-Meta-Model Domain Models Specify Domain Meta-Model Specific Instance Interpreter 1 Interpreter 2 Application Application Generate Application 8 Meta-Model (1) • Workflows describe the execution of complex applications built from individual application’s components. • The basis of the meta-model is the way in which a user specifies a sequence of tasks in an application’s workflow. Upload a File Execute a Job Download a File 9 Meta-Model (2) User Credential Job Processing File Transfer Resource Workflow Task • Experimental knowledge of the domain • Four aspects needed to define the meta-model: – – – – Resources Transfers end-points Jobs specifications Workflows 10 Meta-Model (3) Resources workflows 11 Modeling Process (1) hernandf authorizes the use of the remote hosts (cherokeeData and cherokeeCompute). hernandf specify the location of the user’s security credentials. The location of the data file should be specified for each end-point in a file transfer. 12 Modeling Process (2) The user initiates the execution of the application by first uploading the raw input file. The output file is finally downloaded to the local host. The generator creates a RSL string from the attributes specified by the user. In this case for the job HMM. 13 Interpreter • The interpreter parses the model and generates the control code that manages the application execution. • GME provides an API that traverses the internal representation of the models. A model interpreter uses this API to translate the models into an application that manages the execution of the workflow. Workflow Model GME API Code Generation Model In GME Domain: -Models -Atoms -Connections Model in Globus Domain (Jobs, File Transfers, etc) Translator API Grid Application 14 Example of generated code 9: 10: 11: 12: 13: 14: 15: 16: 17: // create the rsl string GlobusRSL hmmRSL = new GlobusRSL(); hmmRSL.setArg("HMM inHMMFile.txt outHMMFile.txt"); hmmRSL.setEnvironmentVariables ("(INPUT_DIR=/lhome/hernandf) (OUTPUT_DIR=/lhome/hernandf)"); hmmRSL.setStdOut("/lhome/hernandf/sttOutHMM.txt"); hmmRSL.setNumProc(2); hmmRSL.setDir("/usr/bin"); hmmRSL.setExec("java"); 15 Limitations • Work on the modeling environment is in the initial phase. Currently, the environment can handle only a limited set of sequential tasks. • Scalability problems due to the generation of specific code for each workflow task. • Not all of the Globus capabilities are currently supported by the meta-model. 16 Future Work (1) • Improve the scalability problem by generating a reusable workflow engine and generate the appropriate configurations from the graphical models. • Modify the meta-model in order to support capabilities like: – – – – Hierarchical workflows Task’s parallelism Check pointing and error recovery Query Grid information services 17 Future Work (2) • Generate different output specifications: – – – – Grid Services Grid Ant PyGlobus New version of Java CoG Kit. 18 Conclusions (1) • The benefits of using domain-specific modeling techniques for creating Grid workflows are: – Domain modeling removes the accidental complexities of creating workflows in a Grid by focusing on higher levels of abstraction at the problem space rather than solution space. – Modeling tools and their interpreters facilitate the more rapid ability to change the workflow details. That is, it is easier to manipulate and change domain models rather than the associated code. – Model-driven techniques possess the ability to generate multiple artifacts from the same model. Thus, different output representations can be generated from the same domain knowledge. 19 Conclusions (2) • Using these techniques, a user manipulates graphical models that represent the different components from the Globus Toolkit. From these models the user generates the corresponding Java code that manage the execution of the workflow. • This work is an attempt to abstract the Grid environment into a high-level layer such that the essence is not bound to a specific Grid environment. 20 Thank you 21