A Multi-agent Approach for Semantic Resource Allocation Jorge Ejarque, Raül Sirvent and Rosa M. Badia Grid Computing and Clusters Group - Barcelona Supercomputing Center (BSC) Artificial Intelligence Research Institute - Spanish National Research Council (IIIA-CSIC) IEEE CloudCom 2010 December 3, 2010 Indianapolis Outline Motivation Previous Work. – – Distributed Approach – Resource Allocation Ontology Centralized Semantic Scheduling Multi-agent distribution Implementation and Evaluation Conclusions and Future Work 2 Motivation SP has to allocate the user’s tasks in their available resources. Resources are heterogeneous and can be supplied by different providers. SP users specify requirements in different SLA terms which must be satisfied. Users and providers use their own terms. There is a need of semantic interoperability Clouds Service Users Grids Service Provider Clusters 3 Motivation SP Infrastructure can change – New resources can be added or removed from the SP infrastructure Resource can fail – Performance Degradation Resource Allocation must be adapted according to customer and provider preferences Agent technologies provide: – Adaptation to infrastructure changes – Distributed resource allocation according to several customer and provider preferences 4 Previous work Centralized approach Stores semantic descriptions of system elements Semantic Metadata Repository 3. Get descriptions Manage the Client Manager customer’s tasks requests Client Manager 0. Resource description registration 2. Schedule Job 4. Update scheduling 1. Get Resource Candidates Create Resource Manager executions environments, execute customers Resource Manager tasks 5. Execute Job Semantic Scheduler Allocates resources on the different customer’s tasks (semantic descriptions) 5 Resource Allocation Ontology Based on Grid Resource Ontology Extension to describe basic resource allocation concepts – Resources properties, Collections, Abstract task (resource requirements, time constraints) and Process (abstract task allocation), Actors, Providers, etc. 6 Semantic Scheduler Rule-based semantic scheduling 7 Decentralized motivation Scalability Reduce single point of failure Customer and providers has different interests and they should be taken into account in the allocation process. – Different rules depending on the customers and providers or resource types 8 Distributed Architecture Distributed approach Resource Allocation Rules Semantic Metadata Repository Resource Allocation Ontology Agent Platform Scheduler Jobs Res. Man ... Scheduler ... Client Manager Res. Agent Container Scheduler Scheduler Jobs Res. Man Resource Provisioning Private Clouds Job Agent Container Res. Agent Container Scheduler Jobs Res. Man ... Scheduler ... Client Manager Job Agent Container Scheduler Scheduler Jobs Res. Man Resource Provisioning ... ... Clusters Grids Public Clouds EC2, ... 9 System Agents Distributed resource allocation – Scheduling functionalities distributed across the system agents – Based on a negotiation between the system agents. (CNP) – Agents platforms can be distributed across multiple hosts Two parts – Client Management (Job Agents) Management of jobs Resource selection – Resource Provisioning (Resource Agents) Management of resources Resource allocation proposals Belief-Desire-Intention agent modeling – Define data, goals and plans for each agent 10 Job Agent Semantic annotation of tasks. Initiate the negotiation and select the best proposal for the customer interests. Adapt the task allocation to the system events. (Rescheduling) React to status changes (Fault tolerance) Belief (Status) Goal Plans Requested Get Resources Annotate task description Negotiate allocation Scheduled Find Better Allocation Negotiate allocation Running Monitor execution Check performance Update requirements Increase resources Suspended Recover Suspended Evaluate work done Update requirements Negotiate allocation Stopped Recover Stopped Negotiate allocation Non Scheduled Recover Non Scheduled Negotiate allocation Cancelled Cancel Job Remove allocation 11 Resource Agent Semantic annotation of resource descriptions Make proposal according to resource interests Create execution environments and execute tasks by means of a resource management interface React to resource events (Failures) Belief Goal Plans Scheduled Jobs Monitor Scheduled Jobs Check scheduled job Perform execution Running Jobs Monitor Running Jobs Check running jobs Notify status changes Status Failed Recover Failure Try local rescheduling Notify affected jobs agents Status Running Register Resource Annotate resource description 12 Distributed Resource Allocation Resource allocation negotiation sequence SELECT DISTINCT ?id ?host WHERE { ?host tech:isProvidedBy ?rm . ?rm biz:actor_name ?id . ?host tech:containsResource ?proc . ?proc rdf:type gro:Processing . ?proc gro:hasResourceProperty ?memory . ?memory rdf:type tech:MemoryCapacity . 1. Select ?memory tech:hasValue ?mem_cap . candidate resources [Evaluate_Deadline: FILTER (?mem_cap >= 1024ˆˆxsd:int) . (?job rdf:type gro:Execute_Task), 2. Send call for resource(?job allocation ?host tech:containsResource ?image . gro:hasActionState tech:requested), to selected resource (?job agents tech:hasDeadline ?deadline), ?image rdf:type tech:Image . [Earliest_Start_Date: ?image tech:containsResource gro:Software_XX } (?job gro:incarnatedToProcess ?incarnation), (?job rdf:type gro:Execute_Task), 3. Evaluate requirements, (?incarnation gro:hasPlannedEnd ?job_end), (?job gro:hasActionState tech:requested), discard imposible allocations lessThan(?deadline, ?job_end), (?job gro:incarnatedToProcess ?incarnationA), (resource full, license terms) -> (?incarnationA gro:hasPlannedStart ?stDateA), and apply other provider's rules drop(3,4)] (?job gro:incarnatedToProcess ?incarnationB), (?incarnationB gro:hasPlannedStart ?stDateB), [Used_Hosts_First: lessEqual(?stDateA, ?stDateB), 4. Send back proposed resource allocations (?job rdf:type gro:Execute_Task), -> (?job gro:hasActionState tech:requested), remove(4,5)] (?job gro:incarnatedToProcess ?incarnationA), 5. Apply customer selection rules (?incarnationA gro:usesResource ?candidateA), [Priority_AtoB: (?candidateA gro:containsResources ?HostA), (?job rdf:type gro:Execute_Task), 6. Accept selected proposal hasValue(?HostA gro:isUsedBy), (?job gro:hasActionState tech:requested), (?job gro:incarnatedToProcess ?incarnationB), (?job gro:incarnatedToProcess ?incarnationA), (?incarnationB gro:usesResource ?candidateB), (?incarnationA gro:usesResource ?candidateA), (?candidateB gro:containsResources ?HostB), (?candidateA gro:containsResources ?resourceA), noValue(?HostB, gro:isUsedBy) (?resourceA tech:isProvidedBy biz:ResourceProviderA), -> (?job gro:incarnatedToProcess ?incarnationB), drop(6,7)] (?incarnationB gro:usesResource ?candidateB), (?candidateB gro:containsResources ?resourceB), (?resourceB tech:isProvidedBy biz:ResourceProviderB) -> 13 remove(6,7)] Implementation System agents: Jadex BDI Engine Semantic Capabilities: Jena 2 Framework – RDF, OWL APIs – Jena Rule Engine (Scheduler Module) – SPARQL queries 14 Evaluation Qualitative evaluation – BDI Agents facilitate coordination of complex tasks (job and resource management) – Resource Allocation Classes mapped to different schemas EC2 – Instances -> Host Class – Instance Storage, Memory, ECU -> Storage, Processing Resource GLUE schema – ComputeElementType -> Host Class – CE properties -> Storage, Processing Resource properties Ganglia schema – HostType-> Host Class – Ganglia metrics -> Resource properties 15 Evaluation Centralized vs Decentralized allocation time 16 Evaluation Centralized vs. Distributed 17 Evaluation Different agent deployment configuration 18 Conclusions Resource allocation framework which combines several technologies – Semantics for unifying data sources – Multi-agents for adapting to a changing environment – Rule scheduling promising approach Negotiation combines customer and providers preferences – Different rule examples Evaluation – Annotations from different schemas – Agents introduce a negotiation overhead – Trade-off between number of resources per agent and agents per machine. 19 Future Work Improve performance reducing overheads Exploit the benefits of rule-based resource allocation – Dynamic scheduling policies with rules Introduce semantics in the resource management interface interoperability – Not only in the resource description 20 Thank you for your attention. Questions? jorge.ejarque@bsc.es