Towards Autonomic Grid Data Management with Virtualized Distributed File Systems Ming Zhao, Jing Xu, Renato Figueiredo Advanced Computing and Information Systems Electrical and Computer Engineering University of Florida {ming, jxu, renato}@acis.ufl.edu Advanced Computing and Information Systems laboratory Motivation z Grid data provisioning • • Wide-area latency/bandwidth, dynamism of resources … How to provide data with application-tailored optimizations? job job job Highly reliable access Simulation data Compute server Aggressive prefetching Data server job job job Genome sequence GRID data Compute server Sparse file job job job access Data server Virtual machine Compute server Advanced Computing and Information Systems laboratory 2 Motivation z Grid data management • • Dynamic data access, resource sharing, changing resource availability … How to manage data provisioning in dynamically changing environments? job job job Highly reliable access Simulation data Compute server Aggressive prefetching Data server job job job Genome sequence GRID data Compute server Sparse file job job job access Data server Virtual machine Compute server Advanced Computing and Information Systems laboratory 3 Overview z Goal: • z Challenges: • • z Efficient data management in heterogeneous, dynamic and large-scale Grid environments Application-tailored data provisioning Data management in dynamically changing environment Contribution: Autonomic Grid data management • • Virtualized Grid-wide file systems for user-transparent Grid data access with application-tailored enhancements Autonomic data management services for self-managing, goal-driven control over Grid file system sessions Advanced Computing and Information Systems laboratory 4 Outline z Background • Grid virtual file system and data management z Architecture • Autonomic data management services z Evaluation • Experiments and analysis z Summary • Related work, conclusion and future work Advanced Computing and Information Systems laboratory 5 Grid Virtual File Systems z Grid Virtual File System (GVFS) • • • Distributed file system virtualization through user-level NFS proxies Enhancements for Grid environment (disk caching, secure tunneling) Dynamic, independent, application-tailored GVFS sessions data job job Proxy Proxy GVFS Session I $ $ data GRID C1 job S1 g Secure Tunnelin Proxy Proxy job FS GV II on i s Ses Proxy Proxy S2 Proxy Proxy $ C2 Advanced Computing and Information Systems laboratory $ [Cluster Computing, Vol 9/1, 2006] 6 Data Management Services WSRF-based management services for GVFS sessions Data Scheduler Service (DSS) z Server-side File System Service (FSS) • Scheduling and customization z z • of GVFS sessions z DRS DSS Data Replication Service (DRS) • Management of application data sets and their replicas Management of serverside GVFS proxies M FSS FSS Proxy Proxy Proxy Proxy GVFS Session I S1 C1 GRID z Client-side File System Service (FSS) • Management of clientside mount points, GVFS proxies and disk caches FSS FSS FS GV II on i s Ses Proxy Proxy S2 Proxy Proxy C2 [HPDC’2005] Advanced Computing and Information Systems laboratory 7 Autonomic Services Autonomic Data Scheduler Service Analyze Plan z Performance z Monitor z Storage Autonomic Server-side Autonomic Data Replication File System Service Service Analyze Plan z DRS Configuration Execute z Reconfigure Reliability Performance Monitor z z Storage Response DSS z Replication Load balance Execute z z Replicate Reassign M FSS FSS Proxy Proxy GVFS Session I S1 Proxy Proxy C1 FSS GRID Autonomic Client-side File System Service Analyze z Performance Monitor z Response Proxy Proxy Plan z F GV Prim-server Execute z FSS Redirect n sio s e SS II S2 Proxy Proxy C2 Advanced Computing and Information Systems laboratory [Russell, IBM Systems Journal, 2003] 8 Autonomic Data Scheduler Service z Manages concurrent GVFS sessions on shared resources • • • Plan Analyze z Performance z Configuration Execute Monitor z Storage Considers both application performance and resource utilization policy job job FSS Session i’s utility: Proxy Proxy $ $ C1 Ui = Performancei * Priorityi Configures sessions to maximize ∑ U i z Reconfigure FSS Proxy FSS Proxy S2 i z E.g. Disk cache management • • Hit rate is important in wide-area environment Allocates client-side storage among sessions • • • Monitors storage usage via client-side FSS Configures the sessions’ caches to maximize global utility Applies configurations via client-side FSS Advanced Computing and Information Systems laboratory 9 S1 Autonomic Data Replication Service z Replication degree and placement • Increases reliability against server-side failures (crash, network partitioning) • • • z Reliability z Storage C1 ∑U Execute z Replicate replica FSS Proxy Proxy The cost of creating replicas for data set d Costd < Cmax Ud = Valued * Reliabilityd , maximizes Replication FSS Replaces replicas based on utility • z Monitor The reliability of a session’s data set d : Reliabilityd > Rmin Reduces replication overhead • Plan Analyze S1 FSS Proxy S2 d i z Replica regeneration • • • Monitors server status via server-side FSS Reconfigures sessions’ replicas after a failure Generates replicas via server-side FSSs Advanced Computing and Information Systems laboratory 10 Autonomic File System Service z Fail-over against server failures • • z Minimizes impact on the application Non-interrupt session redirection • • • • Monitors server response time Detects failure when request times out Redirects session to backup server Regenerate replicas via DSS/DRS Plan Analyze Performance z Configuration Execute Monitor z Response job C1 z Redirect FSS FSS Proxy Proxy $ FSS Proxy z S1 S2 Primary server selection • Maximizes performance against dynamic environment • • • Monitors connections with effective, low-overhead mechanism: Small random writes to a hidden file on the mounted partition Predicts performance with simple effective forecasting algorithm Chooses the best connection and switches transparently Advanced Computing and Information Systems laboratory 11 Experimental Setup z File system clients and servers • z Wide-area networks • z NIST Net emulated links I/O part of Grid applications • z VMware-based virtual machines IOzone file system benchmark Grid data management • • User-level NFS v3 based GVFS proxies WSRF-Lite based management services Advanced Computing and Information Systems laboratory 12 Autonomic Session Redirection z Scenario (I) • • 80 Data transfer under network fluctuation E.g. caused by parallel TCP transfers Setup • • Client connected to two servers with independently emulated WAN links Randomly applied latencies with values from a real wide-area measurement Network RTT (ms) z FSS 70 FSS Proxy 60 C1 50 S1 Proxy 40 FSS 30 Proxy 20 S2 0 2 4 8 Number of parallel TCP connections Response Time (sec) 0.09 Server 11 Server Server 22 Server Autonomic session redirection 0.08 0.07 0.06 0.05 0.04 0.03 10 z 110 210 310 410 510 Time (second) 610 710 810 Autonomic session redirection achieves the best performance by adapting to the changing network condition Advanced Computing and Information Systems laboratory 13 Autonomic Session Redirection z Scenario (II) • z FSS Data transfer under server load variation Proxy C1 Setup • FSS S1 Proxy FSS Randomly generated background load applied on the data servers Proxy S2 Response Time (sec) 0.6 0.6 Server 11 Server Server 2 Autonomic session redirection 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 00 60 60 z 180 180 300 300 420 420 540 660 780 540 660 780 Time Time (second) (second) 900 900 1020 1020 1140 1140 1260 1260 Autonomic session redirection achieves the best performance by adapting to the changing server load Advanced Computing and Information Systems laboratory 14 Autonomic Data Replication z Scenario • z FSS FSS Proxy Setup • • • z Data transfer and replication under dynamic server failures Replica degree: 2 (1 primary + 1 backup) Failures randomly injected on the servers Consecutive executions of the benchmark C1 S1 Proxy FSS Proxy Case I: Independent sessions (different inputs) S2 backup backupserver serverfailed failed new newreplica replicacreated created primary primaryserver serverfailed failed primary primaryserver serverfailed failed new newreplica replicacreated created backup backupserver serverfailed failed new newreplica replicacreated created new newreplica replicacreated created 0 180 time (s) 198 309 884 676 1st 1strun rundone done redirected redirectedto tobackup backup z 1015 1077 1213 1096 1329 2nd 2ndrun rundone done 2020 2159 1996 3rd 3rdrun rundone done 2642 4th 4thrun rundone done redirected redirectedto tobackup backup Autonomic data replication provides transparent error detection/recovery Advanced Computing and Information Systems laboratory 15 Autonomic Data Replication z Scenario • z FSS FSS Proxy Setup • • • z Data transfer and replication under dynamic server failures Replica degree: 2 (1 primary + 1 backup) Failures randomly injected on the servers Consecutive executions of the benchmark C1 S1 Proxy FSS Proxy Case II: Dependent sessions: same input, cache reuse S2 backup server failed primary server failed new replica created new replica created 311 z 10th run done 9th run done 8th run done 7th run done 6th run done 5th run done 4th run done redirected to backup 675 715 754 793 833 873 913 953 993 1033 3rd run done 201 1011 884 2nd run done 180 1st run done 0 time (s): Client-side disk cache further helps to minimize the impact of failures Advanced Computing and Information Systems laboratory 16 Related Work z Data management in Grid environment • z Middleware control over Grid data transfers • • z GridFTP, GASS, Condor, Legion Batch-Aware Distributed File System [NSDI’04] Self-optimizing, fault-tolerant bulk data transfer framework based on Condor and Stork [ISPDC’03] Replication and storage management • • Wide-area data replication based on Globus RFT IBM autonomic storage manager Advanced Computing and Information Systems laboratory 17 Summary z Problem: • Efficient data management in heterogeneous, dynamic and large-scale Grid environments z Solution: • Autonomic data management system based on GVFS and self-managing, goal-driven services z Future work: • Autonomic session checkpointing and migration • Decentralized coordinating per-domain DSS/DRS Advanced Computing and Information Systems laboratory 18 References [1] M. Zhao, J. Zhang and R. Figueiredo, “Distributed File System Virtualization Techniques Supporting On-Demand Virtual Machine Environments for Grid Computing”, Cluster Computing, 2006. [2] M. Zhao, V. Chadha and R. Figueiredo, “Supporting Application-Tailored Grid File System Sessions with WSRF-Based Services”, HPDC-2005. [3] J. Xu, S. Adabala and J. A.B. Fortes, “Towards Autonomic Virtual Applications in the In-VIGO System”, ICAC-2005. [4] L. W. Russell et al, “Clockwork: A New Movement in Autonomic Systems”, IBM Systems Journal, 42(1), 2003. [5] J. Bent et al, “Explicit Control in a Batch-Aware Distributed File System”, NSDI-2004. [6] T. Kosar et al, “A Framework for Self-optimizing, Fault-tolerant, High Performance Bulk Data Transfers in a Heterogeneous Grid Environment”, ISPDC-2003. [7] A. Chervenak et al, “Wide Area Data Replication for Scientific Collaborations”, Grid2005. [8] Y. Zhong, S. G. Dropsho, C. Ding, “Miss Rate Prediction across All Program Inputs”, PACT-2003. WSRF::Lite An Implementation of Web Services Resource Framework http://www.sve.man.ac.uk/Research/AtoZ/ILCT GVFS Virtualized distributed file system for Grid environment http://www.acis.ufl.edu/~ming/gvfs In-VIGO Virtualization middleware for computational Grids http://www.acis.ufl.edu/invigo Grid virtualization middleware Advanced Computing and Information Systems laboratory 19 Acknowledgments z In-VIGO team • http://invigo.acis.ufl.edu z NSF Middleware Initiative NSF Research Resources IBM Shared University Research Intel ISTG R&D Council z Questions? z z z Advanced Computing and Information Systems laboratory 20