Architecting Fibre Channel HA Solutions Rick Jooss richard.jooss@netapp.com Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 2 Agenda CFModes Single System Image Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 3 CFMODE – Cluster Failover Mode What is CFMODE? – FCP Setting – Determines behavior of FC Target Ports, particularly during a CFO event Why is there more than one CFMODE? – Original CFMODE (standby) did not work for all host types (HP-UX, AIX) – Original CFMODE did not work with the 270C because it only has a single FC port NetApp Confidential -- Do Not Distribute 4 Available Paths - Standby Mode Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 0c 0d 0a 0b 0c 0d 0a Dashed Purple are paths to the LUNs being served by Controller 2 0b HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 5 Path Access (Switch Failure) – Standby Mode Host MP layer works1 around Switch/Fabric will the failure a failure experience Switch/Fabric 1 Switch/Fabric 2 Solid and Blue are paths to the LUNs being served by Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Dashed and Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 6 Path Access (CFO event) - Standby Mode Host Conntroller2 1Takes will over Controller experience a failure all operations Switch/Fabric 1 Switch/Fabric 2 Solid and Blue are paths to the LUNs being served by Controller 1 0c 0d 0a 0b 0c 0d 0a 0b Dashed and Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 7 Path Access (CFO event) - Standby Mode Host Switch/Fabric 1 WWN1 0c Switch/Fabric 2 WWN2 WWN3 0d 0a WWN4 0b WWN5 WWN6 WWN7 0c 0d 0a WWN8 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 8 Solid and Blue are paths to the LUNs being served by Controller 1 0b HA Configuration Controller 1 Filer Head 1 2 will Takes over Controller all operations experience a failure MP layer is not involved in switchover Dashed and Purple are paths to the LUNs being served by Controller 2 Available Paths - Partner Mode Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 0c 0d 0a 0b 0c 0d 0a Dashed Purple are paths to the LUNs being served by Controller 2 0b HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 9 Available Paths - Partner Mode – FAS3000 Default Configuration Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 0c 0d 0c 0d HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 10 Dashed Purple are paths to the LUNs being served by Controller 2 Available Paths - Dual Fabric Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 HA Configuration 0c_0 0c_2 0c_2 0c_0 Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 11 Dashed Purple are paths to the LUNs being served by Controller 2 Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 12 What is the single system image cfmode? Universal cfmode – Works on all HA storage systems – Works on all switches Presents the HA configuration as a single target All LUNs are visible on all controller ports All hosts require multipathing software NetApp Confidential -- Do Not Distribute 13 Available Paths - Single System Image – Single Card Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 0c 0d 0c 0d Dashed Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 14 Path Access (Switch Failure) - Single System Image – Single Card Host Switch/Fabric MP layer works1 around will the failure a failure experience Switch/Fabric 1 Switch/Fabric 2 Solid and Blue are paths to the LUNs being served by Head 1 0c 0d 0c 0d Dashed and Purple are paths to the LUNs being served by Head 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 15 Path Access (CFO event) - Single System Image – Single Card Host Switch/Fabric 1 Switch/Fabric 2 2 will takes over Controller 1 all operations experience a failure MP layer works around the failure Solid Blue are paths to the LUNs being served by Controller 1 0c 0d 0c 0d Dashed Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 16 Available Paths - Single System Image – Single Port Host Switch/Fabric 1 Switch/Fabric 2 Solid Blue are paths to the LUNs being served by Controller 1 0d 0d Dashed Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 17 Available Paths - Single System Image – Single Port Host Loop Mode Loop Mode 0d 0d Dashed Purple are paths to the LUNs being served by Controller 2 HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 18 Solid Blue are paths to the LUNs being served by Controller 1 Why SSI mode? Works in all configurations Makes us look more like other SAN vendors Reduces port burn without using FC Loop – Fully redundant config requires only 1 “wire” per controller, instead of 2. Simpler wiring, no a/b port distinctions and no requirement to run the same cables from each controller to the same switch. NetApp Confidential -- Do Not Distribute 19 Management changes Unified LUN mapping address space across the HA configuration. – Controller prevents these conflicts by checking with the partner controller. If the controller interconnect is down, some operations are disabled by default – Igroup add, lun map, lun online, igroup set ostype NetApp Confidential -- Do Not Distribute 20 SSI Roadmap Introduced in ONTAP 7.1 Refer to FCP host compatibility matrix http://now.netapp.com/NOW/knowledge /docs/san/fcp_iscsi_config/index.shtml for specific host support NetApp Confidential -- Do Not Distribute 21 Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 22 Multipathing Multipathing provides multiple paths from the host to the external storage device Provides High-Availability – Protects against path failures – Ensures high availability of applications and data by eliminating single points of failure Provides Improved Performance – Increases potential performance by utilizing multiple paths NetApp Confidential -- Do Not Distribute 23 Multipathing Host Switch/Fabric 1 0c Switch/Fabric 2 0d 0c 0d HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 24 A/P (active passive) policy – Single LUN Hosts Switch/Fabric 1 Switch/Fabric 2 0c 0d 0c 0d HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 25 A/P (active passive) policy – No Round Robining Hosts Switch/Fabric 1 Switch/Fabric 2 0c 0c 0d 0d HA Configuration Controller 1 Controller 2 LUN1 NetApp Confidential -- Do Not Distribute LUN3 LUN2 26 LUN4 A/P (active passive) policy - Round Robining Hosts Switch/Fabric 1 Switch/Fabric 2 0c 0d 0c 0d HA Configuration Controller 1 Controller 2 LUN1 NetApp Confidential -- Do Not Distribute LUN3 LUN2 27 LUN4 A/P (active/passive) Active/Passive Configuration – 1 active path to a single LUN • Performance to a LUN is limited by that paths capability (HBA, switch, target port) – Possible to round robin multiple LUNs across multiple paths – All other paths to the LUN are passive – On failover • Primary paths are tried first • Secondary paths are used if no primary paths are available NetApp Confidential -- Do Not Distribute 28 A/A (Active active) policy (cfmode = standby) Hosts Switch/Fabric 1 0c Switch/Fabric 2 0d 0a 0b 0c 0d 0a 0b HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 29 A/A (active/active) Host accessing data from a single LUN across multiple paths simultaneously – Typically used for load balancing • Round Robin • Least Queue Depth • Weighted – On failure I/Os are sent down remaining available paths NetApp Confidential -- Do Not Distribute 30 A/A/A (asymmetric active active) Host Switch/Fabric 1 0c Switch/Fabric 2 0d 0c 0d HA Configuration Controller 1 Controller 2 LUNs NetApp Confidential -- Do Not Distribute LUNs 31 A/A/A (asymmetric active active) Distinguishes between primary and secondary paths Does active/active across primary paths only Only uses secondary paths when no primary are available NetApp Confidential -- Do Not Distribute 32 NetApp’s Multipathing Strategy 2 pronged strategy – Support for “native” solutions • What most customers rightly feel best about – Support for host and storage independent solution • VERITAS • Allows common solution across various server as well as storage variants NetApp Confidential -- Do Not Distribute 33 Multipathing For Windows Windows MPIO – Uses the Microsoft standard infrastructure – A/P Policy – Automatically chooses primary paths for failover before trying proxy ones – In standby the LUNS are automatically round robined across all paths MPIO Partner/SSI cfmode A/P Standby cfmode A/P Dual Fabric cfmode A/P NetApp Confidential -- Do Not Distribute 34 MultiPathing For Solaris DMP 4.0 MPxIO Partner/SSI cfmode A/A/A A/P Standby cfmode A/A N/A Dual Fabric cfmode A/P A/P NetApp Confidential -- Do Not Distribute 35 MultiPathing For Solaris VERITAS DMP 4.0 – NetApp ASL 4.0 – Supports A/P, A/A, & A/A/A (Active Passive Concurrent) SUN Native MPxIO – – – – – – Not supported with standby cfmode Supports A/P Can be A/A but required manual failback Manual configuration required Round Robining of the LUNs possible Sometimes called • Traffic Manager • Leadville Stack NetApp Confidential -- Do Not Distribute 36 MultiPathing For Linux Qlogic – A/P Policy – Manually configured – Round Robining of LUNs is possible DCM – Linux native solution Qlogic DM Partner/SSI cfmode A/P A/A/A Standby cfmode A/P A/A Dual Fabric cfmode A/P A/P NetApp Confidential -- Do Not Distribute 37 MultiPathing For AIX DMP 4.0 SANpath MPIO Partner/SSI cfmode A/A/A A/A/A A/A/A Standby cfmode N/A N/A NA Dual Fabric cfmode A/P A/P A/P NetApp Confidential -- Do Not Distribute 38 MultiPathing For AIX SANpath – A/A/A – Automatically chooses primary paths for failover before trying proxy ones – Special policy for SCSI-2 reservation – Required for host clustering HACMP – Can only use A/P VERITAS DMP 4.0 – Only supports A/A/A IBM MPIO – IBM native solution with NetApp PCM NetApp Confidential -- Do Not Distribute 39 Multipathing for HP-UX PVLinks DMP 3.5 Partner/SSI cfmode A/P A/P Standby cfmode N/A N/A Dual Fabric cfmode A/P A/P NetApp Confidential -- Do Not Distribute 40 Multipathing for HP-UX PVlinks/LVM – – – – A/P policy Single active path per LUN, user controlled Ordering for remaining paths for failover ntap_config_paths • NETAPP script to define path ordering based on filer path types: primary, proxy • automatically round robin primary paths among all LUNS – Supports both FCP and iSCSI paths VERITAS DMP 3.5 – A/P Policy NetApp Confidential -- Do Not Distribute 41 Multipathing for VMware VMware – A/P Policy – Manually configured – Round Robining of LUNs possible VMware Partner/SSI cfmode A/P Standby cfmode A/P Dual Fabric cfmode A/P NetApp Confidential -- Do Not Distribute 42 Multipathing for Netware Novell – A/P Policy – Manually configured – Round Robining of LUNs possible Novell Partner/SSI cfmode A/P Standby cfmode A/P Dual Fabric cfmode A/P NetApp Confidential -- Do Not Distribute 43 Fibre Channel SAN Host Support Partner/SSI cfmode Windows “NTAP DSM” Linux: Qlogic “Failover Mode” Standby cfmode Dual Fabric cfmode A/P A/P A/P A/P A/P A/P VMware Multipathing A/P A/P A/P Solaris “DMP” A/A/A A/A A/P Solaris “MPxIO” A/P N/A A/P AIX “SANpath” A/A/A N/A A/P HP-UX “PVLinks” A/P N/A A/P Novell A/P A/P A/P NetApp Confidential -- Do Not Distribute 44 Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 45 Host Clustering & Storage Host 1 LUNs need to be made visible to host simultaneously Some Host Clustering solutions require SCSI reservations to avoid to split brain Host 2 Switch/Fabric 1 Switch/Fabric 2 Controller 1 0b 0d 0c 0a 0c Controller 2 Controller 1 Active Shelf(s) LUNs Controller 2 Active Shelf(s) NetApp Confidential -- Do Not Distribute 46 0d 0b 0a Host Clustering for Microsoft Microsoft Cluster – SnapDrive is integrated to help configuration – WIN2K3 allows single HBA for both boot device & shared storage – Cannot grow LUN online in cluster • SnapDrive ability to very quickly grow a LUN minimizes the pain caused by this NetApp Confidential -- Do Not Distribute 47 Host Clustering for VERITAS VCS – By default does not us I/O fencing to protect against split brain – I/O fencing requires SCSI-3 reservations – 7.0.3 will have SCSI-3 reservations that are compatible with VERITAS – Does not do failover on FC links NetApp Confidential -- Do Not Distribute 48 Host Clustering for HP-UX ServiceGuard – 1 to 3 node clusters using SCSI-2 locks as arbitrator to avoid split brain – Does not do failover in dead FC links NetApp Confidential -- Do Not Distribute 49 Host Clustering for AIX HACMP – Uses SCSI-2 locks as arbitrator to avoid split brain • “setsp –b2” to enable locks with SANpath • SCSI-2 locks to active/active are mutually exclusive NetApp Confidential -- Do Not Distribute 50 Fibre Channel SAN Host Support OS Vendor HBA Multipath Host Cluster Native SANpath HACMP LVM JFS/2 Raw Emulex MPIO MSCS MMC NTFS Volume Mgr File System ext3 ext2 Reiser ext3 ext2 Reiser QLogic QLogic Oracle 9i, 10g RAC QLogic QLogic Oracle 9i, 10g RAC Emulex Veritas DMP Veritas VCS Veritas VxVM Veritas VxFS Native HP PVLInks Veritas DMP MC ServiceGuard Veritas VCS LVM Veritas VxVM JFS/ HFS Raw Veritas VxFS Emulex QLogic VMWare MSCS VirtualCenter (VMotion) VMware VMFS 2.x Raw QLogic QLogic Novell Clusters NetApp Confidential -- Do Not Distribute 51 NSS Shared Storage Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A NetApp Confidential -- Do Not Distribute 53 Enables Dual Path HA Protect Against Cable Pulls Or Breaks Protect Against Single HBA Failure X X Protect Against Storage Controller (eg. ESH2) Hot Swap X Key Benefits Full storage hardware redundancy in HA systems Prevent cluster failover events due to many storage issues. Complements CFO for improved HA and resiliency Loop 1 Loop 2 NetApp Confidential -- Do Not Distribute Loop 4 Loop 3 54 Switched Back-End Dual Active Paths for HA Environments – Reduces the number of HA failovers – Improve overall HA performance – Data ONTAP tries to balance load across paths SyncMirror – SyncMirror requires 100% disk overhead – Proper configuration survives all single failures NetApp Confidential -- Do Not Distribute 55 Agenda CFModes Single System Imagine Multipathing Host Clustering Storage System Backend HA Q&A? NetApp Confidential -- Do Not Distribute 56