Disk Storage (E)CKD VS SAN Frank McDaid zExperts Ltd frank.mcdaid@z-experts.co.uk A presentation for SMUG June 2012 © Copyright zExperts Ltd 2012 Introduction Up until now, the only Operating System running on most z platforms has been z/OS (MVS). z/OS can only use CKD Disks. With the current strategy of running z/VM hosting Linux on the z platform, there is now the option to use either CKD DASD or SAN SCSI. The aim of this presentation is to compare the pros and cons of using CKD DASD and SAN attached disks in order to make an informed decision on the most suitable strategy going forward. © Copyright zExperts 2012 IBM disks, a brief history IBM designed the disk data architecture called Count Key Data (CKD) back in the 1960's Very different disk format to the Fixed Block Architecture (FBA) which is more familiar to people working on UNIX, Linux, and MS Windows Modern mainframe disks from IBM, HDS, and others are all based on concepts introduced in StorageTek’s Iceberg project. The Iceberg Project populated a cabinet full of FBA SCSI disks, and then put a controller in front of them that emulates CKD to the mainframe. Up to this point, a CKD DASD was just one (quite large) disk platter. © Copyright zExperts 2012 Storagetek’s Iceberg A revolutionary Disk for the mainframe market Four years in development project cost $145 million commercially available from 1992 onwards not commercially successful until IBM agreed to resell them as an IBM RAMAC Virtual Array (RVA) First mainframe disk to use RAID Originally Redundant Arrays of Inexpensive Disks Renamed to Redundant Arrays of Independent Disks Iceberg linked together a series of cheap disks By spreading data over several smaller disks and adding parity data Iceberg protected the information it stored from loss and corruption. Also allowed for use of dynamic rebuild to recover from a disk failure • undetectable by the end user of that data • failed components could be “hot swapped” and repopulated with no downtime © Copyright zExperts 2012 IBM’s FBA Devices IBM had introduced real FBA devices for the mainframe much earlier than 1992 the 3310 and the 3370 Greatly disliked by the system programming community who were used to CKD mostly because of the granularity of addressing all the blocks on the disk IBM responded to the Mainframe Systems Programmers’ moans and groans about FBA by …. …. discontinuing FBA format disks for the mainframe FBA was the future, but IBM’s customers (primarily the MVS or z/OS community) wanted CKD © Copyright zExperts 2012 CKD devices today CKD disks are not sold nowadays just FBA devices with controllers that do CKD emulation CKD “looking” disks are 'created' by smart storage controllers essentially virtualised by microcode out of FBA disk arrays The Iceberg project led the way on that changed the way MF storage has been designed ever since Everything since 3330 uses smart storage controllers that create "virtual" CKD devices out of real FBA ones This emulation does have an effect on price MF storage is more expensive than most of the straight FBA/SAN/NAS type solutions available today z/VM and z/OS (MVS) don't know they are not talking to 3390's for the most part. © Copyright zExperts 2012 Disk Architecture - CKD Each physical disk record consists of a count field, an optional key field, and a ("user") data field with error correction/detection information appended to each field, which is separated by gaps Recorded space is larger than required because of the gap separation The principle behind the architecture is that since data record lengths can vary, they all have an associated count field which indicates the size of the key, if used, and the size of the data The count field has the identification of the physical location in cylinder-head-record format, the length of the key, and the length of the data The key may be omitted or consist of a string of characters. Most often the key is omitted © Copyright zExperts 2012 Disk Architecture - ECKD ECKD refers to the CCW (Channel Command Words) commands used with cached controllers for IBM DASD The new commands were introduced on the cached versions of the IBM 3880 controller and were extended on the 3390 The ECKD channel commands provide improved performance for all protocols the obsolete Bus & Tag interface ESCON (Enterprise Systems Connection) interface FICON (Fibre Connectivity) protocol ECKD allows the programmer to provide the control unit with information on intent and to perform operations in a single channel program that would require multiple channel programs with CKD © Copyright zExperts 2012 Parallel Access Volumes (PAV) On the subject of improved performance, it’s worth mentioning Parallel Access Volumes (PAVs) This gives a performance improvement to the z platform by allowing a system access to Disk volumes in parallel Accomplished by defining a ‘base address’ representing the real disk volume and aliases associated with that base address aliases can be either static or dynamic With PAV, a real DASD volume is accessed through a base subchannel and one or more alias subchannels HyperPAV support complements the existing basic PAV support potentially reducing the number of alias-device addresses needed for parallel I/O operations since HyperPAVs are dynamically bound to a base device for each I/O operation instead of being bound statically like basic PAVs. © Copyright zExperts 2012 FBA In IBM's implementation, Fixed Block Architecture (FBA) is a disk drive that stores data in blocks of fixed size Blocks are addressed by block number relative to the beginning of the file Various block sizes (512, 1024, 2048 & 4096) were featured on IBM’s FBA devices The term FBA is not often used since the introduction of SCSI devices which use a scheme called LBA (Logical Block Addressing) LBA is a linear addressing scheme; blocks are located by an integer index, with the first block being LBA 0, the second LBA 1, and so on In SANs where logical drives (LUNs) are composed via LUN virtualization and aggregation, LBA addressing of individual disk is translated by a software layer to provide uniform LBA addressing for the entire storage device © Copyright zExperts 2012 z/VM & Linux Disks z/VM was doing storage virtualization long before storage virtualization became popular using 'minidisks' This is a real CKD device such as a 3390 subdivided into many smaller virtual disks not the same thing as disk slices, nothing is written to a partition table on the disk To create a mini-disk means updating the z/VM directory the place that defines all the attributes for all the virtual machines that will be running on that computer There you enter the start and end cylinder of each minidisk You have to be sure that your new mini-disk does not overlay any other mini-disks When one is working with numbers like 1113, the maths is pretty easy. Minidisk 1 starts at cylinder 2 and ends at cylinder 10. Minidisk 2 starts at cylinder 11 and ends... the trick is not to end up defining overlapping minidisks © Copyright zExperts 2012 FCP – SAN attached SAN disks can be attached to System z via FCP z/OS can not use this SAN storage as it’s restricted to (E)CKD format disk z/VM can see them, but it can not fully utilize them unless an emulation layer is configured (EDEV) Linux guests running under z/VM can fully utilise these disks as if they were running on any other platform It is possible and not unusual to have a hybrid storage architecture with a combination of CKD disks and SAN disks If z/VM and/or Linux do use SAN disks (e.g. for low cost), then they are NOT utilising System z’s huge I/O subsystem With 256+ I/O channels, multipath devices, etc, the z Platform is a powerful box for doing I/O It can be contested that this is not really an issue as benchmarking of CKD vs.SAN has shown that with a correctly configured FCP and multipath architecture under Linux, the SAN devices can out-perform FICON attached CKD devices. © Copyright zExperts 2012 NAS This “low cost” storage is accessed by UNIX, Linux or Windows on non-z platforms using TCP/IP based protocols: NFS, ISCSI or CIFS. The z Platform with an OSA (Open Systems Adapter) installed provides Ethernet connections. The OSA sits where a channel adapter used to, and provides a number of industry standard Ethernet connections so that the z Platform is no longer isolated from the "Network is the Computer" world. With z/VM and Linux on the z platform, native TCP/IP requirements are met and with NFS this cheaper storage can be utilised. © Copyright zExperts 2012 FICON/ECKD Characteristics 1:1 mapping host subchannel:dasd Serialization of I/Os per subchannel I/O request queue in Linux Disk blocks are 4KB High availability by FICON path groups Load balancing by FICON path groups Parallel Access Volumes © Copyright zExperts 2012 FCP/SCSI Characteristics Several I/Os can be issued against a LUN immediately Queuing in the FICON Express card and/or in the storage server Additional I/O request queue in Linux Disk blocks are 512 bytes High availability by Linux multipathing, type failover Load balancing by Linux multipathing, type multibus © Copyright zExperts 2012 (E)CKD v SAN Under zVM/zLinux z/VM and Linux on System z can use either architecture, CKD or FCP/SAN This next section lists the advantages and disadvantages to using either architecture when configuring z/VM and Linux. © Copyright zExperts 2012 Pros of (E)CKD over SAN (1/2) Easy manageability using the existing traditional z/VM tools (backup, cloning, flashcopy, etc.) or z/OS tools such as DFDSS ECKD disk storage types are well supported under z/VM and are well integrated into the tooling that is supplied with z/VM Multipathing Multiple connections between the storage unit and the host, or multipathing, is supported without any problems or concerns and is particularly effective for FICON and FICON Express systems © Copyright zExperts 2012 Pros of (E)CKD over SAN (2/2) Unified DR With z/VM sharing the z Platform with z/OS, putting Linux systems on disk technology that z/OS understands means that both can be part of a GDPS DR configuration Hardware reuse – The DASD usually already on the floor Accessibility of storage for multiple IBM operating systems z/OS and z/VM understand CKD. If demand for disk is greater or lesser in a particular environment, the CKD disk can be assigned wherever it is needed without worries of compatibility Channel-attached infrastructure To backup z/VM using native tools there would have to be a tape drive attached to z/VM. Sharing the same DASD with z/OS means that z/OS can perform backups of z/VM DASD thus alleviating the need to setup a z/VM tape infrastructure. Excellent performance instrumentation and reporting. © Copyright zExperts 2012 Cons of (E)CKD over SAN Limited size - CKD volumes are much smaller than most SAN volumes LVM and MD RAID technologies allow creating larger logical volumes Extra expense for the CKD interfaces on storage servers On most of the CKD storage devices, the necessary interfaces to plug in to FICON are anywhere between 400% and 500% of the cost of a FCP interface. This skews the cost-per-megabyte for CKD disk dramatically "It's different." If you're trying to convince people to move applications from other environments where they can request enormous LUNs and not have to worry about LVM or MD, it's one more thing to have to convince them to do, and the argument generates a fair amount of unnecessary resistance. There is limited z/VM and Linux support for DS8000 EAV volumes. © Copyright zExperts 2012 Pros of SAN over (E)CKD (1/2) Very large volumes without LVM SAN disks can be of pretty much arbitrary size. It's not unusual to have single volumes reaching 500GB each in the SAN environment. Lower cost infrastructure The necessary mainframe adapters are the same price as FICON adapters, they are the same adapter with different microcode, but all the other interfaces that these adapters connect to: SAN switches, interfaces to the storage devices, etc, are identical to the ones used for open systems. These are usually much cheaper than the CKD interfaces, often 400%500% cheaper Volume format compatibility with open systems volumes inside the storage units TSM understands backing up FCP / SAN attached storage on the open systems connected to the SAN. This allows other systems to mount volumes created by Linux under z/VM, if some form of locking software is available on both systems. © Copyright zExperts 2012 Pros of SAN over (E)CKD (2/2) Re-use of existing open system resources If there is extra capacity in the SAN then attaching the mainframe allows it to use some the over-provisioned space. Since z/VM itself can also reside on FCP / SAN-only disk, there may be benefits without additional disk investment. Common storage management policies and procedures with open systems Allocation of a FCP / SAN disk can be done in the same way as it is done for open systems When attached to the z Platform, the overhead of converting from block to ECKD and back to block oriented in the CU (Control Unit) More I/Os can be executed in parallel, although ECKD volumes using Parallel Access Volumes (PAVs) can minimise this advantage More data can be moved in a single I/O command © Copyright zExperts 2012 Cons of SAN over (E)CKD (1/3) Mainframe communication Very few mainframe tools understand it. Most cloning and copy facilities are not available with SAN disk, and z/VM cannot invoke some specialised features of certain disk units mostly for old political battle reasons inside and outside IBM, but the problem remains Configuration complexity Without familiarity of the z Platform, for z/VM and Storage administrators it may appear complex to configure. LUNs and WWPNs are not native or natural concepts to people who normally work with z/VM or z/OS. Dump tools Most z/OS and z/VM volume dump tools cannot access FCP storage at all. When z/VM is using SCSI volumes for its own use, the SAN devices emulate (EDEV) 9336 disks (and thus can be dumped with DDR), but pure FCP / SAN volumes are not accessible to CMS-based tools (CMS is z/VM’s equivalent to a linux Shell) © Copyright zExperts 2012 Cons of SAN over (E)CKD (2/3) Recovery in DR requires additional planning probably won’t get all the applications and data in the same restore cycle; a separate data restoration plan is necessary No native support for FCP SAN-attached tape in VM At least one channel-attached drive is required. alternative is to buy a commercial solution to backup both z/VM and linux data which will involve a cost or write an in-house solution Performance differences SAN attached disk using EDEV to emulate an FBA device to z/VM has a measurable performance overhead when used for z/VM CP functions. Multiple physical paths Managing multiple physical paths to FCP/SAN disk is still a bit difficult for non-Linux systems, including z/VM. © Copyright zExperts 2012 Cons of SAN over (E)CKD (3/3) z/OS and FCP devices z/OS doesn't understand FCP/SAN devices at all z/VM and Linux are perfectly happy to run on FCP/SAN or emulated 9336 (EDEV) on FCP, but z/OS has to have (E)CKD Until z/VM adds (E)CKD emulation on FCP/SAN disk, you have to have separate disks for z/OS Linux native (or running under z/VM) supports SAN attached tape. © Copyright zExperts 2012 (E)CKD vs. SAN – Other considerations Backup / Recovery Management / provisioning Disaster Recovery Data Administration © Copyright zExperts 2012 Backup and recovery The next few charts form a comparison of z/VM & Linux backup and recovery options © Copyright zExperts 2012 z/VM and Linux on CKD z/VM disks z/VM data can be backed up using DFDSS from z/OS without tapes assigned to z/VM as long as the z/VM disks are accessible With tapes assign to z/VM a number of z/VM backup methodologies can be used but there could be a software cost involved Linux Disks Linux disks can be backed up by Tivoli Storage manager (TSM) Linux disks can be backed up by DFDSS or a z/VM backup method but the restoration of data is not granular © Copyright zExperts 2012 z/VM and Linux on SAN z/VM disks z/VM data can NOT be backed up using DFDSS Most z/OS and z/VM volume dump tools cannot access FCP storage z/VM disk (not Linux) using EDEV (FBA emulation) can be dumped using tools like DDR if a channel attached tape drive is available to z/VM Linux Disks Linux disks can be backed up by Tivoli Storage manager (TSM) Linux disks can NOT be backed up by DFDSS FCP / SAN volumes are not accessible to CMS-based tools (CMS is z/VM’s equivalent to a Linux Shell). © Copyright zExperts 2012 Management and Provisioning ECKD - Steps to Add Devices to Linux on System z 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. Setup hardware and make all physical connections Add ECKD I/O definitions to z/VM via IOCP deck Create z/VM directory entries with assigned disk dedicated to the virtual machine Logon on to new virtual machine user Install Linux on System z in new Linux guest virtual machine Add additional disk to Linux on System z virtual machine via directory entry or CP attach command Bring Linux on System z devices online Format Linux on System z DASD devices using dasdfmt Partition devices using fdasd Add to LVM or create a filesystem directly on the partitioned device © Copyright zExperts 2012 Management and Provisioning SAN - Steps to Add FCP Devices to Linux on System z 1. 2. 3. 4. 5. 6. 7. 8. Setup hardware and make all physical connections between the DISK array, System z, and SAN switch Setup zoning on the switch to the System z channel On the Disk array define, map and mask FBA LUNs Add I/O definitions to z/VM using IOCDS NPIV enable the channels on the HMC Create z/VM directory entries with disk in the User Direct file Install Linux on System z in newly allocated Linux guest virtual machine Add additional disk to the Linux on System z virtual machine via directory entry or CP attach command 1. Vary devices online 2. Associate WWPN with device address 3. Associate LUN(s) with WWPN 9. 10. Partition the Linux device, /dev/……… Add the Linux device(/dev/…..) to LVM and/or create filesystem © Copyright zExperts 2012 Disaster Recovery Scenario GDPS xDR Solution Sites typically run a ‘Freeze and Go’ GDPS DR policy. So, there isn’t any Active / Active option using GDPS on the z/Platform. If z/VM were aligned with the z/OS GDPS policy, then there may be applications which are candidates for migrating to Linux under z/VM, but who are running Active /Active clusters, for instance. To give these applications the same availability, different architectural options would need to be considered © Copyright zExperts 2012 xDR Pros and Cons Advantages z/OS currently has GDPS DR architecture in place DR procedures tried and tested Disadvantages Requires z/VM and Linux to exclusively use CKD DASD Active / Active architectures not supported with current GDPS ‘Freeze and Go’ policy © Copyright zExperts 2012 Other Clustering DR solutions When porting applications from other platforms to Linux under z/VM, consideration must be given to the DR and failover architectures that are in place for that application. For the most part, these DR and Failover solutions can be replicated on the z/Platform but maybe not with the same software Any solution would need to be evaluated on an Application by Application basis. © Copyright zExperts 2012 Other Clustering DR solutions Advantages Provides application with same SLA Application architecture retained Disadvantages Greater planning involved It may be the case that all applications may not be restored in the same restore cycle and a separate data restoration plan is necessary for particular applications May compromise any current GDPS xDR solution New software / hardware may be required to architect the same solution as when the application was running on a non-z platform © Copyright zExperts 2012 Data Administration The administration of LUN based data versus CKD based data will be different At most installations, this should not be an issue as there are Storage personnel who are highly experienced in both disciplines The area of Storage administration that needs to be addressed is the lack of knowledge of native z/VM based data Training is required, but this is the case regardless of whether z/VM data is hosted on CKD or SAN disks. © Copyright zExperts 2012 Conclusions 1/3 Benchmarks performed on z/VM Linux systems to compare the various configuration options have shown that using native SAN disks under Linux gives better performance than CKD. The performance improvement over CKD is minimal and would not make it the ultimate deciding factor on using SAN over CKD. Despite z/VM on the z/Platform having huge I/O capabilities inherent in the z architecture, with the correct number of FCP channels and multipathing configured under Linux, the SAN can match and outperform CKD attached Storage. Using SAN storage for native z/VM utilising EDEV (Emulated devices) gives good performance results, on a par with native SAN and CKD, but there is a price to pay with increased CPU. The increase is not insignificant but should not be a blocker to using only SAN storage for both z/VM and Linux. © Copyright zExperts 2012 Conclusions 2/3 If z/VM is going to be in a SSI configuration (z/VM clustering) or part of a GDPS xDR architecture, at least one CKD disk must be available to z/VM Linux. Apart from performance, the other considerations (covered previously) are (a) Backup / Recovery, (b) Management / Provisioning, (c) DR and (d) Data Administration. If the decision is made to use SAN storage then there could be a cost outlay for extra FICON/FCP cards and channels to connect the z/Platform up to an available SAN. © Copyright zExperts 2012 Conclusions 3/3 There’s a place for both types of storage. A good approach maybe to allocate z/VM specific disks on CKD and store application data on the type of disk that provides the best cost/performance trade off. Applications that benefit from very large volumes (like databases) would be good candidates for FCP/SAN storage. © Copyright zExperts 2012