STORAGE ARCHITECTURE/ GETTING STARTED: SAN SCHOOL 101 Marc Farley President of Building Storage, Inc Author, Building Storage Networks, Inc. Agenda Lesson 1: Basics of SANs Lesson 2: The I/O path Lesson 3: Storage subsystems Lesson 4: RAID, volume management and virtualization Lesson 5: SAN network technology Lesson 6: File systems Lesson #1 Basics of storage networking Connecting Concentrator Dish Router HBA or NIC Network Switch/hub Bridge Computer System VPN Connecting Networking or bus technology Cables + connectors System adapters + network device drivers Network devices such as hubs, switches, routers Virtual networking Flow control Network security Storing Host Software Storage Protocol Storage Devices Volume Manager Software Storage Device Drivers Mirroring Software Storage Command and Transfer Protocol (wiring, network transmission frame) Tape Drives Disk Drives RAID Subsystem Storing Device (target) command and control Drives, subsystems, device emulation Block storage address space manipulation (partition management) Mirroring RAID Striping Virtualization Concatentation Filing C:\directory\file Database Object User/Application View User/Application View Logical Block Mapping (Storage) Filing Namespace presents data to end users and applications as files and directories (folders) Manages use of storage address spaces Metadata for identifying data file name owner dates Connecting, storing and filing as a complete storage system Filing Storing Wiring Connecting HBA or NIC Cable Cable Network Switch/hub Storing function in an HBA driver Computer System Disk Drive NAS and SAN analysis NAS is filing over a network SAN is storing over a network NAS and SAN are independent technologies They can be implemented independently They can co-exist in the same environment They can both operate and provide services to the same users/applications Protocol analysis for NAS and SAN NAS SAN Network Filing Storing Wiring Connecting Integrated SAN/NAS environment NAS Client NAS ‘Head’ Server NAS + Server SAN Initiator “NAS Head” System SAN SAN Target Storage Filing Filing Storing Storing Wiring Connecting Wiring Connecting Common wiring with NAS and SAN NAS Client NAS ‘Head’ Server NAS Head System Filing Filing SAN Storage Target Storing Storing Wiring Connecting Lesson #2 The I/O path Host hardware path components Memory Processor Memory Bus System I/O Bus Storage Adapter (HBA) Host software path components Application Operating Filing Cache Volume System System Manager Manager MultiPathing Device Driver Network hardware path components Cabling Fiber optic Copper Switches, hubs, routers, bridges, gatways Port buffers, processors Backplane, bus, crossbar, mesh, memory Network software path components Access and Security Fabric Services Routing Flow Control Virtual Networking Subsystem path components Network Ports Access and Security Cache Resource Manager Internal Bus or Network Device and media path components Disk drives Tape drives Tape Media Solid state devices The end to end I/O path picture App Memory Processor Operating System Cabling Network Systems Subsystem Network Poirt Cache Volume MultiFiling System Manager Manager Pathing Access and Security Access and Security Cache Fabric Services Resource Manager Memory System I/O Bus Bus Routing Internal Bus or Network Device Storage Driver Adapter (HBA) Flow Control Virtual Networking Disk drives Tape drives Lesson #3 Storage subsystems Generic storage subsystem model Controller (logic+processors) Access control Network Ports Storage Resources Resource manager Cache Memory Internal Bus or Network Power Redundancy for high availability Multiple hot swappable power supplies Hot swappable cooling fans Data redundancy via RAID Multi-path support Network ports to storage resources Physical and virtual storage Exported storage Exported storage Exported storage Exported storage Exported storage Exported storage Exported storage Exported storage Subsystem Controller Resource Manager (RAID, mirroring, etc.) Physical storage device Physical storage device Physical storage device Physical storage device Hot Spare Device SCSI communications architectures determine SAN operations SCSI communications are independent of connectivity SCSI initiators (HBAs) generate I/O activity They communicate with targets • Targets have communications addresses • Targets can have many storage resources • Each resource is a single SCSI logical unit (LU) with a universal • • unique ID (UUID) - sometimes referred to as a serial number An LU can be represented by multiple logical unit numbers (LUNs) Provisioning associates LUNs with LUs & subsystem ports A storage resource is not a LUN, it’s an LU Provisioning storage LUN 0 SCSI LU UUID A LUN 1 Port S1 LUN 1 Port S2 Port S3 SCSI LU UUID B LUN 2 LUN 2 LUN 3 SCSI LU UUID C LUN 3 Port S4 LUN 0 Physical storage devices SCSI LU UUID D Controller functions Physical storage devices Physical storage devices Multipathing LUN X Path 1 SCSI LU UUID A MP SW LUN X Caching Exported Volume Exported Volume Exported Volume Exported Volume Controller Cache Manager Read Caches Write Caches 1. Recently Used 2. Read Ahead 1. Write Through (to disk) 2. Write Back (from cache) Tape subsystems Tape Drive Tape Drive Tape Subsystem Controller Tape Drive Tape Slots Robot Tape Drive Subsystem management Now with SMIS Management station browser-based network mgmt software Ethernet/TCP/IP Out-of-band management port In-band management Storage Subsystem Exported Storage Resource Data redundancy 2n Duplication Parity n+1 Difference -1 d(x) = f(x) – f(x-1) f(x-1) Duplication redundancy with mirroring Host-based I/O Path Within a subsystem Mirroring Operator Terminate I/O & regenerate new I/Os Error recovery/notification I/O PathA I/O PathB Duplication redundancy with remote copy Host Uni-directional (writes only) A B Point-in-time snapshot Subsystem Snapshot Host A B C Lesson #4 RAID, volume management and virtualization RAID = parity redundancy 2n Duplication Parity n+1 Difference -1 d(x) = f(x) – f(x-1) f(x-1) History of RAID Late 1980s R&D project at UC Berkeley David Patterson Garth Gibson (independent) Redundant array of inexpensive disks • Striping without redundancy was not defined (RAID 0) Original goals were to reduce the cost and increase the capacity of large disk storage Benefits of RAID ● Capacity scaling ● ● Performance through parallelism ● ● Combine multiple address spaces as a single virtual address Spread I/Os over multiple disk spindles Reliability/availability with redundancy ● Disk mirroring (striping to 2 disks) ● Parity RAID (striping to more than 2 disks) Capacity scaling Combined extents 1 - 12 Exported RAID disk volume (1 address) RAID Controller (resource manager) Storage extent 1 Storage extent 2 Storage extent 3 Storage extent 4 Storage extent 5 Storage extent 6 Storage extent 7 Storage extent 8 Storage Storage extent 9 extent10 Storage extent11 Storage extent12 Performance RAID controller (microsecond performance) 1 Disk drive Disk drive 2 3 Disk drive 4 5 Disk drive 6 Disk drive Disk drives (Millisecond performance) from rotational latency and seek time Disk drive Parity redundancy RAID arrays use XOR for calculating parity Operand 1 False False True True Operand 2 False True False True XOR Result False True True False XOR is the inverse of itself Apply XOR in the table above from right to left Apply XOR to any two columns to get the third Reduced mode operations When a member is missing, data that is accessed must be reconstructed with xor An array that is reconstructing data is said to be operating in reduced mode System performance during reduced mode operations can be significantly reduced XOR {M1&M2&M3&P} Parity rebuild RAID Parity Rebuild The process of recreating data on a replacement member is called a parity rebuild Parity rebuilds are often scheduled for non-production hours because performance disruptions can be so severe XOR {M1&M2&M3&P} RAID 0+1, 10 Hybrid RAID: 0+1 RAID Controller DiskDisk drivedrive 1 DiskDisk drive drive 2 DiskDisk drive drive DiskDisk drive drive DiskDisk drive drive 3 4 5 Mirrored pairs of striped members Volume management and virtualization Storing level functions Provide RAID-like functionality in host systems and SAN network systems Aggregation of storage resources for: scalability availability cost / efficiency manageability OS kernel File system Volume management RAID & partition management Device driver layer between the kernel and storage I/O drivers Volume Manager HBA drivers HBAs Server system Volume managers can use all available connections and resources and can span multiple SANs as well as SCSI and SAN resources Virtual Storage SCSI disk resource Volume manager SCSI HBA SCSI Bus SAN disk resources HBA drivers SAN HBA SAN cable SAN Switch SAN storage virtualization RAID and partition management in SAN systems Two architectures: • • In-band virtualization (synchronous) Out-of-band virtualization (asynchronous) In-band virtualization Exported virtual storage SAN virtualization I/O Path system System(s), switch or router Disk subsystems Out-of-band virtualization Distributed volume management Virtualization management system Virtualization agents Virtualization agents are managed from a central system in the SAN Disk subsystems Lesson #5 SAN networks Fibre channel • The first major SAN networking technology • Very low latency • High reliability • Fiber optic cables • Copper cables • Extended distance • 1, 2 or 4 Gb transmission speeds • Strongly typed Fibre channel A Fibre Channel fabric presents a consistent interface and set of services across all switches in a network Host and subsystems all 'see' the same resources SAN Storage Target Subsystem SAN Storage Target Subsystem SAN Storage Target Subsystem Fibre channel port definitions ● FC ports are defined by their network role ● N-ports: end node ports connecting to fabrics ● L-ports: end node ports connecting to loops ● NL-ports: end node ports connecting to fabrics or loops ● F-ports: switch ports connecting to N ports ● FL-ports: switch ports connecting to N ports or NL ports in a loop ● E-ports: switch ports connecting to other switch ports ● G ports: generic switch ports that can be F, FL or E ports Ethernet / TCP / IP SAN technologies Leveraging the install base of Ethernet and TCP/IP networks iSCSI – native SAN over IP FC/IP – FC SAN extensions over IP iSCSI Native storage I/O over TCP/IP New industry standard Locally over Gigabit Ethernet Remotely over ATM, SONET, 10Gb Ethernet iSCSI TCP IP MAC PHY iSCSI equipment Storage NICs (HBAs) SCSI drivers Cables Copper and fiber Network systems Switches/routers Firewalls FC/IP Extending FC SANs over TCP/IP networks FCIP gateways operate as virtual E-port connections FCIP creates a single fabric where all resources appear to be local OneTCP/IP fabric FCIP Gateway E-port LAN, MAN or WAN FCIP Gateway E-port SAN switching & fabrics High-end SAN switches have latencies of 1 - 3 µsec Transaction processing requires lowest latency Most other applications do not Transaction processing requires non-blocking switches No internal delays preventing data transfers Switches and directors Switches 8 – 48 ports Redundant power supplies Single system supervisor Directors 64+ ports HA redundancy Dual system supervisor Live SW upgrades SAN topologies Star • • Simplest single hop Dual star • Simple network + redundancy • • Single hop Independent or integrated fabric(s) SAN topologies N-wide star • • • Scalable Single hop Independent or integrated fabric(s) Core - edge • Scalable • 1 – 3 hops • integrated fabric SAN topologies Ring • • • Scalable integrated fabric 1 to N÷2 hops Ring + Star • Scalable • integrated fabric • 1 to 3 hops Lesson #6 File systems File system functions Name space Access control Metadata Locking Address space management Filing Storing Think of the storage address space as a sequence of storage locations (a flat address space) 1 2 10 11 19 20 28 29 37 38 46 . 55 . 64 . 73 . 82 83 3 4 12 21 . . . . . . 84 13 . . . . . . . 85 5 6 7 14 15 . . . . . . . . . . . . . . 86 87 16 . . . . . . . 88 8 9 17 26 35 44 53 62 71 80 89 18 27 36 45 54 63 72 81 90 Superblocks are known addresses used to find Superblocks file system roots (and mount the file system) SB SB File systems must have a known and Filing and Scaling dependable address space The fine print in scalability - How does the filing function know about the new storing address space? 1 7 13 19 25 31 37 Filing Storing Storing 1 6 11 16 21 2 7 12 17 22 2 8 14 20 26 32 38 3 8 13 18 23 3 9 15 21 27 33 39 4 9 14 19 24 4 10 16 22 28 34 40 5 10 15 20 25 5 11 17 23 29 35 41 6 12 18 24 30 36 42