Data Analyzing Robot Youth Lifeform Can Fly Jets, Eats Cheetos, Says Memowee, Runs on x86 instruction set, RAM, Not sure? S. A. N. EVERYTHING THEY TOLD ME ABOUT MY SAN IS A LIE! SCSI Small Computer System Interface (SCSI, /ˈskʌzi/ SKUZ-ee)[1] is a set of standards for physically connecting and transferring data between computers and peripheral devices. The SCSI standards define commands, protocols, and electrical and optical interfaces. SCSI is most commonly used for hard disks and tape drives, but it can connect a wide range of other devices, including scanners and CD drives, although not all controllers can handle all devices. The SCSI standard defines command sets for specific peripheral device types; the presence of "unknown" as one of these types means that in theory it can be used as an interface to almost any device, but the standard is highly pragmatic and addressed toward commercial requirements. Common Flavors of SCSI SCSI - 1 ULTRA3 SCSI SCSI-2 Ultra2 Wide SCSI Fast SCSI ULTRA 3 SCSI Fast Wide SCSI ULTRA 320 SCI Ultra SCSCI ULTRA 640 SCSI Ultra Wide SCSI Ultra 2 SCSI FILE ACCESS PROTOCALS (LAYER 7) NFS (Network File System) – Sun originally developed standard in 1984 for remote server file access. IE Mounting a remote drive. NFSv2 – 1989, Operated on UDP predominately. Only allowed 2GB of a file to be read. NFSv3 – 1995 Support for 64 bit file offsets, can handle higher than 2GB, support Async writes to server, Sun Offically added TCP as a transport for Data. NFSv4- 2000 Protocal support for clustered server deployments, influence by CFS, and mandates strong security, and transport improvements. NFSv4.1 – Adds support for parallelism in data distribution across servers. SMB (Server Message Block) & CIFS (Common Internet File System)- Originally developed by IBM. Microsoft picked up the protocal with it’s implementation of LAN Manager for OS2 and 3COM. 1. Was integrated into Windows for Work Groups in 1992. 2. SMB morphed into CIFS with direct connections to TCP Port 139. 3. The windows UNIX connection client is called “SAMBA” to connect to Unix Servers. 4. CIFS – added NTLM and NTLMv2 support for locking, stronger encryption, and later Active Directory Support. 5. Intially Netbios was used for transmission but later versions become less and less chatty. SEE ALSO: WEBDAV for further file sharing protocals. DATA TRANSMISSION PROTOCALS PROTOCAL Advantages Broad Definition Fibre Channel Das Uber Fast. Multiple GB’s and rising! Fiber or Ethernet Medium (Fibre Channel over Ethernet connecting to a storage medium via the Fibre Channel. Good for long distances. Geo located SANS. Small Computer System Interface, old school standard. CPU independent. IP based SCSI. Cheap Attachment for external hard drives. Switched fabric link used in high end data centers. Low Network latency. Superset of Virtual Interface Architecture. (What ever that is?) Features include high throughput, low latency, quality of service and failover, Old School Point to Point Protocol, replacement for SCSI int the 80’s. Backwards compatibility to SATA. Communication protocol used to move data to and from computer storage devices such as hard drives and tape drives. (Arbitrated Loop, Point to Point, and Switched Fabric.) iSCSI host Adapter / RAID Controller e-SATA (External Serial ATA Disk Enclosures) InfiniBand SAS Serial Attached SCSI Remember: Fiber = Fiber Optics, Fibre = Protocol. Engineers get bored so they have to make up silliness so they’ll have something to whine about. and it is designed to be scalable. THE SAN THE HBA HBA (MISC HOST BUS ADAPTERS) External SCSI BUS QLOGIC CISCO SCSI HBA External SCSI BUS ISA SCSI HBA Circa early 90’s WHAT IS A AN HBA? A host bus adapter (HBA) component that connects a host system (the computer) to other network and storage devices via an external hard port. FOR THIS CONVERSATION ONLY: Today, the term host bus adapter (HBA) is most often used to refer to a Fibre Channel interface card. Fibre Channel HBAs are available for all major open systems, computer architectures, and buses, including PCI and SBus (obsolete today). Each HBA has a unique World Wide Name (WWN), which is similar to an Ethernet MAC address in that it uses an OUI assigned by the IEEE. However, WWNs are longer (8 bytes). There are two types of WWNs on a HBA; a node WWN (WWNN), which is shared by all ports on a host bus adapter, and a port WWN (WWPN), which is unique to each port. There are HBA models of different speeds: 1Gbit/s, 2Gbit/s, 4Gbit/s, 8Gbit/s, 10Gbit/s and 20Gbit/s. The major Fibre Channel HBA manufacturers are QLogic and Emulex. As of mid-2009, these vendors shared approximately 90% of the market.[1][2] Other manufacturers include Agilent, ATTO, Brocade, and LSI. HBA is also known to be interpreted as High Bandwidth Adapter in cases of Fibre Channel controllers. OTHER TYPES OF HBA eSata iFiniband ATA SAS SATA SCSI HBA = HARDWARE TRANSLATOR or DISK CONTROLLER MPIO DRIVERS & YOUR HBA: The Microsoft Multipath I/O (MPIO) framework provides support for multiple data paths to storage to improve fault tolerance of connection to storage, and may in some cases provide greater aggregate throughput by using multiple paths at the same time. • • • • Dynamic load balancing Traffic shaping Automatic path management Dynamic reconfiguration Included in the OS in Windows 2008. 2003 & 2000 required a download and configuration. • Microsoft’s full discussion is at http://technet.microsoft.com/en-us/library/ee619734(v=WS.10).aspx • http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=9787 • http://download.microsoft.com/download/3/0/4/304083f1-11e7-44d9-92b92f3cdbf01048/mpio.doc FIBER – It’s a GLASS Lots of breaky stuff. Lots of connectors and types of cables out there for everything from industrial lasers to trans Atlantic Cable. http://en.wikipedia.org/wiki/Optical_fiber_cable http://en.wikipedia.org/wiki/Optical_fiber_connector FIBER CONFIGURATIONS Arbitrated Loop: Has to communicate to all nodes to pass along data between nodes. End of loop can be slow. Middle 2000’s, less often used. Think TOKEN RING. Switched Fabric: Fibre channel used in a SWITCHED environment. Similar to a TCP in IP networks. Data can be shared from switches from multiple end sources. Used to truly create a Storage Area Network. HBA’s can have redundant cable. Point to Point: Every thing old is new again. Direct attached storage devices like drive cabinets or RAM SAN’s. Nice because it becomes a device dedicated for servers with high performance needs. THE D I S K (SUB SYSTEM aka JBOD) YEAH they’re lots of disks. YEAH! JUST A BUNCH OF DISKS DISK SUB SYSTEMS BY LEVEL OF COMPLEXITY 1. JBOD: Just a Bunch of disks 2. RAID ARRAYS (SEE NEXT SLIDE) 3. Intelligent Disk Sub Systems 1. Instant Copies 2. Remote Mirroring & Consistency Groups (i-SCSI is a great candidate protocol.) – Remote copying of data either synchronously or a- synchronously. 3. Lun Masking (Logical Unit Number) 1. Port Based: Poor man’s Masking limiting certain ports to specific servers 2. Full Lun Masking means presenting certain disks to the server so that only that server has access to a specific stripe or set of disks. Don’t confuse LUN Masking with ZONEING – SEE SAN VOLUME CONTROLLER: Many devices and nodes can be attached to a SAN. When data is stored in a single cloud, or storage entity, it is important to control which hosts have access to specific devices. Zoning controls access from one node to another. Zoning lets you isolate a single server to a group of storage devices or a single storage device, or associate a grouping of multiple servers with one or more storage devices, as might be needed in a server cluster deployment. Zoning is implemented at the hardware level (by using the capabilities of Fibre Channel switches) and can usually be done either on a port basis (hard zoning) or on a World-Wide Name(WWN) basis (soft zoning). WWNs are 64-bit identifiers for devices or ports. http://technet.microsoft.com/en-us/library/cc758640(v=ws.10).aspx R.A.I.D. NEW RAID Redundant Array of Independent Disks, OLD SCHOOL: Redundant Array of Inexpensive Disks A spanned volume is a formatted partition which data is stored on more than one hard disk, yet appears as one volume. RAID 0+1: Even number of drives creates a second striped set to mirror a primary striped set. The array continues to operate with one or more drives failed in the same mirror set, but if drives fail on both sides of the mirror the data on the RAID system is lost. RAID 1+0: (a.k.a. RAID 10) mirrored sets in a striped set (minimum four drives; even number of drives) provides fault tolerance and improved performance but increases complexity. RAID 5+3: mirrored striped set with distributed parity (some manufacturers label this as RAID 53). My Experience: SANS Combie Striping with Hybrid RAID. Each manufacturer is different and some older sans use more traditional RAID 1, 5, and 10. An HP SAN I used back in the North East. What is a SAN VOLUME CONTROLLER? Indirection or mapping from virtual LUN to physical LUN : Servers access SVC as if it were a storage controller. The SCSI LUNs they see represent virtual disks (volumes) allocated in SVC from a pool of storage made up from one or more managed disks (MDisks). A managed disk is simply a storage LUN provided by one of the storage controllers that SVC is virtualizing. Data migration : SVC can move volumes from MDisk group to MDisk group, whilst maintaining I/O access to the data. MDisk groups can be shrunk or expanded by removing or adding hardware LUNs, while maintaining I/O access to the data. Both features can be used for seamless hardware migration. Migration from an old SVC model to the most recent model is also seamless and implies no copying of data. Importing existing LUNs via a feature called Image Mode: "Image mode" is a one-to-one representation of an MDisk (managed LUN) that contains existing client data; such an MDisk can be seamlessly imported into or removed from an SVC cluster. Fast-write cache: Writes from hosts are acknowledged once they have been committed into the SVC mirrored cache, but prior to being destaged to the underlying storage controllers. Data is protected by being replicated to the other node in an I/O group (node pair). Cache size is dependant on the model of SVC used. Fast-write cache is also used to increases performance in midrange storage configurations. Auto tiering (Easy Tier): SVC automatically selects the best storage hardware for each chunk of data, according to its access patterns. Cache unfriendly "hot" data is dynamically moved to solid state drives SSD, whereas cache friendly "hot" and any "cold" data is moved to economic spinning disks. Makes use of HOT ZONES on disks by moving data to the outer (faster IO) or inner (slower IO) sections of spindles. Solid state drive SSD support: SVC can use any supported external SSD storage device or provide its own internal SSD slots, up to 32 per cluster. Easy Tiering is automatically active when mixing SSDs with spinning disks in hybrid MDisk groups. Space-efficient features: LUN capacity is only used when new data is written to a LUN. Also known as Thin Provisioning. Data blocks equal zero are not physically allocated, unless previous data unequal zero exists.Thin provisioning is typically combined with the FlashCopy features detailed below to provide space-efficient snapshots Virtual Disk Mirroring: Provides the ability to make two copies of a LUN, implicitly on different storage controllers Stretched Cluster, also called Split IOgroup: A geographically distributed cluster layout leveraging the virtual disk mirroring feature across datacenters within 300 km distance. A stretched cluster presents one logical storage layer over synchronous distances for increased high availability. Unlike in classical mirroring, logical LUNs are writable on both sides (tandem) at the same time, removing the need for "failover", "role switch", or "site switch". The feature can be combined with Live Partition Mobility or VMotion to avoid any data transport (storage mobility or storage VMotion). Each side's SVC nodes also have access to the other side's physical storage hardware, removing the need for data rebuilds in case of simple node failures. Licensed IBM SVC FEATURES: Licensed Features The payment for base license is per TB of MDisks or per number of physical disk drives in the underlying layer. There are some optional features, separately licensed per TB:[1] Metro Mirror - synchronous remote replication This allows a remote disaster recovery site at a distance of up to about 300km[5] Global Mirror - asynchronous remote replication This allows a remote disaster recovery site at a distance of thousands of kilometres. Each Global Mirror relationship can be configured for high latency / low bandwidth or for high latency / high bandwidth connectivity, the latter allowing a consistent recovery point objective RPO below 1 sec. FlashCopy (FC) This is used to create a disk snapshot for backup, or application testing of a single volume. Snapshots require only the "delta" capacity unless created with full-provisioned target volumes. FlashCopy comes in three flavours: Snapshot, Clone, Backup volume. All are based on optimized copy-on-write technology, and may or may not remain linked to their source volume. One source volume can have up to 256 simultaneous targets. Targets can be made incremental, and cascaded tree like dependency structures can be constructed. Targets can be re-applied to their source or any other appropriate volume, also of different size (e.g. resetting any changes from a resize command). Copy-on-write is based on a bitmap with a configurable grain size, as opposed to a journal.[1] SAN 2.0 SOLID STATE Why should you care about NAND FLASH PRODUCTS? PLUS SIDE 1. NAND IS FUN TO SAY (Stay away from ML-FLASH) 2. It’s fast 3. Can be integrated into a SAN with a low form factor. 4. Did I say It’s fast 5. A single card can take up less space than JBOD. Good for off site Disaster Recovery Servers. 6. It’s more reliable than DISKS and easier to tell when a catastrophic failure occurs. 7. Lower Power Consumption. 8. Bad parts of a drive can be sequestered and disabled when chips fail. Why should you care about NAND FLASH PRODUCTS? DOWN SIDE 1. Drives can degrade because of they write and erase data. 2. Data can leak out of memory sectors at a “bit level” 3. They’re not cheap (But getting cheaper) 4. Cheap write controllers can reuse the same chips causing unnecessary wear. 5. Not all external drives will interface with a specific switch. OEMS can get picky. ioDrive/ioDrive Duo Reliability • No moving parts = Fewer points of failure • Has internal N+1 redundancy with self-healing technology • 4 Levels of data parity protection and/or verification. – 10-30 probability of undetected bad data! • 5 Levels of data redundancy. – 10-20 probability of uncorrectable data! • 5 Methods for ensuring NAND longevity and durability –10 years for SLC (@ 100% write duty cycle –8 years for MLC (@ 100% write duty cycle • HDD 5 years component design life span, 350,000 to 600,00 hour MTBF • ioDrive - 1+ million hours MTBF FUSION-IO: Will that be One Card or Two? Texas Memory Systems EMC FUZION IO Texas Memory Systems RamSan-70 PCI Express slot differing by size: Everything old is new again. Who remembers AGP, ISA, VESA? Who Remembers the 80’s? Some Texas Memory Sytems NANDFlash RamSan Products RamSan-70 RamSan-710/810 RamSan-720/820 RamSan-630 SLC Flash SLC/eMLC Flash SLC/eMLC Flash SLC Flash 900 GB 5/10 TB 12/24 TB 10 TB 1.2M IOPS 400K/320K IOPS 500K/450K IOPS 1M IOPS 2.5 GB/s 5/4 GB/s 5/4 GB/s 10 GB/s Full-height, halflength PCIe x8 2.0 1U rackmount, 4x IB or FC ports 3U rackmount, 10x IB or FC ports RamSan-710/-810 1-2 interface modules 4-20 Flash modules + 1 “Active Spare” management control processor redundant power supplies 1U chassis redundant fans N+1 batteries RamSan-720/820 Interface Management Module Management Module Interface Power Module Switch/RAID Controller Switch/RAID Controller Power Module Voltage Gate Voltage Source NAND FLASH: The MICROSOPIC VIEW “Voltage Drain” 1. Charges are held in the Floating gate. 2. Cells are lined in parallel 3. When Electrons are presented to the floating gate. No current flows. 4. Erase blocks 8 to 32 Kbytes in size. 5. A write operation in any type of flash device can only be performed on an empty or erased unit. So in most cases write operation must be preceded by an erase operation. 6. Multi-level cell flash that can store/hold more than one bits rather than a single bit in each memory cell doubling the capacity of memory. ERASE OPERATIONS “Fowler-Nordheim (F-N) Tunneling” : 1. Voltage ranging between -9V to -12V is applied to the source and control gate. 2. A voltage of around 6V is applied to the source. 2. 2. A voltage of around 6V is applied to the source. 3. . The electrons in the floating gate are pulled off and transferred to the source by quantum tunneling (a tunnel current). Electrons tunnel from the floating gate to the source and substrate. Floating Gates do not carry a charge making them neutral and a bit of 1 giving it a state of empty. Voltage Gate 1. -9Volts “Voltage Drain” Voltage Source 2. 6 Volts e- 3. Fowler-Nordheim (F-N) Tunneling WRITE OPERATIONS: 1. 7V is applied to Bit Line (Drain terminal), bit 0 is stored in the cell. 2. The channel is now turned on, so electrons can flow from the source to the drain. Through the thin oxide layer electrons move to the floating gate. 3. The source-drain current is sufficiently high to cause some high-energy electrons to jump through the insulating layer onto the floating gate, via a process called hot-electron injection. 1. 7volts e2.&3. READ OPERATIONS 1. Apply Voltage around 5V to the control gate 2. 1V to the drain. The state of the memory cell is distinguished by the current flowing between the drain and the source. To read the data, a voltage is applied to the control gate, and the MOSFET (metal–oxide–semiconductor field-effect transistor channel) IE. The gate. will be either conducting or remain insulating, based on the threshold voltage of the cell, which is in turn controlled by charge on the floating gate. The current flow through the MOSFET channel is sensed and forms a binary code, reproducing the stored data. 2. 1. MOSFETS (known as FGMOS). These FG MOSFET Flash Problems and TMS Solutions Each company handles this in they’re own way. Problem Solution Limited write-erase cycles Wear leveling Bit errors ECC Block/plane/device failures Block remapping, RAID, Variable Stripe RAID™ Disturb errors Voltage and timing adjustments (read, write, erase) Erases need big blocks and take a long time Overprovisioning Sold State Solutions : DDR on the go verses FLASH • 600,000 IOPS* • Latency = 15µ seconds = 0.000015 = 15 x 10-6 • Internal NAND Flash for backup IBM Chipkill for failed chips. • Raided Ram Boards • Redundant Power • One Internal Controller • Certified with IBM Fabric/SVC’s The RamSan-420 uses RAID protected Flash memory modules to rapidly back up the RAM-based data and ensure non-volatility for the system. In Active Backup mode, the RamSan-420 continuously backs up data to the internal redundant Flash modules without impacting system performance. The RamSan-420 can back up or restore the entire 256 GB of data in just three minutes. Texas Memory Systems’ patented IO2 technology further improves system availability by making user or application-requested data instantly accessible after the system is powered on. Storage Capacity (Usable): 512 GB LUNs: 1,024 LUNs Storage Medium: RAM Performance Maximum Bandwidth: 4,500 MB/s Read IOPS: 600 K IOPS Read Latency: 15 microseconds Write IOPS: 600 K IOPS Write Latency: 15 microseconds Interfaces Expansion Slots: 4 slots Fibre Channel Ports per Card: 2 ports/card Fibre Channel Speed per Port: 4 Gb/s Management Supports Browser Management Supports SNMP Management Supports SSH Management Supports Telnet Management Power Typical Power Consumption: 650 W Mechanical Dimensions: 7" (4U) x 24" Form Factor: 4U Weight: 90 lbs DDR-RAM BOARD RAM SAN ADVANTAGES: 1. LUNS can be spanned against multips Devices creating multipath processing, load balancing. 2. Load balancing enables a single application to fully benefit from the RamSan-440’s technical capabilities. Multipathing also enables path failover so that an application’s availability is not dependent on a single link. 3. Dual HBA’s create 8 Fiber Ports. RamSan firmware has been tested and is compatible with nearly all host multipathing drivers. Texas Memory Systems has developed OEM MPIO module for Microsoft Windows and a OEM MPIO module for IBM AIX 5.3. http://www.ramsan.com/files/f000245.pdf RAMSAN 400 Internals RULE: NEVER TRUST WHAT ANYONE SAYS AND ALWAYS KEEP YOUR LASER HANDY! – PARANOIA ROLEPLAYING GAME, WEST END GAMES. OEM SERVER TOOLS ARE YOUR FRIEND! When starting a new job make sure you know how to get into the server and access the OEM server tools. It makes you look smart for a very low hanging fruit. In a month they’ll call you Scotty because the other DBA’s are scared to touch it. Know how to access your HBA’s CONFIGURATION TOOLS In this case search for the executable sansurfer.exe on the C:\*.* drive. Look Ma’ it’s got Diagnostics Things to look for: 1. Pirates, 2. Physical Disk IO rates, 3. Cache Utilization (Write %, Read %), 4. Write Pending %, 5. Storage Controller Utilization (Disk side), 6. Bandwidth Utilization boooYEAH! DELL’s FUSION IO MANAGER For FUSION IO DUO IO FOR A FILE COPY SESSION HOW TO DO I SMACK MY SAN SO IT KNOWS WHO’S BOSS? HOW DO I KNOW YOUR LYING? Tools I like: What to focus on? • LATENCY • BANDWIDTH • IOPS SQLIO http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=20163 SQL IO SIMULATOR: http://support.microsoft.com/kb/231619 IOMETER http://sourceforge.net/projects/iometer/ http://www.iometer.org/doc/downloads.html CrystalDiskMark Haven’t used it but it’s another Alternative. Performance Monitor (PERFMON) C:\Windows\System32\perfmon.exe RESOURCE MONITOR C:\Windows\System32\perfmon.exe /res WHAT DO I MEASURE? 1. LATENCY : 2. IOPS: Input output Operations per second (Sequential or Random) IOPS * TransfersizeinBytes = Bytes Per Seconds P e r f m o n c o u n t e r : ( W r i t e s / S e c o n d ) -1 Remember not all LUNS are SAN DISKS! Distinguish local from SAN DISKS! \<Drive:>\Disk Reads/sec BRENT OZAR’s FAVORITE PERF MON FOR SANS. \<Drive:>\Disk Writes/sec These are listed OBJECT first, then COUNTER \<Drive:>\Disk Read Bytes/sec Memory – Available MBytes \<Drive:>\Disk Write Bytes/sec Paging File – % Usage Physical Disk – Avg. Disk sec/Read \<Drive:>\Avg. Disk Bytes/Read Physical Disk – Avg. Disk sec/Write Physical Disk – Disk Reads/sec \<Drive:>\Avg. Disk Bytes/Write Physical Disk – Disk Writes/sec Processor – % Processor Time \<Drive:>\ Split IO/Sec SQLServer: Buffer Manager – Buffer cache hit ratio DISK QUEUE Length should SQLServer: Buffer Manager – Page life be avoided on SANS’ expectancy CPU / Memory Counters can be Important to measure general bottlenecks. For SAN Troubleshooting I like using Activity Monitor To get a qualitative health check on memory and CPU pressure. SQLServer: General Statistics – User Connections SQLServer: Memory Manager – Memory Grants Pending SQLServer: SQL Statistics – Batch Requests/sec SQLServer: SQL Statistics – Compilations/sec SQLServer: SQL Statistics – Recompilations/sec System – Processor Queue Length DMV’s Useful for Checking for slowness: Remember some stats are based on ticks: Use Stat * @@TIMETICKS microseconds or culmulative ticks: (select cpu_ticks_in_ms, CPU_Ticks from sys.dm_os_sys_info) Tool Monitors Granularity sys.dm_os_wait_stats PAGEIOLATCH waits SQL Server Instance level sys.dm_io_virtual_file_stats Latency, Number of I/O’s Hot Database files sys.dm_exec_query_stats Number of … Reads (logical or physical) Number of writes (logical) Query or Batch stats sys.dm_db_index_usage_stats Number of I/O’s and type of access (seek, scan, lookup, write) Index or Table Hits sys.dm_db_index_operational_stats PAGEIOLATCH waits Index or Table sys.dm_os_io_pending_ios Pending I/O requests at any given point in time File (Per I/O request) REMEMBER STATS ARE CUMULATIVE. IF YOU REBOOT OR FAIL OVER YOU RESET THEM. How do I use S Q L I O to measure performance blocks. 1.SQLIO IS COMPLETELY COMMAND LINE BASED. 2. USE >> in a BAT FILE TO SAVE OUT STATS. 3. Read the SQLIO.RTF file that comes with it. Very usefull. 4. Edit the C:\Program Files (x86)\SQLIO\param.txt file to specify where your test files are located. 5. Configure you .BAT file to run or run all day to gather info. 6. Use SSIS or some APP TO IMPORT the DATA. Doing it by hand can take a while. [options] may include any of the following: -k<R|W> kind of IO (R=reads, W=writes) -t<threads> number of threads -s<secs> number of seconds to run -d<drv_A><drv_B>.. use same filename on each drive letter given -R<drv_A/0>,<drv_B/1>.. raw drive letters/number for I/O -f<stripe factor> stripe size in blocks, random, or sequential -p[I]<cpu affinity> cpu number for affinity (0 based)(I=ideal) -a[R[I]]<cpu mask> cpu mask for (R=roundrobin (I=ideal)) affinity -o<#outstanding> depth to use for completion routines -b<io size(KB)> IO block size in KB -i<#IOs/run> number of IOs per IO run -m<[C|S]><#sub-blks> do multi blk IO (C=copy, S=scatter/gather) -L<[S|P][i|]> latencies from (S=system, P=processor) timer -B<[N|Y|H|S]> set buffering (N=none, Y=all, H=hdwr, S=sfwr) -S<#blocks> start I/Os #blocks into file -v1.1.1 I/Os runs use same blocks, as in version 1.1.1 -F<paramfile> read parameters from <paramfile> Configuring SQI IO METER FOR TESTING Edit C:\Program Files (x86)\SQLIO\PARAM.TXT Add or delete lines as needed. Format is PATH | Number of threads/File | MASK | Size of File in MB c:\sqlio_test.dat 4 0x0 100 d:\sqlio_test.dat 4 0x0 100 CREATING A BAT FILE FOR TESTING sqlio -kW -s10 -frandom -o8 -b8 -LS -Fparam.txt sqlio -kW -s360 -frandom -o8 -b64 -LS -Fparam.txt sqlio -kW -s360 -frandom -o8 -b128 -LS -Fparam.txt sqlio -kW -s360 -frandom -o8 -b256 -LS -Fparam.txt sqlio -kW -s360 -fsequential -o8 -b8 -LS -Fparam.txt sqlio -kW -s360 -fsequential -o8 -b64 -LS -Fparam.txt sqlio -kW -s360 -fsequential -o8 -b128 -LS -Fparam.txt sqlio -kW -s360 -fsequential -o8 -b256 -LS -Fparam.txt Write | 360 seconds | random or sequential | number of IO Requests | bytes used | what param file to use. You can also use –F<drive>:\testfile.dat READING THE RESULTS: sqlio -kR -s360 -fsequential -o8 -b8 -LS –Fparam.txt >> C:\SQLIOLOG.TXT sqlio v1.5.SG using system counter for latency timings, 2208056 counts per second parameter file used: param.txt file U:\testfile.dat with 10 threads (0-9) using mask 0x0 (0) 10 threads reading for 360 secs from file U:\testfile.dat using 8KB sequential IOs enabling multiple I/Os per thread with 8 outstanding using specified size: 10000 MB for file: U:\testfile.dat initialization done CUMULATIVE DATA: throughput metrics: IOs/sec: 32282.80 MBs/sec: 252.20 latency metrics: Min_Latency(ms): 0 Avg_Latency(ms): 2 Max_Latency(ms): 567 histogram: ms: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24+ %: 58 15 8 5 3 2 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 PARSING THE RESULTS: PERL: http://sqlblog.com/blogs/linchi_shea/archive/2007/02/21/parse-the-sqlio-exe-output.aspx POWERSHELL: http://sqlblog.com/blogs/jonathan_kehayias/archive/2010/05/25/parsing-sqlio-output-to-excel-charts-using-regex-in-powershell.aspx IOMETER: Don’t be your SAN Administrators “B@#$%^” BIG BUCKS NO WHAMMIES!! Standard Files in Directory LOW HANGING FRUIT TROUBLESHOOTING 101 And how not to look like an Idiot. SAN-A-GEDON! What do you do when your SAN’s PRIMARY and SECONDARY CONTROLLER GOES? Never trust the guy servicing your expensive Hardware. Grill him properly. Securely fastening his hands and legs to limit motion. Inject Sodium Pentothal, and have your DR SOLUTION READY! 1. Create a temporary Cluster in case of a “Mega Whoops” like when the tech copies the bad config on the bad card to the good card. 2. Validate and never believe it when they say it’s only going to be 30 Minutes. 3. If you have a DR Solution make sure it’s up to date and possibly perform a nonfailover recovery plan to time outage. 4. In E-commerce solutions TIME = MUCHO DENERO! Time is money. Outages are doubly so. Cost = hourly wage + lost revenue + loss of customer Confidence 5. Remember the guy working on your SAN is being sent out as a Contractor for the bigger company and that company is most likely lowest bidder. 6. Always have a Mega Whoops plan. 7. Contact your local mafia Don to find out what body disposal options are available if and when the IBM tech goes “missing” after he brings your enterprise to a screeching halt. 8. Remember 1:10,000 can happen to you! I AM A SAN I AM AND DON’T LIKE GREEN EGGS AND HAM QA: BLAME THE SAN WHEN IT’S THE SAN…….CHECK YOUR SERVER TOOLS FIRST. The best way I’ve found to compare a disk on the server and Virtual Disk is size. RESOURCE MONITOR IS YOUR FRIEND SOME TIMES YOUR TOOLS LIE! • Remember Trust No one and always keep your laser handy! DELL LIES….THEY ALWAYS LIE THEY LIE! The TEMPDB is on the same HD SPAN as your Page File…… NO ES MUI BUENO! Always check that the guy who built it didn’t make a mistake. Notice the drive says DELL! A dead give away your not on the SAN. You can confirm with the location ID. Further Reading: 1. http://www.brentozar.com/sql/sql-server-san-best-practices/ http://technet.microsoft.com/library/Cc966412 2. http://www.hds.com/assets/pdf/best-practices-for-microsoft-sql-server-on-hitachiuniversal-storage-platform-vm.pdf (HITACHI SANS) SSQL CAT (Customer Advisory Team Links): 1. http://blogs.msdn.com/cfs-file.ashx/__key/communityserver-componentspostattachments/00-09-45-2765/Mike_5F00_Ruthruff_5F00_SQLServer_5F00_on_5F00_SAN_5F00_SQLCAT.zip 2. http://blogs.msdn.com/cfs-file.ashx/__key/CommunityServer-ComponentsPostAttachments/00-09-45-2765/Mike_5F00_Ruthruff_5F00_SQLServer_5F00_on_5F00_SAN_5F00_SQLCAT.zip Books: Troppens, Ulf, Rainer Erkens, Wolfgang Mueller-Friedt, Rainer Wolafka, and Nils Haustein. "Storage Networks Explained—Basics and Application of Fibre Channel SAN, NAS, iSCSI, InfiniBand and FCoE, Second Edition". Storage Networks Explained: Basics and Application of Fibre Channel SAN, NAS, iSCSI, InfiniBand and FCoE, Second Edition. John Wiley & Sons. © 2009. W. Curtis Preston. “Help for Storage Administrators: Using Sans and NAS”. O’Reilly, © 2002. This PRESENTATION IS BROUGHT TO YOU BY: An Angry Squirrel DOCTOR STEEL