I/O Types and Usage in DoD Henry Newman Instrumental, Inc/DOD HPCMP/DARPA HPCS May 24, 2004 © Copyright 2004 Instrumental, Inc Evaluation Issues What are the scaling problems? © Copyright 2004 Instrumental, Inc Facts About Performance(1) System Feature 1977 2004 System CPU Performance CDC 7600 25 MFLOPS Earth Simulator 40 TFLOPS Disk Technology CDC Cyber 819 3600 RPMs Seagate Cheetah 15K RPMs Disk Density 80 MB 146 GB Disk Transfer Rate 3 MB/sec Half Duplex 71.5 MB/sec Avg. per disk 200 MB/sec full duplex RAID Disk Seek+Latency 24 ms 6.0 ms write 5.6 ms read © Copyright 2004 Instrumental, Inc Facts About Performance(2) Item Times Increase CPU 1.6M RPMS 4.1 Density 1814 Transfer Rate disk 23.8 Transfer Rate RAID 133 Seek+Latency Read 4.3 © Copyright 2004 Instrumental, Inc Device Utilization Disk Utilization 524288 Request Size 262144 131072 65536 16384 8192 4096 1024 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 1024 4096 8192 16384 65536 131072 262144 524288 Percentage Utilization 15K 0.19% 0.76% 1.51% 2.97% 10.92% 19.69% 32.89% 49.50% Percentage Utilization 10K 0.14% 0.56% 1.11% 2.20% 8.24% 15.22% 26.42% 41.80% Utilization Percentage Utilization 10K © Copyright 2004 Instrumental, Inc Percentage Utilization 15K Tape Facts Vendor Drive Media Year Introduced Capacity MB Peak Transfer Rate MB/sec uncompressed Performance Increase IBM 3420 Reel-to-Reel 1974 150 1.25 1.00 IBM 3480 3480 1984 200 3 2.40 IBM 3490 3480 1989 200 3 2.40 IBM 3490E 3480 1991 400 4.5 3.60 IBM 3490E 3490E 1992 800 4.5 3.60 IBM 3490 3490E 1995 800 6 4.80 IBM 3590 3590 1995 10000 9 7.20 StorageTek SD-3 SD-3 1995 50000 11 8.80 StorageTek T9840A 9840 1998 20000 10 8.00 IBM 3590E 3590E 1999 20000 14 11.20 IBM 3950E 3590E 2000 40000 14 11.20 T9940A 9940 2000 60000 14 11.20 LTO LTO 2000 100000 14 11.20 DTF-2 2000 24 19.20 StorageTek LTO Sony GY-8240FC 200GB/60GB *** StorageTek T9840B 9480 2001 400000 20 16.00 StorageTek T9940B 9940 2002 2000000 30 24.00 IBM 3590H 3950 2002 600000 14 11.20 LTO LTO-II LTO 2003 2000000 35 28.00 © Copyright 2004 Instrumental, Inc File System Concerns Data fragmentation and allocation Metadata fragmentation and allocation Recovery from crash or metadata loss Performance that scales Support for >2TB LUNs Failover © Copyright 2004 Instrumental, Inc Fragmentation Fragmentation is becoming a performance problem as file systems grow No major technology enhancements have been seen in decades Object Storage Device (OSD new T10 spec) will change this Fragmentation of metadata can have dramatic impact on performance Recently observed 600x slowdown in access at a site © Copyright 2004 Instrumental, Inc USG Types Requirements What is DoD currently using? © Copyright 2004 Instrumental, Inc Current Types of Requirements Database Used by most sites, big and small, for data reference especially in the intelligence community Not used by MSRCs much Real-time data capture Requirement in intelligence community Application Homogeneous shared file system access © Copyright 2004 Instrumental, Inc Current Types of Requirements Archival Used by MSRCs Intelligence community Process Flow Used after real-time capture Could be used by MSRCs if shared file system between HPC and HSM systems were implemented © Copyright 2004 Instrumental, Inc Database 4, 8 or 16 KB I/Os for indexes Random 64 KB I/Os for log updates Sequential Read and write Up to 256 KB Just about everyone uses a database somewhere in their HPC systems Although some don’t have performance requirements © Copyright 2004 Instrumental, Inc Real-Time Data Capture Large Block Requirement 4 MB-128 MB I/O requests Small Block Requirement 1 KB-8 KB files with millions of files per day Multiple Threads 4-8 threads to keep the devices busy with either type of I/O © Copyright 2004 Instrumental, Inc Real-Time Data Capture Generally requires an HSM Usually needs 100’s of MB/sec 7x24 to meet the requirements for capture Everything must run at rate I/O Bus RAID devices Switches Limitations of tape bandwidth are pushed HBAs © Copyright 2004 Instrumental, Inc Application Homogeneous Shared File System access Must be able to get the data from the nodes to a single file over fibre channel High performance I/O from those nodes Depends on the application but given that GPFS peek is about 400 MB/sec that seems to be the current requirement Support for a few 100,000 files No where near the HSM requirements © Copyright 2004 Instrumental, Inc Archival Large HSM Systems MSRCs are a good example High speed networks TCP/IP (ftp) data movement Future movement to shared file systems which will make these look more like real-time capture requirements © Copyright 2004 Instrumental, Inc Process Flow These are applications and processes that are done via an assembly line like concept Each step uses a machine or machines, sometimes specialized, to move the task along Data communication via a shared file system with multithreaded large block I/O requests from each of the hosts to various data sets © Copyright 2004 Instrumental, Inc Current MSRC Requirements Homogeneous shared file system for applications running on the HPC system HSM support and access via TCP/IP Process Flow should be supported for visualization Support for database but no performance requirement © Copyright 2004 Instrumental, Inc Conclusion The future for HPCS machines and most application environments will be shared file systems Shared file systems were pioneered for real-time capture world Large file systems are seeing problems with fragmentation and scaling © Copyright 2004 Instrumental, Inc