Overview Advances in Data Storage Technologies ? ? Opening thoughts Main-stream technologies ? ? Thomas M. Ruwart University of Minnesota Digital Technology Center Intelligent Storage Consortium April 23, 2004 tmruwart@dtc.umn.edu Opening Thoughts ? ? ? Technologies – The underlying components used to build products using a given architecture Architectures – The way technologies are put together to solve a problem Applications – define the scope of requirements and associated problems to be addressed Storage ? Disk Drives ? ? ? ? ? ? ATA Disk Technology SCSI Disk Technology Tape Technology DVD (optical) Technology MEMS Solid State ? ? ? Storage – where data resides when it is not being manipulated or moved around Transports – how data is moved between other components and how components are physically connected together Protocols – the “language” that components use to talk to each other Software – the control of what happens throughout the system Closing thoughts An orthogonal thought ? ? ? There are many interesting “technologies” that can be incorporated into “products” Products are what sells This presentation describes ? ? Past and current “products” and associated technologies The evolution of various technologies and architectures that may or may not become products Disk Drives in General ? ? Definition ? “Winchester” disks have the rotating rigid media and read/write heads plus actuator enclosed in an air-tight case Current Status ? Disk drives have been around since 1957 (IBM RAMAC 305) ? Areal Density is about 60 Gbit/in2 ? Current capacities are in the 300GB range for 3.5-inch ATA-class drives ? Rotation speeds are at 7200 RPM for ATA, 15000 RPM for SCSI ? Form factors are 3.5-inch for desktop/enterprise, 2.5 -inch for mobile, some 1-inch ? Interfaces include ATA, SATA, Parallel SCSI, and FC 1 Disk Drives continued… ? HDD Technologies for the future Technology Evolution ? Perpendicular Recording Perpendicular recording Will operate in the 100-200 Gigabit/in2 areal density range 2 to 4 years out ? ? Patterned media, 33 nm bits. Bruce Terris, HGST New media types ? ? Patterned Media Tilted Perpendicular media Self-organized media ? ? Smaller Form Factors – move from 3.5 -inch to 2.5-inch form factors ? ? Heat assisted magnetic recording uses both laser and field to record T. McDaniel, Seagate Lower power requirements Higher manufacturing yields Higher areal densities Higher RPM drives == lower access latency ? ? ? Serial interfaces ? Serial ATA (SATA) Serial Attached SCSI (SAS) Fibre Channel (FC) – 4 and 10Gbit/sec ? ? ? Self-organized magnetic arrays Protocols ? Object-based Storage Device ? D. Weller, Seagate Courtesy of INSIC and Tarnotek, Inc. Capacity Growth: Sustainable? Precipitous decline in $/GB HDD Capacity vs. Time, 95 mm Desktop ? 7,200 rpm Cost per Gigabyte, 95 mm Desktop ? 7,200 rpm 100.00 y = 6E-28e ? 0.0018x R2 = 0.9626 100 INSIC 1 Tb/inch2 AD demo goal Seagate’s plans HGST Maxtor Seagate WDC 10.00 $/GB All HDD Early HDD 10 1.00 - 44%/ year 0.10 (C) 2003 TarnoTek Courtesy of INSIC and Tarnotek, Inc. Jan-09 Jan-08 Jan-07 Jan-06 Jan-05 Jan-04 Jan-03 Jan-02 Jan-01 Jan-98 Jan-09 Jan-08 Jan-07 Jan-06 Jan-05 Jan-04 Jan-03 Jan-02 Jan-01 Jan-00 Jan-99 Jan-98 Time (C) 2003 TarnoTek Jan-00 0.01 1 Jan-99 HDD Capacity (GB) 1000 Time Courtesy of INSIC and Tarnotek, Inc. The Future of Hard Disk Drive Technology Disk Arrays in General Lab Demos: Possible HDD Areal Density Progression Laboratory Demonstrations 100000.0 Areal Density (Gb/ in 2 ) 10000.0 % 50 1 Terabit per inch 2 g o a l 1000.0 100.0 e lin ate r G CA 30% ? line CAG ? highest in products 70 perpendicular recording ? now heat-assisted mag recording 10.0 patterned media ? self -organized arrays ? 1.0 Jan-90 Jan-93 Jan-96 Jan-99 Jan-02 Jan-05 Jan-08 Jan-11 Jan-14 Definition ? RAID – Redundant Array of Independent (Inexpensive) Disks ? Aggregation of disk drives to operate as a large single storage device ? Used to improve reliability, availability, serviceability, and performance through Current Status ? Disk arrays have been around since 1988 ? Interface is primarily 2Gbit Fibre Channel Technology Evolution ? Recent developments in MAID – Massive Arrays of Independent (Inexpensive) Disks ? MAID takes advantage of smaller form factor disk drives – lots of them ? Intended to address problems associated with multiple drive failures Jan-17 Date Courtesy of INSIC and Tarnotek, Inc. 2 ATA/IDE Disk Technology ? ? ? ATA/IDE Disk Technology (cont.) Definition ? Inexpensive interface used for consumer-grade disk drives ? ATA stands for AT Attachment, IDE stands for Integrated Drive Electronics. The two terms are used interchangeably ? Primarily 3.5-inch and 2.5-inch form factors (5.25 -inch for optical devices) Current Status ? ATA disk drives are one half to one third the cost of equivalent capacity SCSI disks and can result in lower overall equipment costs for large deployments ? ATA cost levels make it an attractive alternative or augmentatio n to tape Technology Evolution ? Parallel ATA interface technology has been around for many years and has matured to the point where it is very commonplace. ? ATA disk technology is equally mature to SCSI drive technology but is a very different drive technology. ? Serial ATA is the next evolutionary step in ATA technology and is now available in production quantities. ? Serial ATA improves upon Parallel ATA performance, addressing, and other limitations while still maintaining the cost effectiveness of Parallel ATA. SCSI Disk Technology ? ? ? ? ? ? ? ? Comments ? ATA disk drives are NOT simply a SCSI disk drive with a different interface. ATA (consumer or personal storage) and SCSI (enterprise-class storage) are designed with very different goals in mind. ? ? ? ? SCSI interface technology has been around for almost 20 years and has matured to the point where it is the disk drive interface of choice for ent erprise class storage. Serial Attached SCSI (SAS) is the next evolutionary step in SCSI technology and will be available in production quantities in late 2004 or early 2005. SAS provides some electrical compatibility with SATA. ? Technology Availability SAS has been announced by Seagate on the new 2.5-inch enterprise-class disk drive. Reference Web Sites ? ? ? ? ? SCSI – www.t10.org FIbre Channel – www.t11.org iSCSI – www.ieft.org ? Tape Technology (cont.) ? Native GB Native MB/S Media Available ? ? 2 3 4 Generation Generation Generation Generation 100 200 400 800 80-160 10-2 0 20-4 0 40-8 0 PRML PRML PRML MP MP MP Thin Film 2000 2002 2004 2006 Recording Method RLL 1,7 The Winchester disk industry is moving toward a 2.5-inch form factor very similar to the disk drives in laptop computers. One significant implication is that there will be a much larger number of disk units (roughly four times as many) to manage in an overall installation. This move toward smaller form factors is motivated by higher aerial densities and media yield. Reference Web Sites ? Serial ATA – www.serialata.org Definition ? Magnetic recording ? Linear and helical scan Current Status ? Used for backup and long-term data archival ? High density, low cost, very durable ? Potential high latency perhaps an issue Technology Evolution ? Magnetic tape well understood – it has been around since 1953 ? Magnetic tape follows disk density and performance curves ? LTO (Linear Tape Open) getting a lot of market share ? Drive consortium of IBM, Seagate and HP ? Tape manufacturers include Maxell, Fujifilm, TDK, Imation, Sony, Emtec (BASF), Verbatim, …etc. ? www.lto.org ? IBM 3490 tape technology is the gold standard in enterprise-class tape Comments ? Technology Refresh is a significant issue with large data archives DVD Technology availability ? LTO Roadmap 1 ATA disks are designed to minimize cost, maximize volume, and are intended to operate as single units. SCSI disks are designed to maximize performance and reliability as well as being able to operate in arrays. ? SCSI disk drives are principally used in disk arrays and in applications that require consistent high bandwidth, lower latency, and higher reliability than can be provided by ATA/IDE disk drive. Technology Evolution ? ? ? High-performance interface used for enterprise-class disk drives SCSI stands for Small Computer Systems Interface Primarily 3.5-inch and 2.5-inch form factors (5.25-inch for optical devices) Current Status ? Technology Availability ? ATA disk interfaces are available on both standard Winchester Ha rd Disks and on CD and DVD devices. ? ATA disk drives of various capacities and form factors are avail able from multiple manufacturers: Seagate, Maxtor, Western Digital, Hitach i Global Storage (formerly IBM Storage), Fujitsu, NEC, and Samsung Tape Technology Definition ? ? ? ? 18 - 24 Months between generations ? Definition ? Digital Versatile Disc Current Status ? Long term data archival ? High latency data Technology Evolution ? Capacity ? Now - 4.7GB per side, 9.4GB per disk double sided ? Soon - 27GB per side (blue-laser) ? Ultimately - 50GB per side ? Transfer rates ? Current write = 3.3MB/sec, read = 10.8MB/sec ? Assume technology will advance (i.e. like CDRW) Technology Availability ? Drives and robotics are common Comments ? Long media life, reasonably durable ? Exploits the consumer market for media 3 MEMS-based Storage Technology ? Definition ? ? ? ? ? ? ? ? ? ? Not available for another few years ? Reference website ? ? Fibre Channel Gigabit Ethernet TCP Offload Engines (TOEs) System Area Networks ? ? ? Gigabit Ethernet ? ? ? ? Consumer products Fibre Channel ? ? Technology Availability ? ? ? Read/write/re-write Slow write speeds, moderate read speeds Driven by the consumer electronics market Continue to follow the density curves of integrated circuits Factor of 10-100 times the price/MB of magnetic storage www.pdl.cmu.edu ? ? CompactFlash (CF) Smart Media (SM) Secure Digital Memory Card (SD) Memory Stick (Sony) USB Memory devices Technology Evolution ? Research at CMU Transports ? Most popular types ? ? ? Non-volatile for permanent data storage Current Status ? Technology Availability ? ? Definition ? ? Still in the technology demonstration mode Technology Evolution ? ? ? Current Status ? ? MEMS - Micro Electro Mechanical Systems Solid State Storage Technology Definition ? Ethernet at speeds of 1,000 (1GigE) to 10,000 (10GigE) bits per second Current Status ? Ethernet is the network transport of choice ? 1GigE is the overwhelming favorite for high -speed LANs ? 10GigE close behind Technology Evolution ? IEEE 802.3 with 802.3ae defining the 10,000 bit modifications (6 /17/02) Technology Availability ? Cisco Catalyst 6500 Serial 1550nm 10 Gigabit Ethernet Module ? Intel® PRO/10GbE LR Server Adapter ? Intel® TXN17401 Optical Transceiver Reference Web Sites ? www.10gea.org/index.htm Comments ? 10GigE Products are currently shipping Definition ? High speed data transport (physical, encoding, framing protocol) ? Extensible to miles (100km) with special gbics and special 9-micron singlemode fiber Current Status ? Currently the Storage Area Network Interconnect of choice Technology Evolution ? 1 and 2Gbit shipping on disk drives and arrays ? 4Gbit may be shipping on disk drives in near future ? 10Gbit/sec standard mature and is intended for disk arrays Vendors/Products ? Multiple General Reference Web Sites ? Fibre Channel Industry Association (www.fibrechannel.org , www.t11.org) Comments ? Synchronization of HBAs, switches and storage ? Technical Committee T11 (www.t11.org) TOE ? ? ? Definition ? A TOE (TCP Offload Engine) -chip or a board that handles the TCP protocol stack without utilizing host CPU resources Current Status ? Reduces processing load on nodes that are connected via GigE and handle heavy traffic through this interface ? Some TOEs have iSCSI protocol engines that accelerate the SCSI command protocol processing for iSCSI storage devices Technology Availability ? Adaptec AIC-7211 1Gb ASIC with full TCP/IP offload ? Lucent TA1000 manufactured by Intel 4 System Area Networks ? ? Definition ? A network used to connect nodes together within a single system (computer room environment) that has the following operational characteristics ? Very Low Latency (~1 µsec ) ? High Bandwidth (>1Gigabit/sec) ? Support for Atomic Operations ? Remote DMA capability (RDMA) ? Low Overhead ? Allow the construction of NUMA -like systems with *standard* hardware Current Status ? System Area Networks are being employed as the interconnect for compute and storage clusters System Area Networks (cont.) ? SAN Switch System Area Networks (cont.) ? Technology Availability ? InfiniBand is available from a variety of companies. It is best to go to the InfiniBand Trade Association website to see who these companies are and what categories their product fall under (i.e. switches, NICs, software, …etc.) ? HyperTransport is slightly ahead of InfiniBand on the maturity curve but also has slightly different applicability. ? Rapid I/O is on the same track as HyperTransport and is available from several vendors. ? VIA hardware adapters are available in different forms from Qlogic, Emulex, Intel, and Troika Networks ? Quadrics is a proprietary system area network and all the adapters and relevant software is available from Quadrics. ? Myrinet hardware (NICs and Switches) is available from Myricom however *all* the associated software, interface protocols, and APIs are publicly available. Protocols System Area Networks (cont.) ? Comments ? Evolving storage system architectures will incorporate System Area Networks ? Storage “devices” will become peers on the System Area Network ? Reference Web Sites ? ? ? ? ? ? ? iSCSI OSD NFS/CIFS InfiniBand – www.infinibandta.org Hypertransport – www.hypertransport.org Virtual Interface Architecture (VIA) – www.intel.com/design/servers/vi Myrinet – www.myri.com Quadrics – www.quadrics.com iSCSI ? ? Technology Evolution ? Two companies, Myricom and Quadrix, have been producing system area networks for a number of years with some success. ? Several other companies have more recently come up with products in the VIA (Virtual Interface Architecture) but these devices only support RDMA and no atomic operations. ? The most widely accepted system area network technology is InfiniBand with more than 200 companies building various pieces of that technology. ? Other companies such as AMD and Motorola have developed competing technologies, HyperTransport and Rapid I/O respectively, originally intended to be restricted to within the confines of a single “box” but have since defined connectors and cables to allow the to interconnect boxes as well. ? ? Definition ? Internet Small Computer Systems Interface (iSCSI) protocol ? Encapsulated SCSI over IP Current Status ? Important Protocol for block storage applications ? Can be thought of an an inexpensive alternative to Fibre Channel Technology Evolution ? Number of early release products ? Some driver based ? Some with specialized hardware, (TCP offload) ? Limited commercial success ? Security and discovery remain a problem ? Draft IETF iSCSI ? www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-14.pdf 5 iSCSI (cont.) ? ? ? OSD – Object -based Storage Devices Technology Availability ? Adaptec ASA-7211 iSCSI Adapter (Mid 2002) ? Intel Pro 1000 T IP ? www.intel.com/network/connectivity/products/iscsi/index.ht m?iid=ipp_home+netcon_iscsi& Reference Web Sites ? iSCSI_network_storage.pdf ? www.ietf .org/html.charters/ips-charter.html Comments ? Replaces need for Fibre Channel, if accepted by customers and industry ? Uphill battle - many camps for and against. What is OSD? ? ? ? ? ? ? ? ? What OSD is NOT Object-based Storage Devices – An Enabling Technology Grew out of the Network Attached Secure Disks (NASD) project at CMU ? A flexible and powerful protocol used to communicate with storage devices Proposed as a protocol extension to the SCSI command set ? ? Actively being pursued by the OSD Technical Working Group in the Storage Networking Industry Association (SNIA) ? It is a natural step in the evolution of storage interface protocols ? For some however, it is very new and very 1902 ST506 1985 SMD Definition ? Object-based Storage Devices – A protocol for accessing data on storage devices Current Status ? OSD can have a significant impact on helping to solve many of th e issues that arise in building scalable, high performance storage systems Technology Evolution ? First release of the T10 (SCSI) OSD standard specification has been submitted to the T10 committee Technology Availability ? Nothing commercially available yet – some product from some companies is available ? Many large storage companies looking into it ? An OSD reference code is available at ? http://www.sourceforge.net/projects/intel -iscsi 1998 SCSI FC SCSI 2002? ? ? ? different 1990 It is not intended or expected that the object abstraction be a complete file system There is NO notion of ? ? Naming Hierarchical relationships Streams file system style ownership access control The omitted features are assumed still to be the responsibility of the OS file system 200X SCSI OSD OSD OSD System Architecture The General Application: Storage Architectures Today DAS Architecture OSD Architecture I/O Application File System I/O Application Storage Device Interconnect Direct Attached Storage (blocks) I/O Application I/O Application File System User Component File System User Component I/O Application File System File System Storage Device Network Attached Storage (files) Interconnect Storage Device Storage Area Network (blocks) Architecture defined by location of storage system & devices File System Storage Component SCSI Block Storage Device SCSI File System Storage Component Block Storage Device 6 File System Components ? User File System Component ? ? ? ? I/O Application I/O Application File System User Component File System User Component Da ta T ran sfe r SCSI Se cu rity File System Storage Component ? ? ? Free space management Storage allocation for data entities Attribute Interpretation File System Storage Component I/O Application Block Storage Device What problems are being solved? File System User Component ? ? ? ? ? ? ? ? ? ? ? ? Depends on the APPLICATION Different people are trying to solve different problems for different reasons Storage Device Utilization Data Management Cost Reliability Device Management Performance Security Availability Maintainability Extensibility Restate the question: What problems CAN be solved with OSD? Intelligent Storage ? ? Definition ? Assume an Object-based Storage Device ? Storage Device is “aware” of the data objects it stores ? An Intelligent storage device can manipulate its data objects an d potentially the “contents” of the data object Current Status ? Pre-Competitive research ? Several organizations involved ? ? ? ? ? University of Minnesota DTC Intelligent Storage Consortium CMU Parallel Data Lab UC Santa Cruz Storage Research Center UCSD Center for Magnetic Recording Research Technology Evolution ? Intelligent Storage is a natural evolution of OSD. File System Storage Component Block Storage Device What CAN be addressed by OSD ? ? OSD Manager t jec n Obcatio Lo Security ? Hierarchy Management Naming User Access Control Data Properties (Attributes) How OSD works ? ? ? ? ? Improved storage management ? Self-managed, policy-driven storage (e.g., backup, recovery) Improved device and data sharing ? Shared devices and data across OS platforms Improved storage performance ? Hints, QoS, Differentiated Services Improved scalability (and not just capacity) ? Of performance and metadata (e.g, free block allocation) Current block- based access protocols and associated file systems are 30 years old (that’s 210 in dog -years). OSD has the potential to make a significant impact on the Extensibility of a Storage System Architecture Why Intelligent Storage? ? From the storage device manufacturer and storage vendor’s perspective ? ? ? More room to innovate and differentiate storage devices like disk drives Increase margins – price storage devices based on capability not simply capacity From the User’s perspective ? ? Increase in capability more specific to the User’s application space Easier to manage 7 Things that need to happen NFS/CIFS – Network File System / Common Internet File System ? ? ? ? Standards – OSD Interface to storage devices Standards – Runtime / execution environment Standards – API from Application to the Intelligent Storage System Software ? ? ? ? ? ? ? DAS/SAN/NAS File systems Heterogeneous Shared File Systems Hierarchical Storage Management (HSM) Storage Resource Management Storage Management Virtualization File Systems ? ? ? DAS/NAS/SAN ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Local file systems that use Direct Attached Storage or dedicated storage on a SAN Network File Systems that subscribe to the NFS or CIFS protocol Shared File Systems that operate on a SAN and allow for concurrent read/write access to files with all other host computer systems on the SAN All these file systems unless otherwise noted are “block-based” file systems – a 30-year-old technology Block-based file systems require the host computer to manage the free space on the storage as well as the allocation of storage blocks to files Block-based file systems have difficulty scaling particularly in a truly shared environment Object-based File Systems assume that free -space management and space allocation functions are delegated to the Object-based Storage Devices Object-based File Systems scale far better than block-based file systems Object-based File Systems enable the exploitation of Intelligent Storag e Devices These are more architectural terms than technologies DAS – Direct Attached Storage NAS – Network Attached Storage SAN – Storage Area Networks Current Status ? DAS is the oldest and most common storage interconnect NAS is a generic term for NFS or CIFS SAN is the architectural interconnect used to physically share storage devices among several host computer systems DAS and SAN imply block-based storage access protocols such as SCSI or ATA NAS implies a file-based access protocol Technology Evolution ? ? ? DAS has been around since the beginning of time NAS became widely used with NFS in the mid 1980’s SAN has only been around since about 1997 and is still somewhat qwerky Heterogeneous Shared File Systems There are several types of file systems ? Definition ? ? ? Definition ? NFS and CIFS are file sharing protocols Current Status ? NFS and CIFS are the current standard protocols used for file sh aring Technology Evolution ? NFS version 3 is the current generally available release. ? NFS version 4 is under development at the University of Michigan and contains many enhancements that will allow it to exploit OSDs . ? CIFS is the Microsoft answer to NFS – it essentially does the same thing as NFS ? CIFS evolution is proprietary to Microsoft Technology Availability ? NFS v3 is available for all OS and platforms ? CIFS comes with all MS operating systems ? ? Definition ? Permits simultaneous file sharing among different (1) Operating Systems and (2) multiple computers at mount point level Current Status ? Key foundation technologies that will ultimately support seamless, geographical data sharing ? Allows a site to phase in (and out) different client and/or proc essing systems without affecting the data storage ? Allows for the growth of data storage subsystems without affecting the client and/or processing systems Technology Evolution ? Products on the market for several years ? Still acclimating to 24x7 operational, ‘user’ heavy environments ? None of the file systems fully support the full range of Operati ng Systems ? Debate over centralized versus distributed metadata ? Scalability remains a question ? Failover sometimes difficult 8 Heterogeneous Shared File Systems (cont.) ? Technology Availability ? Several proprietary, somewhat heterogeneous file systems ? ? ? ? ? ? Funded by the DoE (ASCI Path Forward) GPL’d solution in development and available from Cluster File Systems, Inc (Lustre) ? ? ? Definition ? Storage administration, capacity planning, monitoring, etc. Current Status ? Movement towards centralized control of storage Technology Evolution ? Evolving but disjointed Technology Availability ? Multiple products by multiple vendors ? ? ? So, What is SMIS?! ? ? ? ? ? ? ? Based off of CIM/WBEM and SNIA Shared Storage Model Provides common interface to SAN resources Services include discovery, monitoring, configuration, security, capacity planning, … Bluefin says how we manage CIM via WBEM Solves the problem of multi -vendor SAN interoperability Traditional HSMs mature SAN HSMs emerging Technology Availability ? ? ? Multiple storage tiers likely given mix of short latency/long archive requirements Technology Evolution ? ? Policy-based data migration between storage elements Current Status ? Numerous HSM vendors/products SAN HSMs ? ADIC’s StoreNext SAN HSM now shipping ? Tivoli’s SANergy™/Sun QFs/SAM-FS Comments ? Most interesting and most relevant are emerging Disk-to-Disk systems and storage over IP oriented companies ? Nexsan Technologies – www.nexsan.com ? FalconStor Software – www.falconstor.com ? HSM has been around for many years but has always had trouble getting traction What is SMIS? ? The Problem ? Too many management infrastructures! ? ? ? ? ? ? ? ? ? Simple Network Management Protocol for networks Desktop Management Interface for desktops Common Management Information Protocol for telco System Management BIOS for motherboard/BIOS vendors Alert Standard Format for system alarms, … Non-interoperable models, frameworks and policies ? McData’s SANavigator AppIQ Comments ? Getting the products to gel to provide a unified, cohesive view of storage ? SMIS (a.k.a. Bluefin) is a new standard that is getting significant attention Definition ? ? Reference Web Sites ? Lustre – www.lustre.org ? SNFS – www.adic.com ? CxFS – www.sgi.com Storage Resource Management (SRM) ? ? GFS by Redhat (formerly Sistina), orginally GPL’dbut has since taken on a proprietary course Lustre, currently under development as GPL ? ? Tivoli’s SANergy ™ ADIC’s CVFS now called SNFS SGI’s CxFS Hierarchical Storage Management (HSM) The model describes what you are managing. The framework allows you to manage the model. And policies say how you can manage the model. We need a unified management infrastructure for the enterprise SMIS – Storage Management ? ? ? Definition ? SMIS is an object-oriented messaging interface that links distributed management applications (clients) with device management support (agents) to discover, manage, and control devices of any kind ? A CIM/WBEM-based SAN management framework ? A SNIA-based standard Current Status ? Will significantly enhance the ability to manage the entire heterogeneous storage environment independent of hardware or software vendor or manufacturer Technology Evolution ? Started 5 years ago in SNIA ? Taken out of SNIA by the Partnership Development Process – a consortium of 17 companies ? Rev 1 of the SMIS spec was brought back into SNIA June 2002 for review and approval by all the Technical Working Groups 9 SMIS – Storage Management (cont.) Storage Virtualization ? ? ? Technology Availability ? Spec released to the public September 2002 ? Products that are SMIS-compliant are available from a limited number of companies Reference Web Sites ? SNIA – www.snia.org Definition ? ? ? ? ? ? ? ? ? ? ? Future of Data Storage Systems Workshop – April 27-29, UCSD, San Diego CA Intelligent Storage Workshop, May 19-20, UMN/DTC, Minneapolis, MN SNIA OSD TWG, meets monthly StoreCloud, Supercomputing 2004, November 6 -12, 2004, Pittsburgh, PA Other References ? www.dtc.umn.edu www.insic.org ? www.datarecoverygroup.com/articles/article3.htm ? www.actionfront.com/ts_articles.asp History of disk drives ? www.research.ibm.com/about/past_history.shtml ? www.research.ibm.com/journal/rd/443/thompson.html ? www.i-t-s.com/corporate/disk_drive_history.html ? www.startribune.com/stories/484/4734780.html Future of Data Storage Technologies (NSF/NIST/DARPA project) ? www.wtec.org/loyola /hdmem/toc.htm ? www.eetimes.com/sys/news/OEG20030718S0038 ? ? ? Range from complete software to a mix of hardware and software products Products from several vendors are currently available ? StoreAge Networking Technologies ? DataCore™ Software Comments ? Cool Stuff Happening in the Storage Industry Some of the virtualization products are still only a few years old and have not had time to prove themselves as a success or failure Virtualization hyped during 2001 as a way to decrease the TCO of a storage system but it is becoming commonly believed that this is not the case Technology Availability ? ? Sites are made up of many different vendors’ storage devices and some of the virtualization products allow the pooling of storage devices int o a single, larger space for more efficient use of that space Technology Evolution ? ? Unfortunately, there is no single definition – depends on the vendor Generically, virtualization is an abstraction of physical data storage space Current Status Debate over ‘inband ’ versus ‘out -of-band’ virtualization Closing Thoughts ? ? Hardware technologies are evolving ? Areal Density increasing ? Form factors shrinking ? Serial interfaces/transports are replacing parallel interfaces/transports ? Newer, cooler storage technologies like MEMS are in process Protocol and software technologies ? Lagging the hardware evolution ? Block-based access moving toward Object-based protocol in devices and protocols ? Object-based File Systems are being developed ? Traditional POSIX file system API is being challenged, reformed Thankyou! University of Minnesota Digital Technology Center Intelligent Storage Consortium www.dtc.umn.edu 10 Software Issues ? Operating Systems – Homogeneity and Heterogeneity ? Between OS Types ? ? ? ? Software Issues (cont.) ? ? Windows Unix in all flavors Mac ? ? Within OS types ? ? ? ? Linux Releases from a single vendor (i.e. RedHat ) Linux releases from different vendors (i.e. RedHat vs Suse) Patches from many different Linux Value-Add providers Windows 95/98/NT/2000/XP …etc. ? ? ? ? Concurrent support for multiple OS types (Heterogeneous) Striping efficiency of the Virtualization Engine(s) Striping efficiency of the file system/Volume Manager ? Software Issues (cont.) ? ? Hard product functionality/operational limits ? 1 TB LUN/file system limit for Solaris ? Note this has secondary impacts on products such as CVFS (shared volume labeling) ? Veritas file system ? 1 TB file systems on Solaris ? 2 TB file systems on HP-UX ? QFS supports up to 252 LUNs therefore 252 TB file systems ? CVFS supports up to 1.84E19 files ? Number of files in an HSM, etc. Driver ( NICs, HBAs , etc.) availability ? No iSCSI driver for SGI IRIX™ ? SNIA approved drivers that support LUN discovery ? ? ? ? ? ? ? Mixing protocols with interfaces – not all “endpoints” support all the possible combinations ? FC over everything ? TCP/IP over everything ? SCSI over everything Go with what works ? Ethernet for networking ? Fibre Channel for storage Experiment with what will be the most likely winner ? SCSI over Ethernet one way or another Plan for “phasing out” old technologies Plan on “phasing in” new technologies Linux ? Open source is powerful but not without its problems ? Most everything is kernel and/or distribution (i.e. RedHat , etc.) specific ? ? ? ? SANergy ™ CVFS GFS Product incompatibility – Lots of examples ? ? ? ? From one OS to another OS (i.e. Unix to Windows) Software rot Losing source code Losing algorithms Losing compilers Software Issues (cont.) ? Protocol Issues Name spaces Security mechanisms Meta data: Proprietary versus standard Disk storage layout: Proprietary versus standard Application porting issues ? ? ? File Systems Incompatibilities CVFS won’t run on a GFS patched kernel Security/firewall ? CVFS uses dynamically assigned ports for communication with the FSS (metadata) ? SANergyuses NFS Firmware and software upgrades ? Impact on operations – not just computers Tuning ? Variables/parameters at all levels Management Issues ? ? ? ? ? There are many pieces in a system to manage There is no single unified management tool be weary of anyone who tries to sell you one Even in the SAN space, no single management tool can manage all the SAN devices Bluefin will help with this but it is still a ways out Real-time monitoring and management is still a problem 11 Management Issues (cont.) ? ? Failure management ? Run under the assumption that there is ALWAYS something broken somewhere in the system ? Complete architectural redundancy or allowance for degraded operation ? Host, HBA, Switch(s), RAID controller, LUN ? Interaction of all the components to effect a proper switchover ? Disconnecting and shutting off failed components ? SANergy ? CVFS ? GFS ? Failback (restoration) Performance management ? Treat bandwidth, latency, and transaction rates as a resource that needs to be monitored and managed 12