Verschil tussen SAN – NAS – DAS. While storage technologies have continued to advance in their speed and capacity, practical implementation of storage has only recently become more flexible. Generally speaking, there are three types of storage implementations: Direct attached storage (DAS) Network attached storage (NAS) Storage attached network (SAN) This course provides an overview of the most scalable and robust of the three: storage area networks. You will learn how a SAN is different form a DAS or NAS solution, and what it takes to build and maintain a SAN. When you are done, you will have a good understanding of why you might utilize a SAN as a solution to your organization's storage needs, and what resources you will need to install and maintain it. To get started, this document gives you an overview of the three types of storage technologies (DAS, NAS, and SAN) so you can get a better understanding of how a SAN can be used in a networking environment. However, before we jump right into the specifics of how a SAN works, it's important that you have a good understanding of the different technologies associated with storage systems in general, including: The interface technologies that hardware devices (like computer systems and storage drives) use to communicate with one another. RAID levels that define how well protected your data are on a drive or collection of drives. Disk drive characteristics that you should consider when purchasing drives for your storage system. Let's begin here with a discussion of the different interface technologies that you might use to connect storage drives and other systems in a storage solution. SCSI and IDE drive technologies The small computer system interface (SCSI) has long been a favorite interface for storage because you can use it to attach a large number of drives to a single system. A SCSI chain allows multiple devices (which don't all have to be hard drives) to communicate with the main system, as shown in Figure 1-1. Figure 1-1: SCSI drives can be connected in long chains to provide high capacity storage systems. Each bus on a SCSI card can handle 15 devices per chain. (A bus is the physical connection point on the card where you attach a device.) Some SCSI cards have multiple buses, which allow more than 15 devices to 3/8/2016 Verschil tussen SAN – NAS – DAS. connect per card in the main system. Finally, SCSI supports command queuing, a feature that allows SCSI drives to manage up to 256 commands without the interaction of the SCSI drive's interface card. This allows the drive to operate more efficiently without waiting for the interface card to process multiple commands for each drive. Integrated Device Electronics (IDE) drives are similar to SCSI drives, but IDE drives limit the expansion of a system's storage capabilities. IDE chains can only include two devices per bus, and most system support two IDE buses, which means the system can manage up to a total of four devices, as shown in Figure 1-2. Additionally, and perhaps more importantly for server storage, IDE does not perform command queuing. Figure 1-2: IDE drives provide limited connectivity for low-end solutions. fibre channel drive technology While IDE and SCSI drive technologies have been around for quite some time, a more recent arrival on the storage connectivity scene is Fibre Channel. Fibre Channel provides a new physical topology that allows storage devices to connect using fiber optic cables and a new addressing method that creates a fabric using a Fibre Channel switch, as shown in Figure 1-3. Figure 1-3: Fibre Channel drives can be connected in large numbers, and with much greater distances than SCSI drives. One Fibre Channel interface card can support up to 126 drives, and Fibre Channel has an extremely high data rate, starting at over 100 megabytes per second (Mbps) and ending at 200 Mbps per second in full-duplex 3/8/2016 Verschil tussen SAN – NAS – DAS. mode. In addition, this data rate is expected to quadruple with the next enhancement to the Fibre Channel specification. Fibre Channel drives may be much more expensive than IDE and SCSI drives of the same capacity. However, these drives will often be able to withstand the rigors of data-center usage and high availability access in an industrial-strength environment. They are often designed for much more strenuous usage than some of the lower end drives of the same capacity. A Redundant Array of Inexpensive Disks (RAID) is a collection of disk drives (2 or more), called a volume, that provides for multiple levels of storage protection. The protection offered by a RAID depends on the capabilities of the storage device's interface card and the type of data protection and performance an individual storage system requires. RAID protection is defined as a level between 0 (the lowest level of protection) and 5 (the highest level of protection). RAID level 0 -- striped disk array without fault tolerance RAID 0, shown in Figure 1-4, is the lowest level of storage protection and does not offer data protection. It is not fault tolerant, which means it isn't prepared to respond well if a drive or other device in the system fails. In RAID 0, if you lose one drive, you lose all the data on that entire RAID 0 volume. Figure 1-4: RAID 0 stripes data over multiple drives. However, RAID 0 can provide enhanced performance by splitting the data among a number of drives in an array of drives (called striping), so that the system can read and write data in parallel among the drives in the array. This type of RAID may be used for applications where the data reside on a system temporarily (such as pre-production video editing or image editing) and where performance is more important than data protection. RAID level 1 -- mirroring/duplexing Drive mirroring with RAID 1 provides a complete backup copy of each drive on another drive, as shown in Figure 1-5. When the system reads data, it can read it from either disk, which provides enhanced read performance. However, the system still writes data at the same speed as a single non-RAID drive. Figure 1-5: RAID 1 provides data mirroring protection. The benefit of RAID 1 is that all data is completely contained on two disks so if one should fail, the other drive has a complete copy of all of the data. Unfortunately, RAID 1 is expensive in terms of efficiency. You'll need two drives to manage the storage capacity of a single drive, so there is a 100% overhead for managing this type of data protection. Even so, the cost of recreating lost data (or overall cost of the loss of irreplaceable data) is usually much more expensive than managing two drives. RAID 0 + 1 is a variation of the RAID 0 and RAID 1 environments where a set of RAID 0 disks are mirrored to another set of disks in a RAID 1 configuration. This approach provides redundancy and fault tolerance, and allows for a volume size larger than a single disk may provide. RAID level 3 -- multiple drives with parity drive RAID 3 gives you the ability to stripe data across multiple drives and record error correction data (called parity data) on a separate drive so that if any drive in the set is lost, the data can be reconstructed on the fly. This allows a multiple drive array to provide protection with only one extra drive, as shown in Figure 1-6. 3/8/2016 Verschil tussen SAN – NAS – DAS. Figure 1-6: RAID 3 offers high-speed data protection using a parity drive. Even though RAID 3 records a separate set of parity data, data writing operations do not generally affect drive performance because a controller generates the data onto the separate parity disk as a separate data writing activity. This RAID level allows a more efficient use of drive space since one parity drive can support a large number of data storage drives. RAID level 5 -- independent drives with distributed parity blocks RAID 5, shown in Figure 1-7, is probably the most common RAID level for data storage because it provides good data protection and good data read/write performance. At RAID 5, data are striped across an array of drives, as are the parity data that the system uses to recover from a crash. RAID 5 offers good fault tolerance, and while not as fast as RAID 3, it provides good performance characteristics for almost all data storage applications. If you are choosing a RAID level for the best general data access characteristics, choose RAID 5. Figure 1-7: RAID 5 offers striped parity over multiple drives. other RAID implementations While there are more RAID implementations that you may see in various vendor documents, most are a variation of one of the standard RAID implementations. For example, RAID 6 adds another parity drive to a RAID 5 implementation so you can sustain multiple drive failures. RAID 10 is similar to RAID 0+1 but has somewhat better fault tolerance if you lose a single drive. RAID 53 combines the striping of RAID 0 with the parity drives in a RAID 3 implementation. Each of these RAID levels may provide your application with some additional performance or data availability, but more likely than not, you will be able to manage your data protection using one of the more common RAID implementations. Disk drives are a dime a dozen (well, not necessarily a dime). There are hundreds of drive vendors and part of building a successful storage solution of any kind is choosing the appropriate drives. There are several factors you need to evaluate as you choose what drives you will use in any network storage solution. Spindle speeds indicate how available the raw data stored on a disk are as the drive head passes over the data. The faster the spindle speed, the faster the drive can read the data. However, if the drive's internal data transfer rate is not sufficient to keep up with the rate at which the drive can read data off the drive, if won't matter what the spindle speed is. Be sure you evaluate both of these together when you're checking out drives. Drive head seek time is not as important when you are not spooling large files off of the drive (such as large video files), but is critical when it comes to databases and general file servers that need to deliver many smaller files or bits of data quickly The faster that drive head can move into position, the faster the data can be read from the drive. Choosing an interface type can be a sensitive topic. Vendors will tell you that their interface type is the fastest. Some vendors have engineered IDE-based RAID arrays so that each drive is it's own IDE channel with up to 133 Mbps throughput, while other SCSI vendors will tell you that their SCSI-3 based drives can move data at up to 320 Mbps. And the Fibre Channel vendors will point out the benefits of speed, reliability, and scalability of their solutions. So which one should you choose? Well, first, check your budget. Then check your application's 3/8/2016 Verschil tussen SAN – NAS – DAS. needs. Then recheck your budget, because you will inevitably spend more than you expect to for the storage you are purchasing. TIP As a general rule choose IDE or SCSI for small, departmental storage needs, select SCSI for mid to large-scale storage needs, and choose Fibre Channel for larger, more demanding applications that will require multiple host interaction and clustering. on to storage configurations This concludes the quick review of the basic technologies that affect storage in general. The remaining pages provide an overview of the three types of storage: DAS, NAS, and SAN. The most traditional mechanism for connecting computers to storage devices is via direct attached storage (DAS). Whether you use SCSI or IDE drives, RAID or Just a Bunch of Disks (JBOD), connecting them directly to a server platform is a common method of adding storage. For example, if you connect a Zip drive, CD-ROM burner, optical disk array, or any other storage device to a computer that holds data, you have a DAS setup, as shown in Figure 1-8. Figure 1-8: Direct attached storage is both simplistic and inflexible. However, if you don't already have hands-on experience with a DAS approach to storage, you'll soon realize that managing storage resources that are directly attached to a server (or any other computer system) is not as flexible as you might like it to be. Server software developers typically have the most rudimentary storage management tools built into their software. While they allow for volume management, most are not wellequipped to manage changes in volumes as your needs change. And while you can certainly add more storage devices to a server, most often you will have to perform lengthy maintenance on the server in order to install and test your new drives. In addition, even after you have added your drives to your server, you may not be able to easily join the new devices with the current devices into a single, cohesive storage volume. While this situation may not present much of an issue to a small departmental file server, it may be extremely disruptive to a server handling a shipping database or a Web server that runs the company's Internet store. WARNING Direct attached storage is often much less expensive than network attached storage or storage attached networking from an initial purchase standpoint. However, don't forget to figure in management costs and data backup strategies that may often bring the true solution costs up beyond a similarly equipped SAN solution. DAS solutions often have scalability issues. Whether IDE or SCSI, cable lengths will pose problems when you are ready to expand a drive array, as shown in Figure 1-9. 3/8/2016 Verschil tussen SAN – NAS – DAS. Direct attached storage has distance limitations. IDE cables won't reach more than a couple feet while SCSI cables can go as long as 20 meters or so when using low voltage differential (LVD) technology. For smaller departmental servers, this limitation is often not a barrier since most systems will not contain more than a few drives that can easily be contained within the computer's case. However, if you want to build a larger-scale system or one with additional redundancy or clustering, you may not physically be able to put the system together due to the length limitations of these individual drive connections. Network attached storage (NAS) expands on the concept of storage attached directly to a server or other system. In an NAS configuration you attach a storage device to your network, and a server or other computer(s) can remotely access the storage device over the network, as shown in Figure 1-10. The storage device isn't tied to any one server or system, so it is accessible by several different systems at once and doesn't monopolize the resources of any single system. Figure 1-10: A server or other computer can connect to an NAS device with a common Fast Ethernet or Gigabit Ethernet connection. While the NAS concept is noble in its purpose, it is really best suited for departmental or small server-based environments where a limited capacity for expansion is acceptable and low cost is imperative. Many NAS devices have begun to provide expansion capabilities in their cabinets (the physical casing that holds the storage drives), although the expansion is often limited and the volume management is rudimentary. NAS is well suited for smaller, departmental applications as well as those organizations that do not have sufficient technical expertise to manage more complex data storage. Additionally, although NAS implementation is quick and often very easy, a large influx of data communications between the applications running on the server (such as a database) and the NAS device can easily overwhelm an organization's network. Clearly, NAS has a place in business environments, but there is a better solution for high-demand applications with scalability requirements. TIP Network attached storage has the advantage of being able to be located anywhere you can place a network connection. This helps overcome the cabling restrictions common in DAS solutions. We've covered the basics of the two most common storage connectivity technologies, DAS and NAS, but now it's time to look at the technology that this course is all about, storage area networks (SAN). SAN technology combines the concepts of DAS and NAS to implement the best of both worlds. 3/8/2016 Verschil tussen SAN – NAS – DAS. In a SAN configuration, a server acts as a gateway to a collection of storage devices. The server contains a Host Bus Adapter (HBA) that is similar in concept to a network card. This HBA then communicates to the different drives in the collection and makes those drives available to the different computers on the network. Because the SAN uses a collection of storage devices that are all linked to a server with an HBA, as shown in Figure 1-11, you can add new storage devices to the storage pool whenever necessary. Once a device is part of the pool, it is accessible to any computer that has access to the pool. Figure 1-11: Host Bus Adapters provide connectivity to the storage area network. Visualizing the SAN as a separate network of storage devices connected to an organization's main network is a good way to understand the SAN concept. Each drive within the SAN is attached to the SAN-only network so that you can remove any drive without interrupting the other drives on the network. In fact, hot swapping drives is part of the design of Fibre Channel over which most SANs are built. The basic components of a SAN include: An HBA (Host Bus Adapter) Fiber optic patch cables Fibre Channel disk drives Fibre Channel switches Each of these individual components plays a specific role when combined into the larger SAN, and you find out how they all fit together in Lesson 2. However, before you move on to the next lesson, be sure to stop by the course message board to check in with your instructor and fellow students. 3/8/2016