ICT Standards and Guidelines Segment 101 Hardware Appendices (Version 1.0) Prepared by Table of Contents - Database Systems 1.0 2.0 3.0 4.0 5.0 6.0 Appendix A - Computers Background ....................................................... 1 1.1 Computers hierarchy and the driving technology ................................... 1 1.2 The PC Microprocessor, the industry leader .......................................... 2 Appendix B - The CPU on a Chip ............................................................... 4 2.1 The CPU decision criteria .................................................................... 4 2.1.1 The CPU generations ............................................................... 4 2.1.2 The PC CPU Manufacturers ....................................................... 5 2.1.3 The 8086 compatible instructions (IA-32 ) .................................. 5 2.1.4 The PC Computer Packagers ..................................................... 6 2.1.5 Three or more lines of CPU ....................................................... 6 2.1.6 IA-32 and IA-64 ...................................................................... 6 2.1.7 The RISC CPUs, already 64-bit Architecture ................................ 7 2.1.8 The Apple niche and the PowerPC .............................................. 8 2.1.9 The CPU Evaluation ................................................................. 8 2.2 The PC CPU Technology ..................................................................... 8 2.2.1 The Clock frequency ................................................................ 8 2.2.2 The Clock doubling in the CPU .................................................. 9 2.2.3 Floating Processor Unit in the CPU chip ...................................... 9 2.2.4 Multiple pipelines: More work per clock Strike ............................. 9 2.2.5 The CPU Cache Ram ................................................................ 9 2.2.6 Areas of development ............................................................ 10 2.2.7 Voltages - Dual Voltage.......................................................... 11 2.2.8 Chip production..................................................................... 11 2.2.9 Process technology ................................................................ 11 2.3 The four CPU Architectures ............................................................... 12 Appendix C - The I/O Integrated Solutions ............................................ 14 3.1 Four standard solutions to connect external devices ............................ 14 3.1.1 IDE ..................................................................................... 14 3.1.2 USB ..................................................................................... 14 3.1.3 FireWire ............................................................................... 14 3.1.4 SCSI .................................................................................... 15 3.2 IDE for connecting low cost internal devices ....................................... 15 3.3 USB-Universal Serial Bus.................................................................. 15 3.4 FireWire (IEEE-1394) ....................................................................... 16 3.5 The intelligent SCSI ......................................................................... 17 3.6 Evaluation ...................................................................................... 18 Appendix D - Shared Memory Processing ............................................... 19 4.1 Shared Memory Multiprocessing ........................................................ 19 4.2 NUMA Architecture .......................................................................... 19 4.3 Bus-based Multiprocessing Architecture ............................................. 20 4.4 Hierarchy of Buses .......................................................................... 20 4.5 Dual-ported Memory Architecture ...................................................... 21 4.6 Switch-based SMP Architecture ......................................................... 22 Appendix E - The Cluster Solution .......................................................... 23 5.1 The Cluster Rack-mountable solution ................................................. 23 5.1.1 The Cluster benefits............................................................... 23 5.1.2 The breakthrough of Clusters .................................................. 23 Appendix F - Guidelines to decide between DAS and SAN ...................... 25 6.1 The External RAID DAS or SAN Decision Point ..................................... 25 6.1.1 What Does DAS Mean?........................................................... 25 6.1.2 What Does SAN Mean? ........................................................... 25 6.2 7.0 8.0 The Key Decision Points ................................................................... 25 6.2.1 What Are the Relative Locations of the Servers to the External RAID? 26 6.2.2 What is Storage Security? ...................................................... 26 6.2.3 What are the Numbers of Managed RAIDs?............................... 26 6.2.4 What is Services Continuity? ................................................... 27 6.2.5 Number of Fibre Channel SAN Knowledge Administrators? .......... 27 6.2.6 Capital and Operating Budget Concerns? .................................. 27 6.3 The Decision table ........................................................................... 27 Appendix G - RAID Levels & Types ......................................................... 29 7.1 RAID Levels .................................................................................... 29 7.2 Types of RAID ................................................................................. 30 7.2.1 Software Based RAID ............................................................. 30 7.2.2 Host Bus-based RAID............................................................. 30 7.2.3 External or Bridge-Based RAID ............................................... 30 Appendix H - Standard Benchmarks ....................................................... 33 1.0 Appendix A - Computers Background Computers have their roots more than 300 years back in history. Mathematicians and philosophers like Pascal, Leibnitz, Babbage and Boole made the foundation with their theoretical works. It was only in the second half of last century that electronic science was sufficiently developed to make practical use of their theories. All computers including modern Microprocessors have their design roots that go back to the USA in the 1940s. We still use John von Neumann (1903-57) computer’s design that broke computer hardware down in five primary parts: CPU (Control Processing Unit) Input Output Working memory Permanent memory Actually, von Neumann was the first to design a computer with a working memory (what we today call RAM). If we apply his model to current PCs, it will look like the upward figure. 1.1 Computers Hierarchy and The Driving Technology John von Neumann computer’s design went through several generations of computers. Historically we can view this hierarchy of computers and their influence as follows: Supercomputers and Mainframes were the largest computers - million dollar machines, which can occupy more than one room. These computers defined the industry technology standard at that time. Such computers were reserved to large labs and sites within public or private organizations. Minicomputers came afterwards and were smaller and yet powerful machines. They typically serve a network of simple terminals. DEC VAX minicomputer was a typical model and we still have today IBM's AS/400 as an example of such Hardware Appendices Page 1 minicomputers. Minicomputers led the computer industry for more than a decade and democratized the access to computers for large sectors of the industry, but no domination was observed. Workstations are powerful user machines. They have the power to handle complex engineering applications. They use UNIX or sometimes the Window NT operating system. Workstations are generally equipped with powerful RISC Microprocessors like Digital Alpha, PowerPC, SPARC, Power or MIPS. The PCs are the smallest in this order: Small inexpensive, mass produced computers are also equipped with a Microprocessor which became the industry standard ( x86 ) at the entry level of all computers. PCs work today with Windows, or similar operating systems like Linux. They are used for standard widely used applications. The main feature of the last two solutions - Workstation or PCs - is that they are Microprocessor-based which means that the whole CPU is designed in a single chip. In comparison to 10 years ago the processor scene has become drastically different. While in the period 1980--1990, the proprietary processors and in particular the vector processors were the driving forces of the supercomputers of that period, today that role has been taken on by common off-the-shelf RISC processors and the last to come, the PC Microprocessor. 1.2 The PC Microprocessor - an Industry Leader The PC came out in 1981. In less than 20 years, it has totally changed our means of communicating. When the PC was introduced by IBM, it was just one of many different micro data processors. However, the PC caught on. In 5-7 years, it conquered the market. From being an IBM compatible PC, it became the standard. If we look at early PCs, they are characterized by a number of features. Those were instrumental in creating the PC success. The PC was from the start standardized and had an open architecture. It was well documented and had great possibilities for expansion. It was inexpensive, simple and robust (definitely not advanced). The PC started as IBM's baby. It was their design, built over an Intel processor (8088) and fitted to Microsoft's simple operating system MS-DOS. Since the design was well documented, other companies entered the market. They could produce functional copies (clones) of the central system software (BIOS). The central ISA bus was not patented. Slowly, a myriad of companies developed, manufacturing IBM compatible PCs and components for them. The Clone was born. A clone is a copy of a machine. Some of the components (for example the hard disk) may be identical to the original. However, the Clone has another name (Compaq, Dell, etc.), or it has no name at all. This is the case with "the real clones." Today, we differentiate between: Brand names, PCs from IBM, Compaq, AST, Toshiba, etc. Companies which are so big, so they develop their own hardware components. Clones, which are built from standard components. Anyone can make a clone. Repackagers are companies that assemble components and sell a computer under their own brand. Such strategy made of Dell the first computer in selling PCs today. Hardware Appendices Page 2 The point of this history is that PCs are leading the computing industry today! PCs are just as powerful as minicomputers and mainframes were not too many years ago. A powerful PC can easily keep up with the expensive workstations. Further more, the policy of all the actors in the computer industry on the server and high-end side of the market is determined, if not highly influenced, by the development of the PC Microprocessor. Hardware Appendices Page 3 2.0 Appendix B - The CPU on a Chip 2.1 The CPU decision criteria In order to compare different CPUs, the following aspects should be considered: 2.1.1 The CPU Generations Like all other hardware components, the CPUs are continually undergoing further development. The CPUs have for years doubled their performance about every 18 months (Moore’s Law), and there are no indications that this trend will stop soon. PCs are designed around different CPU generations. Intel is not the only company manufacturing CPUs, but by far the leading one. The following table shows the different CPU generations. They are predominantly Intel chips, but in the 5th generation we see alternatives. The last column indicates the number of transistors integrated within the chip. PC CPUs Year Number of transistors 1st. Generation 8086 and 8088 1978-81 29,000 2nd. Generation 80286 1984 134,000 3rd. Generation 80386DX and 80386SX 1987-88 275,000 4th. Generation 80486SX, 80486DX, 80486DX2 and 80486DX4 1990-92 1,200,000 5th. Generation Pentium Cyrix 6X86 AMD K5 IDT WinChip C6 1993-95 1996 1996 1997 3,100,000 --3,500,000 Improved 5th. Generation Pentium MMX IBM/Cyrix 6x86MX IDT WinChip2 3D 1997 1997 1998 4,500,000 6,000,000 6,000,000 6th. Generation Pentium Pro AMD K6 Pentium II AMD K6-2 1995 1997 1997 1998 5,500,000 8,800,000 7,500,000 9,300,000 Improved 6th. Generation Mobile Pentium II Mobile Celeron Pentium III AMD K6-3 Pentium III CuMine 1999 27,400,000 18,900,000 9,300,000 ? 28,000,000 7th. Generation AMD original Athlon AMD Athlon Thunderbird 1999 2000 22,000,000 37,000,000 Hardware Appendices Page 4 Pentium 4 2001 42,000,000 2.1.2 The PC CPU Manufacturers In 2002, three major manufacturers of PC CPUs are covering the market: Intel, the market leader AMD, who has been very successful with the K6 and Athlon CPU VIA a Taiwanese company who bought Cyrix and IDT 2.1.3 The 8086 Compatible Instructions (IA-32 ) The biggest job for the CPU consists of decoding the instructions and localizing data. The calculations themselves are less heavy work. The decoding consists of understanding the instructions, which the user program sends to the CPU. All PC CPUs are "8086 compatible." This means that the programs communicate with the CPU in a specific family of instructions. These instructions, originally written for the Intel 8086 Architecture (IA-32) , became the blueprint for the "IBM compatible PC" concept. Since there was a strong necessity that subsequent CPU generation should be able to handle the same instructions could without recompilation, it was necessary to make the instruction sets compatible. The new CPUs should understand the same instructions. This backwards compatibility has been an industry standard ever since. All new processors must be able to handle the 8086 instruction format. Thus, the new CPUs must use much effort to translate the 8086 instruction format to internal instruction codes: Hardware Appendices Page 5 2.1.4 The PC Computer Packagers Since the PC Architecture was an open standard many new companies including established ones took the opportunity to offer a packaged PC Computer with their own brand. Packagers of PC computers are of two sorts: Well established manufacturers with a well known brand and a warranty and sometimes a maintenance policy like Dell, Compaq, IBM, or HP. Small not well known packagers who offer competitive prices with of course uncertainty about follow-up 2.1.5 Three or More Lines of CPU All CPU manufacturer and consequently important repackagers do follow today a complex strategy in the simultaneous development of diverse lines of products to address different segments of the market. It takes several years for a CPU design to get to the market. When first models are commercialized the clock rate is at its first threshold .The clock rate is enhanced by 20 to 100 % during the next two years of the lifetime of that design corresponding to a given CPU generation. CPU manufacturer do usually address three segments of the market. The following table gives some indications of the diverse lines of products for 2002. Market segment Processor line CPU speed Bus Speed/ Front Side Bus speed Number of CPUs in system The entry-line desktop 6th generation 566-1200 MHz 66/100 MHz 1 7th generation 733-1400 MHz 100/133 MHz or 400 MHz 1 or 2 6th+ generation 600 1200 MHz 100/133 MHz or 400 MHz 4 The professional The entry level server 2.1.6 IA-32 and IA-64 All PC CPU generations (1 through 7) are IA-32 Architectures. Each of the major CPU manufacturers has already initiated the design of the next generation which will be based on IA-64 Architecture. Intel’s project name for IA-64 is Itanium AMD’s project name is Sledgehammer VIA’s project name is Joshua a rename for the Cyrix MIII project before the merger There is no doubt that the Itanium, Sledgehammer and the corresponding VIA project are going to be heavy processors. But they will not end up on many desktops. They are too expensive, and the design is 100% intended for the server market. There have been questions about IA-32 performance on IA-64 systems. Current PC programs are 32-bits compatible. Hardware Appendices Page 6 A 64-bit processor has to emulate the 32-bit instructions, to execute 32 bit programs like Windows. Emulation is costly; it takes power from the processor. This is the case with the Itanium; it has to translate each of the IA-32 instruction. Anyway, all IA-64 CPU will require a new 64-bit operating system. It is not clear that the large majority of PC users would need to go to the IA-64. Part of the problem with IA-64 is that it has to fight IA-32's massive momentum. 2.1.7 The RISC CPUs - Already 64-bit Architecture Many RISC CPU Architectures exit on the market and all of them have already gone through the technological enhancements that we witness in the PC CPU technology arena. The modern RISC processors generally have a clock frequency that is lower than that of the Intel Pentium III or IV processors or the corresponding AMD clones. However, they have a number of facilities that put them ahead in speed in floating-point oriented applications: RISC processors are able to deliver 2 or more 64-bit floating-point results in one clock cycle. They feature out-of-order instruction execution, which enhances the number of instructions per cycle that can be processed. The bandwidth between the processor and memory, in case of a cache miss, is larger than the one on the PC processors. IA-64 is the last step for PC CPU to overcome and reach the same level of openness to the high-end market including high end Workstation and all levels of the server market. Design techniques advocated by RISC supporters proved their efficiency, such as: Large on-chip caches Multiple functional units and I/O ports Superscalar execution Instruction pre-fetch Deep execution pipelines among them Credit goes to these techniques, not to the RISC blanket in which they were first implemented. They are also successfully when they were later used by Intel's IA-32. A more important factor to consider is that engineering isn't just about building the biggest and fastest system, it's about making numerous technical and practical tradeoffs that lead to the overall best solution. Of some importance also is the fact that those RISC CPU are commercialized by leading computer manufacturers in the industry like IBM, SUN Microsystems, HP/DEC, Apple. Apart from Apple’s PowerPC, RISC CPUs are mostly embedded in specialized or high-end personal workstation and in low, middle-range to high-end servers. This RISC CPU still dominates the server market where they are mostly UNIX based. These CPU designs are known as: Hardware Appendices Page 7 SPARC, accessible through a license and commercialized by Sun and Fujitsu POWER3 and POWER4, designed and commercialized by IBM Alpha, originally designed by DEC, owned now by HP/Compaq HP PA-RISC designed by HP. HP will switch to Itanium MIPS, by Silicon Graphics PowerPC, a Motorola design and commercialized by Apple as OEM There is no clear evident roadmap for the Alpha, HP PA-RISC or MIPS CPU design. HP and Compaq are planning to replace their own RISC Systems with the Itanium over the next several years. On the other hand: 2.1.8 Sun Microsystems with the SPARC Architecture IBM with the Power4 Architecture have aggressive roadmaps for their respective CPU design towards the high-end Workstation and the server market. The Apple Niche and the PowerPC Apple is a special case in the RISC CPU arena since their original market was and still is in direct competition with the PC based desktop. Apple controls some 5 % of the desktop market where the other 95% is PC based. The new strategy of Apple is to enter the lowend server market. Apple has a strong impact for their software and application environment. They have a solid implementation in specific applications areas, with convinced advocates within the user’s community. It is not clear how Apple will keep pace on the long run with the narrow business of the PowerPC. Apple’s strategy in Operating System and Application targets also the UNIX community. 2.1.9 The CPU Evaluation For CPU evaluation, the first step is to find out about the SPEC or LINPACK benchmarks results, and compare between the different alternatives. In case the Agency has a special application or a particular workload, it can define its own benchmark and asks to run it on the proposed systems. Care should be taken in insuring that the tested system is similar to the one you order and a run test should be planned for confirmation before system acceptance. For more details about standard benchmarks see Section 8.0. 2.2 The PC CPU Technology When comparing different CPU technologies we should look to the following: 2.2.1 The Clock Frequency Hardware Appendices Page 8 The first PC CPUs worked at a frequency of 4.77 MHz. Subsequently, clock frequencies rates rose to 16, 25, 50, 66, 90, 133 and 200 MHz to the best today, which operate at almost 2000 MHz. Clock frequencies are still being increased. In a few years we will have CPUs operating at 3 GHz and more 2.2.2 The Clock Doubling in the CPU The problem with the high clock frequencies is to ensure that other electronic components keep up with the pace. It is rather simple to make data move very fast inside a chip where the print tracks are microscopic. But when we move outside the chip, other problems appear. The other components must be able to keep up with the pace. When the frequency gets too high, the circuit board print tracks start acting as an antenna and various forms of "radio noise" appear. In brief, making the rest of the hardware becomes expensive, in order to keep up with these high frequencies. The solution to this problem was to split the clock frequency in two: 2.2.3 A high internal clock frequency, which governs the pace of the CPU. A lower external clock frequency, which governs the pace on the system bus. Floating Processor Unit in the CPU Chip The first CPUs could only work with integers. Therefore, it was necessary to add a mathematical co-processor (FPU, Floating Processor Unit) outside the chip. Later, with better integration the FPU was built inside the CPU. 2.2.4 Multiple pipelines: More Work per Clock Strike 5th and 6th generation CPUs are able to execute more than one operation in one clock tick, since they introduced more processing lines (pipelines), which work in parallel. 2.2.5 The CPU Cache Ram To improve its performance, the CPU must deliver its data at a higher speed. The regular RAM cannot keep up with the CPU speed. Therefore, a special RAM type called cache is used as a buffer. To get higher performance from the CPU, the number of outgoing transactions must be minimized. In fact, the performance increases with the quantity of data transmitted, which is can be contained inside the CPU. Therefore, first cached CPUs was equipped with a built in level 1 cache of 8 KB of RAM. Today, bigger and better CPU cache (L1 and L2) is a natural step in the development of new CPUs. In the following table we see how the internal caches, i.e. cache integrated to the CPU and working at the full CPU clock speed has evolved through different CPU models. CPU Cache size in the CPU 80486DX and DX2 8 KB L1 80486DX4 16 KB L1 Hardware Appendices Page 9 Pentium 16 KB L1 Pentium Pro 16 KB L1 + 256 KB L2 (some 512 KB L2) Pentium MMX 32 KB L1 AMD K6 and K6-2 64 KB L1 Pentium II and III 32 KB L1 Celeron 32 KB L1 + 128 KB L2 Pentium III Cumine 32 KB L1 + 256 KB L2 AMD K6-3 64 KB L1 + 256 KB L2 AMD K7 Athlon 128 KB L1 AMD Duron 128 KB L1 + 64 KB L2 AMD Athlon Thunderbird 128 KB L1 + 256 KB L2 2.2.6 Areas of Development In the following table, some of the technologies are listed that are in constant improvement in the CPU design. Note that internal means inside the CPU. External speed refers to features immediately outside the CPU, on the motherboard. Development area Significance Example Internal clock frequency Speed of data processing inside the CPU. 800 MHz External clock frequency Speed of data transfer to and from the CPU via the system bus (or Front Side Bus). 133 MHz Clock doubling That the CPU works x times faster internally than externally. 6.0 times (like above) Internal data width How many data bits can the CPU process simultaneously 32 bits External data width How many data bits can the CPU receive simultaneously for processing 64 bits Large and better L1 cache, which is a small fast RAM. It works as a buffer between CPU and regular RAM. 64 KB Internal cache (Level 1 cache) Hardware Appendices Page 10 External cache (Level 2 cache) Larger and better implemented L2 cache, place on-die in same chip as CPU. 256 or 512 KB Instruction set Can the instruction set be simplified, to speed up program processing? Or can it be improved? RISC code More pipelines MMX Instructions 3DNow! or SSE 2.2.7 Voltages - Dual Voltage One of the most important CPU technologies is the continually thinner wires inside the chip. With thinner wires and higher density, the CPU can operate at a lower voltage. That results in a smaller CPU generating less heat and with the ability to operate at higher speeds. A step in this development is the design of dual voltage chips: The interface to the I/O bus, which always requires 3.3 volt. In internal CPU parts, it is advantageous to reduce the voltage as much as possible. It becomes possible with extremely thin wires in the CPU. The Socket 7 motherboards have a two part voltage regulator to match the needs of the CPU. Here are some selected CPUs and their voltage requirements ( reference year 2002 ): CPU Internal voltage I/O voltage 2.8 Volt 3.3 Volt 2.8/2.9 Volt 3.3 Volt Cyrix 6X86MX 2.8 Volt 3.3 Volt Pentium II "Klamath" 2.8 Volt 3.3 Volt AMD K6-2 2.2 Volt 3.3 Volt Pentium II and III 2.0 Volt 3.3 Volt Pentium III "CuMine" 1.6 Volt 3.3 Volt Pentium MMX AMD K6 2.2.8 Chip Production It takes a long time to manufacture a CPU. 5 to 50 million transistors must be placed on a tiny silicon wafer. Actually, it required 90 workdays 24 hours round-the-clock to produce a Pentium CPU. CPUs are manufactured in large wafers containing maybe 140 to 150 CPUs. Usually 110 to 120 of these perform perfectly. The rest are discarded. The wafers are burned, etched, and treated in hour long processes - layer by layer. In the CPU there are up to 20 layers of silicon wafers with millions of micro transistors. 2.2.9 Process Technology Hardware Appendices Page 11 The CPUs are processed using CMOS technology with smaller and smaller "wires". The result is smaller "dies" (the little area inside the chip holding all the transistors) with more and more transistors. The power consumption decreases, and the clock frequency increases. CPU 2.3 Process technology Number of transistors die size 486 1.0 micron 1,200,000 79 mm2 Intel Pentium 0.5 micron 3,100,000 161 mm2 Cyrix 6X86 0.5 micron 3,100,000 392 mm2 Intel Pentium MMX 0.35 micron 5,500,000 128 mm2 AMD K6 0.25 micron 8,000,000 68 mm2 Intel Pentium II 0.35 micron 0.25 micron 7,500,000 131 mm2 Intel Celeron 0.25 micron 7,500,000 131 mm2 155 mm2 Cyrix MII 0.25 micron 6,500,000 119 mm2 IDT WinChip 2 3D 0.25 micron 6,000,000 88 mm2 AMD K6-2 0.25 micron 9,300,000 81 mm2 AMD K6-3 0.25 micron ? 118 mm2 AMD ATHLON 0.25 micron 22,000,000 184 mm2 Intel Pentium III CuMine 0.18 micron 28,000,000 106 mm2 AMD ATHLON "Thunderbird" 0.18 micron 37,000,000 (22 mil. + 15 mil.) 117 mm2 Intel Pentium 4 0.18 micron 42,000,000 217 mm2 Intel Pentium 4 Northwood 0.13 micron 42,000,000 116 mm2 Athlon T 0.13 micron 37,000,000 80 mm2 The Four CPU Architectures Four CPU Architectures are leading the industry today: The Ahtlon series and next IA-64 Sledgehammer The Pentium IV and the next IA-64 Itanium The SPARC CPU with the UltraSPARC III v9 Hardware Appendices Page 12 The POWER4 Architecture with dual-CPU on a chip These four architectures are shaping the computer Server market for the next decade Hardware Appendices Page 13 3.0 Appendix C - The I/O Integrated Solutions It takes some time before a given technology gets an overall agreement and sits as an established one. Understanding the various merits and limitations of each I/O technology and where each fits into the grand scheme of things, will make the Agency better-able to determine what is most appropriate for Agency’s particular applications. 3.1 Four Standard Solutions to Connect External Devices These four standard and industry leading I/O technologies are recommended each within a domain of applicability. Even though they overlap sometimes, they can certainly coexist in many situations. These four technologies have an overall agreement: 3.1.1 IDE, for connecting low cost internal devices USB, a low-cost connection for low-bandwidth devices FireWire (IEEE-1394), high speed serial, Ideal for time-based media applications SCSI, the I/O for maximum performance and broadest compatibility IDE IDE provides good, low-cost performance for many desktop applications such as word processing, spreadsheets, and in environments that don't place heavy demands on disk access. If a user demands additional performance, or if they want better support for multiple devices, they can easily add a SCSI card. In a single-drive environment without particularly large amounts of data, IDE technology is probably a good cost effective solution and mostly recommended. 3.1.2 USB USB is a good interface for connecting low-bandwidth devices. It is a poor interface for connecting faster devices -- or for connecting devices that inherently involve large numbers of transactions. The types of peripherals that USB is ideally suited for are keyboards, joysticks, mice, personal printers, zip and floppy drives and entry-level scanners. When connecting a fairly low-speed desktop device like a zip drive to the computer, USB should also work very well. Its “universal” features make of USB a must in any configuration. 3.1.3 FireWire FireWire is very well suited for time-based media applications such as digital video. It also has some nice usability features such as being able to hot-plug and bus powered with enough bandwidth to support disk drives and other moderately fast devices. Unlike USB and IDE, it supports direct memory access (DMA) meaning that it does not require host intervention with every transaction, but it is not ideally suited in a RAID storage environment because of the increasing device overhead that accompanies multiple device connections. When dealing with digital video, connecting a single fairly high-speed device like a hard drive, and simple hot-plug ability is important, FireWire is recommended. Hardware Appendices Page 14 3.1.4 SCSI SCSI is the most preferred I/O interface for users demanding maximum performance, reliability and flexibility. The downside is that a separate host adapter is needed -- a minor consideration in environments where productivity matters. There is also a perception that SCSI supports fewer devices than USB and FireWire. In reality, SCSI provides the best multiple device support available. There is virtually no device overhead, it is CPU independent, and the number of devices supported (i.e., 15 on a single channel and 30 on a dual channel card) are adequate in almost any environment. For maximum performance from disks and for the most reliable technology that is not only fast but also compatible with the widest range of peripherals, SCSI is far by the most suitable I/O technology 3.2 IDE for Connecting Low Cost Internal Devices EIDE (Enhanced Integrated Device Electronics), an extension of IDE, is the most common method of connecting internal storage devices such as CD-ROMs and hard drives today. It was designed to provide good relative performance at the lowest possible cost. UltraDMA 66, the latest implementation of IDE technology, fulfils that promise. But in doing so, it suffers from intrinsic limitations: 3.3 IDE has very limited direct memory access capabilities, as to its cost-toperformance, so as is the case with USB technology, every transaction requires CPU intervention. Unlike SCSI, IDE is not multi-threaded. It must execute each command and wait for return data before moving onto the next command or process. IDE supports maximum data transfer rates of 66 Mbytes per second, which isn't necessarily slow, but doesn't come close to the raw data transfer horsepower offered by SCSI. IDE technology also has no provision to connect external devices, it's limited to four devices internally, and so multiple device support is extremely limited. In spite of these shortcomings, IDE provides reasonably good, low-cost performance for many desktop environments. Lastly, IDE technology allows Apple and other PC vendors to offer systems at reduced prices -- making IDE technology a market necessity. USB-Universal Serial Bus USB(Universal Serial Bus) is a low-cost connection for low-bandwidth devices. USB was developed as a serial port replacement (on the Mac) and a parallel port replacement on the PC. The performance was as much as 40 times faster than the previous Mac serial ports (12 Mbits vs. 256 Kbits), and it offered the added benefits of allowing a hot-plug and being able to support many different devices (up to 127) on the same USB bus. It also offered more robust power management. Furthermore, with a data transfer rate ceiling of 1.5 Mbytes per second (i.e., 12 Mbits), USB became a viable connectivity alternative for other relatively low-bandwidth devices such as non-networked printers, floppy drives, Zip drives and entry-level scanners. USB, however, does have its limitations. Hardware Appendices Page 15 3.4 USB has a maximum bandwidth limit of 1.5 Mbytes per second: "bandwidth" limits can be thought of as in terms of pipe diameter. Smaller pipes cannot support the same data flows that larger pipes can support USB is host centric: The same inherent simple design that allows USB to be lowcost also makes it host centric meaning that the host CPU must initiate every transaction occurring between every device. The practical ramifications of this are that, once that more than a handful of devices on the USB bus were added, bus latencies increase, efficiency goes down and real sustained throughput is only a fraction of the 1.5 Mbytes/second upper bound. Even if USB 2.0 was implemented (an extension of today's USB implementation that Intel has begun to talk about which offer significantly more bandwidth than USB today), the same CPU utilization issues would exist. USB is a hub architecture: USB is also a hub architecture, meaning that each device must be connected into a hubbed port rather than daisy chained from device to device. This adds additional cost when the number of devices needed to connect exceeds the number of ports on the host system and each hub also counts as a device. FireWire (IEEE-1394) FireWire is a high-speed serial, ideal for time-based media applications. FireWire, Apple's trade name for the IEEE-1394 interface, was designed to connect higher bandwidth peripherals. Like USB, FireWire is: Able to have a hot plug Can support a number of devices simultaneously, and it is a serial bus technology That's where the similarities end. The topology is very different; as are the applications it is best suited for. Each device connection on a FireWire bus constitutes a separate node, and while you can theoretically connect 63 devices, practically speaking this is not possible. There are two reasons for this practical limitation: FireWire supports isochronous data transfers, like those used when transferring digital video and other time-based media. Isochronous data transfers differ from standard asynchronous data transfer types in the sense that they guarantee the timely delivery of data between nodes (i.e., 30 seconds of streaming video plays out in exactly 30 seconds). This means that when an isochronous device such as a DV camcorder is connected to the FireWire bus, that device grabs an allocated portion of the bus necessary to guarantee the timely delivery of its data, say 10 Mbytes per second. The bus also automatically allocates 20% of the total 50 Mbytes / second bus bandwidth (i.e., 10 Mbytes / second) for serial command overhead, leaving only 30 Mbytes per second (in our example) of available bandwidth for other devices. Once the FireWire bus bandwidth has been fully allocated, the bus will not recognize additional devices regardless of whether-ornot the connected devices are actually transporting data. A condition known as multiple devices overhead is created when a lot of devices are introduced onto the bus. This condition begins to manifest itself with a relatively few number of devices and continues to degrade performance as more devices are added. The reality is, the very attributes that make FireWire self- Hardware Appendices Page 16 configuring, has a hot-plug and so convenient to use are the same inherent attributes that cause increased latencies and add to the device overhead 3.5 The Intelligent SCSI SCSI (pronounced scuzzy) is the I/O for maximum performance and broadest compatibility. SCSI is a bus architecture that is inherently designed for performance. It is multithreaded in nature, does not degrade as more devices are added to the bus, and it executes commands more efficiently than other host interfaces due to a variety of features inherent to its design. This makes SCSI the best-suited architecture for highspeed disk storage, bar none. It also offers greater reliability and robustness for desktop classes of peripherals as well. SCSI transactions are all memory mapped meaning that the data is transferred directly into memory without CPU host intervention. There is also a lot of smarts built into the SCSI controller chip on the target device as well as the host card, so error conditions can be handled independently of the computer's CPU. These are among the reasons SCSI is the least processor intensive interface available. SCSI has a feature known as disconnect/reconnect that does as the name implies. When a device is transporting data, it connects to the bus. When it is processing a request or sitting idle, it disconnects from the bus allowing full busbandwidth to each device. This is why increasing the number of devices does not degrade bus performance. On the contrary, adding more devices allows you to go faster. SCSI is multi-threaded meaning that it can issue multiple commands to multiple devices simultaneously, can begin another transaction (or command) while waiting for return data and it doesn't care what order the commands complete. It is set up to efficiently handle multiple incoming replies using "scatter-gather" tables. IDE technology, in contrast, is single threaded meaning that it can only issue one command at a time, and must wait for a reply before issuing the next command. This makes IDE technology very inefficient when dealing with more than a single device at a time. Another feature inherent with SCSI is something called tagged-command queuing. Assuming a request is made to a drive for a piece of data, then another request for another piece of data, and perhaps another before any of the requests can be executed. The tagged-command queuing feature enables the device to store these requests (up to 256 at a time), so the device can carry out the entire queue of commands in the most efficient sequence possible. Ultra160 SCSI can support data transfers of up to 160 Mbytes per second. Ignoring the fact that SCSI executes transactions more efficiently than any other interface technology, the bandwidth capacity alone is over 3 times faster than FireWire and nearly 3 times faster than Ultra DMA66, the fastest implementation of IDE technology in production today. All together, it becomes apparent how the SCSI bus architecture is inherently designed for maximum performance and system availability. For applications depending upon heavy disk I/O (i.e., graphic intensive applications such as Photoshop, prepress, video, and server applications), Ultra160 offers greater performance than any other I/O technology. But there is another aspect of SCSI that is equally appealing. SCSI has been a mainstream technology for nearly 15 years. During that time, it has undergone continual refinement. Today it's not only the highest performance I/O technology, but Hardware Appendices Page 17 the most reliable too. It also supports a far wider breadth of peripherals than any other I/O technology and its completely backward compatible with earlier generations of the technology -- making it almost universally compatible with nearly every class of desktop peripherals SCSI reliability extends to the peripherals as well. SCSI hard drives for example; they offer the highest MTBF (Mean Time Between Failure) ratings available making them the most reliable types of drives on the market. 3.6 Evaluation When primarily connecting to single devices, each interface will provide satisfactory results in most desktop computing environments. When these I/O technologies are pushed to their limits, either by the types or number of devices that are connecting or by the specific application environment, these technologies begin to fail in very predictable ways due to their inherent design limitations. In case there is a project heavy I/O workload, it is important to make sure of the I/O capability of the proposed system by defining and asking to submit a benchmark that is representative of the I/O requirements. Hardware Appendices Page 18 4.0 Appendix D - Shared Memory Processing The computer industry is constantly challenged to make available more powerful, lower cost, open-architecture, versatile and reliable computing platforms. The industry also needs to promote an open architecture and make use of commercially available operating systems and application software. The SMP architecture enables computer manufacturers to respond to such complex, mission-critical applications that require the highest levels of computing and processing power. This can also respond to Agencies’ need when their largest services have to support heavy workloads. SMP solutions where once only possible using proprietary RISC/UNIX-based systems. Today we can find x86 SMP solutions as well. 4.1 Shared Memory Multiprocessing Multiprocessing (MP) can be confined into a single server platform, a node, or a combination of nodes working concurrently. An SMP server allows multiple processors in a single server to work concurrently on multiple tasks. In a clustered architecture, the SMP platforms become the building blocks of the system. In this architecture, multiple SMP platforms are interconnected, usually in a proprietary fashion, to form a larger system. Figure 1 shows a typical SMP-based cluster architecture. The textbook definition and industry implementation of multiprocessing (MP) servers comes in a wide variety of architectures and sizes, each optimized for a particular application. The most common MP architectures however are: 4.2 NUMA Architecture Bus-based SMP Architecture 1. Single-bus SMP 2. Multi-bus SMP Switch-based SMP Architecture NUMA Architecture NUMA (Non-Uniform Memory Access) is one type of cluster architecture. The advantage of NUMA is its ability to use inexpensive standard high-volume (SHV) server building Hardware Appendices Page 19 blocks to create a larger system. The architecture offers an ideal implementation for failover. However, without aggressive, low-latency, high-bandwidth interconnects or largenode caches, the operating system and application software must be restructured to achieve optimal performance. Compared to SMP, NUMA becomes cost effective only with large numbers of processors. The figure below shows a typical NUMA architecture connected by the Scalable Coherent Interface (SCI). 4.3 Bus-based Multiprocessing Architecture There are several approaches to a bus-based SMP solution. Single-bus SMPs have traditionally been the most cost-effective way to build small shared-memory multiprocessing systems. For a long time, modern microprocessors have been designed to enable small-scale (four or fewer processors) bus-based SMPs. For a small number of processors, the bus-based SMP platform offers good latency and bandwidth. However, with the advent of higher speed and more powerful processors, the bus bandwidth and scalability become a bottleneck. Multi-bus SMP systems are one way to address the performance and the scalability requirements of a multiprocessing platform. The hierarchy of buses and multi-ported memory architecture are also commonly used MP architectures. 4.4 Hierarchy of Buses Perhaps the most obvious multi-bus architecture is the hierarchy of buses. In this scheme, the processors are organized into single bus nodes that are connected through hierarchical buses to form a single system. This method uses a system bus as a common bus, and then attaches four-way subsystems to the bus with a third-level cache to filter memory references before reaching this common bus (see Figure 3). While this architecture supports eight processors (and potentially more), it requires a high-cost cache and greater memory latency due to its multi-level architecture. In a four-CPU configuration, it could cost more and deliver lower performance than a standard 4-way platform. Hardware Appendices Page 20 4.5 Dual-ported Memory Architecture The dual-ported memory system architecture delivers significantly higher performance than the hierarchy of buses architecture and eliminates the need for a third-level cache. The performance is achieved at a lower cost premium over a clustered 4-way system. A dual-ported memory design (see Figure 4) overcomes the problems inherent in a hierarchy of buses design. The architecture bypasses the need for expensive level-3 cache-impeding performance. The straight-forward implementation offers only two system buses, which overcomes the worst-case transaction of traversing three buses in a hierarchy of buses. In a dual-ported memory design, all CPUs connect to a common memory, but each system bus segment operates independently for maximum performance. The implementation issues to be solved in any such design include: Addressing the total system memory bandwidth Designing the topology of the I/O bridge connectivity Providing for cache coherency Hardware Appendices Page 21 4.6 Switch-based SMP Architecture The switch-based architecture departs from the bus-based approach by implementing a massive central switch that connects processors and memory. While switch-based SMPs enable the construction of large shared-memory system, the average latency, and the cost of this solution, increases dramatically with the number of processors. Today, switch-based SMPs represent a point solution to the scalability requirements of a specific market segment. A chipset designed with a particular number of processors in mind is not necessarily appropriate for a different number of processors. As the number of processors increases, the bandwidth through the switch must increase accordingly. The challenge of switch-based SMP solutions remains providing a balanced platform that is not over-designed and will scale. Figure 5 illustrates a switch-based SMP diagram. Hardware Appendices Page 22 5.0 Appendix E - The Cluster Solution Clusters were at first collections of workstations/PCs connected by a local network. The adoption of clusters has been quite successful since the introduction of the first one in 1994 as a solution for a specific heavy workload. The attraction for clusters lies in the low cost of both hardware and software and the control that builders/users have over their system’s scalability. The concept of cluster has been expanded to servers. 5.1 The Cluster Rack Mountable Solution The interest for clusters can be seen for instance from the active IEEE Task Force on Cluster Computing (TFCC) which regularly issues a White Paper in which the current status of cluster computing is reviewed. As the cluster scene became relatively mature and an attractive market, traditional vendors ( IBM, Sun, HP/Compaq,…) as well as many start-up companies have entered the field and offer more or less ready out-of-the-box cluster ( Rack-mountable) solutions for those that do not want to build their cluster from scratch. The Cluster is often an ideal solution for industrial, governmental and educational Agencies requiring excellent price/performance for handling increasing heavy workloads over time. It is also an excellent choice for applications such as Web serving and collaboration or any situation requiring horizontal scaling. 5.1.1 The Cluster Benefits Clusters offer the following benefits: 5.1.2 Important increases in resource utilization (CPU, memory, software). This helps in maximizing productivity. Provides high compute density per rack, at an affordable price. Comes bundled with all software needed to operate and manage compute farms a complete solution. Arrives fully assembled in rack. Allows you to expand at your own pace - from a few CPUs to a few thousand and helps ensure binary compatibility with next generation systems. Compatible with major OEM workload management software applications. The Breakthrough of Clusters In June 2002 a total of 80 clusters were listed in the TOP500 worldwide classification for high-end processing computers. Here are some highlights of this cluster penetration: The first PC Cluster in the TOP500 classification (rank 35 with 825 GF/s) is based on AMD Athlons and installed at the IWR at the University of Heidelberg in Germany. A total of 42 Intel based and 7 AMD based PC cluster are in the TOP500. Hardware Appendices Page 23 31 of these Intel based clusters are IBM Netfinity systems delivered by IBM. A substantial part of these are installed at industrial customers especially in the oilindustry. Including 5 Sun and 5 Alpha based clusters and 21 HP AlphaServer, the number of clusters in the top500 grew impressively to 80 systems. 14 of these clusters are labelled as 'Self-Made'. Hardware Appendices Page 24 6.0 Appendix F - Guidelines to decide between DAS and SAN How to decide between DAS and SAN deployed external RAID? 6.1 The External RAID DAS or SAN Decision Point The time has come to deploy external RAID for the application servers. The decision now is whether the external RAID will be direct attached to the servers (DAS) or networked storage (SAN). Although this appears to be a fairly simple decision, it will have both budgetary and management repercussions. This section will facilitate that decision process by focusing on the buying criteria and the implications of deploying DAS or SAN RAID. Key points include: Shared external RAID by more than two (2) servers within the next 24 months? Shared attach servers include Microsoft® Windows NT® or Windows® 2000? The relative locations of the server(s) to the external RAID storage? Importance of storage security? The number of managed RAIDs? Business continuity issues? The number of Fibre Channel SAN knowledge workers? Budget concerns, capital, or operating? 6.1.1 What Does DAS Mean? DAS is the acronym for "Direct Attached Storage." It means that a server or servers are physically connected to the external RAID on non-shared ports. These ports can be any flavor of SCSI or Fibre Channel. Each port is only accessible by one server. A key point to remember is that DAS does not necessarily mean non-shared external RAID. It really means non-shared ports on the external RAID. 6.1.2 What Does SAN Mean? SAN is the acronym for "Storage Area Network." It means that multiple servers can share both ports and storage on a switched network. It also means that a server or multiple servers can access multiple external RAIDs and other storage devices on that switched network. 6.2 The Key Decision Points Is the external RAID going to be shared by more than two (2) servers within the next 24 months? If the answer is no, then the decision favours DAS. SAN attached RAID will require additional hardware, software, and services that all add significant cost. If the answer is yes, the decision favours SAN. SANs are an extension of storage that allows multiple servers to access and share the same storage devices. If the RAID is going to be shared, the next question becomes are any of the servers sharing the RAID with Microsoft Windows NT, Windows 2000, or Windows® XP? Hardware Appendices Page 25 Windows servers are not very good citizens in a shared RAID environment. If a storage LUN (Logical Unit Number) is visible to their file system, each Windows server will assume that it owns this LUN and will attempt to format it. Each Windows server will override, take control of that LUN, and overwrite the data. This will cause significant havoc in a SAN environment. Data would be irretrievably lost. UNIX servers do not exhibit these same characteristics. The SAN fabric, or the external RAID, informs the UNIX servers about the LUNs they can access, and the ones belong to someone else. The UNIX servers will not override. Managing Windows, SAN shortcomings require deploying LUN-masking in the RAID controller, HBA (host bus adapter), or hardware enforced world wide name (WWN) zoning in the Fibre Channel switch. LUN masking prevents the individual Windows servers from seeing any LUNs other than the ones assigned to them. Fibre Channel switch hardware enforced WWN zoning will limit the view of the Windows servers to only the LUNs assigned to a specific port. This is not a perfect solution and is limited by the number of ports available on the RAID controller(s). All three Windows solutions increase capital and operating costs. 6.2.1 What Are the Relative Locations of the Servers to the External RAID? If the server(s) that own(s) the RAID storage are within 25 meters of that external storage, then the decision favours DAS. If they are beyond 25 meters, the decision still favours DAS; however, the interface is more likely to be a SAN-ready technology such as Fibre Channel. Fibre Channel can directly connect servers to external RAID, point-topoint. Fibre Channel can go up to 30 km using single mode fibre and powerful laser transceivers, 10 km using single mode fibre and standard laser transceivers, 3 km or less using multi-mode fibre and LED transceivers, and 30 meters using copper transceivers. (Transceiver is the interface between the Fibre Channel port and the media.) Wave division multiplexing (WDM) or dense wave division multiplexing can extend Fibre Channel up to 120 km over single mode fibre. The key point to remember is that distance does not necessitate a SAN even if it is using SAN technologies. 6.2.2 What is Storage Security? How important is storage security to the agency? If security is a paramount consideration, then DAS is the most secure answer. With DAS, servers cannot see the LUNs or spoof (pretend they are another server) the RAID subsystem controller into believing it is the master server because they do not have physical access. SANs today do not prevent WWN spoofing. Security can be violated. The ANSI T11 Fibre Channel standards body has a working group to address this issue sometime in the future. If security is paramount, the decision today is DAS. 6.2.3 What are the Numbers of Managed RAIDs? The number of managed RAID storage units is a storage administrator productivity issue. Industry studies have found that SANs give storage administrators the ability to manage three to four times more RAID storage than DAS. Most of total cost of ownership (TCO) analysis is based on those studies. If there are only a few RAIDs to manage, the Hardware Appendices Page 26 productivity issue becomes moot. If there are several RAIDs or if the unit growth is exponential, the decision favors SAN. 6.2.4 What is Services Continuity? Services continuity has always been an important issue for IT. Since 9/11/2001, it has become even more important. The question becomes what type of service continuity practices will be used? If it is some form of data replication or remote mirroring across a metropolitan area network (MAN) or wide area network (WAN) to a remote RAID, DAS or SAN is dependent on the technology and methodology of each external RAID. It must be a SAN only if the external RAID must replicate or mirror on Fibre Channel. If the RAID can use TCP/IP and Ethernet for replication and/or remote mirroring, DAS and SAN are equivalent. If the business continuity practice is backup to tape, the issue becomes one of applications server and LAN bandwidth. If the LAN bandwidth is minimal, the SAN can provide LAN free backups. If the application server(s) bandwidth is at issue, the answer becomes a storage or RAID appliance. There has been much talk about SAN-based, server-less backup; however, server-less backup today requires a storage or RAID appliance that works with SANs or DAS. The real advantage of a SAN comes into play when there are multiple RAID units that need to be backed up or replicated. SANs will simplify the process and reduce the time. 6.2.5 Number of Fibre Channel SAN Knowledge Administrators? Implementing and managing a Fibre Channel SAN requires specialized knowledge of Fibre Channel. The number of workers with that knowledge is few to none in most Agencies. This means if a SAN is to be implemented, there will be the additional costs of professional services and training. 6.2.6 Capital and Operating Budget Concerns? SANs have definite measurable benefits over DAS as previously described. They also carry additional costs over DAS as well. These costs include additional capital, (SAN hardware, SAN software, RAID hardware, and RAID software,) and additional operating budget (maintenance, training, and possibly more personnel). The key to making a decision on DAS deployed RAID or SAN deployed RAID is whether the measurable benefits of a SAN will outweigh the additional costs. 6.3 The Decision Table Deciding on external RAID deployment as DAS or SAN depends on how each of these questions is answered: Shared external RAID by more than two (2) servers within the next 24 months? If more than two (2) servers need to share the RAID = SAN If less than two (2) servers need to share the RAID = DAS Hardware Appendices Page 27 Shared attach servers include Microsoft Windows NT or 2000? If any of the servers sharing the RAID are Microsoft Windows NT or 2000 = DAS or SAN with RAID LUN masking or FC switch hardware enforced zoning. Locality of the server(s) and the RAID storage? If locality is greater than 25 meters = either with FC technology. If locality is less than 25 meters = DAS What is the importance of storage security? If security is paramount = DAS If security is important = Either What are the number of managed RAIDs? If the number of managed external RAIDs is greater than 3 = SAN If the number of managed external RAIDs is less than 3 = DAS What are the business continuity issues? If business continuity is required for multiple external RAIDs = SAN If business continuity is required for one external RAID = DAS What are the number of Fibre Channel SAN knowledge workers? If FC knowledge workers are few and budgets are tight = DAS What budget concerns exist, capital or operating? If SAN benefits outweigh additional costs = SAN If SAN costs outweigh additional benefits = DAS These answers will not provide a perfect decision; they should, however, make it an easier one. Hardware Appendices Page 28 7.0 Appendix G - RAID Levels & Types 7.1 RAID Levels RAID can be implemented in a number of ways. These are referred to as RAID levels. Each level provides a different combination of performance and fault tolerance. Knowing the characteristics and capabilities of each level will enable you to recommend the optimum RAID solution to meet the agencies’ performance and fault tolerance needs RAID Level 0: In a RAID 0 array, data is distributed, or striped, across the disks in the array. The array appears to the server as one large disk with a capacity approximately equal to the combined capacity of the physical disks. Since data can be written to multiple drives at the same time, this approach provides a significant performance increase. However, RAID 0 arrays do not store redundant data, so they are not true RAID applications. If one disk fails, the entire array fails and all array data is lost RAID Level 1: In RAID 1 (commonly called a mirrored array) disks are paired, with both disks in a pair containing the same data. When data is written to a mirrored array, it is written twice - once to each disk in the pair. This causes a slight decrease in performance. However, a mirrored array is highly reliable, because both disks in a pair must fail for the array to fail. The main disadvantage of a mirrored array is its cost. Because all disks must have a twin, you must add drives in pairs to increase storage capacity. RAID Level 2: RAID 2 arrays sector-stripe data across groups of drives, with some drives assigned to store ECC information. Because all disk drives today embed ECC information within each sector, RAID 2 offers no significant advantages over other RAID architectures RAID Level 3: RAID 3 uses parallel access, block-level data striping with a dedicated parity drive. RAID 3 arrays contain redundant information in the form of parity data, which is calculated block-by-block for all user data. The user data is distributed across all but one of the disks in the array. The parity data is written exclusively to the parity disk. In the event of a disk failure, the data can be reconstructed from corresponding data stripes on the remaining disks in the array. RAID Level 4: RAID 4 uses independent access block-level data striping with a dedicated parity drive. RAID 4 is similar to RAID 3 in that redundancy is achieved through parity data. The user data is distributed across all but one of the disks. The controller uses a single, dedicated parity drive for data protection. The main difference is that RAID 3 usually synchronizes writes to its disks, while RAID 4 can send data to its disk independently. RAID Level 5: RAID 5 arrays contain redundant information in the form of parity data, which is calculated block-by-block for all user data. The parity information is distributed across the disks in the array. Data is interspersed with the parity information. If one disk in the array fails, the data on the failed disk can be reconstructed from the parity data and user data on the remaining disks. Two disks must fail before the entire array fails. RAID 5 arrays provide an excellent combination of performance and data protection. RAID Level 10: RAID 10 stripes data for improved performance, but also uses a mirrored set of disks for data redundancy. RAID 10 requires a minimum of four drives, Hardware Appendices Page 29 only two of which are used for data storage. Drives must be added in pairs when increasing capacity. RAID Level 50: RAID 50 arrays are made up of two or more RAID 5 arrays, across which data is striped. RAID 50 arrays contain redundant information in the form of parity data. As in a RAID 5 array, the parity information is distributed across the disks in the array. Data is interspersed with the parity information. If one disk in the array fails, the data on the failed disk can be reconstructed from the parity data and user data on the remaining disks. Two disks in one RAID 5 subset must fail before the entire array fails. 7.2 Types of RAID There are three main types of RAID implementations. It is possible to run any RAID level with any disk array type. 7.2.1 Software Based RAID Software-based RAID is RAID functionality provided by a Network Operating System (NOS) such as Windows NT or Novell NetWare. Microsoft Windows NT 4.0 workstation has data striping available, which is also know as RAID 0. Windows NT 4.0 server has striping with parity available, which is RAID 5. Novell NetWare has had mirroring built in for years, which is RAID 1. Because this RAID functionality is built into the operating system, the user doesn’t have to buy any more special hardware or software to get RAID: it’s built in. Although this may seem like an advantage at first, it really isn’t because NOS RAID is extremely limited. 7.2.2 Host Bus Based RAID Host bus-based RAID uses a special RAID controller within the server. The controller is connected to the host computer via the system bus and controls the operation of the RAID subsystem. The biggest advantage of this type of disk array is speed. There are two reasons why this is true. The hardware disk array controller has built in logic to enable the card to process RAID functions as opposed to making the host processor fulfill these functions, unlike software based RAID. The bus-based RAID is attached directly via the system bus, which generally has a faster data transfer rate than a single connection to a disk drive, unlike bridge-based RAID which attaches to the host via SCSI or fiber channel. 7.2.3 External or Bridge-Based RAID External, or bridge-based RAID, uses a controller, which resides external to the server. It connects to the server via a standard device interface such as SCSI or fibre channel. The bridge-based disk array controllers offer the highest fault tolerance because in addition to offering RAID they physically reside separate from the server. So if the server goes down the external RAID array can be moved to a new server. External RAID arrays Hardware Appendices Page 30 do not need host device drivers and are therefore operating system independent. The same array could be attached to an NT server, OS2 Server or Mac Server as long as the server had the proper host interface available such as in integrated SCSI controller. Bridge based RAID controllers typically supports many RAID levels, advanced RAID features, and advanced management options just as bus-based controllers do. In the last two RAID implementations, a question rises: 7.2.3.1 Where Does the RAID Controller Reside? Bus Based RAID: RAID Intelligence is in the server. All RAID functions happen in the server. The disk drives could be internal to the server or external to the server as shown in the graphic External RAID: RAID Intelligence is outside the server. All RAID functions happen in the external subsystem 7.2.3.2 What is the Advantage of External RAID? When a customer requires maximum data protection, external RAID is the solution. External RAID offers the highest level of fault tolerance because the RAID subsystem resides outside the host computer. Should the host fail, data can be switched to a second host. In addition, most of the components of an external RAID storage system, including the disk drives, controllers, power supplies, cooling fans, and cables, are redundant and/or hot swappable. Specific characteristics that make external RAID the right solution for many customers. Redundancy: External RAID provides redundancy by incorporating duplicate components - fans, power supplies, and controllers - in the system. For example, two controllers cooperate to provide redundancy. If one controller fails, the remaining controller will take over for the failed controller. Scalability: As more digital devices come into use, the demand for storage is skyrocketing! With external RAID, the Agency can install the right amount of storage to meet current needs and can easily add additional storage capacity in the future. External RAID enclosures can support multiple drives, and the enclosures can be added to support almost any storage requirement. With user-selectable single- and dual-bus configurations, the enclosure offers considerable flexibility. Users are able to scale storage easily by daisy chaining additional subsystems Separate Storage/Server decision purchase: If Agencies want to include all disk storage inside the server cabinetry, they need to anticipate the number of bays necessary at the time of the server purchase. Since it is difficult to foresee future data storage needs, it is easy for customers to under or over estimate the necessary number of bays. If storage requirements is underestimated, a risk of running out of storage space is taken, and/or purchasing a new server sooner than expected. Overestimating storage space, on the other hand runs the risk of spending unnecessary funds to purchase unneeded cabinet space. Hardware Appendices Page 31 An external storage subsystem eliminates the need to consider bay space at the time of the server purchase. Also, whenever a customer purchases a new server, it is much easier to migrate an external storage array to a new server than to migrate storage to a host processor. Explanation: With an external RAID subsystem, the RAID controller resides outside the host computer. The host server operating system sees the external controller as a SCSI disk drive. And since SCSI disk drives do not need device drivers, the external RAID controller is operating system independent All of the characteristics of external RAID that make it highly flexible, reliable, and fault tolerant -- redundancy, operating system independence, and scalability -- make it an ideal storage solution for many Agencies. Hardware Appendices Page 32 8.0 Appendix H - Standard Benchmarks The following SPEC and LINPACK benchmarks reflect microprocessor, memory architecture, and compiler performance of the tested system: SPECint2000—SPEC component-level benchmark that measures integer performance. Result is the geometric mean of eight tests comprising the CINT2000 benchmark suite. All of these are written in the C language. SPECint_base2000 is the result of the same tests as CINT2000 with a maximum of four compiler flags that must be used in all eight tests. SPECint_rate2000—Geometric average of the eight SPEC rates from the SPEC integer tests (CINT2000). SPECint_base_rate2000 is the result of the same tests as CINT2000 with a maximum of four compiler flags that must be used in all eight tests. SPECfp2000—SPEC component-level benchmark that measures floating-point performance. Result is the geometric mean of ten tests, all written in FORTRAN, included in the CFP2000 benchmark suite. SPECfp_base2000 is the result of the same tests as CFP2000 with a maximum of four compiler flags that must be used in all ten tests. SPECfp_rate2000—Geometric average of the ten SPEC rates from SPEC floatingpoint tests (CFP2000). SPECfp_base_rate2000 is the result of the same tests as CFP2000 with a maximum of four compiler flags that must be used in all ten tests. SPECweb96—Maximum number of Hypertext Transfer Protocol (HTTP) operations per second achieved on the SPECweb96 benchmark without significant degradation of response time. The Web server software is ZEUS v.1.1 from Zeus Technology Ltd. SPECweb99—Number of conforming, simultaneous connections the Web server can support using a predefined workload. The SPECweb99 test harness emulates clients sending the HTTP requests in the workload over slow Internet connections to the Web server. The Web server software is Zeus v.1.1 from Zeus Technology Ltd. SPECweb99 SSL—Number of conforming, simultaneous SSL encryption/decryption connections the Web server can support using a predefined workload. The Web server software is Zeus from Zeus Technology Ltd. LINPACK DP (Double Precision)—n=100 is the array size. The results are measured in megaflops (MFLOPS). LINPACK SP (Single Precision)—n=100 is the array size. The results are measured in MFLOPS. LINPACK TPP (Toward Peak Performance)—n=1,000 is the array size. The results are measured in MFLOPS. LINPACK HPC (Highly Parallel Computing)—solve largest system of linear equations possible. The results are measured in GFLOPS. ECperf—benchmark measures performance and scalability of Java (J2EE) server. Hardware Appendices Page 33