B.S. Anangpuria Institute of Technology & Management Branch: CSE/IT (4th SEM) Session-2010 Computer Architecture and Organization CSE – 210-E Unit -6 Introduction to parallelism Lecture – 23: Vector Processors Super computers Memory Interleaving Array Processors o SIMD array processor o Attached array processor Submitted by: Mr. Manoj Kumar Saini There are various type of processors which perform particular operations. Vector Processors: One more type of processor what we use are vector processors. Ability to process vectors, and related data structures such as matrices and multidimensional arrays, much faster than conventional computers Vector Processing Applications • Problems that can be efficiently formulated in terms of vectors – Long-range weather forecasting – Petroleum explorations – Seismic data analysis – Medical diagnosis – Aerodynamics and space flight simulations – Artificial intelligence and expert systems – Mapping the human genome – Image processing Vector Processor (computer) Ability to process vectors, and related data structures such as matrices and multi-dimensional arrays, much faster than conventional computers Vector Processors may also be pipelined Example: 20 DO 20 I = 1, 100 C(I) = B(I) + A(I) Conventional computer Initialize I = 0 20 Read A(I) Read B(I) Store C(I) = A(I) + B(I) Increment I = i + 1 If I 100 goto 20 Vector computer C(1:100) = A(1:100) + B(1:100) Supercomputers:Is a broad term for one of the fastest computers currently available. Such computers are typically used for number crunching including scientific simulations, (animated) graphics, analysis of geological data (e.g. in petrochemical prospecting), structural analysis, computational fluid dynamics, physics, chemistry, electronic design, nuclear energy research and meteorology. Perhaps the best known supercomputer manufacturer is Cray Research. The chief difference between a supercomputer and a mainframe is that a supercomputer channels all its power into executing a few programs as fast as possible, whereas a mainframe uses its power to execute many programs concurrently. A supercomputer is a computer that leads the world in terms of processing capacity, particularly speed of calculation, at the time of its introduction. The first supercomputers were introduced in the 1960s, led primarily by Seymour Cray at Control Data Corporation (CDC), which led the market into the 1970s until Cray split off to form his own company, Cray Research, and then took over the market. In the 1980s a large number of smaller competitors entered the market, a parallel to the creation of the minicomputer market a decade earlier, many of whom disappeared in the mid-1990s "supercomputer market crash". Today supercomputers are typically one-off custom designs produced by "traditional" companies such as IBM and HP, who had purchased many of the 1980s companies to gain their experience. Technologies developed for supercomputers include: Vector processing : A vector processor, or array processor, is a CPU design where the instruction set includes operations that can perform mathematical operations on multiple data elements simultaneously. This is in contrast to a scalar processor which handles one element at a time using multiple instructions. The vast majority of CPUs are scalar (or close to it). Vector processors were common in the scientific computing area, where they formed the basis of most supercomputers through the 1980s and into the 1990s, but general increases in performance and processor design saw the near disappearance of the vector processor as a general-purpose CPU. Liquid cooling : An uncommon practice is to submerse the computer's components in a thermally conductive liquid. Personal computers that are cooled in this manner do not generally require any fans or pumps, and may be cooled exclusively by passive heat exchange between the computer's parts, the cooling fluid and the ambient air. Non-Uniform Memory Access (NUMA): Non-Uniform Memory Access or NonUniform Memory Architecture (NUMA) is a computer memory design used in multiprocessors, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non-local memory, that is, memory local to another processor or memory shared between processor Striped disks (the first instance of what was later called RAID): In computer data storage, data striping is the segmentation of logically sequential data, such as a single file, so that segments can be assigned to multiple physical devices (usually disk drives in the case of RAID storage, or network interfaces in the case of Gridoriented Storage) in a round-robin fashion and thus written concurrently. Parallel filesystems: n computing, a file system (often also written as filesystem) is a method for storing and organizing computer files and the data they contain to make it easy to find and access them. File systems may use a data storage device such as a hard disk or CD-ROM and involve maintaining the physical location of the files, they might provide access to data on a file server by acting as clients for a network protocol (e.g., NFS, SMB, or 9P clients), or they may be virtual and exist only as an access method for virtual data (e.g., procfs). It is distinguished from a directory service and registry. Memory Interleaving Also known as MULTIPLE MEMORY MODULE AND INTERLEAVING Memory interleaving is the term used because we are combining or communicating the different memories for assigning addresses and again to interchange the data. Array Processors: A microprocessor that executes one instruction at a time but on an array or table of data at the same time rather than on single data elements. • Array processor performs a single instruction in multiple execution units in the same clock cycle • The different execution units have same instruction using same set of vectors in the array. Features of array proessor: • Use of parallel execution units for processing different vectors of the arrays • Use of memory interleaving, n memory address registers and n memory data registers in case of k pipelines and use of vector register files A computer/processor that has an architecture especially designed for processing arrays (e.g. matrices) of numbers. The architecture includes a number of processors (say 64 by 64) working simultaneously, each handling one element of the array, so that a single operation can apply to all elements of the array in parallel. To obtain the same effect in a conventional processor, the operation must be applied to each element of the array sequentially, and so consequently much more slowly. An array processor may be built as a self-contained unit attached to a main computer via an I/O port or internal bus; alternatively, it may be a distributed array processor where the processing elements are distributed throughout, and closely linked to, a section of the computer's memory. Array processors are very powerful tools for handling problems with a high degree of parallelism. They do however demand a modified approach to programming. The conversion of conventional (sequential) programs to serve array processors is not a trivial task, and it is sometimes necessary to select different (parallel) algorithms to suit the parallel approach. Array processors are most imortantly implemented in 2 ways: SIMD array processors: A SIMD array processor is a computer with multiple processing units operating in parallel. The processing units are synchronized to perform the same operation under the control of a common control unit, thus providing a single instruction stream, multiple data stream organization. • Data level parallelism in array processor, for example, the multiplier unit pipelines are in parallel Computing x[i] × y[i] in number of parallel units. • It multifunctional units simultaneously perform the actions Fig: SIMD Array Processor Attached array processors: The various components of this structure are: General purpose computer : Used for general procesing Main memory : Memory attached to general purpose computer I/O interface : To connect the two procesors. Attached array processor : The array processor required for high computations. Local memory: Attached to array processor • The attached array processor has an input output interface to a common processor and another interface with a local memory • The local memory interconnects main memory Fig: Attached Array Processor