New Concepts in Display Technology John Adams and Robert Wallis Stanford Technology Corporation Introduction As the number of image processing applications and increases, so does the need for a low-cost image processing system. This paper describes an approach to the implementation of a solid-state image display which provides the nucleus for such a system. Combining a powerful display processor with modern random access memory refresh technology, the system features a unique "image array processor" in which entire images, instead of single pixels, may be manipulated and displayed arithmetically at video frame rates. Built on a modest budget around an LSI-11 microprocessor (see Figure 1), this system can nevertheless provide complete image processing capabilities, using the display's refresh memories for temporary image storage (in lieu of magnetic disk devices). A simplified block diagram of the display processor, emphasizing the signal processing aspects, is shown in Figure 2. A more complete, hardware-oriented diagram is provided in Figure 3. Under the control of the LSI-11, and using the powerful capabilities of the display processor, many classical image processing algorithms may be implemented-piecewise linear intensity maps, D-log-E corrections, CRT power law (gamma) correction, logarithmic intensity mapping, exponential intensity map, etc. The elementary arithmetic operations (+, -, X, /) between image channels may be performed using the look-up tables, combinational logic, and output function memories (see Figure 1) at video rates. These basic arithmetic operations are useful in performing such applications as change detection and in developing spectral ratios. By making use of a display processor subsystem known as the "videometer" (described later), radiometric transformations which are a function of the image histogram can be readily computed and users August 1977 applied to the image at video rates. Spatially variant radiometric operations such as vignetting or sensor shading can be performed by generating an image mask and multiplying the mask against the degraded image. More complex operations such as spectral transforms, table look-up classifications, and even spatial convolutions can be done in seconds using the display processor and feedback loop. More sophisticated processing such as geometric correction, maximum likelihood classification, and edge detection can be implemented in the microprocessor by using the display's refresh memory as a randomly addressable peripheral memory device. This approach avoids the limitations inherent in slower-speed mechanical disk drives. Detailed system description Input function memory. The input function memory is a 13-bit-in, 8-bit-out, high-speed RAM which imposes a programmable mapping on the image data being loaded into a selected refresh memory. In conventional systems, scaling of the data is normally performed by the CPU in order to prepare the data for display. Here, the input function memory relieves the CPU of this duty and so increases throughput. Refresh memories. Each N-bit 512 x 512 image is stored as N separate "bit-planes" on N printed circuit boards. The number of bits per pixel (N) may be field configured to any value from 1 to 8 bits. The image data is continually read out in interlaced raster fashion in order to generate the required refresh signals for the display monitor. Up to 12 image channels may be configured in one display system. 61 MAGNETIC TAPE DUAL FLOPPY DISK Cursor and graphics generator. The trackball/cursor facility allows an analyst to designate individual points or irregular regions for such applications as control point picking, and training set selection. The design of the cursor element is defined by a 64 x 64 RAM which is loaded by the host computer. The cursor element may be cross, circle, square, etc. Up to eight graphics planes may be used for overlaying binary images such as maps and polygonal masks generated with the aid of the trackball. One graphics bit plane is available for connection to a dedicated black and white monitor for the display of histograms, system status, table loadings, etc. In large-scale systems, this feature can keep the user informed of the status of his processing at all times. The graphics planes are combined as shown in Figure 4 to form an 8-bit data stream which is processed in real Figure 1. Low-cost digital image processing system. FROM CPU Figure 2. Signal processing block diagram. 62 COM PUTER time through the color assignment RAM. The color assignment RAM is a 256 x 16 bit random access memory, each location of which contains 5 bits for each of the red, green, and blue components of the resultant graphics image. The 16th bit is used to select either an additive or an overlay (replace) mode to be used when combining the graphics with the image data stream. The use of the color assignment RAM allows the user unprecedented flexibility in the assignment of color to the graphics data, including the ability to select a distinct color for every possible combination of overlapping graphics images. As shown in Figure 4, the cursor forms a pseudo graphics channel which replaces the eighth graphics plane in the input to the color assignment RAM. Since the cursor has the same flexibility in color assignment available to the graphics, it changes color as it intersects different graphics planes. Look-up tables. Each image refresh memory is provided with three look-up tables to allow independent control of red, green, and blue signals derived from the memory. The look-up tables consist of 8-bit-in, 9-bit-out, high-speed RAMs and are completely programmable. The 9-bit-out capability allows the user to assign negative values in the look-up tables. (All negative numbers are stored in 2's complement format.) Each of the look-up tables can be disabled programatically, resulting in the same effect as loading the entire table with zeros. This option is useful in performing operations which require the look-up tables to be loaded differently for the red, green, and blue signals. Combining logic. The combinational logic consists of 2's complement adders capable of summing together up to 12 9-bit data streams emerging from the look-up tables at video rates. There are three such adders, one each for the red, green, and blue channels. The summation is performed in four stages resulting in a 13-bit data stream at the output of the combinational logic. The first stage performs the sum of adjacent channels to form a 10-bit data stream. For a fully configured 12-channel system, six sums are presented to the second adder stage, producing three 1 1-bit sums. Min/max registers. The min/max registers examine the 13-bit data stream as it emerges from the combinational logic, and determine the dynamic range of the data by calculating the minimum and maximum data values. The computer can obtain the 13-bit min/max from either the red, green, or blue data stream in one frame. Range registers. The range register is used to reduce the 13-bit data stream to a 10-bit data stream for input to the associated output function memory. The range register allows the user to subtract a constant from the 13-bit data stream and shift right up to three places to select the desired 10 bits of data to be processed through the output function memory. Output function memories. The display processor contains three output function memories which transform the summed outputs of the look-up tables to generate the final red, green, and blue refresh signals. The output function memories are high-speed RAMs which accept 10-bits-in and produce 10-bits out. The 10-bit resolution on the output is provided to enable the videometer subunit to obtain 10-bit histograms on the processed data. The added histogram resolution may be used to determine intensity mappings which are a function of the image histogram. Videometer. The videometer, a subunit of the display processor, examines the data streams at the output of the output function memories and calculates the histograms of the processed image data for the red, green, or blue drive signals (see Figure 5). A unique feature of the videometer is its capability to restrict the histogram calculation to any subregion of an image, as specified by a binary mask in a graphics overlay channel. Therefore, if the analyst specifies the vertices of a polygon enclosing a region of interest in an image (possibly by use of the trackball-cursor subsystem), a corresponding binary mask can be generated. This mask can then be used to restrict the videometer's attention to the designated subarea. August 1977 63 FUNTIO TO,UOFU FROM KEYBOARD (OPTIONAL) CPU ' NS13 Figure 3. Hardware block diagram. Feedback oop. The various display processor transformations described thus far do not actually modify the image data stored in the refresh memory, but merely alter the way in which it is displayed (see Figure 1). The feedback loop allows the user to process the imagery through the look-up tables and the combinational logic, and save the results in a refresh memory. This feedback process is performed within one video frame (1/30 sec.). As shown in Figure 1, the data is returned to refresh memory via the input function memory, allowing the user to scale the results of the processing before storing it in refresh memory. In addition to providing for the retention of data which would otherwise be lost, this feedback loop enables the display processor to perform recursive pro64 cedures in which the output of the nth processing step becomes the input to the (N+l)st step. An example of how this feature can be used to perform spatial convolution will be discussed later. The feedback option also provides one 13-bit refresh channel which allows the retention of the full 13-bit result of the combinational logic (see Figure 2). This 13-bit channel may later be read back by the host computer. Analysis of processing capability The display processor can be used to perform virtually any point transformation on multiband imagery. The COM PUTER - RED 5 256x16 RAM _ GREEN BLUE SUPERIMPOSE/ INSERT FLAG GRAPHICS COLOR ASSIGNMENT RAM Figure 4. Graphics overlay subsystem. FROM DISPLAY PROCESSOR (SEE FIGURE 2) TO D-TO-A CONVERTERS V-DLME FUNCTION MEMORYR LOUTP3M 3:1 M UX Figure 5. Videometer. LUTiJ (*) is the transformation loaded in the ith (i=1,2,3) look-up table associated with the jth refresh drive signals to the monitor can be expressed as N Di(S,L) where i=1,2,3 is paths; = memory; L LUTIj(IMAGEj JS,LI)> OFM1< j=1 an j=1 is the transformation loaded into the ith (i= 1,2,3) output function memory; and ~~~~~~~~~(1) Di (S,L) is the sample-line coordinate of the image being written out by the ith drive signal to the monitor. The index i= 1,2,3 refers to the red, green, and blue. index for the red, green, and blue data j=1,2,3,... ,N is an memory (N may be as index for high as 12); an individual refresh IMAGE/.S,L) is the value of the refresh memory image element at the sample-line coordinates (S,L); August 1977 OFMi(*) As evident from Equation 1, any of the staindard radiometric transformations, such as scaling and histogram equalization, may be performed by loading the appropriate mapping in either the output function memories or look-up tables. More elaborate processing may be accom65 (a) ERTS band 4. (c) ERTS band 6. (b) ERTS band 5. (d) ERTS band 7. Figure 6. Individual spectral components of ERTS (Landsat) image. plished by using both the look-up tables and output function memories. Performing a ratio operation on a multiband image is one example. By loading logarithms of different polarities into the look-up tables, and exponential functions (antilogs) into the output function memories, the following type of transformation can be performed: Di (S,L) = EXP < LOG [IMAGEmi (S,L)] LOG [IMAGEni (S,L)] > N IMAGEmi (S,L) IMAGE,i (S,L) 66 In other words, three ratios of various pairs of bands, one for each drive signal, can be produced by the processor at video rates. By using logarithmic functions of the same polarity in the transformation of Equation 2, products of images may be generated. By multiplying an image by a correction mask, this approach can be used for such space variant radiometric corrections as the removal of vignetting and camera shading. A second type of manipulation commonly performed on multiband data can be expressed as the following linear transformation: Ai, j IMA GEj (S,L) + Bi Di (SL) (2) j =1 (3) COMPUTER (a) 1st principal component eigenvalue = .(. (c) 3rd principal component eigenvalue (b) 2nd principal component eigenvalue = . . Figure 7. Principal components of ERTS image. (d) 4th principal component eigenvalue = . Special cases of the transform in Equation 3 include the Hadamard, Karhunen-Loeve, Slant, and Fourier transforms (in the spectral dimension). An example of this is provided in Figures 6 and 7 which show the four components of an ERTS (Landsat) image, and the result of performing a coordinate rotation with the eigenvectors of the spectral covariance matrix. (In Equation 3, the rows of the matrix "A" would be the eigenvectors, and the components of the vector "B" would be zero). This procedure, often referred to as a Karhunen-Loeve transformation, generates a new coordinate system in which the new image components are uncorrelated. Although only individual bands are shown in Figure 7, the display processor is capable of processing three of the transform components in parallel, and displaying them as red, green, and blue. In mathematical terms, this can be expressed by considering Equation 3 with a non-square (3 row by N column) "A" matrix. By ordering the eigenvectors in the same sequence as the magnitudes of their eigenvalues, the three "principal components" in the uncorrelated space can be displayed as a red, green, and blue color composite. This type of transformation effects a dimensionality reduction, or bandwidth compression, on the data. This has been done with the four-band ERTS image whose individual spectral components are shown in Figure 6. A conventional color composite, made by deleting one band, and the principal component color composite are shown in Figure 8. More detail is visible in the Karhunen-Loeve image than in the raw composite. This transformation is actually being performed 30 times per. second as the image data travels through the processor to generate the refresh signals. Thus, the implementation time is limited by the speed with which the eigenvectors can be computed and the tables loaded. August 1977 = . 67 As a second special case, the transformation in Equation 3 provides the ability to perform virtually any color coordinate rotation on three-band data in real time. Although the image manipulations in the above examples were all "point processes," the display hardware is capable of performing spatial processing as well. For example, spatial convolution algorithms which are typically used for edge enhancement may be performed at nearvideo rates by using the feedback loop, and certain special registers which impose sample and line offsets in the image data. Consider a rather primitive two-dimensional filter employing vertical and horizontal differences: IMAGE'(S,L) = A1*IMAGE(S,L) -A2*IMAGE (S+1,L) -A3*IMAGE (S,L-1) (4) where IMAGE' (S,L) is the sample-line coordinate of the processed output image; IMAGE(S,L) is the sample-line coordinate of the original input image; and Al, A2, A3 are arbitrary coefficients. The above image filtering can be accomplished by recursively combining shifted versions of the original image. The necessary shifts are provided by certain registers which impose line and pixel offsets in the video data stream. A line shift can be induced in the data stream from a given refresh memory by use of the scroll register associated with that memory. An orthogonal shift in the horizontal (samples) direction can be introduced by a pixel shifting register which imposes a delay in the data stream at the input to the input function memory. From Figure 1, it is evident that in one pass through the feedback loop, an image resident in a given refresh memory can be scaled, shifted, and deposited in a second refresh memory. Mathematically, the feedback operation can be expressed as N (a) Original image ERTS bands 7, 5, 4 displayed as red, green, and blue. IMAGE 'FS-PIX,L)=IFM/1 IMAGE r'S,L+SCROLL)I j=l (5) where IMAGE' (S,L) is the value of the refresh memory image element at the sample-line coordinates (S,L) for the processed output image. IMAGE (S,L) is the value of the refresh memory image element at the sample-line coordinates (S,L), for the original image. Al, A2, and A3 are arbitrary weighting coefficients. PIX is the delay imposed by the pixel shifting register; SCROLL is the line offset imposed by the scroll register: IFM(*) is the mapping programmed into the input function memory; and LUTij(*) is the transformation loaded in the jth (1=1,2,3) look-up table associated with the ith refresh memory. The values (S-PIX) and (L+SCROLL) are 9-bit (modulo 511) quantities, therefore, edge wraparound takes place. The feedback loop image transformation of Equation 5 occurs in one video frame time (1/30 second). Thus, the simple convolution expressed in Equation 4 can be 68 (b) Principal component rotation of ERTS image, components 1, 2, and 3 as red, green, and blue. Figure 8. Conventional color composite, and Karhunen-Loeve color composite. decomposed into a series of simple shifts and adds, each of which could be done in one frame time. Each of the "passes" through the feedback loop can be considered as analogous to an "instruction" in an image manipulation language. The time required to execute each of these instructions is generally limited by the time required for the host CPU to set registers, load tables, etc. (usually a fraction of a second). For instance, the simple convolution of Equation 4, can be "programmed" as a series of three COMPUTER instructions, requiring two "scratchpad" memories (whose use is analogous to storage registers in conventional programming constructs). One straightforward approach would be IMA GE1 (S, L) = IMA GEO (S, L-1) (6) IMAGE2(S-1, L) = IMAGEO (S, L) (7) IMAGE3 (S, L) = Al *IMAGEO (S, L) +A2*IMAGE1 (S, L) +A3*IMAGE2 (S, L) (8) In Equations 6, 7, and 8, IMAGEO refers to the original image, while IMAGEl and IMAGE2 refer to temporary images. Equation 6 indicates that a line-shifted version of the original image was copied into a scratchpad memory using the feedback loop and a scroll register. Equation 7 indicates the generation of a sample-shifted duplicate which utilizes the pixel shifting register. In Equation 8, the original and its shifted replicas are weighted and summed together to produce the convolved result. Although this example uses a trivially simple filter, arbitrarily complex convolutions could be performed by accumulating partial sums (in place) in a recursive. fashion. As with the programming of a conventional computer, the algorithms possible are limited only by the imagination and ingenuity of the programmer. U Acknowledgments The authors would like to express their appreciation to Howard Roberts, John Murphy, Dick Sutton, and Mark Shoenaur, whose efforts in the design and implementation of this system made this paper possible, and to Dr. Paul Scheibe and Glenn Peterson who contributed many significant concepts in the early phases of the system design. 377 ORD (.3 0 in Microcomputer 77 Conference Record April 6-8, 1977 (274 pages) A collection of over 50 papers describing the latest developments in various fields of microcomputer hardware and software as weli as a wide range of microcomputer applications.Topics include communications and intelligent terminals, development systems, high level languages, Basic for microcomputers, security applications, digital and distributed control, power industry applications, instrumentation, data acquisition, digital filtering, systems design, simulation and emulation, fault detection, peripheral memories, pattern recognition, and real-time statistical analysis. Non-members-$20.00 Members-$15.00 LIS John Adams is manager of Computer Systems Stanford Technology Corporation in QJ. atSunnyvale, Calif. His professional interests include the design of computer-based image processing systems, digital restoration and enhancement techniques, raising livestock, and small-scale farming. From 1966 to 1975 he was employed at / 01: 't tESL Corp. in Sunnyvale, where he was also involved with digital imagery. He received the BS degree in mathematics from California Polytechnic in 1969, and the MS degree, also in mathematics, from the University of Santa Clara in 1973. Bob Wallis is a software engineer at Stanford Technology Corp. in Sunnyvale, Calif., where his responsibilities have been in the development of applications software for computerbased image processing systems. His present | interests are digital image enhancement techniques, numerical analysis, colorimetry, and the design of digital signal processing equipment. Walls received a BSEE from the University of Rochester in 1969, an MSEE from the University of Southern California in 1971, and a PhD in electrical engineering, also from USC, in 1975. MIMI 76: Proceedings of the International Symposium on Mini and Microcomputers November 8-11, 1976 (244 pages) Forty-eight papers on a wide range of subjects dealing with the automatic design of microcomputer systems, including hardware, software, systems, applications, and education presented by an international group of computer professionals. A sampling of the titles includes: A Time-Shared Multi-user Approach to Microprocessor System Development; A Floating Point Computer for Generalized Spectral Analysis; A Resident Micro Assembler for Microcomputers; LSI Beyond the MPU; A Minicomputer Based System for Remote Storage and Retrieval of Pictorial Information; Privacy Based Computer Design Using Microprocessors; and Microprocessor Instruction in Electrical Engineering and Electrical Engineering Technology. Non-members-$20.00 Members-$15.00 01r; -