NEW DATA ON NOISE VISIBILITY AND ITS APPLICATION TO IMAGE TRANSMISSION by ULICK OLIVER MALONE B.A., B.A.I., Trinity College Dublin (1975) SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY JANUARY 1977 Signature redacted Signature of Author......................................... Department of Electrical Engineering and Computer Science, January 31, 1977 Signature redacted Certified by........... .... . ............. ......... Signature redacted. Accepted by . . . . -. .......... Chairman, Department Committee Archives on Graduate Students APR 6 1977) NEW DATA ON NOISE VISIBILITY AND ITS APPLICATION TO IMAGE TRANSMISSION by ULICK OLIVER MALONE Submitted to the Department of Electrical Engineering and Computer Science on January 31, 1977 in partial fulfillment of the requirements for the Degree of Master of Science. ABSTRACT A series of noise visibility experiments have been undertaken. The results of these experiments are used to validate the form log(l+ ab) model of vision. of the functional transfer Certain of the results are found to be incompatible with Stockham's visual model. A theoretical framework for image dependent companding is set up using the functional transfer model of vision. Examples are given which show that this technique is an improvement on the traditional approach to optimum companding. All experiments and applications were implemented using a general purpose computer based image processing facility. Name and Title of Thesis Supervisor: Donald E. Troxel, Associate Professor of Electrical Engineering. 2 ACKNOWLEDGEMENTS Many thanks are due to my wife Cathy for the encouragement she gave me during the year I worked on this project. I am very grateful for the guidance I received from my supervisor Professor Donald Troxel and for the many hours of assistance given me by Charles Lynn. 3 TABLE OF CONTENTS ABSTRACT . . . . . . . . . . . . . . . . . . . . . 2 ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . 3 . . . . . . . . . . . . . 5 . . . . . . . 11 . . . . . . . . . . 22 CHAPTER 1. INTRODUCTION CHAPTER 2. EXPERIMENTAL TECHNIQUES CHAPTER 3. OPTIMUM COMPANDING CHAPTER 4. PICTURE DEPENDENT COMPANDING CHAPTER 5. THE INFLUENCE OF BACKGROUND ADAPTION ON NOISE VISIBILITY APPENDIX 1 . . . . . 44 . . . . . 73 . . . . . . . . . . . . . . . . . . . . BIBLIOGRAPHY AND REFERECNES . . . . . . . . . . . 4 89 90 CHAPTER 1 INTRODUCTION The subject of noise visibility is of fundamental importance in image processing and transmission due to the fact that very many of the techniques of image technology give rise to pictorial noise. As a result much effort has been devoted to the development of methods of reducing the detrimental effects of noise on picture quality. A good example is the quantization noise in PCM systems for pictures. This can lead to obvious disconti- nuities in the appearance of the pictures and false contours in areas of low detail. The visibility of such contours becomes irritating if a resolution of less than four bits per pel is attempted. A variety of techniques have been developed for either eliminating contours or lowering their visibility. For example, Graham (1) found that the visibility of the contours could be reduced by applying certain filtering operations quantization. to the image before and after Quantization contours may be considered to be the result of the addition of highly structured, picture correlated noise to the image, 5 and as has been shown 6 in various studies (3, 4, 5) such noise is more visible than random white noise of the same amplitude. devised a Roberts scheme which takes advantage of these facts (6). In this scheme pseudo-random noise is added to the image before quantization and the same noise signal is subsequently subtracted from it. The resulting noise is pseudo- random noise of the same amplitude as the quantization noise, but of lower visibility. Fairly acceptable pictures are produced by the Roberts scheme using only three bits per pel. An understanding of the process of vision is essential to an understanding of noise visibility. Weber fraction experiments (5, 6) simple but powerful visual model. The classical have given rise to a In this model the output intensity at any point is considered to be some function of the intensity of the corresponding point in the input scene. This function v(b) defines the visual model and may be referred to as the visual transfer function. The results of the Weber fraction experiments have led to the conclusion that the visual transfer function is logarithmic. This information can be used to make predictions about noise visibility. For example using the logarithmic model it can easily be shown that noise should be more visible in the dark tones than in the bright tones of a picture, and this, as is well known, is true. As a more practical 7 example, Hashizume (8) used this model to show how noise visibility may be made independent of intensity. This manipulation of noise visibility is referred to as companding and is achieved by performing a tone scale transformation on the picture before noise is added and then performing the inverse transformation after the noise is added. Hashizume used the functional transfer model of vision to show that the function v(b) is the companding function which achieves noise visibility independent of intensity. Since this equalization of the noise in a picture usually results in an overall decrease in its visibility, the logarithmic companding scheme is often used in combination with the Roberts technique for further improvement in image quality (reference 10 is a good example). The results of the Weber fraction experiments and the work of Hashizume have left some doubt as to the exact form of v(b). The Weber fraction experiments were mostly conducted in unusual conditions of dark adaption so it is not clear that the results of these experiments apply to more comfortable viewing conditions such as office lighting. For this reason part of this thesis deals with a new noise visibility experiment similar to the Weber fraction experiments which not only provides valuable new data on noise visibility but also allows a derivation of 8 the exact form of v(b). This experiment was conducted under comfortable lighting conditions with a view to obtaining a result for v(b) which would apply in practical situations. This new result for v(b) for comparison with Hashizume's postulate was also intended v(b) = k log (l+ab) which, though successful in companding applications, was not verified directly. Optimum companding using v(b) has the property of causing noise visibility to be independent of intensity. This necessitates a decrease in noise in the dark tones but an increase of noise in the bright tones. picture is nearly all bright, So if a optimum companding can have the undesirable effect of increasing the overall noise in the picture. A major portion of this thesis deals with methods of overcoming this inability of optimum companding to match itself to the intensity distribution of the individual picture. A variety of optical illusions exist which cannot be explained by the functional transfer model of vision. Mach bands, (7, 14) simultaneous contrast and brightness constancy are the most well known of these effects, and all are examples of the output intensity from the vision system not being functionally related to the intensity of the corresponding point in the input scene, and hence the breakdown of the functional transfer model. Most attempts 9 at developing a model which explains these illusions have concluded that the appropriate model is a log stage followed by a linear shift invariant filter (7, 12, Stockham's visual model (14) 13, 14). has been particularly successful in dealing with illusions. the best visual model to date. It appears to be As such it has the potential of being very useful in the mathematical analysis of noise visibility, and also in the field of noise reduction where it could be used as a companding processor. Unfortunately little or no research appears to have been done in this area since Stockham's paper was published in 1972. Experiments have been described in the literature (5) which demonstrate that the sensitivity of vision in a small area is decreased by increasing the contrast between the small area and its background. Part of the work of this thesis deals with an investigation of this phenomena in which the variation of noise visibility was measured as a function of contrast. A second experiment was designed to determine under what conditions contrast influenced noise visibility. It was hoped that the results of these experiments would give an indication of whether this effect is of any relevance to practical image processing. The decrease of noise visibility as contrast increases is another example of an effect which the functional transfer model fails to explain, but it is not intuitively clear 10 whether Stockham's visual model can account for it or not. For this reason an analysis of the compatibility of this effect with Stockham's model will be given in this report. The fact that noise is more visible in blank fields than in areas of detail (5) raises the issue of the relationship between noise visibility and the spectra of the noise and picture. Greenwood (3) and Mitchell (4) have studied this relationship and found it to be quite complex. Greenwood found that both spectra influence the visibility of the noise. Mitchell's experiments indicated that noise is most effectively concealed in the details of a picture when both picture and noise have the same frequency content. White noises with different probability distributions but equal variances have been found to have equal visibility (11), so it may be concluded that probability distribution is not an important factor in noise visibility. A survey of the present knowledge of noise visibility has now been completed, and it may be concluded that the subject is very complex and not yet fully understood. The aim of this work has been to accumulate some new experimental data on noise visibility, investigate the implications of this data for visual models and their ability to predict noise visibility, and finally to use this new knowledge to improve on the traditional approach to companding. CHAPTER 2 EXPERIMENTAL TECHNIQUES 2.1: The APED system. This work was carried out using the APED image processing facility of the Cognitive Information Processing Group at the Research Laboratory of Electronics, M.I.T. This system is supervised by a custom designed real time multiprocessing operating system for a PDP-ll/40 minicomputer. APED was designed to respond to a simple set of powerful user commands which may be entered into the system via a keyboard. The multiprocessing feature of APED allows it to perform a variety of tasks needed to keep the system in order concurrently with its real time servicing of user commands. APED was designed to receive, process pictorial data. is the picture file. transmit, display and The basic data structure of APED A picture file is composed of lines, each of which is made up of a number of pels elements). number. (picture Each pel is internally represented as a binary For monocrhome pictures this binary number is proportional to the intensity at a point of the picture 11 12 being represented. A picture file is thus a two dimensional digital signal corresponding to the digitized samples of the intensities in the picture being represented. Operator commands enable the user to input pictures to the system from the Associated Press news photo wire and a facsimile receiver device. Once received, a variety of processing operations may be carried out on the picture such as filtering, sharpening or enlarging. The processed picture may then be transmitted to its final destination, for example disk storage or the T.V. display. 2.2: Software for noise visibility experiments. A variety of new APED commands were developed for this research. These included commands to add random noise to a picture, commands to generate noisy test patterns for the noise visibility experiments, and commands to reformat news photos for the purpose of testing applica- tions. A picture format of 256 lines of 128 pels with 8 bit resolution was selected as the standard for these commands. A Tektronix 633 picture monitor was used for display purposes. Existing hardware was used to display the 256 x 128 pel pictures on this T.V., and produced a square picture of dimensions 28 cm x 28 cm. Thus the vertical spacing of pels was twice as close as the horizontal spacing. 13 Software development was carried out using the manufacturer's operating system DOS The for the PDP-ll. new commands were programmed using assembly language, assembled, debugged, and then integrated into the APED system. APED was found to be very suitable for implementing the experiments and applications of this research. Its great flexibility allowed all the new commands to be implemented in software and without any modifications to the existing hardware. This highlights the utility of general purpose systems such as APED in implementing a great variety of tasks with minimum effort. 2.3: The generation of pseudo-random numbers. In this work pictorial noise was produced by adding a sequence of random numbers to the picture signal before displaying it. A subroutine was therefore required to produce a sequence of random numbers with acceptable statistical properties. The required probabilistic behavior for this application is that the output of the random number generator should behave as a discrete random variable n with P.M.F. Pn(n ) nO0 where N, 2 +N+ the noise amplitude, the generator. - N ,N1 < n 0 < +N, is the input parameter to It was decided to achieve this by first 14 generating a random number from the range (0,1), multiplying this number by 2N + 1, truncating the result and then subtracting N. This equivalent procedure simplifies the problem to generating random numbers with uniform distribution on the range (0,1). As a first attempt, as described by Knuth the linear congruential algorithm (17) was implemented and tested. This algorithm can be summarized as follows: Xn+l = (a-Xn+ c) mod M where Xn+1 and Xn are the n+1th and nth values of the random sequence. a and c were chosen according to the constraints laid down by Knuth. M was chosen to be 215 since this allowed modulo arithmetic to be programmed with ease on the PDP-l1. The seed X0 was initialized with the computer clock time. The foregoing procedure guarantees that all 215 possible values of Xn are generated before the sequence starts to repeat. The periodicity of generators such as this one is the reason why they are referred to as pseudorandom rather than random number generators. The 15 bit numbers produced by the Knuth algorithm may be considered to be 15 bit binary fractions selected with uniform probability from the range (0,1) and so may be used to obtain a random sequence of amplitude N using the calculation (2 N+1)-Xnj - N. An example of the 15 pictorial noise produced using this procedure is given in Fig. 2.1(a). The vertical stripe pattern indicates a high degree of correlation between every 128th number in the sequence. Attempts to eliminate the stripes by varying the values of a and c met with no success. It may be concluded that for M = 215 these undesirable vertical stripes are an inescapable result when using the linear congruential generator and a line size of 128. For this reason it was decided not to use the linear congruential method. The problem of vertical stripe patterns in the pictorial noise was eliminated by using a pseudo-random number generator based on a feedback shift register. The particular logical configuration selected has already been described in a paper by Troxel (10), and corresponds to the following bit equations for an 18 bit register: b10 =b2 bl b2 b =b3 b11 b3 b2 = bb4 b12 b3 b13 =b5 b5 b1 b 1 5 =b 7 b16 = b17 4 =b 6 8 b9 b2 = b 5XORb2 b3 = b 6XORb13 b b5 = b8 XORb1 b6 = b9XORb16 b7 = b10XORb 7 5 b0 = b 3XORb1 b8 = b 0 b = b b9 = b1 b = b 7XORb 1 XORb1 1 16 Fig. 2.1(a) Noise field using linear congruential generator, N = 16. Fig. 2.1(b) Noise field using shift register generator, N = 16. 17 Repeated invocation of these feedback equations gives rise to a sequence of random 8 bit patterns in bits 10-17 of The period of the random sequence is the register. 15 2 register length An example of the pictorial noise produced by the shift register random number generator is given in Fig. 2.1(b). Unlike Fig. 2.1(a), this photograph is free of undesirable patterns. 2.4: Statistical testing of the shift register pseudo-random number generator. In order to test the statistical behavior of this generator the average and average squared values of its output for N = 6 were calculated for sequence lengths of 72. time. Each sequence was initialized using computer clock Since N = 6 was the largest amplitude and 72 was the smallest sequence length needed for quantitative results in this work, the statistical behavior for these values of the parameters represents the worst case which can arise. Using the model Pn (n 0 13 -6 < n0 < 6, for the output of the generator some relevant properties for a sequence length L = 72 are summarized in Fig. 2.2. The variances for the average and average squared values 18 n 1 72 Variance Expectation Variable 0 72 n1 i=l N (N+l) 0 n N (N+1) 3 N (N+1) = 14 ( 14 = = 1. 94 (N+1) (3N +3N-1) - N2 (N+1)2) i=l 2.14 - Fig. 2.2 Expectation and variance of random variables based on a sequence of length L = 72 and amplitude N = 1 Fig. 72 S.n 6. 1 72 2 n2 - .14 13.14 - .01 14.21 .35 12.94 .49 13.51 .21 15.57 2.3 Samples of the average and average squared values of a sequence of length 72 and amplitude 6. 19 are 1.94 and 2.14 respectively, so that the standard deviations are less than 2 in both cases. The average value can therefore be expected to be in the range and the average squared value in the range high probability. in Fig. 2.3. (-2,2), (12,16) with The results of several tests are given The average and average squared values obtained are well within the expected range. In view of the fact that the shift register pseudorandom number generator does not appear to possess undesirable properties, it was selected for use in all the experiments and applications of this work. 2.5: The T.V. display. The intensity to voltage response of the T.V. display is non linear, as shown in Fig. 2.4. Therefore it is necessary to transform the binary representation of intensities by the inverse of the function of Fig. before converting to analog. 2.4 This is done by loading 256 rounded samples of the inverse function into a RAM in the display electronics. Each pel is transformed by this table before being converted to the analog voltage used to display the pel. of the T.V. This procedure linearizes the intensity response at the expense of introducing a small amount of noise due to the finite precision representation of the transformation table. OUTPUT (foot candles) 20 40 36 32 28 24 20 16 12 8 -4 32 96 64 128 160 192 224 256 INPUT INTEGER INTENSITY Fig. 2.4 T.V. Transfer Characteristic 21 The T.V. characteristic was found to vary very slightly with time, and also over the area of the screen. These fluctuations were judged to be too small to have a significant influence on the noise visibility experiments. The T.V. was found to have a dynamic range of 38 foot candles. 38 foot candles was represented internally by the integer 255, so that the internal unit of intensity 38. is 255 = .15 foot candles. For convenience this unit (corresponding to the binary representation of intensity) is used throughout this report. 2.6: Control of the experimental environment. A viewing distance of 125 cm was found to be comfortable by the experimental subjects. All the noise visibility experiments were conducted with the eyes of the subject at this distance from the T.V. screen. Lighting conditions were controlled by curtaining off the T.V. area from external light sources. Inside the curtain comfortable lighting conditions were produced using a 75 W lamp. This produced an intensity of about .1 foot candles incident on the T.V. screen. CHAPTER 3 OPTIMUM COMPANDING 3.1: A simple visual model. In this section vision is modeled as the process of mapping the point (x,y) of intensity b scene onto the point (x,y) of intensity v(b) scene. in the input in the output This model is completely specified by a knowledge of the function v(b). It is a very simple model, and though it does not account for such visual phenomena as Mach bands and simultaneous contrast, it is the basis of our understanding of the sensitivity of the eye to detail as a function of intensity. Furthermore it allows the develop- ment of important applications such as companding. 3.2: Procedure for determining v(b). The following approach has been taken to measuring v(b) in this research. A differential stimulation in the input scene of magnitude Ab is produced on a background of intensity b. The stimulation Ab is produced by adding noise to a blank field with intensity b. (This method of producing a differential stimulation will be discussed 22 23 in more detail shortly.) The output intensity from the human vision system is v(b) is Av. and the output stimulation For each value of b there exists a critical value Abc of Ab such that the output stimulation Av is at the threshold of visibility. By performing psychophysical experiments Abc can be measured as a function of b. this function be denoted J(b). stimulation Abc = J(b) Let For all values of b, the produces a constant, just detectable output stimulation Av C. Thus while Abc is a function of b, Avc is not and may therefore be assigned a constant value Avc = k. of v(b) Since the ratio Av is approximately the derivative for Ab small, the following equation is a good dv approximation. = db Av0 Ab c c k J (b) J) Integrating Itgaigbt both sides of the equation yields: b db = v(b) - v(O) = v(b) = k b) Thus the experimental determination of J(b) allows the calculation of v(b) using the result v(b) = k oJ(b) J b 24 3.3: Procedure for producing a differential stimulation using additive noise. The classical experimental arrangement for producing a stimulation Ab has been as follows (5): an observer is exposed to a uniform field of light of intensity b with a small circular target in the center of intensity b + Ab. However since the underlying theme of this research is noise visibility, additive noise was used to produce the stimulation Ab in the target. The input scene was a square T.V. picture of dimension 28 cm x 28 cm (128 x 256 pels) at a distance of 125 cm from the observer. (A Tektronix 633 picture monitor was used for this purpose.) Standard lighting conditions typical of a comfortable viewing situation were produced by curtaining off the T.V. and observer from external light sources, desk lamp and turning on a (75 W) which caused a light intensity of about .1 foot candles incident to the T.V. screen. Keyboard commands to APED from the supervisor of the experiment enabled all the pels of the T.V. picture to be set to the same intensity b (for convenience, T.V. intensities refer to integers on the range 0-255 rather than the physical units to which they are proportional) square target of 72 pels except for a small (6 x 12 pels, or 1.4 cm square). The target was moveable to a random position in the central square area of dimension 14 cm x 14 cm by operator command, 25 thus eliminating the tendency towards guessing by the observer which arises when using a stationary target. The intensity of each target pel was set to b + n, where n is an integer random number supplied with uniform likelihood from the set -N, -N+l, ....- 1,0,1, .... N-1,N by the shift register pseudo-random number generator. The amplitude N was adjustable by operator command. Thus in this experiment the stimulation Ab was produced by adding discrete uniform random noise of amplitude N to a background of intensity b. An objective measure of the visual stimulation Ab produced by this noise is now required. Since the output of the pseudo-random number generator used to produce the visual noise behaves as a random variable n with P.M.F. P nn = 0 1 2N+l' -N - < n 0 < N, it seems likely that measures of the average deviation or dispersion of this random variable are suitable measures of its effective visual stimulation. This seems all the more acceptable when it is considered that the visual system detects the noise by observing the average amount of deviation of the noise from its mean over the area of the target. Three measures of deviation have been considered: (a) the amplitude N; (b) the standard deviation 26 a = 2 E(n) 1 2N+l N 2 no N +N 3= and 0 (c) the mean deviation E(InI) = 1 2N+1 N I n 0j N2+N 2N+1 nO=-N These 3 quantities are plotted in Fig. of N 3.1 as functions in the range of N of interest for this series of experiments. This plot shows that the mean and standard deviations are very nearly linearly dependent on N in this range, so that all three measures of deviation are identical in their characterization of n, proportionality. apart from constants of The fact that these three intuitively acceptable measures of the visual effectiveness of the noise are all proportional to N suggests that Ab may be taken as proportional to N for the purpose of estimating the derivative of v(b) as discussed earlier. Thus the threshold noise amplitudes Nc measured in this experiment are proportional to the Abc ' Nc (b)aJ(b), and v(b)a f db 27 14 N 10 - 12 4 N 2 +N 3 Mean Devi ation 6- N 2 +N 2N +1 4 2- 2 4 6 8 10 12 14 N Fig. 3. 1 Measures of Deviation. 28 3.4: Conduct of the experiment to determine v(b). Each human subject was first given the opportunity of learning to detect low noise amplitudes in the target. Once the subject's ability to detect the target had become consistent the formal experiment was begun. The T.V. intensity was initially set at b = 16, and the noise amplitude in the target at N = 1. N was then incremented by 1 as many times as was necessary for the target to become visible to the subject, at which time the current value of N was recorded as N c(16). Each time N was incremented the target was also relocated to a random position. Relocating the target discourages guessing on the part of the subject. Relocation was achieved by using the shift register random number generator to generate random coordinates for the position of the target. coordinates were displayed on a terminal, These so that the supervisor of the experiment could ensure that what the subject thought he was seeing was in fact the target. Having determined N c (16), N c (b) was then determined using the same procedure for b = 32, 48, 144, 160, 176, 192, 208, total of 16 values of N 224, (b). 64, 80, 96, 112, 128, 240, and 248 to obtain a In order to check the consis- tency of the subject's judgment, N c(b) was redetermined for a few values of b chosen at random upon completion of the experiment. If the new value of N c(b) differed from 29 the original it was recorded along with the original. The experiment was carried out in standard lighting conditions using (.1 foot candles incident on the T.V.) 5 times 5 different subjects in order to be able to average the results and hence obtain a v(b) average behavior of human vision. which reflects the It was also carried out in total darkness using one subject, and under slightly varied lighting conditions using two other subjects in order to test the sensitivity of the experimental results to lighting. 3.5: Experimental results and calculation of v(b). The results for the five controlled trials are presented in Appendix 1. The values of N from 1 for b = 16 to 4 for b = 250. reported ranged Thus there was insufficient resolution to obtain a continuous range of values for N C . Instead the same value of N C was usually reported for several consecutive values of b. So only three data points may be derived from the experimental data for the purpose of estimating v(b): (1) the highest value of b for which Nc = 1; (2) the highest value of b for which N c = 2; (3) the highest value of b for which Nc = 2a and The highest value of b for which Nc = 1 is > 48 and < 64 (this agrees with the data from all 5 subjects). In the 30 absence of further data it seems reasonable to use b = 56, the midpoint between 48 and 64 as the first data point. There was less agreement among the subjects on the value of the highest value of b such that Nc = 2. Transitions from Nc 2 to Nc = 3 were reported between = 160 and 176 by two of the subjects, between 96 and 112 by one subject, between 128 and 144 by one subject, and between 192 and 208 by one subject. The method chosen of determining the average behavior based on this data has been to take the arithmetic average of the midpoints of the above intervals: 1 (136 + 168 + 168 + 200 + 104) 155.2. It is more difficult to decide on a representative value for the transition from N = 3 to Nc = 4. Three of = 3 for the subjects were able to locate the target with N the maximum value of b tested, great consistency. though not with The other two subjects reported transition from Nc = 3 to N and 224-240. b = 250, = 4 in the intervals 208-224 One possible conclusion that may be drawn from this data is that the representative value should be replaced in or about the maximum value tested, The three representative values of b 155.2, b = 250. selected, 56, and 250, are the points at which the noise is just visible for amplitudes N = 1, 2, and 3. As shown in Fig. 3.2 these three data points are very close to the linear relationship .435 + .0101b. As explained earlier 31 this implies that the function Abc (b) = J(b) is the same linear function multiplied by a constant: J(b) k 0 (.435 + .0101b) = Letting m = .435 and n = - v(b) 0b .0101, kdb K-log(l+ (m+nb) _k where K = k kn . m b) Substituting for m and n gives: 0 v(b) K log (1 + ab) = where K is a constant, and a = .0232. The values of m and n used to obtain this result are based on the allocation of the three data points of Fig. 3.2 Though these three points were selected as objectively as possible from the mass of experimental data it is clear that these values may not be considered to be highly accurate. However it does appear reasonable to assume that the three points are linearly, or very nearly linearly, related. relationship, Allowing the assumption of a linear it is worthwhile to investigate the potential error in the estimate of the parameter a = a of the final m result for v(b). The differential of a is da = 1 dn 2 _ dm. m Allowing for an error of 10% in the values of m and n results in a maximum positive error in the value of a of approximately Aa : m (.ln) - -n m2 (- .lm) = .2a. Similarly N C 3 2 1 61 32 64 Fig. 3. 2 96 128 Data Points and the Line 160 .435 + 192 .0101b. 224 256 b 33 the greatest negative error would be Aa = -. 2a. allowing for 10% variations in the values of m restrict (.018, a to the interval Thus and n would .028). The foregoing error analysis served to illustrate that the value a = value. .0232 may not be considered a precision However it is doubtful that it should be measured with any greater precision, since the average behavior of human vision is not itself a very precise idea. In the next section it will be shown that the value a = .02 is accurate enough for companding applications, that this value is not critical and furthermore (for example with a = .01 the effect of companding is indistinguishable from using a = .02). The real value of this experiment has been to derive the form log(l+ab), and obtain some idea of the value of the parameter a. It is doubtful that any practical purposes would be served by setting up a more precise experiment than this one. For comparison purposes, the experiment was carried out with two subjects using slightly modified lighting conditions (a small amount of daylight was allowed instead of using the 75 W lamp). The same trends were observed in the experimental results as with the controlled lighting conditions. not critical. Apparently moderate changes in lighting are 34 In contrast, when the experiment was carried out in darkness, a completely different set of results was obtained--a noise amplitude N = 1 was visible throughout most of the dynamic range of the T.V. is that viewing the T.V. its perception. The implication in total darkness radically alters 35 3.6: Companding. Companding is the process of manipulating the visibility of additive noise in a picture by means of processing the picture both before and after the noise is added. The traditional approach to companding has been as shown in Fig. 3.3(a). Each pel intensity b transformed by a companding function c(b) noise is added. before the After the noise is added the value c(b)+n is inverse transformed by the inverse of c(b) c~1(c(b)+n). is to obtain This process alters the visibility properties of the additive noise. The traditional aim of companding has been to cause the visibility of noise to be independent of intensity, in contrast with the absence of companding when the noise is more visible in the dark regions than in the bright regions of a picture. The most important application of this is in conjunction with the Roberts technique of converting the pictorial contours due to intensity quantization to "snow" noise. As is well known the Roberts technique effectively adds uniformly distributed discrete random noise to the picture, so that companding may be used to manipulate the visibility of this noise. This combination of companding and the Roberts technique is of great value in image transmission applications where it is desired to transmit as few bits per pel as possible. 36 n c (b) b - - c( )c c (b) +n ( ) Fig. 3.3(a). - -c The Companding Process. (c (b) + n) 37 Determination of an optimum companding function. 3.7: a companding function For simplicity of discussion, c(b) on the range 0 < b < The 255 will be considered. 8 bit binary pels used in this research all have intensities For the purpose of utility the following within this range. two boundary conditions are also required: and c(0) = 0 (1) c(225) = 225 (2) Following the approach of Hashizume the optimum (8) companing function may now be defined: The optimum companding function Definition: c(b) on the range 0 < b (1) which obeys d -l [v (c c1 (b) is (2) (c (b)+n)) above and also - v (b)] 0 = (3) is the visual transfer function and the inverse of c (b) . where v(b) and < 255 is that function As explained by Hashizume, (3) is equivalent to requiring that the apparent noise on a background b due to noise n using the standard companding arrangement be independent of b. It is a trivial matter to show that c(b) is the optimum companding function. To show that hold. c(b) = A -v(b), (1) and 255 (2) 255 - v (b) vv(255) = obviously (3) holds let A = v (255) so that and c1 (b) = v[ A. Then 38 c 1(c(b)+n) = v 1(v(b)+-) AA and v(c- (c(b)+n)) -v(b) = which is independent of b. In the previous section v(b) k log (1+ab) where a= .02. was found to be Therefore the optimum companding function is (b) 255 - log(l+ab) log(l+a-255) = a .02. This is in remarkable agreement with the work of Hashizume, who having postulated the above form and experimented with various values of a, found that companding using a .01 resulted in noise which was apparently independent of intensity level. This research has now justified Hashizume's postulate of the form k log (l+ab), and furthermore has obtained a value of a in good agreement with Hashizume's. The estimate a .02 is probably better than Hashizume's estimate of as he experimented only with the values a = 1, values. .1, .01, .01, and .001, and did not test intermediate In any case, as can be seen in Fig. 3.3(b) there is only a slight variation of the function caused by changing a from .01 to .02. The difference is so slight that it has been found to be impossible to tell the difference between 2 pictures with additive noise using companding, one with a = .01 and the other with a = So in fact the exact value of a is not critical in companding applications. .02. 39 c (b) - 240 200 a=.02 160 a=. 01 120 80 40 b 40 80 Fig. 3.3 (b). 120 Graph of c(b) 160 = 200 255log(1+ ab) log (1+ a-255) 240 40 3.8: Effects of the optimum companding function on pictorial noise. A test pattern which spans the dynamic range of the T.V. has been created for the testing of companding functions. A photo of this test pattern together with an intensity map is given in Fig. Fig. 3.4. 3.5 shows the result of adding pseudo-random noise of amplitude N= 16 to the test pattern and illustrates the fact that additive noise is more visible in the dark regions than the bright regions. Fig. 3.6 shows the effect of the same noise using optimum companding (a= .02). This photo shows that optimum companding has been successful in causing noise visibility to be independent of intensity. Figs. photographs. 3.7- 3.9 show the same effects for two Figs. 3.7(a) of a man and a submarine. and Fig. (b) are noise free pictures 3.8 shows the same two pictures with additive noise of amplitude N= 16, companding. and without The submarine picture is predominantly dark, and so Fig. 3.8(b) has a high degree of noise visibility. This noise visibility is greatly reduced in Fig. 3.9(b) due to the use of optimum companding. In Fig. 3.9(a), the noise in the man's collar has been increased by optimum companding due to the fact that optimum companding increases noise visibility in the bright tones. (Note that many of the effects are distorted by the photographic transfer 41 process, which causes tone scale modifications as well as black and white saturation. All of the photographs are therefore only an approximation of what was seen on the T.V. T.V.) display. Many effects were more obvious on the 42 Fig. Fig. 3.4 26 51 71 102 128 153 178 204 229 Test pattern and intensity map 3.5 Test pattern with N=16 and no companding Fig. 3.6 Test pattern with N=16 and optimum companding 43 Fig. 3.7 (a) Man (b) Submarine Fig. 3.8 (a) N=16, no companding (b) N=16, no companding Fig. 3.9 (a) N=16, optimum companding (b) N=1L6, optimum companding CHAPTER 4 PICTURE DEPENDENT COMPANDING 4.1: The intermediate brightness of a companding function. In order to develop further insight into the process of companding, it is useful to define and calculate the intermediate brightness of a companding function: Definition: The intermediate brightness b. a companding function c(b) of is that intensity level at which no change in noise visibility is caused by companding. A method of calculating b will now be developed. If noise n is added to an intensity b the following is an expression for the apparent noise to noise ratio: v(b+n) - v(b) n v(b) refers to the human visual transfer function. If companding is performed the apparent noise to noise ratio now becomes: v[c The limits of -l (1) (c(b)+n)] n and (2) v(b) as n+ O 44 (2) are in general unequal, 45 but are equal for b = b. . v'(b) lim Observing that - v(b+n) 0 , v(b) the definition of b. may now be formalized as follows: b. 1 is the solution of = v' (b) v[c limn -l (c(b)+n)] n - v(b) Further simplification is possible as a result of the fact 0 that the above limit is an indeterminate form of the type g so that L'Hospitals' rule may be used to obtain: -l lim lin+* (c(b)+n)] n v[c = lim -v(b) v'[c -l lim in+0 = (c(b)+n)] - d[ 1 (c(b)+n)] ( ()n dnvc d c -l (c(b)+n) using the chain rule, limn+0 V' [C 1 (c(b)+n) ] d _ 1(ccd(b) +n)) c '(c- (c (b)+n) ) - = using the inverse function rule C -1 (x) - = 1 c '(c b. is thus the _. lim+ 0 (c (b)+n)) (v'(c-1 c'(c (c(b) +n) S v' (b) c' (b) ) - W (x) ) d -- solution of v' (b) =' (b) or c' (b) = 1. This result is not unexpected in view of the fact that the infinitesimal interval (b,b+Ab) is neither expanded 46 nor compressed by the application of c(b) if c'(b) = 1, and consequently no change in the noise visibility at level b is expected. However it is rather surprising that this result is completely independent of v(b). Different viewing conditions may be modelled by a set of different visual transfer functions, the form of v(b) and the independence of b. of therefore implies that the ineffectiveness of companding to modify the noise visibility at level b. holds true for all viewing conditions, for example different lighting conditions. In order to determine the effect of companding for c'(b) 7 1, recall that for infinitesimal noise n the apparent noise to noise ratio without companding = v'(b) (3) and the apparent noise to noise ratio with companding v'(b) c' (b) (4) The region in which apparent noise is increased by companding is clearly defined by (4) > (3) or c' (b) < 1. Similarly the region in which apparent noise is decreased is defined by (4) < (3) or c'(b) again intuitively agreeable. c'(b) > 1. These results are An interval in which < 1 is compressed by the application of c(b), so that additive noise will be expanded by the application 47 of c- (b). Similarly an interval in which c'(b) > 1 is expanded by c(b), so that additive noise will be compressed 1 by c~ (b). Note that these results are again independent of v(b), so that they apply in any viewing situation which may be modelled by a visual transfer function. Continuing with this analysis, it is possible to determine whether apparent noise increases or decreases as b increases when companding is used. It increases if d v' (b) > 0 db c' (b) or if , c' (b)v" (b) > c" (b)v' (b) since c' (b) > 0. There is no variation of noise visibility if c'(b)v"(b) C"(b)v'(b) and noise visibility decreases as b increases if c'(b)v"(b) < c"(b)v'(b). results a depend on v(b). .02, v'(b) = ka 1+ab Unlike previous results these For the case v(b) , and v"(b) = ka 2 - = k log (l+ab), 2 the above (l+ab)2 may be restated as: the apparent noise increases with b if -ac'(b) > (1+ab)c"(b); it is independent of b if -ac'(b) = (1+ab)c"(b) and it decreases with b if -ac'(b) < (1+ab)c"(b). As an example of the foregoing noise analysis consider the set of companding functions k 1 log (1+a1 b), = 255 log(l+a 1 255) or b1I . b. is given by c'(b) 11 255 a . b. is plotted as a function log(l+a le255) a = 1 +ab 1 ' 255_______kya k 48 of a 1 b in Fig. 4.1. < b., c'(b) > 1, As a increases b decreases. For so in this interval apparent noise is decreased by companding. For b > b., c'(b) < 1 and in this interval apparent noise is increased by companding. Note that for optimum companding (a1 .02) b. = 91. For b < 91 the noise is decreased and for b > 91 the noise is increased. Thus the interval in which the noise is decreased is smaller than the interval in which it is increased. This suggests that optimum companding is not necessarily a good idea, especially if the picture is predominantly bright rather than dark. To determine the variation of apparent noise with b, the first and second derivatives of c(b) are required: c'(b) Thus -ac' (b) or if a if a 1 > > a. > a, -k a2 c"(b) (l+ab)c" (b) if -a 1 1 =2 (1+a1 b) (l+a1 b) > - (l+ab)a 2 ' k a 1 1c(b l+a b 1 Thus the apparent noise increases with b it is independent of b if a1 = a, and it decreases with b if a1 < a. All of the above results have been verified by implementing companding functions with a range of values of a from .005 to .5, and studying their effectiveness at modifying noise visibility in T.V. test patterns. The functions and their inverses were implemented digitally by using a Fortran program to calculate the values of the 49 b. 140 n 120 100 80 60 40 20 a .001 Fig. 4.1. I I .01 .1 Graph of b i 255 log (1 + a - 255) I 1.0 a1 1 a1 50 functions using floating point arithmetic, round these values to integers and then generate tables of values of the functions in assembly language format, together with the necessary macro-instructions needed to enter the tables into APED. The assembly listings were then assembled and made available to APED on disk storage. Fig. 4.2 shows that the log compander with a = .1 is more effective at reducing noise visibility in the very dark tones than the optimum compandor. This is to be expected because a = .1 gives bi = 68, so that for b < 68 the log compandor with a = companding. Fig. test pattern used. 4.2(a) Fig. .1 should be better than optimum gives the intensity map of the 4.2(b) shows the result of optimum companding with N = 16, and Fig. 4.2(c) of log companding with a = shows the result .1 and the same noise amplitude. In spite of the black saturation in those photos it is apparent that for a = .1 the noise visibility in the very dark tones is lower than for optimum companding. As a further demonstration, consider the bright tones test pattern whose intensity map is given in Fig. 4.3(a). The lowest intensity is 125 which is above the intermediate brightness 91 of the optimum compandor. Consequently optimum companding should increase the noise visibility of this test pattern. Fig. 4.3(b) This can be verified by comparing and Fig. 4.3(c). In a situation like this 51 5 10 15 20 25 30 35 40 45 Fig. 4.2(a) Fig. 4.2(b) Intensity map of dark tones test Optimum companding with N = 16 pattern Fig. 4.2(c) Log companding with a = .1, N = 16 52 125 140 155 170 185 200 215 230 245 Fig. 4.3(a) Fig. 4.3(b) Optimum companding, N = 16 Fig. Fig. 4.3(c) No companding, N = 16 4.3(d) Exponential companding, a = .05, N = 16 53 optimum companding is clearly undesirable. In Fig. 4.3(d) an exponential companding function has been used to produce a decrease in noise visibility. This type of function will be discussed in the next section. As a further example of the use of the foregoing noise analysis, the effect of inverting the roles of companding function and inverse companding function will now be analyzed. The function k 1 log(l+a1 ) will now be used as the inverse companding function c~ (b), and its inverse will be used as the companding function: c(b) b = 1 (exp() - 1). 1 for this function is the solution of 1 c'(b) = a k exp = which gives bi b = k log(a k ) 255 255a1 log(l+a1 255) * log log(l+a -255) 1 1 is plotted as a function of a 1 in Fig. 4.4. as a increases, b. increases. Since c'(b) Note that > 1 for b > b., the apparent noise is decreased in this interval. c' (b) < 1 for b < b Also so that the apparent noise is increased in this interval. It will now be shown that the apparent noise decreases as b increases. To do this it is necessary to show that 54 b. 180 160 140 120 .001 Fig. 4. 4. .01 Graph of b .1 =k log(a k 1.0 a 55 11 1 and c"(b) 2 aa~k 2 1 b P(k-) holds if -a < (1+ab) of the value of a 1 (exp( -) -1) it is clear that the inequality 1 -. k1 This of course is true regardless so that the companding function results in apparent noise which decreases . as b increases for any positive a1 In summary, the companding function 1 b 1- (exp(-) -1) 1 1 results in apparent noise which decreases as b increases, it causes a reduction of apparent noise for b > b. k 1 log a1 k, = and it causes an increase in noise for b < b. This type of companding would clearly be useful for pictures . which have most of their area of intensity > k 1 log a1 k1 For example, using a 1 = .05 gives b = 154 so that noise should be reduced in most of the test pattern of Fig. 4.3(a) value of a 1 * This may be verified by comparing for this Figs. 4.3(c) and 4.3(d). In contrast optimum companding results in apparent noise even greater than with no companding as can be seen by comparing Figs. 4.3(c). 4.3(b) and In this situation the "optimum" companding function is far from optimum in the sense of achieving reduction 56 in noise visibility. Earlier it was shown that a companding function k log(l+a1 b), a 1 > a = .02, is more effective at reducing noise visibility for b < b. = k 1 1than 1 the a1 optimum compandor. So a picture which has most of its area of intensity < k - c(b) = k 1 log(l+a1 b) than from optimum companding. would benefit more from the use of These two examples of situations in which optimum companding does not achieve the most reduction in noise visibility suggest the possibility of choosing picture dependent companding functions as an alternative. Optimum companding, though it does result in noise visibility which is independent of intensity, does not take advantage of the intensity distribution of the individual picture and the additional potential for noise reduction which may arise from this distribution. 57 4.2: Companding functions with two or more intermediate brightnesses. So far companding functions with one intermediate brightness b. have been discussed. In one case the companding function decreased noise visibility for b > b and in the other case noise visibility was decreased for b < b.. Thus the former type of companding function is suitable for predominantly bright pictures and the latter type is suitable for predominantly dark pictures. By using APED's facility to compute and display the intensity histogram of a picture it has been found that many photographs are bimodal with peaks both in the dark and bright tones, with the midtones occupying a relatively small area of the picture. With this type of a picture, using a compandor k 1 log(l+a1 b), a1 > a, would reduce noise visibility in the dark tones at the expense of greatly increasing it in the bright tones. compandor 1 1 Similarly using a b (exp(K-) - 1) would reduce noise visibility a1 1 in the bright tones at the expense of greatly increasing it in the dark tones. A new approach must be taken to reduce noise visibility both in the bright and dark tones simultaneously. 58 A compandor will now be analyzed which has this capability: c(b) = As before, this function has been selected to accommodate the dynamic range of the T.V. c(255) = + 127.5 1- d 2 (b-127.5)3 + d(b- 127.5) 127.5 so that c(0) = 0 and 255. 1l-d 2 - 3(b- 127.5) 2 + d, c'(b c'(b) 127.5 so that b. is the solution of 1- d2 - 3(b- 127.5)2 + d 127.5 b = 1 or 127.5 127.5 Thus there are 2 intermediate brighnesses, b = 54 and b i2 =201, at which noise visibility is unchanged by companding. Given the restriction 0 < c'(127.5) the fact that c'(127.5) = d, for 0 < b < b < < b and bi2 < 1 and it may be shown that c'(b) > 1 255 so that in these intervals noise visibility is reduced by companding. Also c'(b) < 1 < b < b so that in this interval noise visibility il i2 for b. is increased by companding. c(b) has been plotted in Fig. and .1. 4.5 for d = .5, .3, These three functions and their inverses have been 59 c (b) 240 200 160 d .1 120 d = .3 0.5 d 4080 -40 bli2 40 Fig. 80 4. 5. 120 Graphs of c(b) 160 for d = 200 .5, .3, 240 and .1. b 60 implemented digitally for the purpose of testing them as compandors. As before, a Fortran program was used to generate assembly language listings of the tables. inverse The functions were computed using linear interpolation on the values of the forward functions. Thus to compute c- 1 (m), a search was made of a table of values of c(b) until the two values were found such that c(b) c 1(m) < m < c(b+l). was then approximated by c (m) m-c(b) b+ c(b+l) The values computed for c(b) - c(b) and c~(b) using floating point arithmetic were rounded to integer values before being written onto disk storage in assembly language format, assembled and interfaced with APED. It was found that insufficient reduction of noise visibility in the bright and dark tones was obtained using d = .5. An excessive increase in noise visibility was obtained in the mid tones using d = to be a good compromise. c(b) = ..7 2 127.5 .1. d = .3 was judged The result of using (b - 127.5) 3 + .3 (b- 127.5) on the test pattern of Fig. 3.4 is shown in Fig. + 127.5 4.6. The noise visibility is low in the very dark and the very bright tones, as predicted. A comparison of cube companding and optimum companding of the submarine picture may be made 61 Fig. 4.6 Cube companding, d = Fig. 4.7(a) Cube companding, d = .3, N = 16 .3, N = 8 Fig. 4.7(b) Optimum companding, N = 16 62 by inspecting Figs. 4.7(a) Optimum companding and 4.7(b). appears to have produced a lower noise visibility in the dark and midtones. Cube companding produced a lower noise visibility in the very bright tones, though this is not apparent in the photos due to white saturation. The possibility of reducing noise visibility in the intervals 0 < b < 54 and successfully demonstrated. 201 < b < 255 has now been Pictures which have most of their area in these intervals can be expected to respond better to this type of companding than to optimum companding. However a disadvantage of the companding function tested is that though some of its properties may be varied by varying d, b there is no method of varying the values of and bi2 which define the intervals in which noise Further research can be expected visibility is reduced. to uncover useful methods of designing companding functions which have specified values of bi and bi2' As another example of a set of companding functions which have two intermediate brightnesses, consider the scheme obtained by inverting the roles of the compandor and inverse compandor used in the previous discussion. That is let c 1(b) 2 - (b -127.5)3 + d(b- 127.5) + 127.5 127. 5 and let c(b) be the inverse of this. There is no simple 63 which is why linear algebraic expression for c(b) interpolation was used to calculate it, as described earlier. The intermediate brightnesses of this function are the g1 of c' (b) = c(b), (b) d d Denoting g (b) = c1 (b) and this becomes: g -l g' (g~ or = 1. (b) 1 _1 =1 g' (g (b)) = (b)) = 1 . solutions Recall that the solutions of g' (x) = 1 are the intermediate brightnesses of g(x), which have already been calculated to be 54 and 201, g 1(b) of c(b) g1 (b) = so that the intermediate brightnesses are the values of b which satisfy 54 or 201, or b b i2 = g(201) = g(54) = c~1 (54) and = c~1 (201). The same 3 values of d as used in the previous discussion will again be considered: d = The intermediate brightnesses of c(b) .1, .3 and .5. for these values of d have been computed and are given in the table below to the nearest integer: bi d = .1 = c 1 (54) bi 2 = c 1 (201) 98 157 d = .3 88 167 d = 79 176 .5 64 In this case the variation of d has a considerable effect on the values of b , and b il and it is apparent that i2' increasing d has the effect of increasing bi2 b c(b) has been plotted for the three values of d in Fig. 4.8. It is apparent that c'(b) > 1 for b < b < bi2' so that this is the interval in which noise visibility is reduced by 0 < b < b companding. Furthermore c'(b) < 1 for and b12 < b < 255 so that in these intervals noise visibility is increased by companding. So this companding function has the effect of reducing noise visibility in the midtones and increasing it in the bright and dark tones. Thus it is suitable for use with pictures which have most of their area in the midtones. The choice i2 - b. il of value for d is governed by the desire to make b. large by making d large, but also by the desire to keep d considerably smaller than 1, because for d close to 1 c(b) b and very little change in noise visibility occurs. 3 < d < .5 was judged to be a good range from which to choose d. Fig. 4.9(b) shows the result of inverse cube companding with d = Fig. 4.9(a). b il .3 on the midtone test pattern of = 88 and b m2 = 167 for d = noise is reduced in most of the test pattern. .3 so that The same 65 c (b) 240 200 160 120 80 d .5 d d 40 40 80 Fig. = .1 120 4.8. of Fig. .3 160 200 The inverse of the function 4.5 for d = .5, .3, and .1. 240 b 66 test pattern appears more noisy when optimum companding is used (Fig. 4.9(c)). The building picture of Fig. 4.10 is a midtone picture, so that inverse cube companding with d = reduces noise visibility slightly and 4.10(c)). (compare Figs. 4.10(a) In contrast optimum companding increases noise visibility considerably (compare Figs. and (c)). .5 4.10(b) 67 65 80 95 110 125 140 155 170 185 Fig. 4.9(a) Intensity map of Fig. midtone test pattern. Inverse cube companding, d = Fig. 4.9(c) Optimum companding, N = 8 4.9(b) .3, N = 8 68 Fig. Fig. 4.10(a) Optimum companding, Inverse cube companding, c = .5, 4.10(b) N = 8 N = 8 Fig. 4.10 (c) No companding, N = 8 69 4.3: The role of the visual model v(b). Most of the results in this chapter were derived from the concept of a functional visual model v(b). A few of the results depended on a knowledge of the exact form of that function: v(b) = k log (l+ab), .02. a At first it may not have seemed very fruitful to define the psychophysical quantity of apparent intensity v, especially when it involved an unknown and perhaps meaningless constant k. Yet all of the results in this chapter are dependent on this model, verified experimentally. unexpected, that v(b) and all have been It is fortunate, though not cancelled out of most of the equations, and that k cancelled out of all of them. The only parameter of the model which is of relevance to any of the final results is a, and this fortunately is the parameter which was determinable by experiment. It may be concluded that the applications developed in this chapter (all of which have been experimentally verified), are justification for a visual model which might otherwise have been controversial. 70 4.4: Quantization noise due to finite precision representation of function values. The functional transformations in this work were implemented as mappings of 8 bit integers onto 8 bit integers. is For intervals in which the slope of the function < 1 it is clear that more than one input integer may be mapped onto the same output integer, giving rise to a form of quantization noise. The amplitude of this quantization noise is found to be negligible in most cases. Slopes considerably < 1 are necessary for it to become significant, and should be avoided in companding applications. The cascade of several transformation tables should be avoided where possible. This is because the amplitude of the resultant quantization noise is increased at each stage. One transformation table equivalent to the cascade should be used in order to keep the quantization noise to a minimum. 71 4.5: Conclusion and suggestions for further research. It has been pointed out that optimum companding increases apparent noise visibility for all b > 91, an interval which is well over half the dynamic range of the T.V. Though optimum companding does result in noise which is independent of intensity, the increase in noise for all intensities greater than 91 is unacceptable for many pictures. The possibility of choosing picture dependent companding functions has been suggested. As examples an exponential function was shown to be useful for reducing noise in predominantly bright pictures; a cube function was shown to be useful for reducing noise in pictures which are almost devoid of midtones and an inverse cube function was shown to be useful for reducing noise in pictures which are rich in the midtones. However a formal algorithm for generating adaptive companding functions has yet to be developed. A possible approach is to use the integral of the picture's intensity histogram as companding function. Each peak in the histogram gives rise to an interval of large slope in the integral, so that noise may be reduced in this interval by using the integral as compandor. Experimentation with this technique by the supervisor of this thesis has met with moderate success. 72 Another possible scheme might be an algorithm with the capability to determine one or more intermediate brightnesses b. and the intervals in which c'(b) should be greater or less than 1 based on an examination of the picture's histogram. specified, The behavior of c'(b) having been a second algorithm would then be required to determine a function c(b) with a derivative meeting the correct specifications. In the context of image transmission, the implemen- tation of an adaptive companding scheme would require that the transmitter send the inverse of the companding function to the receiver. This is equivalent to about one picture line which is only a small fraction of the total amount of data being sent. However the calculation of the companding function and its inverse would place a considerable computational load on the transmitter and this would be the main disadvantage of adaptive companding. CHAPTER 5 THE INFLUENCE OF BACKGROUND ADAPTION ON NOISE VISIBILITY 5.1: Introduction. It has already been pointed out that there are situations in which the functional transfer model of vision does not apply. When a small area of one intensity is surrounded by a large area of contrasting intensity such a situation arises. As has been described in the literature (5), the sensitivity of vision to details in the small area is reduced as a result of adaption to the larger contrasting area. Two sets of experiments have been conducted to analyze the effect of this type of adaption on noise visibility. 5.2: The varying contrast experiment. In this experiment random noise of amplitude N was added to a 72 pel square target of intensity b. was located in the center of the T.V. picture. The target The adaption effect was produced by setting the rest of the picture to a contrasting intensity b 0 . Three values of b0 were studied: 73 74 b = 64, 128, and 192. For each value of b 0 , the noise visibility in the target was investigated for the following values of b: 32, 64, Fig. bo' 5.1(a) and N. 96, 128, 160, 192, and 224. shows a typical combination of values of b, For each combination of b and b 0 , each experi- mental subject was asked to select a value of the noise amplitude Nc in the target which he felt represented a noise visibility just above the threshold of visibility. Each subject was allowed to choose his own level of noise visibility Nc subject to the conditions that this choice be close to the threshold of visibility, and that he felt confident of his ability of recognizing the same level of visibility for different combinations of b and b0' The subject was thus allowed to define a constant level of noise visibility Nc of his own choice subject to the above constraints, and was then expected to decide which value of N produced this level of visibility for each combination of b and b 0. Though this criterion appears to be quite subjective it was judged to be sufficient to detect the trends produced by background adaption. The experiment was carried out with three subjects in the usual lighting conditions, and the results are given in Fig. 5.2. The value of b0 can be seen to have a strong effect on the results, so it may immediately be concluded that 75 Fig. 5.1(a). Test pattern with 72 pel noisy target. Fig. 5.1(b). Test pattern with 288 pel noisy target. N 76 c - 10 * bo = 8+ 64 X = Subject 1 = Subject 2 3 = Subject 3 4 2 I N 32 64 I I 96 128 160 192 224 b C T 10 bo = 128 6 "Z ft~ 4 05: 2 3 I 9 64 96 128 I I I 160 192 224 160 192 b C 10 - N 32 bo = 192 8- 4 - + 6 2 64 Fig. 5.2. 96 . I I 32 32 128 I 224 b Results of varying contrast experiment. 77 the functional visual model is not in operation. was, If it the values of Nc reported by the subjects would increase linearly with b independently of b The results may be considered to be a combination of two factors: (1) the usual tendency for sensitivity to noise to decrease as b increases, and (2) the tendency for sensitivity to be decreased as the magnitude of the contrast lb 0 increases. Thus for bo = 192, bl the values of Nc reported first increase as b is increased from 32 because of decreasing sensitivity due to increasing b. However as b approaches bo = 192, the decrease in contrast is more effective than the increase in b, and the values of N decrease to a minimum at the point of zero contrast b c = 192. Increasing b above 192 reintroduces contrast and the loss of sensitivity due to increasing b, For b0 = 128 similar behavior for N so Nc increases again. is observed: it increases as b increases from 32, but there is a tendency for it to decrease as the point of zero contrast b = 128 is approached, after which it again increases as b increases above 128. For b = 64, Nc does not increase as b increases from 32 until after b has increased above 64. The three sets of results clearly illustrate that the magnitude of the contrast lb - b 01 is a factor which decreases sensitivity. 78 5.3: The varying target area experiment. It seems likely that the influence of the contrasting area b0 on the target b depends on the target being small. Thus if the target area was to be increased, it might be expected that the effect of contrast on sensitivity would be diminished. An experiment has been conducted to test this assumption. In this experiment the target area was varied. A computer command was used to vary the dimension d of the target, where d = 2 the width of the target in pels = height of the target in pels the (recall that the vertical spacing of pels is twice as close as the horizontal spacing). The dimension d thus corresponds to an area of 8d2 pels. d was varied from 3 to 9, which varied the area of the target from 72 to 648 pels. As an example, the effect of increasing d from 3 to 6 can be seen in Fig. 5.1. The three experimental subjects were again asked to define their own level of noise visibility Nc close to the threshold of visibility. With target intensity b = 128, the variation of Nc with d was then measured for background intensities b 0 = 192 and 64. The results of this experiment are given in Fig. 5.3. These results clearly show that sensitivity increases as d increases. This may be interpreted as being due to the diminishing effect of the contrast 1(b - b 0) 1. As the N c 79 * = Subject 1 X = Subject 2 o LE3 3 N 4 5 6 7 = Subject 3 ElX 8 9 d c 8-,Fig. 5.3. 7. Results of the varying target area experiment. 65- 0 4- 10 32- - 1 I I i ai 3 4 5 6 I 7 8 9 d s0 target area increases, the intensity b begins to maintain its own level of adaption, so that sensitivity in the target tends to become independent of what is outside of it. For both bo = 64 and bo = 192 all three subjects found that sensitivity remained constant for d > 6. This value of d, which corresponds to a target edge of 2.8 cm, or an area of approximately 7.8 cm 2, can be considered to be the critical value above which sensitivity is not influenced by contrast. A target edge of 2.8 cm at the experimental viewing distance of 125 cm corresponds to an angle of vision of arctan 5.4: 2.8 (125) = 1* 18'. The effect of adaption on the success of companding schemes. Since the functional visual transfer model does not account for the influence of contrast on sensitivity, question arises as to whether companding, the a technique based on this model, may result in undesirable effects due to contrast. In most viewing situations, an angle of vision of 1* 18' accounts for only a very small faction of the scene being viewed. Consequently the photographer ensures that the objects and areas of interest in his photographs subtend much larger visual angles than this. As a result it is rare for such tiny regions of one intensity surrounded by 81 a larger contrasting area to arise in conventional Thus with most photographs the decreased photography. sensitivity to noise in small areas due to contrast is not an important factor, and should not affect the success of companding schemes for typical photos. Pictures tend to be composed of several large areas of different average intensity. As the eye scans the picture it adapts itself to the average intensity of each area in turn, so that in each area noise is percieved as predicted by the functional transfer model. The ability of the eye to rapidly adapt to a sequence of different intensities (5), is the reason why contrast is not an important factor in the perception of noise in most pictures, or in the success of companding schemes. It is only when an area subtends a visual angle of less than about 1 that the eye cannot adapt to its intensity level, 18' and this situation is rare in most photographs. 5.5: Stockham's visual model as a companding processor. Underlying the formal development of optimum companding theory given in Chapter 3 is the idea that pictures should be transformed by a visual model before having noise added. The analysis of Chapter 3 gave objective support for the intuitive idea that noise should 82 be less noticeable if added to the transformed picture than to the picture itself. Using similar reasoning it can be argued that a more perfect visual model would make a better compandor than the simple functional model. Stockham's visual model (14) is such a model. It consists of a log stage followed by a linear shift invariant filter V as in Figure 5.4. This model may be considered superior to the functional transfer model because it successfully accounts for such optical illusions as simultaneous contrast and Mach bands, so it appears to have great potential as a companding processor. was aware of this, Indeed Stockham and in his original paper he gave an example of a photograph with additive noise reduction using his model as a compandor. For comparison, he also included the same photousing log companding, using his own model appeared to be better. and the results However it is unclear whether this was due to the virtues of his own model, or to the predictable bad properties of straightforward log companding. As has been derived in Chapter 3, the ideal logarithmic compandor is log function itself. log (1 + .02b) not the So further research will be necessary to fully evaluate the companding potential of Stockham's model. Stockham (14) devotes considerable attention to the desirability of a realizable output guarantee in image 83 input picture V log shift Fig. 5.4 output picture linear invariant filter Stockham's visual model. 84 processing. However the log and linear stages of his own model do not eliminate the possibility of negative light at its since logx output, is negative for x < 1. This possibility of negative light causes no problems with the final output in companding applications since the final stage of processing is exponential and always gives an output > 0. It nevertheless seems questionable to allow negative light at all, even at an intermediate stage of processing, especially when the process being modelled is the physiological process of human vision. A possible solution to this problem with Stockham's model would be to replace the log stage with the log (1 + .02b) function of Chapter 3 which guarantees a positive output for any positive input. This procedure is justifable by the strong evidence given in Chapter 3 that log (l+.02b) the earliest process of human vision. represents Further research will be necessary to determine if this modification of Stockham's model is worthwhile. 85 5.6: The variation of sensitivity with adapting brightness: an effect which Stockham's model does not account for. In this section it will be shown that Stockham's model does not account for the contrast phenomenon of section 5.2. Consider a picture composed of a background blank field b0 and a small target b with additive, uniformly distributed discrete random noise of amplitude N. This picture can be considered to be the sum of a signal and noise component, as in Fig. 5.5(a). The noise component is zero in the background area and fluctuates between -N and +N in the target area. The notation R(-N,N) to denote a random number from this range. is used Before applying Stockham's model to the picture it is convenient to consider it as a product of signal and noise components, Fig. 5.5(b). as in In the product representation the noise component is 1 in the background area and is random noise from the range R(l - , (1 1 + N) - N g, N 1 + -) in the target area in the figure). (denoted The product representation is equivalent to the sum representation. Application of the log stage of Stockham's model to the product of Fig. 5.5(b) Fig. 5.5(c). results in the sum of the two signals of The signal component of the sum is log b 0 in the background and log b in the target. The noise 86 and is random noise logarithmically distributed on the range in the target. (1 - N, 1 + ) component is 0 in the background, The principle of superposition now allows the linear filter V of Stockham's model to be applied individually to the two additive components of Fig. 5.5(c). This results in a final output in the form of the two additive components of Fig. 5.5(d). The noise component of the output depends only on the filter V, the target intensity b, and the noise amplitude N. Thus Stockham's model has failed to predict the dependence of the output noise on the background intensity b0 0 However this criticism of Stockham's visual model is not of major importance to its success in image processing, because as was discussed earlier the dependence of sensitivity on contrast applies only to small viewing angles which rarely arise in practice. Stockham's model is still the most complete visual model to date, and is certainly worthy of further research and development, particularly in the area of companding. 87 0 6 b0 6+ Ft Fig. 5.5 (a). -H The input picture as a sum. bo The input Fig. 5.5 (b). picture as a product. 88 0 + Lr b Fig. 5.5 (c). The log of the input as a sum. LOG-b0 0 V +v L I Fig. The output 5.5 (d). as a sum. 89 APPENDIX 1 b Subject Subject Subject Subject Subject 1 2 3 4 5 N c (b) N c (b) N C (b) N c (b) N c (b) 16 1 1 1 1 1 32 1 1 1 1 1 48 1 1 1 1 1 64 2 2 2 2 2 80 2 2 2 2 2 96 2 2 2 2 2 112 2 2 2 2 3 128 2 3,2 2 3,2 4,3 144 2 3 2 3,2 3 160 2 4,3 2 2 4,3 176 3 4,3 2 3 3 192 3 3 2 4,3 4,3 208 3 3 3 3 3 224 3 3 3 4 4 240 4,3 4 3 4,3 4 250 4,3 4 3 3 4 Note: Where two values are indicated, the second one was taken at the end of the experiment as a consistency check and is assumed to be the more reliable of the two. 90 BIBLIOGRAPHY AND REFERENCES 1. Graham, D. N., M.S. Thesis, Department, Cambridge, 2. Post, A. E., S.B. Thesis, Department, Cambridge, 3. Greenwood, R. E., Sc.D. M.I.T. Electrical Engineering 1962. Mass., M.I.T. Electrical Engineering Mass., Thesis, 1966. M.I.T. Electrical Engineering Department, Cambridge, Mass., 4. Mitchell, O.R., Jr., Ph.D. Thesis, M.I.T. Electrical Engineering Department, Cambridge, Mass. 5. Schreiber, W. March, 6. Roberts, F., "Picture Coding," Proc. 1970. 1972 of the IEEE, 1967. L. Noise," G., "Picture Coding Using Pseudo-Random I.R.E. Transactions on Information Theory, Feb. 1962. 7. Cornsweet, Press, 8. T. N., Visual Perception. 1970. Hashizume, B., Department, 9. Stevens, New York: Academic S. S., S.B. Thesis, M.I.T. Electrical Engineering Cambridge, Mass., 1973. "The Psychophysiology of Vision," in Sensory Communications, M.I.T. Press, Cambridge, Mass., 1961. 10. Troxel, D. E., "Computer Editing of News Photos," IEEE Transactions on Systems, Nov. 1975. Man, and Cybernetics, 91 11. Huang and Hartmann, "Subjective Effect of Additive White Noise with Various Probability Distributions," M.I.T./R.L.E. Quarterly Progress Reports, No. 85, April 1967. 12. and DePalma, Lowry, E. M. Vol. 51, Berkeley, L., Ph.D. K. Jr., "Image Processing in the Context P., of the IEEE, S.M. Thesis, M.I.T. Department, Cambridge, 16. Am., 1966. Stockham, T. G., Wacks, Soc. Thesis, University of California, of a Visual Model," Proc. 15. J. Opt. 1961, p. 740. 13. Davidson, M. 14. J. m., Mass., July 1972. Electrical Engineering 1970. Stevens, S. S., Handbook of Experimental Psychology, New York, Wiley, 1951. 17. Knuth, The Art of Computer Programming, Vol. 2.