From: Proceedings of the Eleventh International FLAIRS Conference. Copyright © 1998, AAAI (www.aaai.org). All rights reserved. A Parallel ComputerPreattentive-Vision System for Learning and Extracting special Features in 2-D Images Anne M. Landraud-Lamole EquipeRAPID:Recognitionand Artificial Perceptionin ImagesfromData D~partementMath~mtiques-Informatique, Facult~ des Sciences 97159Pointe-~-Pitre, Guadeloupe,France (FWI) alamole@xtetguacom, fr Abstract A robust parallel multichannel filtering methodfor modellingpreattentive vision is presented in this paper. Ouraimis to achievean artificial perceptionsuchthat its pertbrmancerather than its workingbe as close as possible to that of humanvision. In establishing a rigorous theoretical basis tbr constructing original and efficient operators to be incorporated in a robust computervision system, we drew our inspiration from existing computer methods,whichmevttablyimply approximatefunctionning, and also fromthe experimentalresults on the mechanisms of humanvision, which a priori maybe considered as optimum. Our model implemtmts a "homothetic-filter bank"(IIFB) whereeach filter is selectively sensitive to frequency-and-orientation pair just like a visual single cell. This modelverifies both the Shannontheoremand whatis knownof the properties of humanvision. The proposed system is independent of rotations and scale changes. Moreover,an appropriate choice of the frequaney-filter function with its parameters allows us to provide tbr parallel prt~essingusing ca.~;adecomputations. being equal to 40°. It has been also established that the spatial-frequency bandwidthis between 0.6 and 2 octaves, the frequency, range being about 6 octaves (Movshen, Thompsonand Tolhurst 1978). The average value of the spatial frequencT bandwidth of single visual cells is slightly aboveone octave. Whereasthe angular sampling is relatively well-known, the frequency sampling is not yet cleared up and the changeover from I-D to 2-D models raises somequestions. In particular, we do not knowany convincing experiment describing the interdependence between spatial frequencies and orientations. It has been observedthat the sensitivity to frequency-and-orientation pairs does not imply any correlation between these two parameters. Since the construction of filters separable in frequencies and in orientations is not forbidden by any theoretical consideration, we have opted for that approach to build our modelof vision. Mathematical Biological Basis The perception faculty corresponds to the first stages of data processing b.v the visual system and corresponds to the so-called "preattentive vision" (Neisser 1967). In performing artificial vision s3.’stem, three important problems are to be solved: the rigorous sampling of the spatial frequencies, the scale related problemand parallel processing for real time applications. Experiences of b~olog~sts have shown thai the perception of some phenomenons, such as textures, seems to be made instantaneously - in parallel - by neurons localized in area17 of the visual cortex (Julesz 1975). That preattentive vision is handled by single visual cells. It is generally admittedthat these cells behavelike filtering channelsthat are selectively sensitive to frequency-and-orientation pairs and that the frequent sensitivi.ty is independent of the orientation sensitivity (Westheimer 1984). It has been shown that anisotropie single visual-cells sample orientations with a period of about ten degrees (Schiller, Finlay and Volman1976). The angular bandwidth of each channel is between 30° and 50°, the average bandwidth Landraud-Lamole Perception Ourapproachallows us to use, if necessary, different kinds of filters which could help lessen the computing complexi~’. Taking into account the research work done ~’ psy. chophysiologists, whohave shownthai the visual cells sensitivities to frequencies are independentof their angular sensitivities, we investigate separately the frequency, behaviour of our proposed filters and their angular properties. Wehave first emphasizedproperties of the I-D filters within the scopeof frequent’ processing. It has been shownthat the filter bandwidthto be considered will always be nearly proportional to its "preferential" (central) frequency and it was proposed to use real even functions corresponding to the real part of a Gabor function (Pollen and Ronner, 1983). Wehave shownthat this kind of filter belongs to the more general following model. Wecall "homothetic filter bank" (HFB)any family of filters ~(x)}, where i belongs to an interval I of nonnegative integers, having Fourier transforms l~(u), belongingto L that verify the followingrelations: ..~’-F: R---~Cand"vie]...~,eRandpiER:Fi(u)ciF(p,u).(I) Copyright ©1998, Amedczq Association for ArtificialIntelligence (www.aaai.org). All dghls reserved. 52 Model of Visual Wecall "profile of the filter bank"such a function F and wesay that the HFBis generatedby F. Reciprocally, let F(u) be a continuousfunction definedin R with values in C being the Fourier transform of a function f(x), application of R into C If F(u) has a maximum value at u ~u= and defines a bandpassas wide or lightly broader than u,~ then the I-WBengendered by F(u) can considered for modelling"all single visual cells. As a matterof fact, the filters j~(x) belongingto this I-IFBhave bandwidthesof about 1 octave. Subsequently,this set of filters is referedto as a "visualfilter bank". Mostapproachesusing multiclmnnel-filtering makeuse of energy measuresafter having filtered the imageby Gaborfunctions for it has beenshownthat those functions allow an excellent localization in both image and frequency spaces (Bovik, Clark and Geisler 1987). Our model contains a product of I-D functions which are frequency-and-orientation separable. Oursystem can be run with either Gaborfunctions or any other function providedthey, are well-fitted to frequencyfilters. Wecan prove that our modelallowsfor optimal characterization. Ourapproachconsists in characterizing imagesby energy measures taken as the output of a filter bank whose propertiesare the sameas those of single visual cells. Wehave shownthat the energy of a signal s(x) can expressedas a convolutionT(A)*G(A) at &= -logJp.~. In a "visual filter" bank, each filter Ft(u) of the HFBhas its ownfrequency selectivity (its bandwidth)and its own preferential frequency,noted up Assl~ming that uo is the preferential frequency, of the profile F. the following results can easily be inferred fromthe newexpressionof the frequency./l= log,(u) andfromthe signal theow: po = !/uo (2) k uk uoh (3) wherethe samplingperiod of the function e-~F(e’t) is AA=log,(h). This sampling period is calculated by applying the Shannontheorem. Thewholefilter bank, in the image space, should be {f,(x)}. k~Z where the functions fk(x) are the inverse Fourier transformsof the F~(u) functions. However.it is knownthat the maximum extent of the humanvisual system’ssensitivity field to spatial frequencies is of six octaves. Fromthese considerations, it can be deducedthat in modellingthe visual system,onlythe filters with a preferentialfrequency lying inside that six-octaverange haveto be kept so as to havea finite bankof filters. Thenext important point is to treat the orientations while describing the changeover from the I-D representationto the 2-Drepresentation. Assuming that an appropriatefunction G(~, 0) achievesa low-passfiltering in the plane (/l, 0). we can again apply the Shannon theorem,Theoutput energies of the filter bank are then interpreted as resulting from the 2-D sampling of the function T(3,0)*G(3,0). Taking into account notes by physiologists on the sensitivity beth to angles and to frequencies,weassumethat the filter functionG(A,0) is a separableone: (5) where~. correspondsto the preferential frequencyu~, 0j beingthe preferentialorientationof the filter Go(A,0). We choosethe samplingratio A)t of the ,~ values and, from A~weinfer the frequency-samplingratio Au exactly as donein the case of 1-Dfilters. Wealso applythe Shannon conditionfor selecting the O-sampling ratio as a function of the widthof the Fourier transformof Go(O).This is how weachieve a discrete representation in polar coordinates of the functionT. i.e. of the energyspectrumin the Fourier space, where the frequency axis is graduated with logarithmicvalues. Owingto that representation, anyscale change corresponds to a simple translation along the logarithmicfiequencyaxis andany shift of orientation is expressedin a translation alongthe O- orientation axis. As weaxe aimingat practical efficiency, wemakea few simplifications. Thefilters, constructedin the Fourier space, are madeto be implemented in the imagespace for convolution filtering. For that purpose, we have investigated a goodapproximationmakingit possible to invert the given equationsmoreeasily. First, weconsider the specific case whereorientation Oo =0. Thegeneral shapeof the filter beingconsU’ucted in the Fourierplaneis then: FA(u,v) (uoS~)"!~: F[u/(uohD.] [v/(uoh~)]. Fj (6 Theexpressionof sucha filter in cartesian coordinates is very simple. Moreover,this filter has the advantageous property to be separable in u and v coordinates, which makesthe computationof its inverse Fourier transform easier, this transformbeing a simple product of two onedimensionalfunctions of x and y. respectively, in the imageplane, instead of a convolution integral. On the other hand, the approximatefilter FA(u,v) is no longer separable in polar coordinRtesand does not keepconstant characteristics in frequencies or in orientations. Consequently, the frequency behaviour of the filter dependson the orientation and, conversely,its directional behaviourdependson the frequency. Becauseof this, we choose the shape of the filter so that its response is significant both in the neighbourhood of its preferential frequencyand near its preferential orientation, while it quickly decreases outside of that neighbourhood.Under these conditions, we admit that the so-obtained approximationis valid. Filters with orientation 0j will derive fromhorizontalonesthrougha simplerotation 0j. In a previous paper (Landraud-Lamole and Yum-Oh 1995), in order to makeuse of phaseinformation,wehave considered a classical I-D complexGabor function to represent the ith frequencyfilter. Buthere, weshowthat Computer Vision 58 in order to design a parallel vision system, it is more pertinent to use differencesof gaussianfunctions. A Parallel Vision System A differenceof gaussian("dog")functionswassuggested (Wilson and Bergen 1979) to simulate the response functionof symmetrical single ~asualcells: where~ stands for the parameter~rj of Eq. (7). According to the theory presented above, the set ff~(x)) of real functions of a real variable x. havingFourier transforms {Fffu)} is a homothetic filter bank generated by the followingprofile F: F(u) = exp(-2nguz) 2) - exp(-2gz du suchthat: f(x) = clexpf-x’.,(2crl"’ ~ :)]-c2exp[-x2/(2cr22)J (7) Thefilter f(x) is characterizedby four parameters:cz, ~, trz and ty:. Weshall see that these filters are well adapted to parallel computingand makeit possible to extract the most useful frequent’ properties of a given signal. TheFouriertransformof equation(7) is: 2) - [c2cr:(2~/"]exp(-2~o’z2u F~(u)= c~(o~ t~ c~ =(t~.) Onthe other hand,it is assumed that the filter doesnot respond to zero frequencies Indeed, in the case of a bandwidthof one octave centered at the frequencyu, the interval [u-2. 3u 2] never includes the zero frequency.In order to normalizegaussian functions that occur in each filter formulafex),the constantscz andc: are definedas: ¢t = c,/[crt(2r~ r’:] and c2 = c,/[cr2(2n)I~] (11) Thedeterminationof the standard deviationso-t andcr~ for each of the two gaussian functions forminga filter makesit possibleto infer the constant values c: andc2 by usingequation(11). Thechoiceof or: and~r: for eachfilter depends on its preferential frequency and on the size selected for its bandpass. A value of about one octave correspondsto the knownproperties of visual-cell bands. Forthe filterf(x), weset: u cr:,tTz constant (12) Thevaluea is the samefor all the filters consideredso that the bandwidth is constant,t~(u) is then writtenas: F,(u) c,[expt-2n’cr,’u’) exp(-,n2agcrf’u’)] (13) 54 Lendraud-Lamole (16) Wehave found that the value of the bandwidthlies between0.6 and 2 octavesfor: !.0625 _<a_<2 So as to makethe filter responsesindependentof the average grey-level and dependent only on frequencies present in the image. F,(O)0 is set. Constantsct and c: havethen to veri~’: (15) Let us recall that these filters are to be used for characterizing filtered imagesby energy measures. The filters are normalizedso that their responsesto a white noise are alwaysthe same.In this way,noneof the filters has a preponderant weight over the others. That normalization leads to the followingresult: (8) Theresponseof the filterf(x) to the zero frequency,i.e. to someconstant imagc,is: (14) (17) andthat the optimalvalueof at is about1.5. Wechoosethe value a =~ because it also allows us to implementa parallel visionsystem,as weshall see later. Oncethe frequencyfilters are created, webuild the 2-D separablefilters by. bringingin orientationswhichare to he sampledaccording to the conditions of the signal theory. Then at 0o=0, whenthe filter axis and the horizontalx-axis are identical, the generalshapeof the 2Dfilterf(x, y) in the spatial domain is definedas: f(x.y) ,. ¯ (2~-~.’~{(cr,,.t)-t exp[_x2/(2~z)] -(~,9exp[-x’/(2cr,:’)} (2~’t~2{(~)’t exp[-yz /(2~2)]} (18) Wehavechosenthe standard deviationsso that, for all values i belongingto the set I: ~ /~ .. a = constant and t~ /t~z = fl = constant. The constant fl dependson the numberof orientations being processed. Bygiving the value v~ to the constant t~ a constant spatial-frequent, bandwidthcan be achieved,its widthbeing slightly above one octave, accordingto visual systemproperties and to the signal theory. Moreover,this value v’2is the most suitable for computing convolutions,as recalled hereafter. Thefirst gaussianfunction,with standarddeviationt~t is: goi.~o(x) " (2tO"t/~{(~)’]exp[-x~/ (2~z)]} (19) Thefilter can then be expressedas: fo~.oo(x)=[go~.~(x) - g,,o~,~(x)]. gao~.~,(y) (20) Thesefilters, with an orientation 0o = 0, are notedly(x,y) orf~(x, y) indifferently changingexpression(20) into: f,~(x, y) = lg,n(x) -g~,.(x)]. Thefilters corresponding to anypreferential orientation 0j are deduced from filters with a zero preferentialorientation ~" executinga simplerotation of angle 0j and are written f~.q(x, y). Subsequently, the filter f~(x, y) is written as a differenceof twogaussianlowpass-filters: (21) with: f.~x, y) =fz.o,(x, y) "A,~(x. (22) Imagelo(x.y) Input ~’ *g(x) 0 .g(y) T ImageIo(x,y) filteredby1’1 ¯ g(x) ImageIo(x.y) filteredby.t’2 ~1[ *g(x) *g(y) by fm(x,y) :: Image*I, filtered I *[g(x)*g(x)] ,J by(fl*f~) ImageIo filtered by. (f2*f2) " I (2 steps) *[g(x)*g(x)l 0 ImageIo filtered byf,=,v,t2(x,y) ,[g~’),gt.v)l ’~ Inmge 1,, filtered byt"l*fl*fl*fl ,[g(x),g(x)l I I I Image I° filtered I .J ImageIo filtered byf2*f2*f2*f2 q by f~’2(x’y) I etc.+ Sketchof cascadefiltering Computer Vision 55 with: ]i.~(x, y) - g~(x), gp~(.~) (23) and: f2.=(x, y) :: g~o,(x). gpm(.g) (24) Three basic properties of the convolution integral make an interesting implementation of our filtering processes possible. First is the associativeness of the gaussian functions, whether f, g and h are 1-D or 2-D functions: /~(g,h). ¢f,g),h [f~(x)gKv).l*L~(x)g2~v)//f~(x)*.~(x).l.[gz(.v)*ge(y)J (26) The last is, the convolution of a gaussian function by itself is a similar gaussian function with a standard deviation multiplied b.v v~: (27) From these properties, the possibility of cascade filtering is infered. Considerexpressions (23) and (24) the filtersfi andAand let us take into account the filter separableness. A convolution by fi (or by J}) can be made in two steps: a convolution of the image lines ~" g~(x) followed by a convolution of the image columns by gpo~(x). Selecting the value v2 for the parametera is reD’ appropriate for we can then write: g,o~(x) ¯ g~ ~,(x) " g~(x)*goi(x). (28) Cascade convolulions can then be executed according to the diagram of the figure. Cascade filtering of the imagelo(X, .v) uses relalion ~28). In this figure, the arrows with a continuous line represent a convolution of image lines, while the arrows with a broken line correspond to a convolution of columns. The notation g(x) represents the function go~(x) while g(vJ - gp,~(y). All filterings are madein a given direction 0o 0. The highest preferential frequent’ umis inverscl.v proporlional to the lowest standard deviation crm. 56 Landraud.Lamole References Bovik, A. C., Clark, M., and G-eisler, W. S., 1987. Computational Texture Analysis using Localized Spatial Filtering. In Proceedings of the IEEE ComputerSociety WorkshopComputerVision, Miami Beach, Florida. (25) The two-dimensional convolution of separable functions is such that: go(x) *go(x) g,~,~x) This work has a numberof potential appli~tions, especially in medical imagery and in the analysis of satellite images. Julfsz, B., 1975. Experimentsin the Visual Perception of Texture. Scientific American232: 34-43. Landraud-Lamole, A. M., and Yum-Oh. S., 1995. Texture Segmentation using Local Phase Differences in Gabor Filtered Images. Lecture Notes in Computer Science. Image Analysis and Processing. SpringerVeflag 974: 447-452. Movshen,J. A., Thompson,I. D., and Tolhurst, D. J., 1978. Spatial and Temporal Contrast Sensitivity of Neurons in Areas 17 and 18 of the Cat’s Visual Cortex. Journal of Physiology 283: 101-120. Neisser, U., 1967. Cognitive P,~.wchology. NewYork: Appleton-Century-Crofts. Pollen, D. A., and Ronner, S. F., 1983. Visual Cortical Neuronsas Localized Spatial Frequenw,.’ Filters. IEEE Transactions on ~vstems, Manand C:vbernetics SMC-13, 5:907-916. Schiller, P., Finlay, B. And Volman, S., 1976. Quantitative Studies of Single-Cell Properties in Monk~" Striate Cortex - II. Orientation Specificity and Ocular Dominance.Journal of Neurophysioiogv 39: i 334-1351. Westheimer, G., 1984. Spatial Vision. Annual Review of P..vvchoiogr’35: 201-226. Wilson, H. R., and Bergen, J. R., 1979. A Four MechanismModel for Threshold Spatial Vision. i.7sion Research 19, 1: 19-32.