A Parallel Computer Preattentive-Vision System for Learning

advertisement
From: Proceedings of the Eleventh International FLAIRS Conference. Copyright © 1998, AAAI (www.aaai.org). All rights reserved.
A Parallel ComputerPreattentive-Vision System for Learning
and Extracting special Features in 2-D Images
Anne M. Landraud-Lamole
EquipeRAPID:Recognitionand Artificial Perceptionin ImagesfromData
D~partementMath~mtiques-Informatique,
Facult~ des Sciences
97159Pointe-~-Pitre, Guadeloupe,France (FWI)
alamole@xtetguacom,
fr
Abstract
A robust parallel multichannel filtering methodfor
modellingpreattentive vision is presented in this paper.
Ouraimis to achievean artificial perceptionsuchthat its
pertbrmancerather than its workingbe as close as possible
to that of humanvision. In establishing a rigorous
theoretical basis tbr constructing original and efficient
operators to be incorporated in a robust computervision
system, we drew our inspiration from existing computer
methods,whichmevttablyimply approximatefunctionning,
and also fromthe experimentalresults on the mechanisms
of humanvision, which a priori maybe considered as
optimum. Our model implemtmts a "homothetic-filter
bank"(IIFB) whereeach filter is selectively sensitive to
frequency-and-orientation
pair just like a visual single cell.
This modelverifies both the Shannontheoremand whatis
knownof the properties of humanvision. The proposed
system is independent of rotations and scale changes.
Moreover,an appropriate choice of the frequaney-filter
function with its parameters allows us to provide tbr
parallel prt~essingusing ca.~;adecomputations.
being equal to 40°. It has been also established that the
spatial-frequency bandwidthis between 0.6 and 2 octaves,
the frequency, range being about 6 octaves (Movshen,
Thompsonand Tolhurst 1978). The average value of the
spatial frequencT bandwidth of single visual cells is
slightly aboveone octave.
Whereasthe angular sampling is relatively well-known,
the frequency sampling is not yet cleared up and the
changeover from I-D to 2-D models raises somequestions.
In particular, we do not knowany convincing experiment
describing
the interdependence
between spatial
frequencies and orientations. It has been observedthat the
sensitivity to frequency-and-orientation pairs does not
imply any correlation between these two parameters. Since
the construction of filters separable in frequencies and in
orientations
is not forbidden by any theoretical
consideration, we have opted for that approach to build
our modelof vision.
Mathematical
Biological
Basis
The perception faculty corresponds to the first stages of
data processing b.v the visual system and corresponds to
the so-called "preattentive vision" (Neisser 1967). In
performing artificial
vision s3.’stem, three important
problems are to be solved: the rigorous sampling of the
spatial frequencies, the scale related problemand parallel
processing for real time applications. Experiences of
b~olog~sts have shown thai the perception of some
phenomenons, such as textures,
seems to be made
instantaneously - in parallel - by neurons localized in area17 of the visual cortex (Julesz 1975). That preattentive
vision is handled by single visual cells. It is generally
admittedthat these cells behavelike filtering channelsthat
are selectively sensitive to frequency-and-orientation pairs
and that the frequent sensitivi.ty is independent of the
orientation sensitivity (Westheimer 1984). It has been
shown that anisotropie single visual-cells
sample
orientations with a period of about ten degrees (Schiller,
Finlay and Volman1976). The angular bandwidth of each
channel is between 30° and 50°, the average bandwidth
Landraud-Lamole
Perception
Ourapproachallows us to use, if necessary, different kinds
of filters
which could help lessen the computing
complexi~’. Taking into account the research work done
~’ psy. chophysiologists, whohave shownthai the visual
cells sensitivities to frequencies are independentof their
angular sensitivities,
we investigate separately the
frequency, behaviour of our proposed filters and their
angular properties. Wehave first emphasizedproperties of
the I-D filters within the scopeof frequent’ processing. It
has been shownthat the filter bandwidthto be considered
will always be nearly proportional to its "preferential"
(central) frequency and it was proposed to use real even
functions corresponding to the real part of a Gabor
function (Pollen and Ronner, 1983). Wehave shownthat
this kind of filter belongs to the more general following
model. Wecall "homothetic filter bank" (HFB)any family
of filters ~(x)}, where i belongs to an interval I of nonnegative integers, having Fourier transforms l~(u),
belongingto L that verify the followingrelations:
..~’-F: R---~Cand"vie]...~,eRandpiER:Fi(u)ciF(p,u).(I)
Copyright
©1998,
Amedczq
Association
for ArtificialIntelligence
(www.aaai.org).
All dghls
reserved.
52
Model of Visual
Wecall "profile of the filter bank"such a function F
and wesay that the HFBis generatedby F. Reciprocally,
let F(u) be a continuousfunction definedin R with values
in C being the Fourier transform of a function f(x),
application of R into C If F(u) has a maximum
value at
u ~u= and defines a bandpassas wide or lightly broader
than u,~ then the I-WBengendered by F(u) can
considered for modelling"all single visual cells. As a
matterof fact, the filters j~(x) belongingto this I-IFBhave
bandwidthesof about 1 octave. Subsequently,this set of
filters is referedto as a "visualfilter bank".
Mostapproachesusing multiclmnnel-filtering makeuse
of energy measuresafter having filtered the imageby
Gaborfunctions for it has beenshownthat those functions
allow an excellent localization in both image and
frequency spaces (Bovik, Clark and Geisler 1987). Our
model contains a product of I-D functions which are
frequency-and-orientation separable. Oursystem can be
run with either Gaborfunctions or any other function
providedthey, are well-fitted to frequencyfilters. Wecan
prove that our modelallowsfor optimal characterization.
Ourapproachconsists in characterizing imagesby energy
measures taken as the output of a filter bank whose
propertiesare the sameas those of single visual cells.
Wehave shownthat the energy of a signal s(x) can
expressedas a convolutionT(A)*G(A)
at &= -logJp.~. In a
"visual filter" bank, each filter Ft(u) of the HFBhas its
ownfrequency selectivity (its bandwidth)and its own
preferential frequency,noted up Assl~ming that uo is the
preferential frequency, of the profile F. the following
results can easily be inferred fromthe newexpressionof
the frequency./l= log,(u) andfromthe signal theow:
po = !/uo
(2)
k
uk uoh
(3)
wherethe samplingperiod of the function e-~F(e’t) is
AA=log,(h). This sampling period is calculated by
applying the Shannontheorem. Thewholefilter bank, in
the image space, should be {f,(x)}. k~Z where the
functions fk(x) are the inverse Fourier transformsof the
F~(u) functions. However.it is knownthat the maximum
extent of the humanvisual system’ssensitivity field to
spatial frequencies is of six octaves. Fromthese
considerations, it can be deducedthat in modellingthe
visual system,onlythe filters with a preferentialfrequency
lying inside that six-octaverange haveto be kept so as to
havea finite bankof filters.
Thenext important point is to treat the orientations
while describing the changeover from the I-D
representationto the 2-Drepresentation. Assuming
that an
appropriatefunction G(~, 0) achievesa low-passfiltering
in the plane (/l, 0). we can again apply the Shannon
theorem,Theoutput energies of the filter bank are then
interpreted as resulting from the 2-D sampling of the
function T(3,0)*G(3,0). Taking into account notes by
physiologists on the sensitivity beth to angles and to
frequencies,weassumethat the filter functionG(A,0) is a
separableone:
(5)
where~. correspondsto the preferential frequencyu~, 0j
beingthe preferentialorientationof the filter Go(A,0). We
choosethe samplingratio A)t of the ,~ values and, from
A~weinfer the frequency-samplingratio Au exactly as
donein the case of 1-Dfilters. Wealso applythe Shannon
conditionfor selecting the O-sampling
ratio as a function
of the widthof the Fourier transformof Go(O).This is how
weachieve a discrete representation in polar coordinates
of the functionT. i.e. of the energyspectrumin the Fourier
space, where the frequency axis is graduated with
logarithmicvalues. Owingto that representation, anyscale
change corresponds to a simple translation along the
logarithmicfiequencyaxis andany shift of orientation is
expressedin a translation alongthe O- orientation axis.
As weaxe aimingat practical efficiency, wemakea few
simplifications. Thefilters, constructedin the Fourier
space, are madeto be implemented
in the imagespace for
convolution filtering.
For that purpose, we have
investigated a goodapproximationmakingit possible to
invert the given equationsmoreeasily. First, weconsider
the specific case whereorientation Oo =0. Thegeneral
shapeof the filter beingconsU’ucted
in the Fourierplaneis
then:
FA(u,v)
(uoS~)"!~: F[u/(uohD.]
[v/(uoh~)].
Fj (6
Theexpressionof sucha filter in cartesian coordinates
is very simple. Moreover,this filter has the advantageous
property to be separable in u and v coordinates, which
makesthe computationof its inverse Fourier transform
easier, this transformbeing a simple product of two onedimensionalfunctions of x and y. respectively, in the
imageplane, instead of a convolution integral. On the
other hand, the approximatefilter FA(u,v) is no longer
separable in polar coordinRtesand does not keepconstant
characteristics in frequencies or in orientations.
Consequently, the frequency behaviour of the filter
dependson the orientation and, conversely,its directional
behaviourdependson the frequency. Becauseof this, we
choose the shape of the filter so that its response is
significant both in the neighbourhood
of its preferential
frequencyand near its preferential orientation, while it
quickly decreases outside of that neighbourhood.Under
these conditions, we admit that the so-obtained
approximationis valid. Filters with orientation 0j will
derive fromhorizontalonesthrougha simplerotation 0j.
In a previous paper (Landraud-Lamole and Yum-Oh
1995), in order to makeuse of phaseinformation,wehave
considered a classical I-D complexGabor function to
represent the ith frequencyfilter. Buthere, weshowthat
Computer
Vision 58
in order to design a parallel vision system, it is more
pertinent to use differencesof gaussianfunctions.
A Parallel
Vision System
A differenceof gaussian("dog")functionswassuggested
(Wilson and Bergen 1979) to simulate the response
functionof symmetrical
single ~asualcells:
where~ stands for the parameter~rj of Eq. (7). According
to the theory presented above, the set ff~(x)) of real
functions of a real variable x. havingFourier transforms
{Fffu)} is a homothetic filter bank generated by the
followingprofile F:
F(u) = exp(-2nguz) 2)
- exp(-2gz du
suchthat:
f(x) = clexpf-x’.,(2crl"’ ~ :)]-c2exp[-x2/(2cr22)J
(7)
Thefilter f(x) is characterizedby four parameters:cz,
~, trz and ty:. Weshall see that these filters are well
adapted to parallel computingand makeit possible to
extract the most useful frequent’ properties of a given
signal. TheFouriertransformof equation(7) is:
2)
- [c2cr:(2~/"]exp(-2~o’z2u
F~(u)= c~(o~
t~
c~ =(t~.)
Onthe other hand,it is assumed
that the filter doesnot
respond to zero frequencies Indeed, in the case of a
bandwidthof one octave centered at the frequencyu, the
interval [u-2. 3u 2] never includes the zero frequency.In
order to normalizegaussian functions that occur in each
filter formulafex),the constantscz andc: are definedas:
¢t = c,/[crt(2r~ r’:] and c2 = c,/[cr2(2n)I~] (11)
Thedeterminationof the standard deviationso-t andcr~
for each of the two gaussian functions forminga filter
makesit possibleto infer the constant values c: andc2 by
usingequation(11). Thechoiceof or: and~r: for eachfilter
depends on its preferential frequency and on the size
selected for its bandpass. A value of about one octave
correspondsto the knownproperties of visual-cell bands.
Forthe filterf(x), weset:
u cr:,tTz constant
(12)
Thevaluea is the samefor all the filters consideredso
that the bandwidth
is constant,t~(u) is then writtenas:
F,(u) c,[expt-2n’cr,’u’) exp(-,n2agcrf’u’)] (13)
54
Lendraud-Lamole
(16)
Wehave found that the value of the bandwidthlies
between0.6 and 2 octavesfor:
!.0625 _<a_<2
So as to makethe filter responsesindependentof the
average grey-level and dependent only on frequencies
present in the image. F,(O)0 is set. Constantsct and c:
havethen to veri~’:
(15)
Let us recall that these filters are to be used for
characterizing filtered imagesby energy measures. The
filters are normalizedso that their responsesto a white
noise are alwaysthe same.In this way,noneof the filters
has a preponderant weight over the others. That
normalization
leads to the followingresult:
(8)
Theresponseof the filterf(x) to the zero frequency,i.e.
to someconstant imagc,is:
(14)
(17)
andthat the optimalvalueof at is about1.5. Wechoosethe
value a =~ because it also allows us to implementa
parallel visionsystem,as weshall see later.
Oncethe frequencyfilters are created, webuild the 2-D
separablefilters by. bringingin orientationswhichare to
he sampledaccording to the conditions of the signal
theory. Then at 0o=0, whenthe filter axis and the
horizontalx-axis are identical, the generalshapeof the 2Dfilterf(x, y) in the spatial domain
is definedas:
f(x.y)
,.
¯
(2~-~.’~{(cr,,.t)-t exp[_x2/(2~z)]
-(~,9exp[-x’/(2cr,:’)}
(2~’t~2{(~)’t exp[-yz /(2~2)]} (18)
Wehavechosenthe standard deviationsso that, for all
values i belongingto the set I: ~ /~ .. a = constant and
t~ /t~z = fl = constant. The constant fl dependson the
numberof orientations being processed. Bygiving the
value v~ to the constant t~ a constant spatial-frequent,
bandwidthcan be achieved,its widthbeing slightly above
one octave, accordingto visual systemproperties and to
the signal theory. Moreover,this value v’2is the most
suitable for computing
convolutions,as recalled hereafter.
Thefirst gaussianfunction,with standarddeviationt~t is:
goi.~o(x) " (2tO"t/~{(~)’]exp[-x~/ (2~z)]} (19)
Thefilter can then be expressedas:
fo~.oo(x)=[go~.~(x)
- g,,o~,~(x)]. gao~.~,(y) (20)
Thesefilters, with an orientation 0o = 0, are notedly(x,y)
orf~(x, y) indifferently changingexpression(20) into:
f,~(x, y) = lg,n(x) -g~,.(x)].
Thefilters corresponding
to anypreferential orientation
0j are deduced from filters with a zero preferentialorientation ~" executinga simplerotation of angle 0j and
are written f~.q(x, y). Subsequently,
the filter f~(x, y) is
written as a differenceof twogaussianlowpass-filters:
(21)
with:
f.~x, y) =fz.o,(x, y) "A,~(x.
(22)
Imagelo(x.y)
Input
~’
*g(x)
0
.g(y)
T
ImageIo(x,y)
filteredby1’1
¯ g(x)
ImageIo(x.y)
filteredby.t’2
~1[
*g(x)
*g(y)
by fm(x,y)
::
Image*I, filtered I
*[g(x)*g(x)]
,J
by(fl*f~)
ImageIo filtered
by. (f2*f2)
"
I
(2 steps)
*[g(x)*g(x)l
0
ImageIo filtered byf,=,v,t2(x,y)
,[g~’),gt.v)l
’~
Inmge
1,, filtered
byt"l*fl*fl*fl
,[g(x),g(x)l
I
I
I Image I° filtered
I
.J ImageIo filtered
byf2*f2*f2*f2
q
by f~’2(x’y) I
etc.+
Sketchof cascadefiltering
Computer
Vision 55
with:
]i.~(x, y) - g~(x), gp~(.~)
(23)
and:
f2.=(x, y) :: g~o,(x). gpm(.g)
(24)
Three basic properties of the convolution integral
make an interesting implementation of our filtering
processes possible. First is the associativeness of the
gaussian functions, whether f, g and h are 1-D or 2-D
functions:
/~(g,h). ¢f,g),h
[f~(x)gKv).l*L~(x)g2~v)//f~(x)*.~(x).l.[gz(.v)*ge(y)J (26)
The last is, the convolution of a gaussian function by
itself is a similar gaussian function with a standard
deviation multiplied b.v v~:
(27)
From these properties, the possibility of cascade
filtering is infered. Considerexpressions (23) and (24)
the filtersfi andAand let us take into account the filter
separableness. A convolution by fi (or by J}) can be made
in two steps: a convolution of the image lines ~" g~(x)
followed by a convolution of the image columns by
gpo~(x). Selecting the value v2 for the parametera is reD’
appropriate for we can then write:
g,o~(x) ¯ g~ ~,(x) " g~(x)*goi(x).
(28)
Cascade convolulions can then be executed according
to the diagram of the figure. Cascade filtering of the
imagelo(X, .v) uses relalion ~28). In this figure, the arrows
with a continuous line represent a convolution of image
lines, while the arrows with a broken line correspond to a
convolution of columns. The notation g(x) represents the
function go~(x) while g(vJ - gp,~(y). All filterings are
madein a given direction 0o 0. The highest preferential
frequent’ umis inverscl.v proporlional to the lowest
standard deviation crm.
56
Landraud.Lamole
References
Bovik, A. C., Clark, M., and G-eisler, W. S., 1987.
Computational Texture Analysis using Localized Spatial
Filtering. In Proceedings of the IEEE ComputerSociety
WorkshopComputerVision, Miami Beach, Florida.
(25)
The two-dimensional convolution of separable functions
is such that:
go(x) *go(x) g,~,~x)
This work has a numberof potential appli~tions,
especially in medical imagery and in the analysis of
satellite images.
Julfsz, B., 1975. Experimentsin the Visual Perception of
Texture. Scientific American232: 34-43.
Landraud-Lamole, A. M., and Yum-Oh. S., 1995.
Texture Segmentation using Local Phase Differences in
Gabor Filtered Images. Lecture Notes in Computer
Science. Image Analysis and Processing. SpringerVeflag 974: 447-452.
Movshen,J. A., Thompson,I. D., and Tolhurst, D. J.,
1978. Spatial and Temporal Contrast Sensitivity of
Neurons in Areas 17 and 18 of the Cat’s Visual Cortex.
Journal of Physiology 283: 101-120.
Neisser, U., 1967. Cognitive P,~.wchology. NewYork:
Appleton-Century-Crofts.
Pollen, D. A., and Ronner, S. F., 1983. Visual Cortical
Neuronsas Localized Spatial Frequenw,.’ Filters. IEEE
Transactions on ~vstems, Manand C:vbernetics SMC-13,
5:907-916.
Schiller,
P., Finlay, B. And Volman, S., 1976.
Quantitative Studies of Single-Cell Properties in Monk~"
Striate Cortex - II. Orientation Specificity and Ocular
Dominance.Journal of Neurophysioiogv 39: i 334-1351.
Westheimer, G., 1984. Spatial Vision. Annual Review of
P..vvchoiogr’35: 201-226.
Wilson, H. R., and Bergen, J. R., 1979. A Four
MechanismModel for Threshold Spatial Vision. i.7sion
Research 19, 1: 19-32.
Download