A HYBRID FORMALISM FOR REPRESENTATION ... Francisco de Assis Tavares Ferreira ...

advertisement
A HYBRID FORMALISM FOR REPRESENTATION AND INTERPRETATION OF IMAGE KNOWLEDGE
Francisco de Assis Tavares Ferreira da Silva
Instituto Nacional de Pesquisas Espaciais - INPE
Caixa Postal 515 - Sao Jose dos Campos - SP - Brazil
E-mail: inpenco§brfapesp.bitnet
ABSTRACT
A blackboard-base knowledge representation, adapted to maintain information about
images using method of frame, is proposed. This representation is to be used
development of artificial intelligence applications to help human operators in the
radar images and related informations. This project is part of the development
artificial intelligence applications in the image processing area.
KEY WORDS:
meteorological radar
as a basis for the
task of interpreting
of a laboratory for
Artificial Intelligence, Expert System, Image Interpretation, Radar, Remote Sensing
Application
1. INTRODUCTION
which are represented through links sillmilar to
the ones
used in semantic nets.
An analog
combination of representation methods can be seen
in the ERNEST system [15] where emphasis was
given to the semantic nets structures.
The radar
image representation, analysis and
interpretation process exhibits characteristics
which make it a good candidate for automatization
using artificial intelligence techniques [1]-[5]:
(i) the task demands specialized knowledge, (ii)
the specialists can describe the used methods,
(iii) the methods can be described with symbols,
(iv)
the
interpretation
demands
heuristic
solutions, and, finally, (v) the task is neither
trivial nor excessively difficult [6].
Using the proposed formalism it is possible to
represent from the original images pixels (in the
proposed experimental application a radar image)
to the symbolic interpretation of the physical
phenomena under analysis. The higher levels of
representation
also
integrate
foreign
information,
independent of
the image,
and
provenient from, say, a geographical data bank,
data
collection platforms, satellite
images,
profiles, meteorological balloons, avionics or
another radar.
The analysis and interpretation process involves
different knowledge levels, where the processing
varies
from
pixels
to
the
symbolic
representations
of
the
possible
image
interpretations of the many regions of the image,
associated to entities of the real world [7,8].
The higher levels of representation, that is, the
levels where the information is already symboliC
in
the sense of
being more abstract,
may
integrate information from elsewhere beyond the
information
extracted from the image
itself
[1,9] •
The paper is organized in the following way: in
section 2 the frame model is briefly presented;
in section 3 the language developed for the
frames management is
presented and how this
language can be used in the image interpretation
automatization process is commented; in section 4
the radar images characteristics and how the
images will be represented using the frame model
are discussed; in section 5 the Blackboard model
is briefly presented; in section 6 the radar
image interpretation process is presented showing
how it can be helped by the use of Knowledge
Sources acting on the proposed representation;
finally, in section 7, conclusions and future
research directions are presented.
The use of artificial intelligence methods and
techniques
may increase
the efficiency
and
productivity of the interpretation process. Note
that image interpretation being a dynamic process
and dependent of the available image type is
based if the system disposes of some ability to
learn [10, 11] •
2. THE FRAME MODEL
The present work proposes the development of a
formalism
to represent the knowledge associated
to the images and their possible interpretations,
as well as a set of methods which permit the
manipulation of these knowledges. To test the
applicability of this formalism the development
of a radar image interpretation aid system is
proposed.
Frames were introduced as a generalization of the
semantic nets in order to express the internal
structure of objects, mantaining the possibility
of representing properties' heritage in the same
way as the semantic nets [16].
The proposed formalism uses a knowledge base
structured according to the frames' method [12],
combined
with some
of
the semantic
nets'
characteristics
[13]. A review about
Hybrid
Knowledge
Representation
can
be
seen
in
Bittencourt [14]. The symbolic structure of the
objects of interest presented in a given image is
represented through frames structures, involving
meanings associated to a specific interpretation,
The
fundamental ideas
of this method
were
introduced by Marvin Minsky (1975) in his paper
"Framework to
Represent Knowledge" [12]. The
applications proposed by
Minsky for the new
method were scene analysis, visual perception
modelling and
natural language understanding;
however,
the
paper
proposes
neither
an
implementation
methodology
nor
a
formal
definition of the method. Since 1975, many
810
systems were implemented based on the frames idea
and
many formal
definitions were
proposed.
Usually a frame consists in a set of attributes
which,
through
their values,
describe
the
characteristics of the object described by the
frame. The values belonging to these attributes
may be other frames, generating a network of
dependencies between the frames. The frames are
also organized in a specialization hierarchy,
creating another
dependency dimension between
them.
The
attributes also
have
properties
regarding the type of values and restrictions on
the number of them which they can have. These
properties are called facets.
between one frame
and another represent the
effects of movement from one place to another.
For non visual types of frames, the differences
between the frames of a system may represent
actions, cause and effect relations, or changes
of the viewpoint. Different frames of a system
may share the same attribute values, and this is
the critical point that makes it possible to
coordinate
united information from
different
viewpoints.
Since
a frame is
proposed to represent
a
situation,
a
process of
correspondence
of
patterns tries to associate values to each frame
attribute,
consistent with these
attributes'
facets. The
correspondence process is partly
controlled by information associated to the frame
(which includes information about how to deal
with susprises) and partly by knowledge about the
objectives of the system in course. These are
important uses of the information obtained when a
correspondence process fails. This may be used to
select an alternative frame which better fits in
the situation.
The systems based on the method of frames are not
an homogeneous
set, however some fundamental
ideas are shared by them. One of these is the
concept of property heritage, which permits the
specification of an object class through the
declaration that this class is a subclass of
another one that has the property in question.
Heritage
can be a very
efficient inference
mechanism
in domains
which show a
natural
taxonomy of concepts.
3. MANAGEMENT OF THE FRAMES
Another idea common to the systems based on
frames is the expectative guided reasoning. A
frame contains attributes, which can be tipical
values, or a priori values, the so called default
values. When trying to instantiate a frame so
that it corresponds to a given situation, the
reasoning process should fill the values of the
frame attributes with the available information
in the situation description. The fact that the
reasoning process knows what to look for to
complete the necessary information, and in case
it is not available, which tentative values to
attribute to the empty
attributes may be a
fundamental factor for the efficiency recognition
of a complex situation.
The manipulation of the structure of frames which
constitutes the knowledge
base will be made
through a data manipulation language developed in
PROLOG language [17].
The characteristics and functionalities
language are presented in the following:
of
the
(i) Frames' hierarchies, in the form of "trees",
permitting the heritage of attribute values.
the
(ii)
Encapsulation of
the
frames and
relations between them, through primitives which
to be
permit the
entities of the language
manipulated.
Many of the method's representation power depends
on
this
inclusion
of
expectatives
and
assumptions. The default values
may be very
useful
in
the
representation
of
general
information, more usual cases and ways to make
generalizations.
(iii) Procedural
linking of PROLOG functions
defined externally to frames attributes. These
functions are activated when an attributed value
is desired to be found and it is not available.
The default values are freely associated to their
corresponding attributes,
such that they may
easily be substituted by new items which better
fit the current situation. In fact they can serve
as "variables" or special cases of "reasoning",
frequently permitting
the dismissal of logic
quantifiers.
(iv) Possibility of
the name of a frame.
an attribute's
value to
be
The primitives of the frame manipulation language
may be classified into the following groups:
Visualization
and
initialization
primitives,
definition primitives,
attribution primitives,
elimination primitives and search primitives. The
objectives
of each
group of primitives
is
presented in the following.
A third idea is the procedural link. Apart from
values,
an attribute
may
be
the
default
associated to a procedure which must be executed
are satisfied,
for
when
certain conditions
example: when the attribute is created; when its
value is read, changed or destroyed.
- Initialization and Visualization Primitives
The
link, by
heritage, through
attributes'
values,
or
an
interrelated
frames
set's
procedural
link
allows
certain
specific
inferences to be made efficiently which can be
used to control the changes in the focus of
attention and emphasis of the application.
Permit the initialization of the hierarchy top
frame and the system diverse control variables ,
beyond offering resources for visualization of
the frames, their attributes' values and the
heritage hierarchy.
- Definition Primitives
For the analysis of visual scenes, the different
frames of a system
describe the scene from
different points of view, and the transformations
811
4. REPRESENTATION OF RADAR IMAGES
Permit the definition of new frames in a certain
heritage hierarchy position, and the definition
of a given frame's new attributes.
Meteorologic radar images are composed of point
clusters (stains) which generate the radar map
(PPI: Plan Position Indicator, RHI: Range-Height
Indicator, CAPPI: Constant Altitude Plan Position
Indicator among others [20]). In this map, we
can identify geographic elements, which may be
subtracted,
and
meteorological
phenomena
(targets) like clouds, rain, wind, snow, hail or
others. The return signal intensity is shown in
many
discrete levels (digital image), called
slices, where the number of levels depends on
the calibration of the radar spectral band and on
the density of the detected element. Usually the
sampling
can
be
preprogrammed,
permitting
delineation
of areas and vertical
section
sampling
with
many
levels
of
intensity
quantification. A general and historical review
of
the
meteorological
radar
system
characteristics
is
presented in
"Radar
in
Meteorology" [21].
- Attribution Primitives
Permit an attribute value or a procedural link to
be associated to a given frame.
- Elimination Primitives
Permit elimination of attributes,
and associated functions.
their
values
- Search Primitives
These are the language central primitives, they
permit navigation inside the heritage hierarchy,
through
consultation
to
the
relations
of
descendency between the diverse frames, and the
recovery of attribute values.
The
representation
to
be adopted
in
the
implemention of the Knowledge Base will be based
on frame structures. A review of the clustering
process is presented in Mussio and Pawlina [22].
A general review of the computational models for
image representation is
presented in Argialas
and Harlow [4].
The recovery of an attribute is made through a
search, initially made in the starting frame. In
case the attribute or its value are defined in
this frame the search continues according to the
frame hierarchy. If the attribute is defined, but
not its value, and if there is an associated
function,
this function is called
with the
following
parameters:
(frame) ,
(attribute) ,
(result). "result" is a variable which should
return
the result. This result
becomes the
attribute value of the frame where the search is
and ends. A "type" parameter informs us how the
value was obtained and if it is a frame or not.
The possible types are:
Each
frame
of the
representation
proposed
contains attributes whose values can be stored
directly in the structure or determined through
procedures. An important characteristic of the
frame
structure
is
that frames lower
in
hierarchy can inherit values of
attributes from
frames higher in hierarchy.
Typically a frame corresponding to an image will
be
represented by a set of attributes. The
attribute of lowest level is the pixel, which is
a mapping from the image point coordinates to
its gray level. Other attribute values may be
obtained
through
preprocessing
routines:
filtering,
segmentation and
classification.
Other attributes will be
used at levels of
greater
abstraction, where it becomes possible
elements
of
the
image
to
identify
the
interpretation.
value-value directly obtained from attribute.
frame-value is a name of frame
from attribute.
function-value
function call
value
obtained directly
obtained
function-frame
value is a name
obtained through a function call.
through
of
a
frame
This
representation will permit the
use of
several heuristics
to program automatic alerts
for targets we may
want to observe. It should
also be useful in
the creation of heuristics
for
pre-selected target phenomena,
evolution
(historic behaviour) and automatized storing. The
general
architecture
of
the
proposed
representation is shown in figure 1.
The
proposed language should help
from the
acquisition of pattern and semantic relations
task to the image interpretation process itself.
Heuristics for
knowledge acquisition will be
implemented via a man-machine interface, which
through the proposed language permits frames to
be built containing the necessary knowledge for
the diverse interpretations. A review of this
process for knowledge acquisition is presented by
Oliveira [18] and an example of this process is
presented by Silva et ale [19].
The first level
corresponds to the analysed
region. The second level instances the class
which represents the possible vision types of
each cell under analysis (see figures 1, 2.a and
2.b). This description
is obtained from the
processing of the original image corresponding to
the pixel matrix (coordinates and gray levels),
generated by the radar control processor. The
next levels present the evolution of phenomena
and attributes like: intensity, width, height,
length, and area of each cell, beyond baricenter
and elliptic factor of each cell or subcell
extracted from the original image (see figures
2.d, 2.e and 2.f).
The
interpretation
process
uses
frame
manipulation methods, that is, a given inference
method [6] may use the primitives of the language
for creation, manipulation and dele tat ion of the
frames inside the interpretation process.
812
The information stored in the frames that form
the Blackboard (Figure 3) belong to different
levels
of
information.
The
first
level
corresponds to the original image representation
(coordinates
and gray levels), this information
is obtained directly from the
radar control
processor. The second level contains elements
retrieved from the original image used by the
Knowledge Sources
to validate the prospective
targets.
The third
level
corresponds
to
attributes obtained
through preprocessing and
context analysis.
These attributes are generated
by
the
respective
Knowledge
Sources.
The
following
levels
will
contain
intermediate
information generated and used by the Knowledge
Sources during the interpretation process.
Date: {iot, iilt, iilt>
i'i~e:
{int, int, intI
The intermediate
informations, linked to the
results
produced by the Knowledge
Sources, may
also
be
information specific levels.
The
intermediate result structure is linked to the
intermediate conclusions of the
classification
or interpretation tasks' successive improvement
process.
The
Blackboard
will
contain
also
frames
corresponding to
structures of events, recording
the whole processing. Some
examples of events
are modifications representing transformations or
movements which occur to the elements analysed.
The structure of frames forming the Knowledge
Base will be manipulated by
many Knowledge
Sources defined in a hierarchical way. These
Knowledge
Sources
are based in
different
processing
methods
according
to
their
specialities,
varying from
production
rules
to image processing algorithms.
A set of reference frames will be defined for
each standard
image element. These reference
interpreting new
frames can be useful when
images.
The flexibility of the frame structure permits
other informations, linking different images or
describing procedure
groups that act on the
images, to
be represented
in the Blackboard.
The interrelation between these different
types
of knowledge will be
generated, mantained and
modified by a set of Knowledge Sources, each one
specialized in a task.
5. THE BLACKBOARD MODEL
The basic structure of the Blackboard consists of
a data
structure called Knowledge
Base and
entities called Knowledge Sources [5]
and [23]-
[25].
The Knowledge Sources alter the
Knowledge
Base. There
is
no
centralized
flux control:
the
Knowledge
Sources
are
autonomous. The Knowledge Sources
are active
entities that
may
contain
from algorithms to rule
based
expert
systems,
even
another Blackboard.
The
informations
about
the
solution problem's solution state
are stored in the Knowledge Base
of
the
system.
There
the
Knowledge
Sources
make
alterations
which
take
the
system
incrementally
to
the
solution of the problem.
813
The Knowledge Sources answer opportunistically to
the alterations on the Blackboard. There is no
specific control element in the Blackboard model.
The model specifies a generic environment for the
problem's solution. The control may be in the
Knowledge
Sources,
Knowledge Base,
separate
modules or in some combination of these three.
The architecture proposed can be seen in
figure 3.
The principle of working
of
presented system can be observed through
interpretation process (next section).
The function of
the second Knowledge Source
(analysis KS) is to process the reference frames
to identify the analysed frame components, using
the information stored in the selected reference
frames.
The
third Knowledge Source
(evolution KS)
updates the history of the analysed elements and
eventually the reference frame information or
generates new frames of this kind.
the
the
the
The fourth Knowledge Source (coordination KS)
manages the different information
generated by
the other sources and the
schedule of tasks to
be performed.
This source signals indicating
what should be done with the analysed frame, for
example: ignore
it, store the image and the
frame,
fire some
alert signal, update
the
schedule, etc.
The coordination source also incorporates a set
of
control auxiliary modules to monitor the
alterations on the Blackboard information, and to
activate one or more Knowledge Sources according
to information about the
next tasks on the
schedule.
7. CONCLUSION
IMAGES
1
J r\. ,= F~ =H=F,~!"~,
E=: S=S=TF=;U=CI=U~i{=E'=="J
The Blackboard model has been used before in the
image processing area. Goodenough et al.
(1987)
introduced
an
aplication
based on
the
Blackboard model in the
area of Remote Sensing
[5]. Shihari et all. (1987) use this architecture
in the area of .post address identification [24].
Andress and Kak (1988) use
the technique in the
geometric image area [25]. Matsuyama (1987) uses
the model in an aplication towards aerial image
understanding [9].
1_---,--,1'
'===='
The knowledge representation proposed in this
work will be used initially inside a prototype
expert system for meteorological radar images
cataloging. Also the efficacy of the diverse
heuristics proposed by
the experts in these
image's
interpretation
for
meteorological
phenomena recognition will be investigated.
6. THE INTERPRETATION PROCESS
The interpretation process may be divided in two
basic
levels:
(i)
meteorological
targer
qualification, which is the
identification of
some image
element such as rain,
when the
qualification, or selective classification, will
be related
to the rain area
and density
information
extraction;
(ii)
phenomena
evolution, in the case of rain being information
like intensity, propagation speed and duration.
Our
perspective
is
to
also
explore
the
connectionist aspect through use of functional
neural networks as specific knowledge sources.
This project is part of a more ambicious project
regarding
the
design
of
an
artificial
intelligence tools based environment
for the
development
of
aplications
in
the
image
interpretation area.
The first Knowledge Source (numeric KS in figure
.3) is responsible for the quantification of the
image elements.
This project involving three departments at INPE
intends to
develop applications concerning not
only radar but also
satellite, aerial and
medical images. The knowledge
representation
proposed in
this paper will be used
as a
prototype
for
the representation
to
be
integrated in this environment.
This
source consists of
a set of numeric
procedures capable of
generating new attribute
values
for a
given image, given
available
attributes. Some
functions of this Knowledge
Source are: calculus of the number of elements;
quantification and storage of the sub cells
of
each element; calculus and storage of the average
intensity
level
of each
analysed
element;
calculus and storage of
the distance between
each
element. These
elements
may be
then
confirmed through context analysis: verification
of the other
elements in the scene, distance
between elements and analysis
through context
rules.
REFERENCES
[1] Quegan, So; Rye, A. J. ; Hendry, A.;
Skingley, J.; Oddy, C. J. Automatic
Interpretation Strategies for Synthetic Aperture
814
Radar Images. Philosophical Transactions Review
Soc. Lond., 324: 409-421, 1988.a
[16] Silva, F.A.T.F.; Bittencourt, G., Uma
de Conhecimento para a
Interpreta~ao de Imagens de Fenomenos
Meteorologicos. Simposio Brasileiro de
Inteligencia Artificial, 83-88, Novembro 1991.
Representa~ao
[2] Campbell S. Do; Olson S. H. Recognizing LowAltidude Wind Shear Hazards from Doppler Weather
Radar: An Artificial Inteligence Approach. Jornal
of Atmospheric and Oceanic Technology, 4(1): 518, March 1987.
[17] Sterling, L. and Shapiro, E., The Art of
Prolog. M.I.T. Press Series in Logic Programming,
Cambridge, MA, London, England, 1986.
[3] Moninger W. R. ARCHER: A Prototype Expert
System for Identifying Some Meteorological
Phenomena. Jornal of Atmospheric and Oceanic
Technology, 5(1): 144-148, February 1988.
[18] Oliveira, C. A., IDEAL, Uma Interface
Dialogica em Linguagem Natural para Sistemas
Especialistas. Tese de Doutorado em Computa~ao
Aplicada, INPE - 5151-TDL/424, Outubro 1990.
[4] Argialas, D.P.; Harlow, C. A. , Computational
Image Interpretation Models: An Overview and a
Perspective. Photogrammetric Engineering and
Remote Sensing, 56(6): 871-886, June 1990.
[19] Silva, J. D. S.; Oliveira, C. A.; Carvalho,
J. M., Intelligent Environment for Interpretation
Tasks of Remotely Sensed Images, paper to be
presented in the XVIII ISPRS, 1992.
[5] Goodenough, D.G.; Goldberg, M.; Plunkett, Go;
Zelek, J., An Expert System for Remote Sensing.
IEEE Trans. on Geoscience and Remote Sensing, Ge25(3), May 1987.
[20] Battan L. J., Radar Meteorology, The
University of Chicago Press, Chicago, Illinois,
1959.
[6] Silva, F.A.T.F.; Oliveira, C.A.; Bittencourt,
G., Uma Representa~ao de Conhecimentos para
Processamento e Interpreta~ao de Imagens. Jornada
EPUSP/IEEE em Computa~ao Visual, 165-176,
Dezembro 1990.
[21] Atlas D., Radar in Meteorology: Battan
Memorial and 40TH Anniversary Radar Meteorology
Conference, American Meteorological Society,
Boston, 1990.
[22] Mussio, Pe; Bonati, A.P., Modelling Rain
Patterns: Towards Automatic Interpretation of
Radar Images. Image and Vision Computing, 8(3),
199-210, August 1990.
[7] Tailor, A.; Cross, A.; Hogg, D.C.; Mason,
D.C., Knowledge-based Interpretation of Remotely
Sensed Images. Image and Vision Computing, 4(2):
67-83, May 1986.
[23] Erman, L.; Hayes-Roth, Fe; Lesser, V.;
Reddy, R., The Hearsay II Speech Understanding
System. Computing Surveys, 12(2): 213-253, 1980.
[8] Matsuyama, T., Expert Systems for Image
Processing: Knowledge-Based Composition of Image
Analysis Processes. Computer Vision, Graphics,
and Image Processing, 48:22-49, 1989.
[24] Srihari, S. N.; Wang, H.L.; Palumbo, P. W.
and Hull, J. J., Recongnizing Address Blocks on
Mail Pieces: Specialized Tools and ProblemSolving Architecture. AI Magazine, 8(4): 25-40,
1987.
[9] Mat suyama , T., Knowledge - Based Aerial
Image Understanding System and Expert Systems for
Image Processing. IEEE Trans. on Geoscience and
Remote Sensing, GE-25(3), May 1987.
[25] Andress, K.Me; Kak, A.C., Evidence
Accumulation & Flow of Control in a Hierarchical
Spatial Reasoning System. AI Magazine, 9(2):7591, 1988.
[10] Cohen, P.R.; Feigenbaum, E.A., The Handbook
of Artificial Intelligence vol. 3, HeurisTech
Press-Stanford, CA.; William Kaufmann, Inc.-Los
Altos, CA. 1982.
[11] Special Issue on Genetic Algorithms,
Machine Learning, 3(2/3), October 1988.
ACKNOWLEDGEMENT
I wish to thank Dr. Guilherme Bittencourt and Dr.
Carlos Alberto de Oliveira for their guidance and
encouragement, and all colleagues at INPE that
contributed to this work anyhow.
[12] Minsky, M. A Framework for Representing
Knowledge. In: The Psychology of Computer Vision,
P. Winston (ed), McGraw-Hill, New York, 1975.
[l3] Quillian, R., "Semantic Memory", in
Semantic Information Processing, M. Minsky (Ed.),
MIT Press, Cambridge, Mass., 1968.
[14] Bittencourt, G., An Architecture For Hybrid
Knowledge Representation, corresponding to the
doctorate program in Computer Science, University
of Karlsruhe, January, 1990.
[15] Niemann, H.; Sagerer G. F.; Schroder So;
Kummert. ERNEST: A Semantic Network System for
Pattern Understanding. IEEE Trans. Pattern Anal.
Mach. Intell, 12(9): 883-905, September 1990.
815
Download