Laborious Object Recognition

advertisement
Laborious Object Recognition
Dallenbach, K. M. (1951). A puzzle-picture with a new principle of concealment.
American Journal of Psychology, 64, 181-191.
Gray, C. M., Koenig, P., Engel, A. K., & Singer, W. (1989). Oscillatory responses in cat
visual cortex exhibit inter-columnar synchronization which reflects global stimulus
properties. Nature, 338, 334-337.
Hebb, D. O. (1949). The organization of behavior. Wiley.
Hubel, D. H., & Wiesel (1968). Receptive fields and functional architecture of monkey
striate cortex. Journal of Physiology, 195, 215-243.
Hummel, J. E., & Biederman, I. (1992). Dynamic binding in a neural network for shape
recognition, Psychological Review, 99, 480-517.
Kolers, P. A., & Roediger, H. L. (1984). Procedures of Mind. Journal of Verbal Learning and Verbal
Behavior, 23, 425-449.
Kreiter, A. K., & Singer, W. (1996). Stimulus-dependent sychronization of neuronal
responses in the visual cortex of awake macaque monkey. Journal of Neuroscience,
16, 2381-2396.
Logothetis NK, Pauls J, Poggio T. 1995. Shape representation in the inferior temporal
cortex of monkeys. Curr. Biol, 5:552-63
McClelland, J. L., & Rumelhart, D.E. (1981). An interactive activation model of context
effects in letter perception: Part 1. An account of basic findings. Psychological
Review, 88, 375-407.
Moran, J., & Desimone, R. (1985). Selective attention gates visual processing in the
extrastriate cortex. Science, 229, 782-784.
Perrett, D. I., Smith, P. A. J., Potter, D. D., Mistlin, A. J., Head, A. D., Jeeves, M. A.
(1984). Neurones responsive to faces in the temporal cortex: Studies of functional
organization, sensitivity to identity and relation to perception. Human Neurobiology., 3,
197-208
Rodriguez, E., George, N., Lachaux, J-P, Martinerie, J., Renault, B., & Varela, F. J.
(1999). Perception's shadow: Long-distance synchronization of human brain activity.
Nature, 397, 403-433.
Rzempoluck, E. J. (1998). EEG changes index camoflaged object identification: A pilot
study, Biological Psychology, 47, 181-191.
Singer, W. (1995). Development and plasticity of cortical processing architectures,
Science, 270, 758-764.
Tovee, M. J., Rolls, E. T., & Ramachandran, V. S. (1996). Rapid visual learning in
neurones of the primate temporal visual cortex. Neuroreport, 7, 2757-2760.
Yu, K., & Blake, R. (1992). Do recognizable figures enjoy an advantage in binocular
rivalry. Journal of Experimental Psychology: Human Perception and Performance, 4,
1158-1173.
Dolan, R. J., Fink, G. R., Rolls, E., Booth, M., Holmes, A., Frackowiak, R. S. J., &
Friston, K. J. (1997). How the brain learns to see objects and faces in an impoverished
context. Nature, 389, 596-599.
Perception is commonly delineated into "bottom-up" and "top-down" processes. Bottomup processes are those that begin with low-level perceptual features derived from a
stimulus, and compose them together into larger and larger units until a coherent
perceptual interpretation of an entire scene is constructed. Via top-down processes, an
oberserver's expectations, knowledge, and experience influence how the individual
elements of a scene are interpreted. These two types of processes are not mutually
exclusive, and there are formal models that provide an account of how top-down and
bottom-up processing can each have a simultaneous influence on the other (McClelland
& Rumelhart, 1981). Typically, expectations and stimulus information will mutually
determine the perceptual interpretation given to an object. Still, one striking phenomenon
that demonstrates a contribution of experience-driven expectations to object perception is
the subjective difference between perceiving a degraded image of an object before and
after the true interpretation of the object has been revealed.
As originally described by Dallenbach (1951), when observers are shown degraded
images such as Figure X, they frequently cannot determine the object being represented,
even though the object comprises the major part of the image and is depicted in a
canonical perspective. When the object is pointed out to an observer, the observer
frequently has an "Aha" reaction in which the degraded image is readily interpreted.
Once interpreted, it is difficult for the observer to return to their naive state of seeing the
image as a set of unorganized blotches. This phenomenon suggests a powerful role of
experience-driven expectations because the physical information contained in a degraded
image is the same before and after its interpretation has been revealed (pre- and postrevelation). The subjective difference in perception of the degraded image comes from
perceptual learning that requires only a single presentation of the original, undegraded
image.
The subjectively different perceptual experiences associated with pre- and post-revelation
degraded images are reliably associated with differences in brain activity. Magnetic
Resonance Imagery (MRI) has revealed that post-revelation images produce higher
activity in parietal and inferior temporal regions than do pre-revelation images (Dolan et
al, 1997). The inferior temporal area is known to be associated with object recognition
(Moran & Desimone, 1985), particularly for the recognition of familiar objects
(Logothetis, Pauls, & Poggio, 1995). This MRI evidence is consistent with single-cell
recordings of neurons in the inferior temporal region of macaque monkeys. Degraded
face images produced higher neuron firing rates when they were presented after the
original , undegraded face images were revealed than before revelation (Tovee, Rolls, &
Ramachadran, 1996). Finally, there is evidence from eletroencephalogram (EEG)
recordings in humans that one of the brain difference between images that are coherently
interpreted and those that are not is that the former causes more synchronized neural
activity in at 34-40 HZ (in the Gamma frequency range). Rodriguez et. al. (1999) showed
their participants degraded images of faces, and separately analyzed those trials where
participants did and did not perceive faces. Participants who interpreted upright degraded
faces as depicting faces showed greater synchronized neural activity between left parietooccipital and frontotemporal regions at 250 milliseconds after the onset of the stimulus
than did participants who did not interpret inverted degraded faces as depicting faces.
These explorations of neural activity suggest two accounts for what occurs when an
image is given a meaningful interpretation. First, detectors in a particular region may
signal the interpretation of an object. Such an account is consistent with work suggesting
the existence of neurons that are selectively activated not only by simple stimulus features
such as lines moving at particular orientation, but also of by complex stimulus
configurations such as hands or faces (Hubel & Wiesel, 1968). The specicity of cells in
the inferotemporal regions is at least partially learned given that it is especially
pronounced for familiar faces (Perrett et al., 1984). Second, a coherent interpretation of
an image may be the result of binding together neural activity caused by parts of the
image that come from the same object. By this account, coherent objects are represented
by dynamically forming assemblies of neurons (Hebb, 1949). One of the main candidates
for "labeling" neural activity that is to be bound together is by synchronizing the electrical
discharges between neurons within an assembly (Gray et al, 1989; Singer, 1995). The
strength of response synchronization between neurons reflects perceptual constraints such
as the Gestalt laws of organization, including continuity, proximity, similarity,
colinearity, and common fate (Kreiter & Singer, 1996; Singer, 1995). However, given the
results of Rodriguez et al., top-down interpretability, as well as bottom-up stimulus
properties, may determine the synchronization of neural activity. One advantage of
representing objects by the synchronized neural activity rather than the firing rate of
individual neurons is that a complex scene can be decomposed into several objects, with
the neurons responding to different parts firing with different phases (Hummel &
Biederman, 1992).
Ascending from neural-level considerations to a behavioral analysis, there are four
revealing properties associated with changing the subjective perception of a degraded
image by previously revealing its original form. First, once revealed, the correct
interpretation of a degraded image persists even if the image is presented only after delay
(see our Experiment X). Pilot work in our laboratory suggests that even after delays of
two months, there is a strong influence of revelation on degraded image interpretation.
Second, once the degraded image has been revealed, it is hard to look at the degraded
image and not interpret it, or to give it an alternative interpretation. Although alternative
interpretations are frequent pre-revelation, they seem to be inhibited by the propert
interpretation. Third, providing an interpretation of an image by verbally presenting its
category (e.g. saying "look for a cow") is much less effective in changing subjective
organization than is either showing the original version of the image, or a simplified
drawing of it (Dallenbach, 1951). Fourth, presenting the original version of an image
facilitates interpretation of the degraded image much more if they are presented at
roughly the same time (see our Experiment X). The original picture is much more
effective if it actively used to interpret the degraded picture, rather than simply being a
passive prime. Together, these properties suggest that exposure to an original picture acts
to prime the procedure of segmenting an image into objects and background. A hallmark
of a strong "aha" effect (i.e. large difference between pre- and post-revelation subjective
experience) is that figure-ground segmentation cues in the image conflict with the actual
segmentations required to correctly interpret the object. For example, in Figure X.... .
Simultaneous exposure to an original and degraded image allows people to tune their
figure-ground segmentation processes to create the correct segmentation of the degraded
picture. By stressing procedural priming, rather than semantic, strategic, or episodic
priming, we are claiming that the large difference between pre- and post-revelation
perception of degraded images stems from altering the segmentation routines that take
relatively unprocessed inputs and produce structured figure/ground organizations.
Open questions: when/how early? Is pathway "greased" just like it is when an object
becomes familiar? That is, is a post-revealed object just like an object that has been
presented many times before?
Yu and Blake - show some people what the dalmatian really is a picture of - revelation.
Results: dog predominates more even if it is not revealed.
Information about structural configuration is registered early. object superiority effect Pomerantz et al. Weisstein, N., & Harris, C. S. (1974). visual detection of line segments:
An object-superiority effect. Science, 186, 752-755.
Degraded object perception as a model of agnosia. Like agnosics, people can see the
degraded object fine, and could reconstruct it quite well. They just can't combine the
parts together to create a coherent interpretation- exactly what apperceptive agnosics
complain of.
ACCESSION NUMBER
1996-09110-003
DOCUMENT TYPE
Journal-Article
TITLE
Identification of fragmented pictures under ascending versus fixed presentation in
young and elderly adults: Evidence for the
inhibition-deficit hypothesis.
AUTHOR
Lindfield,-Kimberly-C.; Wingfield,-Arthur; Bowles,-Nancy-L.
SOURCE
Aging-and-Cognition.1994 Dec; Vol 1(4): 282-291.
ISSN0928-9917
PUBLICATION YEAR
1994
ABSTRACT
Hypothesized that (1) older adults have deficient inhibitory processes and (2) poorer
performance in ascending than in fixed
presentations of fragmented stimuli is due to residual activation interference. 24 6086 yr old volunteers and 24 17-22 yr old
college students were tested for the ability to identify degraded pictures that were
presented using either an ascending (AC) or
a fixed (FC) condition. In the AC, Ss identified the pictures at each level of
increasing completeness until correct identification
was achieved. In the FC, Ss identified degraded pictures that were presented once at
an intermediate level of visual
completeness. An ANOVA confirmed that accuracy was higher in the FC than the
AC, and that the main effect of age was not
significant. When Ss were equated on a pretest for performance on the AC, a
marginal trend was found for the elderly Ss only.
Additional evidence for reduced inhibitory processes in older Ss was seen in the Ss'
correct response latencies. Results are
interpreted as support for the inhibition-deficit hypothesis. ((c) 1997 APA/PsycINFO,
all rights reserved) .
TITLE
Perceptual/sensory information versus performance level as indicators of competitive
activation in an object identification task:
Evidence from aging.
AUTHOR
Lindfield,-Kimberly-C.; Wingfield,-Arthur
SOURCE
Brain-and-Cognition.1998 Jun; Vol 37(1): 24-27.
ISSN0278-2626
PUBLICATION YEAR
1998
ABSTRACT
Young and elderly adults were tested for the ability to identify degraded pictures
presented either in a series of incremental
steps with each step increasing the completeness of the visual information (ascending
condition) or in one single exposure
(fixed condition). The probability of correct identification in the fixed condition was
better than the ascending condition once
the amount of visual information shown reached a certain level of completeness. This
was the case for both age groups tested
even though the performance of older adults was lower than young adults. Findings
are consistent with the competitive
activation model of perceptual interference in picture identification (C. R. Luo and J.
G. Snodgrass, 1994). ((c) 1998
APA/PsycINFO, all rights reserved) .
Cortical dynamics of three-dimensional figure-ground perception of two-dimensional
pictures.
AUTHOR
Grossberg,-Stephen
SOURCE
Psychological-Review.1997 Jul; Vol 104(3): 618-658.
ISSN0033-295X
PUBLICATION YEAR
1997
ABSTRACT
Develops the FACADE theory of 3-dimensional (3-D) vision and figure-ground
separation to explain data concerning how
2-dimensional pictures give rise to 3-D percepts of occluding and occluded objects,
and how geometrical and contrastive
properties of a picture cooperate or compete when forming the boundaries and
surface representations that subserve conscious
percepts. Spatially long-range cooperation and spatially short-range competition
work together to separate the boundaries of
occluding figures from their occluded neighbors, and this process is sensitive to
image T junctions at which occluded figures
contact occluding figures. These boundaries control the filling-in of color within
multiple depth-sensitive surface
representations. Feedback between surface and boundary representations strengthens
consistent boundaries while inhibiting
inconsistent ones. Both the boundary and the surface representations of occluded
objects may be amodally completed, while the
surface representations of unoccluded objects become visible through modal
completion. Functional roles for conscious modal
and amodal representations in object recognition, spatial attention, and reaching
behaviors are discussed. Model interactions are
interpreted in terms of visual, temporal, and parietal cortices. ((c) 1997
APA/PsycINFO, all rights reserved) .
Do recognizable figures enjoy an advantage in binocular rivalry?
AUTHOR
Yu,-Karen; Blake,-Randolph
SOURCE
Journal-of-Experimental-Psychology:-Human-Perception-and-Performance.1992
Nov; Vol 18(4): 1158-1173.
ISSN0096-1523
PUBLICATION YEAR
1992
ABSTRACT
Five experiments examined whether recognizable stimuli predominate in binocular
rivalry. It was found that a face
predominated more than did a pattern equated for spatial frequency, luminance, and
contrast; an objective reaction time
(RT) procedure confirmed predominance of the face. The face was still liable to
fragmentation as stimulus size increased.
Observers tracked exclusive dominance of a picture of a camouflaged figure (a
Dalmatian dog) prior to and then
following discovery of the figure's presence; control observers received the same
protocol with a scrambled version of the
dog stimulus. Compared with control results, predominance of the dog picture was
higher even before observers knew of
the camouflaged figure. Inversion of the dog figure reduced its predominance.
Binocular rivalry is sensitive to
object-related, configural properties of a stimulus. ((c) 1997 APA/PsycINFO, all
rights reserved) .
Recognition of computer-generated pictures on monochrome monitors.
AUTHOR
Baker,-Patti-R.; Belland,-John-C.; Cambre,-Marjorie-A.
SOURCE
Journal-of-Computer-Based-Instruction.1985 Fal; Vol 12(4): 104-107.
ISSN0098-597X
PUBLICATION YEAR
1985
ABSTRACT
Examined whether 64 2nd-4th graders could recognize computer-generated pictures
on monochrome monitors. Ss were
randomly assigned to 1 of 2 conditions. Ss in the 1st treatment were asked to identify
on a monochrome monitor a figure
that was initially presented in its original form and then as a redesigned, more
distinguishable figure. The redesigned
figure had greater figure^ground contrast because color substitutions were made that
used pixel patterns to provide contrast
in the monochromatic display. The order of picture presentation was reversed for Ss
in the 2nd treatment. Ss also
completed the Children's Embedded Figures Test to assess their field
independence^dependence. Results indicate that
regardless of grade or field independence^dependence characteristics, Ss were unable
to discern critical features of a color
graphic displayed on a monochromatic monitor unless it was designed to enhance
figure^ground separation. Implications for
the design of instructional software that incorporates microcomputer-generated
graphics are discussed. (14 ref) ((c) 1997
APA/PsycINFO, all rights reserved) .
Evoked potential correlates of figure and ground.
AUTHOR
Landis,-Theodor; Lehmann,-D.; Mita,-T.; Skrandies,-W.
SOURCE
International-Journal-of-Psychophysiology.1984 Jun; Vol 1(4): 345-348.
ISSN0167-8760
PUBLICATION YEAR
1984
ABSTRACT
Brain potentials averaged during the viewing of an alternating positive and negative
"hidden man" puzzle picture were
averaged from 8 Ss before and after they learned to recognize the figure. After vs
before recognition, there was
significantly more evoked positivity at 64/96 msec latency and more negativity at
224/256 msec and 352-480 msec latency
over parietal areas during the viewing of the positive picture (recognizable as a face).
It is hypothesized that separate
physiological changes might reflect learned meaningfulness of the figure (which
entails increased attention) and figure
extraction from ground. (10 ref) ((c) 1997 APA/PsycINFO, all rights reserved) .
Is visual image segmentation a bottom-up or an interactive process?
AUTHOR
Vecera,-Shaun-P.; Farah,-Martha-J.
SOURCE
Perception-and-Psychophysics.1997 Nov; Vol 59(8): 1280-1296.
ISSN0031-5117
PUBLICATION YEAR
1997
ABSTRACT
Visual image segmentation is the process by which the visual system groups features
that are part of a single shape. In Exps 1
and 2, Ss were presented with two overlapping shapes and were asked to determine
whether two probed locations were on the
same shape or on different shapes. The availability of top-down support was
manipulated by presenting either upright or
rotated letters. Ss were fastest to respond when the shapes corresponded to familiar
shapes--the upright letters. In Exp 3, a
variant of this segmentation task was used to rule out the possibility that Ss
performed same/different judgments after
segmentation and recognition of both letters. Exp 4 ruled out the possibility that the
advantage for upright letters was merely
due to faster recognition of upright letters relative to rotated letters. Results suggest
that the previous effects were not due to
faster recognition of upright letters; stimulus familiarity influenced segmentation.
The results are discussed in terms of an
interactive model of visual image segmentation. ((c) 1998 APA/PsycINFO, all rights
reserved) .
TITLE
Hidden figures are ever present.
AUTHOR
Mens,-Lucas-H.; Leeuwenberg,-Emanuel-L.
SOURCE
Journal-of-Experimental-Psychology:-Human-Perception-and-Performance.1988
Nov; Vol 14(4): 561-571.
ISSN0096-1523
PUBLICATION YEAR
1988
ABSTRACT
Preference judgments about alternative interpretations of unambiguous patterns can
be explained in terms of a rivalry
between a preferred and a second-best interpretation (cf. Leeuwenberg & Buffart,
1983). We tested whether this second-best
interpretation corresponds to a suppressed but concurrently present interpretation or
whether it merely reflects an
alternative view that happens to be preferred less often. Two patterns were present
immediately following each other with a
very short onset asynchrony: a complete pattern and one out of three possible
subpatterns of it, corresponding to the best,
the second best, or an odd interpretation of the complete pattern. Subjects indicated
which subpattern was presented by
choosing among the three subpatterns shown after each trial. The scores, corrected
for response-bias effects, indicated a
relative facilitation of the second-best interpretation, in agreement with its predicted
"hidden" presence. This result is more
in line with theories that capitalize on the quality of the finally selected
representation than with processing models aimed at
reaching one single solution as fast and as economically as possible. ((c) 1997
APA/PsycINFO, all rights reserved) .
Spatial context in recognition.
AUTHOR
Bar,-Moshe; Ullman,-Shimon
SOURCE
Perception.1996; Vol 25(3): 343-352.
ISSN0301-0066
PUBLICATION YEAR
1996
ABSTRACT
Exps 1 and 2, with 18 graduate students, investigated the role of individual objects in
recognition of complete figures and the
influence of contextual information on identification of ambiguous objects.
Configurations of objects that were placed in
either proper or improper spatial relations were used, and response times and error
rates in a recognition task were measured.
Proper spatial relations among the objects of a scene decreased response times and
error rates in the recognition of individual
objects. Also, the presence of objects that had a unique interpretation improved the
identification of ambiguous objects in the
scene. Ambiguous objects were recognized faster and with fewer errors in the
presence of clearly recognized objects
compared with the same objects in isolation or in improper spatial relations.
Implications for the organization of recognition
memory are discussed. ((c) 1997 APA/PsycINFO, all rights reserved) .
TITLE
Visual schemas in neural networks for object recognition and scene analysis.
AUTHOR
Leow,-Wee-Kheng; Miikkulainen,-Risto
SOURCE
Connection-Science:-Journal-of-Neural-Computing,-Artificial-Intelligence-andCognitive-Research.1997 Jun; Vol 9(2):
161-200.
ISSN0954-0091
PUBLICATION YEAR
1997
ABSTRACT
VISOR is a large connectionist system that shows how visual schemas can be
learned, represented and used through mechanisms
natural to neural networks. Processing in VISOR is based on cooperation,
competition, and parallel bottom-up and top-down
activation of schema representations. VISOR is robust against noise and variations in
the inputs and parameters. It can indicate
the confidence of its analysis, pay attention to important minor differences, and use
context to recognize ambiguous objects.
Experiments also suggest that the representation and learning are stable, and behavior
is consistent with human processes such
as priming, perceptual reversal and circular reaction in learning. The schema
mechanisms of VISOR can serve as a starting
point for building robust high-level vision systems, and perhaps for schema-based
motor control and natural language
processing systems as well. ((c) 1997 APA/PsycINFO, all rights reserved)(journal
abstract) .
TITLE
Object recognition based on impulse restoration with use of the expectationmaximization algorithm.
AUTHOR
Abu-Naser,-Ahmad; Galatsanos,-Nikolas-P.; Wernick,-Miles-N.; Schonfeld,-Dan
SOURCE
Journal-of-the-Optical-Society-of-America.-A.1998 Sep; Vol 15(9): 2327-2340.
ISSN0740-3232
PUBLICATION YEAR
1998
ABSTRACT
It has recently been demonstrated that object recognition can be formulated as an
image-restoration problem. In this
approach, which the authors term impulse restoration, the objective is to restore a
delta function that indicates the detected
object's location. Solutions based on impulse restoration for the Gaussian-noise case
are developed, and a new iterative
approach, based on the expectation-maximization (EM) algorithm, that
simultaneously estimates the background statistics
and restores a delta function at the location of the template, is proposed. A Monte
Carlo study and
localization-receiver-operating-characteristics curves was used to evaluate the
performance of this approach quantitatively
and compare it with existing methods. Experimental results that demonstrate that
impulse restoration is a powerful
approach for detecting known objects in images severely degraded by noise are
presented. ((c) 1998 APA/PsycINFO, all
rights reserved)
KEY PHRASE
TITLE
How the brain learns to see objects and faces in an impoverished context.
AUTHOR
Dolan,-R.-J.; Fink,-G.-R.; Rolls,-E.; Booth,-M.; Holmes,-A.; Frackowiak,-R.-S.-J.;
Friston,-K.-J.
SOURCE
Nature.1997 Oct; Vol 389(6651): 596-599.
ISSN0028-0836
PUBLICATION YEAR
1997
ABSTRACT
A degraded image of an object or face, which appears meaningless when seen for the
first time, is easily recognizable
after viewing an undegraded version of the same image. The neural mechanisms by
which this form of rapid perceptual
learning facilitates perception are not well understood. Psychological theory suggests
the involvement of systems for
processing stimulus attributes, spatial attention and feature binding, as well as those
involved in visual imagery. Here it is
investigated where and how this rapid perceptual learning is expressed in the human
brain by using functional
neuroimaging to measure brain activity during exposure to degraded images before
and after exposure to the
corresponding undegraded versions. Perceptual learning of faces or objects enhanced
the activity of inferior temporal
regions known to be involved in face and object recognition respectively. A strong
coupling was observed between the
temporal face area and the medial parietal cortex when faces were perceived. This
suggests that perceptual learning
involves direct interactions between areas involved in face recognition and those
involved in spatial attention, feature
binding and memory recall. ((c) 1998 APA/PsycINFO, all rights reserved)
KEY PHRASE
An optimal estimation approach to visual perception and learning.
AUTHOR
Rao,-Rajesh-P.-N.
SOURCE
Vision-Research.1999 Jun; Vol 39(11): 1963-1989.
ISSN0042-6989
PUBLICATION YEAR
1999
ABSTRACT
How does the visual system learn an internal model of the external environment?
How is this internal model used during
visual perception? How are occlusions and background clutter so effortlessly
discounted for when recognizing a familiar
object? How is a particular object of interest attended to and recognized in the
presence of other objects in the field of
view? In this paper, the author attempts to address these questions from the
perspective of Bayesian optimal estimation
theory. Using the concept of generative models and the statistical theory of Kalman
filtering, it is shown how static and
dynamic events occurring in the visual environment may be learned and recognized
given only the input images. The
author also describes an extension of the Kalman filter model that can handle
multiple objects in the field of view. The
resulting robust Kalman filter model demonstrates how certain forms of attention can
be viewed as an emergent property
of the interaction between top-down expectations and bottom-up signals.
Experimental results are provided to help
demonstrate the ability of such a model to perform robust segmentation and
recognition of objects and image sequences
in the presence of occlusions and clutter. ((c) 1999 APA/PsycINFO, all rights
reserved)
Figure-ground organization and object recognition processes: An interactive account.
AUTHOR
Vecera,-Shaun-P.; O'Reilly,-Randall-C.
SOURCE
Journal-of-Experimental-Psychology:-Human-Perception-and-Performance.1998
Apr; Vol 24(2): 441-462.
ISSN0096-1523
PUBLICATION YEAR
1998
ABSTRACT
Traditional bottom-up models of visual processing assume that figure-ground
organization precedes object recognition.
This assumption seems logically necessary: How can object recognition occur before
a region is labeled as figure?
However, some behavioral studies find that familiar regions are more likely to be
labeled figure than less familiar regions,
a problematic finding for bottom-up models. An interactive account is proposed in
which figure-ground processes receive
top-down input from object representations in a hierarchical system. A graded,
interactive computational model is
presented that accounts for behavioral results in which familiarity effects are found.
The interactive model offers an
alternative conception of visual processing to bottom-up models. ((c) 1998
APA/PsycINFO, all rights reserved)(journal
abstract)
Leow,-Wee-Kheng; Miikkulainen,-Risto
SOURCE
Connection-Science:-Journal-of-Neural-Computing,-Artificial-Intelligence-andCognitive-Research.1997 Jun; Vol 9(2):
161-200.
ISSN0954-0091
PUBLICATION YEAR
1997
ABSTRACT
VISOR is a large connectionist system that shows how visual schemas can be
learned, represented and used through
mechanisms natural to neural networks. Processing in VISOR is based on
cooperation, competition, and parallel bottom-up
and top-down activation of schema representations. VISOR is robust against noise
and variations in the inputs and
parameters. It can indicate the confidence of its analysis, pay attention to important
minor differences, and use context to
recognize ambiguous objects. Experiments also suggest that the representation and
learning are stable, and behavior is
consistent with human processes such as priming, perceptual reversal and circular
reaction in learning. The schema
mechanisms of VISOR can serve as a starting point for building robust high-level
vision systems, and perhaps for
schema-based motor control and natural language processing systems as well. ((c)
1997 APA/PsycINFO, all rights
reserved)(journal abstract)
Visual perception.
AUTHOR
Paap,-Kenneth-R.; Partridge,-Derek
BOOK SOURCE
McTear, Michael F.; et-al. (1988). Understanding cognitive science. Ellis Horwood
series in cognitive science. (pp.
69-101). Chichester, England UK: Ellis Horwood, Ltd; New York, NY, USA:
Halsted Press. 264 pp.SEE BOOK
ISBN0745801617 (hardcover, Ellis Horwood); 047021175X (hardcover, Halsted)
PUBLICATION YEAR
1988
ABSTRACT
(from the chapter) top-down effects and the modularity of mind /// object recognition
by basically bottom-up
processing /// lessons from natural vision / pragmatism leads to oversight and
hallucination / attentional control /// bringing
top-down information into play / the activation-verification model of word
recognition /// lessons from
neurophysiology: natural and artificial edge detection /// connectionist models ((c)
1997 APA/PsycINFO, all rights
reserved)
Perception and knowledge.
AUTHOR
Rock,-Irvin
SOURCE
Acta-Psychologica.1985 May; Vol 59(1): 3-22.
ISSN0001-6918
PUBLICATION YEAR
1985
ABSTRACT
Argues that knowledge concerning the object, scene, or event in a conscious
propositional form generally does not affect
perception, while knowledge in the form of stored representations of past visual or
phylogenetic experience can affect
perception. Exceptions to the 1st generalization can occur if the stimulus is
ambiguous and can support a cued or suggested
interpretation or one in line with what is known to be present as well as it can support
the perception that occurs
spontaneously. Knowledge in the form of stored representations can affect perception
in various ways: It enables
recognition and interpretation to occur; it enables perceptual discrimination among
similar members of a category to
occur; it can lead to perceptual enrichment effects; it provides internal solutions that
can be accessed in cases in which
perceptual problem solving occurs; it provides rules or laws concerning geometrical
optics on the basis of which
phenomena such as perceptual constancy and the like can be achieved; and it can lead
to the recalibration of tactual or visual
sensation. However, before such top-down effects of past experience can occur,
bottom-up processes must achieve a
preliminary perception. That perception provides the bridge to the relevant stored
representations accessed on the basis of
similarity. (38 ref) ((c) 1997 APA/PsycINFO, all rights reserved)
Bottom-up connectionist models of 'interaction'
AUTHOR
Norris,-Dennis
BOOK SOURCE
Altmann, Gerry T. M. (Ed); Shillcock, Richard (Ed); et-al. (1993). Cognitive models
of speech processing: The Second
Sperlonga Meeting. (pp. 211-234). Hove, England UK: Lawrence Erlbaum
Associates, Inc., Publishers. xiv, 531 pp.SEE
BOOK
ISBN0863773028 (hardcover)
PUBLICATION YEAR
1993
ABSTRACT
(from the book) Presents an exploration of some of the properties of simple
connectionist models of word recognition,
including a model of lexical access. Some of the properties of these models are
demonstrated as part of a discussion of
claims for 'top-down' activation in psychological models, notably the word
superiority effect, and J. Elman and J.
McClelland's (1988) demonstration of phoneme restoration in conjunction with the
compensation for coarticulation effect.
/// It is demonstrated that it is possible to obtain apparent top-down behaviour from a
connectionist model that contains only
feedforward and delay connections. These simulations force attention to the issues of
levels of description and processing,
and demonstrate that traditional box-and-arrow models may not trivially be
transformed into connectionist models without
a careful reappraisal of how the learning process effectively redistributes their
functional architecture. They also
demonstrate the difficulty of determining the precise generalizations that are acquired
by even small connectionist
networks. ((c) 1998 APA/PsycINFO, all rights reserved)
Does lexical information influence the perceptual restoration of phonemes?
AUTHOR
Samuel,-Arthur-G.
SOURCE
Journal-of-Experimental-Psychology:-General.1996 Mar; Vol 125(1): 28-51.
ISSN0096-3445
PUBLICATION YEAR
1996
ABSTRACT
A critical issue in modeling speech perception is whether lexical representations can
affect lower level (e.g., phonemic)
processing. Phonemic restoration studies have provided support for such top-down
effects, but there have also been a
number of failures to find them. A methodology is introduced that provides good
approximations to the underlying
distributions of perceived intactness that are assumed in signal detection analyses of
restoration. This methodology provides
a sensitive means to determine the necessary conditions for lexical feedback to occur.
When these conditions are created, a
reliable lexical influence on phonemic perception results. The experiments thus show
that lexical activation does influence
lower level processing, and that these influences are fragile. The theoretical
implications of real but fragile lexical effects
are discussed. ((c) 1997 APA/PsycINFO, all rights reserved)(journal abstract)
Download