Auditory scene analysis by echolocation in bats

advertisement
Auditory scene analysis by echolocation in bats
Cynthia F. Mossa)
Department of Psychology, Program in Neuroscience and Cognitive Science, University of Maryland,
College Park, Maryland 20742
Annemarie Surlykkeb)
Center for Sound Communication, Institute of Biology, Odense University, SDU, DK-5230 Odense M.,
Denmark
共Received 30 October 2000; revised 27 June 2001; accepted 5 July 2001兲
Echolocating bats transmit ultrasonic vocalizations and use information contained in the reflected
sounds to analyze the auditory scene. Auditory scene analysis, a phenomenon that applies broadly
to all hearing vertebrates, involves the grouping and segregation of sounds to perceptually organize
information about auditory objects. The perceptual organization of sound is influenced by the
spectral and temporal characteristics of acoustic signals. In the case of the echolocating bat, its
active control over the timing, duration, intensity, and bandwidth of sonar transmissions directly
impacts its perception of the auditory objects that comprise the scene. Here, data are presented from
perceptual experiments, laboratory insect capture studies, and field recordings of sonar behavior of
different bat species, to illustrate principles of importance to auditory scene analysis by echolocation
in bats. In the perceptual experiments, FM bats 共Eptesicus fuscus兲 learned to discriminate between
systematic and random delay sequences in echo playback sets. The results of these experiments
demonstrate that the FM bat can assemble information about echo delay changes over time, a
requirement for the analysis of a dynamic auditory scene. Laboratory insect capture experiments
examined the vocal production patterns of flying E. fuscus taking tethered insects in a large room.
In each trial, the bats consistently produced echolocation signal groups with a relatively stable
repetition rate 共within 5%兲. Similar temporal patterning of sonar vocalizations was also observed in
the field recordings from E. fuscus, thus suggesting the importance of temporal control of vocal
production for perceptually guided behavior. It is hypothesized that a stable sonar signal production
rate facilitates the perceptual organization of echoes arriving from objects at different directions and
distances as the bat flies through a dynamic auditory scene. Field recordings of E. fuscus, Noctilio
albiventris, N. leporinus, Pippistrellus pippistrellus, and Cormura brevirostris revealed that spectral
adjustments in sonar signals may also be important to permit tracking of echoes in a complex
auditory scene. © 2001 Acoustical Society of America. 关DOI: 10.1121/1.1398051兴
PACS numbers: 43.80.Lb, 43.80.Ka, 43.66.Ba, 43.66.Lj, 43.66.Mk 关WA兴
I. INTRODUCTION
Auditory perception goes beyond the detection, discrimination, and localization of sound stimuli. It involves the
organization of acoustic events that allows the listener to
identify and track sound sources in the environment. For
example, at a concert, individuals in the audience may be
able to hear out separate instruments, or differentiate between music played from different sections of a symphony
orchestra. At the same time, each listener may also follow a
melody that is carried by many different sections of the orchestra together. In effect, the listener groups and segregates
sounds according to similarity or differences in pitch, timbre,
spatial location, and temporal patterning, to perceptually organize the acoustic information from the auditory scene. Auditory scene analysis thus allows the listener to make sense
of dynamic acoustic events in a complex environment 共Bregman, 1990兲.
Auditory scene analysis shares many perceptual phea兲
Author to whom correspondence should be addressed; electronic mail:
cmoss@psyc.umd.edu
Electronic mail: ams@dou.dk
b兲
J. Acoust. Soc. Am. 110 (4), October 2001
nomena with visual scene analysis, facilitating the application of Gestalt principles to its study. For example, both visual and auditory scene analyses involve the segregation and
grouping of information to develop a coherent representation
of the external sensory environment. The auditory and visual
systems both perform operations such as object recognition,
figure–ground segregation, and stimulus tracking.
The phenomenon of auditory stream segregation in humans appears to follow some fairly simple laws, namely the
effect depends on the frequency similarity of groups of tones
and the rate at which stimuli are presented. Similar laws also
apply to apparent motion in vision, a phenomenon in which
spatially separated lights that flash in sequence give rise to
an observer’s perception of a moving light. As with melodic
motion in auditory streaming, visual apparent motion depends on the similarities among the lights that flash and the
temporal separation of flashes 共Körte, 1915兲.
Perceptual grouping and segregation of sounds plays an
important role in auditory scene analysis. In human psychoacoustic studies, a subject listening to tones that alternate
slowly between low and high frequencies 共e.g., 300, 2000,
400, 2100, 500, 2200 Hz兲 reports hearing the individual
tones in the sequence. However, when the stimulus presen-
0001-4966/2001/110(4)/2207/20/$18.00
© 2001 Acoustical Society of America
2207
FIG. 1. 共A兲 Schematic illustration of alternating tone
frequencies over time that give rise to the perception of
two pitch streams 共after Bregman, 1990兲. 共B兲 Schematic
illustration of alternating echo delays over time that are
hypothesized to give rise in the bat to the perception of
two sonar target distance streams.
tation rate is increased, the listener hears out streams of
tones, one containing the low-frequency signals 共300, 400,
500 Hz兲 and the other containing the high frequency signals
共2000, 2100, 2200 Hz兲. In some cases, the listener reports
hearing these tone streams in alternation and in other cases
reports two simultaneous auditory streams that differ in perceived amplitude 共Bregman and Campbell, 1971兲. A schematic of these stimulus patterns is presented in Fig. 1共A兲.
Only recently have studies of animal perception examined the principles of auditory scene analysis in nonhuman
species. Experiments with European starlings 共Braaten and
Hulse, 1993; Hulse et al., 1997; Wisniewski and Hulse,
1997; MacDougall-Shackleton et al., 1998兲 and goldfish
共Fay, 1998, 2000兲 have demonstrated that spectral and temporal features of acoustic patterns influence the perceptual
organization of sound in these animals. Using conditioning
and generalization procedures, these researchers provide empirical evidence that both fish and birds can hear out complex acoustic patterns into auditory streams.
While the concept of auditory scene analysis applies
broadly to all hearing animals, it is vividly illustrated in the
echolocating bat, a mammal that probes the environment
with sonar vocalizations and extracts spatial information
about the world from the features of returning echoes. Each
species of bat has a distinct repertoire of signals that it uses
for echolocation, and the features of these sounds determine
the echo information available to its sonar imaging system.
The sonar sounds used by bats fall broadly into two categories, narrow-band, constant-frequency 共CF兲 signals and
broadband, frequency-modulated 共FM兲 signals. Bats using
CF signals for echolocation typically forage in dense foliage
and are specialized to use Doppler shifts in returning echoes
for the perception of target movement 共e.g., Schnitzler et al.,
1983兲, aiding in the segregation of figure 共insect target兲 and
ground 共vegetation兲. By contrast, many FM bats forage in the
open or at the edge of forests, using shorter, broadband signals 共Kalko and Schnitzler, 1998; Schnitzler and Kalko,
1998兲 that are well suited for target localization 共e.g., Simmons, 1973, 1979; Moss and Schnitzler, 1989, 1995兲 and for
separating figure and ground.
The acoustic parameters a bat uses to determine the features of an object is conveyed by the features of the echoes
共Moss and Schnitzler, 1995兲. For example, a bat can dis2208
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
criminate the size of an object from the intensity of echoes
共Simmons and Vernon, 1971兲, the velocity of a target from
the Doppler shift of the echoes 共Schnitzler, 1968兲, and the
shape of an object from the spectrum of the echoes 共Simmons et al., 1974兲. Differences in arrival time, intensity and
spectrum of echoes at the two ears encode the location of an
object in azimuth 共Blauert, 1997; Shimozowa et al., 1974;
Simmons et al., 1983兲 and elevation 共Batteau, 1967; Grinnell
and Grinnell, 1965; Lawrence and Simmons, 1982; Wotton
et al., 1995兲. The third dimension, the distance between the
bat and an object 共target range兲, is determined from the time
delay between the outgoing sound and the returning echo
共Hartridge, 1945; Simmons, 1973兲. Together, these cues provide the bat with information to form a three-dimensional
共3D兲 representation of a target and its position in space.
While research over the last 30 years has elucidated the echo
features that bats use to localize and discriminate sonar targets, we know little about how these acoustic features are
perceptually organized in the representation of an auditory
scene. Here, we present a conceptual framework to advance
our understanding of the bat’s perceptual organization of
sound for the analysis of complex auditory scenes.
II. DETAILING THE PROBLEM: AUDITORY SCENE
ANALYSIS BY ECHOLOCATION
The analogy between auditory scene analysis in bats and
visual scene analysis in other animals is particularly strong,
because sonar targets are auditory objects, and the echolocating bat uses spatial information carried by sonar echoes to
identify and track these auditory objects. Many species of bat
use echolocation to hunt small insect prey in the dark. This is
a daunting perceptual task, given the acoustic environment in
which the bat must operate. The bat produces sonar vocalizations and processes information in returning echoes to detect, track, and intercept insects that may be only a few millimeters in diameter. To accomplish this task, the bat must
sort out its own sonar vocalizations from returning echoes. In
addition, it may encounter echoes from multiple targets 共several insects in a cluster, branches, walls, etc.兲, as well as
signals produced by other bats in close proximity. To successfully intercept the insect and avoid obstacles, the bat
must organize acoustic information collected from multiple
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 2. 共A兲 Illustration of a bat pursuing a prey item in the vicinity of trees.
Each panel represents a new slice in
time, each separated by a fixed interval. In each panel, the bat’s position
relative to the trees and insect
changes, as the bat pursues its prey.
Below the drawing is a schematic that
displays the relative timing of sonar
pulses 共solid bars兲 and target echoes
共individual trees and insect echoes are
represented by filled triangles, circles,
and x’s兲. Bottom shows separate plots
of the time delay between each sonar
pulse and returning echo depends on
the distance between the bat and the
reflecting object, plotted separately for
each of the objects. 共B兲 A schematic
that illustrates the changing echo delays for each of the reflecting objects.
The right y-axis shows the stages of
insect pursuit. B-I refers to buzz one
and B-II refers to buzz two. As the bat
flies closer to the trees and insect, the
delays shorten. The echo amplitudes
are arbitrarily set at fixed values that
do not change with distance but durations and signal intervals were taken
from a pursuit sequence recorded in
the field. Only when the bat has passed
an object does the amplitude decrease
rapidly to reflect the low SPL radiated
in the backward direction. The rate of
delay changes over time for each of
the reflecting objects as a distinct ridge
with a particular slope. In this display,
one can visually identify and track the
returning echoes from the trees and insect over time.
sonar targets arriving from different directions and at different times. This problem is illustrated in Fig. 2共A兲.
Figure 2共A兲 共above兲 shows a bat pursuing an insect that
is flying in the vicinity of trees. This figure contains six
panels, each illustrating a slice in time, and the bat’s position
relative to the insect and trees changes over time. Below the
illustrated panels is a schematic showing the timing of
echolocation cries, one associated with each slice in time,
and the corresponding echoes. Each of the bat’s sonar vocalizations appears as a solid dark bar, and the echoes returning
from the objects in the environment appear as stippled bars.
Echoes from the insect and three trees are identified with
distinct patterns. When displayed as a string of sounds,
pulses, and echoes, it is difficult to identify patterns of delay
change over time that the bat must sort out. In addition, echo
returns from one sonar pulse may arrive after the production
of a second pulse, as illustrated by the numbers below that
identify the echoes associated with each pulse and the resultJ. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
ing echoes. The plots below separate out the delays of echoes
for each of the objects in the bat’s environment with a reduction in distance. In these plots, one can more readily see that
the absolute delays and rates of delay changes over time are
distinct for the insect and the three trees.
These data are presented in yet another format in Fig.
2共B兲, which displays an echo delay marker for each of the
objects in the environment, stacked over time. This figure
shows the reflecting objects as sloping ridges, whose placement along the axes and slopes depend on the bat’s relative
position and velocity with respect to the insect and trees.
This figure also illustrates the change in sound repetition rate
as the bat approaches a target 关not conveyed in Fig. 2共A兲兴,
and the search, approach, and terminal buzz phases 共I and II兲
of the insect capture sequence 共Griffin et al., 1960兲 are labeled on the right y axis. The duration of the signals is conveyed by the length of each marker. The amplitude of the
echoes is arbitrarily set at a fixed level that does not change
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2209
with target distance, but the figure does illustrate how echo
amplitude decreases rapidly after the bat flies past an object.
It is important to note how the increase in sound repetition
rate 共and the decrease in signal duration兲 in the terminal
phase defines and ‘‘sharpens up’’ the ridges corresponding to
the insect and the three trees. Just as the layout of a graphic
display can aid the identification of patterns in echo delay
changes over time, we propose that the bat’s auditory system
perceptually organizes the signals arriving from multiple objects at different distances to create a readable display, which
then permits tracking and capture of insects in a dynamic and
complex acoustic environment.
Moreover, echolocation is an active system, i.e., the bat
transmits the very sound energy it uses to probe the environment. As a bat flies toward a target, changes in the repetition
rate, bandwidth, and duration of its sonar emissions occur,
and these dynamic vocal production patterns are used to divide the bat’s insect pursuit sequence into different phases:
search, approach, and terminal buzz 关Griffin et al., 1960;
Webster, 1963; see Fig. 2共B兲兴. Given that the perceptual organization of sound depends on the acoustic features of signals, changes in the bat’s sonar vocalizations certainly impact its analysis of an auditory scene. While the active
modulation of sonar vocalizations in response to changing
echo information complicates our task of understanding
scene analysis by echolocation, the bat’s vocal production
patterns also provide us with a window to the acoustic information the bat is actively controlling as it maneuvers through
the environment.
How does the bat’s sonar system operate to coordinate
spatial acoustic information from a complex environment to
avoid obstacles and intercept prey? Here, we describe some
experimental data showing that the bat assembles acoustic
information over time, a requirement for spatial tracking of
moving targets and the analysis of auditory scenes. We also
discuss sonar vocalization patterns of bats operating in a dynamic auditory scene. These vocalization patterns provide
some insights to the bat’s active control for scene analysis by
echolocation.
III. CONCEPTUAL FRAMEWORK
As described above, humans listening to tone sequences
may experience perceptual streaming of the acoustic patterns
along the frequency axis. We propose here that echolocating
bats, animals with well-developed three-dimensional 共3D兲
spatial hearing, may stream acoustic patterns along the delay
axis. Certainly a variety of acoustic parameters contributes to
a bat’s perceptual organization of sound, but we choose to
focus here on the single parameter of echo delay. If the bat
streams echo information along the delay axis, this could
provide the perceptual foundation for sorting echoes arriving
from different objects at different absolute delays and rates
of delay change over time 关see Fig. 1共B兲 and Fig. 2兴.
While the notion of the bat’s acoustic scene has been
considered in very general terms 共e.g., Dear et al., 1993a;
Simmons et al., 1995兲, a conceptual framework to identify
and study the principles that operate in the perceptual organization of sound by an echolocating animal has not been
previously formulated. In this paper, we take a small step
2210
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
towards unraveling the complicated problem of scene analysis in biosonar. Here, we articulate our working hypotheses
and report on data that lay a foundation for understanding
how the perceptual and vocal-motor systems may operate to
support auditory scene by the echolocating bat. In this context, we wish to caution that our data only touch upon the
many issues that bear on the problem of scene analysis
through active sonar.
IV. STUDY 1. PSYCHOPHYSICAL EXPERIMENTS
FM bats, such as Eptesicus fuscus, receive brief acoustic
flashes of a changing environment as they fly. In order for the
FM bat to organize acoustic information from sonar targets
in a dynamic scene, it must assemble echo information over
time. E. fuscus must therefore use working- or short-term
memory to track changing echo information about sonar targets.
Field and laboratory observations of the bat’s increasing
repetition rate during insect approach and capture might suggest that echolocating bats integrate acoustic information
over time; however, these data have not been taken as the
definitive demonstration of echo integration, because echo
delay-dependent changes in vocal production patterns do not
explicitly inform us about the animal’s perception of echo
sequences. To experimentally address this question, we have
conducted some perceptual experiments, and the data demonstrate that Eptesicus fuscus can indeed assemble information about changing echo delay to discriminate sonar targets.
These findings suggest that the bat’s perceptual system meets
the minimum requirements for analysis of a dynamic auditory scene.
A. Animals
Five FM bats of the species E. fuscus served as subjects
in the different perceptual tasks. The animals were collected
from private homes in Maryland during the summers of 1995
and 1996 and housed in a colony room at the University of
Maryland, College Park. The temperature in the colony
room was maintained at approximately 27 °C and the day/
night cycle was reversed, with lights out between 7:00 a.m.
and 7:00 p.m. Bats were given free access to water and
maintained at about 85% of ad lib body weight. Food was
available only as a reward during behavioral experiments,
which were carried out 6 –7 days/week over a period of 18
months.
B. Apparatus and target simulation
Behavioral experiments took place in a large 共5.6 ⫻ 6.4
⫻ 2.5 m兲 carpeted room, whose walls and ceiling were lined
with acoustic foam 共Sonex兲 that reduced the amplitude of
ultrasonic reverberation by a minimum of 20–30 dB below
what would be present if the room surfaces were hard and
smooth.
Each bat was trained to rest at the base of an elevated
Y-shaped platform 共1.2 m above the floor兲 and to produce
sonar sounds. The bat’s sonar sounds were picked up by a
vertically oriented 1/8 in. Bruel & Kjaer condenser microphone 共model 4138兲 that was centered between the arms of
the platform at a distance of 17 cm from the bat. The bat’s
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 3. 共A兲 Schematic of playback apparatus used in
the psychophysical experiments. The bat was trained to
rest at the base of the Y-platform and emit sonar vocalizations into an 1/8 in. microphone. The bat’s signals
were amplified, bandpass filtered, digitized, electronically delayed, attenuated, low-pass filtered, and broadcast back to the bat through a custom loudspeaker. 共B兲
Frequency response curve of the playback system. 共C兲
Schematic illustration of the pattern of delay changes
presented in the sequential 共S兲 and random 共R兲 echo
playbacks 共see the text兲.
echolocation sounds were amplified, bandpass filtered at
20–99 kHz 共Stewart filter, model VBF7兲, digitized with 12bit accuracy at a rate of 496 kHz 共controlled externally by a
Stanford Research Systems function generator兲, electronically delayed 共custom DSP board, SPB2-Signal Data, installed in a 486 computer兲, attenuated 共PA-4, Tucker-Davis
Technologies兲, low-pass filtered 共Krone-Hite兲, and broadcast
back to the bat through a custom electrostatic loudspeaker
共designed by Lee Miller, University of Odense, Denmark兲.
The loudspeaker, positioned in front of the microphone 共0.5
cm lower than the microphone grid兲 and 15 cm from the bat,
was powered by a dc amplifier 共Ultrasound Advice兲 and had
a frequency response that was flat within 3 dB between 25
and 95 kHz 关see Figs. 3共A兲,共B兲兴. Placement of the microphone behind the speaker eliminated feedback in the system,
and the signals recorded were not distorted by the presence
of the speaker 共see Wadsworth and Moss, 2000兲.
The total gain of the system feeding into the DSP board
was approximately 60 dB, to bring the peak–peak amplitude
of most bat sonar sounds to a level just below the 12-bit limit
of the processor for maximum signal-to-noise ratio 共S:N兲.
Digital attenuators 共PA-4, Tucker Davis Technologies兲 permitted adjustment of the playback level of the sounds returning to the bat’s ears, which was set at approximately 80 dB
SPL 共rms兲 for all experiments.
Each sound produced by the bat resulted in a single
sonar signal playback that simulated an echo from a target
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
positioned directly in front of the bat, whose distance was
determined by an electronic delay controlled by the experimenter. The shortest echo delay used was 4.43 ms, corresponding to a target distance of 76 cm, and the longest echo
delay was 4.93 ms, corresponding to a target distance of 85
cm.
Before each experiment, a calibration routine was run to
test each of the components of the target simulator. The electrostatic loudspeaker broadcast a linear 1-ms 10–100-kHz
frequency modulated sweep that was picked up by a condenser microphone 共QMC兲 positioned on the test platform.
The signal received by the microphone was amplified, filtered 共Stewart VBF-7, 10–99 kHz兲, and delivered to the DSP
board. The arrival time and power spectrum of the FM sound
picked up by the microphone were measured and compared
against standard values. Experimental data were collected
only when the delay and power spectrum of the calibration
signal matched the standard values, a 0.29-ms delay when
the microphone was positioned 10 cm from the speaker 共oneway travel delay, 2.9 ms/m兲 and a relatively flat spectrum
关⫾3 dB at 25–95 kHz; see Fig. 3共B兲兴.
C. Behavioral task: Sequential versus random echo
delay sequences
Each of the bats was trained in a two-alternative forcedchoice experiment to discriminate between two stimulus sets
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2211
TABLE I. Summary of stimulus conditions in the sequential versus random
echo delay discrimination task. Values refer to electronic echo delays in ms.
Acoustic travel time from the bat to the microphone and from the loudpspeaker to the bat adds 0.93 ms to these values for the total echo playback
delays presented in the sequential conditions.
Condition 1
Decreasing:
4.00
3.90
3.80
3.70
3.60
3.50
3.85
3.80
3.75
3.80
3.90
4.00
3.90
3.95
4.00
3.82
3.90
4.00
Condition 2
Decreasing:
4.00
3.95
3.90
Condition 3
Increasing:
3.50
3.60
3.70
Condition 4
Increasing:
3.75
3.80
3.85
Condition 5
Increasing:
3.50
3.55
共unequal step sizes兲
3.75
of echo delays. Each stimulus set contained the same set of
six echo delays, and the bat received only one echo playback
for each sonar emission. In one echo set, the delays systematically increased or decreased 共stimulus ‘‘S’’兲, and in the
other echo set, the sequence of delays was randomized
共stimulus ‘‘R’’兲. The echo stimulus set repeated until the bat
made its response 共see below兲. The step size and direction of
echo delay change was manipulated. The delay step size of
echoes in stimulus set S was unequal in some experiments to
ensure that the bat was discriminating the pattern of echo
delay change, and not simply between variable and fixed
delay steps in echo sets S and R. A schematic of these echo
sets is illustrated in Fig. 3共C兲. Table I lists the different echo
delay conditions tested in these experiments.
The bat learned to report whether it perceived a systematic increase or decrease in target distance by crawling down
the left arm of the platform to indicate an ‘‘S’’ 共sequential兲
response or if it perceived a random presentation of the same
echo delays by crawling down the right arm of the platform
to indicate an ‘‘R’’ 共random兲 response. The presentation of
the random and sequential echo sets followed a pseudorandom schedule 共Gellerman, 1933兲, and the bat’s response 共S
or R兲 was recorded. For each correct response, the bat received a food reward 共a piece of mealworm兲, and for each
incorrect response, the bat experienced a 10–30 s time-out.
No correction trials were introduced. Each test day included
25–50 trials per bat.
FIG. 4. Performance of two individual bats 共M-6 and G-6兲 trained to discriminate between the sequential and random playbacks. Breaks between the
data points appear when experimental trials were not run over consecutive
days. Each block shows data that come from a single condition, indicated by
numbers 1–5 above the x-axis 共see Table I兲. Criterion for successful discrimination was 75%-correct performance over a minimum of 3 days.
Table I for stimulus details on each condition兲. Note that in
some instances, the bat’s discrimination of echo patterns in
one condition transferred immediately to a new pattern, and
performance remained at or above 75%-correct following the
introduction of a new stimulus set. These data show that the
bat can integrate echo sequences along the dimension of delay 共Morris and Moss, 1995, 1996兲, and they lay the foundation for future studies that can directly probe the phenomenon of stream segregation along the delay axis in
echolocating bats. Such experiments will determine whether
the bat hears out delay streams from complex echo delay
patterns.
1. Behavioral performance
V. STUDY 2: BEHAVIORAL STUDIES OF
ECHOLOCATING BATS OPERATING IN A COMPLEX
AUDITORY SCENE
The data from two bats that successfully completed testing under all conditions are plotted in Fig. 4. 共Three out of
the five bats trained in the task fell ill during the experiment,
and their data sets are incomplete.兲 The conditions are separated into different panels and identified by number 1–5 共see
A complete understanding of auditory scene analysis
should specifically address the acoustic problems that must
be solved by the species under study. The use of biologically
relevant stimuli in the study of auditory scene analysis has
not been widely applied, but this approach has been adopted
D. Results
2212
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 5. Schematic of setup for video
and sound recordings of tethered insect captures by echolocating bats.
Two video cameras were mounted in
the room to permit 3D reconstruction
of the bat’s flight path. Video recordings were synchronized with audio recordings taken with ultrasonic microphones delivering signals to either a
digital acquisition system 共IoTech
Wavebook兲 or a reel-to-reel highspeed recorder 共Racal兲.
by Hulse and colleagues working on the perceptual organization of sound in the European starling 共Hulse et al., 1997;
Wisniewski and Hulse, 1997兲. They studied the European
starling’s perception of conspecific song and found evidence
for stream segregation of biologically relevant acoustic signals by the starling. Similarly, in bats, one can study the
phenomenon of scene analysis in the context of biologically
relevant acoustic tasks. In particular, the bat’s pursuit and
capture of insect prey in a dynamic acoustic environment
requires analysis of the auditory scene using echoes of its
own sonar vocalizations. Thus, studies of vocal production
patterns in echolocating bats can provide insights to the animal’s perception and control of the auditory scene 共Wadsworth and Moss, 2000兲. Indeed, the echolocating bat’s perception of the world builds upon sonar returns of its
vocalizations. Given that the repetition rate and frequency
separation of sounds affect stream segregation in human listeners, the vocal production patterns of the bat would be
expected to influence its perceptual organization of space
using sonar echoes. Below, we describe high-speed video
and sound recordings of the echolocating bat as it flies
through a dynamic auditory scene to intercept insect prey
tethered in open laboratory space and in the vicinity of echo
clutter-producing leaves and branches.
A. Animals
Seven FM bats of the species Eptesicus fuscus served as
subjects in the laboratory insect capture studies. Four of the
animals were collected from private homes in Maryland during the summers of 1997–1999 and three from a cave in
Ontario, Canada in the winter of 2000. Bats were housed in
the University of Maryland bat colony room under conditions comparable to those of the bats in the psychophysical
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
experiments. Food was available only as a reward during
behavioral experiments, which were carried out 4 –5 days/
week over a period of 27 months.
B. Behavioral studies of echolocation behavior in
free flight
Big brown bats 共Eptesicus fuscus兲 were trained to capture tethered whole mealworms 共Tenebrio molitor兲 in a large
flight room 共6.4 ⫻ 7.3 ⫻ 2.5 m兲 lined with acoustical foam
共Sonex兲. Their vocalization behaviors were studied under
open-space and clutter conditions. In the open-space condition, no obstacles were in the vicinity 共within 1 m兲 of the
insect target; however, the walls, ceiling, and floor of the
flight room prevent us from creating a truly open space environment. In the clutter condition, the tethered mealworm
was suspended 5–90 cm from leafy house plants placed in
the calibrated space of the behavior room 共see video processing methods below兲. The separation between target and clutter was manipulated between trials: close 共5–15 cm兲, intermediate 共20–30 cm兲, and far 共⬎40 cm兲. A schematic of the
setup for these experiments is presented in Fig. 5.
Experiments were carried out using only longwavelength lighting 共Plexiglas #2711; Reed Plastics and Bogen Filter #182兲 to eliminate the use of vision by the bat
共Hope and Bhatnagar, 1979兲. Mealworms were suspended at
a height of about 1.5 m above the floor by monofilament line
共Trilene Ultra Thin, 0.1-mm diameter兲 within a 5.3-m target
area in the center of the room. A mealworm was suspended
at a randomly selected location within the target area, and
then the bat was released in a random direction to orient to
the target area and find the mealworm. So that the bat would
not memorize the target area, the mealworm was suspended
outside the target area 50% of the time, and those trials were
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2213
not recorded. Once each bat achieved a consistent capture
rate of nearly 100% in open-space conditions 共typically
within 2 weeks of introduction to the task兲, audio and video
recordings of its capture behavior began.
1. Video recordings
Two gen-locked 共frame synchronized兲, high-speed video
cameras 共Kodak MotionCorder, 640⫻240 pixels, 240-Hz
frame rate, and 1/240-s shutter speed兲 were positioned just
below the ceiling in the corners of the flight room. A calibration frame 共Peak Performance Technologies兲 was placed in
the center of the room and filmed by both cameras prior to
each recording session. The high-speed video cameras were
used to record target position, bat flight path and capture
behavior. The resulting images were used in calculation of
the three-dimensional positions of the bat, target, and microphones.
The video buffer contained 1963 frames, allowing for
recording of 8.18 s of data at 240 frames/s. Using an endtrigger on the video, we captured the behavior leading up to
and just following successful and unsuccessful insect captures.
2. Audio recordings
Echolocation signals were recorded using two ultrasonic
transducers 共Ultrasound Advice兲 placed within the calibrated
space. In some trials, the microphone output was amplified
共Ultrasound Advice兲 and recorded on the direct channels of a
high-speed tape recorder 共Racal Store-4 at 30 in. per second兲.
An FM channel of the tape recorder was used to record TTL
synch pulses corresponding to the start of each video frame
and gated to the end of video acquisition. In other trials, the
microphone signals were amplified, bandpass filtered 共10–99
kHz, 40-dB gain, Stewart, VBF-7兲 and recorded digitally on
2 channels of an IoTech Wavebook 512 at a sample rate of
240 kHz/channel. The Wavebook, controlled by a Dell Inspiron 7000 laptop computer, had 16 Mbytes of random access memory 共RAM兲 and was set to record 8.18 s prior to the
trigger; the trigger was set to simultaneously stop the audio
and video acquisition. The experimenter triggered the system
on each trial after the insect capture was attempted and/or
accomplished. This approach allowed us to record 8.18 s of
audio data that corresponded precisely to the video segment
for a given trial.
the room and 1.6 m vertically. The three-dimensional space
calibration frame provided 25 control points for direct linear
transformation 共DLT兲 calibration. The calibration procedure
produced a mean residual error of 1.0 cm in each coordinate
for the 25 control points.
The video position data for the bat, the target, and
clutter-producing plants 共when present兲 were entered in a
database for each trial. The bat’s position with respect to the
microphones was also measured, and a correction factor for
the sound travel time from the bat to the microphone was
used to accurately record the vocalization times.
2. Audio processing methods
Recordings of the bat’s sonar vocalizations in the laboratory insect capture studies were processed following two
distinct methods. The sounds recorded on the Racal Store 4
tape recorder were played back at 1/4th the recording speed
共7.5 in. per s兲 and digitized using a National Instruments
board AT-MIO-16-1 with a sampling rate of 60 kHz per
channel, resulting in an effective sampling rate of 240 kHz
per channel. Custom software 共LABVIEW兲 trimmed the digitized audio data to begin with the first and end with the last
frame of each trial’s video segment, and output files were
exported to a signal-processing program 共SONA-PC©兲. Using
SONA-PC, a fast Fourier transform 共FFT兲 was performed over
256 points per time step, with 16 –20 points being replaced
in each time step, and the signals were displayed spectrographically. The onset time, duration, and start- and end frequencies of the first harmonic of the emissions were marked
with a cursor on the display and entered in a database for
further analysis. Data acquired digitally with the IoTech
Wavebook were displayed as time waveforms and spectrograms 共256-point FFT兲 using MATLAB. The onset time and
duration of the signals were measured using the time waveforms, and frequency measurements were taken from the
spectrograms. Audio and video data were merged in a single
analysis file in order to associate vocal behavior with motor
events. MATLAB animation permitted dynamic playback of
the bat’s position data and corresponding vocalizations, enabling detailed study of the bat’s behavior under different
task conditions 共programming by Aaron Schurger兲. Examples
can
be
found
at
http://batlab.umd.edu/
insect_capture_trials.
D. Results
C. Data analysis
1. Video processing methods
A commercial motion analysis system 共Peak Performance Technologies, Motus兲 was used to digitize both camera views with a Miro DC-30 Plus interface. The Peak Motus
system was also used to calculate the three-dimensional location of points marked in both camera views. Digitization
was to 1/4-pixel resolution using magnification. The digitization procedure resulted in 2560⫻1920 lines of resolution.
The video image spanned less than 6 m horizontally, so that
1/4-pixel resolution corresponded to approximately 0.4 mm.
The accuracy of the system was within ⫾0.5% over a calibrated volume extending approximately 2.2⫻2.2 m across
2214
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
Three hundred and twenty trials were run in the openspace condition and 170 trials in the clutter condition. Bats
successfully intercepted tethered insects at a rate of 95% in
the open space and 80% in the presence of echo clutterproducing plants, when the target was 20–90 cm from the
clutter. Detailed video and sound analyses were carried out
for a subset of trials, 53 open-space trials and 107 clutter
trials. Three-dimensional plots of the bats’ flight paths in
representative trials from selected open-space and clutter
conditions are shown in Fig. 6. Each data point 共open circle兲
along the bat’s flight path indicates the occurrence of a sonar
vocalization. The asterisks in the clutter trials denote the position of the branches. The final target position is shown with
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 6. Data from selected insect capture trials. Flight paths of bats pursuing tethered insects under open-space 共left兲 and cluttered 共right兲 conditions. Each
point 共open circle兲 along the flight path denotes the occurrence of a sonar vocalization. The star shows the end position of the target, which in the upper two
examples was stationary and in the lower two examples was moving. In the moving target trials, the path of the insect is illustrated by a thin line. Note that
the sonar vocalizations occur at a higher repetition rate as the bat approaches the target. For the clutter trials shown, asterisks denote the position of the
branches. The bat successfully intercepted the target in all four trials shown.
a star, and in the case of moving target trials, the path of the
insect is illustrated with a fine line. All trials presented in
Fig. 6 resulted in successful target capture. Consistent with
earlier reports 共e.g., Griffin, 1953, 1958; Cahlander et al.,
1964; Webster, 1963兲, the interval between the bat’s sonar
vocalizations decreased as the bat approached the insect, but
in general, the decrease was not continuous. In fact, most
trials contained groups of sounds that occurred with relatively stable intervals, interrupted by a longer pause, and
followed by another group of sounds, sometimes with the
same stable interpulse interval 共IPI兲. We will refer to the
sound groups with stable intervals as sonar strobe groups.
Figure 7 shows the spectrograms of sounds produced by
the bat as it approached and intercepted the tethered insect
target under the same set of open-space 共left panels兲 and
clutter 共right panels兲 trials presented in Fig. 6. Note in all of
the trials that the bat’s sonar sounds sometimes occurred in
groups, interrupted by gaps. The groups of sounds contained
signals with relatively stable intervals 共less than 5% variance
in the interval between sounds兲. This is displayed in Fig. 8,
which plots the interpulse interval of successive sounds in
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
the same four trials relative to the time of insect capture. In
the example of the open-space moving target trial, the bat
extends the production of these sound strobe groups with
short IPIs over 1000 ms before contact, and the intervals
remain just below 20 ms over several successive strobe
groups. Processing these groups of sounds with relatively
fixed repetition rate might facilitate the bat’s analysis of sonar scenes and planning for target capture.
These sonar signal sequences, typical of those recorded
in the lab 共and under some conditions in the field; see below兲, include examples of several groups of sounds with
stable intervals. Histograms summarizing the interpulse intervals and durations of these sonar sound groups in the
open-space and cluttered conditions are presented in Fig. 9.
Stable-space IPI groups with intervals smaller than 13 ms
were excluded from this summary figure, as they comprise
the terminal buzz 共see Fig. 11兲. In both the open and cluttered environments, the strobe groups occurred most frequently with intervals characteristic of the late approach
phase of insect pursuit: median ⫽ 16.5 ms under open space
conditions and 17.4 ms under clutter conditions. The differC. F. Moss and A. Surlykke: Auditory scene analysis in bats
2215
FIG. 7. Spectrograms of the sounds produced by the
bats in the same trials as shown in Fig. 6. Each set of
three panels comes from a single trial, with time wrapping from one panel to the next. Trials on the left come
from open-space conditions and on the right from cluttered conditions. The upper panels display data taken
when the target was stationary and the lower panels
display data taken when the target was moving. Note
that the sounds occur in stable IPI groups with a relatively consistent repetition rate, followed by a pause
共e.g., clearly demonstrated in the middle panel of the
open-space, moving target trial兲. Note that the panels
show sonar sounds starting at approximately 1500 ms
prior to target capture.
FIG. 8. Interpulse interval relative to target contact, plotted for the same four trials as shown in Figs. 6 and 7. The plots on the left show data from trials in
open space and on the right from cluttered space. The upper plots show data from stationary target trials, and the lower plots show data from moving target
trials. Note that the time scale on each panel shows sounds produced by the bat starting at 1500 ms prior to contact.
2216
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 9. 共A兲 Distribution of the interpulse interval 共IPI兲 of echolocation sounds produced in ‘‘stable IPI groups’’ under open-space 共above兲 and cluttered
共below兲 insect capture conditions. The median intervals for stable IPI groups were 17.4 ms for open space and 16.5 ms for cluttered trials. 共B兲 Distribution
of the duration over which the stable IPI groups occurred in open space 共above兲 and cluttered 共below兲 insect capture conditions. The median duration of stable
IPI groups in the open space trials was 43.3 ms and in the cluttered trials was 37.6 ms.
ence in distribution of strobe group intervals between open
space and clutter conditions was not statistically significant
关Fig. 9共A兲兴. Across behavioral trials, stable IPI groups contained three, four, and sometimes more sounds. The median
duration of the stable IPI groups 共onset of the first sound to
the onset of the last sound in the group兲 was 43.3 ms for the
open space condition and 37.6 ms for the cluttered condition
关Fig. 9共B兲兴. The difference between duration of strobe groups
under open-space and cluttered condition was not statistically significant.
Figure 10 shows the incidence of sonar strobe groups for
individual trials, displayed separately for open-space and
clutter conditions. Individual trials analyzed are listed on the
y axis 共53 for open-space and 107 for clutter兲, and time relative to target contact on the x axis. Data points mark each
sonar strobe group within a trial, and the position of each
data point along the x axis indicates its time of occurrence
relative to target contact 共time zero兲. In both open-space and
clutter trials, there is a clustering of strobe groups 250–750
ms before target contact. At earlier times relative to target
contact, the incidence of sonar strobe groups is much lower
in open space than in clutter. It is noteworthy that the incidence of strobe groups is much higher after target contact in
the clutter condition than in the open space. In the clutter
condition, the bat must negotiate the branches after contact,
and this appears to influence the production of sonar strobe
groups following insect capture in these trials.
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
As the bat gets closer to a target, it carefully adjusts its
vocalizations to avoid overlap between sonar pulses and target echoes. In addition, the bat appears to adjust the interval
between sounds to allow time for reception of relevant target
echoes before producing the next sound. Just prior to capture, the bat produces sounds at a very high rate. This group
of sounds, produced at a rate higher than about 80 sounds/
second and reaching approximately 150/second just prior to
capture, is called the terminal buzz 关see Fig. 2共B兲兴. In some
trials, the interval between the bat’s sounds dropped abruptly
from about 15–18 ms down to 6 – 8 ms, omitting the buzz I
component from the terminal sequence. The terminal sequences recorded in the laboratory always contained a buzz
II component 共intervals less than 8 ms兲, and summary data
on buzz II are presented in Fig. 11.
Figure 11共A兲 plots buzz onset and offset 共relative to target contact兲 and total buzz II duration for insect capture sequences in open space 共moving and stationary targets兲 and in
the presence of clutter 共far, intermediate, and close to the
target兲. While moving insects were presented in both the
open-space and clutter conditions, the data are excluded from
the clutter summary in this plot, because physical limitations
of our setup only allowed us to introduce moving targets at
the intermediate and far target–clutter separations.
Buzz onset time differed across conditions (F⫽5.37; p
⬍0.001), with the smooth motion open-space condition
yielding significantly earlier onset time than stationary open
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2217
FIG. 10. Occurrence of sonar strobe groups relative to contact time, shown
by •. Time of each strobe group is displayed across individual trials and
relative to target contact time 共0 on the x axis兲. Data from open-space trials
are shown in the upper plot and from cluttered space in the lower plot.
space and clutter trials. The buzz onset time in the openspace moving target trials was earliest 共mean⫽⫺265.3 ms,
SEM 19.8 ms兲. The shortest target–clutter separation tested
共5–15 cm兲 gave rise to the latest buzz onset time relative to
contact 共mean⫽⫺188.7 ms, SEM⫽12.2 ms兲. The intermediate target–clutter separation 共20–30 cm兲 and the far target–
clutter separation 共⬎40 cm兲 yielded similar buzz onset times
共intermediate mean⫽⫺202.85 ms, SEM⫽13.1 ms, and far
mean⫽⫺195.8 ms, SEM⫽10 ms兲, which did not differ statistically from the open-space stationary buzz onset time
共mean⫽⫺213.9 ms, SEM⫽4.7 ms兲.
The offset time of the buzz relative to target contact also
differed significantly across conditions (F⫽6.27; p
⬍0.001). It was earliest for the near-clutter condition 共mean
⫽⫺66.2 ms, SEM⫽3.8 ms兲, and decreased systematically
from the intermediate 共⫺53.7 ms, SEM⫽3.1 ms兲 to the far
共⫺44.79 ms, SEM⫽3.9 ms兲 clutter conditions, and again
slightly for the open-space stationary condition 共⫺38.08 ms,
SEM⫽2.9 ms兲. The mean open-space moving target trail
buzz offset time relative to target contact was ⫺54.6 ms
2218
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
共SEM⫽4.7 ms兲, similar to the mean for the intermediate
target–clutter separation.
The mean duration of the buzz shortened systematically
from the open-space moving target trials 共mean⫽210.8 ms,
SEM⫽17.2 ms兲 to the open-space stationary trials 共mean
⫽175.8 ms, SEM⫽4.8 ms兲 and across the far 共151.06 ms,
SEM⫽10.2 ms兲, intermediate 共149.2 ms, SEM⫽13.5 ms兲,
and near 共122.5 ms, SEM⫽14.6 ms兲 clutter–target trials.
This overall change in buzz duration was statistically significant (F⫽6.23, p⬍0.001). Post hoc analyses show that the
significant contrasts were between open-space motion and all
three target–clutter conditions 共Tukey comparisons, p
⬍0.05). By shortening the duration of the terminal buzz
with target clutter, the bat shortened the time period when it
was likely to experience a mixing of pulses and echoes from
closely spaced sonar vocalizations.
Figure 11共B兲 shows that the target distance at which the
bat initiated the terminal buzz was longer in open space than
in clutter (F⫽19.77, p⬍0.001). In open space the mean distance was 61.55 cm 共SEM⫽2.75 cm兲 and 64.69 cm 共SEM
⫽2.27 cm兲 for moving and stationary trials, respectively. As
the target was placed closer to the clutter, the buzz onset
distance became shorter, suggesting that the bat made its
decision to intercept the target at closer range under these
conditions. The mean target distances at buzz onset in cluttered space were 51.17 cm 共SEM⫽2.57 cm兲, 42.92 cm 共SEM
⫽2.21 cm兲, and 33.77 cm 共SEM⫽3.0 cm兲 for the far, medium, and near conditions, respectively. Post hoc
comparisons across conditions show that the buzz onset distance differed significantly across all five conditions 共Tukey
test, p⬍0.01).
The difference in the bat’s distance to the target when it
terminated the buzz was not statistically significant across
conditions (F⫽2.36, p⬎0.05); however, the distance traveled by the bat during the buzz was different across all conditions (F⫽12.42, p⬍0.001). Distances traveled over the
course of the buzz were 86.64 cm 共SEM⫽7.78 cm兲 in the
open-space moving condition, 65.77 cm 共SEM⫽3.94 cm兲 in
the open-space stationary condition, 51.43 cm 共SEM⫽3.3
cm兲 in the far-clutter condition, 49.93 cm 共SEM⫽4.49 cm兲 in
the medium-clutter condition, and 36.24 cm 共SEM⫽5.73
cm兲 in the near-clutter condition.
VI. FIELD STUDIES OF ECHOLOCATION BEHAVIOR
The complex and dynamic acoustic environment of the
echolocating bat requires perceptual organization of sound
stimuli to segregate and track signals from different objects
and other animals in the environment. Observations from
field recordings suggest that the bat actively controls the information it extracts about the auditory scene by modulating
the spectral and temporal characteristics of its sonar vocalizations. Here, we report on field recordings from several
species of echolocating bat, with the goal of using sonar
signal design and temporal patterning as a window to the
bat’s perceptual control of the spatial information sampled
from the environment.
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 11. 共A兲 The onset and offset 共relative to target
contact兲 and duration of the terminal buzz II produced
by bats just prior to insect capture under open space
共moving and stationary兲 and stationary target in clutter
trials 共near, 5–15 cm; medium 20–30 cm, and far
⬎40-cm target–clutter separations兲. 共Error bars show
one standard error of the mean.兲 共B兲 The bat’s distance
to the target at the time it initiated and ended the terminal buzz across the same five conditions. Also shown is
the distance traveled by the bat while producing the
terminal buzz sequence in the five insect capture conditions.
A. Field sites
Echolocating bats inhabit both temperate and tropical
environments, and sound recordings of foraging bats were
taken at several different sites around the world. Noctilio
albiventris and N. leporinus 共Noctilionidae兲 and Cormura
brevirostris 共Emballonuridae兲 were recorded on Barro Colorado Island in Panama in October, 1999 in collaboration with
E. Kalko and M. E. Jensen. The Noctilio species flew low
over a calm water surface in a small bay. They were using
echolocation to capture fish swimming just below the water’s
surface and mealworms 共set out by experimenters兲 floating
on the water’s surface. The Commura brevirostris was recorded while it hunted aerial insects in a small clearing in the
tropical rain forest covering the island.
The vocalizations of E. fuscus were recorded in the summers of 1999 and 2000 as they foraged insect prey in GreenJ. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
belt, Maryland. At this field site, E. fuscus flew along the
edge of a forest that abutted a meadow. Under moonlight,
two bats were sited that flew 3–5 m above the ground and
2– 6 m from the forest edge, and on two occasions we recorded several bats simultaneously at this location. The
echolocation behavior of Pipistrellus pipistrellus 关45-kHz
phonic type 共Barratt et al., 1997兲兴 was recorded in Montel
共Midi-Pyrénées, southern France兲 in collaboration with Lee
A. Miller et al. The area was under and close to a small
bridge over a creek with brinks covered with shrubbery. Several pipistrelle bats hunted simultaneously in this area.
B. Recording methods
All recordings used microphones that operate in the ultrasonic range. In Panama, recordings were taken with a battery operated 41-in. model 40BF G.R.A.S. microphone, amC. F. Moss and A. Surlykke: Auditory scene analysis in bats
2219
plified 共40 dB兲 and high-pass filtered 共13 kHz兲 through a
G.R.A.S 12AA microphone power supply. This system is
linear 共⫾1 dB from 15–100 kHz兲. In Maryland, recordings
were taken with a 41-in. ACO Pacific microphone powered
and amplified with a Larsen Davis battery-operated power
supply. In all cases the microphone was mounted on the end
of a 2-m thin rod. The microphone signals were continuously
A/D converted on-line 共12 bit, sampling rate 250 kHz兲 using
a battery-operated IoTech Wavebook 512. The signals were
stored in a ring buffer 共FirstInFirstOut, FIFO兲. The Wavebook had 128 Mbytes of random access memory 共RAM兲 and
was set at 3– 6 s pretriggering time and 1–2 s post-triggering
time. The Wavebook was controlled by a laptop computer
共AST Ascentia P series, IBM Thinkpad 600 or Dell Inspiron
7000兲. When we picked up strong signals on a bat detector
共D240 Pettersson Electronics兲, we began signal acquisition
with the Wavebook, storing the 3– 6 s preceding and the 1–2
s following trigger. At all recordings sites, the data storage
system was battery operated, resulting in a low noise floor.
In France the pipistrelle recordings were taken with a
linear array of three Brüel & Kjaer 41-in. microphones, each
separated by 75 cm. The array was placed horizontally approximately 1.5 m above the ground with the microphones
pointing forward and upward at an angle of 45 deg. The
sounds were stored on a RACAL instrumentation tape recorder running at 30 ips 共flat frequency response up to 150
kHz兲, and later digitized for analysis using the IoTech Wavebook 共see above兲.
C. Results
The field recordings demonstrated that bats emit signals
that, depending on species, allow them to exploit both timeand frequency cues to segregate their own signals and echoes
from a complicated background of noise and emissions from
other bats. In general, the bat diversity is much greater in the
tropics, which is reflected in a broad representation of sonar
signal characteristics from this part of the world. In temperate areas, there are not nearly as many bat species. However,
acoustically, temperate species may face comparable sonar
challenges, since several individuals of the same species may
hunt in close proximity in areas with a high insect density.
Naturally, the similarity of signals is greatest within the same
species, and thus this situation may create confusion for the
bat trying to discriminate between echoes from its own emissions and those of the neighbors.
The recordings of the two Noctilio species 共N. leporinus
and N. albiventris兲 illustrate just how complex the acoustic
environment of the bat can be. Generally, the density of bats
on Barro Colorado Island, Panama was very high. Up to 25
Noctilio bats, most of which were N. leporinus, might fly in
the same area simultaneously. Thus, we recorded many overlapping sonar signals, and the silent breaks were few and
short. N. leporinus emits signals with most energy around 55
kHz 共Schnitzler et al., 1994兲. Our recordings 共Fig. 12兲 show
peak energy around 58 kHz, whereas the dominant energy in
the sonar signals of N. albiventris was around 74 kHz, close
to the values 共67–72 kHz兲 reported earlier 共Kalko et al.,
1998兲. Figure 12 also illustrates that the N. albiventris re2220
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
FIG. 12. Two Noctilio species 共N. leporinus and N. albiventris兲 hunting over
a calm water surface. 共A兲 is a multiflash photo with 50 ms between flashes,
except between 1 and 2 and 6 and 7. There are at least five different individuals in the photo. Photo by Elisabeth K. V. Kalko. 共B兲 is a spectrogram of
the sound recording corresponding to the multiflash photo in 共A兲 illustrating
how complex the acoustic scene can be in a dense group of bats. The
spectrograms show that the main frequency of the N. leporinus is around 58
kHz, whereas the N. albiventris emissions fall in two groups, either around
72 or 74 kHz.
corded simultaneously emit slightly different frequencies,
e.g., approximately 72 and 74 kHz.
Similar to the insect capture recordings of E. fuscus in
the laboratory 共see Sec. V above and Figs. 6, 7, and 8兲, vocal
sequences recorded in the field included stable IPI sequences. Groups of sonar signals produced at a stable repetition rate in the field again suggest active vocal control to
facilitate analysis of the auditory scene. Illustration of sonar
IPI sets produced by E. fuscus hunting along a forest edge is
shown in Fig. 13. When E. fuscus is in cruising flight, the
interpulse intervals are stereotyped 共Surlykke and Moss,
2000兲. In contrast, a stable pulse repetition pattern, followed
by short breaks and another group of signals at a stable,
higher repetition pattern, clearly reveals active hunting, similar to the stable IPI groups described in our lab recordings of
insect capture.
We took sound recordings of E. fuscus and P. pipistrellus
hunting in close proximity of neighbors, and these sound
sequences suggest active vocal control of the emitted dominant frequency to facilitate the identification and perceptual
segregation of echoes from self-produced cries and those
from neighbors. In the case of E. fuscus, a combination of
the directionality of sonar emissions 共Hartley and Suthers,
1989兲 and of the microphones indicates that bats recorded
simultaneously were close in range. In the case of P. pipistrellus’s echolocation behavior, the linear array of microC. F. Moss and A. Surlykke: Auditory scene analysis in bats
FIG. 13. The interpulse intervals 共IPI兲 of four field recordings of E. fuscus. 共A兲 shows spectrograms of one
recording sequence. 共B兲 shows IPI’s of all four recordings illustrating the typical changes in the regular pattern of emission. One pattern is short breaks appearing
as sharp peaks on the IPI curves. The other typical patterns are short bursts of higher but almost constant repetition rates 共stable IPI groups兲 resulting in troughs in
the curves, e.g., the five cries between interval 3 and 7
on the curve plotted with squares, or the five cries between interval 17 and 21 on the curve plotted with triangles. The histogram of cry intervals 共C兲 共10-ms bin
width兲 shows that besides the ‘‘standard’’ value of
around 140 ms and approximately twice that value 共see
Surlykke and Moss, 2000兲, there is a lower peak around
70 ms corresponding to the stable IPI group present in
the laboratory vocalization data.
phones allowed for determination of the distance to the bats,
and the directionality of the microphones, combined with the
voice notes on the tape, ensured that the recordings chosen
for analysis were from bats in front of the array. Although
flight angle and speed were not measured, Doppler shifts
introduced by the bat’s flight velocity would be less than 0.7
kHz at 25 kHz, assuming a flight velocity of 10 m/s. It is
noteworthy that both species show evidence of frequency
shifts of several kHz when two bats are flying close together.
Thus, these frequency shifts are considerably larger than
those that could be explained by Doppler shifts introduced
by the bat’s flight velocity 共see Surlykke and Moss, 2000兲,
and suggest a jamming avoidance response. Such frequency
shifts could help segregate information relevant to different
individuals 共Fig. 14兲.
Finally, field recordings show that some bat species emit
signals with systematic changes in frequency from emission
to emission. Figure 15 shows a 500-ms recording of an Emballonurid bat, Cormura brevirostris, which emits triplets of
multiharmonic sounds in the search phase. The dominant
harmonic of the first signal is at 26 kHz, the second at 29
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
kHz, and the third at 33 kHz. This may be the bat’s solution
in the frequency domain to recognizing its own signals, and
thus echoes, in areas with very high bat density and much
acoustic activity in the same frequency range.
VII. DISCUSSION
Echolocating bats probe the environment with sonar vocalizations and use information contained in the returning
echoes to maneuver through the environment and intercept
small flying insect prey in the dark. These extraordinary behaviors lead one to speculate that the echolocating bat’s representation of the environment obtained through sonar compares well with that of human spatial vision. Spatial vision
does not arise from passive reception of retinal images but
instead builds upon transformational rules that support interpretation of the visual scene 共Hoffman, 1998兲. Similarly, auditory scene analysis utilizes organizational rules that permit
the perceptual grouping and segregation of sound sources in
the environment 共Bregman, 1990兲. In the case of the echolocating bat, its vocal control over the acoustic environment
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2221
FIG. 14. 共A兲 Signals from two E. fuscus recorded simultaneously. As long as both bats are recorded at high
intensity while they are presumably close together
共from 0 to around 2.5 s兲, they emit signals where the
main frequency differs by 4 –5 kHz. When one of the
bats starts veering away and the other closes in on the
microphone 共as revealed by the reduced intensity of the
low-frequency bat, and increased intensity of the highfrequency bat兲 both bats adjust the frequency of their
emissions to approximately the same value, approximately 26 kHz. 共B兲 Same pattern for Pipistrellus pipistrellus. In this case an array of three microphones allowed for discrimination between the emissions of the
two bats 共filled squares and filled circles, respectively兲
and for determinations of approximate distance between
the bats 共line curve兲. In this case the emitted frequencies of both bats are scattered around 48 and 50 kHz,
but when the bats start closing in on one another
共around 1600 ms兲 they adjust their frequencies to be
separated by approximately 2.5 kHz.
allows the motor system to play directly into its analysis of
the auditory scene. We hypothesize that active perceptual
processes and active vocal control over echo information obtained from the environment operate in concert, permitting
an elaborate representation of a dynamic auditory scene.
Here, we have presented examples from perceptual and motor studies that illustrate the active processes that contribute
to scene analysis by echolocation.
FIG. 15. In the search phase the tropical bat, Cormura brevirostris 共Wagner’s sacwinged bats, a.k.a. the ‘‘do-re-mi-bat’’兲 emits triplets of sonar
sounds with main frequency increasing from 26 kHz to 29 kHz and ending
at 33 kHz. This very distinct frequency pattern may help the bat to distinguish between its own emissions and those of the many other bats operating
in the same frequency range. Background signals from other bats also appear in this figure. Single bat’s signals are identified by the numbers 1, 2, 3.
Another advantage of this frequency shift might be to reduce problems with
overlap between emission and echo 共see the text兲.
2222
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
Over the past several decades, echolocation research has
yielded valuable data on sonar performance and acoustic information processing by many different species of bats 共see
for example, Griffin, 1958; Busnel and Fish, 1980; Moss and
Schnitzler, 1995兲; however, very little attention has been directed toward understanding how echo information may be
interpreted by the bat’s auditory system to build a perceptual
representation of the auditory scene. The aims of this paper
are to highlight the question of higher-level perception in
sonar, to provide a conceptual framework for studying this
problem, and to report on data that begin to address questions of scene analysis by echolocation in bats.
In this article, we have emphasized temporal processing
of sonar signals for spatial analysis of the auditory scene.
Certainly many distinct acoustic features contribute to the
bat’s perception of the world through sound 共e.g., direction,
spectrum, amplitude兲, but we have focused largely on the
processing of echo delay for target distance estimation, because this signal parameter plays a central role in the bat’s
perceptually guided behavior through three-dimensional
space. In particular, we have chosen to present data from
echo discrimination experiments, laboratory insect capture
studies, and field recordings, to illustrate the role of temporal
patterning in perception and action for scene analysis by
echolocation in bats. The contribution of spectral signal characteristics to the bat’s perceptual organization of sound was
discussed briefly in the context of sonar behavior in the field
共Figs. 14 and 15兲. A more detailed analysis of a wide range
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
of acoustic parameters will be the focus of future work.
The perceptual experiments presented in this paper explicitly examined the fundamental issue of processing spatial
information from a dynamic acoustic environment. Changes
in the auditory scene, produced by movement of sonar targets and by the bat’s own flight path, are sampled at a low
duty cycle by species that use frequency-modulated 共FM兲
echolocation cries, resulting in brief acoustic snapshots of
the world. The bat’s auditory system must assemble information over time to build a representation of the dynamic auditory scene. While others 共e.g., Hartley and Suthers, 1992;
Roverud, 1993, 1994; Roverud and Rabitoy, 1994兲 have
shown that decreasing echo delay can evoke distanceappropriate increases in sonar signal production in trained
echolocating bats, these earlier behavioral experiments did
not directly study the bat’s perceptual discrimination of echo
delay patterns that increase or decrease over time. Here, we
have presented psychophysical data demonstrating that the
perceptual system of the big brown bat, Eptesicus fuscus,
supports the integration of acoustic information over time,
permitting the discrimination of increasing or decreasing
echo delay patterns. Our future experiments will attempt to
determine if the bat hears out streams of echoes from an
echo-cluttered background that would allow it to segregate
and track moving targets against background.
Here, we have also presented data on active vocal control by echolocating bats, operating both in the laboratory
and in the field. Unlike the majority of hearing animals that
perceive their surroundings by listening passively to acoustic
signals generated in the environment, the echolocating bat
actively transmits and controls the acoustic information it
obtains about the auditory scene. The bat’s active control
over the acoustic information it receives from the environment prevents us from conducting the classical psychophysical experiments on auditory scene analysis with bats, like
those employed with passively listening subjects. However,
the active selection of sonar signal parameters under changing task conditions presents a valuable opportunity to attack
the problem of scene analysis from the motor side.
We propose that the bat’s vocal production patterns during insect capture are consistent with the notion that it uses
information over successive echoes to build a representation
of the world that ultimately guides its behavior. If the bat
were to process each echo as a discrete event and modify its
behavior based on the spatial information contained in single
echoes, insect capture in a dynamic environment would
surely fail. Assuming a behavioral response latency of 100–
150 ms, the position of the target with respect to the bat
would have changed by the time the bat localized and responded to it, and the bat’s motor adjustments would lag
behind the movement of the insect. Furthermore, if the bat
were to process echo information from each sound before
making appropriate motor adjustments, the intervals between
sounds should be greater than the shortest behavioral response latency of approximately 100 ms. However, intervals
between the bat’s echolocation sounds are typically less than
100 ms, except during the early search phase of insect pursuit or during cruising 共Griffin, 1958兲, suggesting that the bat
emits and processes signals in groups with shorter interpulse
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
intervals. In this context, it is noteworthy that the flight path
data often show the bat predicting its point of interception
with a moving target 共example shown in Fig. 6兲.
Assuming that the general principles of auditory scene
analysis apply to the echolocating bat, we hypothesize that
the task of organizing acoustic information from a dynamic
environment is simplified if the bat probes sonar targets with
vocalizations occurring with a relatively stable interval,
which we refer to as stable IPI sound groups. In human listening experiments, the interval between successive sounds
influences the tendency to hear out auditory streams 共Bregman, 1990兲, and in the bat, the interval between signals in a
group may influence the acoustic streams it segregates from
objects at different distances and directions.
What other benefits might the bat derive from the production of stable IPI sound groups? Information about target
position may be encoded more reliably in the central nervous
system under conditions in which the sonar production rate is
stable. The echolocating bat’s auditory system contains a
population of neurons that shows facilitated and delay-tuned
responses to pairs of sounds, simulating sonar emissions and
echoes, a response property believed to be related to the
encoding of target range 共Suga and O’Neill, 1979兲. Echodelay-tuned neurons, found in the auditory brainstem 共Mittman and Wenstrup, 1995; Dear and Suga, 1995; Valentine
and Moss, 1997兲, thalamus 共Olson and Musil, 1992兲, and
cortex 共O’Neill and Suga, 1982; Sullivan, 1982; Wong and
Shannon, 1988; Dear et al., 1993b兲, show very weak responses to single FM sounds. However, these neurons respond vigorously to pairs of FM sounds 共a simulated pulse
and a weaker echo兲, separated by a delay. Typically, echodelay-tuned neurons show facilitated responses to simulated
pulse–echo pairs over a delay range of several milliseconds.
The pulse-echo interval to which an echo-delay-tuned neuron
shows the largest response is referred to as the best delay.
For example, a neuron may respond maximally to a pulse–
echo interval of 9 ms, which corresponds to a target range of
1.5 m. Pulse–echo pairs with intervals of 7 and 11 ms 共simulating ranges between 1.1 and 1.9 m兲 may also evoke a response, but at a lower rate.
Delay-tuned neurons respond phasically to acoustic
stimuli simulating FM sonar signals, and typically fire no
more than one or two action potentials for each pulse–echo
stimulus presentation. In addition, a population of echodelay-tuned neurons exhibits shifts in the best delay with the
stimulus presentation rate 共O’Neill and Suga, 1982; Tanaka
and Wong, 1993; Wong et al., 1992兲. This shift in best delay
may be several milliseconds, corresponding to a shift in best
range of several centimeters. These two observations together, i.e., that the delay-tuned neurons fire one or two action potentials per stimulus presentation at the best delay, and
that the best delay may change with stimulus repetition rate,
suggest that sonar strobe groups may serve to stabilize the
neural representation of the distance of objects in an auditory
scene. For example, the repeated production of sounds with
stable IPI for a sonar target at a given distance, e.g., 1.5 m
from the bat, would consistently stimulate neurons responsive to the combination of a particular echo delay 共e.g., 9 ms兲
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2223
and repetition rate 共e.g., 60 Hz兲 over several successive signals, which would then increase reliability of target distance
estimation.
The stable IPI groups are interrupted by longer gaps, and
thus the signals in each group occur in clusters, which may
impact both perception and motor planning. With regard to
perception, the grouping of signals would serve to increase
the information carried by echoes in discrete blocks of time.
It may be that the bat’s sonar receiver can extract finer detail
about the surroundings with the higher density of echoes that
results from these groups. In this context, it is noteworthy
that the bat increases its relative production of stable IPI
groups shortly before target contact, and again just after capture, when a detailed assessment of the surroundings may be
particularly important for planning the final attack and reorienting after tucking the head in the tail membrane to take the
prey 共Fig. 10兲. The stable IPI groups were also recorded from
actively hunting bats but not from animals that were commuting from the roost to hunting grounds. All of this suggests to us that the stable IPI groups may be important for
spatially segregating 共streaming兲 the target 共insect prey兲 and
background. Moreover, we speculate that the spatial resolution of the auditory scene varies inversely with the interpulse
intervals within stable IPI groups. Thus, shorter interpulse
intervals, characteristic of the final stages of insect capture,
may permit segregation of acoustic streams from closely
spaced echo-producing auditory objects.
Naturally, the clustering of signals in groups, which results in greater echo density, also creates time periods when
the echo density is reduced. We speculate that the bat may
use the IPI gap time to update motor programs, using the
acoustic information carried by echoes in the IPI groups. The
median duration of the stable IPI groups was about 43 ms for
the open-space condition and 38 ms for the cluttered condition, which might approximate the time over which information is assembled for motor program updates.
It is well established that the echolocating bat’s respiration cycle is correlated with its wingbeat cycle, and the intervals between sonar vocalizations occur during inspiration.
The coupling of wingbeat, respiration, and vocal production
contributes to the temporal patterning of sonar signals
共Schnitzler and Henson, 1980; Suthers et al., 1972; Wong
and Waters, 2001兲, and indeed this coupling can be used, in
part, to explain the sonar signal groups we report here. However, some bat species do not produce sonar signal groups,
but instead show a relatively regular and continuous decrease
in interpulse intervals 共see, e.g., Schnitzler and Henson,
1980兲. The fact that some species produce stable IPI groups
and others do not suggests differences in the coordination of
vocal production and respiration across bat species, even
among those that use FM signals to hunt insects in temperate
climates 共e.g., compare Eptesicus fuscus and Myotis lucifigus, Schnitzler and Henson, 1980兲. Such differences in the
temporal patterning of vocal production naturally results in
differences in echo patterning received by the sonar systems
of different species, since the vocal production pattern directly impacts the information available to the sonar receiver.
We do not know why differences exist in the temporal patterning of sonar signals across species; however, we propose
2224
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
that the stable IPI groups produced by Eptesicus fuscus play
into the analysis of the auditory scene in this species.
In our laboratory studies of insect capture behavior, a
clear distinction in vocal production patterns across task conditions appeared in the duration of the terminal buzz. When
the bats captured targets in the presence of clutter, they shortened the terminal buzz, as compared with the buzz produced
under open uncluttered space conditions (F⫽6.22, p
⬍0.01). This result is not surprising, when one considers
that the bat typically waits to produce its next signal until it
has received and processed echoes of interest. In other
words, the bat avoids producing signals before all the relevant echoes have returned. In the terminal buzz the interval
between signals is so short that there often is not time for all
relevant echoes 共from target and clutter兲 to return before the
subsequent sound is produced. Thus, in the terminal buzz,
the bat likely receives echoes from the clutter following one
sonar signal after producing the next signal, and the bat reduces the time when it experiences the mixing of echoes
from different vocalizations under clutter conditions by
shortening the terminal buzz.
The onset of the buzz differed between stationary 共openspace and clutter conditions兲 and moving target 共open space兲
trials. When the bat pursued moving prey, it typically followed the insect from behind, in the direction of the target’s
motion. Thus, the insect was moving away from the bat during the final stage of capture, and the bat’s approach direction presumably resulted in the extended buzz duration in the
moving target trials. The buzz onset times relative to target
capture were very similar in the open-space stationary and
the three different clutter conditions. The consistency of buzz
onset time in clutter relative to stationary prey capture suggests that it reflects a motor planning process for target interception, rather than serving as an indicator of target detection or localization. Measures of the distance between the bat
and target when it initiated the buzz show that the bat was
closer to the target when it entered the final capture sequence
in clutter than in open space. Moreover, the closer the target
was positioned relative to the clutter, the shorter was its distance to the prey when it began the terminal buzz. This is
consistent with the observation that buzzes recorded in the
field are shorter than in the lab 共Surlykke and Moss, 2000兲,
since natural prey is unpredictable, and motor planning must
be delayed until the bat is closer to the point of capture.
Given the potential for mixing signals and echoes in the
terminal buzz, one might then ask why the bat produces signals at such a high repetition rate. Furthermore, the bat has
little time to react to any new information carried by echoes
from the terminal buzz signals. However, the terminal buzz
does allow the bat to sample information at a very high and
stable repetition rate, providing details about the auditory
scene that may be enhanced and sharpened, particularly if the
information is assembled over time. This is conveyed in Fig.
2共B兲 by the clear ridges representing changing target distances during the buzz phase of insect pursuit. While the bat
has little time to react to information about the target in the
terminal buzz, it may use echoes from the terminal buzz to
represent spatial information about the background that it
still must negotiate following insect capture. The signal freC. F. Moss and A. Surlykke: Auditory scene analysis in bats
quencies in the terminal buzz drop well below 20 kHz 共Surlykke and Moss, 2000兲, and this drop in sound frequency
serves to reduce the directionality of the bat’s sonar signals.
Using lower-frequency, more omnidirectional signals may
thus help the bat in sampling background information that
could ultimately influence its flight path following target capture. Thus, if one thinks of the buzz as a sampling unit, rather
than many closely spaced, discrete elements, one can begin
to appreciate the rich information it may carry for the analysis of the auditory scene.
Field data from several different species highlight the
perceptual requirements and vocal-motor strategies used by
echolocating bats to analyze the auditory scene. In particular,
we have presented sound recordings from Noctilio that illustrate the complicated acoustic environment the bat encounters, raising questions about how the bat sorts a cacophony of
signals and echoes in an environment where many animals
are hunting together. While hunting in less crowded conditions, E. fuscus and P. pippistrellus sometimes encounter
conspecifics in close proximity, and we presented examples
here of adjustments in the frequency of sonar emissions produced by these two species in response to the signals of
neighbors. Such adjustments in signal spectrum and sweep
rate may allow an individual bat to separate out echoes from
its own signals from echoes of signals produced by conspecifics in close proximity. Finally, we presented the signals
produced by a bat that samples the environment with multiharmonic signals with changing fundamental frequencies 共C.
brevirostris, dubbed the ‘‘do-re-mi bat’’兲. We have no data to
speak directly to this species’ perception of the world by
sonar, but we note that vocal production patterns of C. brevirostris resemble frequency hopping used in wireless communication systems. Frequency hopping minimizes interference
from other sources by carving the transmission band into
separate frequency channels that are active for short intervals
共about 100 milliseconds兲 before shifting to a neighboring
frequency. This allows wireless communication more access
points in the same area, and we speculate that C. brevirostris
uses a similar strategy to minimize interference from neighboring echolocating bats.
In conclusion, we have introduced here a conceptual
framework for the study of auditory scene analysis by
echolocation in bats. We assert that the principles of auditory
scene analysis, which have been identified in humans 共Bregman, 1990兲, can be applied broadly to understand the perceptual organization of sound in other animals, including the
echolocating bat. In particular, we postulate that the bat’s
perceptual system organizes acoustic information from a
complex and dynamic environment into echo streams, allowing it to track spatially distributed auditory objects 共sonar
targets兲 as it flies. Our hypothesis emphasizes the importance
of the echolocating bat’s vocal production patterns to its
analysis of the auditory scene, as motor behaviors and perception operate in concert in the bat’s active sensing system.
It is our goal that the conceptual framework presented here
will stimulate new research avenues for unraveling the details of auditory scene analysis in animal systems.
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
ACKNOWLEDGMENTS
We gratefully acknowledge the following agencies,
foundations, and institutions that have supported this work:
NIMH 共No. R01-MH56366兲, NSF 共No. IBN-9258255兲,
Whitehall Foundation 共No. S97-20兲 and the Institute for Advanced Study, Berlin 共to C.F.M.兲, Danish National Research
Foundation and the Danish Natural Science Research Council 共to A.S.兲. In addition, we thank the following individuals
for their valued assistance: Kari Bohn for data collection,
analysis, and figure preparation, Hannah Gilkenson, Aaron
Schurger, and Willard Wilson for data collection, analysis,
and processing of the laboratory insect capture experiments,
Maryellen Morris for running the psychophysical experiments, and Amy Kryjak for data analysis and overseeing the
complicated management of the many projects reported here.
Carola Eschenbach and two anonymous reviewers provided
critical comments on an earlier version of this manuscript.
Discussions with Hans-Ulrich Schnitzler contributed to ideas
presented here. We also thank Elisabeth Kalko and Marianne
Jensen for their valued assistance in the field.
Barratt, E. M., Deaville, R., Burland, T. M., Bruford, M. W., Jones, G.,
Racey, P. A., and Wayne, R. K. 共1997兲. ‘‘DNA answers the call of pipistrelle bat species,’’ Nature 共London兲 387, 138 –139.
Batteau, D. W. 共1967兲. ‘‘The role of the pinna in human localization,’’ Proc.
R. Soc. London 1011, 158 –180.
Blauert, J. 共1997兲. Spatial Hearing: The Psychophysics of Human Sound
Localization 共MIT Press, Cambridge兲.
Braaten, R. F., and Hulse, S. H. 共1993兲. ‘‘Perceptual organization of temporal patterns in European starlings 共Sturnus vulgaris兲,’’ Percept. Psychophys. 54, 567–578.
Bregman, A. S. 共1990兲. Auditory Scene Analysis 共MIT Press, Cambridge,
MA兲.
Bregman, A. S., and Campbell, J. 共1971兲. ‘‘Primary auditory stream segregation and perception of order in rapid sequences of tones,’’ J. Exp. Psychol. 89, 244 –249.
Busnel, R. G., and Fish, J. F. 共1980兲. Animal Sonar Systems 共Plenum, New
York兲.
Cahlander, D. A., McCue, J. J. G., and Webster, F. A. 共1964兲. ‘‘The determination of distance by echolocation,’’ Nature 共London兲 201, 544 –546.
Dear, S. P., Simmons, J. A., and Fritz, J. 共1993a兲. ‘‘A possible neuronal basis
for representation of acoustic scenes in auditory cortex of the big brown
bat,’’ Nature 共London兲 364, 620– 622.
Dear, S. P., Fritz, J., Haresign, T., Ferragamo, M., and Simmons, J. A.
共1993b兲. ‘‘Tonotopic and functional organization in the auditory cortex of
the big brown bat, Eptesicus fuscus,’’ J. Neurophysiol. 70, 1988 –2009.
Dear, S. P., and Suga, N. 共1995兲. ‘‘Delay-tuned neurons in the midbrain of
the big brown bat,’’ J. Neurophysiol. 73, 1084 –1100.
Fay, R. R. 共1998兲. ‘‘Auditory stream segregation in goldfish 共Carassius auratus兲,’’ Hear. Res. 120, 69–76.
Fay, R. R. 共2000兲. ‘‘Spectral contrasts underlying auditory stream segregation in goldfish 共Carassius auratus兲,’’ J. Assoc. for Res. in Otolaryngol.,
online publication.
Gellerman, L. W. 共1933兲. ‘‘Chance orders of alternating stimuli in visual
discrimination experiments,’’ J. Gen. Psychol. 42, 206 –208.
Griffin, D. R. 共1953兲. ‘‘Bat sounds under natural conditions, with evidence
for echolocation of insect prey,’’ J. Exp. Zool. 123, 435– 465.
Griffin, D. 共1958兲. Listening in the Dark 共Yale University Press, New Haven兲.
Griffin, D. R., Webster, F. A., and Michael, C. R. 共1960兲. ‘‘The echolocation
of flying insects by bats,’’ Anim. Behav. 8, 141–154.
Grinnell, A. D., and Grinnell, V. S. 共1965兲. ‘‘Neural correlates of vertical
localization by echolocating bats,’’ J. Physiol. 共London兲 181, 830– 851.
Hartley, D. J. 共1992兲. ‘‘Stabilization of perceived echo amplitudes in echolocating bats. II. The acoustic behavior of the big brown bat, Eptesicus
fuscus, when tracking moving prey,’’ J. Acoust. Soc. Am. 91, 1133–1149.
Hartley, D. J., and Suthers, R. A. 共1989兲. ‘‘The sound emission pattern of the
echolocating bat, Eptesicus fuscus,’’ J. Acoust. Soc. Am. 85, 1348 –1351.
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
2225
Hartridge, H. 共1945兲. ‘‘Acoustic control in the flight of bats,’’ Nature 共London兲 156, 490– 494.
Hoffman, D. D. 共1998兲. Visual Intelligence: How We Create What We See
共Norton, New York兲.
Hope, G. M., and Bhatnagar, K. P. 共1979兲. ‘‘Electrical response of bat retina
to spectral stimulation: Comparison of four microchiropteran species,’’
Experientia 35, 1189–91.
Hulse, S. H., MacDougall-Shackleton, S. A., and Wisniewski, A. B. 共1997兲.
‘‘Auditory scene analysis by songbirds: Stream segregation of birdsong by
European starlings 共Sturnus vulgaris兲,’’ J. Comp. Psychol. 111共1兲, 3–13.
Julsz, B., and Hirsh, I. J. 共1972兲. ‘‘Visual and auditory perception—An essay
of comparison,’’ in Human Communication: A Unified View, edited by E.
E. David, Jr. and P. B. Denes 共McGraw-Hill, New York兲.
Kalko, E. K. V., Schnitzler, H.-U., Kaipf, I., and Grinnell, A. D. 共1988兲.
‘‘Echolocation and foraging behavior of the lesser bulldog bat, Notilio
albiventris: Preadaption for piscivory?’’ Behav. Ecol. Sociobiol. 42, 305–
319.
Kawasaki, M., Margoliash, D., and Suga, N. 共1988兲. ‘‘Delay-tuned
combination-sensitive neurons in the auditory cortex of the vocalizing
mustached bat,’’ J. Neurophysiol. 59, 623– 635.
Körte, A. 共1915兲. ‘‘Kinomatoscopische Untersuchungen,’’ Zeitschrift für
Psychologie der Sinnesorgane 72, 193–296.
Lawrence, B. D., and Simmons, J. A. 共1982兲. ‘‘Echolocation in bats: The
external ear and perception of the vertical positions of targets,’’ Science
218, 481– 483.
MacDougall-Shackleton, S. A., Hulse, S. H., Gentner, T. Q., and White, W.
共1998兲. ‘‘Auditory scene analysis by European starlings 共sturnus vulgaris兲:
Perceptual segregation of tone sequences,’’ J. Acoust. Soc. Am. 103,
3581–3587.
Mittmann, D. H., and Wenstrup, J. J. 共1995兲. ‘‘Combination-sensitive neurons in the inferior colliculus,’’ Hear. Res. 90共1–2兲, 185–191.
Morris, M. R., and Moss, C. F. 共1995兲. ‘‘Perception of complex stimuli in
the echolocating FM bat, Eptesicus fuscus,’’ Tenth International Bat Research Conference, Boston, MA.
Morris, M., and Moss, C. F. 共1996兲. Perceptual Organization of Sound for
Spatially Guided Behavior in the Echolocating Bat 共Society for Neuroscience Abstracts, Washington, D.C.兲, Vol. 22, p. 404.
Moss, C. F., and Schnitzler, H.-U. 共1989兲. ‘‘Accuracy of target ranging in
echolocating bats: Acoustic information processing,’’ J. Comp. Physiol., A
165, 383–393.
Moss, C. F., and Schnitzler, H.-U. 共1995兲. ‘‘Behavioral studies of auditory
information processing,’’ in Springer Handbook of Auditory Research.
Hearing by Bats, edited by A. N. Popper and R. R. Fay 共Springer, Berlin兲,
pp. 87–145.
O’Neill, W. E., and Suga, N. 共1982兲. ‘‘Encoding of target range and its
representation in the auditory cortex of the mustached bat,’’ J. Neurosci.
2共1兲, 17–31.
Olson, C. R., and Musil, S. Y. 共1992兲. ‘‘Topographic organization of cortical
and subcortical projections to posterior cingulate cortex in the cat—
Evidence for somatic, ocular, and complex subregions,’’ J. Comp. Neurol.
324共2兲, 237–260.
Roverud, R. C. 共1993兲. ‘‘Neural computations for sound pattern recognition:
Evidence for summation of an array of frequency filters in an echolocating
bat,’’ J. Neurosci. 13共6兲, 2306 –2312.
Roverud, R. C. 共1994兲. ‘‘Complex sound analysis in the lesser bulldog bat:
Evidence for a mechanism for processing frequency elements of frequency
modulated signals over restricted time intervals,’’ J. Comp. Physiol., A
174, 559–565.
Roverud, R. C., and Rabitoy, E. R. 共1994兲. ‘‘Complex sound analysis in the
FM bat Eptesicus fuscus, correlated with structural parameters of frequency modulated signals,’’ J. Comp. Physiol., A 174, 567–573.
Schnitzler, H.-U. 共1968兲. ‘‘Die Ultraschall-Ortungslaute der HufeisenFledermaeuse 共Chiroptera-rhinolophidae兲, in verschiedenen Orientierungssituationen,’’ Zeitschrift vergl. Physiol. 57, 376 – 408.
Schnitzler, H.-U., and Henson, W., Jr. 共1980兲. ‘‘Performance of airborne
animal sonar systems. I. Microchiroptera,’’ in Animal Sonar Systems, edited by R. G. Busnel and J. F. Fish 共Plenum, New York兲, pp. 109–181.
Schnitzler, H.-U., and Kalko, E.K.V. 共1998兲. ‘‘How echolocating bats search
and find food,’’ in Bats: Phylogeny, Morphology, Echolocation and Conservation Biology, edited by T. H. Kunz and P. A. Racey 共Smithsonian
Inst. Press, Washington, D.C.兲, pp. 183–196.
2226
J. Acoust. Soc. Am., Vol. 110, No. 4, October 2001
Schnitzler, H.-U., Menne, D., Kober, R., and Heblich, K. 共1983兲. ‘‘The
acoustical image of fluttering insects in echolocating bats,’’ in Neuroethology and Behavioral Physiology, edited by F. Huber and H. Markel
共Springer, Heidelberg兲, pp. 235–249.
Schnitzler, H.-U., Kalko, E. K. V., Keipf, I., and Grinnell, A. D. 共1994兲.
‘‘Fishing and echolocation behavior of the greater bulldog bat, Noctilio
leporinus, in the field,’’ Behav. Ecol. Sociobiol. 35, 327–345.
Shimozowa, T., Suga, N., Hendler, P., and Schuetec, S. 共1974兲. ‘‘Directional
sensitivity of echolocation system in bats producing frequency-modulated
signals,⬙ J. Exp. Biol. 60, 53– 69.
Simmons, J. A. 共1973兲. ‘‘The resolution of target range by echolocating
bats,’’ J. Acoust. Soc. Am. 54, 157–173.
Simmons, J. A. 共1979兲. ‘‘Perception of echo phase information in bat sonar,’’
Science 204, 1336 –1338.
Simmons, J. A., Ferragamo, M. J., Saillant, P. A., Haresign, T., Wotton, J. M.
Dear, S. P., and Lee, D.N. 共1995兲. ‘‘Auditory dimensions of acoustic images in echolocation,’’ in Springer Handbook of Auditory Research. Hearing by Bats, edited by A. N. Popper and R. R. Fay 共Springer, Berlin兲, pp.
146 –190.
Simmons, J. A., Lavender, W. A., Lavender, B. A., Doroshow, D. A., Kiefer,
S. W., Livingston, R., Scallet, A. C., and Crowley, D. E. 共1974兲. ‘‘Target
structure and echo spectral discrimination by echolocating bats,’’ Science
186, 1130–1132.
Simmons, J. A., Kick, S. A., Lawrence, B. D., Hale, C., Bard, C., and
Escudie, B. 共1983兲. ‘‘Acuity of horizontal angle discrimination by the
echolocating bat, Eptesicus fuscus,’’ J. Comp. Physiol., A 153, 321–330.
Simmons, J. A., and Vernon, J. A. 共1971兲. ‘‘Echolocation: Discrimination of
targets by the bat, Eptesicus fuscus,’’ J. Exp. Zool. 176, 315–328.
Suga, N., and O’Neill, W. E. 共1979兲. ‘‘Neural axis representing target range
in the auditory cortex of the mustache bat,’’ Science 206, 351–353.
Sullivan, W. E. 共1982兲. ‘‘Neural representation of target distance in auditory
cortex of the echolocating bat, Myotis lucifugus,’’ J. Neurophysiol. 48,
1011–1032.
Surlykke, A., and Moss, C. F. 共2000兲. ‘‘Echolocation behavior of the big
brown bat, Eptesicus fuscus, in the field and the laboratory,’’ J. Acoust.
Soc. Am. 108, 2419–2429.
Suthers, R. A., Thomas, S. P., and Suthers, B. J. 共1972兲. ‘‘Respiration, wingbeat and ultrasonic pulse emission in an echo-locating bat,’’ J. Exp. Biol.
56, 37– 48.
Tanaka, H., and Wong, D. 共1993兲. ‘‘The influence of temporal pattern of
stimulation on delay tuning of neurons in the auditory cortex of the FM
bat, Myotis lucifugus,’’ Hear. Res. 66, 58 – 66.
Valentine, D. E. and Moss, C. F. 共1997兲. ‘‘Spatially-selective auditory responses in the superior colliculus of the echolocating bat,⬙ J. Neuroscience
17, 1720–1753.
Wadsworth, J., and Moss, C. F. 共2000兲. ‘‘Vocal control of acoustic information for sonar discriminations by the echolocating bat, Eptesicus fuscus,’’
J. Acoust. Soc. Am. 107, 2265–2271.
Webster, F. A. 共1963兲. ‘‘Active energy radiating systems: The bat and ultrasonic principles in acoustical control of airborne interceptions by bats,’’
Proc. International Congress on Technology and Blindness, Vol. I.
Wisniewski, A. B., and Hulse, S. H. 共1997兲. ‘‘Auditory scene analysis in
European starlings 共Sturnus vulgaris兲: Discrimination of song segments,
their segregation from multiple and reversed conspecific songs, and evidence for conspecific song categorization,’’ J. Comp. Psych. 111, 337–
350.
Wong, D., and Shannon, S. L. 共1988兲. ‘‘Functional zones in the auditory
cortex of the echolocating bat, Myotis lucifugus,’’ Brain Res. 453, 349–
352.
Wong, D., Maekawa, M., and Tanaka, H. 共1992兲. ‘‘The effect of pulse repetition rate on the delay sensitivity of neurons in the autory cortex of the
FM bat, Myotis lucifugus,’’ J. Comp. Physiol. 170, 393– 402.
Wong, J. G., and Waters, D. A. 共2001兲. ‘‘The synchronization of signal
emission with wingbeat during the approach phase in soprano pipistrelles
共Pipistrellus pygmaeus),’’ J. Exp. Biol. 204, 575–583.
Wotton, J. M., Haresign, T., and Simmons, J. A. 共1995兲. ‘‘Spectral and
temporal cues produced by the external ear of Eptesicus fuscus with respect to their relevance to sound localization,’’ J. Acoust. Soc. Am. 98,
1423–1445.
C. F. Moss and A. Surlykke: Auditory scene analysis in bats
Download