The Identification of Auditory Objects - CSIS

advertisement
Lecture 7:
Auditory Scene Analysis: I
Slide 1
Recap
Timbre
Spectral Components
Critical bandwidth
Envelope – Attack, Steady State and Release
Judgements of Timbre – Grey
Shepard Tones – Never ending ascent/descent
Sound Localisation
Inter Aural Time Difference
Inter Aural Intensity Difference
Hass or Precedence Effect
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 2
Auditory Scene Analysis: 1
The Identification of Auditory Objects
The Perception of Temporal Patterns
Gestalt Psychology
Gestalt Principles
Proximity
Similarity
Common fate
Good Continuation
Disjoint Allocation (Belongingness)
Closure
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 3
The Identification of Auditory Objects
Thus far we have looked at the perceptual processes involved in the
perception of pitch, loudness and timbre. These are all cues as to
identity of things and processes.
Pitch, loudness and timbre are fundamental – but the auditory world
that we live in is highly complex and involves multiple sounds
originating from multiple sources occurring at one instant, and also
over time.
Our hearing has to be able to separate out these sounds and say this
is a car, that is a leaf rustling in the gutter, or a violin, or a piano,
etc….
Research indicates that the following cues are used to identify and
distinguish the sounds associated with events:
 Fundamental Frequency.
 Onset Timing – how synchronised sound sources are.
 Contrast with Prior sounds (i.e. changes in pitch, loudness and
timbre).
 Correlated Modulation or Change – e.g. in Amplitude or
Frequency – many sounds have signatures that are reflected in
synchronised patterns of AM and FM.
 Location – Sounds coming from two locations are unlikely to be
from one source, while sounds from one location are probably
from one source.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 4
The Identification of Auditory Objects
This richness is useful.
Along any one dimension we only have a limited capacity to
categorise – up to 5 – 6 categories.
If we want to have more categories we need to use more dimensions.
Fundamental Frequency
If two sounds share their fundamental frequency then they fuse into a
single percept.
Experiments with synthetic vowels showed that if sets of harmonics
are split into two groups that don’t share a fundamental then we hear
more than one pitch.
If one harmonic in a series is mistuned enough then it will stand out
and be heard as a separate pure tone.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 5
The Identification of Auditory Objects
Onset Timing
Rasch(1978). Played two tones– a masker Fm and a test tone Ft
Ft was just higher than Fm – i.e. it was at the JND level. Rasch
observed three things when the tones were played more or less
synchronously. …
 When played synchronously - if Ft is 20dB (or more) less intense
than Fm it is not heard.
 If Ft is played slightly (say 30 ms,) before Fm, then Ft can be up
to 60 dB less powerful than the masker before it is inaudible.
 If Ft is started before masker and then stopped it seems to
continue.
Thus, sounds that are difficult to separate when played at exactly the
same time become separable if there is a slight asynchrony i.e. time
difference, between them.
Also, given proximal or overlapping stimuli the auditory system can be
fooled into thinking that something is occurring when it isn’t.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 6
The influence Onset Timing on Auditory
Objects
Rasch also observed that if Fm
and Ft are played synchronously
or up to 30ms asynchronously,
but the rise-time (onset of
energy) in Ft is faster than for
Fm they are heard as separate
but simultaneous
Simultaneity in hearing is
approx. < 15 ms, and pitch
fusion is approx. 60 ms
i.e. we don’t have to be
consciously aware of the onset differences to be able to use them to
distinguish tones.
Example: ASA, Demonstration 21
In the example 4 tones have slightly different onset times and also
vary their rise time. This affects the clarity with which their order can
be heard. The order is either
M L H M or M H L M
This kind of effect is important in ensemble playing where the onset of
instruments is never completely synchronised.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 7
The influence of Contrast with Prior
sounds on Auditory Objects
The auditory System uses the change in a signal to identify and
distinguish
Steady state

Leads to adaptation
Change

Differences of stimuli stands out
Or, when change occurs adaptation is removed.
Experiments with spectral sounds and noise (Zwicker 1964)
Noise by itself is colourless
If noise is played after a spectral tone then
it sounds coloured – coloration is the
inverse of the spectral sound.
If a notched noise is played and then a
noise with a flat spectrum distribution it
sounds as if it has a spectral peak at the
same place as the notch.
Experiments with complex tones
Generally we hear synthetically – we are unaware of components
If one of the harmonics is changed (This can be higher or lower in
energy) it stands out –after a while our awareness of it fades
Changes in sound are important for noticing the arrival of a new
source – e.g. you are lying on the grass listening to the wind when a
car drives up.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 8
The influence Coherent Changes in
Amplitude or Frequency on Auditory
Objects
Rasch (1975) showed that synchronous tones could be separated by
 Modulating their frequency
 Modulating their amplitude
FM reduces level loss of F0 against masker by 17dB.
This also affects the fusion of sounds (McAdams 1982, 1989)
If a subset of the harmonics of a complex tone constructed from
random harmonics,vary in AM or FM, these harmonics stand out like a
figure against a ground.
This is the Gestalt Principle of Common fate.
Also this affects whether we can hear complex tones analytically or
synthetically.
 Synchronised onset

synthetic
 Asynchronous onset

analytical
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 9
The influence of Location on Auditory
Objects
Our hearing assumes that sounds coming from different places are
probably coming from different sources
This also affects masking affects – Masking Level Difference – MLD
If sounds are in phase then masking is greater that if they are out of
phase
The auditory system seems to assume that if they are out of phase
they are from different sources because masking effects are
suppressed (Kubovy 1974).
The cues outlined above are all capable of distinguishing sounds. In
the world patterns of change in parameters is regular, i.e. consistent.
The auditory system has adapted to extract and organise these
regularities – this process is called Auditory Scene Analysis.
ASA is thought to follow Gestalt Principles of Organisation
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 10
Gestalt Psychology
Gestalt means pattern.
Gestalt Psychology originated in 1920’s, (Kohler, Koffka &
Wertheimer)
The basic principles underlying Gestalt psychology are
The whole is greater than the sum of the parts.
The parts are defined by the whole as much as vice versa
Gestalt psychologists are best known for their work in vision – but
their principles are also applicable to auditory perception.
They argued that the brain’s ability to organise is innate to it and
originates in patterns of electrical activity in the brain (never proven).
They systematically developed a set of principles of perceptual
organisation that they thought determine how we assemble or
associate components in a perceptual field.
These principles are:
1. Proximity
2. Similarity
Bottom Up
3. Common Fate (Common Direction)
4. Good Continuation
5. Disjoint Allocation (Belongingness)
Top Down
6. Closure
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 11
Gestalt Principals
Two sets of Principles
Bottom Up (BU)
Hard wired, Pre-attentive, Not Learned
Top Down
Plastic, Schematic, Learned
(TD)
Can be difficult to know exactly what effects are operating
Gestalt principles are rules of thumb that seem to have been
incorporated in to the auditory system over evolution because they
give approximately right answers most of the time.
Bregman (1978) and Bregman and Pinker (1978) distinguished
between two concepts in auditory processing.
 Source – a physical entity producing sound
 Stream – a coherent set of ordered or simultaneous events
indicates a source
In this view hearing is a process of auditory parsing
Simultaneous event = single source
Connected sequences indicate what is happening over time
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 12
Gestalt Principles of Perceptual
Organisation
Some Visual Analogies
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 13
Gestalt Principle of Proximity
In vision when elements in an image are close together they are
perceived to be together and separate from others that are further
away, even though they are similar, e.g.
In hearing sounds occurring together over time are clustered – e.g.
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 14
Gestalt Principle of Similarity
Two or more auditory events are grouped if they are similar in timbre,
pitch, loudness, or close in apparent location or time
Pure tones - closer in pitch  grouped
Van Noorden (1975) sequences of Complex Tones – but with
missing fundamental
Fundamentals in same region but harmonics are not leads to fission
i.e. Different timbres but same pitch = unfused
Harmonics in same region but fundamentals not leads to fusion
i.e. Different pitches same timbre = fused
This is not clear-cut –depends on individual differences.
Loud and Soft tones (Van Norden)
If the difference in loudness is large enough they form different
streams – either can be attended to
Same dB  single stream at twice the tempo
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 15
Gestalt Principle of Common Fate.
Components in sound act together
The tend to start and finish together
They tend to change in pitch or intensity together
Therefore if we have a complex sound and the components are coordinated then they are fused
e.g. onset disparities, and AM and FM (tremolo & vibrato)
For example if harmonics 2,4 and 8’s frequency is modulated (FM)
they separate from harmonics 3,5,6 and 7
Example: ASA, Demonstration 19, FM
Or if the frequency of the 1st harmonic is
modulated (FM) at a different rate it
separates from harmonics 3,4 and 5
Example: ASA, Demonstration 20, FM
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 16
Gestalt Principle of Good Continuation.
Natural sound sources tend change gradually rather than abruptly in
frequency, intensity, location or timbre
Therefore: Abrupt change == new stream == new source
Low and high tones that tend to split into streams – this can be
suppressed by putting glides in between (Bregman & Dannenbring
1973)
In speech if there are oscillations in frequency it gives the impression
that there are two speakers saying the one word
In music in general if a note is near in pitch to the one just before it
then it will be heard as the next note in the melody rather than a note
that is separate - higher or lower
Example ASA, Demonstration 12
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 17
Gestalt Principle of Disjoint Allocation
(Belongingness).
A component can only come from one source – i.e. hearing tries to
use each component only once (Bregman & Rudnicky 1975)
Say we have two tones at slightly different pitches and these can
either be heard in isolation or embedded in another series of tones thus
In isolation the order of AB or BA is easily judged.
The addition of tones (X’s) close to the pitch of AB act as distracters
making it difficult to order AB (This is thought to be because we attend
more to the start and end of sequences).
But if more X’s are added, they form a stream that is separate from
AB and again the order of AB is easily judged.
This not hard & fast – ambiguity is possible and this shows that this
level of organisation is on the boundary of being pre-attentive and
attentive
It also shows how the addition of new elements changes the
perceptual organisation of the stimulus.
Example ASA, Demonstration 16
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Lecture 7:
Auditory Scene Analysis: I
Slide 18
Gestalt Principle of Closure through
Apparent Continuity.
A source maybe obscured or absent – but its percept continues.
e.g. FM radio – disturbance from ignition of passing cars – we hear a click over
the sound whereas in fact the radio is producing only a click.
A pitched sound that is broken but the gap is filled by noise seems unbroken.
Example ASA, Demonstration 28
Similarly a glide that is broken but the gap is filled with noise seems unbroken.
Example ASA, Demonstration 29
CS5611 - Psychoacoustics - Niall Griffith – Semester 1
Computer Science and Information Systems, University of Limerick
Download