Articulation and the Vocal Tract Chapter 8

advertisement
Articulation and the Vocal Tract
Chapter 3
Perry C. Hanavan, Au.D.
Audiologist
Oral Cavity
•
•
•
•
•
•
•
•
•
Anatomy
Lips
Oral cavity
Teeth
Hard palate
Soft palate
Uvula
Tonsils
Tongue
Question
Which is not a part of the oral cavity?
A. Uvula
B. Soft palate
C. Pharynx
D. Lips
E. Hard palate
Lips
• Anatomy
– FAS
Question
Cupid bow is:
A. Upper lip
B. Lower lip
C. Chin
D. Nose
E. A novel
Facial Muscles
• Anatomy
• Facial muscles
Question
Which muscle helps round the lips?
A. Obicularis oculi
B. Obicularis oris
C. Obicularis oracle
D. Obicularis occipital
Teeth
• Anatomy
Dental Occlusion
• Types
Hard Palate
• Anatomy
– Oral cavity
– Hard palate
Soft Palate
• Anatomy
Tongue
• Anatomy
Function of Soft Palate
• Velopharyngeal port
– nasality
Passavant’s Pad
• A prominence on the
posterior wall of the nasopharynx formed by
contraction of the superior
constrictor of the pharynx
during swallowing.
• Synonym:
Passavant's bar,
Passavant's pad,
Passavant's ridge.
Speech Production
• Phonemes (sound units of language)
– Consonants (s, z)
• Voiced (z)
• Unvoiced (s)
– Vowels (i)
– Diphthongs (oy)
Production of Consonants
• Place of production
– Where major constriction occurs in vocal tract
• Manner of production
– How consonant is produced
• Voicing (phonation)
– Voiced or unvoiced
Place of Production
•
•
•
•
•
•
•
Bilabial
Labiodental
Dental
Alveolar
Palatal
Velar
Glottal
Manner of Production
•
•
•
•
•
Stops
Fricatives
Africates
Nasals
Semivowels
Alphabet Soup
1.
2.
3.
4.
26 letters of alphabet
Only list consonants
Digraph consonants
Other consonants
Glottal
t d
Velar
Palatal
Alveolar
InterDental
Labiodental
Bilabial
p b
k g
Stops
f v
Ө
s z
h
Fricatives
Affricates
m
n
w hw
l r
ŋ
Nasals
Semivowels
j
Male Vocal Tract
17.2 cm
17.2 cm
Tube open at one end, closed at the other
¼ Wave Resonator
Vocal Tract Resonators (Filter)
• Tube open at both ends
– ½ resonator
• Tube closed at one end and
open at the other
– ¼ resonator
Vocal Tract (Filter)
• Approximately 17.2
cm for males
• 5/6 the length for
females
• Children roughly half
the length of adult
male
Assignment
Source-Filter Theory
Fo
(source produced
at vocal folds)
Formants (F1, F2,
F3, …) created by
vocal tract resonance
Source which is
emphasized and not
modulated by vocal tract
resonance (F1, F2, F3,
shown at left)
Formants with Tongue Position
More pictorials
Vowel Formants
Formants Created by Vowel
Shapes
Vowel Formant Data
Adult Male Talker Data
Adult Female Talker Data
Chart Vowel Formants
• Acoustics and Tongue Position
• Video Clip
Lip Rounding
Vowels Formants
Source-Filter Theory
Vowel Spectrograph
Diphthongs
Consonants
• Intensity (loudness cues)
• Frequency (pitch cues)
• Duration (length of sound cues)
Audiogram
Plotting Acoustic Data on Audiogram
Place of Production
Speech Production Patterns
Speech Production Patterns
Vowel Shapes
Spectrogram Dimensions
Spectrogram Dimensions
Transitions
Spectrogram Dimensions
Question
What is the phoneme in the English language with
the weakest energy?
A. /l/
B. /r/
C. /s/
D. /f/
E. /ɵ/
Consonant Data
Intensity Data
Vowel Formants
Sound Source
• Stops, fricatives and affricates produced in vocal
tract as the sound source
• For voiced stops, fricatives and affricates, there are
two sound sources
– Periodic laryngeal source combined with
– Aperiodic vocal tract sound source
• Aperiodic sound is produced by two different
manners:
– Sudden release of air pressure (burst/transient) behind
closure
• Stop/plosives
– Turbulence as air rushes through a narrow constriction
• fricatives
Stop/Plosive
/p/ /b/
/t/ /d/
/k/ /g/
Manner of Production
• Stops produced with – complete closure within the oral cavity,
– build up of pressure behind the closure and
– rapid release of closure with air rapidly expelled
Acoustic Events
Divided into five components
1. Occlusion
2. Transient
3. Frication
4. Aspiration
5. Transition
In practice, it is difficult to differentiate the transient
from the frication, thus, this complex is generally
referred to as the burst.
Acoustic Events
• occlusion is the period during which there is a stoppage of the
airflow during which the pressure increases. It is characterized by
silence or the absence of energy. Voiced stops may have low
frequency (0 - 500Hz) periodic energy during this phase.
• transient corresponds to the release of the closure. It is
characterized by a spike on the spectrogram of intense energy with
a duration of about 10msec.
• frication component is the result of the combination of high intra-oral
pressure being released through a narrow opening at the point of
release.
• aspiration phase is the result of the vocal tract opening even further
with turbulence through the glottis rather than the oral constriction.
Formants are often present during this phase.
• transition is the component where formants are present and the oral
tract is moving to the position for the following vowel target.
• In practice it is difficult to differentiate the transient from the frication
so this complex is generally referred to as the burst.
ACOUSTIC CUES TO THE
VOICED/VOICELESS DISTINCTION
1.
2.
3.
4.
VOT
F1 of vowel following stop/plosive
Preceding vowel duration
Other cues
1. Voice Onset Time (VOT)
• Voiced and voiceless stops differ in the coordination
between supralaryngeal and laryngeal events
– Difference is referred to as differences in Voice Onset Time
(VOT)
• Voice onset time is the time that voicing begins relative
to consonant release
• In English, voiceless stops have large VOT values and
voiced stops have small or negative VOT values.
– Negative VOT occurs when periodicity begins before stop
release i.e. during closure
• English speakers hear a consonant as voiceless if VOT
is over 25msec for bilabials, over 35 msec for alveolars
and over 40 msec for velars
• VOT values separating voiced from voiceless stops are
language specific
1. Voice Onset Time (VOT)
• Spanish and French make use of prevoiced stops (negative
VOT) and contrast these with positive VOT stops. English
does not recognize a difference between prevoiced and
voiceless unaspirated
• Thai speakers make a three way distinction for bilabials and
alveolars. Voiced, voiceless unaspirated, voiceless aspirated
• Values also change in context
• VOT separation decreases for stops produced in sentences
compared with initial stops produced in isolated words
• Stressed voiceless are produced with greater VOT values
than unstressed
• VOT increases when stops occur in Stop Approximant
sequences
• VOT for unaspirated stops (/sC/ clusters) is close to VOT for
voiced stops in CV syllables
2. F1 for Following Vowel
• F1 provides important acoustic information about voicing
characteristics
• F1 is very low during complete closure.
• For voiced stops-– F1 rises very quickly from burst to vowel target formant position
– Rise steepest in open vowels (high F1), and flattest in close vowels (low F1)
• For voiceless stops
– Periodicity (voicing) occurs at least 30 msec later than voiced stops so less
of the formant will be pulse excited
– By the time pulse excitation begins, F1 has almost reached the vowel target
• On spectrograms, voiced stops characterized by a voiced, rising F1
transition which is NOT present in voiceless stops due to
– pulse excitation begins later in the transition for voiceless stops
– aspiration requires open glottis which (due to the large resonating sub
laryngeal chamber) causes an attenuation of F1
• For VC syllables—
– F1 should fall sharply into the closure for voiced stops
– Offset frequency should be higher for voiced than voiceless stops
3. Preceding Vowel Duration
• Duration of vowels before voiceless stops is
shorter than before voiced stops.
• 52-69% shorter vowel duration before voiceless
than voiced stops
– Examples: Pop vs. Bob
4. Other
• Voiced stops have voicing/periodicity during
closure when in intervocalic or postvocalic
position
• Duration of intervocalic closure provides an
additional cue to voicing
• Closure greater for voiceless than voiced e.g.
rapid vs. rabid
• Onset frequency of Fo higher following voiceless
than voiced stops.
• Burst intensity of voiceless stops greater than
voiced stop.
CHARACTERISTICS OF ENGLISH STOPS
IN CONTEXT
Aspiration
• When /p,t,k/ followed by /r,l,w,j/ aspiration
manifests itself in the devoicing of the
approximants "please", "try", "clean", "pew"
• In final position and in unstressed syllables
aspiration is weak
• When /s/ precedes /p,t,k/ initially , there is no
aspiration
Closure
• /b,d,g/ only fully voiced during closure when
occurring intervocalically
Release
• Generally, stops have a release stage in the form of
aspiration or as a following vowel. However, there
are instances where the release does not occur
• No audible release in final position: e.g. rope/robe
• No audible release in stop clusters: e.g. dropped,
locked, good boy
• Glottal reinforcement of final voiceless stops:
– Nasal release: If a stop is followed by a homorganic nasal
in the following syllable, the release of air is usually via the
nasal cavity. e.g. topmost, submerge, cotton, not now, red
nose
– Lateral release: When the homorganic stops /t,d/ occur
before /l/ they are released laterally. The tip remains in
contact with the alveolar ridge but one or both of the sides
is lowered allowing the air to escape. e.g. cattle, medal,
atlas
Place of Production
• Place of articulation for stops determined by
1. burst
2. transitions
Burst
• Burst is combination of transient and frication phase
– Provide information for place of production
– Frequency spectrum for alveolars and velars results from
resonance of cavity in front of tongue constriction
• Alveolars--front cavity is small and place of production
doesn't alter greatly under influence of different vowels
• Velars, front cavity shape varies greatly with different
vowels
• Three important parameters of burst that allow one to
differentiate the place of production of stops:
1.
2.
3.
Energy level
Spectral centre of gravity (frequency location of main energy
concentration)
Spectral variance (whether the spectrum lacks peaks or has
multiple peaks)
1. Energy Level
• Alveolar stops have the most intense bursts
• Bilabials have weakest bursts
– Due to lack of resonance for bilabials as no front
cavity to amplify the sound
• Little difference between alveolar and velar
2. Center of Gravity
• Bilabials lack any main resonance in the 0-10kHz range as there is
no front cavity so characterized by gradually falling distribution of
energy throughout frequency range
• Alveolars - broad distribution of energy in the burst characterized by
prominence about 1.8 kHz and another rise between 2.5 -4.5 kHz
• Velar - compact concentration of energy in middle of spectrum which
varies according to F2 and F3 of following vowel
• Frequency position of energy for velars derives from the cavity in
front of tongue constriction
• Prevelar (before front vowels (/kip/, /gis/), compact energy
distributed around center frequency of about 3 kHz
• Postvelar (before back vowels(/ko:t/, /go:d/) compact energy
distributed around center frequency of about 1 kHz
• High frequency bursts = alveolar 3kHz to 4kHz
• Low frequency bursts = bilabial 350Hz (but higher for front vowels)
• Bursts with energy slightly above the F2 for the following vowel =
velar e.g back vowels = low F2 :700Hz, front vowels high F2: 3kHz
3. Formant Locus & Transitions
• The locus theory proposes that the place of articulatory closure for
each of the three places of articulation is relatively fixed regardless
of following vowel and that this articulatory invariance has its
acoustic correlate in the starting frequency of the second formant.
Even though the formants may not reach the actual locus position
they will still point to it.
• Once we know the locus frequency we should be able to predict the
slope of the second formant transition if we know the following vowel
formant frequencies.
• Therefore: The locus for /b/ is low (720Hz) and most vowels would
have an F2 value greater than that then the transition will be rising in
/bV/ syllables.
• The locus for /d/ being at 1800Hz means that for central and back
vowels F2 will fall in /dV/ syllables but will be level or slightly rising in
/di,dI,de/.
• Only the alveolars can be considered to have a relatively stable
locus at around 1800 Hz. Cassidy and Harrington (1994) found that
the variability in F2 onset frequency is least for /d/ followed by /b/
then /g/.
3. Formant Locus & Transitions
• For bilabials and velars, there is not an invariant locus value as
modifying following vowel will produce large changes in formant
frequency values
– For instance, for bilabials F2 and F3 will have rising transitions before front
vowels but F2 will be falling before back vowels
• When F3 information is included, better picture of how the stops
cluster
• F2/F3 plots show tendency of three clusters corresponding to
bilabial, alveolar and velar
– However there are examples of bilabials which are potentially confusable
with velars preceding back vowels
– If we examine the change in F2 relative to the change in F3 (the difference
between the formant value at onset and the value at the vowel target) then
these bilabials are well separated
• Cannot separate place of articulation on just one dimension such as
F2 locus.
– Several variables are required to give the whole picture
• Locus is not invariant as it changes substantially as a result of
coarticulation
Stops
Produced with a closure within the oral cavity, a
build up of pressure behind this closure and a
release of the closure allowing the air to be rapidly
expelled.
Acoustically these events can be divided into five
components:
1. Occlusion
2. Transient
3. Frication
4. Aspiration
5. Transition
More info
Plosives
Manner of
Production
Fricatives
Fricatives
Fricative production involves two articulators
being brought together and held close
enough for the escaping air to become
turbulent creating an aperiodic (noise)
sound. Maybe be voiced or unvoiced.
The closure phase of fricatives is
characterized by the continuant noisy
aperiodic component. The characteristics of
the noise are the result of the position of the
constriction, the shape of the orifice, and the
aerodynamic forces of the air stream.
Acoustic characteristics include:
High frequency hiss, long duration, weak to
moderate intensity
Unvoiced Fricatives
Sliding from /s/ to /ʃ/
Affricates
• Stop with a fricative release – but
palatal.
• Combination of stop and fricative
characteristics.
• Closure, burst followed by short
silence then frication.
• Affricates have shorter rise time
than fricatives.
– Rise time is the time from onset to
peak intensity of frication.
• Affricates distinguished from
fricatives by presence of closure
and by duration of noise which is
longer for fricatives.
• Shorter the duration of noise,
shorter the silence necessary to
elicit affricate response.
Affricates
Question
Which consonants are characterized by
antiresonants?
A. Plosives
B. Fricatives
C. Affricates
D. Nasals
E. Semivowels
Nasals
• Like the oral tract, nasal tract has
its own resonant frequencies or
formants.
• Most commonly reported nasal
formants occur at 300Hz, 1kHz,
2.2 kHz, 2.9kHz, 4kHz.
• Antiresonances enter whenever
there is a side branch in the main
acoustic pathway.
• An antiresonance or zero serves
to decrease the spectral energy at
specific frequencies by absorbing
the sound at or near the
antiresonant frequencies.
These cumulatively have effect of
reducing the total amplitude of the
sound generated.
Nasals
Formants / Antiformants
Nasal Antiformant
Approximates (Semivowels)
Approximants are consonants most
similar to vowels in their articulation
and hence their acoustic structure.
Articulation involves one articulator
approaching another but without the
tract becoming narrowed to such an
extent that turbulent airflow occurs.
Like vowels, approximants are:
• highly resonant
• produced with a relatively open vocal
tract
• characterized by identifiable formant
structures
• continuant sounds since there is no
occlusion or momentary stoppage of
the air stream
• non turbulent due to lack of
constriction
• voiced sounds
Semivowels
Question
Which of the following are glides?
A. l
B. r
C. j
D. w
E. A and B
F. C and D
l and r
Glides
VOT
Connected Speech
Place of Production Cues
Manner of Production
Download