The University of South Florida audiovisual phoneme

advertisement
The University of South Florida
audiovisual phoneme database
v 1.0
Frisch, S.A., Stearns, A.M.,
Hardin, S.A., & Nikjeh, D.A.
University of South Florida
frisch@cas.usf.edu
This work supported by NIH-NIDCD R03 06164
1
Phoneme Database Project
• Recorded wordlist demonstrating all
English phonemes in initial, medial, and
final word position (if possible)
• Audiovisual recordings
–
–
–
–
Acoustics
Face video
Ultrasound of tongue
Flexible endoscopy of pharynx, larynx
2
Gratuitous Equipment Picture
3
Purpose
• Potential for multimedia tools in teaching
phonetics/speech science
• Students have ready access to multimedia
computers
• Freeware for acoustic analysis is
available
• Need for multimedia resources
appropriate for students’ needs
4
Methods – Recording Parameters
• Ultrasound
– Mid-saggital image of tongue posture
– Probe in direct contact with jaw (no
compressible acoutically transparent
standoff)
– No head stabilization
• Digital video camera
– Aimed at angle to front of face
– Shows lip and jaw movement
5
Methods – Recording Parameters
• Flexible endoscopy
– Shows laryngeal setting (but cannot see
glottal cycle)
– Also shows pharyngeal articulation
• Audio recording captured as part of all
video recordings, used to synchronize
videos with one another
6
Word List
• Each English phoneme in word initial,
word medial, and word final position
where allowed by English phonotactics
• Common words used wherever possible
• Some additional gaps in database due to
recording difficulties
• See handout for complete list
7
Procedure
• Each word was read clearly in isolation
• Considerable pause between each word,
with articulators moved back to “neutral”
position
8
Post-Processing
• Video recordings were superimposed to
create a single video file showing facial
video, endoscopy of larynx, and
ultrasound of tongue position
• Noise reduction applied to audio to
eliminate machine noise from recording
environment
9
Using the Database
• Recordings can be viewed with freeware
Wavesurfer program
• Allows display of common acoustic phonetic
analysis windows in conjunction with video
image
• Cursor position in acoustic analysis window is
tied to the appropriate frame in the video image
• Download from
http://www.speech.kth.se/wavesurfer/
10
Example 1 – okay
front
tongue root
tongue blade
arytenoids
vocal folds
11
Example 1 – okay
• Ultrasound shows tongue body raised
and tongue root pulled forward to
produce high front vowel /e/
• Endoscope window shows arytenoids
cartilages are approximated and glottis is
closed for voicing
• Video clip shows lips pulled apart for
unrounded vowel production
• Etc…
12
Example 2 – voice
// F2
/I/ F2
13
Example 2 – voice
• Sample image of diphthong off-glide /I/
• Cursor positioned on spectrogram at end of
diphthong
• Ultrasound shows visible tongue tip and body
raising, and advancement of the tongue root
• Face video shows lip spreading and jaw raising
• Endoscope shows approximation of the
arytenoids and vocal folds
14
Conclusion
• Ultrasound and other multimedia tools
have great potential to enhance teaching
and learning in phonetics/speech science
• Copies of the database, version 1.0, are
available in a compressed file archive on
CD-ROM
• Additional suggestions for improvements
or additions to the database are welcome
15
Just for fun
• “Tongue” the music video included as
part of the database
16
Download