The University of South Florida audiovisual phoneme database v 1.0 Frisch, S.A., Stearns, A.M., Hardin, S.A., & Nikjeh, D.A. University of South Florida frisch@cas.usf.edu This work supported by NIH-NIDCD R03 06164 1 Phoneme Database Project • Recorded wordlist demonstrating all English phonemes in initial, medial, and final word position (if possible) • Audiovisual recordings – – – – Acoustics Face video Ultrasound of tongue Flexible endoscopy of pharynx, larynx 2 Gratuitous Equipment Picture 3 Purpose • Potential for multimedia tools in teaching phonetics/speech science • Students have ready access to multimedia computers • Freeware for acoustic analysis is available • Need for multimedia resources appropriate for students’ needs 4 Methods – Recording Parameters • Ultrasound – Mid-saggital image of tongue posture – Probe in direct contact with jaw (no compressible acoutically transparent standoff) – No head stabilization • Digital video camera – Aimed at angle to front of face – Shows lip and jaw movement 5 Methods – Recording Parameters • Flexible endoscopy – Shows laryngeal setting (but cannot see glottal cycle) – Also shows pharyngeal articulation • Audio recording captured as part of all video recordings, used to synchronize videos with one another 6 Word List • Each English phoneme in word initial, word medial, and word final position where allowed by English phonotactics • Common words used wherever possible • Some additional gaps in database due to recording difficulties • See handout for complete list 7 Procedure • Each word was read clearly in isolation • Considerable pause between each word, with articulators moved back to “neutral” position 8 Post-Processing • Video recordings were superimposed to create a single video file showing facial video, endoscopy of larynx, and ultrasound of tongue position • Noise reduction applied to audio to eliminate machine noise from recording environment 9 Using the Database • Recordings can be viewed with freeware Wavesurfer program • Allows display of common acoustic phonetic analysis windows in conjunction with video image • Cursor position in acoustic analysis window is tied to the appropriate frame in the video image • Download from http://www.speech.kth.se/wavesurfer/ 10 Example 1 – okay front tongue root tongue blade arytenoids vocal folds 11 Example 1 – okay • Ultrasound shows tongue body raised and tongue root pulled forward to produce high front vowel /e/ • Endoscope window shows arytenoids cartilages are approximated and glottis is closed for voicing • Video clip shows lips pulled apart for unrounded vowel production • Etc… 12 Example 2 – voice // F2 /I/ F2 13 Example 2 – voice • Sample image of diphthong off-glide /I/ • Cursor positioned on spectrogram at end of diphthong • Ultrasound shows visible tongue tip and body raising, and advancement of the tongue root • Face video shows lip spreading and jaw raising • Endoscope shows approximation of the arytenoids and vocal folds 14 Conclusion • Ultrasound and other multimedia tools have great potential to enhance teaching and learning in phonetics/speech science • Copies of the database, version 1.0, are available in a compressed file archive on CD-ROM • Additional suggestions for improvements or additions to the database are welcome 15 Just for fun • “Tongue” the music video included as part of the database 16