VTAR: A Matlab-based computer program for vocal tract acoustic modeling Xinhui Zhou, Zhaoyan Zhang and Carol Espy-Wilson Speech Communication Lab, Institute for Systems Research (ISR), Department of Electrical and Computer Engineering http://www.isr.umd.edu/Labs/SCL FLUID AND WALL PROPERTY SETUP INTRODUCTION LATERAL SOUND /l/ A Matlab-based computer program for vocal tract acoustic response calculation (VTAR) [1] based on a frequency-domain vocal tract model. VTAR is able to model various complex sounds such as nasals, rhotics, and liquids. With input in the form of vocal tract cross-sectional area functions, VTAR calculates the vocal tract acoustic response function and the formant frequencies and bandwidths. The user-friendly interface allows directed data input for defined categories: vowels, nasals, nasalized sounds, consonant, laterals, and rhotics. The program also provides an interface for input and modification of arbitrary vocal tract geometry configurations, which is ideal for research applications. <Mode> Nasal <typeOfBranch> PharynxConnectingBranch <numOfBranches> 2 <numOfSections> 20 <secLen> 0.3968 0.3968 0.3968 0.3968 0.3968 0.3968 ... <secArea> 0.2600 0.2400 0.1700 0.2100 0.1500 0.3600 ... <typeOfBranch> OralTerminalBranch <numOfBranches> 0 <numOfSections> 21 <secLen> 0.3968 0.3968 0.3968 0.3968 0.3968 0.3968 ... <secArea> 2.3000 1.9300 1.7700 0.9600 0.8900 1.2200 ... <typeOfBranch> NasalCommonConnectingBranch <numOfBranches> 2 <numOfSections> 5 <secLen> 0.4087 0.4087 0.4087 0.4087 0.4087 … <secArea> 1.2600 2.7423 3.5236 3.8665 4.3199 … MODELING METHODS Volume velocity Transfer function VTAR • Frequency-domain formulation • Vocal tract modeled as concatenation of various modules (such as single tube, branching, and lateral channels). • For each module the input and output pressure and volume velocities are related by a transfer matrix. ⎡ p in ⎤ ⎡ p out ⎤ ⎡A ⎢U ⎥ = K ⎢U ⎥ = ⎢ C ⎣ ⎣ in ⎦ ⎣ out ⎦ VOWEL SOUND GENERIC AREA FUNCTION OF /r/ AND /l/ FROM MRI IMAGE B ⎤ ⎡ p out ⎤ D ⎥⎦ ⎢⎣U out ⎥⎦ 10 0 -10 B ⎤⎡ pl ⎤ D ⎥⎦ ⎢⎣U l ⎥⎦ 20 log10 U l / U g = 20 log10 (1/ CZ l + D ) <typeOfBranch> NasalRightTerminalBranch <numOfBranches> 0 <numOfSections> 20 <secLen> 0.4063 0.4063 0.4063 0.4063 0.4063 0.4063 ... <secArea> 1.6855 1.8696 1.4279 1.7885 1.9956 1.8020 ... SIMULATED SPECTRUM OF /l/ FROM VTAR AND REAL SPECTRUM FROM ACOUSTIC DATA • The coefficients of the matrices are calculated based on a transmission-line model and the chain matrix. • Includes losses due to flow viscosity, heat conduction, and vocaltract wall vibration. • Pressure and volume velocities at the glottis and lips can be related by a simple matrix and the vocal tract transfer function can be calculated. ⎡ pg ⎤ ⎡ pl ⎤ ⎡ A ⎢U ⎥ = K ⎢ ⎥ = ⎢ ⎣U l ⎦ ⎣ C ⎣ g⎦ <typeOfBranch> NasalLeftTerminalBranch <numOfBranches> 0 <numOfSections> 21 <secLen> 0.3750 0.3750 0.3750 0.3750 0.3750 0.3750 ... <secArea> 2.1411 1.7442 1.4429 1.6214 2.0820 2.5039 ... RHOTIC SOUND /r/ NASAL SOUND Power Spectrum (dB) Cross-sectional area functions ARBITRARY AREA FUNCTION CONFIGURATION FOR NASAL SOUND (Example format as follows) -20 -30 -40 -50 -60 -70 -80 MAIN FEATURES -90 0 1000 2000 3000 4000 5000 6000 Frequency (Hz) • Acoustic response calculation for different kinds of sound and different vocal tract configurations • Formant and bandwidth calculation from acoustic response • Different models for the same sound • Generic area function for different kinds of sounds • User-friendly interface for area function input and manipulation • Arbitrary area function input • Fluid and wall property setting • Save and load the setting information and calculation result • Marker to measure the data in plot FREE DOWNLOAD AVAILABLE SOON http://www.isr.umd.edu/labs/SCL/vtar REFERENCE 1. Zhang and Espy-Wilson, A vocal tract model of American English /l/, J. Acoust. Soc. Am. 115(3), pp1274-1280, March (2004) ACKNOWLEDGMENTS This work was supported by NIH Grant 1 R01 DC05250-01