VTAR: A Matlab-based computer program for vocal tract acoustic modeling

advertisement
VTAR: A Matlab-based computer program for vocal tract acoustic modeling
Xinhui Zhou, Zhaoyan Zhang and Carol Espy-Wilson
Speech Communication Lab, Institute for Systems Research (ISR), Department of Electrical and Computer Engineering
http://www.isr.umd.edu/Labs/SCL
FLUID AND WALL PROPERTY SETUP
INTRODUCTION
LATERAL SOUND /l/
A Matlab-based computer program for vocal tract acoustic response
calculation (VTAR) [1] based on a frequency-domain vocal tract model.
VTAR is able to model various complex sounds such as nasals, rhotics,
and liquids. With input in the form of vocal tract cross-sectional area
functions, VTAR calculates the vocal tract acoustic response function and
the formant frequencies and bandwidths. The user-friendly interface
allows directed data input for defined categories: vowels, nasals,
nasalized sounds, consonant, laterals, and rhotics. The program also
provides an interface for input and modification of arbitrary vocal tract
geometry configurations, which is ideal for research applications.
<Mode> Nasal
<typeOfBranch> PharynxConnectingBranch
<numOfBranches> 2
<numOfSections> 20
<secLen> 0.3968 0.3968 0.3968 0.3968 0.3968 0.3968 ...
<secArea> 0.2600 0.2400 0.1700 0.2100 0.1500 0.3600 ...
<typeOfBranch> OralTerminalBranch
<numOfBranches> 0
<numOfSections> 21
<secLen> 0.3968 0.3968 0.3968 0.3968 0.3968 0.3968 ...
<secArea> 2.3000 1.9300 1.7700 0.9600 0.8900 1.2200 ...
<typeOfBranch> NasalCommonConnectingBranch
<numOfBranches> 2
<numOfSections> 5
<secLen> 0.4087 0.4087 0.4087 0.4087 0.4087 …
<secArea> 1.2600 2.7423 3.5236 3.8665 4.3199 …
MODELING METHODS
Volume velocity
Transfer function
VTAR
• Frequency-domain formulation
• Vocal tract modeled as concatenation of various modules (such
as single tube, branching, and lateral channels).
• For each module the input and output pressure and volume
velocities are related by a transfer matrix.
⎡ p in ⎤
⎡ p out ⎤
⎡A
⎢U ⎥ = K ⎢U ⎥ = ⎢ C
⎣
⎣ in ⎦
⎣ out ⎦
VOWEL SOUND
GENERIC AREA FUNCTION OF /r/
AND /l/ FROM MRI IMAGE
B ⎤ ⎡ p out ⎤
D ⎥⎦ ⎢⎣U out ⎥⎦
10
0
-10
B ⎤⎡ pl ⎤
D ⎥⎦ ⎢⎣U l ⎥⎦
20 log10 U l / U g = 20 log10 (1/ CZ l + D )
<typeOfBranch> NasalRightTerminalBranch
<numOfBranches> 0
<numOfSections> 20
<secLen> 0.4063 0.4063 0.4063 0.4063 0.4063 0.4063 ...
<secArea> 1.6855 1.8696 1.4279 1.7885 1.9956 1.8020 ...
SIMULATED SPECTRUM OF /l/ FROM
VTAR AND REAL SPECTRUM FROM
ACOUSTIC DATA
• The coefficients of the matrices are calculated based on a
transmission-line model and the chain matrix.
• Includes losses due to flow viscosity, heat conduction, and vocaltract wall vibration.
• Pressure and volume velocities at the glottis and lips can be
related by a simple matrix and the vocal tract transfer function
can be calculated.
⎡ pg ⎤
⎡ pl ⎤ ⎡ A
⎢U ⎥ = K ⎢ ⎥ = ⎢
⎣U l ⎦ ⎣ C
⎣ g⎦
<typeOfBranch> NasalLeftTerminalBranch
<numOfBranches> 0
<numOfSections> 21
<secLen> 0.3750 0.3750 0.3750 0.3750 0.3750 0.3750 ...
<secArea> 2.1411 1.7442 1.4429 1.6214 2.0820 2.5039 ...
RHOTIC SOUND /r/
NASAL SOUND
Power Spectrum (dB)
Cross-sectional
area functions
ARBITRARY AREA FUNCTION CONFIGURATION FOR NASAL SOUND (Example format
as follows)
-20
-30
-40
-50
-60
-70
-80
MAIN FEATURES
-90
0
1000
2000
3000
4000
5000
6000
Frequency (Hz)
• Acoustic response calculation for different kinds of sound and
different vocal tract configurations
• Formant and bandwidth calculation from acoustic response
• Different models for the same sound
• Generic area function for different kinds of sounds
• User-friendly interface for area function input and manipulation
• Arbitrary area function input
• Fluid and wall property setting
• Save and load the setting information and calculation result
• Marker to measure the data in plot
FREE DOWNLOAD AVAILABLE SOON
http://www.isr.umd.edu/labs/SCL/vtar
REFERENCE
1. Zhang and Espy-Wilson, A vocal tract model of American
English /l/, J. Acoust. Soc. Am. 115(3), pp1274-1280, March
(2004)
ACKNOWLEDGMENTS
This work was supported by NIH Grant 1 R01 DC05250-01
Download