The following article appeared in Journal of the Acoustical Society of America 88: 97–100 and may be found at http://scitation.aip.org/content/asa/journal/jasa/88/1/10.1121/1.399849. Copyright (1990) Acoustical Society of America. This article may be downloaded for personal use only. Any other use requires prior permission of the author and the Acoustical Society of America. Analytical expressions for the tonotopic sensory scale Hartmut Traunm011er InstitutionenfSr lingvistik,Stockholmsuniversitet,S-106 91 Stockholm,Sweden (Received16August1989;acceptedfor publication20 February1990) Accuracyand simplicityof analyticalexpressions for the relationsbetweenfrequencyand criticalbandwidthaswell ascritical-bandrate (in Bark) are assessed for the purposeof applicationsin speechperceptionresearchand in speechtechnology.The equivalent rectangularbandwidth (ERB) is seenas a measureof frequencyresolution,while the classical critical-bandrate is considereda measureof tonotopicposition.For the conversionof frequencyto critical-bandrate, and vice versa,the inversibleformula z = [26.81/( 1 + 1960/ f) ] -- 0.53 is proposed.Within the frequencyrangeof the perceptuallyessentialvowel formants(0.2-6.7 kHz), it agreesto within _+0.05 Bark with the Bark scale,originally publishedin the form of a table. PACS numbers:43.71.Cq, 43.72.Ar, 43.66.Fe INTRODUCTION or less than the CB. The CB and the ERB have been found to Two processesare generallyassumedto contributeto auditoryfrequencyresolution.First, the hearingsystemis capableof performingan "oscillographic"analysisof the set of neuralsignalsoriginatingin the cochlea.This processis limitedto frequencies that canbe resolvedin the patternof neuralresponses. While singleneuronsare not likely to fire morefrequentlythan500timespersecondevenat highstimulusintensities,frequencies between0.5 and 1.5kHz canstill behandledin the temporaldomain,albeitlessefficiently,on thebasisof the signalsfrom a largenumberof neurons.The capabilityandlimitationsof a frequencyanalysisin thetemporaldomainaredemonstrated vividlyby cochlearimplant patientswhosesole auditory input is an undifferentiated electricalstimulationof the auditorynerve. The second process coversthewholeauditoryfrequency range.Any soundenteringa normalfunctioningcochleais subjectto a spectralanalysis,resultingin a frequency-toplacetransformation. The cochleacanberegardedasa bank of filterswhoseoutputsare orderedtonotopically,with the filtersclosest to thebaseresponding maximallyto thehighestfrequencies. The tonotopicorderis knownto be maintainedin thestructureof the neuralnetworkat higherlevels in the hearingsystem. be proportionalandequivalentfor centerfrequencies above 500 Hz. For lower frequencies, there is a discrepancy, as shownin Fig. 1. In this range,the ERB decreases with decreasingcenter frequency,while the CB remainscloseto constant.The discrepancy canbe explainedby the reasonableassumption thattheanalysis withinthetemporaldomain is irrelevantto loudness summationaslongasloudness variations are not audible as such, while it contributessubstan- tially to frequency resolution forf< 500Hz. Consequently, theCB shouldnotbetakenasa measure offrequency resolution,butCB ratemaybetakenasa measure of thetonotopic sensoryscale. In the familiar CB-ratescale(seeFig. 2), the CB has beenchosento serveasa naturalunit of the tonotopicsensoryscale.Standardvaluesfortherelationbetweenfrequen- Frequency O. I ß 0.2 0.5 I I f 1.0 I (kHz) 2.0 •;.0 I I IO I The "notch-noise method" has often been used in inves- tigationsof auditory frequencyselectivity.It involvesthe determinationof the detectionthresholdfor a sinusoid,centeredin a spectralnotchof a noise,asa functionof the width of the notch. On the basis of results obtained with this meth- od,auditoryfrequencyselecivitycanbedescribedin termsof the equivalentrectangularbandwidth(ERB) as a function of centerfrequency(Moore and Glasberg,1983). Sincethe two processes mentionedaboveboth contributeto the detection of the sinusoid,the ERB, or ERB rate should not be takenasa measureof the tonotopicscaleassuch. A quantity related to the ERB, though not identical with it, is the classicalcritical bandwidth(CB) (Zwicker et al., 1957). Measurement of the CB typicallyinvolvesloudnesssummationexperiments.Different summationrules havebeenfoundto holdfor auditorystimuli,depending on whethertheirfrequencycomponents areseparated by more 97 J. Acoust.Soc.Am.88 (1),July1990 2.0 2. S 3.0 3.S; z,.O l•<f) FIG. 1. Equivalent rectangularbandwidth, accordingto the formula B = 6.23 10-6 f2 q_9.33910-2 f + 28.52,givenby MooreandGlasberg ( 1983) (curve), and criticalbandwidth,accordingto Zwicker's ( 1961) table (marks), as a functionof frequency. 0001-4966/90/070097-04500.80 ¸ 1990Acoustical Societyof America 97 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.237.171.143 On: Wed, 19 Nov 2014 11:19:40 Frequency 0.1 0.2 I I O.S ! • 1.0 I (kHz> 2.0 ! I. ANALYTICAL S.O I EXPRESSIONS 10 I A. Expressions for critical-band rate In roughapproximation,the relationbetweenfandz is linearforf< 500 Hz (z =f/100) and logarithmicfor higher frequencies.Figure 3 (a) showsthe error functionsof two logarithmicapproximationsto the CB scale.One of these, Eq. (1), has been suggestedby Zwicker and Terhardt (1980). It givesvaluesthat agreewith the tabulatedonesto within _ 0.25 Bark in the range0.6 <f< 7.2 kHz. The other approximation,Eq. (2), satisfies our stricterstandardsof no 24 more than _ 0.05-Bark deviation at the costof a reduction in the rangeof validity, to 1.0< f< 3.6 kHz: z = 14.2log(f/1000) 4- 8.7, z = 6.578 ln(f) -- 36.99. lg(f> cyfand CB ratez havebeenproposedby Zwicker ( 1961) in theform of a table.The CB-ratescalehasbeenappliedextensivelyin researchonpsychoacoustics andspeechperception. For mostof theseapplications,it wouldbe moreconvenient to havethe relationbetweenz andfspecifiedin the form of an equationinsteadof a table. Severalequationsthat approximate the tabulatedvalueshave also been published (Tjomov, 1971; Schroeder, 1977; Zwicker and Terhardt, 1980;Traunm011er,1983). In the following,the error functionsof theseequationswill be compared. Recentstudiesof speechsoundssuggest that the tonotopic distances (CB-rate differences)between prominent peaksin their spectraare fundamentalto the perceptionof theirphoneticquality.More specifically, it hasbeensuggested that the spectralpeaksshapedby the formantsand the fundamentalhave the samerelativetonotopiclocationsin linguisticallyidenticalvowelsutteredby speakersdifferent in ageandsex(Traunm011er, 1983,1988;SyrdalandGopal, 1986). While differences in speakersizeappearto be reflected in a tonotopictranslationof the spectralpeaks,differencesin vocaleffortappearto be reflectedin a linear tonotopic compression/expansion (Traunm011er,1988). In order to test thesehypotheses,both in theory and by meansof speechsynthesis,a convenientand accuratemethodof conversionfrom frequencyto CB rate, and viceversa,isneeded. Our requirementsincludethat the functionhave a simple inverse and that it be accurate preferably to within _ 0.05Barkin therangeof essential vowelformantfrequenciesof men, women,and children.This rigorousclaim for accuracypreventsthe introductionof any avoidableerror in addition to that inherent in the table (Zwicker, 1961 ). However, it should be noticed that the absolute width of the criti- cal band,and its definition,is irrelevantto the applications we havein mind, aslongasthe obtainedscalesremainproportional. 98 J. Acoust.Soc.Am.,Vol.88, No.1, July1990 (2) In theseandin all thefollowingequations, frequencyfis to be expressed in Hz andCB ratez in CB units(Bark). A mathematical FIG. 2. Critical-bandratez asa functionoffrequencyf The plussign( + ) represents datafrom Zwicker ( 1961). The curvecorresponds to Eq. (6). ( 1) function that is linear at one extreme andlogarithmicat theotherextreme,thesinus-hyperbolicus function,hasbeenusedby Tjomov ( 1971), Eq. (3), and by Schroeder(1977), Eq. (4), to calculateCB rate. The error functionsof both equationsare shownin Fig. 3(b). f = 600 sinh(z/6.7) 4- 20, z = 6.7 ln{[ (f-- + ([ (f- ( 3) 20)/600] 20)/600]2+ 1)•/2} (inverse), f = 650 sinh(z?7), (4) z = 7 ln((f/650) + [ (f/650) 2+ 1] •/2) (inverse). As comparedwith the tabulatedvalues,Tjomov'sequaBark for f< 4.5 kHz and Schroeder's equation(4) to within 4- 0.13 Bark for f< 4.0 kHz. Theseequationsare accurateenough for someapplicationsin whichfrequencycomponents above 4 kHz maybe neglected,asthey are in somesystemsof telephoniccommunication. Approximationscoveringthe wholeauditoryfrequency rangecanbeachievedin variouswaysby appropriatecombinationsof mathematicalfunctions.For the mostpart, however, this yields equationsthat lack a simpleinverse.The most accurateof the equationsgivenby Zwicker and Tertion (3) is accurate to within + 0.03 to -0.28 hardt (1980), z = 13atn(0.00076f) + 3.5 atn(f/7500) 2, (5) is of this kind. It agreeswith the table to within 4- 0.20 to -0.25 Bark over the whole range of auditory perception [seeFig. 3 (c) ]. The wavinessof the error functiontells us, however,that there is room for improvement.The equation alsoclearlyfalls short of our standards.If, e.g.,we want to comparethe tonotopicdistancesbetweentwo pairsof spectral peaks,we might obtainan error of up to 0.9 Bark. An approximationthat hasa simpleinverseand meets our standardsis achievedby consideringz to be related to log(f) by a logistic function, also known as "growth curve."Suchan approximation,Eq. (6), hasbeenproposed by Traunm011er(1983). Its error function is shownin Fig. 3(d): HartmutTraunmdller: Tonotopic sensoryscale 98 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.237.171.143 On: Wed, 19 Nov 2014 11:19:40 Fregue'ncy 0.0 I O.S 1.0 I I f Frequency (kHz) 2.0 4.0 8.0 I • • 0.0 I A • 0.• I I'''''1' • f (kHz) l.O I 2.0 I •.0 I½ .... I .......... 8.0 I I 3 v . N 2 I - A . . • • ß 0• - -.! _ O v-2N ee •e 3 o _ i ' _ ß . -.3 !.......................................................................................................... • --e• • . o _ 4 • I .... ,I,,,,,I 0 CB-rote z (LobIe Frequency a, 0.0 0.5 I I 1.0 f vo, lue) z /,.0 8.0 I I I 0.0 O.S 1.0 18 (Loble Frequency (kHz) I,,,,,I 12 CB-rote (a) 2.0 ! ..... 6 f 2• v•lue) (b) (kHz) 2.0 •.0 8.0 ß • I I I'''''1'''''1' .... I' .... I o .3 3 .1 2 ß2f .1 ß ß .1 • 0 0 ...... - c• -.1 o vN 2 -,2 o _.• ,,, i , , , , , I , , , , o 6. CB-rale I ..... I , , , , , I 12 z (tab!e • 18 I, 0 2 z, v•lue) •............................................................................... ,,,, I , ,,, CB-•cte (c) , I ,,,, 6 z • I ,, 12 18 (tc•le vclue) ,,, I 2• FIG. 3. (a)-(d) Errorfunctions ofvarious approximations oftheCB-ratescale. Theerrorisdefined asthedifference between thecalculated valueandthatin Zwicker's( 1961) table.It isplottedin steps of0.5 Barkforeachfrequency valuein thattable.(a) Logarithmic approximations: curvewithmarks,Eq. ( 1) [givenbyZwickerandTerhardt(1980)]; curvewithoutmarks,Eq. (2). (b) Sinus-hyperbolicus approximations: lowercurve,Eq. (3) [givenbyTjomov ( 1971) ]; uppercurve,Eq. (4) [givenbySchroeder (1977)]. (c) An overallapproximation, Eq. ( 5), givenbyZwickerandTerhardt(1980).(d) A logistic "growth-curve" approximation: lowercurvewitherrorscaleat theleft,Eq. (6) [givenbyTraunmiiller(1983)]; uppercurve,shownverticallydisplaced, with error scaleat the right, Eq. (6) with corrections(7) and (8). z = [26.81f/( 1960 +f) f= ] -- 0.53, 1960(z + 0.53)/(26.28 -- z) (6) (inverse). The valuesobtainedwith Eq. (6) deviatefrom the tabulated onesby lessthan +_0.05 Bark for 0.2 <f< 6.7 kHz. At the low-frequencyend of the scale,the deviation from the table (Zwicker, 1961) sumsup to -- 0.53 Bark for f= 0 Hz ( -- 0.26 Bark forf= 20 Hz). At leastin part, this deviationis due to biasedroundingof the bandwidthvalues in Zwicker'stable. For frequenciesbelow400 Hz, the standard width of the critical band was set uniformly equal to 100Hz. This appearsto havebeendonein orderto obtainthe mnemonicallysimplerelationz = f? 100.The originalbandwidth data (Zwicker et al., 1957) indicateB • 90 Hz for the lowerfrequencies in that range.The valueslistedin the table forf< 100Hz are particularlyquestionable becausetheycan hardly be saidto be basedon any reliableexperimentalevi99 J. Acoust. $oc. Am., Vol. 88, No. 1, July 1990 dence.Equation(6) may representthe tonotopicscalewell enoughdown to the lowestfrequenciesfor which it can be determinedexperimentally.The deviationat the high-frequencyendof the scaleremainsunaccounted for. Calculatingz with Eq. (6), closeagreementwith the table can be achieved over the whole auditory frequency range by added corrections,bending the error function straightat both endsof the scale,in the followingway: for calculated z < 2.0 Bark: z'=z+O. 15(2--z), for calculated z > 20.1 Bark: z' = z q- 0.22 (z -- 20.1 ). (7) (8) Sincethisisan easilyinvertedprocedure,the calculation off for a givenzis nota problem.The errorfunctionobtained with thesecorrectionsis alsoshownin Fig. 3 (d). The values calculatedin this way agreewith the tableforf> 100 Hz to within -F 0.05 Bark. Correction (7), however, simulates alsothe above-mentioned biasat low frequencies. Hartmut Traunm•Jller:Tonotopicsensory scale 99 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.237.171.143 On: Wed, 19 Nov 2014 11:19:40 F•e•u, ency 0.0 ß 0 O.E; 1.0 f 2.0 (kHz) A..O for criticalbandscenteredat z obtainedby Eq. (6) without corrections.The valuescalculatedby Eq. (10) agreewith Zwicker's table to within + 6% for 0.27 <f<5.8 kHz. Within that range, the error function is similar to that obtainedby Eq. (9). The error functionsof both equationsare shownin Fig. 4. 8.0 ACKNOWLEDGMENT ' The preparationof this paperhasbeensupportedby a grant from HSFR, the SwedishCouncilfor Researchin the v rn 2 Humanities u.I I , 0 , I , , , , I , , , , , I , , , , G CB-rote 12 z 18 (Loble and Social Sciences. I Moore,B.C. J., andGlasberg,B. R. (1983). "Suggested formulaefor calculating auditory-filterbandwidthsand excitationpatterns,"J. Acoust. 2•, value) Soc. Am. 74, 750-753. FIG. 4. Error functionsfor critical bandwidthcalculatedwith Eq. (9) (curve with marks) and Eq. (10) (curve without marks), as compared with Zwicker's ( 1961) tablevalues(seealsoFig. 1). Schroeder,M. R. (1977). "Recognitionof complexacousticsignals,"in Life Sciences Research Report5 (DahlemKonferenzen), editedby T. H. Bullock (Abakon Verlag, Berlin), pp. 323-328. Syrdal,A. K., and Gopal, H. S. (1986). "A perceptualmodelof vowelrecognitionbasedon the auditoryrepresentation of AmericanEnglishvowels," J. Acoust. Soc. Am. 79, 1086-1100. B. Expressions for critical bandwidth Zwicker and Terhardt (1980) proposedthe equation B- 25 + 75(1 + 1.4 10--6f2)0'69 to calculate critical bandwidth (9) B as a function of center fre- quencyf While Eq. (9) is very accurate,it cannoteasilybe integratedto obtainCB rate. The authors'equationfor CB rate (5)'is not compatiblewith Eq. (9). Proceedingfrom Eq. (6), critical bandwidthsB can be calculated 100 J. Acoust.Soc. Am., Vol. 88, No. 1, July 1990 loudnesssummation," J. Acoust. Soc. Am. 29, 548-557. Zwicker, E., and Terhardt, E. (1980). "Analytical expressions for criticalbandrate and criticalbandwidthasa functionof frequency,"J. Acoust. as B = 52548/(z2 -- 52.56z+ 690.39) Tjomov,V. L. (1971). "A modelto describetheresultsof psychoacoustical experiments on steady-state stimuli,"in AnalizRechevykh $ignalovChelovekom,editedby G. V. Gershuni (Nauka, Leningrad), pp. 36-49. Traunmiiller,H. (1983). "On vowels:Perceptionof spectralfeatures,related aspectsof productionand sociophonetic dimensions," Ph.D. thesis, Universityof Stockholm. Traunmiiller, H. (1988). "Paralinguisticvariationand invariancein the characteristic frequencies of vowels,"Phonetica45, 1-29. Zwicker,E. ( 1961). "Subdivision of the audiblefrequencyrangeinto critical bands(Frequenzgruppen),"J. Accoust.Soc.Am. 33, 248. Zwicker,E., Flottorp, G., andStevens,S.S. (1957). "Critical bandwidthin (lO) Soc. Am. 68, 1523-1524. HartmutTraunm(Jller: Tonotopicsensoryscale 100 Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.237.171.143 On: Wed, 19 Nov 2014 11:19:40