Perception of aperiodicities in synthetically generated voices James Hillenbranda) RIT Research Corporation and Departmentof Computer Science, Rochester Instituteof Technology, 75HighpowerRoad,Rochester, New York14623-3435 (Received2 December1987;accepted for publication15February1988) The purposeof this studywasto investigateunivariaterelationshipsbetweenperceived dysphoniaand variationin pitchperturbation,amplitudeperturbation,andadditivenoise.A time-domain,pitch-synchronous synthesis techniquewasusedto generatesustainedvowels varyingin eachof the threeacousticdimensions. A panelof trainedlistenersprovideddirect magnitude estimates of roughness in thecaseof thestimulivaryingin pitchandamplitude perturbation,and breathinessin the caseof the stimulivaryingin additivenoise.Very strong relationships werefoundbetweenperceivedroughness andeitherpitchor amplitude perturbation. However,unlikeresultsreportedpreviously for nonspeech stimuli,thesubjective qualityassociated with pitchperturbation wasquitedifferentfromthat associated with amplitudeperturbation.Resultsalsoshowedthat perceived roughness wasaffectednot onlyby the amountof perturbation,but alsoby the degreeof correlationbetweenadjacentpitchor amplitudevalues.A strongrelationshipwasfoundbetweenperceivedbreathiness and signalto-noiseratio. Contrary to previousfindings,therewasno interactionbetweensignal-to-noise ratio and the amountof high-frequencyenergyin the periodiccomponentof the stimulus: Stimuliwith similarsignal-to-noise ratiosreceivedsimilarratings,regardless of differences in the spectralslopeof the periodiccomponent. PACS numbers:43.70.Dn, 43.7 l.Gv, 43.72.Ja tic dimensions in naturallyoccurringvoices.Of particular importance are theinabilityto controltherangeof variation There is very goodagreementamongvoiceclinicians on particular acoustic dimensions andthedegree ofintercorand voicescientistsof the needfor objectivemeasurements relation among individual acoustic properties. that wouldserveto quantifyboth the typeand severityof The presentstudyrepresents an attemptto address this dysphonia thataccompanies a widerangeoflaryngeaidisorproblem by studying the perceptual characteristics of synders(e.g.,Aronson,1980;Davis,1981;Fritzelletal., 1977; theticallygenerated voices. Thethreeparameters chosen for Hammarbergetal.,1980;Hirano,1981;Jensen,1965).In an studywerepitchperturbation, amplitudeperturbation, and effortto achieve thisgoal,a numberof i.nvestigators have additivenoise.The long-termgoalof this work is to learn made detailedacousticmeasurements of a wide variety of acousticpropertiesassociatedwith laryngealdisorders. howtheseandotheracousticparameterscombineto affecta INTRODUCTION Many of the acoustictechniquesthat havebeeninvestigated arebasedon the findingthat dysphonicvoicestendto show listener's overallimpression of vocalquality.Asaninitial steptowardthisgoal,thepresentstudywasdesigned to examinetheunivariaterelationships betweenvariationin each largerthannormaldeviations fromperfectperiodicity.As a numberof investigators havepointedout,however,acoustic of these acousticdimensionsand the perceptionof dysmeasurements areusefulonlyif theycanberelatedto specif- phonievocalquality. ic diagnostic categories, or to meaningfulperceptualdimensions.For example,Hammarberget al. (1980) commented A. Pitch perturbation that" ... acoustic measurements do not make senseon their Pitchperturbation, or "vocaljitter,"isdefined ascycleown... [but] mustberelatedto perceptual characteristics in to-cyclevariationin voicefundamental frequency (Fo). All orderto beclinicallyuseful"(p. 32). human voicescontain a certain amount of vocaljitter, and The taskof relatingacousticproperties to perceptual synthesis studies haveshownthat a minimumamountof dimensions hasnot provento bea simpleone.The most jitter is requiredfor a voiceto soundnatural(Gill, 1961; commonapproachto thisproblemhasinvolvedthe useof Holmes, 1962; Rozsypal and Millar, 1979; Schroeder, correlational techniques to determine relationships between 1961). Jittervalues in normalvoices aregenerally lessthan theperceived degreeand/or typeof dysphonia andvariation about 1.0% (Hollien eta!., 1973; Horii, 1979, 1982;Jacob, in specificacousticdimensions(e.g., Kempster, 1984; 1968;Kempstcr,1984;Simon,1927). Lieberman(1963) Kempsterand Kistler, 1984;Kojimaet aL, 1980;Murry et wasthe firstto reportthat dysphonicvoicestend to show al., 1977;Proseket al., 1984;Smithet al., 1978;Yanagihara, unusuallylargecycle-to-cycle variationsin Fo. This basic 1967:Yumoto et al., 1982). As will be discussedin greater findinghasbeenconfirmedin severalstudiesusinga wide detailbelow,the interpretabilityof thesestudieshasbeen varietyof analysisand computational procedures (Davis, limitedbytheinabilityto controlvariationin specific acous- 1976;Deal and Emanuel, 1978; Hecker and Kruel, 1971; Kitajimaet al., 1975;Koike, 1973;seeMurry andDoherty, '• Presentaddress: Departmentof SpeechPathology andAudiology,WesternMichiganUniversity,Kalamazoo,MI 49008. 2361 J. Acoust.Soc.Am.83 (6),June1988 1980,for a negativefinding). 0001-4966/88/062361-11 $00.80 ¸ 1988Acoustical SocietyofAmerica 2361 cal quality in naturallyproducedvoices.Takahashiand Koike (1975) reportedthat"breathiness" ratingscorrelated with amplitudeperturbation and"roughness" ratingscorrelatedbothwithpitchperturbation(r = 0.55) andamplitude disorderedvoices.Weak- to moderate-strength correlations perturbation(r = 0.72). Deal and Emanuel (1978) made measures of pitchandamplitudeperturbation betweenpitchperturbationand perceived roughness were nonsequential whowereaskedto simulate reportedin studiesof disorderedspeakersby Deal and froma groupof normalspeakers Emanuel (1978), Lieberman (1963), and Takahashi and roughvoicequalityand froma groupof speakers with varKoike (1975). Smithetal. (1978) reporteda relativelyweak iouslaryngealpathologies. Resultsshowedsignificant correand measures (r = 0.55) nonsignificant correlation between jitter andper- lationsbetweenlistenerratingsof roughness of bothpitchand amplitudeperturbation. On the basisof ceivedroughness in a groupof esophogeal spea. kers. analyses, Deal and Emanuelconcluded As mentionedpreviously,two importantinterpretive multipleregression limitationsof thesestudiesusingnaturally producedvoices that, "... cyclicpeak amplitudevariabilitymay providea betterindexof perceivedroughness than cyclicperiodvariconcernthe inability to controljitter valueswithout affecting valueson other dimensions and the inabilityto control ation ..." (p. 250). A reanalysisof theseresultsby Nichols Jitter doesnot appearto be a factorcontrollingperceivedroughness in naturallyproduced normalvoices(HeibergerandHorii, 1982),butthere!ssomeevidence suggesting that jitter is correlatedwith perceivedroughnessin therangeof variationon anyof theacousticdimensions. The firstof theseproblemsisespeciallyimportantbecauseseveral acoustic measurement studies of disordered voices have (1979) reachedthe sameconclusionusingpartial correlation techniques. Studiesby Wendahl (1966a,b) and Heibergerand Horii (1982) reported strong relationshipsbetwei•namplitude perturbationand perceivedroughness in synthetically generatednonspeech signals.Wendahlreportedthattheroughnessperceptresultingfrom the introductionof amplitude perturbationin sawtoothwaveformswasverysimilarto that associatedwith pitch perturbation: "... it isinteresting to notethat theroughness generated by thesedifferentprocedures [i.e., pitchandamplitude perturbation]resultsin suchsimilar auditory experiences.Somehighly trainedlistenerswereableto distinguishbetweenthe two typesof stimuli,but the writer, who has had years of listeningexperiencewith such stimuli, is able to discriminatebetweenthe program typesonly at the extremesof the continuum..." (Wendahl, 1966b,p. 106). comparison listening tests showed thatroughness judgments Heiberger and Horii studied perceptualtrading relacorrelatedstronglywith pitch perturbation.Wendah!'sresultsalsoshowedthat, for a givenjitter size,a signalwith a tionsbetweenpitch and amplitudeperturbationby synthesizingtriangular wavesvarying in jitter, shimmer,or both low Fotendedto soundmoreroughthana signalwith a high Fo (see,also,Coleman,1969), a findingwhichhasalsobeen jitter andshimmer.The resultssuggested that theperceptual effects of jitter and shimmer are, in some sense,equivalent. reportedfor naturallyproducedvoices(Deal andEmanuel, 1978; Heibergerand Horii, 1982). Heiberger and Horii For example,a stimuluswith 2.0% jitter wasjudged to be (1982) alsoreporteda strongcorrelationbetweenjitter and approximatelyequivalentin roughness to a 1.0-dBshimmer perceived roughness usingsynthesized triangularwaves.To stimulus.The resultsalsosuggested that the effectsof jitter date, no studyhasexaminedthe relationshipbetweenjitter and shimmerare additive;for example,a stimuluscontainand perceivedroughnessin syntheticallygeneratedvoice ing both 2.0% jitter and 1.5-dB shimmer soundedmore signals. roughthaneithera 2.0% jitter stimulusor a 1.5-dBshimmer reported significantintercorrelationsamong individual acousticparameterssuchas pitch pe.rturbation, amplitude perturbation,and additivenoise(Davis, 1976;Deal and Emanuel, 1978; Heibergerand Horii, 1982; Horii, 1980;. Kempster,1984;KempsterandKistler,1984;Yumotoetal., ! 984). Further,a recentmethodological studysuggests that the acousticanalysistechniquesthat havebeenusedto measureperturbationarenot alwaysableto discriminateamong varioussourcesof aperiodicity(Hillenbrand, 1987). Becauseof the difficulty in interpretingthe resultsof perceptionstudiesusingnaturallyproducedvoices,several studieshaveexaminedtheperceptio n of syntheticallygenerated signals.Wendahl (1963, 1966a,b) synthesizedsawtoothwavesvaryingin jitter and meanFo. Resultsof paired stimulus. B. Amplitude perturbation Amplitudeperturbation,or "vocalshimmer,"isdefined ascycle-to-cycle variationin voiceamplitude.Shimmervaluesin normal voicesare generallylessthan about0.7 dB (Horii, 1980,1982;Kempster,1984;Robbins,1981). Using a calculation method based on successive differences from a thre•-point moving .average, Kitajima andGould(1976)reportedthat amplitudeperturbationvaluesfrom a groupof dysphonicsubjectsweresignificantlylargerthan thosefor a nondisordered controlgroup.Similarfindingswerereported by Davis (1981) usingslightlydifferentcalculationmeth- C. Additive noise The termadditivenoiseis generallyusedto referto the acousticby-productof turbulencegeneratedat the glottis.A numberof studieshave reportedthat noiselevelsin dysphonicvoicestendto be higherthan thosein normalvoices and that noisemeasurements correlatewith subjectiveratingsof dysphonia(Deal and Emanuel, 1978;Emanueland Sansone,1969;Kojima et al., 1980;Lively and Emanuel, 1970;Sanseine and Emanuel,1970;Yanagihara,1967;Yu- motoeta!., 1982,1984}.A straightforward interpretation of ods. theperceptual effects ofadditive noise iscomplicated bythe Relativelylittle isknownabouttherelationship between amplitudeperturbationandthe perceptionofdysphonicvo- presence of relativelystrongintercorrelations amongmea- 2362 J. Acoust. Soc. Am., Vol. 83, No. 6, June 1988 sures ofperturbation and. additive noise (DealandEmanuel, JamesHillonbrand: Perceptionof aperiodicities 2362 1978;Kempster,1984;Kempsterand Kistler, 1984;Yu- for the vowel [a] (F1=720, F2=1240, F3=2400, F4 = 3300,F 5 = 3700).Thepitchpulse wasgenerated with surement interactions amongthesevariables(Hillenbrand, a 40-kHzsample frequency, 12bitsofamplitude resolution, 1987). andconsisted of512datapoints ( 12.8ms).Asshown inFig. Verylittleworkhasbeendoneonthesynthesis andper- 1,sustained vowels weresynthesized bystringing together ception ofstimulivaryingin additivenoise.In a studythatis the individualdampedoscillations produced by the Klatt described very briefly,Yanagihara(1967) mixedfiltered synthesizer. TheFowascontrolled byadjusting theinterval moroetal., 1984)andbythepresence of verystrong mea- and unfilterednaturallyproducedsustainedvowelswith varioustypesof bandpass filterednoise.The signal-to-noise ratioswereheldconstantand the primarypurpose of the studywasto determinethe relationship betweenperceived dysphonia and the spectralproperties of the periodicand aperiodic components of thestimuli.Yanagiharareporteda strongrelationship betweenthe lossof high-frequency harmonicsandperceived dysphonia: "Evenif therelativeintensityof thenoisecomponents andtheharmoniccomponents remainunchanged, the lossof high-frequency harmonicsresuitsin anincrease ofthedegreeof perceived dysphonia" (p. 538). Yanagihara'sresultsare consistentwith FroekjaerJensenandPrytz (1976), who reportedan increasein highfrequencyenergyin long-termaveragespectrummeasurementsfollowingtreatmentfor voicedisorders(see, also, Hammarberget al., 1980;Fritzell et al., 1977;Gauffinand Sundberg,1977). The presentstudywasdesigned to extendthe work of Wendahl( 1963, 1966a,b)and Heibergerand Horii (1982) in studyingtherelationsbetweenperturbationandperceived roughness in syntheticallygeneratedsignalsand to extend the work of Yanagihara (1967) in studyingthe perceptual effectsof additivenoise.The experiments on the perception of perturbationweredesignedprimarily to addresstwo limitationsof previousresearchonjitter andshimmersynthesis. First, the presentstudywasdesignedto examineroughness- perturbationrelations in syntheticallygeneratedvoices, ratherthan the nonspeech waveformsusedin the studiesby Wendahland Heibergerand Horii. Second,unlike the previousperturbationsynthesisstudies,the presentstudyused synthesis techniques that attemptedto modelthe sequential propertiesof cycle-to-cycle pitch and amplitudechangein naturallyproducedvoices.The purposeof theadditivenoise experiments wasto examinethe relationshipbetweennoise leveland perceivedbreathiness overa widerangeof signalto-noiseratiosand to testfor possibleinteractionsbetween signal-to-noise ratioandenergylevelsin high-frequency harmonicsovera broadrangeof signal-to-noise ratios. I. EXPERIMENT betweentheonsetof onedampedoscillation andtheonsetof the nextoscillation.For fundamentalperiodsthat are less than12.8ms,theendof onedampedoscillation will overlap with the beginningof the next.This effectwasaccountedfor simplybyaddingthetail endof onedampedoscillation into thebeginning of thenext.In Fig. 1,theonset-to-onset intervalwasfixedat 8 ms,producing a vowelwitha constant Foof 125Hz. Pitchperturbation wascontrolledby introducing specificamountsof variabilityin the onset-to-onset intervals. Althoughnot usedin experiment1, VSYN controlsamplitudeperturbation byscalingeachpitchpulseindividually to achievethe desiredamountof amplitudevariability.Additivenoisecanbecontrolledby theappropriatescalingand point-for-pointadditionof a separatenoisesignal. Using a methodsuchas this to synthesizestimuli that differonlyin pitchperturbationis problematicsince,asdiscussedin detail in a relatedarticle (Hillenbrand, 1987), am- plitudeperturbation isproducedasa sideeffectof pitchperturbation.For thestimulithat wereintendedto differonlyin pitch perturbation,thisartifactwasremovedby a separate programthat measuredthe intensityof individual pitch pulsesand scaledall pitch pulsesto the samerms value. 2. Random-number generation A random-number generatorof sometypeis neededto producethesequence of Foand/or pitch-pulse amplitude valuesthat control the synthesizer.The random-number 1 A. Methods 1. Synthesis technique Stimuli for all the experimentsdescribedin this article weregenerated with a pitch-synchronous synthesis program called VSYN (Wilde and Martens, 1985; modeled after Wilde et al., 1986). The programwasdesignedto generate sustainedvowelsdifferingin jitter, shimmer,additivenoise, and mcanF o.The first stepin the synthesisprocessinvolved usingKlatt's (1980) formantsynthesizer to generatea single pitchpulsewithformantfrequency characteristics appropriate for whatevervowel quality is desired.Stimuli for the presentstudyuseda formantfrequencypatternappropriate 2363 J. Acoust.Sec. Am., Vol. 83, No. 6, June 1988 FIG. 1. Pitch-synchronous, time-domain synthesis technique usedby VSYN. James Hillenbrand:Perceptionof aperiodicities 2363 (i.e., the standarddeviationof the distributioncycle-to-cycle differencesin fundamental period with the sign retained). Equatingthe stimuli foreither fundamentalperiodstandard deviationor perturbationfactor increasedrather than decreaseddifferences in roughness magnitudebetweenthe correlated and uncorrelated continua. When the stimuli wereequatedfor fundamentalperiodstandarddeviation,the averagedifferencein roughnessmagnitudebetweencorrelated and uncorrelatedsignalswas20.4%; for the perturbation factor, the differencewas 29.4%. Koike's (1973) "rela- tiveaverageperturbation,"whichusesa three-pointmoving average,producedresultsthat werevery similar to the mean jitterdataShown in Fig.3. It isalsointeresting to notethat the uncorrelatedsignalshad substantiallyhigher valuesof directionaljitter than the correlatedsignals(72.7% vs 40.6%). The factthat the correlatedsignalssounded more roughthanthe uncorrelated signalswouldseemto indicate thatdirectional jitter doesnotplaya rolein roughness perception. In general,our preliminary conclusionfrom the com- parisonbetweenthe correlatedand uncorrelated signalsis that the standarddeviationof signedjitter, or Davis' ( 1981) PPQ, showsa strongerrelationship to perceivedroughness than other methodsof representingpitch perturbation. However, none of the calculation methods that were used eliminatedthedifference in perceived roughness betweenthe correlatedand uncorrelatedsignals. II. EXPERIMENT PERTURBATION 2: PERCEPTION OF AMPLITUDE A. Methods VSYN wasusedto synthesizetwo 22-membershimmer continuausingmethodsthat wereanalogous to thoseusedto createthejitter continuumin experiment1. As in the pitch perturbationexperiment,onecontinuumwascreatedusing the modifiedI/f random-number generatorand the other was createdusinga standardwhite-noisegenerator.The stimuli along each continuumvaried from 0.0-2.6 dB and were spacedat 0.1-dB incrementsfrom 0.0-1.0 dB and at 0.2-dB increments from 1.0-2.6 dB. The decision to restrict B. Results and discussion The functionrelatingshimmerto perceivedroughness is shownin Fig. 5. The smoothcurveisa second-order polynomial in the case of the correlated continuum and a fourth- order polynomialin the caseof the uncorrelatedcontinuum. As wastrue for the pitch-perturbation data,the signalsthat were producedfrom correlatedsequences soundedmore roughthan signalswith the samemeanperturbationvalues that were producedfrom uncorrelatedsequences. For the datain Fig. 5, the averagedifferencein roughness magnitude betweencorrelatedand uncorrelatedsignalswith the same meanshimmervalue was 27.1%. This valueis nearlythree times larger than the differencethat was observedbetween correlatedand uncorrelatedstimulifor the pitch-perturbation continua. Unlike the pitch-perturbation data. this discrelms•_betweenthe correlatedand uncorrelatedsignalsdoes seemto berelatedto the choiceof perturbationcalo•t•tion methods.In general,stimuli that were matchedfor mean shimmer tendedtoshowverysimilarratings •-henperluffoationwasmeasured usingothercalculationmethods,su• as pitch-pulseamplitudestandardde•Sation, amplitude-perturbationquotient(the amplitudeanalogofPPQ), standaxd deviationof signedshimmer,and Koike's (1973) relatis• averageperturbation. Oneimportantfindingof thisexperimentthat cannotbe observed in Fig. 5 concerns thesubjective qualityof thestim- uli varyingin amplitudeperturbation. Recallthatprevious synthesis researchwith nonspeech signalssuggested that amplitudeperturbation produceda sensation of roughness that wasvirtuallyindistinguishable from that producedby pitchperturbation(Wendahl,1966b).Althoughsubjects in the presentexperimentwere'askedto rate the stimuli on • [] R CORRELATED AMPLITUDE SEOUENCES UNCORRELATEDAMPLITUDE SEOUENCES 80 0 the continuum to the range below 2.6 dB was somewhat U arbitraryand wasbasedon the increasingly unnaturalperceptualqualityof the synthesizedsignalsasshimmervalues approachedabout2.0 dB. All stimuliwere 1.0 s in duration and weresynthesized at 40 kHz, with a constantFo of 130 Hz. As in thepreviousexperiment,the stimuliweregatedon H G 70 N E S •0 S S0 R •0 T and off with a 20-ms cosine function and all stimuli on the continuumwereequatedfor overallrmsintensity. Subjects consisted of thesametenlisteners whoparticipatedin experiment1.As in theprevious experiments, subjectswereaskedtoratethestimulionthedegree ofperceived roughness. Eachof the44 stimuliwaspresented16timesin pseudorandomorder. The first 132 trials were consideredto bepracticeandthesedatawerenotincludedin theanalysis. Methodsusedfor stimuluspresentationwere identicalto experiment1. 2366 J.Acoust. Soc.Am.,Vol.83,No.6,June1988 G 20 I 0.0 I 0.•f I 0.8 I 1.2 SHIMMER I 1.6 I 2.0 I 2.t+ I 2.8 (08) FIG. 5. Perceivedroughness as a functionof shimmerfor correlatedand uncorrelated pitch-pulseamplitudesequences. James Hillenbrand: Perception ofaperiodicities 2366 themselves to beexperienced in theevaluation andtreatment of voicedisorders. Thesesamesubjects participated in experiments2-4 aswell. The order of presentationof the four experimentswascounterbalanced acrosssubjects,with one exception:All subjectsparticipatedin experiment3 (comparisonof pitchandamplitudeperturbation)followingtheir participationin experiment1 (perceptionof pitchperturbation) and experiment2 (perceptionof amplitudeperturbation). B. Results and discussion Resultsare shownin Fig. 3, which plots normalized roughness magnitudeas a functionof percentjitter, pooled acrossall ten listeners.Direct magnitudeestimateswererescaledseparatelyfor eachsubjectsothat the numbersranged from 10-90.The smoothcurvesarethird-orderpolynomials that werefit to the data. Ignoringfor the momentthe differencebetweenthe correlated and uncorrelated stimuli, it can beseenthat thereisa verystrongrelationship betweenpitch perturbationand perceivedroughness. The compression of thefunctionat thehighendof thejitter continuumisconsistent with Heibergerand Horii (1982), who reportedthat, "... beyond a certain point, relatively large increasesin [jitter] did not resultin similarlylarge increasesin roughnesslevel ..." (p. 321). However,in Heibergerand Horii's nonspeech data,the changein slopeoccurredbetweenjitter valuesof 5.0% and 10.0%, much largerthan the valueof approximately2.0% foundin the presentstudy.Although this discrepancy might reflectdifferences in the perception of triangularwavesversusthe moreharmonicallyrich voice signalsusedin the presentstudy,thereare two other possibilities.The stimuliusedby Heibergerand Horii werehigher in mean Fo ( 165 vs 130 Hz usedin the presentstudy) and were presentedto subjectsover earphonesrather than a • , [] 90 G uncorrelatedcontinua were equatedfor mean jitter (the averageabsolutedifferencein fundamentalperiodbetween adjacentpitchpulses),but werenot necessarily matchedin terms of other calculation the correlated and uncorrelated continua, but with the stim- uli equatedfor Davis' ( 1981) "pitch perturbationquotient" (PPQ), which usesa five-pointmovingaverage.It can be seenthat thedifferences in perceivedroughness betweenthe correlatedand uncorrelatedsignalsare reducedsignificantly, althoughnot eliminatedentirely.For the data in Fig. 4, the averagedifferencein roughness magnitudebetweenthe correlatedand uncorrelatedsignalswas4.5%. Very similar resultswere found when the correlatedand uncorrelatedsig- nalswereequatedfor the standarddeviationof signedjitter CORRELATED PERIOD SEQUENCE ] UNCORRELATED 90 R 0 U - methods that have been used to representpitchperturbation.For example,stimulifrom the correlatedcontinuumgenerallyhad largervaluesof fundamental period standarddeviation (Deal and Emanuel, 1978)andlargermeanjitter valueswhencalculations were made from either a three-pointmovingaverage(Koike, 1973) or a five-pointmovingaverage(Davis, 1981). Figure4 showsthe roughness-perturbation functionfor _ o 70 tant to note, however, that stimuli on the correlated and CORRELRTED PERIOD SEQUENCES UNCORRELRTED' PERIO0 SEOUENCES R 80 U loudspeaker. Bothof thesedifferences wouldbeexpectedto makethe Heibergerand Horii stimulisoundlessroughthan the stimuli used in the presentstudy •Wendahl, 1963, 1966a,b;Coleman, 1969;Wilde et aL, 1986), which might havethe effectof movingthe entireroughness-perturbation functionto the right. The otherobviousfeatureof thedatain Fig. 3 is that the stimuligenerated fromthecorrelatedperiodsequences were perceivedasmoreroughthanthestimuligenerated fromthe uncorrelated periodsequences. The differences in roughness magnitudefor a givenjitter valueaveraged9.3% and were highly significant(t = 26.0, df= 29, p <0.01 ). It is impor- 80 PERIO0 SEOUENCES _ - G H N E S 70 _ 60 - S S0 R T T ! [ 30 30 N G 20 20 10 10 0 I 2 PERCENT 3 •t S 6 JITTER FIG. 3. Perceived roughness asa functionofjitter for correlatedanduncorrelatedperiodsequences. 2365 J. Acoust.Soc. Am., Vol. 63, No. 6, June 1988 0 50 1OO PPQ 1S0 2•0 •50 300 350 (MICROSECONDS) FIG. 4. Perceivedroughness asa functionofjitter for correlatedanduncorrelatedperiodsequences with stimuliequatedfor PPQ. James Hillenbrand:Perceptionof aperiodicities 2365 (i.e., the standarddeviationof the distributioncycle-to-cycle differencesin fundamentalperiod with the sign retained). Equatingthestimuliforeitherfundamental periodstandarddeviationor perturbationfactorincreased ratherthan decreased differences in roughness magnitudebetweenthe correlated and uncorrelated continua. When the stimuli wereequatedfor fundamentalperiodstandarddeviation,the averagedifferencein roughness magnitudebetweencorre- latedanduncorrelated signalswas20.4%;for theperturbation factor, the differencewas 29.4%. Koike's (1973) "rela- tiveaverage perturbation," whichusesa three-point moving average,producedresultsthat wereverysimilarto the mean jitterdataShown in Fig.3. It isalsointeresting tonotethat the uncorrelated signals hadsubstantially highervaluesof directionaljitter than the correlatedsignals(72.7% vs 40.6%). The factthat the correlatedsignalssounded more B. Results and discussion Thefunctionrelatingshimmerto perceived roughness is shownin Fig.5.Thesmooth curveisa second-order polynomial in the case of the correlated continuum and a fourth- orderpolynomialin thecaseof the uncorrelatedcontinuum. As wastruefor thepitch-perturbation data,thesignalsthat were producedfrom correlatedsequences soundedmore roughthansignalswith the samemeanperturbationvalues that wereproducedfrom uncorrelated sequences. For the datain Fig. 5, theaveragedifference in roughness magnitude betweencorrelatedand uncorrelated signalswith the same meanshimmervaluewas27.1%. This valueis nearlythree timeslargerthan the differencethat wasobservedbetween correlated anduncorrelated stimulifor thepitch-perturbation continua. Unlike the pitch-perturbation data, this discrepancy betweenthe correlatedand uncorrelated signalsdoesnot roughthanthe uncorrelated signalswouldseemto indicate seemto be relatedto the choiceof perturbationcalculation thatdirectional jitterdoes notplaya roleinroughness per- methods.In general,stimuli that were matchedfor mean ception. shimmer tendedtoshowverysimilarratingswhenperturbaIn general,our preliminaryconclusionfrom the comtionwasmeasured usingothercalculationmethods,suchas parisonbetweenthe correlatedand uncorrelated signalsis pitch-pulse amplitudestandarddeviation,amplitude-perthat the standarddeviationof signedjitter, or Davis' ( 1981) turbationquotient(theamplitudeanalogof PPQ), standard PPQ,showsa stronger relationship to perceived roughness deviationof signedshimmer,and Koike's (1973) relative than other methodsof representing pitch perturbation. averageperturbation. However, none of the calculation methods that were used eliminated thedifference in perceived roughness between the correlatedand uncorrelatedsignals. II. EXPERIMENT PERTURBATION 2: PERCEPTION OF AMPLITUDE A. Methods Oneimportantfindingof thisexperiment thatcannotbe observed in Fig. 5 concerns thesubjective qualityof thestimuli varyingin amplitudeperturbation. Recallthat previous synthesis researchwith nonspeech signalssuggested that amplitudeperturbation produced a sensation of roughness thatwasvirtuallyindistinguishable fromthat produced by pitchperturbation (Wendahl,1966b).Althoughsubjects in the presentexperimentwere'askedto rate the stimuli on VSYN wasusedto synthesize two 22-membershimmer continuausingmethods thatwereanalogous to thoseusedto createthejitter continuumin experiment1.As in the pitch perturbation experiment, onecontinuumwascreatedusing the modifiedI/f random-numbergeneratorand the other • [] was created using a standardwhite-noisegenerator.The stimulialongeachcontinuumvariedfrom 0.0-2.6 dB and 9o! -I CORRELATED AMPLITUDE SEQUENCES UNCORRELATED AMPLITUDE SEOUENCESJ I I• were spacedat 0. l-dB increments from 0.0-1.0 dB and at 0.2-dB increments from 1.0-2.6 dB. The decision to restrict the continuumto the rangebelow 2.6 dB was somewhat arbitraryandwasbasedon the increasingly unnaturalpereeptualqualityof thesynthesized signalsasshimmervalues approachedabout 2.0 dB. All stimuli were 1.0 s in duration and weresynthesized at 40 kHz, with a constantF0 of 130 I-Iz. Ag in the previougexperiment,the•timuli weregatedon and off with a 20-ms cosine function and all stimuli on the continuumwereequatedfor overallrmsintensity. Subjects consisted of thesametenlisteners whoparticipatedin experiment 1.Asin theprevious experiments, sub- • 70 A • 60 s so • 30 20 1o jectswereasked toratethestimulionthedegree ofperceived roughness. Eachof the44 stimuliwaspresented 16timesin pseudorandomorder. The first 132 trials were consideredto bepracticeandthesedatawerenotincludedin theanalysis. Methods usedfor stimuluspresentationwere identicalto experiment 1. 2366 J. Acoust.Soc. Am., Vol. 83, No. 6, June 1988 0.0 0.• 0.8 1.2 SHlMMER 1.6 I I I 2.0 2.• 2.8 (DB) FIG. 5. Perceivedroughness asa functionof shimmerfor correlatedand uncorrelated pitch-pulseamplitudesequences. JamesHillenbrand: Perception of aperiodicities 2366 roughness, thistermisalmostcertainlynota gooddescriptionof theperceptual qualityofthestimulivaryinginamplitudeperturbation.Unlike the resultsfor sawtoothwavesreported by Wendahl, the perceptualquality of the stimuli varyingin amplitudeperturbationin the presentstudywas B. Results and discussion The resultsarepresentedin Fig. 6, whichshowspercent correct identification for each stimulus. With the obvious exceptionof stimuli with zero-perturbationvalues,subjects weregenerallyableto determinewhetherthestimulusrepre- quite differentfrom that producedby pitch perturbation. sentedperturbationsin pitch or amplitude.Identification When askedto provideverbaldescriptions of the stimuli, performanceimprovedat higherperturbationlevelsandwas subjectsgenerallycommentedthat the signalstoward the generallybetterfor the correlatedratherthan uncorrelated highendof theshimmercontinuumhadan unnatural"popstimuli.Theseresultssuggestthat, contraryto the findings ping"quality.For example,onesubjectcommentedthat the reportedby Wendahl(1966b) for sawtoothwaves,the substimulisoundedasthoughtheywerebeingplayedthrougha jectivequalitiesproducedbyjitter andshimmerin synthetic loudspeakerwith a loosewire and another comparedthe vowelsarequitedifferent,exceptat verylow levelsof aperiosignalsto speechplayed over a radio during an electrical dicity. storm.By contrast,stimulitowardthehighendof thepitchperturbationcontinuumare perceivedas very rough;howIV. EXPERIMENT 4: PERCEPTION OF ADDITIVE NOISE ever,with the exceptionof the very highjitter valuesfrom The purposeof experiment 4 wasto studytherelationthe correlatedcontinuum,the stimuli soundedas though ship between additive noise and perceived dysphonia andto they could havebeenproducedby a talker with a severely determine how this relationship might be affected by the disordered voice. slopeof thespectrum in theperiodiccomponent of thestimulus.Examinationof the role spectralslopewas motivated by Yanagihara's(1967) findingthat the lossof energyin III. EXPERIMENT 3: COMPARISON OF PITCH AND high-frequency harmonicsresultsin an increasein perceived AMPLITUDE PERTURBATION dysphonia evenwhenstimuliarematchedfor signal-to-noise A. Methods 1. Stimuli ratio. A. Methods The purposeof experiment3 wasto determinewhether subjectscould, in fact, differentiatebetweenthe effectsof pitchandamplitudeperturbation.The teststimuliconsisted of ninesignalseachfrom the correlatedpitch-perturbation continuum,theuncorrelatedpitch-perturbation continuum, the correlatedamplitude-perturbation continuum,and the uncorrelatedamplitude-perturbation continuum.The nine 1. Stimuli All stimuli for experiment4 were synthesized with VSYN, whichcontrolssignal-to-noise ratioby theapproprißatescalingandpoint-for-point additionof separate periodic and aperiodiccomponents. A singleaperiodicsignalwas generatedwith the Klatt (1980) synthesis programby passing the aspiration source through formant resonators that stimulus values from each continuum were chosen in such a were set appropriate for [a] (F1 =720, F2= 1240, way that the spacingbetweenstimuli was approximately F3 = 2400,F4 = 3300,F5 = 3700). Exceptfor differences evenin perceptualterms,as determinedby the roughness in amplitude, thenoisewaveform wasidenticalforall stimumagnitudeestimates. Eachseriesof ninestimulibeganwith li. a stimulushavinga perturbationvalueof zero.Thesefour stimuli should have been identical and were included as a reliabilitycheck.The 36 stimuliwereequatedfor overallrms intensityand presentedovera loudspeakerusingthe proceduresdescribedpreviously. 1oo E 2. Subjectsand procedures R so Thetensubjects whohadparticipated inexperiments 1 • ao and2 served aslisteners. Twoseparate identification tasks r wererun in counterbalanced order. One taskusedthe corre- 70 c o latedsignals fromeachcontinuum, andtheotherusedthe R 60 uncorrelated signals. A verybrieftrainingsession preceded E c so each identificationtask. The trainingsessionconsistedof tworandomly ordered presentations ofeach ofthe18stimuli. Subjects wereaskedto pressoneof two keyson a terminal keyboardto indicatewhetherthestimuluswasdrawnfrom thejitter continuumor the shimmercontinuum.Fcedback wasprovidedon eachof the 36 trials.The testingsessions wereidenticalexceptthat feedbackwasnot provided,and eachstimuluswaspresented tentimesin pseudorandom order. 2367 J. Acoust.Sec. Am.,Vol.83, No. 6, June1988 I T ß UNCORRELRTED JITTER [] UNCORRE•TED SHIMMER •^ ,o I • STIMULUS FIG. 6. Percent correct identification CORRELRTED ............... I S I 6 I 7 SHIMMEVR I 8 I 9 NUMBER of stimuli from the correlated and uncorrelatedjinercontinuaandfrom thecorrelatedanduncorrelatedshimmer continua.The subjects'task was to judge whetherthe stimuluswas drawn from one of thejitter continuaor oneof the shimmercontinua. JamesHillenbrand: Perceptionof aperiodicities 2367 With thesynthesizer setin parallelmode, Severalversions of theperiodiccomponent weregenera- mantamplitudes. the resonatorgainsassociated with F l-F6 werespacedat ted usingthe Klatt synthesizer. Two methodswereusedto -- 3-, -- 5-, or -- 10-dB increments.For example,for the controlspectralslope.In method1, spectralslopewascon-- 10-ribsignal,F 1gainwassetto 66 riB,F2 gainwassetto trolledat the glottallevelbyvaryingthe "bandwidthof glotcharacteristal resonance"(BGR) parameterin the Klatt synthesizer. 56riB,F 3gainwassetto46riB,etc.Thespectral This parametercontrolsthe cutofffrequencyof a low-pass ticsof thesestimuliareshownin Fig. 8. The periodiccomponentsthat weregenerated with thismethodweremixedapfilterthat is usedto shapethe voiceimpulsetrain. The BGR parameterwassetat 75, 150,and 300 Hz, produe!ng the propriatelywith the noisesignaldescribedaboveto produce glottalsource functions shownin Fig.7. Theglott• wave- threeadditional13-stepcontinuavaryingin signal-to-noise formswerepassed throughformantresonators .appropriate ratio from -- 10 to 26 dB. for [a] and then mixed with the scaled noise described above.Three 13-stepcontinuaweresynthesized that varied 2. Sublets and procedures in signal-to-noise ratioin 3-dBstepsfrom -- 10to 26 dB (39 stimuli). All stimuliwere 1.0s in durationand weresynthesizedwith a constant130-HzFo and a 40-kHz samplefrequency. Method 2, whichwasmorenearlyanalogousto Yanagihara's(1967) technique,usedthe sameglottalwaveformfor all stimuli and controlledthe spectralslopeby adjustingfor- Listeners consisted of nineof thetenspeech pathologists who participatedin the other experiments.An additional subject'meeting the samecriteria was recruitedto replace onespeechpathologistwho wasnot availableat the time the experimentwas run. Using the magnitudeestimationtask described above,subjects wereaskedto ratethestimulion thedegreeofperceived breathiness. Subjects wererunin two blocks of 624 trials in counterbalanced I I I I order. One block con- sistedof 16 psuedorandomly orderedpresentations of the 39 stimulicreatedusingmethod 1 to controlspectralslope;a secondblockconsisted of 624 presentations of the 39 stimuli createdusingmethod 2. For both blocksof trials, the first 117 stimuluspresentations were consideredto be practice trials and were not includedin the data analysis. I B. Resultsand discussion I TI•E i i Functionsrelatingsignal-to-noise ratios to nrmalized breathiness ratingsareshownin Fig. 9 for method1andFig. I0 for method2. Thesmoothcurvesarethird-orderpolynomials.Not surprisingly, thereis a verystrongrelationship betweensignal-to-noise ratio and listeners'perceptionof breathiness. However,contraryto the resultsreportedby i --) Yanagihara (1967),theamount ofhigh-frequency energy in the periodiccomponentdid not appearto playa rolein controllingthe degreeof perceiveddysphonia.In general,stimuli with similarsignal-to-noise ratiostendedto receivevery 80 70 9O G0 • 8O S0 •0 v o i 2 FREQUENCY 3 • S (KHZ) FIG. 7. Time-domain(top) andfrequency-domain (bottom) representationsof glonal sourcefunctionsvaryingin spectralslope.The function showingthe highestrate of changein the time domainand the greatest amountof high-frequency energywasproducedwith a BGR valueof 300 Hz; the functionwith themostgradualrateof changeand theleastamount of high-frequency energywasproducedwith a BGR of 75 Hz. The middle functionin bothpanelswasproducedwith a BGR of 150Hz. 2368 d.Acoust. Soc.Am.,Vol.83,No.6, June1988 3o O I 2 FREQUENCY 3 •f S (KHZ) FIG. 8.Fourierspectra( 1024points)of theperiodiccomponents produced by controllingformantamplitudeswith the synthesizer in parallelmode. JamesHillenbrand: Perception ofaperiodicities 2368 I I I I I I I I I I I I I I gists who were asked to rate the same set of stimuli on 9O hoarsehess. Theresults of thattestwerevirtuallyidenticalto the data shownin Figs.9 and 10. B 80 œ A T 70 V. GENERAL [ N E 6o $ S0 To summarizebriefly,four experiments wererun that examinedunivariaterelationships betweenperceived dysphoniaandvariationin pitchperturbation, amplitude perturbation,andadditivenoisein synthetically generated sustained vowels. Among the results were: (1) Strong DISCUSSION H S T [ 30 relationships werefoundbetweenperceivedroughness and variationin eitherpitchor amplitudeperturbation; (2) stim- N G 20 uli thatweregenerated fromcorrelated pitchor amplitude •0 r-l•,I I I -12-9 I -6-3 I I I I 0 3 6 9 SIGNP•L-TO-NO[SE I I I 12 IS I 1821 RATIO I I 2• 27 (OB) FIG. 9. Perceived breathiness asa functionof signal-to-noise ratio.The parameteris the spectralslopeof the periodiccomponent,which wascontrolledby varyingthe cutofffrequencyof a low-passfilter that shapesthe glottalsourcefunction.This is the BGR parameterin the Klatt (1980) synthesis program. similarbreathiness ratings,regardless of differences in the spectralslopeof theperiodiccomponent. It isnotclearwhythepresentfindingsdonotagreewith thoseof Yanagihara(1967). Comparingthe synthesis procedures used in the two studies is difficult since the methods usedby Yanagiharaare not describedin greatdetail. One possibilitythat wasconsidered is that the discrepancy is related to the fact that Yanagihara'ssubjectsrated the stimuli on hoarseness, whilesubjects in thepresentstudywereasked to makebreathiness judgments.To testthispossibility,the experiment wasrerunwith four additionalspeechpatholo- sequences soundedmoreroughthanstimuligenerated from uncorrelated sequences, especially for stimulivaryingin amplitudeperturbation;(3) unlikefindingsreportedfor sawtooth wavesvaryingin pitch and amplitudeperturbation (Wendahl,1966b),theperceptassociated withpitchperturbation was noticeablydifferent from that associatedwith amplitudeperturbation;(4) a strongrelationship wasfound betweenadditive noiseand ratingsof breathiness;and (5) contrary to Yanagihara's(1967) report, ratingsof either breathiness or hoarseness were unaffected by the spectral slopeof the periodiccomponent. Resultsof the listeningtestsusingstimulivaryingin pitchand amplitudeperturbationare in generalagreement with thosereportedfor nonspeech waveforms(Heiberger andHorii, 1982;Wendahl,1963,1966a,b).Oneveryimportant discrepancy concernsWendahl's(1966b) reportthat theroughness percepts associated withpitchandamplitude perturbationwere very similar to oneanother.Resultsfrom the presentstudysuggestthat thesetwo perceptsare quite easy to differentiate. Further, informal interviews with the speechpathologistswho servedas listenerssuggested that jitter seemsto morecloselyapproximatethe kind of roughnessthat is heardin naturallyoccurringdisorderedvoices. Mostlisteners agreed thatstimulitowardthehighendofthe shimmercontinuahadan unnaturalquality.It mightbenoted that the unusualquality associatedwith the shimmer IIIIIIIIIIIIII 90 8 R Io sLoPE = [] SLOPE= -S DB 80 =- E stimuli does not seem to be restricted to stimuli created with I T H I 60 _ N E S SO _ tudeperturbationwasnot expectedsincepreviouscorrelational researchwith naturally producedvoicessuggested that amplitude perturbationwas more stronglyassociated with perceivedroughness thanpitchperturbation(Deal and Emanuel,1978;Nichols,1979). It isimportantto note,however, that the amplitude-perturbationmeasuresreportedin S R H'O_ T 30 _ N G 20 _ 10 - I I t 12-9-6-3 I I I I I 0 3 6 9 121S18212•27 SIGNAL-TO-NOISE I I I I I I RATIO FIG. 10. Perceivedbreathiness as a functionof signal-to-noise ratio. The parameteris thespectralslopeof theperiodiccomponent, whichwascontrolledby varyingparallelformantamplitudes. Parallelresonator gainsfor F1-F6 werespacedat intervalsof -- 3, -- 5, or -- 10 dB. 2369 the time-domain synthesismethod used by VSYN. Pilot work usingstimuligeneratedwith the Klatt (1980) synthesisprogramproducedstimuli that soundedvery similar to thosegeneratedby VSYN. The Unnaturalquality of the stimuli varyingin ampli- J. Acoust. Soc. Am., Vol. 83, No. 6, June 1988 the natural speechstudiesalmost certainly reflectedseveral differenttypesofaperiodicity.Becauseof limitationsin measurementtechniques, measuredvaluesof parameterssuchas amplitude perturbationcan reflect unknown combinations of amplitudeperturbation, l•itch perturbation, additive noise,and perhapsother sourcesof aperiodicity(Hillenbrand, 1987). The presenceof thesemeasurementartifacts makesit difficultto interpretperceptualresultsin termsof James Hillenbrand:Perceptionof aperiodicities 2369 specificunderlyingacousticevents. Another possibilitythat shouldbe consideredis that thereis someimportantaspectof amplitudeperturbationin naturally producedvoicesthat was not modeledaccurately by the synthesistechniquesusedin the presentstudy. It is catethat spectralslopeplaysno rolein judgmentsof voice quality.Althoughit wasnot testedformally,mostlisteners indicatedthat theywereawareof thedifferences in "brightness"amongthestimuli.Thesedifferences, however,didnot appearto influencesubjects' judgmentsalongthe two quali- alsopossible that the amplitude-perturbation signalssound unnaturalbecause amplitudevariabilityisoccurringagainst a backgroundof perfect periodicity in other dimensions. More natural-soundingsignalsmight be producedif amplitudeperturbationwerecombinedwith othertypesof aperiodicity. Experimentsthat arejust underwayin our laboratory are designedto studythe perceptualpropertiesof sustained vowelsthat combineseveralsourcesof aperiodicity. The experimentswith stimuli varyingin pitch and amplitude perturbationalsoshowedthat perceivedroughness wasaffectednot onlyby the amountof perturbation,but also by the degreeof correlationamongadjacentpitchor amplitude values.In general,stimuli that were generatedfrom correlatedsequences tended to soundsignificantlymore roughthan stimulithat were generatedfrom uncorrelated sequences. Thiswasespecially truefortheamplitude-perturbationstimuli,wherethe differencein perceivedroughness betweencorrelatedanduncorrelated signalswasverylarge. For the pitch-perturbation signals,therewassomeevidence ty dimensionsthat were tested. thatthiseffect mayhave been atleast partly duetothechoice of meanjitter as a way to calculatepitch perturbation.The difference between correlated anduncorrelated signals was reduced significantly whenstimuliwereequated for either PPQ (Davis, 1981) or thestandarddeviationof signedjitter. However,changingthe calculationmethoddid not entirely eliminatethedifferencein perceivedroughness betweencorrelatedanduncorrelated pitch-perturbation signals, andthis effectseemedto be largelyunrelatedto calculationmethods for the amplitude-perturbation signals. One implicationof thesefindingsis that acousticmeasurementtechniques might needto accountfor the sequential characteristics of pitchor amplitudechangein addition to the degreeof perturbation.Work along theselines has been reportedby Koike (1973), who studiedautocorrelation functionsof voiceamplitudesequences from normal speakers andpatientswitheitherlaryngealneoplasms or unilateralvocalcordparalysis.Koike reportedthat the presencepeaksin the autocorrelationfunction at lags of 3-12 periodscouldbe usefulin differentiatingpatientswith neoplasmsfrom theothertwo groups.The presentfindingssug- As indicatedpreviously,thelong-termgoalof thiswork is to learnsomethingaboutmultivariateratherthan simply univariate relationshipsbetweenperceiveddysphoniaand variationin underlyingacousticparameters.Experiments that are currentlyunderwayusingmore complexsynthetic stimuli have been designedto determinehow acousticparameterssuchas the onesexaminedin the presentstudy combineto influencelistenerjudgmentsof the overall severity of dysphonia,as well as the type of dysphonia.Another important issuethat will need to be addressedin future researchconcernsthe generalizabilityof resultsbasedon sustained vowelsto continuousspeech.Sustainedvowelshave beenstudiedheavilybecause the measurement problemsare moretractableand because the psychophysical characteristicsarelikelyto bemuchsimpler.However,therearea number of phenomena foundonly in continuous speech(e.g., pitchbreaksandaperiodicities at transitionsbetweenvoiced andunvoicedsegments)that are likely to play a strongrole in theperceptionof dysphonia.A majorchallengefor future researchwill be to determinehow thesedynamiccharacteristicsinteract with the kindsof aperiodicitiesthat were examined in the presentstudy. ACKNOWLEDGMENTS I am verygratefulto Dale Metz and RobertWhitehead of the National Technical Institute for the Deaf for their technicaladvice,commentson previousdrafts,and the generouscontributionof theirtimein listeningto thetestsignals and suggesting improvementsto the procedures.I would like to thankRaymondColtonfor hishelpfulcommentsona previousdraft,ThomasRidleyfor hishelpwith dataanalysis,datacollection,andsoftwaredevelopment, andWilliam Martens andMartinWildefortheirhelpindeveloping techniquesfor the generationof correlatedrandom numbers. This researchwassupportedby NIH Grant No. 7-R01-NS23703-01to RIT ResearchCorporation. Aronson,A. (1980). ClinicalVoiceDisorders: An Interdisciplinary Approach(Thieme-Stratton,New York). gestthatthesequential characteristics of pitchandampli- Coleman,R. F. (1969). "Effectsof medianfrequency levelsuponthe tude change might need to be incorporated in the developmentof a quantitativeindexof the severityof dys- Davis,S.B. (1976)."Computer evaluation oflaryngeal pathology based on inverse filteringof speech," SCRLMonogr.13,Speech Communication phonia. More research will be needed, however, to deter- roughness jitteredstimuli,"J. SpeechHear.Res.12, 330-336. Research Laboratory. Santa Barbara. CA. mine how the sequentialpropertiesof pitch and amplitude Davis,S.B. (1981)."Acoustical characteristics ofnormalandpathological changein naturallyproduced voicescanbequantified and Deal,R. E., andEmanuel, F. W. (1978)."Somewaveform andspectral features of vowelroughness," J. Speech Hear.Res.21,250-264. Emanuel, F. W., andSansone, F. (1969)."Somespectral features of 'nor- how this information can be combined with more standard measures of perturbation. Theprimaryfinding fromthelistening testsusingstimuli varying in additive noisewas the failure to observean effectfor the spectralslopeof the periodiccomponent.In general,stimuliwith similarsignal-to-noise ratiostendedto receiveverysimilarratingsof eitherbreathiness or hoarseness.It is importantto note that thesefindingsdo not indi2370 J.Acoust. $oc.Am.,Vol.83,No.6, June1988 voices,"ASHA Rep. 11, 97-115. mal' and'simulated rough'vowels,"FoliaPhoniatr.21, 410-415. Fritzell,B.,Hammarberg, B.,andWedin,L. (1977}. "Clinical applications of acoustic voiceanalysis, PartI: Background andperceptual factors," Speech Trans.Lab.Q. Prog.Stat.Rep.2-3, 31-38. Froekjaer-Jensen, B.,andPrytz,S.(1976)."Registration ofvoicequality," Brueland Kjaer Tech.Rev. 3, 3-17. Gardner,M. (1978). "White and brownmusic,fraetalcurves,and oneover-f fluctuations,"Sci. Am. 238, 16-31. JamesHillenbrand: Perception ofaperiodicities 2370 Gauffin,J., andSundberg, J. (1977). "Clinicalapplications of acoustic voiceanalysis,Part II: Acousticanalysis,results,and discussion," SpeechTran.Lab.Q. Prog.Star.Rep.2-3, 39-43. Gill, J. S. (1961). "Automatic extraction of the excitation function of speechwith particularreferenceto the useof correlationmethods,"in Proceedings of the ThirdInternational Congress onAcoustics (Elsevier, Amsterdam),Vol. 1, pp. 217-220. Hammarberg, B., Fritzell, B., Gauffin,J., Sundberg, J., and Wedin,L. (1980). "Perceptual andacoustic correlates of abnormal voicequality," Acta Otolaryngol.90, 441-451. Hecker,M., andKruel,E. J. (1971). "Descriptions of thespeech ofpatients withcancerof thevocalfolds,PartI: Measures of fundamental frequency," J. Acoust. Soc. Am. 49, 1275-1282. Heiberger,V. L., and Horii, Y. (1982). "Jitter and shimmer in sustained phonation," in Speechand Language.'Advancesin BasicResearchand Practice,Vol.7, editedby N.J. Lass(Academic,New York), pp. 299332. Hillenbrand, J. (1987). "A methodological studyof perturbation andadditivenoisein synthetically generated voicesignals,"J. SpeechHear.Res. 30, 448-461. Hirano,M. ( 1981). ClinicalExamination of Voice(Springer, NewYork). Hollien,H., Michel,J., andDoherty,E. T. (1973). "A methodfor analyzing vocaljitter in sustainedphonation,"J. Phon. 1, 85-91. Holmes,J. N. (1962). "The effectof simulatingnaturallarynxbehavioron the quality of synthetic speech,"SpeechCommunicationsSeminar, SpeechTransmission Laboratory,RoyalInstituteof Technology,Stockholm, Sweden. Horii, Y. (1979). "Fundamentalfrequency perturbation observed in sustainedphonation,"J. SpeechHear. Res.22, 5-19. Horii,Y. (1980). "Vocalshimmerin sustained phonation," J.Speech Hear. Res. 23, 202-209. Horii, Y. (1982). "Jitterandshimmerdifferences amongsustained vowel phonations," J. SpeechHear. Res.25, 12-14. Jacob,L. (1968). "A normatirestudyof laryngealjitter," Master'sthesis, Universityof Kansas,Lawrence,KA, unpublished. Jensen,J.P. (1965). "Adequacyof terminologyfor clinicaljudgmentof voicequality deviation," Eye, Ear, Nose Throat Mort. 44, 77-82. Kempster,G. B. (1984). "A multidimensional analysisof dysphoniain two dysphonicgroups,"Ph.D. thesis,NorthwesternUniversity, Evanston, IL, unpublished. Kempster, G. B., andKistler,D. J. (1984). "Perceptual dimensions ofdysphonicvoices,"J. Acoust.Soc.Am. Suppl.1 75, S8. Kitajima,K., andGould,W. J. (1976). "Vocalshimmerin sustained phonationof normalandpathologic voice,"Ann. Otol.Rhinol.Laryngol.85, 377-381. Francisco). Murry, T., andDoherty,E. T. (1980). "Selectedacousticcharacteristics of pathologic andnormalspeakers," J. Speech Hear.Res.23, 361-369. Murry, T., Singh,S., and Sargent,M. (1977). "Multidimensionalclassificationof abnormalvoicequalities,"J. Acoust.Soc.Am. 61, 1630-1635. Nichols,A. C. (1979). "Jitterand shimmerrelatedto vocalroughness: A commentontheDeal andEmanuelstudy,"J. SpeechHear.Res.22, 670671. Prosek,R. A., Montgomery,A. A., Walden, B. E., and Hawkins, D. B. (1984). "Somerelationsbetweenvoice-qualityjudgmentsand derived acousticmeasurements," J. Acoust.Soc.Am. Suppl.I 75, S8. Rozsypal,A. J.,andMillar, B. F. (1979). "Perceptionofjitter andshimmer in syntheticvowels,"J. Phon.7, 343-355. Robbins,J. (1981). "A comparative acousticstudyof laryngealspeech, esophageal speech,and speechproductionafter tracheo-esophageal puncture,"Ph.D thesis,Northwestern University,Evanston, IL, unpublished. Sansone, F., andEmanuel,F. W. (1970). "Spectralnoiselevelsandroughnessseverityratingsfor normalandsimulatedroughvowelsproducedby adult males,"J. SpeechHear. Res. 13, 489-502. Schroeder,M. R. (1961). "Recentprogressin speechcodingat Bell TelephoneLaboratories," in Proceedings of the Third InternationalCongress onAcoustics (Elsevier,Amsterdam),Vol. 1, pp. 201-210. Simon,C. (1927). "The variabilityof consecutive wavelengths in vocaland instrumentalsounds,"Psyehol.Monogr.36, 41-83. Smith,B., Weinberg,B., Feth, L., and Horii, Y. (1978). "Vocaljitter and roughness characteristics of esophageal speech,"J. SpeechHear. Res.21, 240-249. Takahashi,H., and Koike, Y. (1975). "Someperceptualdimensions and acousticalcorrelatesof pathologicvoices,"Acta Otolaryngol.Suppl. 338, 1-24. Wendahl,R. (1963). "Laryngealanalogsynthesisof harshvocalquality," Folia Phoniatr. 15, 241-250. Wendahl, R. (1966a). "Some parametersof auditory roughness,"Folia Phoniatr. 18, 26-32. Wendahl,R. (1966b). "Laryngealanalogsynthesis of jitter and shimmer: auditoryparameters of harshness," Folia Phoniatr.18, 98-108. Wilde, M.D., and Martens,W. L. (1985). "VSYN: A computerprogram for synthesizing vocalsignalsvaryingin perturbationandsignal-to-noise ratio," NorthwesternUniversity,Evanston,IL. Wilde, M.D., Martens, W. L., Hillenbrand, J., and Jones,D. R. (1986). Kitajima,K., Tanabe,M., and Isshiki,N. (1975). "Pitchperturbation in normaland pathologicvoice,"Stud.Phonol.9, 25-32. Klatt, D. H. (1980). "Softwarefor a cascade/parallel formantsynthesizer," J. Acoust. Soc. Am. 67, 971-995. Koike,Y. (1973). "Applicatio n of someacoustic measures for theevaluationof laryngealdysfunction,"Stud.Phonol.7, 17-23. Kojima,H., Gould,W. J., Lambaisc,A., andIsshiki,N. (1980). "Computer analysisof hoarseness," Acta Otolaryngol.89, 547-554. Lieberman,P. (1963). "Someacousticmeasuresof the fundamentalperiodicityof normalandpathologic larynges,"J. Acoust.Soc.Am. 35, 344353. 2371 Lively, M. A., and Emanuel,F. W. (1970). "Spectralnoiselevelsand roughness severityratingsfor normalandsimulated roughvowelsproducedby adultfemales,"J. SpeechHear.Res.13, 503-5t7. Mandelbrot,B. (1983). TheFractalGeometryof Nature (Freeman,San J. Acoust.Soc. Am., Vol. 83, No. 6, June 1988 "Externalization mediates changes in theperceived roughness of sound signals withjitteredfundamental frequencies," in Proceedingsof the1986 InternationalComputer MusicConference, The Hague. Yanagihara,N. (1967). "Significance of harmonicchangeand noisecomponentsin hoarseness," J. SpeechHear. Res.10, 531-541. Yumoto, E., Gould, W. J., and Baer, T. (1982). "Harmonics-to-noise ratio asan indexof the degreeof hoarseness," J. Acoust.Sec.Am. 71, 15441550. Yumoto, E., Sasaki, Y., and Okamura, H. (1984). "Harmonics-to-noise ratio and psychophysical measurement of the degreeof hoarsehesS," J. SpeechHear. Res.27, 2-6. James Hillenbrand:Perceptionof aperiodicities 2371