Acoustic measurements on prosody using Praat Bert Remijsen Universiteit Leiden & University of Edinburgh 1 Overview Brief motivation Introduction to Praat scripting Measurement of – > Vowel quality > Fundamental frequency > Voice quality and intensity 2 Overview Topics in relation to measurements: [> > > > > Data collection and processing] How to measure it in Praat (Semi-)automating measurements Displaying the descriptive statistics Inferential statistics 3 Motivation 4 Motivation Why quantitative analysis of prosody? > quantitative results can be used to test hypotheses 5 Motivation Why quantitative analysis of prosody? > humans are bad at determining the acoustic cause of prosodic variation by ear E.g.: - controversy on lexical stress - perception of pitch-accent 6 Motivation Why quantitative analysis of prosody? > Prosodic contrasts are often realized in terms ‘packages’ of prosodic correlates. E.g.: stress: duration, vowel quality, intensity complementary quantity: duration, vowel q. pitch-accent: fundamental frequency (f0), duration, etc. 7 Motivation Why quantitative analysis of prosody with Praat? > Allows for measurement, manipulation, and representation of the full range of acoustic parameters. > Relatively easy to (semi-)automate procedures by means of scripts. 8 How to write a Praat script? 9 How to write a Praat script? A. Try to start out from an existing script > For example, check on: http://uk.groups.yahoo.com/group/praat-users > Praatscripts introduced in this presentation can be found at: http://www.ling.ed.ac.uk/~bert/praatscripts 10 How to write a Praat script? B. Writing (part of) a script from scratch > Do the steps by hand for one item > Display them using Paste history > Combine these steps with control structures, guided by the manual. 11 How to write a Praat script? An annotated script: Script: msr_duration.psc Function: collecting durations for onset, nucleus and coda of a target word, for each file in list. Automatic. 12 How to write a Praat script? Common components: > User interface (form … endform) > Getting the input files (Read …) > Finding point of measurement (using TextGrid) > Measurements > Writing output to file (e.g. fappendinfo) 13 How to write a Praat script? The dataset: > One long sound file – e.g. the whole recording session, with information on sections in the TextGrid. > One item-per-file. If so, it is best to encode as much useful information as possible in the filename, in a structured way. 14 How to write a Praat script? > One item-per-file. If so, it is best to encode as much useful information as possible in the filename, preferably fixed-width. E.g.: dataset_code d2_2_012_s_1 speaker_no item_no repetition_no 15 s(ingular) / p(lural) [S&R] How to write a Praat script? Reasons: > Saves work coding in statistics package > The fields in the name can be searched with a Praat script (using string pattern matching). 16 How to write a Praat script? Script: openlist.psc Function: open specific objects associated with item in list 17 How to write a Praat script? Script: openlist_specificitem.psc Function: This script searches on the item code – the third field in the name. 18 Measuring vowel quality 19 Vowel quality Measurement in Praat How to measure formants in Praat? I. The point of measurement II. An algorithm and a protocol III. Semi-automating measurements 20 Vowel quality Measurement in Praat I. The point of measurement – possibilities: > Where F1 reaches its maximum > Small domain centered on temporal mid point > Averaged over (middle x% of) vowel. 21 Vowel quality Measurement in Praat II. An algorithm and a protocol 1. Produce Formant object using default algorithm (Burg) and parameters (5 formants below 5000 Hz [male] / 5500 Hz [female]) 2. Track using default values (male values = female values – 10 %). 22 Vowel quality Measurement in Praat 3. Protocol for when the value is incorrect: E.g.: weak F2 of high back vowels often missed; F3 reported as F2 Options: - Use LPC with more coefficients - Retrack with changed F1/2 ref. The strategy is to be fixed within a single study. 23 Vowel quality Measurement in Praat III. Semi-automating the measurements > Formant measurements should be checked. A fully-automated procedure is not an option. > Instead: automate all the repetitive actions. 24 Vowel quality Measurement in Praat Script: msr&check_f1f2_indiv_interv.psc Function: Makes measurement as proposed above, Point of measurement: midpoint of an interval – suitable for analysis for monophthongs. 25 Vowel quality Measurement in Praat Script: msr&check_f1f2_indiv_point.psc Function: Makes measurement as proposed above, Point of measurement: points on a point tier – suitable for analysis of di/triphthongs. > These scripts can easily be modified to process a batch in one go – still with check. 26 Vowel quality Scaling The formant values, once collected, can be scaled in a number of ways: 1. Individual frequencies or frequency differences? > Vowel height: F1-F0 or F1 > Advancement: F2-F1 or F2 27 Vowel quality Scaling The formant values, once collected, can be scaled in a number of ways: 2. F1 x F2, or others formants as well? > F1 x F2 > F0 x F1 x F2 x F3 28 Vowel quality Scaling The formant values, once collected, can be scaled in a number of ways: 3. Acoustic / psycho-perceptual scale? > hertz (Hz) > Logarithmic (ST) > MEL > ERB > Bark 29 Vowel quality Scaling The formant values, once collected, can be scaled in a number of ways: 4. Cross-speaker comparisons? > z-transformation (Lobanov) > Gerstman > Constant Log Interval Hypothesis 30 Vowel quality Scaling Ideal set-up for normalization (Adank 2003): > Individual frequencies rather than Δ’s > hertz (Hz) rather psycho-acoustic scale > No need to consider F0 and F3 > between-speaker variation: z-transformation 31 Vowel quality Analysis / vowel plots The formant values, can be interpreted best in a vowel plot (F1 x F2). Characteristics of a good vowel plot: > Inverted axes > Over speakers (so z-transformed) > Categories labeled using IPA 32 Vowel quality Analysis / vowel plots The formant values, can be interpreted best in a vowel plot (F1 x F2). Characteristics of a good vowel plot: > Inverted axes > Over speakers (so z-transformed) > Categories labeled using IPA Praat can do it. 33 Vowel quality Analysis / vowel plots Example: - The vowels of Dinka: /i,e,,a,,o,u/ - Ellipses encircle 1 standard deviation - Separate ellipses for compl. quantity - Values averaged over 2 repetitions of 36 items uttered by 5 speakers. -1 iC iCC uC CC u eC oCC eCC 0 CCC oC C CC 1 aCC 2 3 2 aC 1 0 -1 F2 (z-transformed) -2 34 Vowel quality Analysis / vowel plots Example: - The vowels of Dinka: /i,e,,a,,o,u/ - Ellipses encircle 1 st. dev. (68%) - Separate ellipses for compl. quantity - Values averaged over 2 repetitions of 36 items uttered by 5 speakers. -1 iC iCC uC CC u eC oCC eCC 0 CCC oC C CC 1 aCC 2 3 2 aC 1 0 -1 F2 (z-transformed) -2 35 Vowel quality Analysis / vowel plots 1. Create a TableofReal, with, for each token: > praat-code for the IPA label (e.g. ‘’ is ‘\ep’) > z-transformed F1 and F2; sign inverted (I do this in SPSS) > Header contains axis labels and no. of tokens 36 Vowel quality Analysis / vowel plots Example: formants_tor.txt: File type = "ooTextFile" Object class = "TableOfReal" numberOfColumns = 2 columnLabels []: "F2 (z-transformed)" numberOfRows = 341 "F1 (z-transformed)" row [1]: "i^C" -1.6595 1.2794 row [2]: "i^C" -1.9973 1.2538 … row [341]: "o^C" 0.6245 0.0380 37 Vowel quality Analysis / vowel plots 2. Open the TableofReal in Praat, and use either: > Draw scatter plot to plot individual values; each token is marked by its (IPA) label. or > Draw sigma ellipses ellipses, sized by user in terms of st. devs. (sigma). (IPA) label plotted at center. 38 Vowel quality Analysis / vowel plots Either way, plot with no for Garnish and Discriminant plane 3. In Picture window, add marks on x and y axes, inverting the inverted sign back to normal – for example: One mark left... -2 no yes no 2 This gives a y-axis mark in terms of z-scores of ‘2’ at -2 on the y-axis, without plotting ‘-2’. 39 Vowel quality Analysis / inferential tests Characteristic inferential test: ANOVA > within-subjects > multivariate (dependents zF1 and zF2) > factor(s) vowel quality (and e.g. lexical stress / intonational accent / position in phrase / etc.). 40 Measuring fundamental frequency 41 F0 Overview > Issues in measuring F0 > Scaling > Descriptive stats 42 F0 Issues in measuring F0 I. For detailed study about the realization of tonal contrasts, consonants in target words should be: + nasals liquids approximants, rhotics voiced fricatives – unvoiced fricatives, stops 43 F0 Issues in measuring F0 BUT: other may be more important – such as the availability of minimal-set data: /ba1/ /ba3/ /ba121/ /ba12[p]/ /ba41/ /ba21/ / ba31/ Low level High level Rise-fall Low Rise Extra High Fall Low Fall High Fall ‘to remain’ ‘ancestor’ ‘stiff’ ‘father’ ‘to hit’ ‘to blow’ ‘when’ 44 F0 Issues in measuring F0 II. F0 measurements need to be checked for octave jumps etc. > suggestion: use a semi-automated procedure 45 F0 Issues in measuring F0 Script: lst2f0&check.psc Function: This script automates all the repetitive actions involved in the checking of F0 tracks. It calculates the F0 track (Pitch object), plots it in the Picture window, gives the opportunity to fix errors if need be, and then writes the (fixed) Pitch object to a file. Batch processing using file list. 46 F0 Issues in measuring F0 III. The point of measurement – turning points can be determined: > by eye > using mathematical modelling. MOMEL (Hirst & Espesser) is implemented in Praat. See also recent work by Grabe & Kochanski. 47 F0 Issues in measuring F0 Script: momel_modif.psc Function: Praat implementation of the MOMEL algorithm. (Original implementation in the MES signal processing package) 48 F0 Scaling From physical F0 trace to psycho-acoustic track. 1. Normalization for the logarithmic nature of pitch perception: > hertz (Hz) > semitone (ST) > Equivalent Rectangular Bandwidth (ERB) 49 F0 Scaling From physical f0-track to psycho-acoustic track. 1. Normalization for the logarithmic nature of pitch perception: > hertz (Hz) > semitone (ST) > Equivalent Rectangular Bandwidth (ERB) Latest news: semitone is best (Nolan 2003). 50 F0 Scaling 2. Normalization across speakers: > No need to normalize for slope differences expressed in ERB or ST. > Absolute values can be normalized using the ztransformation. 51 F0 Analysis / Plotting tracks How to interpret the data, and communicate tendencies to others? The problem: > Averages of F0 measures expressed as numbers in Hz are hard to interpret. ST, ERB and zscores are even harder to interpret. > Visual illustration by means of F0 tracks of individual cases fail to exploit the dataset. 52 F0 Analysis / Plotting tracks The solution: > Represent F0 visually across speakers, by means of tracks normalized for time. > I.e.: graph used as a descriptive stat (reports average) 53 F0 Analysis / Plotting tracks Example 1: - The 6 lexical tones of Matbat - Normalized time - Utterance-medial position, following low target - Tracks averaged over 2 repetitions of 48 items uttered by 8 speakers. (784 tokens) 54 F0 Analysis / Plotting tracks Example 2: - The 3 word-prosodic patterns of Papiamentu. - Normalized time - Whole sentence represented. - Tracks averaged over 2 repetitions of 2 items uttered by 8 speakers. (96 tokens) SUBJ COP O1 V1 O2 V2 PREP. word-acc. I, penult. stress word-acc. II, penult. stress 55 word-acc. II, final stress F0 Analysis / Plotting tracks Script: pp_show_series10.psc (example) Function: on the basis of checked tracks, the scripts produces a text file with an F0 values for each of 8 points of measurement. Takes voicing at edges into consideration. 56 Measuring overall / selective intensity (dB) 57 dB Introduction Variation in perceived voice quality (breathy, modal, creaky) correlates with distribution of energy in spectrum. 58 dB Introduction Functions include: 1. Utterance-level contrasts Example: creaky voice correlates with low F0 – Q: A: The slugs ate the dahlias, didn’t they? No, that’s not true / the rabbits ate the dahlias, not the slugs. (Thanks to Mariko Sugahara for the example) 59 dB Introduction 2. Word-level contrasts – on its own… Dinka example – breathy vs. modal raal raall ltt ‘vein-sg.’ ‘vein-pl.’ lt ‘insult-sg.’ ‘insult-pl.’ 60 dB Introduction Functions include: 2. Word-level contrasts – on its own… Dinka example – breathy vs. modal raal raall ltt ‘vein-sg.’ ‘vein-pl.’ lt ‘insult-sg.’ ‘insult-pl.’ … or as a package (register tone – e.g. MonKhmer languages, Chamic languages) 61 dB Introduction Variation in perceived loudness correlates with: > the distribution of energy in the spectrum (spectral balance) > overall intensity Functions include: > Lexical stress (cf. Sluijter & van Heuven 1996) > Phrasal accent (cf. Heldner 2003) 62 dB Introduction In summary: > Selective intensity marks distinction in voice quality AND distinctions in loudness. > Loudness contrasts may also correlate with overall intensity. > It remains unclear whether / to what extent loudness and voice quality have separate correlates. 63 dB Measuring overall intensity > No need for checking. Automated procedure is possible, cf. measurement of duration. > Important issue: controlling for variation in irrelevant factors in the course of session. > Relate intensity of target segment to the intensity of (part of) the carrier utterance. 64 dB Measuring selective intensity Abundance of possible measurements, including: > H1-H2 See thematic issue of JPhon 29:4 (2001) > H1-A1, H1-A2, H1-A3 > Dynamic filter (Heldner) > Average within a range (Sluijter & van Heuven) 65 dB Measuring selective intensity Recommendations: > For detailed acoustic study: a measure of specific spectral properties is best (explanatory adequacy). > In relation to a big corpus, Heldner’s filterbased measure seems best (relatively vowelindependent; easy to automate) > Try out several / Make your own variation 66 dB Measuring selective intensity How to: > Point of measurement (cf. vowel quality) > Semi-automating measurements using script 67 dB Measuring selective intensity Script: msr&check_spectr_indiv_interv.psc Goal: Semi-automated procedure for measurement of H1, H2, A1, A2, A3. Extension of vowel quality script. 68 Acknowledgements > The organizers, for inviting me. > Thanks to Patti Adank and Alice Turk, for discussions on measurements of vowel quality; and to Helen Hanson, for discussions on voice quality. > The Netherlands Organization for Scientific Research (NWO), for funding my research by means of a postdoc grant to Vincent van Heuven. 69 Conference announcement: Between Stress and Tone Topic: When/where: Abstracts due: Details: Typology of prosodic systems 16-18 of June 2005 / Leiden 1 of November 2004 http://www.iias.nl/iias/agenda/best/ 70