Building a Catalan diphone voice Ariadna Font Llitjos May 10, 2001 Defining the phoneset • Most Catalan phones (34) plus 2 Spanish phones (th and jj) – Reason: All Catalan speakers also have Spanish phones, and there are many Spanish borrowed words that are in most Catalan speaker’s lexicon • Left out phones that need a much finer classification than the ones made for English phones (beta, gamma, etc) Generating Diphone Schema • Mostly same as Spanish, but with the new set of phones. • Catalan has 8 vowels (w/o considering stress), whereas Spanish has only 5 -> had to add a level of vheight (high mid-high mid-low low) ( draw graph on the board) • Mapping Catalan phones to a predefined set of phones • Over generative. Voice better suited to pronounce foreign or nonsense words that contain phones in the language but no legal combination of those Mapping Catalan phones to a predefined set of phones • Options: Spanish and English • My choice: English • Reasons: – English has more phones for vowels, more appropriate than Spanish, – Spanish phones have already been mapped to English phones, better to just map the phones directly to English, rather than indirectly Generating and recording the prompts • 1109 prompts (recorded on festvox0) • Lots of room noise (typing, door, talking, etc.) • Microphone not always in same position • Different power and even different intonation and duration throughout the whole recording process Labeling nonsense words • Automatically: – make_labs – make_diph_index • Manually: – Find a set of diphones that are wrong and look them up in dic/afldiph.est – Edit and correct the corresponding file with emulabel – Rerun make_diph_index (etc.) Extracting pitchmarks and LPS coefficients • Automatically: – make_pm_wav (edit to modify pitch range of speaker) – find_powerfactors (tells us what general power difference exists between files, calculated a table of power modifiers for each file) – make_lpc Testing phone synthesis • (SayPhones ‘(pau o l a pau s o k l a r i a d n a pau)) – Catalan voice – Spanish voice – English voice (modifying the phones) Catalan voice is still quite bad • Bad example • But it does have a basic Spanish phone… and without it, it would sound like this And here is how kal_diphone sounds Added tokenization • To be able to tell the numbers in Catalan (followed the Spanish tokenizer) Show file Added some lexical entries • Letters of the alphabet, symbols, punctuation, some content words… Phrasing, duration and intonation • Not there yet • Nor can I get it to SayText Summary: building a diphone voice • • • • • • • • • • • • • Define phoneset Generate diphone schema Generate prompts Record prompts Label prompts Extract pitchmarks and LPC coefficients Test phone synthesis Hand correct labels Add tokenizer Add lexicon Add prosody, durations and intonation Test and evaluate voice Package for distribution