Tools for Speech Analysis How do we choose? • What kind of data? • Which task? 2 Data • Speech content (noise, multivoice,…) • Data File – Sound/Transcription/PitchContour – Sampling/Quantization 16k 12k 8k 4k 8bit – Size: how much data? – Format • Sound: wav, wma, mp3, ogg, aiff, aifc, au, vox, raw, sd, CSL, Ogg/Vorbis, NIST/Sphere • Transcription types 3 What tasks do we want to perform ? • Visualization and Editing: – Record, play, edit, mix, add effects • Analysis: – spectral, pitch, intensity • Speech manipulation: – Filtering, mixing, adding effects, prosodic manipulation • Annotation: – segmentation, labeling • Scripting: – Batch, communication with outside 4 Sample Tasks • Create stimuli for an experiment (i.e. hybridization) • Create a database for TTS • Create a prosodic database • Analyze a speech corpus from experiment or ‘real’ recordings • Verify/correct an automatic segmentation or pitch track 5 No Unique Speech Tool • • No piece of software does everything There are usually many ways of doing the thing you want to do 6 Features to Look For • • • • • • Visualization/Edition Analysis Speech manipulation Annotation Scripting Plotting • • • • • Supported formats Platform/installation Evolution/community Accessibility Price 7 Possible Options • • • • • • • • Goldwave (audio editor) Esps Xwaves (routines + visual.) Praat (speech analysis) Wavesurfer (speech editor) Transcriber (annotation tool) Matlab (general purpose soft) OGI speech tools (routines + app. dev.) …winpitch, pitchworks, phonedit, cooledit….. 8 Links • • • • • • www.goldwave.com www.speech.kth.se/software/#esps www.praat.org www.speech.kth.se/software/#wavesurfer www.cse.ogi.edu/toolkit www.mathworks.com (Matlab) • • • • www.lpl.univ-aix.fr/~sqlab/ (phonedit) www.sciconrd.com/pworks.htm (PitchWorks) www.winpitch.com (WinPitch) www.adobe.com (CoolEdit > Audition) 9 Praat • Developed by Paul Boersma and David Weenink at the Institute of Phonetic Sciences, University of Amsterdam • General purpose speech tool : editing, segmentation and labeling, prosodic manipulation 10 11 Praat • Pros: designed for speech analysis (not only sound edition or spectrogram visualization), nice GUI, scripting, active development and community, prosodic manipulation • Cons: limited scripting language, native format of transcription and pitch files 12 File Management • Recording files and saving them – New menu • Opening files – Read menu • Long and short sound files • Other file types – Write menu 13 Editing Options from Objects Window • View – Navigation • Spectrum: spectral slice, spectrogram • Pitch: settings, pitch information • Intensity: settings, intensity information • Formant: display controls, information 14 Modifying the Data • Stylizing the pitch contour: – From Praat objects, Go to manipulation – Edit (the new object) – Pitch stylize pitch (2st) – Then …. • Modifying pitch • Modifying duration 15 Annotation: Textgrids • From objects – Annotate To textgrid • Labeling • Point vs. interval tiers • NB: remember to select the interval or point first in the waveform or spectrogram before trying to insert a label 16 Scripting • Automatic, from history – Ctrl new Praatscript Edit Paste history – NB: you can run all or part of the script • Writing scripts 17 Help • Online help, FAQ, manual • Links from http://www.praat.org • Additional tutorials, scripts, resources, user groups 18 Files to Play With • http://www.cs.columbia.edu/~julia/cs4706/sound s 19