LOT summer school Ultrasound, phonetics, phonology: Articulation for Beginners! James M Scobbie CASL Research Centre With special thanks to collaborators Jane Stuart-Smith & Eleanor Lawson Joanne Cleland & Zoe Roxburgh Natasha Zharkova, Laura Black, Steve Cowen Reenu Punnoose, Koen Sebreghts Sonja Schaeffler & Ineke Mennen Conny Heyde Alan Wrench (aka Articulate Instruments Ltd) for AAA software and UTI hardware Various funding – thank you to ESRC, EPSRC, QMU June 2013 • • • • • • Introduction to articulation Brief overview of techniques Ultrasound tongue imaging Playtime Technical issues and the nitty gritty of data Maybe a linguistic illustration – Malayalam liquids Structure Why study articulation? • It underlies acoustic and visual elements of speech, but is only a means to an end • It tells you about what speakers actually do • It might be what the speaker intends to control • Some bits of speech are silent • It is interesting in its own right because it is a complex multichannel linguistic phenomenon • Applications – clinical, military, HCI, L2 • Comparison with sign language and gesture • Provides evidence for phonological patterns Why study articulation? • Silent articulations – Pre-speech, post-speech – Speech errors – Voiceless stops – Listening, turn taking • Covert contrasts – In acquisition therapy, L2 learning, sociolinguistics • Covert errors – In acquisition, therapy, L2 learning • Articulation / acoustics relationship in segmental and prosodic speech Easy topics for research Techniques for speech analysis? • Flesh-point tracking – EMA Electromagnetic Articulography – Motion-capture • Constriction tracking – EPG Electropalatograph – EGG / Largyngograph & transglottal illumination • Parameter tracking / indirect analysis – Airflow – Intraoral pressure sensing – EMG Electromyography or muscular measures – Acoustic analysis • formants, pitch, voice quality, constriction types, VOT etc. Quantitative articulatory approaches • Video (including basic photos/still frames) – Regular video (25fps PAL or 30fps NTSC), often underlyingly a higher image / refresh rate (de-interlaceable) – From cheap camcorders to endoscopy for internal images • X-ray stills and X-ray cinematography • MRI Magnetic Resonance Imaging & CT scanning – Excellent resolution of superficial and deep features in 3D for static images – Bigger voxels, more grainy, more processing at faster frame rates, great prospects in next few years • UTI Ultrasound Tongue Imaging – Regular video outputs or – Digital (“high speed”) cineloop outputs Imaging • EMA: gluing and sitting in the cube: 2h+ data • UTI: probe fitting and the headset: “30m” data – Short sessions, outputs are image sequences, needs synchronisation, captures root What hi-tech speakers go though • Video playback in powerpoint / media players don’t really convey the spatio-temoral nature of the data Vid UTI: ECB08 spontaneous dialogue MRI: USC • What do you do when you “say hello”…? • Description…? What about something easier? • Same speaker? MRI • What do you do when you “say hello”…? Video can tell you a lot • Easy-to-get images are only 2 dimensional – The head and vocal tract are 3D objects – Which 2 dimensional plane do you want to study? • Speaker & camera move relative to each other – False motion of articulators within the plane – Towards or away from the camera, changing scale – And rotations mean a different plane is shown • Not many frames per second – Potential for smearing in time – Missing key events completely – Weak and/or variable synch with acoustics Drawbacks are not unique • To get data in more than one plane, let alone enough to make a 3D image that moves in time… … means sacrifices – Lower spatial resolution – Lower temporal resolution 3D / 4D? Ultrasound Tongue Imaging • Ultrasound as a tongue imaging technique • Relatively cheap, non-invasive and accessible – Fieldwork – Clinical diagnostics – Child language acquisition – Standard laboratory phonetics & phonology • Real time visual biofeedback – Phonetics and linguistics teaching – Clinical intervention – L2 teaching and personal training UTI applications • Quick, portable, cheap, live/realtime, “comfyish” • Synchronisation with audio, probe movement • Applications – Clinic – Teaching – Piloting – Outreach – Fieldwork – Discourse – Infants – pT Research! hand held & live video-mode • Articulate Assistant Advanced (AAA) • ~120fps hs-UTI: raw probe echo-location data is stored and re-imaged on the fly – Up to ~400fps available • 135° Field of View • ~60fps de-interlaced lip camera stored as uncompressed bitmap QMU AAA multichannel lab set-up MC suburb and the AAA multichannel system • Data collection and analysis in a fast single dedicated software environment – Ultrasonix high speed UTI (no post-processing) – Various video UTI or camcorder systems – Same annotation & display software for EPG, EMA • Custom-made multichannel synchronisation – Video via “synch-brightup” clapperboard on AD of video images, with built in batch-processing • De-interlacing from ~30 to ~60-fps • Offset of clapper-frame to adjust for UTI creation (~20ms) • Semi-automatic edge-detection – Smoothing & confidence-rating over 42 points AAA • Ultrasound gives rise to – Artefacts from parallel tongue & echo pulse beams – Missing data • Between scan lines, or beyond the scan area • Behind bones • Above a sublingual cavity (aka losing the tip) – Grainy data or poor resolution • Older speakers, dry mouth, beards, etc. • Tongue surface when it’s far from the probe • When tongue is parallel to the scanlines • Problems with stabilisation & synchronisation – Technical solutions can lead to speaker fatigue Some problems • We only have mid-sagittal tongue curves – Not passive articulators – Not all the tongue surface – Not all the internal tongue tissue – Not lips • But unlike EMA – We are not limited to 3 or 4 anterior points • And unlike MRI – UTI is cheap, non-invasive, portable and quick – For small datasets analysis is quick… we can collect & trace 12 tokens of 5 vowels in half a day With UTI… Real time visual biofeedback • Have a go… • Front/back vowels • Dutch & English /r/ • Dutch & English /s/ and English “sh” • Swallowing Annotation and analysis software • AAA • Basic overview, using some good data • Annotation and filtering • Drawing some splines Playtime!