ESE250: Digital Audio Basics Week 6 February 19, 2013 Human Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 1 Course Map ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 2 • Week 2 • • Received signal is sampled & quantized q = PCM[ r ] Over Sample r(t) q+n Week 4 • Where are we? Sampled signal first transformed into frequency domain Q = DFT[ q ] Week 3 Quantized Signal is Coded c =code[ q ] p(t) Produce Week 4 Decode Week 5 DFT Store / Transmit Q+N C LPF Q Perceptual Coding Week 3 Week 5 • signal oversampled & low pass filtered Q = LPF[ DFT(q+n) ] Week 6 • Transformed signal analyzed Using human psychoacoustic models [Painter & Spanias. Proc.IEEE, 88(4):451–512, 2000] Week 6 Week 7 Acoustically Interesting signal is “perceptually coded” C = MP3[ Q] ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 3 The Physical Ear [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] • • External Sound Waves Guided by outer ear into auditory canal Excite Inner Ear Through mechanical linkage connecting ear drum to cochlea ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 4 The Physical Ear [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] • • Initiates signal processing frequency domain analysis Via analog computation Video: Cochlea What part of the Cochlea vibrates for an 800 Hz square wave? ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 5 The Cognitive Ear • Modern Psychoacoustics Benefits greatly from o o decades of neural recording contemporary brain imaging technology [R. Munkong and B.-H. Juang. IEEE Sig. Proc. Mag., 25(3):98–117, 2008] ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 6 Power Spectrum Model of Hearing B.C.J. Moore. Int.Rev.Neurobiol., 70:49–86, 2005. • Rough Picture (main content of today’s lecture): Critical Bands: Auditory system contains finite array of adaptively tunable, overlapping bandpass filters Frequency Bins: humans process a signal’s component (against noisy background) in the one filter with closest center frequency Masking: certain signal components in a given band are “favored” and others are filtered out • Established through decades of psychoacoustic experiments ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 7 Auditory Thresholds • • In the lab, you varied the frequency, amplitude and phase of signals What was the effect of each, if any, on the sound you heard? Frequency s (t ) A sin( 2ft ) Amplitude Phase ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 8 Auditory Thresholds • Harvey Fletcher (1940) Played pure tones varying o frequency, f [ Hz] o Intensity, I o [Dyn ¢ cm-2] = 10-5 [N ¢ cm-2] = 0.1 Pa phase changes tend to be inaudible Large listener population o Young o Acute • Recorded extreme thresholds faintest audible greatest tolerable ESE 250 S’13 DeHon Kadric Kod Wilson-Shah (http://www.et.byu.edu/) Week 6 – Psychoacoustics Auditory Thresholds [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. • Results: pain-free hearing range extends at most over 20 Hz – 20 KHz with sensitivity » 2 ¢ 10-4 ¢ 0.1 Pa = 20 Pa ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 0.1 Pa 10 The decibel unit • Define standard pressure: p0 = 0.0002 ¢ 0.1 Pa = 20 Pa • Threshold of human hearing • Compute Sound Pressure Level as: LSPL = 20 log10(p/p0) dB • LSPL for p1 = 20 Pa , for p2 = 200 Pa , for p3 = 20 mPa 0.1 Pa Compare to Ambient sea-level pressure: 1 Atmosphere = 105 Pascal • Q: why use log-log scale? • A1: dynamic range • A2: “loudness” is a power function ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 11 The decibel unit – Hearing intensity (http://www.dspguide.com/ch22/1.htm) Week 6 – Psychoacoustics 12 Let’s try to reproduce these results! • We will listen to single sine tones starting at a frequency of 10KHz, all the • way up to 20KHz, so each student can figure out their cut-off frequency Suggestions to improve this experiment? (http://www.dspguide.com/ch22/1.htm) Week 6 – Psychoacoustics 13 Animal hearing ranges • Dogs: Greater hearing range: 40Hz to 60KHz Ultrasonic dog whistles • Mice: Large ears in comparison to their bodies Hearing range: 1KHz to 70KHz Can’t hear low frequency noises Communicate with high frequency Distress call (40KHz), alert of predator ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics [Pictures from Wikipedia] 14 Why Sinusoids? • Why not some other harmonic series? …. all sound is produced by vibrating masses …. Fourier’s analysis shows harmonic analysis could be based on arbitrary smooth periodic fundamental • • Why does the animal receiver use sinusoids? Hamiltonian Mechanics b m x k Simplest physical model of vibrating masses Coupled spring-mass-damper mechanics Produce sinusoidal harmonics • Video: Cochlea ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 15 Masking - Spatial • Masking Paradigms “Masker” masking “maskee” Tone Masking Noise o pure tone o of 80 SPL at 1 kHz just masks “critical band” noise of 56 SPL centered at 1 kHz Masker-to-Maskee ratio o o Constant for fixed relative frequency and varying amplitude Changes with varying relative frequency 1 “Bark” frequency interval [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 16 Masking The first graph shows the masking pattern for a 200Hz tone Mostly masks tones around 200Hz, but also at harmonics The second graph shows the same plot for different frequencies, but only the fundamental part Notice that the band gets wider for increasing frequencies [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. …masker at fundamental can somewhat mask maskees at the harmonics … … but the “spreading curve” is traditionally depicted over the fundamental only 17 ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics Tone Masking Noise • Are the following signals masked? 200 Hz tone at 80dB 200 Hz tone at 40dB 300 Hz tone at 40dB 400 Hz tone at 40dB 700 Hz tone at 30dB ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 18 Masking • [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. Tone Masking Noise (Fig 12) value above quiet threshold such that a signal at the • abscissa frequency can be heard in presence of top: 200 Hz tone bottom: various frequencies Noise Masking Tone (Fig 13) dots show pure tone magnitude (in dB) required to be audible above noise o o o Of the magnitude on the middle curve centered at that frequency with bandwidth at least wider than the bars of Fig 12 ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 19 Noise Masking Tone • Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 20 Noise Masking Tone • Are the following signals masked by the noise? 200Hz at 60dB o Yes! noise 1KHz at 60dB ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 21 Noise Masking Tone • Are the following signals masked by the noise? 200Hz at 60dB o No! 1KHz at 60dB ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 22 Noise Masking Tone • Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 23 Noise Masking Tone • Are the following signals masked by the noise? 200Hz at 60dB 1KHz at 60dB o No! ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 24 Masking - Temporal • Temporal Masking Masker effect persists for tenths of a second Masker effect is “acausal” o on ~ 2/100 timescales ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 25 Pitch JND [H. Fletcher. Rev. Mod. Phys., 12(1):47–65, 1940]. • JND = “just noticeable difference” change in stimulus that “just” elicits perceptual notice where “just” means that a smaller variations of stimulus cannot be discerned • What can you say about the JND: Below 1000 Hz? o o roughly constant ~ 3 Hz Above 1000 Hz? roughly log-log linear Log[Jnd(f2)] - Log[ Jnd(f1)] ~ n (Log[f2] - Log[f1]) o o What is n? e.g. f1 =2000 f2 =4000 6 = 10 – 4 ~ n( Log10[2] ) ) n ~ 20 • Suggests that as frequency increases broader frequency bands “assigned” to same length of cochlear tissue Remember cochlea model ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 26 JND experiment • The following audio files contain a single • • tone playing for 10 seconds. The sine starts at 200Hz, then changes to a higher frequency (201, 202, 203, 205, 210). This change occurs after a number of “noises”: 1, 2, 3, 4, 5, 6, 7, 8 or 9. Can you notice when the change happens? ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 27 Critical Bands Decades of empirical study • reveal that human audio frequency • • • perception is quantized into < 30 “critical bands” of perceptually near-identical pitch classes corresponding to ~equal length bands of cochlear tissue (neurons) ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 28 Critical Bands: Evidence [T. Painter and A. Spanias. Proc. IEEE, 88(4):451–512, 2000.] Tone masking Noise (Fig. a & c) o o o o o noise audibility threshold for small bandwidth noise remains constant until tone frequency locus falls away from critical bandwidth Noise masking Tone (Fig. b & d) o o same effect with masker and maskee roles reversed ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 29 The Bark Scale [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] • “Bark” units: Uniform JND scale for frequency Maps frequency intervals into their respective critical band number ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 30 The Bark Scale [E. Zwicker. J. Acoust. Soc.Am., 33(2):248, February 1961] • Frequency-to-Bark function First Principles vs. Empirical Modeling B( f ) 13 tan 1 (0.00076 f ) 3.5 tan 1 (( f / 7500) 2 ) ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 31 Compression opportunities Consider the following recording Any ways to improve the compression? ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 32 Compression opportunities Zooming in on a smaller portion Any ways to improve the compression? 120 dB 100 80 60 40 20 Masked 0 193 194 195Hz 196 197 198 199 200Hz ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 201 202 203 204 205Hz 206 Week 6 – Psychoacoustics 207 208 Frequency 33 Compression opportunities Zooming in on a smaller portion Any ways to improve the compression? 120 JND: Could only represent integer frequency values dB 100 80 60 40 20 0 193 194 195Hz 196 197 198 199 200Hz ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 201 202 203 204 205Hz 206 Week 6 – Psychoacoustics 207 208 Frequency 34 Compression opportunities Zooming in on a smaller portion Any ways to improve the compression? 120 dB 100 80 60 40 20 0 193 194 195Hz 196 197 198 199 200Hz ESE 250 S’13 DeHon Kadric Kod Wilson-Shah 201 202 203 204 205Hz 206 Week 6 – Psychoacoustics 207 208 Frequency 35 Next Week • How can we use what we know about human perception to compress music? Frequency hearing range Masking o Temporal o Spatial o JND o Barks ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 36 Big Ideas • • • Sound is a pressure wave that makes the Cochlea vibrate with frequencies from ~20Hz (at the tip) to ~20KHz (at the base) This vibration is sinusoidal (physics) This is why sound harmonics are best represented as sinusoidal signals Masking Temporal – A masker tone can mask another tone that is present either right before or a little after the masker Spatial – A single tone can mask an entire frequency band (that contains the tone) if its intensity is high enough There are <30 such bands (Bark scale), and they are wider for higher frequencies ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 37 Admin • Lab 5 report due tomorrow • On Thursday: Lab 6 You will be designing your own experiments o o To measure the range of frequencies you can hear To perform spatial masking experiments ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 38 ESE250: Digital Audio Basics End Week 6 Lecture Human Psychoacoustics ESE 250 S’13 DeHon Kadric Kod Wilson-Shah Week 6 – Psychoacoustics 39