Digital Media Dr. Jim Rowan ITEC 2110 Audio What is audio? First, some demos • Can you hear this? – http://www.freemosquitoringtones.org/heari ng_test/ – “mosquito ring tone” • Audio illusion “Creep” – http://www.youtube.com/watch?v=ugriWS mRxcM The nature of sound First, a video from ted.com http://www.wimp.com/howsound/ • Three types we will discuss – 1) Environmental sound (sounds found in the environment) • There are two special classes of audio – 2) Music – 3) Speech The nature of sound • Environmental sounds – Provides information about the surroundings that the human is currently in • Music and Speech – Functionally and uniquely different than other sounds – Music • Carries a cultural status • Can be represented by non-sound: MIDI • Can be represented by a musical score – Speech • Linquistic content • Lends itself to special compression And it’s complicated… • Converting energy to vibrations and back • Transported through some medium – Either air or some other compressible medium • Consider speech – Starts as an electrical signal (brain & nerves) – Ends as an electrical signal (brain & nerves) – But… No… it’s REALLY complicated.. http://en.wikipedia.org/wiki/Ear – Starts as an electrical signal (brain & nerves) ==> – Muscle movement (vocal chords) • Vibrates a column of air sending out a series of compression waves in the air – Compression waves cause ear membrane to vibrate ==> – Moves 3 tiny bones ==> – Causes waves in the liquid in the inner ear ==> – Bends tiny hair cells immersed in the liquid ==> – When bent they fire ==> – Sends electrical signals to the cerebral cortex – Processed by the temporal cortex Audio Illusions • Play a 200 Hz pure tone – Softly at first – Gradually increase the volume – Most listeners will report that the tone drops in pitch as the volume increases • Play a 2000 Hz pure tone – Softly at first – Gradually increase the volume – Most listeners will report that the tone rises in pitch as the volume increases Why do you think… • You can’t tell where some sounds come from (like some alarms for instance) • You only need one sub woofer when you need at least two for everything else • You can’t tell where sound is coming from underwater • Two things running at the same speed make a “beating” sound Why do you think… (cont) • With your eyes closed you can’t tell whether a sound is in front of you or behind you • You hear sound that isn’t there (tinnitis) • Phantom sounds – Heard… but not there • Masking sounds – Not simply drowning them out – Can mask a sound that occurs before the masking sound actually starts Why do you think… (cont) • You can hear your name in a noisy room – Cocktail party effect – http://en.wikipedia.org/wiki/Cocktail_party_ effect – Still very much a subject of research Why? It’s complicated! • http://en.wikipedia.org/wiki/Psychoacoustics • Psychoacoustics – The study of human sound perception – The study of the psychological and physiological affects of sound Why? It’s complicated! • Sound is physical phenomenon that is interpreted through the human perceptual system – Wavelength affects stereo hearing • The distance between your ears related to the wavelength – Speed of sound affects stereo hearing • The faster the sound travels, the wider apart your ears need to be – You can tell where a sound comes from if • the wavelength is long enough and • the speed that sound travels is slow enough to allow the waves arrive at your ears at different times Processing Audio Processing audio • How can we look at sound? • What do you want to see? • Waveform displays – Summed amplitude of all frequencies & time – Amplitude & frequency components at one point in time – Amplitude & frequency & time Summed amplitude across all frequencies & time more examples of this form ==> now for some other forms of audio display ==> Amplitude & frequency components at one point in time pipe organ audio Amplitude & frequency & time pipe organ audio Summed amplitude & time joe took father’s shoe bench out Amplitude & frequency & time Here… the amplitude (volume) is shown as increasingly darkening areas Digitized audio • As we have seen earlier this semester – Sample rate & quantization level – Reduction in sample rate is less noticeable than reducing the quantization level • Jitter is a problem – Slight changes in timing causes problems • 20k+ frequencies? – Though they can’t be heard they manifest themselves as aliases when reconstructed Audio Dithering Weird… add noise… get better sounding result • Add random noise to the original signal • This noise causes rapid transitioning between the few quantized levels • Makes audio with few quantization levels seem more acceptable Audio processing terms to know • Clipping – …but you don’t know how high the amplitude will be before the performance is recorded • Noise gate – has an amplitude threshold • Notch filter – remove 60 cycle hum • • • • • Low pass filter High pass filter Time stretching (or shrinking… Limbaugh) Pitch alteration Envelope shaping (modifying attack) One thing about humans… • We can actively “filter out” what we don’t want to hear – remember the cocktail party effect? • Over time we don’t hear the pops and snaps of a vinyl record – Have you ever recorded something that you thought would be good only to play it back and hear the air conditioner or traffic roaring in the background? • A piece of software can’t do this… – …not yet anyway! Compressing sound files • Take the opposite approach from the one you took with images – With images you can toss out the high frequencies – With audio you can’t… high frequency changes are highly significant Compressing sound: Voice • Remove silence – Similar to RLE • Non-linear quantization • “companding” – Quiet sounds are represented in greater detail than loud ones • Mu-law (North America and Japan) • A-law (Europe) – Allows a dynamic range that would require 12 bits into 8 bits – 4096 (2**12) ==> 256 (2**8) Compressing sound: Voice • Differential Pulse Code Modulation (DPCM) – Related to temporal (inter-frame) video compression • It predicts what the next sample will be • It sends that difference rather than the absolute value • Not as effective for sound as it is for images • Adaptive DCPM – Dynamically varies the sample step size • Large differences were encoded using large steps • Small differences were encoded using small steps Sound compression that is based on perception • The idea is to remove what doesn’t matter • Based on the psycho-acoustic model – Threshold of hearing • Remove sounds too low to be heard – High and low frequencies not as important (for voice) Questions? • http://www.ted.com/talks/lang/eng/david _byrne_how_architecture_helped_musi c_evolve.html