Digital Media Lecture 12: Additional Audio Georgia Gwinnett College

advertisement
Digital Media
Lecture 12: Additional Audio
Georgia Gwinnett College
School of Science and Technology
Dr. Jim Rowan
Audio & Illusions

Can you hear this?



“mosquito ring tone”
http://www.freemosquitoringtones.org/hear
ing_test/
Audio illusion: “Creep”

http://www.youtube.com/watch?v=ugriWS
mRxcM
The nature of sound
First, a video from ted.com
http://www.wimp.com/howsound/
Other related video #1
How to use visualizations of human speech and music to explain computation:
http://www.youtube.com/watch?v=mGc6clf_Wt4&feature=bf_prev&list=PL278ECD
A0705DAF3DS
Other related video #2
David Byrne on how the venue shapes the form of the music performed:
http://www.ted.com/talks/lang/en/david_byrne_how_architecture_helped_music_e
volve.html
The nature of sound

Three types we will discuss
– 1) Environmental sound
(sounds found in the environment)

and there are two special classes
of audio
– 2) Music
– 3) Speech
The nature of sound

Environmental sounds
– Provides information about the
surroundings that the human is currently in

Music and Speech
– Functionally and uniquely different than
other sounds
– Music
• Carries a cultural status
• Can be represented by non-sound: MIDI
• Can be represented by a musical score
– Speech
• Linquistic content
• Lends itself to special compression
And it’s complicated…
Converting energy to vibrations and
back
 Transported through some medium

– Either air or some other compressible
medium

Consider speech
– Starts as an electrical signal (brain &
nerves)
– Ends as an electrical signal (brain & nerves)
– But…
No… it’s REALLY complicated..
http://en.wikipedia.org/wiki/Ear
– Starts as an electrical signal (brain & nerves) ==>
– Muscle movement (vocal chords)
• Vibrates a column of air sending out a series of
compression waves in the air
– Compression waves cause ear membrane to vibrate
==>
– Moves 3 tiny bones ==>
– Causes waves in the liquid in the inner ear ==>
– Bends tiny hair cells immersed in the liquid ==>
– When bent they fire ==>
– Sends electrical signals to the cerebral cortex
– Processed by the temporal cortex
Audio Illusions
Audio creep…
 Play a 200 Hz pure tone

– Softly at first
– Gradually increase the volume
– Most listeners will report that the tone
drops in pitch as the volume increases

Play a 2000 Hz pure tone
– Softly at first
– Gradually increase the volume
– Most listeners will report that the tone rises
in pitch as the volume increases
Why do you think…




You can’t tell where some sounds come from
(like some alarms for instance)
You only need one sub woofer when you need
at least two for everything else
You can’t tell where sound is coming from
underwater
Two things running at the same speed make a
“beating” sound
Why do you think… (cont)
With your eyes closed you can’t tell
whether a sound is in front of you or
behind you
 You hear sound that isn’t there (tinnitis)
 Phantom sounds

– Heard… but not there

Masking sounds
– Not simply drowning them out
– Can mask a sound that occurs before the
masking sound actually starts
Why do you think… (cont)

You can hear your name in a noisy
room
– Cocktail party effect
– http://en.wikipedia.org/wiki/Cocktail_
party_effect
– Still very much a subject of research
Why? It’s complicated!

http://en.wikipedia.org/wiki/Psychoaco
ustics

Psychoacoustics
– The study of human sound perception
– The study of the psychological and
physiological affects of sound
Why?
It’s complicated!

Sound is physical phenomenon that is interpreted
through the human perceptual system
– Wavelength affects stereo hearing
• The distance between your ears related to the
wavelength
– Speed of sound affects stereo hearing
• The faster the sound travels, the wider apart your
ears need to be
– You can tell where a sound comes from if
• the wavelength is long enough and
• the speed that sound travels is slow enough to
allow the waves arrive at your ears at different
times

Processing Audio
Processing audio

How can we characterize sound?
– Amplitude
– Frequency
– Time

Waveform displays
– Summed amplitude of all frequencies &
time
– Amplitude & frequency components at one
point in time
– Amplitude & frequency & time
Summed energy & time
Croak!
Play Croak!
The sonogram,
a snapshot of frequency
Croak!
Play Croak!
Another way to show audio,
frequency density across time
Slim Pickens from Dr. Strangelove
Croak!
Play Croak!
More examples…
Pure sine wave G, E, C
Bassoon playing the same notes
Summed energy & time
G
C
E
Sonogram
G
C
E
Frequency snapshot
Frequency over time
Digitized audio

As we have seen earlier this semester
– Sample rate & quantization level
– Reduction in sample rate is less noticeable
than reducing the quantization level

Jitter is a problem
– Slight changes in timing causes problems

20k+ frequencies?
– Though they can’t be heard they manifest
themselves as aliases when reconstructed
Audio Dithering is Weird…
add noise…
get better sounding result?!?
Add random noise to the original
signal
 This noise causes rapid
transitioning between the few
quantized levels
 Makes audio with few quantization
levels seem more acceptable

Audio dithering
Audio processing
terms to know








Clipping
– …but you don’t know how high the amplitude will be
before the performance is recorded
Noise gate
– has an amplitude threshold
Notch filter
– remove 60 cycle hum
Low pass filter
High pass filter
Time stretching (or shrinking… Limbaugh)
Pitch alteration
Envelope shaping (modifying attack)
Audio clipping
One thing about humans…

We can actively “filter out” what we
don’t want to hear
– remember the cocktail party effect?

Over time we don’t hear the pops and
snaps of a vinyl record
– Have you ever recorded something that you
thought would be good only to play it back
and hear the air conditioner or traffic
roaring in the background?

A piece of software can’t do this…
– …not yet anyway!
Compressing sound: Voice

Remove silence
– Similar to RLE

Non-linear quantization
• “companding”
– Quiet sounds are represented in greater detail
than loud ones
Compressing sound: Voice

Differential Pulse Code Modulation (DPCM)
– Related to temporal (inter-frame) video compression
• It predicts what the next sample will be
• It sends that difference rather than the absolute
value
• Not as effective for sound as it is for images

Adaptive DCPM
– Dynamically varies the sample step size
• Large differences were encoded using large steps
• Small differences were encoded using small steps
Sound compression
that is based on perception
The idea is to remove what doesn’t
matter
 Based on the psycho-acoustic model

– Threshold of hearing
• Remove sounds too low to be heard
– High and low frequencies not as important
(for voice)
Download