Speech Coding

advertisement
CELLULAR
COMMUNICATIONS
5. Speech Coding
Low Bit-rate Voice Coding





Voice is an analogue signal
Needed to be transformed in a digital form (bits)
Speech signal is not random=>can be encoded
using fewer bits as compared to random signal
If bits representing 1sec of speech can transferred
over wireless channel during 200ms=> can pack 5
signals into the channel
For a handset transmitting less bits is alsoe means
longer battery life
Requirement for speech coding


Can distort a speech a little bit (lossy) but should
preserve acceptable quality
Shouldn’t be to complex
 Use
less power
 Use less circuits
 Reduce delay
Hierarchy of speech coders
Waveform Coders vs. VOCODERS

Waveform coders
 Approximate

any acoustic signal
VOCODERS


Based on prior knowledge of the signal
Speech signals are very special signals
Speech signals

Not all levels of a speech signal are equally likely
High probabilities of very low amplitudes
 Significant probability of very high amplitudes
 Monotonically decreasing probabilities of amplitudes
between these two extremes


Speech is predictable

The next value of a speech signals can be predicted
with large probability and fair precision from the past
samples
Speech in frequency domain


Power of high frequency components is small
High frequency components when present are very
important for speech quality
Sampling and quantization



Speech signal is analog, measured at infinitely
many time instances and infinitely many possible
values
Sampling: measure signal at finite time instances
(sampling interval)
Quantization: approximate infinitely many possible
values by finite number of possible values (e.g. 8
bits)
Uniform quantization


Divide the range of all possible values into finite
number of equal intervals
Assign single quantization value to all values within
the interval
Non-uniform quantization



Divide the range of all possible values into finite
number of unequal but equally probable intervals
Logarithmic quantization: smaller intervals at low
amplitudes
Different weight to low values
-Law
 Europe: A-low
 US:
-Law
A-law
Adaptive quantization

Adjust to input signal power
Rate-Distortion Theorem



Shannon: There existing a mapping from source
waveform to code words such that for given
distortion (error) D, R(D) bits per sample is sufficient
to restore signal with an average distortion
arbitrary close to D
R(D) is called rate distortion formula (achievable
low bound)
Scalar quantization does not achieve this bound
Vector quantization




Encode a segment of
sampled analog signal (e.g.
L samples)
Use codebooks of n vectors
Segment all possible samples
of dimension L into areas of
equal probability
Very efficient at very low
rates( R=0.5 bits per sample)
Learning codebook

LBG: Split areas (double codebook)
Adaptive Differential Pulse Code Modulation

PCM
 Each
sample representing by its amplitude (8 bits)
 Standard telephony: 8K samples per second, 8 bit per
sample= 64kbps

DPCM
 Encode
only difference from previous sample
 Smaller differences are more often
 Use less bits to represent smaller differences(4 bits) but
more bits (10 bits) to represent larger differences
DPCM and prediction
ADPCM


Use more complex prediction in a
transmitter/receiver to estimate next sample value
Transmitter send only difference between estimation
and real value

Lossy codec: transmit approximate differences

Hopefully difference will be small
Frequency Domain Coding of Speech



Divide speech signal into a set of frequency
components
Quantize and encode each component separately
Control number of bits/quality allocate to each
band
Sub-band coding

Human ear does not detect error at all frequencies
equally well
SBC
Vocoders





Model speech signal generation process
Transmitter analyze the voice signal according to
assumed model
Transmitter sends parameters driveled from the
analysis
Receiver synthesize voice based on received
parameters
Vocoders are much more complex that waveform
coders but achieve higher economy in a bit rate
Human Vocal Tract
Voice Generation Model
LPC
Advanced codecs

CELP
 Transmitter/Receiver


share common pitch codebook
Search for most suitable pitch code
RELP


Transmit model parameters
Also transmit Residual(differences) signal
Mean Opinion Score Quality Rating
Codec MOS rating
Download