Speech Compression for LPC Using HCDC Sonali S.Gunjotikar, Prof.Mohan H.Nerkar,

advertisement
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 10 - May 2014
Speech Compression for LPC Using HCDC
Sonali S.Gunjotikar, Prof.Mohan H.Nerkar,
Department of Electronics &Telecommunication
Government College of Engineering, Aurangabad.
Abstract— In this paper, we apply a hamming code corrector
compressor on Linear Prediction coded frames parameter.
Database of speech by different age males and females is
created .The speech signal is segmented in 20ms frames. Linear
Prediction analysis using autocorrelation is carried on signals of
order 12.These parameters include Linear Prediction Coefficient,
Gain, Excitation bits which are DCT residual for signal frame
consist of 40 coefficients. Hamming code corrector compressor
compresses signal at different parities. HCDC is able to compress
the signal up to 80%.
Keywords— Speech Compression, Lossless Speech Compression,
Linear Prediction Coding, Hamming Corrector Code
Compressor, Discrete cosine Transform.
I. INTRODUCTION
Linear Prediction has, for several decades been the mainstay
of speech communication technology, and applied to speech
coding since at least 1971[1].It relies on several characteristics
of speech derived from the fact that speech is produced by
human muscular system. These muscles act to shape sounds
through their movements, which is limited by a maximum
speed. Muscles cannot move infinitely fast, thus human
speech remains pseudo stationary for 30ms[2].Action of
glottis to generate pitch spikes is often shorter than 30ms, so
pitch needs to remove from speech signal first leaving a much
lower energy signal called residual. Pseudo stationary implies
that 240 samples at 8 KHz sample rate being similar can be
parameterized by 12 linear prediction coefficients. Linear
prediction coefficients are the generator polynomials for a
digital filter, that when simulate with input signal recreates the
characteristics of original samples. They may not identical in
time domain but most importantly frequency response will
match the original.LPC has been used successfully by itself, in
speech coding: the very low bit rate US Federal Standard 1015
2.4Kbits/algorithm, developed in 1975 is based on LPC.LPC
is adapted widely in modern communication standards such
as G72.x,GSMx,H.323,SIP and CDMA.LPC is a lossy
compression technique, it reduces rates from PCM range of
64Kbps.Research trends for LPC focuses on four main issues
rate reduction, quality enhancement, delay and complexity
reduction. A key issue for LPC is system order is represented
by number of terms in normal equation for signals frame, the
system order varies according to its application, for speech
recognisition it reaches up to 22[3],speech morphing uses
order of18[4]. The least seen order was for experimental
codec LPC10 with order 10[5].The common order for speech
communication is 12.System excitation might be
accomplished by pitch impulse train[6],or signal residue of
Discrete Cosine Transform[7],or Fast Fourier Transform[8],or
it might be a Vector Quantizer VQ codes[8].
ISSN: 2231-5381
Each signal in LPC is segmented into frames for stationary
reasons. For a frame, its size in bits varies, according to type
of LPC being used, the frame data is {LPcoef, G, V/UV, T}
where the LPCoef. represents the linear predictions, G
represents gain, V/UV states weather the frame is voiced or
unvoiced. For voiced frame it is synthesized with impulse
train, and unvoiced frame is synthesized with white noise.
Most common form for frame representation is expressed as
{LPCoef. G, residual},where the residual represents excitation
of system. Hamming Correction Code Compressor reduces
transmission rate for LPC coded voices, the algorithm was
applied over whole frame information that include coefficients,
gain and DCT excitation coefficients. All of frame
information is targeted at once, and then we evaluated
compression rates.
II. LINEAR PREDICTION CODING
The importance of linear prediction lies in accuracy with basic
model of speech.LPC of speech is a technique for estimating
the basic speech parameter that are pitch, formants, spectra,
vocal tract area functions and representing speech for low bit
rate transmission or storage.
The basic idea behind LPC is to express signal as an
output of basic linear digital filter, the filter equation is
obtained from the signal autoregression, which represents
signal normal equation. The signal x[n] represents the signal
autoregression as in equation (1).
x [n] = x[n-1]x[n-2]-……….x[n-M] +v[n]
(1)
The autoregression expresses power spectral density for signal.
The power spectral density in terms of normal equation
represents a set of highly correlated lags at i and linearly
independent from each other’s over n. The term v[n]
represents the error in the equation; expressed white noise. By
definition of autoregression, the smaller this error term more
accurate is the equation describing the signal. In order to have
a wide sense stationary signal; it is framed into l length frame,
so autocorrelation can be calculated for frames, in order to
form autoregression of signal ,rewriting equation (1) in its
normal form we yield
[0]+
[1] +……..+
[1] +
[0] +………. +
.
.
http://www.ijettjournal.org
[M] =
[M-1] =0
(2)
.
Page 489
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 10 - May 2014
[m] +
[m-1] +……..+
[0]=0
The solution of this linear system in equation (2) was detailed
in [9] using Levinson’s Durbin Recursion Equation, the final
outcomes of this operation is the vector of length M, which
represents the number of coefficients in equation(1);where M
is known as LPC order. These are coefficients of synthesis
filter, with proper excitation for filter; we recover [n] which
is very similar to original signal. The synthesis filter equation
is
H (z) =
=
(3)
For the excitation signal, it might Vector Quantization (VQ)
codes, Discrete Cosine Transform (DCT), or Fast Fourier
Transform (FFT) of frame [8] according to LPC system type
being used.
III. HAMMING CORRECTION CODE COMPRESSOR
Hamming Correction Code Compressor is based on Hamming
Correction Code, assume we have a set of bits
{
,
,
,
,
}, where it’s hamming parity is
{ ,
, , ,
}.for this set number of parity
=3 for a 7 bit length, simply we will transmit or save d bit
only, calculate the parity at reception or decompression, this
means we can express 7bit word by 4 bit by saving 3bits.
A. Encoding and Decoding with HAMMING CODE.
The “Hamming (7, 4)” code (which takes k=4 bits in and
furnishes n=7bits out) is listed in the following table.
TABLE I.
Input
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111
Codeword
0000000
0001110
0010101
0011011
0100011
0101101
0110110
0111000
1000111
1001001
1010010
1011100
1100100
1101010
1110001
1111111
The setup shown in figure (1)
Weight
0
3
3
4
3
4
4
3
4
3
3
4
3
4
4
7
Fig.1.Encoding and Decoding with Hamming Code
The channel is memory less binary symmetric channel, with
crossover probability q<0.5.
Pr (output = 0 | input = 1) = Pr (output = 1 | input = 0) = q
(channel error)
Pr (output = 0 | input = 0) = Pr (output = 1 | input = 1) = 1−q
(correct output)
While
encoding firstly calculate probability of receiving
=
for all values of i, when the four bits are sent over the
channel raw then repeat above when four bits are first coded,
at the channel output.
1) Encoding a (7, 4) Hamming Code:
v=
xu
u is the sequence of source.
G is the ‘Generator matrix’ of code. Check bits occupy
positions that are powers of 2.The number of parity check bits
needed are given by Hamming Rule, a function of number of
bits of information transmitted. The matrix equation is
= [P
]
where
is 4*4identity matrix, and P is the parity check
matrix. v is the transmitted codeword.
2) Decoding (7, 4) Hamming Code:
In this special case of linear code, with binary symmetric
channel, the decoding task can be represented as syndrome
decoding.
s=H x r
s is Syndrome. H is parity check matrix. r is received vector.
According to parity compress the signal by dividing the
window frame by parity.
Compression rate
=
k=parity bits 1, 2, 3,......., 7, 8.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 490
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 10 - May 2014
Thus if code is to be compress 8 times, then k is set to 8, and
compression rate becomes {
The compression rate
is calculated against given parity on
every count. The overall compression
is to be calculated
for each sample accordingly and specified at the last iteration.
IV. METHODOLOGY AND EXPERIMENT DESIGN
Database is created by recording the voices of different age
males and females. The experiment was carried out on several
data
set,
men’s
sound
signals
were
A,B,C,D,E,F,G,H,I,J .Sample’s signals were segmented into
20ms frames each signals. For each signal we used 8 bit
resolution at 8 KHz sampling. For each frame LP coefficients
are calculated until order 12, DCT Residual first 40
coefficients quantized with 4 bits each and gain value.
V.RESULT
For all samples of signals, compression is observed at
different parity values. From figure (3) we can say that
compression is 80% at parity 5.
TABLE II
LPC UNCOMPRESSED FRAME
Contents
Number of
Values
Total bits
12
Minimal
Representation
of bits
8
LPC
Coefficients
DCT
Coefficients
Gain
Total Frame
40
4
160
1
5
5
261
96
The transmission rate for 261 bit/frame is equal to 5.22 kbps.
Fig.3: Compressed Waveform
For each frame with data in table 2; we went into steps in fig 2,
in order to evaluate compression performance at different Table III shows some cases achieved by HCDC on 10
parities and at different iteration.
different signals.10 different sentences of men recorded voices
had gone into LPC Extraction technique, which determines
No. of Frames, Gain, and Error of each signal.
Table III
Sr .No.
ISSN: 2231-5381
Technique
Gain
1
Sample
Signal
A
LPC
21.1451
No. of
Frames
3892
2
3
B
C
LPC
LPC
9.60389
18.2793
4954
4429
4
D
LPC
18.0501
4672
5
E
LPC
18.0768
5952
6
F
LPC
8.86326
3431
7
G
LPC
12.8114
3264
8
H
LPC
17.1401
5556
9
I
LPC
21.335
7156
10
J
LPC
19.1867
7348
http://www.ijettjournal.org
Error
1.48598e.005
-9.3945e.006
1.16041e.005
1.11497e.005
1.16014e.005
1.47836e.005
2.00031e.005
1.18287e.005
1.49501e.005
1.41608e.005
Page 491
International Journal of Engineering Trends and Technology (IJETT) – Volume 11 Number 10 - May 2014
V. CONCLUSION:
In this paper, we exploit Hamming Correction ode
Compressor (HCDC) Linear Prediction frames parameter.
These parameters include Linear Predictor Coefficients, Gain
and Excitation bits; which is DCT residual for signal frame,
consists of 40 coefficients, each is quantized using 4 bits. For
the signals used in experiments; total bits in frame were261
with transmission rate of 5.22Kbps.For each sample of dataset
for males and females, we segmented the samples into short
frames of 20ms for processing and performed compression
over these frames. The transmission rate was reduced by
80%in average by using HCDC.
REFERANCES
[1] J. Markel and A. Gray. Linear Prediction of Speech.Springer-Verlag, 1976.
[2] Applied Speech and Audio Processing by Ian McLaughlin.
[3]W. Chou, “Pattern Reorganisation in Speech and Language Processing”,
CRC Boca, Fl 2002.
[4]Alexander Kain And Michael W. Macon, “Design And Evaluation Of A
Voice Conversion Algorithm Based On Spectral Envelop Mapping And
Residual Prediction”, Center for Spoken Language Understanding
(CSLU),Oregon Graduate Institute ,USA 2000.
[5]Benjamin W. Wah,” LSP based Multiple Description Coding for Real Time Low Bit-Rate Voice Over IP”’, IEEE Transactions on Multimedia,Vol.7
No.1,February 2005.
[6] A.V.Rao Et Al.”Pitch Adaptive Window for Improved Excitation Coding
in Low Rate CELP Coder”, Transaction on Speech And Processing Vol.
4No.6, IEEE 2006.
[7] E. Lam ,”Analysis Of DCT coefficient Distribution”,IEEE Signal
Processing Letters.Vol.11,No.2, IEEE 2004.
[8] Series G: Transmission System And Media ,Digital Systems And
Networks, Digital Terminal Equipments- Coding Of Voices And Audio
Signals, Coding Of Speech at 8 Kbit/s using Conjugate-Structure AlgebraicCode Excited Linear Prediction, Telecommunication Standardization Sector
of ITU 06/2012.
[9] Chu.,” Speech Coding Algorithms”;Willey, New York,2003.
ISSN: 2231-5381
http://www.ijettjournal.org
Page 492
Download