Digital Systems: Hardware Organization and Design

Speech & Audio Processing Speech & Audio Coding Examples A Simple Speech Coder  LPC Based Analysis Structure Linear Prediction Analysis Windowing Analysis Analysis Filter 15 April 2020 AutoCorrelation LevinsonDurbin Quantization Audio Input Preemphasis Residual Residual Filter Coeffs Filter Coeffs Veton Këpuska 2 Windowing Analysis Stage N – Length of the Analysis Window 10-30 msec 15 April 2020 Veton Këpuska 3 Some Analysis Windows 15 April 2020 Veton Këpuska 4 MATLAB Useful Functions  wintool  Use “doc wintool” for more information  window  Use “>doc window” for the list of supported windows  Define your own window if needed e.g:  Sine window and Vorbis window   n  0.5  wn  sin  sine window  N    2   n  0.5   wn  sin  sin    vorbis window N   2 15 April 2020 Veton Këpuska 5 LPC Analysis Stage  LPC Method Described in:  Ch5-Analysis_&_Synthesis_of_PoleZero_Speech_Models.ppt  Summary:  Perform Autocorrelation  Solve system of equations with DurbinLevinson Method  MATLAB help  doc lpc, etc. 15 April 2020 Veton Këpuska 6 Example of MATLAB Code function myLPCCodec(wavfile, N) % % wavfile - input MS wav file % N - LPC Filter Order % [x, fs, nbits] = wavread(wavfile); % plot(x); % Playing Original Signal soundsc(x,fs); % Performing LPC analysis using MATLAB lpc function [a, g] = lpc(x,N); % performing filtering operation on estimated filter coeffs % producing predicted samples est_x = filter([0 -a(2:end)], 1, x); % error signal e = x - est_x; % Testing the quality of predicted samples soundsc(est_x, fs); ge[n] % Synthesis Stage With Zero Loss of Information syn_x = filter([0 -a(2:end)], 1, g.*e); soundsc(syn_x,fs); H z   1 A z  ŝ[n] p sˆn   k sˆn  k   gen k 1 15 April 2020 Veton Këpuska 7 Analysis of Quantization Errors  Use MATLAB functions to research the effects of quantization errors introduced by precision of the arithmetic operations and representation of the filter and error signal:     Double (float64) representation (software emulation) Float (float32) representation (software emulation) Int (int32) representation (hardware emulation) Short (int16) representation (hardware emulation).  Useful MATLAB functions:  Fix, floor, round, ceil  Example:  sig_hat=fix(sig*2^(B-1))/2^(B-1);  Truncation of the sig to B bits. 15 April 2020 Veton Këpuska 8 Quantization of Error Signal & Filter Coefficients  Can Apply ADPCM for Error Signal  Filter Coefficients in the Direct Filter Form are found to be sensitive to quantization errors:  Small quantization error can have a large effect on filter characteristics.  Issue is that polynomial coefficients have nonlinear mapping to poles of the filter (e.g., roots of the polynomial).  Alternate representations possible that have significantly better tolerance to quantization error. 15 April 2020 Veton Këpuska 9 LPC Filter Representations  As noted previously when Levinson-Durbin algorithm was introduced one alternate representation to filter coefficients was also mentioned: PARCOR coefficients:  LPC to PARCOR: a jp   j 1  j  p for i  p,p  1, ,1 a ij1  a ij  aii aii j 1 k 2 i 1  j  i 1 ki 1  aii11 15 April 2020 Veton Këpuska 10 PARCOR Filter Representation  PARCOR to LPC: for i  1,2, ,p aii  ki a ij  a ij1  ki aii1j 1  j  i  1  j  a jp 1  j  p 15 April 2020 Veton Këpuska 11 Line Spectral Frequency Representation  It turns out that PARCOR coefficients can be represented with LSF that have significantly better properties.  Note that: 1 H z   A z   The PARCOR lattice structure of the LPC synthesis filter above: kp+1=∓1 Input + z-1 Bp 15 April 2020 Ap-1 Ap - A0 + kp z-1 Bp-1 Veton Këpuska - Output kp-1 z-1 B0 12 Line Spectral Frequency Representation  From previous slide the following holds: Ap 1 z   Ap  z   k p B p 1 z    B p z   z 1 B p 1 z   k p Ap 1 z  A0 z   1 & B0 z   z 1 &   B p z   z  p 1 Ap z 1  From this realization of the filter the LSP representation is derived: 15 April 2020 Veton Këpuska 13 LSF Representation k p 1  1 Pp 1  z   Ap  z   B p  z  k p 1  1 Q p 1  z   Ap  z   B p  z   15 April 2020   1 Ap  z   Pp 1  z   Q p 1  z  2 Veton Këpuska 14 LPC Synthesis Filter with LSF 1 1 H z    A z  1  A z   1 1  1 1  Pp 1  z   1  Q p 1  z   1 2  15 April 2020  Veton Këpuska 15 A Simple Speech Coder  LPC Based Synthesis Structure Decoding Residual Signal Residual Synthesis Filter Deemphasis Audio Output Filter Coeffs Filter Coeffs 15 April 2020 Veton Këpuska 16 Audio Coding Audio Coding  Most of the Audio Coding Standards use principles of Psychoacoustics.  Example of Basic Structure of MP3 encoder: Audio Input Filterbank & Transform Quantization Bit-stream Psychoacoustic Model 15 April 2020 Veton Këpuska 18 Basic Structure of Audio Coders  Filterbank Processing  Psychoacoustic Model  Quantization 15 April 2020 Veton Këpuska 19 Filter Bank Analysis Synthesis Filterbank Processing:  Splitting full-band signal into several subbands:  Uniform sub-bands (FFT)  Critical Band (FFT followed by non-linear transformation)  Reflect Human Auditory Apparatus.  Mel-Scale and Bark-Scale transformations f   Mel  1127.01048 * ln 1    700    f 2   Bark  13 * arctan 0.00076 * f   3.5 * arctan      7500     15 April 2020 Veton Këpuska 21 Mel-Scale 15 April 2020 Veton Këpuska 22 Bark-Scale 15 April 2020 Veton Këpuska 23 Analysis Structure of Filterbank hk[n] – Impulse Response of a Quadrature Mirror kth-filter N – Number of Channels. Typically 32 ↓ - Down-sampling MDCT – Modified Discrete Cosine Transform ↓ MDCT Audio Input 15 April 2020 MDCT Quantization h1[n] hk[n] ↓ MDCT MDCT hN[n] ↓ MDCT MDCT Veton Këpuska Bit Stream 24 Analysis Structure of Filterbank gk[n] – Impulse Response of a Inverse Quadrature Mirror kth-filter N – Number of Channels. Typically 32 ↑ - Up-sampling Bit Stream 15 April 2020 MDCT IMDCT ↑ g1[n] Decoding IMDCT – Inverse Modified Discrete Cosine Transform MDCT IMDCT ↑ gk[n] MDCT IMDCT ↑ gN[n] Veton Këpuska Audio Output 25 Psycho-Acoustic Modeling Psychoacoustic Model  Masking Threshold according to the human auditory perception.  Masking threshold is used to quantize the Discrete Cosine Transform Coefficients  Analysis is done in frequency domain represented by DFT and computed by FFT. 15 April 2020 Veton Këpuska 27 Threshold of Hearing  Absolute threshold of audibly perceptible events in quiet conditions (no other sounds).  Any signal bellow the threshold can be removed without effect on the perception. 15 April 2020 Veton Këpuska 28 Threshold of Hearing 15 April 2020 Veton Këpuska 29 Frequency Masking  Schröder Spreading Function  Bark Scale Function:   f 2   z  f   13 * arctan 0.00076 * f   3.5 * arctan      7500     z   z  f maskee   z  f mas ker   10 * log 10 F z   15.81  7.5z   0.474  17.5 1  z   0.474 15 April 2020 Veton Këpuska  1 2 2 30 Masking Curve 15 April 2020 Veton Këpuska 31 Primary Tone 1kHz 15 April 2020 Veton Këpuska 32 Masked Tone 900 Hz 15 April 2020 Veton Këpuska 33 Combined Sound 1kHz + 0.9kHz 15 April 2020 Veton Këpuska 34 Combined 1kHz + 0.9kHz (-10dB) 15 April 2020 Veton Këpuska 35 Combined 1kHz + 5kHz (-10dB) 15 April 2020 Veton Këpuska 36 END 15 April 2020 Veton Këpuska 37

Digital Systems: Hardware Organization and Design

Related documents

Products

Support

Digital Systems: Hardware Organization and Design

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib