Presentation on perceptual distortion

advertisement
Perceptual Distortion-Rate Optimization of
Long Term Prediction in MPEG AAC
Tejaswi Nanjundaswamy, Vinay Melkote, Emmanuel Ravelli and
Kenneth Rose
Signal Compression Lab
Department of ECE
UCSB
November 7, 2010
Outline
1
Introduction
2
Current approach for LTP parameter value selection
3
Proposed RD optimization of LTP parameter values
4
Results
Outline
1
Introduction
2
Current approach for LTP parameter value selection
3
Proposed RD optimization of LTP parameter values
4
Results
Audio Coding
Audio Signal
Audio Coding
Signal divided into overlapping frames
Audio Coding
Frame transformed
Audio Coding
Frame transformed
Audio Coding
Transform coecients split into bands
Audio Coding
Transform coecients split into bands
Audio Coding
Bands quantized and coded to generate bitstream
Audio Coding
Coding problem denition: Achieve minimum perceptual distortion at
a given rate
Perceptual?
Based on content, human brain can tolerate (or mask) variable amount
of noise in each band
Captured in distortion measure as Maximum Noise to Mask Ratio
(MNMR)
MNMR =
Quantization noise energy
∀ bands
Masking threshold
max
Masking threshold estimated via psycho-acoustic analysis of input
frame
Audio Coding
Coding problem denition: Achieve minimum perceptual distortion at
a given rate
Perceptual?
Based on content, human brain can tolerate (or mask) variable amount
of noise in each band
Captured in distortion measure as Maximum Noise to Mask Ratio
(MNMR)
MNMR =
Quantization noise energy
∀ bands
Masking threshold
max
Masking threshold estimated via psycho-acoustic analysis of input
frame
Audio Coding
Coding problem denition: Achieve minimum perceptual distortion at
a given rate
Perceptual?
Based on content, human brain can tolerate (or mask) variable amount
of noise in each band
Captured in distortion measure as Maximum Noise to Mask Ratio
(MNMR)
MNMR =
Quantization noise energy
∀ bands
Masking threshold
max
Masking threshold estimated via psycho-acoustic analysis of input
frame
Audio Coding
Coding problem denition: Achieve minimum perceptual distortion at
a given rate
Perceptual?
Based on content, human brain can tolerate (or mask) variable amount
of noise in each band
Captured in distortion measure as Maximum Noise to Mask Ratio
(MNMR)
MNMR =
Quantization noise energy
∀ bands
Masking threshold
max
Masking threshold estimated via psycho-acoustic analysis of input
frame
Audio Coding
Coding problem denition: Achieve minimum perceptual distortion at
a given rate
Perceptual?
Based on content, human brain can tolerate (or mask) variable amount
of noise in each band
Captured in distortion measure as Maximum Noise to Mask Ratio
(MNMR)
MNMR =
Quantization noise energy
∀ bands
Masking threshold
max
Masking threshold estimated via psycho-acoustic analysis of input
frame
Audio Coding
MDCT exploits redundancies within the current frame
Audio Coding
MDCT exploits redundancies within the current frame
Audio signal has a repeating pattern
Audio Coding
MDCT exploits redundancies within the current frame
Audio signal has a repeating pattern
Previously reconstructed data available at decoder
Audio Coding
MDCT exploits redundancies within the current frame
Audio signal has a repeating pattern
Previously reconstructed data available at decoder
Can we exploit this correlation?
MPEG AAC LTP
MPEG AAC uses the Long Term Prediction (LTP) tool
MPEG AAC LTP
MPEG AAC uses the Long Term Prediction (LTP) tool
Predicts current frame from history
MPEG AAC LTP
MPEG AAC uses the Long Term Prediction (LTP) tool
Predicts current frame from history
Reference position indicated via lag index
MPEG AAC LTP
MPEG AAC uses the Long Term Prediction (LTP) tool
Predicts current frame from history
Reference position indicated via lag index
Waveforms matched via gain factor
MPEG AAC LTP
MPEG AAC uses the Long Term Prediction (LTP) tool
Predicts current frame from history
Reference position indicated via lag index
Waveforms matched via gain factor
Transformed
MPEG AAC LTP
Transformed coecients split into bands
MPEG AAC LTP
Transformed coecients split into bands
Prediction residue generated
MPEG AAC LTP
Transformed coecients split into bands
Prediction residue generated
Compared with original
MPEG AAC LTP
Transformed coecients split into bands
Prediction residue generated
Compared with original
Per band LTP ag set
MPEG AAC LTP
Transformed coecients split into bands
Prediction residue generated
Compared with original
Per band LTP ag set
Per frame ag indicates if LTP is used at all in current frame
MPEG AAC LTP
The overall LTP parameter set includes
Lag index
Gain factor
Per band LTP ag
Per frame LTP ag
Outline
1
Introduction
2
Current approach for LTP parameter value selection
3
Proposed RD optimization of LTP parameter values
4
Results
Current approach
Lag and gain selected to minimize a mean squared prediction error cost
Current approach
Lag and gain selected to minimize a mean squared prediction error cost
The solution results in following choice of lag (L)
2K −1
∑ x [m ]z [m + 2 K − L 0 ]
L = arg max ms=0
L0 ∈[0,2K )
2K −1
∑
m=0
z 2 [m + 2K − L0 ]
Current approach
Lag and gain selected to minimize a mean squared prediction error cost
The solution results in following choice of lag (L)
2K −1
∑ x [m ]z [m + 2 K − L 0 ]
L = arg max ms=0
L0 ∈[0,2K )
2K −1
∑
m=0
z 2 [m + 2K − L0 ]
Current approach
Lag and gain selected to minimize a mean squared prediction error cost
The solution results in following choice of lag (L)
2K −1
∑ x [m ]z [m + 2 K − L 0 ]
L = arg max ms=0
L0 ∈[0,2K )
2K −1
∑
m=0
z 2 [m + 2K − L0 ]
Current approach
Lag and gain selected to minimize a mean squared prediction error cost
The solution results in following choice of lag (L)
2K −1
∑ x [m ]z [m + 2 K − L 0 ]
L = arg max ms=0
L0 ∈[0,2K )
2K −1
∑
m=0
z 2 [m + 2K − L0 ]
Current approach
Gain (G) calculated as
2K −1
G=
∑
x [m]z [m + 2K − L]
m=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Gain further quantized to one of the 8 levels
Current approach
For each band, LTP ag chosen as
1, if Energy of prediction residue < Energy of original coefficients
0, otherwise
The per frame ag is set if heuristic bit savings due to LTP >
side-information rate of LTP
Current approach
For each band, LTP ag chosen as
1, if Energy of prediction residue < Energy of original coefficients
0, otherwise
The per frame ag is set if heuristic bit savings due to LTP >
side-information rate of LTP
Current approach
Given all LTP parameters, core AAC parameters are selected via a
two-loop search (TLS)
For every band, an inner loop nds quantization step size for a target
distortion criterion
The outer loop then nds Human code books that minimize the bits
to encode and if this doesn't meet the rate constraint for the frame,
the target distortion is changed and inner loop repeated
Current approach
Given all LTP parameters, core AAC parameters are selected via a
two-loop search (TLS)
For every band, an inner loop nds quantization step size for a target
distortion criterion
The outer loop then nds Human code books that minimize the bits
to encode and if this doesn't meet the rate constraint for the frame,
the target distortion is changed and inner loop repeated
Current approach
Given all LTP parameters, core AAC parameters are selected via a
two-loop search (TLS)
For every band, an inner loop nds quantization step size for a target
distortion criterion
The outer loop then nds Human code books that minimize the bits
to encode and if this doesn't meet the rate constraint for the frame,
the target distortion is changed and inner loop repeated
Motivation
Objective results for reference AAC coder without LTP
Single instrument file 1
Single instrument file 2
−10
AMNMR in dB
AMNMR in dB
−9
−10
−11
−12
−13
−12
−14
−16
nopred−TLS
nopred−TLS
−18
20
25
30
bitrate (kbps)
35
30
35
40
bitrate (kbps)
45
Motivation
Objective results for reference AAC coder without LTP
Motivation
Objective results for reference AAC coder without LTP
Motivation
Objective results for reference AAC coder without LTP
Single instrument file 1
Single instrument file 2
−10
AMNMR in dB
AMNMR in dB
−9
−10
−11
−12
−13
−12
−14
−16
nopred−TLS
nopred−TLS
−18
20
25
30
bitrate (kbps)
35
30
35
40
bitrate (kbps)
45
Motivation
Objective results for reference AAC coder with and without LTP
Single instrument file 1
Single instrument file 2
−10
AMNMR in dB
AMNMR in dB
−9
−10
−11
−12
−12
−14
−16
nopred−TLS
stdLTP−TLS
−13
nopred−TLS
stdLTP−TLS
−18
20
25
30
bitrate (kbps)
35
30
35
40
bitrate (kbps)
45
Motivation
We know that TLS is sub-optimal for core AAC parameters selection
Could this be the reason for the poor RD performance?
Replace TLS with RD optimal Trellis based core AAC parameters
selection [Aggarwal et al. 2006]
Motivation
We know that TLS is sub-optimal for core AAC parameters selection
Could this be the reason for the poor RD performance?
Replace TLS with RD optimal Trellis based core AAC parameters
selection [Aggarwal et al. 2006]
Motivation
Objective results for Trellis based AAC coder with and without LTP
Single instrument file 1
Single instrument file 2
−16
AMNMR in dB
AMNMR in dB
−16
−18
−20
−18
−20
−22
−24
nopred−Trel
stdLTP−Trel
−22
20
25
bitrate (kbps)
−26
30
nopred−Trel
stdLTP−Trel
30
35
bitrate (kbps)
40
Motivation
LTP tool is expected to give improvements for single instrument
harmonic les
Shortcomings attributed to the sub-optimal LTP parameter selection
RD optimal approach has to select all encoder parameters with the
objective of minimizing perceptual distortion for a given rate
Current approach clearly sub-optimal as LTP parameters selected to
minimize mean squared prediction error, and independent of core AAC
parameters
Lag and gain selection ignores eventual prediction switching o in
select bands
Time domain lag and gain selection eectively considers all transform
coecients
Lag and gain thus selected not the best when considering a reduced set
of coecients
The heuristically estimated bit savings due to LTP doesn't match
actual bit savings reected after the quantization and coding process
Motivation
LTP tool is expected to give improvements for single instrument
harmonic les
Shortcomings attributed to the sub-optimal LTP parameter selection
RD optimal approach has to select all encoder parameters with the
objective of minimizing perceptual distortion for a given rate
Current approach clearly sub-optimal as LTP parameters selected to
minimize mean squared prediction error, and independent of core AAC
parameters
Lag and gain selection ignores eventual prediction switching o in
select bands
Time domain lag and gain selection eectively considers all transform
coecients
Lag and gain thus selected not the best when considering a reduced set
of coecients
The heuristically estimated bit savings due to LTP doesn't match
actual bit savings reected after the quantization and coding process
Motivation
LTP tool is expected to give improvements for single instrument
harmonic les
Shortcomings attributed to the sub-optimal LTP parameter selection
RD optimal approach has to select all encoder parameters with the
objective of minimizing perceptual distortion for a given rate
Current approach clearly sub-optimal as LTP parameters selected to
minimize mean squared prediction error, and independent of core AAC
parameters
Lag and gain selection ignores eventual prediction switching o in
select bands
Time domain lag and gain selection eectively considers all transform
coecients
Lag and gain thus selected not the best when considering a reduced set
of coecients
The heuristically estimated bit savings due to LTP doesn't match
actual bit savings reected after the quantization and coding process
Motivation
LTP tool is expected to give improvements for single instrument
harmonic les
Shortcomings attributed to the sub-optimal LTP parameter selection
RD optimal approach has to select all encoder parameters with the
objective of minimizing perceptual distortion for a given rate
Current approach clearly sub-optimal as LTP parameters selected to
minimize mean squared prediction error, and independent of core AAC
parameters
Lag and gain selection ignores eventual prediction switching o in
select bands
Time domain lag and gain selection eectively considers all transform
coecients
Lag and gain thus selected not the best when considering a reduced set
of coecients
The heuristically estimated bit savings due to LTP doesn't match
actual bit savings reected after the quantization and coding process
Motivation
LTP tool is expected to give improvements for single instrument
harmonic les
Shortcomings attributed to the sub-optimal LTP parameter selection
RD optimal approach has to select all encoder parameters with the
objective of minimizing perceptual distortion for a given rate
Current approach clearly sub-optimal as LTP parameters selected to
minimize mean squared prediction error, and independent of core AAC
parameters
Lag and gain selection ignores eventual prediction switching o in
select bands
Time domain lag and gain selection eectively considers all transform
coecients
Lag and gain thus selected not the best when considering a reduced set
of coecients
The heuristically estimated bit savings due to LTP doesn't match
actual bit savings reected after the quantization and coding process
Extension
Trellis based approach RD optimal for core AAC parameter selection
Extension
All possible LTP parameter combinations formed
Each case RD evaluated via Trellis
Case which minimizes the distortion for a give rate forms nal choice
Computationally prohibitive as LTP adds signicantly more choices of
parameters for
gain (8)
lag (frame length)
per band LTP ags (2 power number of bands)
Extension
Trellis based approach RD optimal for core AAC parameter selection
Extension
All possible LTP parameter combinations formed
Each case RD evaluated via Trellis
Case which minimizes the distortion for a give rate forms nal choice
Computationally prohibitive as LTP adds signicantly more choices of
parameters for
gain (8)
lag (frame length)
per band LTP ags (2 power number of bands)
Extension
Trellis based approach RD optimal for core AAC parameter selection
Extension
All possible LTP parameter combinations formed
Each case RD evaluated via Trellis
Case which minimizes the distortion for a give rate forms nal choice
Computationally prohibitive as LTP adds signicantly more choices of
parameters for
gain (8)
lag (frame length)
per band LTP ags (2 power number of bands)
Outline
1
Introduction
2
Current approach for LTP parameter value selection
3
Proposed RD optimization of LTP parameter values
4
Results
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Proposed approach
We achieve computational eciency by
Limiting LTP lag and gain parameter space by careful selection of
prediction survivors
Retains the simplicity of time domain lag and gain calculation
Limiting number of LTP parameter combinations also limits the
number of full RD evaluations
Full RD evaluation enables selection of encoder parameters aligned with
the end objective of minimizing perceptual distortion for a given rate
Trellis approach, which operates in frequency domain, for selecting the
band wise quantization and coding parameters, is extended to select
the per band LTP ags as well
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
P lag indices with the highest normalized cross-correlation are retained
2K −1
R [L] =
∑
x [m]z [m + 2K − L]
m
s=0
2K −1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
Forming the P lag survivors
Prediction survivors
Gain value for each of these lags found
2K −1
∑
x [m]z [m + 2K − L]
G[L] = m2=K0−1
∑
m=0
z 2 [m + 2K − L]
Prediction survivors
Closest Q quantization levels to each gain value are retained
Prediction survivors
Closest Q quantization levels to each gain value are retained
Prediction survivors
Closest Q quantization levels to each gain value are retained
Prediction survivors
To form the overall PQ survivorsg
Transformation
For each survivor, prediction residue is calculated and transformedg
RD evaluation
Each of these are rate-distortion evaluated via Trellisg
Per frame LTP ag
To nd per frame ag, current frame is transformed and RD evaluated
Final selection
Parameters resulting in minimum distortion for the given rate chosen
Trellis optimization
Trellis with stages for each band l , states in each stage for every
combination of per band LTP ags, quantization and coding
parameter values
Trellis optimization
Each state associated with corresponding distortion and rate costs
Transition between states associated with costs to dierentially encode
quantization and coding parameters
Dynamic programming pursued to nd optimal path through trellis
This path corresponds to optimal set of per band parameters
Trellis optimization
Each state associated with corresponding distortion and rate costs
Transition between states associated with costs to dierentially encode
quantization and coding parameters
Dynamic programming pursued to nd optimal path through trellis
This path corresponds to optimal set of per band parameters
Trellis optimization
Each state associated with corresponding distortion and rate costs
Transition between states associated with costs to dierentially encode
quantization and coding parameters
Dynamic programming pursued to nd optimal path through trellis
This path corresponds to optimal set of per band parameters
Trellis optimization
Each state associated with corresponding distortion and rate costs
Transition between states associated with costs to dierentially encode
quantization and coding parameters
Dynamic programming pursued to nd optimal path through trellis
This path corresponds to optimal set of per band parameters
Low complexity variant
For low complexity Trellis replaced with Two Loop Search
Low complexity variant
For low complexity Trellis replaced with Two Loop Search
Outline
1
Introduction
2
Current approach for LTP parameter value selection
3
Proposed RD optimization of LTP parameter values
4
Results
Objective evaluation results
For Trellis based AAC coder with and without LTPQpg
Single instrument file 1
Single instrument file 2
−16
AMNMR in dB
AMNMR in dB
−16
−18
−20
−18
−20
−22
−24
nopred−Trel
stdLTP−Trel
−22
20
25
bitrate (kbps)
−26
30
nopred−Trel
stdLTP−Trel
30
35
bitrate (kbps)
40
Objective evaluation results
Along with results for the proposed coder (with P = 20 and Q = 6)
Single instrument file 1
Single instrument file 2
−16
AMNMR in dB
AMNMR in dB
−16
−18
−20
20
25
bitrate (kbps)
−20
−22
−24
nopred−Trel
stdLTP−Trel
propLTP−Trel
−22
−18
−26
30
nopred−Trel
stdLTP−Trel
propLTP−Trel
30
35
bitrate (kbps)
40
Objective evaluation results
Trellis based coders compared to the TLS based codersQpg
Single instrument file 1
Single instrument file 2
−10
−12
AMNMR in dB
AMNMR in dB
−10
−14
−16
nopred−TLS
stdLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−18
−20
−22
20
25
30
bitrate (kbps)
−15
−20
−25
35
nopred−TLS
stdLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
30
35
40
bitrate (kbps)
45
Objective evaluation results
Along with results for proposed low complexity coder
Single instrument file 1
Single instrument file 2
−10
−12
AMNMR in dB
AMNMR in dB
−10
−14
−16
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−18
−20
−22
20
25
30
bitrate (kbps)
−15
−20
−25
35
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
30
35
40
bitrate (kbps)
45
Objective evaluation results
For other lesQpg
Polyphonic file with dominant instrument
Single instrument file 3
−10
AMNMR in dB
AMNMR in dB
−10
−15
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−20
−25
25
30
35
bitrate (kbps)
−15
−20
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−25
40
45
25
30
35
bitrate (kbps)
40
45
Objective evaluation results
For other lesQpg
Speech file
Complex polyphonic file
−8
−8
AMNMR in dB
AMNMR in dB
−10
−12
−14
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−16
−18
20
25
30
35
bitrate (kbps)
−10
−12
nopred−TLS
stdLTP−TLS
propLTP−TLS
nopred−Trel
stdLTP−Trel
propLTP−Trel
−14
−16
40
20
25
30
35
bitrate (kbps)
40
45
Subjective evaluation
MUSHRA listening tests for coders operating at 32 kbps
12 listeners score on a scale of 0 (bad) to 100 (excellent)
Plots show average MUSHRA scores and 95% condence interval
Subjective evaluation results
Single instrument file 1
100
80
Avg. MUSHRA score
80
Avg. MUSHRA score
Single instrument file 2
100
60
40
20
60
40
20
0
0
ref
rel
rel
anc
nop
T
red
pro
p
T
LTP
ref
l
anc
red
nop
Tre
rel
PT
pLT
pro
Subjective evaluation results
Single instrument file 3
100
100
80
Avg. MUSHRA score
Avg. MUSHRA score
80
Polyphonic file with dominant instrument
60
40
20
60
40
20
0
0
ref
rel
rel
anc
nop
T
red
pro
p
T
LTP
ref
l
anc
red
nop
Tre
rel
PT
pLT
pro
Subjective evaluation results
Speech file
100
80
Avg. MUSHRA score
80
Avg. MUSHRA score
Complex polyphonic file
100
60
40
20
60
40
20
0
0
ref
rel
rel
anc
nop
T
red
pro
p
T
LTP
ref
l
anc
red
nop
Tre
rel
PT
pLT
pro
Summary
Current approach for LTP parameter selection is sub-optimal
Joint selection of LTP and core AAC parameters which optimize
perceptual distortion-rate performance proposed
Low complexity two-loop search based variant also proposed
Subjective and objective evaluations show substantial improvements
We conclude that when rightly optimized LTP can be a potent tool
Summary
Current approach for LTP parameter selection is sub-optimal
Joint selection of LTP and core AAC parameters which optimize
perceptual distortion-rate performance proposed
Low complexity two-loop search based variant also proposed
Subjective and objective evaluations show substantial improvements
We conclude that when rightly optimized LTP can be a potent tool
Summary
Current approach for LTP parameter selection is sub-optimal
Joint selection of LTP and core AAC parameters which optimize
perceptual distortion-rate performance proposed
Low complexity two-loop search based variant also proposed
Subjective and objective evaluations show substantial improvements
We conclude that when rightly optimized LTP can be a potent tool
Summary
Current approach for LTP parameter selection is sub-optimal
Joint selection of LTP and core AAC parameters which optimize
perceptual distortion-rate performance proposed
Low complexity two-loop search based variant also proposed
Subjective and objective evaluations show substantial improvements
We conclude that when rightly optimized LTP can be a potent tool
Summary
Current approach for LTP parameter selection is sub-optimal
Joint selection of LTP and core AAC parameters which optimize
perceptual distortion-rate performance proposed
Low complexity two-loop search based variant also proposed
Subjective and objective evaluations show substantial improvements
We conclude that when rightly optimized LTP can be a potent tool
Thank you for your attention
Questions?
Download