Local Optimality in Tanner Codes

advertisement
Local optimality in
Tanner codes
Guy Even
Nissim Halabi
June 2011
1
Error Correcting Codes for Memoryless
Binary-Input Output-Symmetric Channels (1)
Channel
Encoding
c  C  0,1
N
Noisy
Channel
codeword
y ¡
N
noisy codeword
Channel
Decoding
cˆ  0,1
Memoryless Binary-Input Output-Symmetric Channel
Binary input but output is real
Stochastic: characterized by conditional probability function P( y | c )
Memoryless: errors are independent (from bit to bit)
Symmetric channel: Errors affect ‘0’s and ‘1’s symmetrically
Example: Binary Symmetric Channel (BSC)
1-p
0
ci
p
1
2
p
1-p
0
yi
1
N
Error Correcting Codes for Memoryless
Binary-Input Output-Symmetric Channels (2)
Channel
Encoding
c  C  0,1
codeword
N
Noisy
Channel
( y ) ¡
N
noisy codeword
Channel
Decoding
cˆ  0,1
Log-Likelihood Ratio (LLR) i for a received observation yi:
 PYi / X i  yi | xi  0  
i  yi   ln 

 PY / X  yi | xi  1 
 i i

i > 0
i < 0

3
yi is more likely to be ‘0’
yi is more likely to be ‘1’
y : injective mapping
replace y by 
N
Maximum-Likelihood (ML) Decoding
Maximum-likelihood (ML) decoding can be formulated by:
(for any binary-input memory-less channel)
ML     arg min  , x
xC
or a linear program:
ML    
arg min  , x  arg min  , x
xC
No Efficient
Representation
xconv C
{0,1}N
conv()[0,
1]N
4
Linear Programming (LP) Decoding
Linear Programming (LP) decoding [Fel03, FWK05] –
relaxation of the polytope conv()
LP     arg min  , x
xP
P : (1) All codewords x in  are vertices
(2) All new vertices are fractional
(therefore, new vertices are not in )
(3) Has an efficient representation
{0,1}N
LP    integral  LP     ML   
Solve LP
N
LP    fractional  fail !
5
conv([0,1]
fractional
P
conv(  P
Linear Programming (LP) Decoding
Linear Programming (LP) decoding [Fel03, FWK05] –
relaxation of the polytope conv()
LP     arg min  , x
xP
P : (1) All codewords x in  are vertices
(2) All new vertices are fractional
(therefore, new vertices are not in )
(3) Has an efficient representation
LP decoder finds
ML codeword
{0,1}N
LP    integral  LP     ML   
Solve LP
N
LP    fractional  fail !
6
conv([0,1]
fractional
P
conv(  P
Linear Programming (LP) Decoding
Linear Programming (LP) decoding [Fel03, FWK05] –
relaxation of the polytope conv()
LP     arg min  , x
xP
P : (1) All codewords x in  are vertices
(2) All new vertices are fractional
(therefore, new vertices are not in )
(3) Has an efficient representation
LP decoder
fails
{0,1}N
LP    integral  LP     ML   
Solve LP
N
LP    fractional  fail !
7
conv([0,1]
fractional
P
conv(  P
Tanner Codes [Tan81]
Factor graph representation of Tanner codes:
Every Local-code node j is associated with
linear code of length degG(j)
Tanner code (G) and codewords x :
x  C G   j. xN  C   local  code Cj
j
d *  min dmin  local  code Cj 
j
Extended local-code j
bits outside the local-code
{0,1}N:
8
extend to
x1
x2
x3
x4
x5
x6
x7
x8
x9
C1
C2
C3
C4
C5
x10
Variable
nodes
Example: Expander codes [SS’96]
Tanner graph is an expander; Simple bit flipping
decoding algorithm.
G=(I
Local-Code
nodes
J,E)
LP Decoding of Tanner Codes
Maximum-likelihood (ML) decoding:
ML    

where conv  C =conv 

j
arg min  , x
xconv C

extended local-code Cj 

Linear Programming (LP) decoding [following Fel03, FWK05]:
LP     arg min  , x
xP
where
conv  extended local-code Cj 
P 
j
P satisfies the requirements.
9
conv(extended local-code

conv(extended
local-code
1)
Why study (irregular) Tanner Codes?
Tighter LP relaxation because local codes
are not parity codes
Irregularity of the Tanner graph Codes
with better performance in the case of
LDPC codes [Luby et al., 98]
Regular Low-Density Tanner Code with nontrivial local codes Irregular LDPC code.
Irregular Tanner code with non-trivial local
codes Might yield improved designs.
10
Criterions of Interest
N denote an LLR vector received from the channel.
Let 
Let x (G) denote a codeword.
Consider the following questions:
?
x  ML   
NP-Hard
?
ML    unique ?
LP    unique ?
x  LP   

x
Efficient Test with
One Sided Error
?
x  ML    unique ?
Definitely Yes /
Maybe No
E.g., efficient test via local computations  “Local
Optimality” criterion
11
Combinatorial Characterization of Local Optimality (1)
Let x  (G)  {0,1}N
f  [0,1]N  N
[Fel03] Define relative point x  f by  x  f i  xi  fi
Consider a finite set B  [0,1]N of deviations
Definition: A codeword x   is locally optimal for   N if for all
vectors b  B,
, x  b  , x

ML()
Goal: find a set B deviations such that:
(1) x  LO( )  x  ML() and ML() unique
(2) x  LO( )  x  LP() and LP() unique

LP() integral
LO( )
P LP decoding fails  P 0 N is not locally opt. | c  0 N
12


(3) P 
 , b  0 | c  0 N   1  o(1)
 b B


All-Zeros
Assumption
Combinatorial Characterization of Local Optimality (2)
Goal: find a set B such that:
(1) x LO( ) x ML() and ML() unique
(2) x LO( ) x LP() and LP() unique


N
(3) P 
 , b  0 | c  0   1  o(1)
 b B

Suppose we have properties (1), (2).
Large support(b) property (3).
(e.g., Chernoff-like bounds)
If B = , then: x LO( ) x ML() and ML() unique
However, analysis of property (3) ???
b – GLOBAL
Structure
13
Combinatorial Characterization of Local Optimality (3)
Goal: find a set B such that:
(1) x LO( ) x ML() and ML() unique
(2) x LO( ) x LP() and LP() unique


N
(3) P 
 , b  0 | c  0   1  o(1)
 b B

Suppose we have properties (1), (2).
Large support(b) property (3).
(e.g., Chernoff-like bounds)
For analysis purposes, consider structures with a local nature
 B is a set of TREES [following KV’06] (height < ½ girth)
Strengthen analysis by introducing layer weights! [following ADS’09]

 better bounds on 
N
P
, b  0 | c  0 
 b B

Finally, height(subtrees(G)) < ½ girth(G) = O(log N)
 Take path prefix trees – not bounded by girth!
14
Path-Prefix Tree
Captures the notion of computation tree
Consider a graph G=(V,E) and a node r
V:
Vˆ – set of all backtrackless paths in G emanating from node r with
length at most h.
Trh  G 
G:
Vˆ , Eˆ  – path-prefix tree of G rooted at node r with height h.
Tr4  G  :
1
15
1
1
1
1
1
1
1
1
1
2
2
2
2
2
3
2
4
2
d-Tree
Consider a path-prefix tree Trh  G  of a Tanner graph
G = (I  J , E)
h
d-tree T [r,h,d] – subgraph of Tr  G 
root = r
∀ v ∈ T ∩ I : deg T (v) = deg G (v).
∀ c ∈ T ∩ J : deg T (c) = d.
2-tree = skinny tree /
minimal deviation
v0
16
3-tree
v0
Not necessarily a valid
configuration!
4-tree
v0
Cost of a Projected Weighted Subtree
Consider layer weights  : {1,…,h} → , and a subtree Trˆ of a path
prefix tree Tr2h  G  .
Define a weight function Trˆ  : Vˆ  R for the subtree Trˆ induced by :
where
 G Trˆ    R N – projection to Tanner graph G.
r̂
1
1
Trˆ   161  
1 1

17
 G Trˆ   

1
4

2
8
Trˆ
1
1
1
1
1
1
2
2
1
2 2
project
 
1
2
1
 62 2   
2 2 1  2
1
2
2
2
3
2
4
2
2
Combinatorial Characterization of Local Optimality
Setting:
(G)  {0,1}N Tanner code with minimal local distance d*
2 ≤ d ≤ d*
  [0,1]h\{0N}
Bd  – set of all vectors corresponding to projections to G by
-weighted d-trees of height 2h rooted at variable nodes
  
Bd    G T  

T   is  -weighted a d -tree of height h
Definition: A codeword x is (h,  , d)-locally optimal for
  N if for all vectors b Bd,
, x  b  , x
18
Thm: Local Opt ⇒ ML Opt / LP Opt
Theorem: If x  (G) is (h , , d)-locally optimal for ,
then: (1) x is the unique ML codeword for .
(2) x is the unique optimal LP solution given .
Proof sketch – Two steps: STEP 1: Local Opt.
ML Opt.
Key structural lemma: Every codeword x equals a convex
combination of projected -weighted d-trees in G (up to a scaling
factor ):
x
For every codeword x’ ≠ x, let z x  x '
, x
 Eb  , x  b
local-optimality of x
  , x  Eb b
Key structural lemma
 , x   z
 1     , x    , x '
19
 , x '  , x
Thm: Local Opt ⇒ ML Opt / LP Opt
Theorem: If x  (G) is (h , , d)-locally optimal for ,
then: (1) x is the unique ML codeword for .
(2) x is the unique optimal LP solution given .
Proof sketch – Two steps: STEP 2: Local Opt.
LP Opt.
Local opt. ML Opt. used as “black box”.
Reduction using graph covers and graph cover decoding [VK’05]
20
Thm: Local Opt ⇒ ML Opt / LP Opt
Theorem: If x  (G) is (h , , d)-locally optimal for ,
then: (1) x is the unique ML codeword for .
(2) x is the unique optimal LP solution given .
Interesting outcomes:
Characterizes the event of LP decoding failure. For example:
Theorem: Fix   R h . Then


P LP decoding fails  P d -tree  .  G [ ],   0 | c  0 n .
All-Zeros
Assumption
Work in progress: Design an iterative message passing decoding
algorithm that computes an (h , , d)- locally-optimal codeword
after h iterations.
 x locally optimal codeword for  weighted message passing
algorithm finds x after h iterations + guarantee that x is the ML
codeword.
21
Thm: Local Opt ⇒ ML Opt / LP Opt
Theorem: If x  (G) is (h , , d)-locally optimal for ,
then: (1) x is the unique ML codeword for .
(2) x is the unique optimal LP solution given .

ML()
LP() integral
LO( )
22
Goals achieved:
(1) x LO( ) x ML() and ML() unique
(2) x LO( ) x LP() and LP() unique
Left to show:
(3) Pr{x LO( )} = 1 – o(1)
Bounds for LP Decoding Based on d-Trees
(analysis for regular factor graphs)
(dL,dR)-regular Tanner codes with minimum local distance d*
- Form of finite length bounds:  c > 1.  t.  noise < t.
Pr(LP decoder success)
1 - exp(-cgirth)
- If girth = Θ(logn), then
Pr(LP decoder success)
1 - exp(-nγ), for 0 < < 1
n → ∞ : t is a lower bound on the threshold of LP decoding
Results for BSC using analysis of uniform weighted d-trees:
23
Skinny-Tree
analysis (2-tree)
[ADS’09]
3-trees
4-trees
(3,6)-regular Tanner code
with d*=4
pLP > 0.02
pLP > 0.076
pLP > 0.4512
(4,8)-regular Tanner code
with d*=4
pLP > 0.017
pLP > 0.057
pLP > 0.365
Message Passing Decoding for
Irregular LDPC codes
Dynamic Programming on coupled computation trees of the
Tanner graph.
Input: Channel observations for variable nodes v V.
Output: An assignment to variable nodes v V , s.t.:
assignment to v agrees with the “Tree” codeword that
minimizes an objective function on the computation tree
rooted at v.
Algorithm principle – Messages are sent iteratively:
variable nodes → check nodes
check nodes → variable nodes
Goal: A locally optimal codeword x exists
passing algorithm finds x
24
Message
-Weighted Min-Sum: Message Rules
Variable node → Check node message (iteration ):

( )
v c
h
1
( 1)

v 


c v
degG  v 
degG  v   1 c 'N  v \c
Check node → Variable node message (iteration ):

25
( )
c v

   sign v() c
 v 'N  c  \v




v() c
  v 'min
N  c  \v



-Weighted Min-Sum: Decision Rule
Decision at variable node v on iteration h:
0
1
( h 1)
v 
v 

 cv
degG  v 
degG  v   1 cN  v 
0 if v  0
xˆv  
o.w.
1
26
Thm: Local Opt. ⇒ -Weighted Min-Sum
Opt.
Theorem: If x  (G) is (h , , 2)-locally optimal for
, then:
(1) -weighted min-sum computes x in h iterations.
(2) x is the unique ML codeword given .
The -Weighted Min-Sum algorithm generalizes
the attenuated max-product algorithm [Frey and
Koetter ’00] [Jian and Pfister ’10].
Open question: probabilistic analysis of localoptimality in the irregular case.
27
Local Optimality Certificates for LP decoding
Previous results:
Koetter and Vontobel ’06 – Characterized LP solutions and
provide a criterion for certifying the optimality of a codeword for
LP decoding of regular LDPC codes.
Characterization is based on combinatorial structure of skinny trees of
height h in the factor graph. (h< girth(G))
Arora, Daskalakis and Steurer ’09 – Certificate for LP decoding
of regular LDPC codes over BSC.
Extension to MBIOS channels in [Halabi-Even ’10].
Certificate characterization is based on weighted skinny trees of height h.
(h< girth(G))
Note: Bounds on the word error probability of LP decoding are
computed by analyzing the certificates.
By introducing layer weights, ADS certificate occurs with high
probability for much larger noise rates.
Vontobel ’10 – Implies a certificate for LP decoding of Tanner
codes based on weighted skinny subtrees of graph covers.
28
Local Optimality Certificates for LP decoding –
Summary of Main Techniques
[KV06,ADS09,HE10]
Girth
h<
h is unbounded. characterization uses
computation trees
girth(G)
Regularity
Regular factor graph.
Check Nodes /
Constraints
Parity code.
Irregular factor graph –
normalization by node degrees.
Linear Codes.
Tighter relaxation for the generalized
fundamental polytope.
x1
x2
x3
x4
x5
x6
x7
x8
x9
29
Current work
x10
C1
C2
C3
C4
C5
Local Optimality Certificates for LP decoding –
Summary of Main Techniques
[KV06,ADS09,HE10]
Girth
h<
girth(G)
Regularity
Regular factor graph.
Check Nodes /
Constraints
Deviations
h is unbounded. characterization uses
computation trees
Irregular factor graph –
normalization by node degrees.
Parity code.
Linear Codes.
Tighter relaxation for the generalized
fundamental polytope.
“skinny”
Locally satisfies parity
checks.
“fat” – Certificates based on “fat”
structures likely to occur with high
probability for larger noise rates.
Not necessarily a valid configuration!
v0
30
Current work
v0
Local Optimality Certificates for LP decoding –
Summary of Main Techniques
[KV06,ADS09,HE10]
Girth
h < girth(G)
h is unbounded
No dependencies on the Characterization using computation
factor graph.
trees
Regularity
Regular factor graph.
Check Nodes /
Constraints
Deviations
Irregular factor graph –
add normalization factors according to
node degrees.
Parity code.
Linear Codes.
Tighter relaxation for the generalized
fundamental polytope.
“skinny”
Locally satisfies inner
parity checks.
“fat” – Certificates based on “fat”
structures likely to occur with high
probability for larger noise rates.
Not necessarily a valid configuration!
LP solution
Dual / Primal LP
analysis /
analysis. Polyhedral
characterization analysis.
31
Current work
Use reduction to ML via
characterization of graph cover
decoding.
Conclusions
Combinatorial, graph theoretic, and algorithmic methods
for analyzing decoding of (modern) error correcting
codes over any memoryless channel:
New local optimality combinatorial certificate for LP
decoding of irregular Tanner codes,
i.e., Local Opt.  LP Opt  ML Opt.
The certificate is based on weighted “fat” subtrees of
computation trees of height h.
h is not bounded by girth.
Degree of local-code
nodes is not limited to 2
Proofstechniques: combinatorial decompositions and
graph covers arguments.
Efficient algorithm (dynamic programming), runs in
O(|E| h) time, for the computation of local optimality
certificate of a codeword given an LLR vector.
32
Future Work
Work in progress: Design an iterative message
passing decoding algorithm for Tanner codes that
computes an
(h , , d)- locally-optimal
codeword after h iterations.
 x locally optimal codeword for  weighted message
passing algorithm computes x in h iterations + guarantee
that x is the ML codeword.
Asymptotic analysis of LP decoding and weighted
min-sum decoding for ensembles of irregular
Tanner codes.
apply density evolution techniques and the combinatorial
characterization of local optimality.
33
Thank You!
34
Download