Asynchronous Convolutional-Coded Physical

advertisement
1
Asynchronous Convolutional-Coded
Physical-Layer Network Coding
Qing Yang, Student Member, IEEE, and Soung Chang Liew, Fellow, IEEE
Abstract—This paper investigates the decoding process of
asynchronous convolutional-coded physical-layer network coding
(PNC) systems. Specifically, we put forth a layered decoding
framework for convolutional-coded PNC consisting of three layers:
symbol realignment layer, codeword realignment layer, and joint
channel-decoding network coding (Jt-CNC) decoding layer. Our
framework can deal with phase asynchrony (phase offset) and
symbol arrival-time asynchrony (symbol misalignment) between
the signals simultaneously transmitted by multiple sources. A
salient feature of this framework is that it can handle both
fractional and integral symbol misalignments. For the decoding
layer, instead of Jt-CNC, previously proposed PNC decoding
algorithms (e.g., XOR-CD and reduced-state Viterbi algorithms)
can also be used with our framework to deal with general
symbol misalignments. Our Jt-CNC algorithm, based on belief
propagation (BP), is BER-optimal for synchronous PNC and near
optimal for asynchronous PNC. Extending beyond convolutional
codes, we further generalize the Jt-CNC decoding algorithm for
all cyclic codes. Our simulation shows that Jt-CNC outperforms
the previously proposed XOR-CD algorithm and reduced-state
Viterbi algorithm by 2 dB for synchronous PNC. For both phaseasynchronous and symbol-asynchronous PNC, Jt-CNC performs
better than the other two algorithms. Importantly, for real wireless
network experimentation, we implemented our decoding algorithm
in a PNC prototype built on the USRP software radio platform.
Our experiment shows that the proposed Jt-CNC decoder works
well in practice.
Keywords—Physical-layer network coding; convolutional codes;
symbol misalignment; phase offset; joint channel-decoding and
network coding; cyclic codes.
I. I NTRODUCTION
HIS paper investigates the use of convolutional codes
in asynchronous physical-layer network coding (PNC)
systems to ensure reliable communication. In particular, we
focus on the decoding problem when simultaneous signals from
multiple transmitters arrive at a PNC receiver with asynchronies
between them.
PNC was first proposed in [1] as a way to exploit network
coding [2], [3] at the physical layer. In the simplest PNC
setup, two users exchange information via a relay in a two-way
relay network (TWRN). The two users transmit their messages
simultaneously to the relay; the relay then maps the overlapped
signals to a network-coded message and broadcasts it to the
two users; and each of the two users recovers the message
from the other user based on the network-coded message and
the knowledge of its own message. PNC can potentially boost
the throughput of TWRN by 100% compared with a traditional
relay system [1].
T
The authors are with the Department of Information Engineering, The
Chinese University of Hong Kong, Shatin, New Territories, Hong Kong (email:
{yq010, soung}@ie.cuhk.edu.hk).
This work is partially supported by AoE grant E-02/08 and the General Research Funds Project No. 414911, established under the University Grant Committee of the Hong Kong Special Administrative Region, China. This work is
also partially supported by the China 973 Program, Project No. 2012CB315904
and the China NSFC grant, Project No. 61271277.
Our paper focuses on PNC decoding as applied to TWRN.
To ensure reliable transmission, communication systems make
use of channel coding to protect the information from noise
and fading. In channel-coded PNC, the goal of the relay is
to decode the simultaneously received signals not into the
individual messages of the two users, but into a network-coded
message. This process is referred to as the channel-decoding
network coding (CNC) process in [4].
In addition to the issue of channel coding, in practice,
the signals from the two users may be asynchronous in that
there may be relative symbol arrival-time asynchrony (symbol
misalignment), phase asynchrony (phase offset), and other asynchronies between the two signals received at the relay. These
PNC systems are referred to as asynchronous PNC (APNC)
systems [5].
Both [4] and [5] assume the use of repeat accumulate (RA)
codes. Our current paper, on the other hand, focuses on the use
of convolutional codes. A main motivation is that convolutional
codes are commonly adopted in many communications systems
(e.g., the channel code in IEEE 802.11 is a convolutional code
[6]). Convolutional codes have been well studied and there are
many good designs for the encoding/decoding of convolutional
codes in the conventional communication setting. Given this
backdrop, whether these designs are still applicable to PNC, and
what additional considerations and modifications are needed for
PNC, are issues of utmost interest. This paper is an attempt to
address these issues.
Our main contributions are as follows:
• We put forth a layered decoding framework for asynchronous PNC system. The proposed decoding framework can deal with synchronous PNC as well as asynchronous PNC with relative phase offset and general
symbol misalignment—by general symbol misalignment,
we mean that the arrival times of the two users’ signals at
the relay are offset by (τI + τF ) symbol durations, where
τI is an integral offset and τF is a fractional offset smaller
than one. With our framework, the previous decoding
algorithms can also be used to deal with asynchronous
PNC.
• We design a joint channel-decoding network coding (JtCNC) decoder for convolutional-coded PNC. The Jt-CNC
decoder, based on belief propagation (BP), is optimal in
terms of bit error rate (BER) performance. We analyze the
BER of our Jt-CNC decoder mathematically and derive
an approximate expression for the BER.
• We implement the Jt-CNC decoder in a real PNC system
built on USRP software radio platform. Our experiment
shows that the Jt-CNC decoder works well under real
wireless channel.
• We propose an algorithm that can handle general symbol misalignment in cyclic-coded PNC, building on the
insight obtained from our study of convolutional-coded
PNC; that is, the algorithm is applicable to all cyclic
codes, not just convolutional codes.
2
The remainder of this paper is organized as follows. Section
II overviews related work. Section III describes the PNC system
model. Section IV puts forth our Jt-CNC framework, focusing
on synchronous PNC. Section V extends the Jt-CNC framework
to asynchronous PNC. We further show how the algorithmic
framework is applicable to the general cyclic-coded PNC in
the Appendix. Section VI presents simulations and experimental
results together with the BER analysis for the Jt-CNC decoder.
Section VII concludes this work.
II. R ELATED W ORK
A. Synchronous PNC with Convolutional Codes
The first implementation of TWRN based on the principle of
PNC was recently reported in [7], [8]. This system employs the
convolutional code defined in the 802.11 standard and adopts
the OFDM modulation to eliminate symbol misalignment [9].
In [7], [8], first the log-likelihood ratio (LLR) of the XORed
channel-coded bits is computed; then this soft information is fed
to a conventional Viterbi decoder. We refer to this decoding
strategy as the soft XOR and channel decoding (XOR-CD)
scheme [10]. Detailed explanation and interpretation of the
XOR-CD algorithm are given in Appendix III-A. The experiment shows that the use of XOR-CD on the convolutional-coded
PNC system, thanks to its simplicity, is feasible and practical.
The acronym XOR-CD refers to a two-step process: first,
prior to channel decoding and without considering the correlations among the received symbols due to the channel code,
we apply symbol-by-symbol PNC mapping on the received
symbols to obtain estimates on the successive XORed bits;
after that, we perform channel decoding on the XORed bits
to obtain the XORed source bits. The performance of XORCD is suboptimal because the PNC mapping in the first step
loses information [4]. Furthermore, only linear channel codes
can be correctly decoded in the second step. Jt-CNC, on the
other hand, performs channel decoding and network coding as
an integrated process rather than two disjoint steps. Jt-CNC
can be ML (maximum likelihood) optimal, depending on which
variations of Jt-CNC we use and whether the underlying PNC
system is synchronous or asynchronous.
Within the class of Jt-CNC algorithms, for optimality, there
are two possible decoding targets: (i) ML XORed packet;
(ii) ML XORed bits. To draw an analogy, for the conventional single-user point-to-point communication, if convolutional codes are used, then the Viterbi algorithm [11] aims to
obtain the ML packet, while the BCJR [12] aims to obtain ML
bits. For PNC systems, the aim is to obtain the network-coded
packet or the network-coded bits instead.
A Jt-CNC algorithm for finding the ML XORed packet
was proposed in [13]. However, as will be discussed later,
finding the ML XORed packet requires exhaustive search that
could have prohibitively high complexity. Therefore, the logmax approximation is adopted in [13] and the ML algorithm
is simplified to (approximated with) a full-state Viterbi (FSV)
algorithm. The detailed explanation and interpretation of the
full-state Viterbi algorithm are given in Appendix III-B. The
term “full-state” comes from the fact that this algorithm combines the trellises of both end nodes to make a virtual decoder.
By searching the best path on the combined trellis with the
Viterbi algorithm, [13] tries to decode the ML pair of packets
of the two end nodes. To further reduce the complexity, [13]
simplifies the full-state Viterbi algorithm to a “reduced-state”
Viterbi algorithm. Reference [13], however, did not benchmark
their approximate algorithm with the optimal one. As we will
show later, the algorithm proposed by us in this paper can yield
better performance than that in [13].
In this paper, we aim to find the ML XORed bits within
the source packet rather than the overall ML XORed packet.
In Section IV we show that our algorithm is XOR bit-optimal
for synchronous PNC. Finding ML XORed bits turns out to
have much lower complexity than finding the ML XORed
packet. This is quite different from the conventional pointto-point communication system, in which the simple Viterbi
algorithm can be used to decode the ML packet, and in which
BCJR (slightly more complex than the Viterbi algorithm) can be
used to decode the ML bits. The XOR bit-optimal decoding for
synchronous PNC is investigated in our prior published work
[14].
B. Asynchronous PNC with Convolutional Codes
In asynchronous PNC systems, the signals from the two
end nodes may arrive at the relay with symbol misalignment
and relative phase offset [5]. To our best knowledge, there
was no Jt-CNC decoder for convolutional codes that can deal
with integral-plus-fractional symbol misalignment. In [15], a
convolutional decoding scheme with an XOR-CD algorithm
was proposed to deal with integral symbol misalignment. As
pointed out in [15], symbol misalignment entangles the channelcoded bits of the trellises of the two encoders in a way that
ordinary Viterbi decoding, based on just one of the trellises,
is not applicable. Therefore the XOR-CD algorithm for synchronous PNC cannot be applied anymore in the presence of
integral symbol misalignment. Their solution is to rearrange the
transmit order of the channel-coded bits into blocks, and pad
Dmax zeros between adjacent blocks. The zero padding acts as
a guard interval between blocks that avoids the entanglement
of channel-coded bits and facilitates Viterbi decoding. However,
this scheme can only deal with integral symbol misalignment
of at most Dmax symbols. In addition, it incurs a code-rate loss
due to the zero padding between blocks.
C. Asynchronous PNC with Other Channel Codes
The use of LDPC codes in asynchronous PNC systems has
previously been considered. In [5], the authors designed a JtCNC decoder for the RA code that can deal with fractional
symbol misalignment (i.e., symbol misalignment that is less
than one symbol duration) and phase offset. Our decoding
framework adopts the over-sampling technique proposed in [5]
to address fractional symbol misalignment.
To deal with asynchrony in PNC, our decoding framework
consists of three layers: symbol-realignment layer, codewordrealignment layer, and joint channel-decoding network coding
(Jt-CNC) layer. The first two layers, symbol realignment and
codeword realignment, counter fractional and integral symbol
misalignments, respectively; the third layer, Jt-CNC, decodes
the ML XORed bits. Other decoding schemes (e.g., XOR-CD,
reduced-state Viterbi) can also be used in the third layer of the
framework. We further show that our decoding framework is
not only applicable when convolutional codes are adopted, it is
also applicable when general cyclic codes are used.
Besides convolutional codes, an important class of cyclic
codes is the cyclic LDPC. The Jt-CNC LDPC decoder proposed
in [5] was extended by [16] to deal with general asynchrony
using cyclic LDPC. However, the proposed decoder in [16]
discards the non-overlapped part of the received signal, losing
useful information that can potentially enhance performance.
3
Uplink
Therefore, for the decoder in [16], the larger the symbol
misalignment, the worse the performance. By contrast, our
framework makes full use of the non-overlapped portion of
the signal so that the larger symbol misalignment can enhance
performance.
D. Two-Way Relay Network with Other Techniques
The goal of our Jt-CNC decoder proposed in this paper is
to decode the XOR of the end nodes’ messages in the uplink
phase. The relay then broadcasts the XORed message to both
end nodes in the downlink phase. Each end node XORs the
XORed message with its own message to obtain the other
end node’s message. For two-way relay network system, we
have other rate-improving techniques other than PNC such as
superposition coding and hierarchical modulation [17], [18].
Both superposition coding and hierarchical modulation are for
downlink. If they are to be used, the relay has to decode the
individual messages U A and U B instead of their XOR in the
uplink. For FSV and Jt-CNC, we can also get the individual
messages, and then use superposition coding or hierarchical
modulation rather than network coding (NC) for the downlink.
For both superposition coding and hierarchical modulation,
the self information at the two end nodes is not used to do
the decoding in the downlink. In other words, some available
information is not exploited. As a result, the overall achievable
rates (from an information-theoretic viewpoint) will not be as
good as those achieved by the NC schemes.
III. S YSTEM M ODEL
We consider the application of PNC in a two-way relay
network (TWRN) as shown in Fig. 1. In this model, nodes
A and B exchange information with the help of relay node R.
We assume that all nodes are half-duplex and there is no direct
link between A and B.
With PNC, nodes A and B exchange one packet with each
other in two time slots. The first time slot corresponds to an uplink phase, in which node A and node B transmit their channelcoded packets simultaneously to relay R. The relay R then
constructs a network-coded packet based on the simultaneously
received signals from A and B. This operation is referred to
as the channel decoding network coding (CNC) process [10],
because the received signals are decoded into a network-coded
message rather than the individual messages from A and B.
The second time slot corresponds to a downlink phase, in
which relay R channel-encodes the network-coded message and
broadcasts it to both A and B. Upon receiving the networkcoded packet, A (B) then attempts to recover the original packet
transmitted by B (A) in the uplink phase using self-information
[1]. This paper focuses on the design of the CNC algorithm in
the uplink phase; the issue in the downlink phase is similar to
that in conventional point-to-point transmission and does not
require special treatment [10].
As shown in Fig. 1, in the uplink phase, the source packets of
nodes A and B each goes through a convolutional encoder, an
interleaver, and a modulator. Our framework can accommodate
all different types of convolutional codes. We adopt zero-tail
convolutional codes1 in the main text of this paper. We denote
the source packets of node A and node B by two K-bit binary
sequences:
(1)
U i = ui1 , ui2 , · · · , uiK , i ∈ {A, B} ,
1 The use of other convolutional codes (e.g., tail-biting [19] and recursive) is
discussed in the Appendix II.
A
Interleaver
Π (•)
CA
U
A
R
Downlink
Encoder
(•)
CA
Modulator
M(•)
B
WR
XA
+
XB
Y
R
Demodulator
M-1(•)
Modulator
M(•)
CB
Interleaver
Π (•)
CB
Encoder
(•)
UB
Deinterleaver
Π-1 (•)
Decoder
C-1(•)
Û R
Fig. 1. System model of two-way relay network operated with physical-layer
network coding.
where uik is the input bit of end nodes i’s source packet at time
k. The source packets are encoded into two M -bit channelcoded binary sequences. We assume nodes A and B use the
same convolutional code with code rate r. In the following
presentation we choose r = 1/R where R is an integer as
an example. For concreteness, let us consider R = 3. Thus,
M = 3K. The two channel-coded packets are
C i = ci1 , ci2 , · · · , ciM
= c̄i1 , c̄i2 , · · · , c̄iK
= ci1,1 , ci1,2 , ci1,3 , ci2,1 , ci2,2 , · · · , ciK,1 , ciK,2 , ciK,3 ,
i ∈ {A, B} , (2)
where cik,j is the jth channel-coded bit of end nodes i’s channelcoded packet at time k; the 3-bit tuple c̄ik = (cik,1 , cik,2 , cik,3 )
is the output of the convolutional encoder of node i at time k.
Then, C A and C B are fed into their respective block interleavers
that realize the same permutation to produce
C̃ i = ci1,1 , ci2,1 , · · · , ciK,1 ; ci1,2 , · · · , ciK,2 ; ci1,3 , · · · , ciK,3 ,
i ∈ {A, B} . (3)
Note that after the permutation, the jth coded bits of all the
source bits are grouped into a block. There are altogether three
blocks. Finally, C̃ i are modulated to produce the two sequences
of N complex symbols:
(4)
X i = xi1 , xi2 , · · · , xiN , i ∈ {A, B} .
Throughout this paper, we focus on BPSK and QPSK modulations; our framework can be easily extended to higher order
i
constellations [20]–[22]. For BPSK
√ N =3K and xn ∈ {1, −1}.
i
For QPSK N =3K/2 and xn ∈1/ 2 {1+j, −1+j, 1−j, −1−j}.
The complex symbol sequences X A and X B are shaped using
a pulse shaping function p(t) with symbol duration T and
transmitted. Without loss of generality, we assume p(t) is the
rectangular pulse throughout this paper.
Let us denote the channel coefficients of the channels from
node A and node B to relay R by hA and hB , respectively.
Both hA and hB are complex numbers, whose phase difference
φ = (hB /hA ) is the relative phase offset between node A and
node B. We assume that the channel state information (CSI) hA
and hB can be estimated at the relay R using preambles. Node A
and node B use different pseudo-noise (PN) sequences that have
4
good cross-correlation property (e.g., Gold sequence) as their
preambles. Upon receiving the superposed packet, the relay
cross-correlates the received signal with node A’s preamble to
estimate the channel coefficient hA . Since node A and node B
use different PN sequences as preambles, the influence of node
B’s signal is removed by the cross-correlation. The relay also
estimates hB using the same method.
The received complex baseband signal at the relay is
y R (t)=
N
B B
R
hA x A
n p (t−nT ) + h xn p (t−nT −τ T ) +w (t),
n=1
(5)
where τ T is the symbol misalignment (i.e., the arrival time
of the signal of B lags the arrival time of the signal of A by
τ T ). The relay can estimate the symbol misalignment using the
two PN preambles. First, the relay cross-correlates the received
signal with node A’s preamble to locate the first sample of A’s
packet; it then cross-correlates the received signal with node
B’s preamble to locate the first sample of B’s packet. Finally,
the relay calculates their difference to estimate τ . This method
works even if the end nodes’ preambles are partially overlapped
due to the good cross-correlation property of the PN preambles.
The noise term wR (t) is assumed to be circularly complex with
variance σ 2 . We assume the symbol misalignment to consist
of two parts: an integral part τI ∈ N, and a fractional part
τF ∈ [0, 1) so that τ = τI + τF .
IV. S YNCHRONOUS C ONVOLUTIONAL -C ODED PNC
Let us first focus on synchronous convolutional-coded PNC,
where the signals of node A and node B are symbol-aligned
(τ = 0). Section V will discuss the asynchronous case. We
first derive the XOR packet-optimal Jt-CNC algorithm that
aims at finding the ML XORed source packet. We show that
such an XOR packet-optimal algorithm has prohibitively high
complexity. Then we introduce our XOR bit-optimal Jt-CNC
algorithm that finds the ML XORed bits, which has much lower
complexity.
A. XOR
Packet-Optimal
Decoding
of
Synchronous
Convolutional-Coded PNC
In the case of synchronous convolutional-coded PNC, the
received baseband signal at relay R is obtained by setting
symbol misalignment τ to zero in (5):
y R (t) =
N
B B
R
hA x A
n p(t − nT ) + h xn p(t − nT ) + w (t).
n=1
(6)
After matched filtering [5], the received baseband samples at
relay R are
R
,
(7)
Y R = y1R , y2R , · · · , yN
where
B B
R
ynR = hA xA
n + h xn + w n .
(8)
R
R
The ML XORed source packet Û R = (ûR
1 , û2 , · · · , ûK ) (i.e.,
ML XOR of the source packets of node A and node B) is given
by
Û R = arg max
exp −M X A , X B , (9)
UR
U A ,U B :U A ⊕U B =U R
where ⊕ denotes the binary bit-wise XOR operator; X A and
X B are the convolutional-encoded and modulated baseband
signal of U A and U B , respectively; and M(X A , X B ) is the
distance metric defined as follows:
N R
B B 2
A B
y n − hA x A
n − h xn
=
M X ,X
2σ 2
n=1
R
Y − hA X A − hB X B 2
2
=
. (10)
2σ 2
For source packets U A and U B of length K, the functional
mapping from U A and U B to the XORed source packet U R
can be expressed as
fpacket : {0, 1}K × {0, 1}K → {0, 1}K .
(11)
The mapping in (11) is a 2K -to-1 mapping; that is, there are
2 possible (U A , U B ) that can produce a particular U R . This
is where the complexity lies in (9). For each possible source
packet U R , we need to examine 2K possible combinations of
(U A , U B ), with each (U A , U B ) associated with one pair of
channel-coded signal (X A , X B ). The Viterbi algorithm is a
shortest-path algorithm that computes a path in the trellis of
(U A , U B ). Meanwhile, each U R is associated with 2K paths
in the trellis. There is no known exact computation method for
(9) except to exhaustively sum over the possible combinations
of (U A , U B ) for each U R .
We now consider the computing complexity of the
XOR packet-optimal decoding algorithm. For each possible
(U A , U B ), we need to sum over N terms in (10) to compute
M(X A , X B ). For a code-rate r code and M-QAM modulation,
N = K/[rlog2 (M )]. Computing each term in (10) takes six
complex operations, and the summation takes (N − 1) operations. Hence the complexity of one combination of (U A , U B )
is (7K/[rlog2 (M )]−1). Moreover, to find the maximum in (9),
(2K −1) comparisons are needed. Given that there are 2K possible U R , from which we want to find the optimal Û R , the overall
complexity is therefore 22K (7K/[rlog2 (M )] − 1) + 2K − 1. In
Big-O notation, the complexity is O(K22K ).
This is a big contrast with the situation in the regular point-topoint communication system, in which the Viterbi algorithm for
finding the ML codeword has polynomial complexity only. For
PNC systems, the complexity of XOR packet-optimal decoding
algorithm is exponential with packet length K.
K
B. XOR Bit-Optimal Decoding of Synchronous ConvolutionalCoded PNC
To reduce complexity, we consider an XOR bit-optimal JtCNC decoder based on the framework of Belief Propagation
(BP) algorithms. The proposed decoder aims to find the ML
XORed source bit rather than the ML XORed source packet.
We give two important results: (i) the proposed Jt-CNC decoder
is optimal in terms of BER performance; and (ii) the complexity
is linear in packet length K.
Unlike finding ML XORed packets, for which the Viterbi
algorithm is of little use, the BP (similar to BCJR) algorithm
can find the ML XORed source bit without incurring exponential growth in complexity. We first explain the reason before
describing the BP algorithm in detail.
The kth ML XORed source bits ûR
k , k = 1, 2, . . . K is given
by
B
R
ûR
Pr uA
(12)
k = arg max
k , uk |Y , C ,
uR
k
uA
,uB
:uA
⊕uB
=uR
k
k
k
k
k
5
u2
u1
s0
f1
s1
f2
uK
fK
uk
sK
c1
c2
cK
J c1 J c2 J cK G uk sk 1
D sk 1 Fig. 2. Tanner graph of the Jt-CNC decoder on which the BP algorithm operates: sk is the joint state variable that captures the states of the convolutional
codes at the two users at time k; ūk is the pair of source bit of the two users
at time k; c̄k is the group of channel-coded bits of the two users at time k;
fi is the factor node that represents the state transition function of the Jt-CNC
decoder; γ(c̄k ) is the likelihood function of c̄k .
B
R
where Pr(uA
k , uk |Y , C) denotes the a posteriori probability
A
B
of (uk , uk ) given the received signal Y R and the codebook
C. We can use the BP algorithm to calculate this probability.
Fortunately, finding the ML XORed bits in PNC systems has
much lower complexity, because the functional mapping from
B
R
(uA
k , uk ) to uk can be expressed as
fbit : {0, 1} × {0, 1} → {0, 1}.
(13)
The mapping in (13) is a 2-to-1 mapping; hence for each
A
B
possible realization of the XOR bit uR
k = uk ⊕ uk , we need to
examine only two possible realizations of the pair of source
B
bits (uA
k , uk ). Importantly, the BP algorithm can compute
A
B
B
R
Pr(uk , uk |Y R , C) easily, from which Pr(uA
k ⊕ uk |Y , C) can
readily be obtained through the 2-to-1 mapping in (13).
We now explain the details of our BP algorithm that
implements the XOR bit-optimal Jt-CNC decoder. BP is a
general framework for generating inference-making algorithms
for graphical models, in which there are two kinds of nodes:
variable nodes and factor nodes. Each variable node represents a
variable, such as the state variable of the convolutional encoder;
each factor node indicates the relationship among all variable
nodes connected to it. For example the state transition function
of a convolutional encoder is represented by a factor node. The
goal of BP is to compute the marginal probability distributions
B
R
Pr(uA
k , uk |Y , C) for all k. This goal is achieved by means of
a sum-product message-passing algorithm [23].
Fig. 2 shows the Tanner graph of our bit-optimal Jt-CNC
decoder. Unlike the conventional point-to-point convolutional
decoder for single-user systems with only one transmitter, the
Jt-CNC decoder combines the states and the trellis of both
transmitters A and B. Note here that node A and node B
can use different convolutional codes with the same code rate.
In Fig. 2, vectors S = (s0 , s1 , · · · , sK ) represents the state
variables, where state sk combines the state of both end nodes’
B
states; vector U = (ū1 , ū2 , · · · , ūK ), where ūk = (uA
k , uk ),
represents the “virtual” source packet consisting of the duple
of the two source packets from nodes A and B; similarly,
B
vector C = (c̄1 , c̄2 , · · · , c̄K ), where c̄k = (c̄A
k , c̄k ) (as defined
i
in (2) c̄k denotes the group of channel-coded bits of node i
at time k), represents the “virtual” channel-coded packet. The
behavior of the decoder is defined by the functions of the factors
node fk (sk−1 , ūk , c̄k , sk ) that represents the state transition rule
of the trellis. For a trellis transition e = (sk−1 , ūk , c̄k , sk ),
fk (e) = 1 if e is a valid transition, and fk (e) = 0 otherwise.
For example, if input ūk causes a state transition from sk−1 to
sk and the output is c̄k , then fk (e) = 1; on the other hand, if
E sk E sk 1 fk
D sk sk
J ck ck
Fig. 3. The messages being passed around a factor node during the operation
of the sum-product algorithm.
input ūk causes a state transition from sk−1 to a state not equal
to sk or the output is not c̄k , then fk (e) = 0.
The goal of the Jt-CNC decoder is to find the maximum
likelihood XOR bit uR
k through the a posteriori probability
(APP) Pr(ūk |Y R , C) by
R Pr uR
k Y ,C =
Pr ūk Y R , C ,
(14)
ūk :uA
⊕uB
=uR
k
k
k
where Pr(ūk |Y R , C) can be computed exactly by the sumproduct message-passing algorithm thanks to the tree structure of the Tanner graph associated with convolutional nodes
[24]. The sum-product algorithm, when applied to decode
convolutional codes, is the well-known BCJR algorithm [12].
The difference in our situation here is that instead of the
source bit from one source, we are decoding for the bit duple
B
ūk = (uA
k , uk ) from the two sources.
We now explain our sum-product algorithm in detail. Fig. 3
depicts the messages being passed around a factor node within
the overall Tanner graph of Fig. 2. We follow the notation of
the original paper on the BCJR algorithm [12]. In the forward
direction, the message from sk−1 to fk is denoted by α(sk−1 ),
and the message from fk to sk is denoted by α(sk ). In the
backward direction, the message from sk to fk is denoted by
β(sk ), and the message from fk to sk−1 is denoted by β(sk−1 ).
Additionally, γ(c̄k ) denotes the message from c̄k to fk , and
δ(ūk ) denotes the message from fk to ūk . Note that δ(ūk ) is
the APP Pr(ūk |Y R , C) and the goal here is to compute it.
Since the Tanner graph of the Jt-CNC decoder is cyclefree, the operation of the sum-product algorithm consists of
two natural recursions according to the direction of message
flow in the graph: a forward recursion to compute α(sk )
as a function of α(sk−1 ) and γ(c̄k ); a backward recursion
to compute β(sk−1 ) as a function of β(sk ) and γ(c̄k ). The
calculation of Pr(ūk |Y R , C) can be divided into three steps:
initialization, forward/backward recursion, and termination. We
present these three steps in detail below.
Initialization As usual in a cycle-free Tanner graph, the sumproduct algorithm begins at the leaf nodes. Since zero-tail
convolutional code is used, the initial and terminal states of end
node’s convolutional encoders are both zero state. Therefore the
6
x1A x2A x3A x4A x5A
Jt-CNC decoder
x1B x2B x3B
WI
1
0
if sK equals to 0
otherwise.
(15a)
(15b)
1
u2
fK
uK
sK
cK
J c2 J cK Deinterleaver
\o
Pr x1A , x1B Y R
Symbol-realignment
layer
Demodulated samples
The message γ(c̄k ) is the likelihood function of c̄k
based on the evidence Y R . For example, if the code rate
is 1/3 and the BPSK modulation is used, then c̄k =
A
A
B
B
B
(cA
k,1 , ck,2 , ck,3 , ck,1 , ck,2 , ck,3 ). These channel-coded bits c̄k are
A
A
mapped to BPSK modulated symbols (xA
3k−2 , x3k−1 , x3k ) and
B
B
B
(x3k−2 , x3k−1 , x3k ) at node A and node B, respectively. Given
the overlapped signal Y R at the relay node, the likelihood of
c̄k is calculated by
3
f2
c2
Codeword-realignemnt
layer
message α(s0 ) and β(sK ) are initialized as
1
if s0 equals to 0
α(s0 ) =
0
otherwise,
β(sK ) =
s1
J c1 WF
s0
u1
c1
Fig. 4. Symbol misalignment in PNC: a general symbol misalignment consists
of an integral part τI = 2 and a fractional part τF = 0.7.
and
f1
Pr xNA , xNB Y R
\e
x1,1
x1,0
Pr y1R x1,0
Pr y2R x1,1
Pr y3R x 2,1
\e
x N ,N
x 2,1
Pr y2RN x N , N
x N 1, N
Pr y2RN 1 x N 1, N
Fig. 5.
Decoding framework for asynchronous convolutional-coded PNC
systems. This framework can deal with an integral-plus-fractional symbol
misalignment.
finding a single ML XORed source bit in (12) takes two
summations and one comparison, so finding all the ML XORed
source bits takes 3K operations beyond the operations by the
B
R
BP algorithm that computes Pr(uA
k , uk |Y , C), k = 1, 2, . . . K.
Therefore finding the ML XORed source bits of length-K packets has an overall complexity of 9K/[rlog2 (M )]+6·22/r KS 2 +
4·22/r KS 2 + 3K. In Big-O notation, the complexity of source
bit-optimal decoding algorithm is O(K). Compared with the
XOR packet-optimal decoding algorithm, the XOR bit-optimal
algorithm has a much lower complexity and is therefore more
feasible in practice.
V. A SYNCHRONOUS C ONVOLUTIONAL -C ODED PNC
2
2πσ
j=1
In this section, we present our three-layer decoding frame⎧ 2 ⎫ work for asynchronous convolutional-coded PNC. The asyn
⎪
⎪
R
B B
⎨ y3(k−1)+j
⎬ chrony causes unique challenges that the synchronous decoder
− hA x A
3(k−1)+j − h x3(k−1)+j . in Section IV cannot handle. As shown in Fig. 4, when the
exp −
⎪
⎪
2σ 2
⎩
⎭
signals of nodes A and B arrive at the relay at different times,
(16)
their symbols can be misaligned. The symbol misalignment
consists of two parts: an integral part τI and a fractional part
Forward/backward recursion After initializing the messages
τF . These two components impose different challenges: the
from leaf nodes, we can compute the message α(sk ) and β(sk )
fractional symbol misalignment causes overlaps of adjacent
recursively by following the message update rule below [24]:
symbols, and as a result the symbol-boundary preserving sam
pling as expressed in (8) is no longer valid; the integral symbol
α (sk ) =
fk (sk−1 , ūk , c̄k , sk )α (sk−1 ) γ (c̄k ) ,
misalignment entangles the channel-coded bits of nodes A and
sk−1 ,ūk ,c̄k
B in such a way that the decoding scheme as proposed in
(17a)
Section IV cannot be applied anymore.
fk (sk−1 , ūk , c̄k , sk )β (sk ) γ (c̄k ) . (17b)
β (sk−1 ) =
To address these challenges, we add two layers to the Jt-CNC
sk ,ūk ,c̄k
decoder to construct an integrated framework, as illustrated in
Fig. 5. First, to address the fractional symbol misalignment,
Termination In the final step, the algorithm terminates with
the symbol-realignment layer uses a BP algorithm at the relay
the computation of δ(ūk ), which gives the APP of the source
to “realign” the soft information of the symbols. Second, the
bit ūk
codeword-realignment layer uses an interleaver/deinterleaver
δ (ūk ) =
fk (sk−1 , ūk , c̄k , sk )α (sk−1 ) γ (c̄k ) β (sk ) .
set-up to accommodate the integral symbol misalignment. As
a result, the three-layer decoding framework can deal with the
sk−1 ,sk ,c̄k
(18)
integral-plus-fractional symbol misalignment.
The summation in (18) is over different trellis transitions
e = (sk−1 , ūk , c̄k , sk ) with fixed ūk , such that fk (e) = 1 if e
A. Symbol-Realignment Layer: Addressing Fractional Symbol
is a valid transition, and fk (e) = 0 otherwise.
Misalignment
Let us consider the computing complexity of the XOR
bit-optimal decoding algorithm. The initialization takes
For simplicity, as in [5], we assume the use of rectangular
9K/[rlog2 (M )] complex operations in (16). The forpulse to carry the modulated signal in the analog domain. As
illustrated in Fig. 4, the fractional symbol misalignment τF
ward/backword recursions step take 6·22/r KS 2 operations,
where S is the number of decoder’s states. The termination
causes an inter-symbol interference between A’s and B’s signals
A
A
step for computing (18) takes 4·22/r KS 2 operations. Because
(e.g., xB
1 overlaps with part of x3 and part of x4 ). In [25],
γ(c̄k ) ∝
√
7
Pr x1A , x1B Y R
\o
Pr xNA , xNB Y R
\e
x1,0
Pr y1R x1,0
\e
x1,1
Pr y2R x1,1
x N ,N
x 2,1
Pr y3R x 2,1
Pr y2RN x N , N
x N 1, N
Pr y
R
2 N 1
x N 1, N
Fig. 6. Tanner graph of the symbol-realignment layer. The variable node xi,j
B
corresponds to the joint symbol (xA
i , xj ); nodes ψo and ψe are the factor
nodes that constrain the relationships among the variable nodes. The likelihood
R
R |xn,n ) are the evidences from
|xn,n−1 ) and Pr(y2n
probabilities Pr(y2n−1
observation Y R .
suboptimal sampling was assumed: with respect to Fig. 4, only
A
the overlapped part of xB
1 and x3 is sampled, and the useful
A
signal in the overlapped part of xB
1 and x4 is discarded. By
contrast, our method here is an optimal maximum-likelihood
(ML) oversampling method based on the BP algorithm. Specifically, the relay R performs integration (matched filtering) on
the overlapped symbols for a duration τF and a duration of
(1 − τF ) alternately to generate (2N + 1) samples. Our oversampling method makes full use of the overlapped signal and
is ML optimal.
Let us first ignore the integral part of symbol misalignment
and only consider the fractional part (i.e., τ < 1) in this
subsection. When the integral part of symbol misalignment
is not zero, the factor graphs follow different equations for
non-overlapping parts of the received signal. Because the nonoverlapping parts are “clear” signal, we use the traditional
sampling technique for them. The soft-information of the nonoverlapping parts of signal can be directly computed, so we only
focus the Tanner graph of the
√ overlapped signal. Furthermore,
let us assume |hA |=|hB |= P where P is the transmission
power of end nodes. The design of this layer is similar to
the decoding method of asynchronous non-channel-coded PNC
expounded in [5], [26]. It is provided here for completeness
and for illustration on how this layer is tied to the upper layer.
With over-sampling on the received signal y R (t), the total
number of samples obtained per packet is (2N + 1), where
N is the number of symbols per packet (for both users A
and B). The relay uses the (2N + 1) samples to compute
B
R
the soft information Pr(xA
n , xn |Y ), where instead of the
R
R R
R
expression in (7), Y = (y1 , y2 , · · · , y2N
+1 ) consists of the
(2N +1) samples. Thus, as far as the soft information fed to the
upper layer is concerned, the fractional symbol misalignment is
removed and the symbols are realigned. We emphasize that
this realignment of soft information is a key step. Once that is
done, the channel decoding algorithm for synchronous PNC as
proposed in Section IV can be applied.
We can write the samples obtained at the relay R as follows
(after normalization):
R
B
jφ
R
y2n−1
= xA
+ w2n−1
n + xn−1 e
R
y2n
=
xA
n
+
jφ
xB
ne
+
R
w2n
(19a)
(19b)
R
B jφ
R
where n=1, 2, . . . , N , xB
0 =0 and y2N +1 =xN e +w2N +1 . The
R
R
terms w2n−1 and w2n are zero-mean complex Gaussian noise
with variances σ 2 /τF P and σ 2 /(1 − τF )P per dimension,
respectively [5].
We use a BP algorithm to compute soft information of
B
R
Pr(xA
n , xn |Y ) from the (2N + 1) samples. The associated
Δ
Tanner graph is shown in Fig. 6. In the Tanner graph xi,j =
A
B
(xi , xj ) are the variable node; ψo and ψe are the compatibility
functions associated with the factor nodes. The compatibility
functions model the correlation between two adjacent symbols
and are defined as
⎧
n,n−1
of xA
n in x
n,n−1 n,n ⎨1 if the values
n,n
ψo x
=
,x
and x
are equal
⎩
0 otherwise,
(20a)
⎧
B
n,n
of xn in x
⎨1 if the values
(20b)
ψe xn,n , xn+1,n =
and xn+1,n are equal
⎩
0 otherwise.
Note that the Tanner graph in Fig. 6 has a tree structure,
hence the BP algorithm can find the “exact” a posteriori
probability Pr(xn,n |Y R ) for n = 1, . . . , N . Furthermore, the
solution can be found after only one iteration of the messagepassing algorithm [24]. We now describe the message-passing
algorithm in detail: αL (xi,j ) denotes the forward message from
the factor node (ψo or ψe ) to variable node xi,j , αR (xi,j )
denotes the forward message from variable node xi,j to the
factor node; βR (xi,j ) denotes the backward message from the
factor node to variable node xi,j , βL (xi,j ) denotes the backward
message from variable node xi,j to the factor node; γ(xi,j )
denotes the evidence of variable node xi,j from the observation.
Let us first consider variable node xn,n . The forward messages
from left to right can be computed as
αL (xn,n ) =
αR (xn,n−1 )ψo xn,n−1 , xn,n , (21a)
xn,n−1
αR (xn,n ) = αL (xn,n )γ(xn,n ).
(21b)
The backward messages from right to left can be computed as
(22a)
βR (xn,n ) =
βL (xn+1,n )ψe xn,n , xn+1,n ,
βL (x
n,n
)=
xn+1,n
βR (xn,n )γ(xn,n ).
(22b)
We compute the forward/backward messages of variable
node xn+1,n in the same way, except that the factor nodes
need to be modified accordingly. The likelihood probabilities
R
R
Pr(y2n+1
|xn+1,n ) and Pr(y2n
|xn,n ) are the evidences γ(xn,n )
R
from observation Y . The computation of these evidences is
given by
R A
R n+1,n x
xn+1 , xB
γ(xn+1,n ) = Pr y2n+1
= Pr y2n+1
n
R
B 2
y2n+1
− xA
1
n+1 − xn
exp −
=
2πσ 2 /τF
2σ 2 /τF
(23a)
and
R A B
R n,n x
x ,x
= Pr y2n
γ(xn,n ) = Pr y2n
n n
R
B 2
y2n
− xA
1
n − xn
exp −
.
=
2πσ 2 /(1 − τF )
2σ 2 /(1 − τF )
(23b)
B
R
Finally, the soft information of Pr(xA
n , xn |Y ) is computed by
B
R
n,n
)βR (xn,n )γ(xn,n ).
Pr(xA
n , xn |Y ) = αL (x
(24)
8
The computation of forward messages in (21a) and (21b) takes
2N M 2 (2M 2 − 1) complex operations and 2N M 2 complex
operations, where N is the number of symbols per packet
and M is the modulation order, respectively. The same computation is needed for the backward messages in (22a) and
(22b) as well. Computing the evidences in (23a) and (23b)
takes (2N + 1)7M 2 complex operations. Equation (24) takes
another 2N M 2 complex operations. Therefore, the overall
computational complexity of the symbol-realignment layer is
4N M 2 (2M 2 − 1) + 4N M 2 + (2N + 1)7M 2 + 2N M 2 complex
operations. In Big-O notation, the complexity is O(N ). Since
the complexity is linear with the packet size N , the symbolrealignment layer does not incur heavy overhead.
B. Codeword-Realignment Layer: Countering Integral Symbol
Misalignment
Since the fractional part of symbol misalignment has been
removed in the symbol-realignment layer, here we only consider
the integral part of symbol misalignment in this subsection.
Recall that in Section IV we used (16) to compute the message
γ(c̄k ). Equation (16) requires that the modulated symbols of end
nodes A and B to be symbol-by-symbol aligned (i.e., xA
n must
align with xB
n ). However, with integral symbol misalignment
B
τI , x A
n will be aligned with xn−τI ; consequently, the algorithm
proposed in Section IV becomes invalid.
The codeword-realignment layer solves this problem using
a specially designed interleaver/deinterleaver at the end/relay
nodes. At the end nodes, we use the same block interleaver with
R rows and M/R columns, where r = 1/R is the code rate and
M is the number of bits in the codeword. For interleaving, the
channel-coded bits are filled into the interleaver column-wise ,
and read out row-wise. Let Π denotes the interleave operation.
The interleaving process is
Π C A = C̃ A , Π C B = C̃ B .
(25)
The interleaved packets C̃ A and C̃ B are modulated and
transmitted simultaneously to the relay. Upon receiving the
overlapped signal (with symbol misalignment), the relay first
deals with the fractional symbol misalignment with the algorithm proposed in Section V-A, leaving only the integral
symbol misalignment τI after that. Then the relay wraps back
the nonoverlapped signal at the tail to the head. The overlapped
B
B
, where C̃(τ
is the τI bit circularsignal becomes C̃ A + C̃(τ
I)
I)
B
shifted version of C̃ . Finally, the relay uses the same block
deinterleaver to deinterleave the overlapped signal as
B
B
B
Π−1 C̃ A +C̃(τ
=Π−1 C̃ A +Π−1 C̃(τ
=C A +C(τ
.
I)
I)
I R)
(26)
Let us consider an example with a convolutional code of code
rate 1/3, BPSK modulation, and integral symbol misalignment
of τI = 2 (in this subsection, we only consider the integral part
of τ ). As specified in Section III, the channel-coded packets
of node A and node B are C A and C B , respectively. Then the
channel-coded packet is bit-interleaved with a block interleaver
with 3 rows and M/3 columns. The interleaved packets C̃ i , i ∈
{A, B} in (3) are BPSK modulated to produce the transmitted
signal
X i = xi1,1 , xi2,1 , · · · , xiK,1 , xi1,2 , · · · , xiK,2 , xi1,3 , · · · , xiK,3 ,
i ∈ {A, B} , (27)
where xik,j = 1 − 2cik,j , and cik,j is the jth channel-coded bit of
node i’s channel-coded packet at time k as defined in (3). The
TABLE I.
D IFFERENT TECHNIQUES TO MAKE THE CONVOLUTIONAL
CODE ’ S INITIAL STATE AND TERMINAL STATE THE SAME .
Code
Technique
Non-recursive
Recursive
zero tailing or tail biting
zero tailing
received signal samples are the superposition of the following
two sequences
(xA
1,1 )
(xA
2,1 )
xA
3,1
xB
1,1
xA
4,1
xB
2,1
···
···
(xA
2,1 )
(xB
K,3 )
xA
3,1
xB
1,1
xA
4,1
xB
2,1
xA
K,3
xB
K−2,3
(xB
K,3 ).
(28)
The relay first wraps back the nonoverlapped signals:
A
B
B
xA
1,1 x2,1 at the head and xK−1,3 xK,3 at the tail (enclosed in
brackets in (28) above). After the wrap-back, the realigned
sequences look like
(xA
1,1 )
(xB
K−1,3 )
xA
5,1
xB
3,1
(xB
K−1,3 )
···
···
xA
K,3
B
xK−2,3 .
(29)
Then the relay deinterleaves the signal in (29) to restore node
A’s transmission order. After deinterleaving, the received packet
becomes the superposition of the following sequences:
A
A
A
A
(xA
xA
xA
(xA
xA
2,1 ) x2,2 x2,3
1,1 )
1,2
1,3
3,1 x3,2 x3,3 · · ·
B
B
B
B
B
B
B
B
(xK−1,3 ) xK−1,1 xK−1,2 (xK,3 ) xK,1 xK,2 x1,1 x1,2 xB
1,3 · · ·
block−1
block−2
block−3
(30)
We group the received packet into K blocks, with each blockk containing the R = 3 coded symbols of the kth input.
The signal in (30) is equivalent to the superposition of the
B
modulated signals of C A and C(6)
, where, except for the first
B
two blocks, C(6)
is the 6-bit right circular-shifted version of C B .
The symbols of the first two blocks are out of order due to the
larger-than-one symbol misalignment. However, the disorder
does not hinder our decoding because we can still compute
the likelihoods of the first and second blocks. For example, the
likelihood of the first block can be computed as follows:
A A A B
B
B
B
γ c̄A
1 c̄K−1 = γ x1,1 x1,2 x1,3 xK−1,1 xK−1,2 xK−1,3
B
A B
A B
(31)
∝ γ xA
1,1 γ xK−1,3 γ x1,2 xK−1,1 γ x1,3 xK−1,2 .
To compute the likelihood in (31), we first compute
the likelihoods in the second line. We can obtain γ(xA
1,1 ),
A
B
A
B
),
γ(x
x
),
and
γ(x
x
)
from
the
γ(xB
1,2 K−1,1
1,3 K−1,2
K−1,3
symbol-realignment layer as shown in Fig. 6. After normalB
ization, we get the likelihood of block-1 γ(c̄A
1 c̄K−1 ). We can
compute the likelihood of block-2 in the same manner. The
likelihood of block-k when k > τI can be computed using
(16).
We then pass the likelihoods of the K blocks to the upper
layer, Jt-CNC decoder, as the message γ(c̄k ). In Appendix I
we prove that if a convolutional code C has the same initial
state and terminal state, then C(τI R) , the τI R bit circular-shifted
version of C, can be decoded to U(τI ) . To ensure that the initial
state equals to the terminal state, we use different techniques
for different kinds of convolutional codes as shown in Table I.
As a result, in the presence of integral symbol misalignment, our XOR bit-optimal decoding algorithm will output
B
U R = U A ⊕ U(τ
. However, node A (B) can still decode the
I)
information of node B (A). In the downlink phase, the relay
broadcasts the XOR message U R together with the value of τI
to both the end nodes. Node A can first XOR U R with its own
9
0
10
0
10
(5,7) FSV
(5,7) Jt−CNC
(5,7) XOR−CDV
−1
10
−1
10
−2
FER
BER
10
−3
10
−2
10
−4
10
−5
10
(5,7) Jt−CNC
(5,7) FSV
(5,7) XOR−CDV
(13,15,17) Jt−CNC
(13,15,17) FSV
(13,15,17) XOR−CDV
−3
10
−4
10
−6
10
−1
0
1
2
3
4
5
SNR (dB) Eb/N0
6
7
8
0
1
2
3
4
SNR (dB) Eb/N0
5
6
7
8
9
Fig. 7. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), and
full-state Viterbi (FSV) algorithms for PNC. The channel codes are (5, 7)
and (13, 15, 17) convolutional codes. We use BPSK modulation and assume
AWGN channel.
B
packet to obtain U(τ
, then left shift it τI bits to restore U B .
I)
B
Node B can first right shift its own packet to obtain U(τ
and
I)
R
A
then XOR it with U to obtain U .
VI.
−1
N UMERICAL R ESULTS
We evaluate the performance of the proposed PNC decoding
framework under the AWGN channel by extensive simulations
and analysis. First, we compare the BER performances of JtCNC, XOR-CD Viterbi, and full-state Viterbi algorithms in
synchronous PNC under both AWGN channel and Rayleigh
fading channel. Second, we investigate the effect of phase offset
on our Jt-CNC decoder. Third, we present the performance of
the three algorithms in the presence of symbol asynchrony.
Furthermore, we implement the three algorithms in a real PNC
prototype built on the USRP software radio platform, and test
them in an indoor environment.
A. BER Performance Comparison and Analysis
1) AWGN Channel: We compare the BER performances of
Jt-CNC, XOR-CD Viterbi (XOR-CDV), and full-state Viterbi
(FSV) in synchronous PNC under AWGN channel. The XORCD Viterbi algorithm and full-state Viterbi algorithm were
introduced in Section II and elaborated in Appendix III. In
the simulations, we adopt convolutional codes of two different
code rates: code rate 1/2 (5, 7) code and code rate 1/3
(13, 15, 17) code. We consider BPSK modulation, assuming
AWGN channel.
We plot the BER curve of the full-state Viterbi algorithm as
a benchmark for the reduced-state Viterbi algorithm in [13]. In
our attempt to replicate the reduced-state Viterbi algorithm, we
cannot get the same simulation results as in [13] even though
we follow the exact specification as described in the paper2 .
Our simulation results are somewhat better than those presented
2 We believe that there are errors in equation (12) and Fig. 3 in [13]. We
suspect that in [13], the SNR was not normalized correctly. Our attempt to
contact the authors of [13] by email received no reply.
Fig. 8. Frame error rate (FER) performances of Jt-CNC, XOR-CD Viterbi
(XOR-CDV), and full-state Viterbi (FSV) algorithms for synchronous PNC.
The channel code used is (5, 7) convolutional code. We use BPSK modulation
and assume AWGN channel.
in [13]. To avoid misrepresenting their results, here we just
compare the results of full-state Viterbi with Jt-CNC.
As shown in Fig. 7, Jt-CNC has slightly better BER performance than FSV. In Section II-A and also in Appendix
III-B, we explained that FSV is an approximation to the
XOR packet-optimal decoding algorithm based on the log-max
approximation. As such, FSV is not exactly XOR-packet optimal. The approximation is shown in equation (III.51). During
the simplification in (III.51), some possible combinations of
{U A , U B } yielding the same XOR packet U R are omitted, so
FSV is not strictly an XOR packet-optimal decoding scheme,
but an approximation to it. As such, there is no guarantee that
it will outperform Jt-CNC even if XOR-packet error rate is the
performance metric. Meanwhile, when XOR-bit error rate is
the performance metric (as shown in Fig. 7), Jt-CNC will be
better than FSV, since Jt-CNC targets for bit optimality and is
an exact bit-optimal algorithm for synchronous PNC. In [13],
a performance gap of 2 dB was observed between the reducedstate Viterbi and the full-state Viterbi. If the gap between fullstate Viterbi and reduced-state Viterbi is 2 dB, then the gap
between Jt-CNC and reduced-state Viterbi is at least 2 dB.
Fig. 7 also shows that Jt-CNC outperforms XOR-CD Viterbi
by 2 dB for both rate 1/2 and 1/3 convolutional codes. As
described previously, XOR-CD loses information in the XORmapping, hence this 2 dB gap is as expected.
2) Frame Error Rate: We compare the frame error rate (FER)
performances of Jt-CNC, XOR-CD Viterbi and full-state Viterbi
in synchronous PNC. As discussed in Section IV, Jt-CNC is
optimal in terms of BER, but may not be optimal in terms of
FER. As shown in Fig. 8, the FER performances of Jt-CNC
and FSV are quite close, and are 1 dB better than XOR-CDV.
We have also investigated the FER performances of the three
decoders in asynchronous PNC. The relative performance gaps
among the decoders as in synchronous PNC are also observed.
Furthermore, in terms of the dB gaps between the decoders,
there is no substantial difference between BER and FER results.
Henceforth, we will only present the BER results.
3) Fading Channel: We compare the BER performances of
Jt-CNC, XOR-CDV, and FSV in synchronous PNC under fading
channel. In the simulation, we assume block Rayleigh fading
9
10
0
10
(5,7) FSV
(5,7) Jt−CNC
(5,7) XOR−CDV
−1
10
−2
−3
10
−4
10
−5
10
i=1
(ẋA
i
−6
10
(33)
Since the path {Ẋ A , Ẋ B } and path {X A , X B } differ in
exactly k symbols, the pairwise error probability is
Pk = Pr M X A , X B − M Ẋ A , Ẋ B > 0
k k R
B 2
B 2
yiR − xA
yi − ẋA
i − xi
i − ẋi
−
>0
= Pr
2σ 2
2σ 2
i=1
i=1
k
B
A
B
A
B 2
2yiR (ẋA
= Pr
i + ẋi − xi − xi ) + (xi + xi ) −
10
BER
(as defined in (10)) than the correct path, i.e.,
M Ẋ A , Ẋ B < M X A , X B .
0
2
4
6
SNR (dB) Eb/N0
8
10
12
Fig. 9. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), and
full-state Viterbi (FSV) algorithms for PNC under fading channel. The channel
code used is (5, 7) convolutional code. We use BPSK modulation and assume
block Rayleigh fading channel.
and white noise. As shown in Fig. 9, compared with the BER
performances under AWGN channel, all the three decoding
algorithms experience 3 dB degradation under fading channel.
The degradation is caused by the phase offset and unbalanced
channel coefficients of fading channel.
4) BER Analysis: We now analyze the bit error rate (BER)
of our Jt-CNC decoder under AWGN channel. It is difficult to
obtain the closed-form expression of BER for Jt-CNC decoder,
which uses the BCJR decoding algorithm. However, the BER of
FSV (full-state Viterbi) algorithm can be a good approximation
for Jt-CNC algorithm for the following two reasons: first, the
simulation in Section VI-A1 shows that BER performances of
Jt-CNC and FSV are quite close; second, Jt-CNC and FSV
should perform nearly the same in the high SNR regime. Next
we derive the approximative BER expression for Jt-CNC and
FSV with BPSK modulation.
Let BERXOR denote the BER of the XORed source packet
U R , and BERAB denote the BER of the joint source packet
{U A , U B } of the two end nodes. It can be easily proved that
BERXOR < BERAB , since an error of U R implies an error
of {U A , U B } but not vice versa (e.g., if U A and U B have
one bit error in the same position, their XOR is still correct).
There is no simple way to derive the closed-form expression of
BERAB , but we may approximate (upper bound) it using the
union bound [27] as
BERAB ≈
∞
c k Pk .
(32)
k=d
In (32), d is the free distance of the FSV decoder. ck
is the sum of the numbers of bit errors over all paths of
distance k from the correct path. We compute ck by taking the
derivative of the FSV decoder’s transfer function [27]. Pk is the
pairwise error probability that an incorrect path with distance
k from the correct path is decoded. To compute Pk , let us
consider an incorrect path {Ẋ A , Ẋ B } merging with the correct
path {X A , X B } at a particular step, which has k incorrect
symbols and the remaining symbols correct. Such a path may
be incorrectly chosen only if it has a smaller distance metric
+
2
ẋB
i )
>0 .
(34)
Without loss of generality, we assume that the correct path
{X A , X B } corresponds to the all-zero source packet [28]
B
(hence xA
i =xi = + 1, ∀i), and only one symbol error happens
B
for each overlapped symbol yiR (hence ẋA
i = +1, ẋi = −1 or
A
B
ẋi = −1, ẋi = +1). Therefore
k
R
2yi (0 − 2) + 4 > 0
Pk = Pr
= Pr
i=1
k
yiR
<k
.
(35)
i=1
Since yiR are independent Gaussian random variables of
B
A
B
variance σ 2 and mean (xA
i + xi ), where xi and xi are the
actually transmitted
symbols
by
the
end
nodes.
Therefore
the
k
sum Z = i=1 yiR is also Gaussian with mean 2k and variance
kσ 2 hence
!√ "
k − 2k
k
. (36)
Pk = Pr {Z < k} = 1 − Q √
=Q
2
σ
kσ
Consequently,
BERAB ≈
∞
k=d
!√ "
k
.
ck Q
σ
(37)
To give a concrete example, let us assume both end nodes use
(5, 7) convolutional codes as in Section VI-A1. As illustrated in
Section III-B, the trellis of the FSV decoder is the combination
of node A’s and node B’s trellises.
Therefore, the FSV decoder
5, 7, 0, 0
has a generator vector of
. We compute the free
0, 0, 5, 7
distance d and the coefficients ck of this FSV decoder using
the method proposed in [29]. We get that d = 5 and substitute
the values of ck into (37) to obtain
BERAB ≈ 2P5 + 8P6 + 24P7 + 64P8 + 160P9 + . . .
!√ "
!√ "
!√ "
5
6
7
+ 8Q
+ 24Q
+
= 2Q
σ
σ
σ
!√ "
!√ "
8
9
64Q
+ 160Q
+ . . . . (38)
σ
σ
We omit the higher order terms (when k > 12 the value of
Pk is negligible) in (38) and plot the approximative BER curve
together with the simulation results from Section VI-A1. As
11
0
0
10
10
−1
10
−2
10
−1
10
−2
BER
10
BER
Jt−CNC φ=0
Jt−CNC φ=π/4
XOR−CDV φ=0
XOR−CDV φ=π/4
FSV φ=0
FSV φ=π/4
−3
10
−4
−3
10
−4
10
10
Jt−CNC φ=0
Jt−CNC φ=π/4
XOR−CDV φ=0
XOR−CDV φ=π/4
FSV φ=0
FSV φ=π/4
−5
10
−6
10
0
−5
10
−6
2
4
6
8
SNR (dB) Eb/N0
10
12
(a) without random-phase precoding
10
0
2
4
6
SNR (dB) Eb/N0
8
10
12
(b) with random-phase precoding
Fig. 11. Effects of phase offset on Jt-CNC, XOR-CD Viterbi (XOR-CDV), and full-state Viterbi (FSV). QPSK modulation and the (13, 15, 17) convolutional
code are used in the simulation. We assume the symbols are aligned and the relative phase offset is π/4. In (a), both nodes transmit their signals directly; in (b),
node B precodes its transmit signal with a pseudo-random phase sequence.
2
10
(5,7) Approximative BER
(5,7) Jt−CNC
(5,7) FSV
1
10
0
10
−1
BER
10
−2
10
−3
10
−4
10
−5
10
−6
10
−7
10
−1
0
1
2
3
4
SNR (dB) Eb/N0
5
6
7
8
Fig. 10. The approximative BER curve derived in (38) and the simulated BER
curves of Jt-CNC, full-state Viterbi (FSV) algorithms. The channel code used
is (5, 7) convolutional code. We use BPSK modulation and assume AWGN
channel.
shown in Fig. 10, the BER curves of Jt-CNC and FSV are close
to the derived approximative BER curve in (38), and approach
it in the high-SNR regime.
B. Effects of Phase Offset
We next evaluate the effect of phase offset on Jt-CNC
assuming QPSK modulation (higher order QAM can also be
used)—note that phase offset does not present a challenge
to BPSK systems (see [4], [5]). First, we compare the BER
performances of the aforementioned three decoding algorithms
with phase offset φ=0 (phase synchronous) and φ=π/4 (worst
case for QPSK [5]).
As shown in Fig. 11a, when the phase offset is π/4, the
BER performances of Jt-CNC, FSV, and XOR-CDV degrade
by 3 dB, 3 dB, and 5 dB, respectively. The severe phase penalty
is due to the poor confidence of the messages as calculated in
(16) when the phase offset is π/4.
One method to improve the confidence is to make the
phase offset random so that the symbols with small phase
offset can help the symbols with large phase offset during the
BP process. To improve our system’s resilience against phase
offset, we adopt the random-phase precoding at the transmitter
of one end node. Specifically, node B rotates the phase of
its transmitted signal with a pseudo-random phase sequence
B
B
ΦB =(φB
1 , · · · , φN ) where φn is randomly chosen from zero
to π/4. We assume that this pseudo-random phase sequence is
known at the relay so that it can incorporate this knowledge into
the decoding process. As shown in Fig. 11b, with the randomphase precoding algorithm, the phase penalty is reduced to 1 dB,
1 dB, and 3 dB compared with the synchronous case for Jt-CNC,
FSV, and XOR-CDV, respectively.
C. Effects of Symbol Misalignment
A major advantage of the proposed decoding framework is
that it can handle general symbol misalignment with different
decoding algorithms. We evaluate the performance of Jt-CNC
under varying degrees of symbol misalignment and phase offset
(without random-phase precoding). In the simulation, both end
nodes transmit 1000-bit source packets (corresponding to 1500
QPSK symbols for channel code rate of 1/3).
From Fig. 12 we see that although the fractional symbol
misalignment (the curve with τ = 0.5, φ = 0) degrades the
BER performance by 0.5dB, the integral symbol misalignment
(the curve with τ =100.5, φ=0) improves the BER performance
12
0
10
0
10
XOR−CDV
FSV
Jt−CNC
−1
10
−1
10
−2
10
−2
−3
BER
BER
10
τ=0, φ=0
τ=0.5, φ=0
τ=100.5, φ=0
τ=0, φ=π/8
τ=0.5, φ=π/8
τ=100.5, φ=π/8
10
−4
10
−3
10
−4
10
−5
10
−5
10
−1
0
1
2
SNR (dB) Eb/N0
3
4
5
−6
10
Fig. 12.
BER performance of Jt-CNC decoder under general symbol
misalignment, with (13, 15, 17) convolutional code and QPSK modulation.
4
5
6
7
8
9
10
SNR (dB) E /N
b
11
12
13
0
Fig. 14. BER performances of Jt-CNC, FSV, and XOR-CDV in an indoor
environment. We tested the three algorithms on a practical OFDM PNC
prototype implemented on USRP N210. The PNC prototype adopts BPSK
modulation and (5, 7) convolutional code.
0
10
−1
10
−2
BER
10
−3
10
−4
10
−5
10
FSV δ=0
Jt−CNC δ=0
XOR−CDV δ=0
XOR−CDV δ=9.5
Jt−CNC δ=9.5
FSV δ=9.5
−6
10
0
1
2
3
4
5
6
SNR (dB) Eb/N0
7
8
9
10
Fig. 13. BER performances of Jt-CNC, XOR-CD Viterbi (XOR-CDV), and
full-state Viterbi (FSV) decoding algorithms under general symbol misalignment (δ = 9.5, φ = 0), with (13, 15, 17) convolutional code and QPSK
modulation.
slightly. That is because when there are integral symbol misalignments, the head and tail of the signals are non-overlapping
and thus yield cleaner information without mutual interference.
We next compare the BER performances of the Jt-CNC,
XOR-CD Viterbi, and full-state Viterbi algorithms under largerthan-one symbol misalignment. As shown in Fig. 13, when
symbol misalignment δ = 9.5 and phase offset φ = 0, the
BER performances degrades 0.5 dB, 0.5 dB, and 3 dB for JtCNC, FSV, and XOR-CDV, respectively. Within the framework,
both Jt-CNC and full-state Viterbi are robust to symbol misalignment; while XOR-CD Viterbi is quite sensitive to symbol
misalignment.
D. Software Radio Experiment
To evaluate the proposed algorithm in a real communication
system, we implemented an OFDM PNC prototype built on
USRP N210. The three decoding algorithms are implemented
in the prototype. The PNC prototype adopts BPSK modulation
with 2 MHz bandwidth and 2.58 GHz carrier frequency. We
used the (5, 7) convolutional code and followed the frame
format design in [8]. We conducted our experiments in an
indoor office environment and evaluated the BER performances
of Jt-CNC, XOR-CDV, and full-state Viterbi algorithms under
different SNRs. In the experiment, we balanced the powers of
the end nodes and let both nodes transmit 100 packets to the
relay. Each packet consisted of 204 OFDM symbols (4 symbols
of preambles and 200 symbols of data). The PC used for this
experiment has 32 GB RAM and an Intel Core i7 processor.
The typical processing times to decode one packet for the three
decoding algorithms are: 0.5255 s for XOR-CDV; 2.0596 s for
FSV; 1.4946 s for Jt-CNC.
As shown in Fig. 14, the BER performances of Jt-CNC and
full-state Viterbi are nearly the same in real indoor environment.
Compared with the simulation results in Fig. 7, the BER
performance of all the three algorithms in the real system are
degraded by 5 dB due to imperfections of the real system, such
as imperfect channel estimation, carrier-frequency offsets, and
frequency-selective channels.
VII. C ONCLUSION
We have proposed a three-layer decoding framework for
asynchronous convolutional-coded PNC systems. This framework can deal with general (integral plus fractional) symbol
misalignment in convolutional-coded PNC systems. Furthermore, we design a Jt-CNC algorithm to achieve the BERoptimal decoding of convolutional code in synchronous PNC.
Building on the study of convolutional codes, we further
generalize the Jt-CNC decoding algorithm to all cyclic codes
(in Appendix II), providing a new angle to counter symbol asynchrony. Simulation shows that our Jt-CNC algorithm outperforms the previous decoding algorithms (XOR-CD,
reduced-state Viterbi) by 2 dB. For both phase-asynchronous
and symbol-asynchronous PNC, our Jt-CNC algorithm outperforms the two previously proposed algorithms. Importantly, we
have implemented the proposed Jt-CNC decoder in a real PNC
14
13
u2
u1
s1
f1
( s0 ) sK
f2
f K k
c2
c1
cK k
sK k
uK
fK
cK
uK k
cK k 1
f K k 1
uK k 1
sK 1
sK k 1
Fig. 15. Tanner graph of a convolutional code that has the same initial state
S0 and terminal state SK . We merge the initial state and terminal state, hence
the Tanner graph in Fig. 2 becomes a ring.
prototype built on software radio platform. Our experiment
shows that the Jt-CNC decoder works well in practice.
A PPENDIX I
C ORRECT D ECODING OF C IRCULAR -S HIFTED S OURCE B ITS
Theorem 1: Let C denote the codeword of a convolutional
code whose code rate is r = L/R, where L and R are
positive integers. If the initial state and terminal state of this
convolutional code encoder are the same, then the decoding
based on C(kR) , the kR-bit right circular-shifted version of C,
yield the kL-bit right circular-shifted version of U .
Proof: The encoding and decoding process of the convolutional code can be represented by the Tanner graph in Fig. 2.
Since the code has the same initial and terminal state, we can
merge the initial state and terminal state of the Tanner graph
as shown in Fig. 15. For a general convolutional code with
code rate L/R, the source message ūk is an L-bit tuple and
the coded message c̄k is an R-bit tuple.
Let C(kR) = (c̄K−k+1 , c̄K−k+2 , · · · , c̄K , c̄1 , · · · , c̄K−k ) be
the kR-bit right circular-shifted version of codeword C. To
decode C(kR) , the decoding algorithm starts with the first
tuple c̄K−k+1 and ends with the last tuple c̄K−k . Because
the Tanner graph has a ring structure, the decode output is
U(kL) = (ūK−k+1 , ūK−k+2 , · · · , ūK , ū1 , · · · , ūK−k ), which is
the kL-bit right circular-shifted version of U .
Remark 1: Both zero-tail convolutional codes and tail-biting
convolutional codes have the property in Theorem 1, because
their initial state and terminal state are the same. For a recursive convolutional code, we can append tail bits to the input
packet to force the terminal state of the encoder to zero state.
Then recursive convolutional codes can also be used with the
proposed Jt-CNC decoder.
A PPENDIX II
H ANDLE S YMBOL M ISALIGNMENT IN TAIL -B ITING
C ONVOLUTIONAL C ODES AND C YCLIC C ODES
Theorem 2: For a tail-biting convolutional code with code
rate 1/R, R ∈ N+ , let U denote the source packet of the
channel-coded packet C, and C(kR) denote the kR-bit right
circular-shifted version of C. The source packet corresponding
to C(kR) is U(k) , the k-bit right circular-shifted version of U .
Proof: Let m denote the memory length of this convolutional encoder. The generator matrix of the convolutional code
is
⎡
⎤
g0
g1 g2 · · ·
gm
⎢
⎥
g0 g1 · · · gm−1 gm
⎢
⎥
..
..
⎢
⎥
.
.
⎢
⎥
⎢
⎥
g0
g1
g2 · · ·
gm ⎥
⎢
,
G=⎢
g0
g1 · · · gm−1 ⎥
⎢ gm
⎥
⎢g
⎥
g0 · · · gm−2 ⎥
⎢ m−1 gm
⎢ .
⎥
..
..
⎣ ..
⎦
.
.
g1
g2 · · · gm
g0
(II.39)
where gb = [g0 g1 · · · gm ] is the basis generator matrix of
the convolutional code; each entry gi is an R-bit vector
)
*
gi = gi(1) gi(2) · · · gi(R) ,
(II.40)
(r)
where gi is equal to 1 or 0, corresponding to whether the
ith stage of the shift register contributes (connects) to the rth
output. Therefore, the basis generator matrix gb can be regarded
as the “impulse response” of the convolutional encoder. For
example, the basis generator matrix of (5, 7) convolutional code
shown in Fig. 16 is [11 01 11]. The encoding process is simply
C = U G.
(II.41)
The right circular-shifted codeword can be represented by
C(kR) = U G(k)
(II.42)
where G(k) is obtained by right circular-shift matrix G by k×R
columns. Since G is a circulant matrix, we have
C(kR) = U G(k) = U(k) G.
(II.43)
Therefore the source packet of C(kR) is U(k) , the k-bit circularshifted version of U .
Remark 2: Theorem 2 is also valid for the tail-biting convolutional code with a general code rate L/R, but the resulting
source packet will be U(kL) , the kL-bit right circular-shifted
version of U . The proof is the same except that the entry of
the basis generator matrix gb is an L × R matrix:
⎤
⎡ (1)
(2)
(R)
g1,i g1,i · · · g1,i
⎢ (1)
(2)
(R) ⎥
⎢g2,i g2,i · · · g2,i ⎥
⎥
⎢
(II.44)
gi = ⎢ .
..
.. ⎥
⎣ ..
.
. ⎦
(1)
(2)
(R)
gL,i gL,i · · · gL,i
(r)
where gl,i is equal to 1 or 0, depending on whether the ith stage
of the shift register for the lth input contributes (connects) to
the rth output.
Theorem 2 also indicates that tail-biting convolutional codes
are quasi-cyclic with period R. Inspired by the quasi-cyclic
property of tail-biting convolutional codes [19], [30], [31],
we attempted to generalize the results to general quasi-cyclic
14
uk
D
B
B
shifts its codeword C B to produce C(τ
and decodes C(τ
to
I)
I)
B
R
B
A
obtain Ũ ; then node B XORs U with Ũ to obtain U .
ck ,1
D
D
ck ,2
Fig. 16. Convolutional encoder of (5, 7) convolutional code. uk is the input
source bit at time k; ck,1 and ck,2 are the first and second output bit of the
encoder at time k, respectively.
codes (as opposed to just convolutional codes). Unfortunately,
a general quasi-cyclic code3 may not have the property in
Theorem 2. However, building on the insight obtained from our
study of convolutional-coded PNC, we propose an algorithm
that can deal with general symbol misalignment when cyclic
codes are used (as opposed to quasi-cyclic codes). That is,
our asynchronous PNC decoding framework can incorporate
not just convolutional codes, but all cyclic codes. Since cyclic
codes have a period of one, we do not need the interleaver/deinterleaver here.
Let C(·) and C −1 (·) denote the encoding function and decoding function of a particular linear cyclic code (e.g., BCH code),
respectively. Then the encoding process in the end nodes is
C i = C(U i ), i ∈ A, B. To ease presentation, we assume BPSK
modulation and a symbol misalignment of τI . The received
signal at the relay is the overlap of the following two signals:
xA
1
···
xA
τI
xA
τI +1
xB
1
xA
1
B
xN −τI +1
xA
2
···
···
···
······
xB
2 ···
xA
N
B
xN −τI
xB
N.
(II.45)
Upon receiving the overlapped signal, the relay first aligns
the last τI symbols with the first τI symbols to obtain a new
overlapped signal
xA
τI
xB
N
xA
τI +1
xB
1
···
xB
2
xB
N −τI +1
···
···
···
xA
N
B
xN −τI .
(II.46)
B
The result in (II.46) is actually the signal X A + X(τ
,
I)
B
where X(τI ) is the τI -symbol right circular-shifted version of
node B’s signal. Then the relay can map the signal of (II.46)
B
B
to C A ⊕C(τ
, where C(τ
is the τI -bit right circular-shifted
I)
I)
B
B
version of C . Note that C(τ
is also a valid codeword due
I)
to the property of cyclic code. We assume the source packet
B
B
is Ũ B such that Ũ B = C −1 (C(τ
).
corresponding to C(τ
I)
I)
Because the XOR operator preserves the linearity of codes, the
relay first decodes the XORed packet by
B
B
=C −1 C A ⊕C −1 C(τ
=U A ⊕Ũ B ,
U R =C −1 C A ⊕C(τ
I)
I)
(II.47)
and then broadcasts this packet to both the end nodes. After
decoding U R , node A first XORs U R with its own information
B
=
U A to obtain Ũ B ; then node A re-encodes Ũ B to obtain C(τ
I)
B
B
to
obtain
C
;
finally
from
C(Ũ B ), and left circular-shifts C(τ
I)
C B node A can decode U B . For node B, it first right circular3A
recent paper [32] investigates the use of quasi-cyclic LDPC codes to
deal with the symbol misalignment in PNC without requiring the validity of
Theorem 2.
A PPENDIX III
E XPLANATIONS OF XOR-CD AND F ULL -S TATE V ITERBI
A LGORITHMS
In this appendix, we explain and provide interpretations for
XOR-CD algorithm and full-state Viterbi (FSV) algorithm.
To ease the presentation, we consider BPSK modulation and
synchronous PNC.
A. XOR-CD
As pointed out in Section II, XOR-CD refers to a twostep process: (i) symbol-by-symbol PNC mapping; (ii) channel
decoding. In the first step, the received symbol ynR is mapped
A
B
to XORed coded bit cR
n = cn ⊕ cn for n = 1, . . . , N . Upon
receiving the overlapped symbols, the demodulator at the relay
computes the likelihood
B B 2
ynR − hA xA
R A B
−
h
x
1
n
n
Pr yn xn , xn = √
.
exp −
2σ 2
2πσ 2
(III.48)
Then the likelihood in (III.48) is mapped to XORed coded
bits by
B
(III.49)
Pr ynR cR
Pr ynR xA
n =
n , xn .
B A
B
R
xA
n ,xn :cn ⊕cn =cn
After the mapping, we obtain C R = C A ⊕ C B , which is
fed into the channel decoder in the second step. Based on the
likelihood in (III.49), we can make hard decision on cR
n , which
is called “hard” XOR-CD; or we directly pass the probability
to the decoder, which is called “soft” XOR-CD (which is used
in our main text).
In the second step, an ordinary point-to-point channel decoder can be used to decode the XORed source bits. Since
convolutional code is a linear code and XOR is a linear
operator, the decoding process of C R is
C −1 (C R ) = C −1 (C A ⊕C B ) = C −1 (C A )⊕C −1 (C B ) = U A ⊕U B .
(III.50)
Two points are noteworthy: (i) not only convolutional codes,
any linear code can be used with XOR-CD; (ii) the symbols
of X A and X B must be symbol-by-symbol aligned, otherwise
(III.50) is invalid.
B. Full-State Viterbi
In Section IV-A, we show that the complexity of finding
the ML XORed source packet is prohibitively high. Full-state
Viterbi algorithm is proposed to reduce the complexity in [13].
Equation (9) is simplified using log-max approximation to
Û R = arg max log
exp −M X A , X B
UR
≈ arg min
UR
U A ,U B :U A ⊕U B =U R
min
U A ,U B :U A ⊕U B =U R
M X A, X B .
(III.51)
The computing of (III.51) consists of two steps. First, we
find the best pair of codewords Û A and Û B such that
Û A , Û B = arg min M X A , X B .
(III.52)
U A ,U B
Computing (III.52) is equivalent to finding the shortest path on
the joint trellis of node A’s and node B’s encoders. Viterbi
15
algorithm is a well-known algorithm to solve this problem.
Since the state space of the joint trellis is the combination
of node A’s and node B’s state space, we call this decoding
algorithm the “full-state” Viterbi algorithm. Second, we obtain
Û R by XOR Û A with Û B (i.e., Û R = Û A ⊕ Û B ).
[24]
[25]
[26]
R EFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
S. Zhang, S. C. Liew, and P. P. Lam, “Hot topic: physical-layer network
coding,” in Proceedings of Mobicom 2006. ACM, 2006, pp. 358–365.
R. Ahlswede, N. Cai, S.-Y. Li, and R. W. Yeung, “Network information
flow,” IEEE Trans. Inf. Theory, vol. 46, no. 4, pp. 1204–1216, 2000.
S.-Y. Li, R. W. Yeung, and N. Cai, “Linear network coding,” IEEE Trans.
Inf. Theory, vol. 49, no. 2, pp. 371–381, 2003.
S. Zhang and S. C. Liew, “Channel coding and decoding in a relay
system operated with physical-layer network coding,” IEEE J. Sel. Areas
Commun., vol. 27, no. 5, pp. 788–796, 2009.
L. Lu and S. C. Liew, “Asynchronous physical-layer network coding,”
IEEE Trans. Wireless Commun., vol. 11, no. 2, pp. 819–831, 2012.
IEEE-SA Standards Board, “Wireless LAN medium access control
(MAC) and physical layer (PHY) specifications,” IEEE Std 802.11 part
11, 2003.
L. Lu, T. Wang, S. C. Liew, and S. Zhang, “Implementation of physicallayer network coding,” Physical Communication, 2012.
L. Lu, L. You, Q. Yang, T. Wang, M. Zhang, S. Zhang, and S. C.
Liew, “Real-time implementation of physical-layer network coding,” in
Proceedings of the 2nd Workshop on Software Radio Implementation
Forum. ACM, 2013, pp. 71–76.
F. Rossetto and M. Zorzi, “On the design of practical asynchronous
physical layer network coding,” in IEEE 10th Workshop on SPAWC ’09.
IEEE, 2009, pp. 469–473.
S. C. Liew, S. Zhang, and L. Lu, “Physical-layer network coding:
Tutorial, survey, and beyond,” Physical Communication, vol. 6, pp. 4–42,
2013.
A. Viterbi, “Error bounds for convolutional codes and an asymptotically
optimum decoding algorithm,” IEEE Trans. Inf. Theory, vol. 13, no. 2,
pp. 260–269, 1967.
L. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of
linear codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory,
vol. 20, no. 2, pp. 284–287, 1974.
D. To and J. Choi, “Convolutional codes in two-way relay networks with
physical-layer network coding,” IEEE Trans. Wireless Commun., vol. 9,
no. 9, pp. 2724–2729, 2010.
Q. Yang and S. C. Liew, “Optimal decoding of convolutional-coded
physical-layer network coding,” accepted by WCNC 2014, 2014.
D. Wang, S. Fu, and K. Lu, “Channel coding design to support asynchronous physical layer network coding,” in Proceedings of Globecom
2009. IEEE, 2009.
X. Wu, C. Zhao, and X. You, “Joint ldpc and physical-layer network
coding for asynchronous bi-directional relaying,” IEEE J. Sel. Areas
Commun., vol. 31, no. 8, pp. 1446–1454, 2013.
S. Vanka, S. Srinivasa, Z. Gong, P. Vizi, K. Stamatiou, and M. Haenggi,
“Superposition coding strategies: Design and experimental evaluation,”
IEEE Trans. Wireless Commun., vol. 11, no. 7, pp. 2628–2639, 2012.
H. Jiang and P. A. Wilford, “A hierarchical modulation for upgrading
digital broadcast systems,” IEEE Trans. Broadcast, vol. 51, no. 2, pp.
223–229, 2005.
H. Ma and J. Wolf, “On tail biting convolutional codes,” IEEE Trans.
Commun., vol. 34, no. 2, pp. 104–111, 1986.
V. Namboodiri, K. Venugopal, and B. Rajan, “Physical layer network
coding for two-way relaying with qam,” IEEE Trans. Wireless Commun.,
vol. 12, no. 10, pp. 5074–5086, October 2013.
H. J. Yang, Y. Choi, and J. Chun, “Modified high-order PAMs for binary
coded physical-layer network coding,” IEEE Commun. Lett., vol. 14,
no. 8, pp. 689–691, 2010.
T. Koike-Akino, P. Popovski, and V. Tarokh, “Optimized constellations
for two-way wireless relaying with physical network coding,” IEEE J.
Sel. Areas Commun., vol. 27, no. 5, pp. 773–787, 2009.
J. Pearl, Probabilistic reasoning in intelligent systems: networks of
plausible inference. Morgan Kaufmann, 1988.
[27]
[28]
[29]
[30]
[31]
[32]
F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and
the sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp.
498–519, 2001.
S. Zhang, S.-C. Liew, and P. P. Lam, “On the synchronization of physicallayer network coding,” in Information Theory Workshop 2006. IEEE,
2006, pp. 404–408.
L. Lu, S. C. Liew, and S. Zhang, “Optimal decoding algorithm for
asynchronous physical-layer network coding,” in Proceedings of ICC
2011. IEEE, 2011.
A. J. Viterbi, “Convolutional codes and their performance in communication systems,” IEEE Trans. Commun. Tech., vol. 19, no. 5, pp. 751–772,
1971.
B. Sklar, Digital communications. Prentice Hall NJ, 2001, vol. 2.
M. L. Cedervall and R. Johannesson, “A fast algorithm for computing
distance spectrum of convolutional codes,” IEEE Trans. Inf. Theory,
vol. 35, no. 6, pp. 1146–1159, 1989.
M. Esmaeili, T. A. Gulliver, N. P. Secord, and S. A. Mahmoud, “A link
between quasi-cyclic codes and convolutional codes,” IEEE Trans. Inf.
Theory, vol. 44, no. 1, pp. 431–435, 1998.
G. Solomon and H. Tilborg, “A connection between block and convolutional codes,” Journal on Applied Mathematics, vol. 37, no. 2, pp.
358–369, 1979.
P.-C. Wang, Y.-C. Huang, and K. R. Narayanan, “Asynchronous computeand-forward/integer-forcing with qusai-cyclic codes,” arXiv preprint arXiv:1312.4003, 2013.
Qing Yang received his B.Eng degree in electronics
and information engineering from the Huazhong
University of Science and Technology, Wuhan, China,
in 2010. Since then he has been Ph.D. student
at the Department of Information Engineering, The
Chinese University of Hong Kong. His research
interests include physical-layer network coding, multiuser MIMO and software-defined radio.
Soung Chang Liew received his S.B., S.M., E.E.,
and Ph.D. degrees from the Massachusetts Institute of
Technology. From 1984 to 1988, he was at the MIT
Laboratory for Information and Decision Systems,
where he investigated Fiber-Optic Communications
Networks. From March 1988 to July 1993, he was
at Bellcore (now Telcordia), New Jersey, where he
engaged in Broadband Network Research. He has
been a Professor at the Department of Information
Engineering, The Chinese University of Hong Kong
(CUHK), since 1993. Prof. Liew is currently the
Division Head of the Department of Information Engineering and a Co-Director
of the Institute of Network Coding at CUHK. He is also an Adjunct Professor
of Peking University and Southeast University, China.
Prof. Liews research interests include wireless networks, Internet protocols,
multimedia communications, and packet switch design. Prof. Liews research
group won the best paper awards in IEEE MASS 2004 and IEEE WLN
2004. Separately, TCP Veno, a version of TCP to improve its performance
over wireless networks proposed by Prof. Liews research group, has been
incorporated into a recent release of Linux OS. In addition, Prof. Liew initiated
and built the first inter-university ATM network testbed in Hong Kong in 1993.
More recently, Prof. Liews research group pioneers the concept of Physicallayer Network Coding (PNC). Publications of Prof. Liew can be found in
www.ie.cuhk.edu.hk/soung.
Download