ELEC692n assignment #2

advertisement
ELEC692n VLSI Signal Processing Architecture
assignment #2
(Due on Oct. 23)
1. Unfold the DFG in Fig 1 using unfolding factors 2 and 5.
3D
A
10D
B
3D
A
C
D
B
C
7D
E
D
16D
Fig. 1 The DFGs for problem 1.
2. Our objective in this problem is to prove that the critical path of a J-unfolded DFG
is a monotonically non-decreasing function with respect to J. To show this, prove
that the critical path of a J-unfolded DFG is greater than or equal to the critical
path of the (J-1)-unfolded DFG.
3. In this problem, we wish to show that changing the ordering of retiming and
unfolding is immaterial when we wish to minimize the critical path of the
unfolded DFG. Perform (a) and (b) for the DFG in Fig. 2 :
a) Unfold the DFG with unfolding factor J=2, and then retime the unfolded DFG
to minimize the clock period.
'
b) Let the retiming function determined in part (a) of this problem be denoted r .
Let the retiming function r be
r  Ar ' A0r ' A1 ; r  B   r '  B 0   r '  B1 ;
 
 
 
 
 
 
'
'
'
'
r C  r C 0  r C 1 ; r D  r D 0  r D1 .
Show that retiming the DFG in Fig. 2 using r results in a DFG that, when
unfolded, achieves the same critical path as the result of part (a) of this
problem. Do this by showing that if D U ,V   c , then
W U ,V   r V   r U   J holds for all pairs of nodes U , V in the retimed
version of the original DFG.
(20)
B
3D
A
(10)
C
(20)
D
(10)
Fig. 2 The DFG for problem 3.
4. Consider the DFG in Fig.3. The numbers in parentheses are the computation times
of the nodes.
a) What is the iteration bound of this DFG? What is the actual iteration period?
b) Retime this DFG to minimize the iteration period. What is the actual iteration
period of the retimed DFG?
c) Unfold both the original DFG and the retimed DFG by a factor of 2. What are
their actual iteration periods?
d) Determine the minimum unfolding factor J such that the J-unfolded DFG
(unfold from the original DFG) can be retimed so that the critical path of this
unfolded DFG is J T  , where T  is the iteration bound of the original
DFG in Fig. 3. Unfold the DFG by this minimum unfolding factor and retime
the unfolded DFG so that its critical path is J T  .
(8)
E
(2)
(10)
A
B
D
D
3D
D
(6)
Fig. 3 The DFG for problem 4.
5. Consider the 6-tap FIR filter
5
y  i    h ix  n  i 
i 0
C
(4)
implemented using data-broadcast form shown in Fig. 4. This filter is implemented
using folding factor 2 with folding set
S  MA5, MA4 , S  MA3, MA2 , S  MA1, MA0
0
1
2
a) Design the folded architecture.
b) Construct a schedule corresponding to the folded architecture and verify that the
folded architecture generates the desired filter output samples.
x(n)
h5
0
h4
X
+
D
h3
X
+
D
+
(S0/0)
(S0/1)
D
h1
X
+
MA3
MA2
(S1/0)
(S1/1)
MA4
MA5
h2
X
D
h0
X
+
D
X
+
D
MA0
MA1
(S2/0)
(S2/1)
Fig. 4 A 6-tap data-broadcast FIR filter for problem5.
6. The goal of this problem is to fold the lattice filter shown in Fig.5 using the folding
set description shown in the figure. Assume the multiply operations to be mapped to
multiply hardware operations pipelined by 2 stages and assume the add operations to
be mapped to 1-stage pipeline adders. The hardware architecture needs to be clocked
with a clock period 1 u.t.
a) Systematically perform retiming for folding so that all folded edge delays are
nonnegative.
b) Fold the retimed DFG.
IN
A2(SA1/0)
+
M2(SM1/1)
X
X
A1(SA1/1)
+
D
M3(SM2/0)
X
M4(SM2/1)
M1(SM1/0)
D
X
X
+
+
A3(SA2/0)
M5(SM3/1)
OUT
A4(SA2/1)
Fig. 5 The lattice filter used in problem 6.
SA1={A2,A1} SA2={A3,A4} SM1={M1,M2} SM2={M3,M4} SM3={  ,A5}
7. Dynamic programming (DP) has been used to solve problems in communications
and controls, artificial intelligence, and operations research, etc. The DP problems
have the property that the optimum solution from an initial iteration to the iteration
i  j must consist of the optimum solution from initial iteration to iteration i , and
from iteration i to iteration i  j . In signal processing, DP is frequently used in
y(n)
Viterbi decoders in communication systems, and in hidden Markov models based
speech recognition systems [15]. Consider the N -state DP problem given by
xi (n  1)  max[ x j (n)  a ji (n)], i, j  1, 2,
j
, N,
(6.9)
Where xi (n) is the value for state i in the n -th iteration, and variable a ji are
referred to as the trellis or path coefficients. The fundamental operation in (6.9) is
add-compare-select (ACS). An N -state DP problem require N 2 ACS operations in
order to update the N state values. The DFG for N =4 is show in Fig. 6, which
corresponds to a ring systolic structure. The coefficient a ji is stored in node A ji ,
and the DFG is wrapped around with edge i on the right connected to the edge on
the left with the same number.
(a) Write down the computations performed by all ACS units in the DFG in Fig. 6
during n -th iteration, and verify that this DFG updates x1 (n  1) , x2 (n  1) ,
x3 (n  1) , and x4 (n  1) along the first, second, third and fourth column,
respectively.
(b) Fold the DFG using the folding factor 4 and following folding sets:
S1  { A41 , A31 , A21 , A11}
S2  { A12 , A42 , A32 , A22 }
S3  { A23 , A13 , A43 , A33}
S4  { A34 , A24 , A14 , A44 }.
(Hint: You should get a ring systolic structure.)
(c) Fold the DFG using the folding factor 8 and the following folding sets:
S1  { A41 , A23 , A31 , A13 , A21 , A43 , A11 , A33}
S2  { A12 , A34 , A42 , A24 , A32 , A14 , A22 , A44 }.
(d) Fold all the nodes in the DFG onto one processing element using folding factor 16
and the following folding set:
S1  { A41 , A12 , A23 , A34 , A31 , A42 , A13 , A24 , A21 , A32 , A43 , A14 , A11, A22 , A33 , A44}.
Fig. 6 The DFG for the DP used in Problem 7.
Download