Viterbi Algorithm Advanced Architectures Lecture 15 Vladimir Stojanović 6.973 Communication System Design – Spring 2006 Massachusetts Institute of Technology Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Radix 2 ACS 1x 0 Gnx0 = min ( G 0nx- 1 + λ0nx-01, G 1x n - 1 + λn - 1 ( 0 G1nx- 1 1 1 0 1x0 λ 0x0 n λn λ 0x0 n Gn0x- 1 Å λ 1x0 n Gn1x- 1 x0 dn Select Gn0 x- 1 n λ 0x0 n G x0 n λ1x0 n 0x1 λn Gnx1 λ 1nx1 Compare n-1 Gnx0 x0 G0x n-1 2-way ACS 0x1 λ n λ 1x1 n dn Gnx0 Gn1x- 1 2-way ACS dn G nx1 x1 Figure by MIT OpenCourseWare. Radix-2 trellis 2-way ACS Radix-2 ACS Unit Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 2 Radix-4 trellis n-2 n-1 n n-2 n-1 n n-2 n 0 0 0 0 0 0 0 0 1 1 1 2 1 1 2 1 2 2 2 4 4 2 4 2 3 3 3 6 5 3 6 3 4 4 4 1 2 4 1 4 5 5 5 3 3 5 3 5 6 6 6 5 6 6 5 6 7 7 7 7 7 7 7 7 Figure by MIT OpenCourseWare. 8-state Radix-2 trellis 4-state subtrellis 8-state Radix-4 trellis Radix - 2k complexity speed measures Radix 2 4 8 16 k 1 2 3 4 Ideal speedup 1 2 3 4 Complexity increase 1 2 4 8 Area efficiency 1 1 0.75 0.5 Figure by MIT OpenCourseWare. 3 Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. Radix-4 ACS x00 λ01 x00 λ10 x00 λ11 x00 λ00 n n n n x00 λ00 n Gn00- x2 Gnx 00 G 0n0-x2 Gn01- x2 Gnx 01 G 0n1-x2 Gn10- x2 Gnx10 Gn11- x2 Gnx11 G1n0-x2 Gn11-x2 G 0n 0- 2x Å x00 λ10 n Å 4-way ACS x01 λ01 x01 λ10 x01 λ11 x01 λ00 n n n n x00 λ01 n Select n Compare n-2 d nx 00 x00 λ11 n Å Gnx 00 4-way ACS G 0n 1- 2x x10 λ01 x10 λ10 x10 λ11 x10 λ00 n n n n G1n 0- 2x 4-way ACS x11 λ01 x11 λ10 x11 λ11 x11 λ00 n n n n G1n 1- 2x Radix-4 trellis 4-way ACS 4-way ACS d nx 00 Gnx00 d nx 01 Gnx01 d nx 10 Gnx10 d nx 11 Gnx11 Radix-4 ACS unit Figure by MIT OpenCourseWare. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 4 Radix-4 ACS implementation Use ripple carry adders and comparators Take advantage of the ripple profile to hide the compare Delay 17% longer than 2-way ACS due to Image removed due to copyright restrictions. Increased adder fanout 4:1 mux instead of 2:1 mux Overall, results in 1.7x speedup compared to 2-way ACS Figure from from Black, P. J., and T. H. Meng. "A 140-Mb/s, 32-state, Radix-4 Viterbi Decoder." IEEE Journal of Solid-State Circuits 27 (1992): 1877-1885. Copyright 1992 IEEE. Used with permission. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 5 Radix-4 placement and routing Paths given as Hamiltonian cycles (visit each node in the graph once) Images removed due to copyright restrictions. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 6 Modulo arithmetic for ACS Viterbi algorithm inherently bounds the maximum dynamic range ∆max of state metrics ∆max ≤ λmaxlog2N (N-number of states, λmax maximum branch metric of the radix-2 trellis) Number theory Given two numbers a and b such that |a-b|< ∆ Comparison |a-b| can be evaluated as |a-b| mod 2∆ without ambiguity Hence state metrics can be updated and compared modulo 2∆max Choose state metric precision to implement modulo by ignoring the state metric overflow Required state metric precision equal to twice the maximum dynamic range of the updated state metrics Required number of bits is Γbits=ceil[ log2(∆max+kλmax) ] + 1 k – accounts for branch metric addition Example values (for the 32-state radix-4 decoder) k=2, λmax=14, ∆max=70, Γbits=8 Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 7 Branch metric unit Example (8-level soft input, R=1/2, K=6 (32 state) λ(S1S2)=|G1-S1|+|G2-S2| (G-received sample, S-expected sample) (λmax=14) 4 bits required for radix-2 branch metrics 5 bits for the radix-4 branch metrics Images removed due to copyright restrictions. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 8 State-metric initialization Need to start from right state metrics for dynamic range bound to hold (and for modulo arithmetic to be valid) This is b/c there are constraints on the state metric values imposed by the trellis structure For example state-0 and state-1 have a common ancestor state one iteration back This constrains the state metrics to differ at most by the λmax Find the right initial metric through simulation (with all zero inputs) until steady state is reached Figure removed due to copyright restriction. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 9 Decoder block diagram Decoder output 4 LIFO 4 Decisions Radix-16 trace-back unit 32 x 4 Radix-16 decisions Decision memory 32 x 4 Radix-16 decisions Radix-4 pretrace-back unit 32 x 2 Radix-4 decisions Radix-4 ACS array 16 x 5 Metrics Radix-4 branch metric unit 4x4 Metrics Radix-2 BMU 2x3 Metrics 4x4 Radix-2 BMU G 1, G 2 G 1, G 2 2x3 Decoder inputs Figure by MIT OpenCourseWare. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 10 Decision memory organization Survivor paths always merge L=5K steps back Decode block D Survivor path merge block L Write block D Decision vector Trace-back { Write { Read region Write region Figure by MIT OpenCourseWare. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 11 Advanced algorithmic transformations Sliding block Viterbi decoder (SBVD) Based on two important observations (µ+1=constraint length of the code) 1. Survivor paths merge L=5µ iterations back into the trellis 2. After K=5µ steps, state metrics independent on the initial value of state metrics Unknown state at time n can be decoded using only information from the block [n-K, n+L] Cannot store all the values in the memory Have to obtain them “on-the-fly” Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 12 SBVD implementation Can find shortest path by running forward or backward At step m Forward processing Backward processing 4 survivors 4 shortest paths Combined Smallest concatenated state metric Starting state for trace-back of the shortest path Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Continue fw and backw operation Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 13 Forward vs. Forward-Backward Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Can decode more than one state (M – states) Fw-Bw has reduced decoding delay and skew buffer memory Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 14 Continuous stream processing Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Cut the incoming stream in overlapping chunks Process in parallel Outputs are non-overlapping Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 15 Systolic SBVD architecture Since block is bounded Can unroll and pipeline ACS Trace-back M=2L Has to be a multiple of L Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 16 Example for L=2 Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 17 ACS units Figures from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 18 A 64-state example Not fully parallel (8 radix-4 ACS units) 2 radix-4 butterflies in each cycle 8 cycles for 64 states radix-4 (i.e. two radix-2 steps) Images removed due to copyright restrictions. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 19 Readings [1] A.P. Hekstra "An alternative to metric rescaling in Viterbi decoders," Communications, IEEE Transactions on vol. 37, no. 11, pp. 1220-1222, 1989. [2] P.J. Black and T.H. Meng "A 140-Mb/s, 32-state, radix-4 Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 27, no. 12, pp. 1877-1885, 1992. [3] P.J. Black and T.Y. Meng "A 1-Gb/s, four-state, sliding block Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 32, no. 6, pp. 797-805, 1997. [4] M. Anders, S. Mathew, R. Krishnamurthy and S. Borkar "A 64-state 2GHz 500Mbps 40mW Viterbi accelerator in 90nm CMOS," VLSI Circuits, 2004. Digest of Technical Papers. 2004 Symposium on, pp. 174-175, 2004. Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006. MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY]. 6.973 Communication System Design 20