Low Density Parity Check (LDPC) Code Implementation Matthew Pregara & Zachary Saigh Advisors: Dr. In Soo Ahn & Dr. Yufeng Lu Dept. of Electrical and Computer Eng. 2 Contents Background and Motivation Linear Block Coding Example Hard Decision Tanner Graph Decoding Constructing LDPC Codes Soft Decision Decoding Results Conclusion 3 Background ARQ: Automatic Repeat Request Detects errors and requests retransmission Example: Even or Odd Parity FEC: Forward Error Correction Detects AND Corrects Errors Examples: Linear Block Coding Turbo Codes Convolutional Codes 4 Why LDPC? Low decoding complexity Higher Better code rate Error performance Industry standard for: 802.11n Wi-Fi Digital Video Broadcasting WiMAX and 4G 5 Performance Comparison (taken from [1]) 6 Project Goals Create LDPC code system simulation with MATLAB/Simulink Implement a scaled down LDPC system on a FPGA using Xilinx System Generator Complete System performance comparison between MATLAB/Simulink and FPGA implementation 7 Linear Block Coding Block Codes are denoted by (n, k). k = message bits (message word) n = message bits + parity bits (coded bits) # of parity bits: m = n - k Code Rate R = k/n Ex: (7,4) code 4 message bits +3 parity bits = 7 coded bits Code rate R = 4/7 8 Hamming Code Example Transmitter Channel Receiver e m G (encoder matrix) u + + r r H’ (decoder matrix) S Error Lookup Table u’ Remove Parity bits m’ e e = error pattern m = message bit word u = code word r = received code word with error S = syndrome u’ = corrected code word m’ = received message word 9 Constructing Hamming Code Factor xn +1 Populate G matrix (k x n) with shifted factor Take reduced row echelon form to find Systematic G matrix from G matrix H matrix is obtained by manipulating the systematic G matrix. 10 Encoding Example 1 0 1 0 1 1 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 1 0 0 1 1 1 1 1 1 0 1 1 0 1 1 p1 p2 p3 p1 m1 m3 m4 1 1 1 1 p2 m1 m2 m3 1 0 1 0 p3 m2 m3 m4 0 1 1 0 1 0 1 1 1 0 0 11 Decoding S m1 = rcvd. code word × HT m2 m3 m4 p1 p2 1 0 1 p3 1 1 0 0 1 0 1 1 1 1 0 1 0 0 1 0 0 1 S1 S2 S3 S1 m1 m 3 m 4 p1 S2 m1 m 2 m 3 p 2 S3 m 2 m 3 m 4 p 3 12 Correcting Errors In this case the 2nd bit is corrupted Invert the corrupted bit according to the location found by the syndrome table S 0 1 1 1 1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 0 13 Tanner Graph and Hard Decision Decoding (8,4) Example V1 (2458) C1 V2 V3 (1236) (3678) C2 V4 V5 C3 V6 (1457) C4 V7 V8 V1 V2 0 1 H 0 1 1 1 0 0 V3 V4 V5 V6 V7 V8 0 1 1 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 1 1 1 0 1 0 C1 C2 C3 C4 14 Hard Decision Decoding C1 V1 1 V2 1 0 V3 C2 V4 V5 C3 V6 C4 V7 V8 Check Nodes C1 1 0 1 0 1 C2 C3 C4 Activities Receive V2→ 1 V4→ 1 V5→ 0 V8→ 1 Send 0 → V2 0 → V4 1 → V5 0 → V8 Receive V1→ 1 V2→ 1 V3→ 0 V6→ 1 Send 0 → V1 0 → V2 1 → V3 0 → V6 Receive V3→ 0 V6→ 1 V7→ 0 V8→ 1 Send 0 → V3 1 → V6 0 → V7 1 → V8 Receive V1→ 1 V4→ 1 V5→ 0 V7→ 0 Send 1 → V1 1 → V4 0 → V5 0 → V7 15 Hard Decision Decoding C1 V1 1 V2 1 0 V3 C2 V4 V5 C3 V6 C4 V7 V8 Check Nodes Update C1 1 0 1 0 1 Activities C2 C3 C4 Receive V2→ 1 V4→ 1 V5→ 0 V8→ 1 Send 0 → V2 0 → V4 1 → V5 0 → V8 Receive V1→ 1 V2→ 1 V3→ 0 V6→ 1 Send 0 → V1 0 → V2 1 → V3 0 → V6 Receive V3→ 0 V6→ 1 V7→ 0 V8→ 1 Send 0 → V3 1 → V6 0 → V7 1 → V8 Receive V1→ 1 V4→ 1 V5→ 0 V7→ 0 Send 1 → V1 1 → V4 0 → V5 0 → V7 16 Hard Decision Decoding C1 V1 1 V2 1 0 V3 C2 V4 V5 C3 V6 C4 V7 V8 Check Nodes Update C1 1 0 1 0 1 Activities C2 C3 C4 Receive V2→ 1 V4→ 1 V5→ 0 V8→ 1 Send 0 → V2 0 → V4 1 → V5 0 → V8 Receive V1→ 1 V2→ 1 V3→ 0 V6→ 1 Send 0 → V1 0 → V2 1 → V3 0 → V6 Receive V3→ 0 V6→ 1 V7→ 0 V8→ 1 Send 0 → V3 1 → V6 0 → V7 1 → V8 Receive V1→ 1 V4→ 1 V5→ 0 V7→ 0 Send 1 → V1 1 → V4 0 → V5 0 → V7 17 Hard Decision Decoding C1 V1 1 V2 1 0 V3 C2 V4 V5 C3 V6 C4 V7 V8 Check Nodes Update C1 1 0 1 0 1 Activities C2 C3 C4 Receive V2→ 1 V4→ 1 V5→ 0 V8→ 1 Send 0 → V2 0 → V4 1 → V5 0 → V8 Receive V1→ 1 V2→ 1 V3→ 0 V6→ 1 Send 0 → V1 0 → V2 1 → V3 0 → V6 Receive V3→ 0 V6→ 1 V7→ 0 V8→ 1 Send 0 → V3 1 → V6 0 → V7 1 → V8 Receive V1→ 1 V4→ 1 V5→ 0 V7→ 0 Send 1 → V1 1 → V4 0 → V5 0 → V7 18 Variable Node Decisions Variable Nodes yi Messages from Check Nodes Decision V1 1 C2 → 0 C4 → 1 1 V2 1 C1 → 0 C2 → 0 0 V3 0 C2 → 1 C3 → 0 0 V4 1 C1 → 0 C4 → 1 1 V5 0 C1 → 1 C4 → 0 0 V6 1 C2 → 0 C3 → 1 1 V7 0 C3 → 0 C4 → 0 0 V8 1 C1 → 0 C3 → 1 1 19 Differences of LDPC Code Construct H is sparsely populated with 1s Fewer Find G H matrix first edges → less computations the systematic H and G matrices will not be sparse 20 Soft Decision Decoding Uses Tanner Graph representation with an iterative process No “hard-clipping” of received code word 2dB performance gain over hard decision [2] 21 High Level LDPC System Block Diagram Transmitter Channel Receiver e m Encoder Matrix G u + r LDPC Decoder u’ Remove Parity bits m’ e = error pattern m = message bit word u = code word r = received code word with error S = syndrome u’ = corrected code word m’ = received message word 22 Soft Decision Decoder Diagram 23 -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 2 0.4997 6 3 -5.1652 7 4 1.5357 8 5 -5.0942 9 1.2526 10 1 Re-Calculate Each Edge -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 24 -1.5320 1 2 3 4 5 10 Re-Calculate Each Edge -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 -1.5320 25 1 2 3 4 5 10 Re-Calculate Each Edge -7.11158 1 -1.5320 2 1 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 10 2 26 -1.5320 Update Algorithm 3 SUM 4 5 Re-Calculate Each Edge -7.11158 1 -1.5320 2 1 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 10 2 27 -0.3289 Update Algorithm 3 SUM 4 5 Re-Calculate Each Edge -7.11158 1 -1.5320 2 1 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 1.2526 2 5.7602 Update Algorithm 3 SUM 4 9 10 28 5 This Updated Value is Sent back to Variable Node 1 0.2096 Re-Calculate Each Edge -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 29 1 2 3 4 5 10 Re-Calculate Each Edge -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 30 -1.5320 1 2 3 4 5 10 Re-Calculate Each Edge -7.11158 1 -1.5320 2 -0.3289 3 5.7602 4 2.7111 5 0.4997 6 -5.1652 7 1.5357 8 -5.0942 9 1.2526 -1.5320 31 1 2 3 4 5 10 32 Decoding Algorithm Phi function: Difficult to implement on a FPGA Solutions: Find an approximation Construct lookup table 33 Lookup table Approach Note: all inputs are >=0 34 Simulation Results 35 Simulink LDPC System 36 Encoder Comparison 38 Encoder with Xilinx blocks 38 Co-Simulation Results 39 Xilinx LDPC Decoder 40 Decoder Control Logic 41 Check Node Implementation 42 Conclusion MATLAB/Simulink simulation of LDPC system has been completed. An efficient approximation of decoding algorithm has been developed for hardware implementation. Xilinx System generator design for the decoder has been constructed. Comparison and verification has not been completed for those results from MATLAB and Xilinx system generator. FPGA implementation and a scaled up system may not be completed. 43 References [1] Valenti, Matthew. Iterative Solutions Coded Modulation Library Theory of Operation. West Virginia University, 03 Oct. 2005. Web. 23 Oct. 2012. <www.wvu.edu>. [2] B. Sklar, Digital Communications, second edition: Fundamentals and Applications, Prentice-Hall, 2000. [3] Xilinx System Generator Manual, Xilinx Inc. , 2011. 44 Questions? 45 Appendix 46 Matrix Manipulation 47 Reducing Decoding Complexity Square_add function: y = max_star(L1,L2) - max_star(0, L1+L2); MAX* function: if (L1==L2) y = L1; return; end; y = max(L1,L2) + log(1+exp(-abs(L1-L2))); end; 48 Reducing Decoding Complexity This: y = max(L1,L2) + log(1+exp(-abs(L1-L2))); Becomes Approximated MAX* function: x = -abs(L1-L2); if ((x<2) && (x>=-2)) y = max(L1,L2) + 3/8; else y = max(L1,L2); end; No costly log function 49 In Practice Using MATLAB’s Code Profiler… MAX* function takes: 25.85s of simulation Approx. MAX* function takes: 28.02s of equally sized simulation Difference of 2.17s 50 Simulation Results (10,5) Code 1000 codewords per datapoint, or 10,000 bits 50 Timeline Mar 4, 2013 Xilinx System Generator Model Feb 11, 2013 MATLAB Implementation Feb 1, 2013 Jan 28, 2013 Feb 25, 2013 Performance Analysis Mar 1, 2013 Apr 2, 2013 FPGA Implementation Mar 29, 2013 Student Expo Abstract Due Apr 1, 2013 Apr 21, 2013 Design Scaling Apr 15, 2013 - Apr 19, 2013 Student Expo May 9, 2013 - May 15, 2013 Project Presentations May 1, 2013 May 15, 2013 52 Division of Labor Zack Simulink Model Xilinx System generator design of decoder Implementation of VHDL on FPGA Performance analysis of FPGA implementation Matt MATLAB Simulation Error Performance Analysis Xilinx System generator design of encoder Performance analysis of MATLAB/Xilinx implementation