FPGA Implementation of Reed-Solomon Decoder for IEEE 802.16 WiMAX Systems using Simulink-Sysgen Design Environment Miljko Bobrek, Kenyon H. Clark, Austin P. Albright Oak Ridge National Laboratory Cognitive Radio Program 1 Bethel Valley Rd Oak Ridge, TN 37831. bobrekm@ornl.gov Abstract— This paper presents FPGA implementation of the Reed-Solomon decoder for use in IEEE 802.16 WiMAX systems. The decoder is based on RS(255,239) code, and is additionally shortened and punctured according to the WiMAX specifications. A Simulink model based on the System Generator (Sysgen) library of low-level Xilinx blocks was used for simulation and hardware implementation. At the end, simulation results and hardware implementation performances are presented. Keywords— Reed-Solomon, syndrome, Berlekamp algorithm, Chien search, Forney algorithm 1. INTRODUCTION Reed-Solomon (RS) error correcting codes are one of the most popular codes used in communication, data storage systems, and bar codes. They are an integral part of many standards such as G.709, IEEE 802.16, DVB, IESS-308, CCSDS, RAID, and QR bar codes. Many implementations of RS codes have been published over the years targeting ASIC, FPGA, or DSP applications [1-6]. Most of the implementation oriented RS papers address a specific application usually related to a particular standard. Some of the examples are VLSI implementation of RS decoder for HDTV [7] and the FPGA implementation for OTN G.709 [8]. In this paper, we present an FPGA-based, fast prototyping design environment for arbitrary RS code sizes including shortened and punctured codes. The design methodology is based on Simulink® (Mathworks) which uses the System Generator® (Xilinx) library of logic blocks for seamless FPGA implementations. The main motivation for the creation of such a development platform is to be able to quickly design RS decoders for special purpose wireless links not covered by existing standards. The development platform was tested by designing and implementing 255,239 code for IEEE 802.16 WiMAX systems. A description of Reed-Solomon encoding and decoding procedures is presented in Section 2. Section 3 presents the FPGA implementation issues and results. 2. REED-SOLOMON CODES Reed-Solomon codes , are linear-block, systematic and cyclic codes defined over Galois field with data words and parity words each containing elements from the set 0, 1, … , 1 . The most commonly used values for and are 2 and 8 thus producing a non-binary field with byte-size words. Error correcting capability of RS codes is defined by the following expression 2 (1) where E is the number of errors and B is the number of erasures. The codes that have their erasures located in the parity part of the code are called punctured codes. The arithmetic in the Galois field is defined by the field generator polynomial . The code words are generated using the code generating polynomial defined as: (2) where is a primitive root of the field generator polynomial. RS encoder The encoded message is generated as: (3) where is the message polynomial: (4) and the parity polynomial of order the remainder of the polynomial division: calculated as (5) Commonly, the polynomial division in (5) is implemented in hardware using a feedback shift register as shown in Figure 1. In many applications the original , code is shortened and punctured so that the new code , has only data words and first parity words. RS decoder The first step in the RS decoding process is the syndrome calculation. The syndrome polynomial is defined as: where µ are known erasure location and Λ 1. In the case of error-only codes, the locator polynomial is initialized 1. Once the error locator polynomial is known, by Λ the error evaluator polynomial is calculated by polynomial multiplication: 0/1 0 gn-k-1 g1 g0 D D Ω D Λ (11) The positions of the errors are located at the roots of the error locator polynomial which are calculated by brute force using the Chien search. The error values e are then calculated using Forney algorithm. These two steps are formalized in the following formula: din dout Ω 0/1 when Λ 0, 1, … , (12) Fig. 1 Reed-Solomon encoder. (6) For the shortened and punctured codes , the Chien , search can be restricted to data locations only, as shown below. Ω The syndrome polynomial coefficients are calculated using: (7) when Λ 0, ,…, (13) This way the decoder latency can be significantly reduced as is much smaller than in many applications. At the end of the decoding process, the error values are xor-ed with the erroneously received words to correct the error. The block diagram of the decoder described above is shown in Figure 2. where are the received message code words. If the code is shortened to words, the syndrome calculation can be shortened as well as shown below. (8) If the code is also erased or punctured, the syndrome can be calculated using (8) after zero-words are inserted into the erased/punctured positions [2]. In the next step, the error/erasure locator polynomial coefficients are calculated by solving the following system of linear equations: Λ Λ 0, … , 1 Λ 1 Syndrome Delay Line S(x) Berlekamp Algorithm Λ(x) e(x) rd(x) Ω(x) Chien Forney Algorithm e(x) m(x) Fig. 2 Reed-Solomon decoder. (9) 3. Note that there are /2 locator polynomial coefficients in error-only codes, and coefficients in erasure-only codes. The system (9) can be solved using inversionless Berlekamp sequential algorithm [9] with the initial value for the locator polynomial Λ set to: Λ r(x) Λ (10) FPGA IMPLEMENTATION A Simulink® model based on Sysgen® (Xilinx) library of logic blocks was implemented for the proposed decoder architecture. The model top level design is shown in Figure 3. Inside the hierarchical top-level blocks, the design includes a mix of Sysgen library blocks and VHDL-based black boxes. The decoder was implemented on a Xilinx’s Virtex6 hardware platform, and was tested as an integral part of a WiMAX baseband receiver. This particular application is based on WiMAX 802.16 standard that uses 255,239 code with The code is shortened the field generator polynomial 1 and the primitive root 2. 1 rst data_in 4 data_rdy 4. rst rst 2 rate_ID 3 of Mbits/s. The architecture from Figure 3 took 2131 flipflops to implement on a Vitex6 FPGA. rd_en_out rate_ID din dout we empty re %full rst StartChien data_in data_in data_rdy wr_en rst rs_data_in syndrome_calc_done input_strobe ack_o rd_en_in full ErrorsPresent FIFO sy ndrome_done XferSyndrome omega_poly 1 ack errors_present rate_ID lambda_poly syndrome fifo_data syndrome_poly sy ndrome Berlekamp_algorithm Syndrome_calculator f if o_data Assert Assert CONCLUSIONS Reed-Solomon decoder Simulink model is designed and implemented on a FPGA-based hardware platform. The decoder architecture can be easily adapted for arbitrary code sizes and error correction capabilities not covered by existing standards. To test and verify the decoder design, a WiMAX 802.16 implementation of the decoder is used. The decoder is tested in real-life environment as an integral part of a WiMAX base-band receiver. Assert Assert2 rst rd_en rate_ID lambda_poly REFERENCES addrz-1 ROM_addr omega_poly inv_ROM StartChien 2 RS_Data_out [1] data_out ROM_data 3 RS_Data_out_Valid RS_Data_In data_valid Chien-Forney algorithm [2] Assert Assert3 Fig. 3 Simulink model for RS(255,239) decoder and punctured according to Table 1. The table shows six different code sizes and their corresponding error correction capabilities as well as the number of punctures. Table 1 code 1 2 3 4 5 6 [3] [4] [5] n’ 32 40 64 80 108 120 k’ 24 36 48 72 96 108 E 4 2 8 4 6 6 B 8 12 0 8 4 4 The decoder was tested as the integral part of the baseband receiver with the system clock at the rate of 64 MHz. It had sufficient throughput for sustained decoding of WiMAX 802.16 downlink frames with an average data rate of few tens [6] [7] [8] [9] Y. Xu and T. Zhang, “Variable Shortened-and-Punctured ReedSolomon Codes for Packet Loss Protection,” IEEE Transactions on Broadcasting, vol. 48, no. 3, pp. 237-245, September 2002. S-L. Shieh, S-G. Lee, and W-H. Sheen, “A Low-Latency Decoder for Punctured/Shortened Reed-Solomon Codes,” IEEE 16th International Symposium on Personal, Indoor and Mobile Radio Communications, September 2005. X. Zhang, “High-Speed VLSI Architecture for Low-Complexity Chase Soft-Decision Reed-Solomon Decoding,” Information Theory and Application Workshop, pp. 422–430, February 2009. M. Song, S. Kuo, and I. Lan, “A Low Complexity Design of Reed Solomon Code Algorithm for Advanced RAID Systems,” IEEE Transactions on Consumer Electronics, pp. 265-273, 2007. H. Lee, “High-Speed VLSI Architecture for Parallel Reed Solomon Decoder,” IEEE Transactions on VLSI Systems, pp. 288-294, April 2003. J. Sankaran, “Reed Solomon Decoder: TMS320C64x Implementation,” TI Application Report, December 2002. Y. F. Guo, Z. C. Lee, and Q. Wang, “An Area-Efficient Reed-Solomon Decoder for HDTV Channel Demodulation,” Embedded Systems and Applications -ESA, 2006. T. C. Barbosa, L. R. Moreno, T. S. Pimenta and L. H. C. Ferriera, “FPGA Implementation of Reed-Solomon CODEC for OTN G.709 Standard with Reduced Decoder Area,” 8th International Conference on Computing, Communications and Control Technologies, CCCT 2010, Orlando FL, April 2010. T-K. Troung, J. H. Jeng, and K-C. Hung, “Inversionless Decoding of Both Errors and Erasures of Reed-Solomon Code,” IEEE Transactions on Communications, vol. 46, no. 8, pp. 973-976, August 1998.