Modulation and Demodulation Techniques for FPGAs Ray Andraka P.E., president, Andraka Consulting Group, Inc. the 16 Arcadia Drive • North Kingstown, RI 02852-1666 • USA 401/884-7930 FAX 401/884-7950 copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 1 You can do Math in them things??? copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 2 Overview • Introduction • Digital demodulation for FPGAs • Filtering in FPGAs • Comparison to other technologies • Summary copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 3 Digital Communications • • • • • Historically only base-band processing High sample rates for down-converters Down-conversion traditionally analog Digital down-conversion with specialty chips FPGAs can compete copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 4 Why Digital? • Frequency agility • Repeatability • Cost copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 5 Digital Challenges • A to D converter • High sample rates • Arithmetic intensive copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 6 Conventional Demodulator √2f(t)ejωct Video Bandpass Filter 2f(t) cos(ωct) -ωc 0 ωc -jωct e Decimate by R Phase Split -ωc 0 =cos(ωct)-jsin(ωct) ωc -2ωc copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 0 7 Re-arranged Demodulator cos (ωct) * e-jωct Lowpass Filter √2 f(t) Decimate by R I Lowpass Filter √2 f(t) Q -jsin(ωct) -ωc 0 ωc -2ωc 0 -2ωc copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 0 8 Complex Mixer Complex Baseband Signal (Passband for Demodulator) Phase or frequency input Re[Sk] Re[Ak] Im[Sk] Im[Ak] sin(ωct) ωc cos(ωct) Complex Passband Signal (Baseband for Demodulator) Numerically Controlled Oscillator copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 9 Waveform Synthesis (NCOs) • Various Methods – – – – Look up table (LUT) Partial products Interpolation Algorithmic • Most methods use a frequency synthesizer Phase Angle to Wave Shape Conversion Waveform Out Frequency Synthesizer Phase Angle Modulation copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 10 Phase Accumulator Design Sample Clock • “Direct Digital Synthesis” • Essentially integrates phase increment • Increment value may be modulated – Frequency and PSK modulation • Binary Angular Measure (BAMs) – Most significant bit = π ∆ Phase (phase increment) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 11 Waveform Synthesis by LUT • Phase resolution limited • Arbitrary waveshapes • Sampled waveshape must be band-limited • Complex requires 2 lookups • fosc= fs /4 special case Read Only Memory Data Waveform Out Addr Phase Angle (BAMs) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 12 Using Symmetry to Extend LUT Phase Resolution remaining bits - + + + + - - Phase MSB’s 00 01 10 11 N 0 0 N N 0 0 N Count sequence 0 N N 0 0 N N 0 Q1 Sin LUT Q1 Sin LUT I Q MSB MSB-1 copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 13 Waveform Synthesizer plus Multiplier • Obvious Solution • Separate into functional parts Re[Sk] Im[Sk] • Treat each part independently sin(ωct) ωc cos(ωct) Numerically Controlled Oscillator copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 14 Look Up Table Modulator a Cos(φ) signal -4 -3 -2 -1 0 1 2 3 000 -4 -3 -2 -1 0 1 2 3 001 -2.8 -2.1 -1.4 -0.7 0 0.7 1.4 2.1 010 0 0 0 0 0 0 0 0 phase 011 100 2.8 -4 2.1 3 1.4 2 0.7 1 0 0 -0.7 -1 -1.4 -2 -2.1 -3 101 2.8 2.1 1.4 0.7 0 -0.7 -1.4 -2.1 copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 110 0 0 0 0 0 0 0 0 111 -2.8 -2.1 -1.4 -0.7 0 0.7 1.4 2.1 15 Partial Products Modulator A[1:0] A[3:2] A[5:4] A[7:6] n+8 6-LUT 6-LUT <<2 <<4 6-LUT 6-LUT <<2 Phase[3:0] copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 16 Distributed Arithmetic Modulator n+8 I[0] Q[0] 6-LUT I[1] Q[1] 6-LUT I[2] Q[2] 6-LUT I[3] Q[3] 6-LUT <<1 <<2 <<1 Phase[3:0] copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 17 Distributed Arithmetic Modulator (Serial Form) serial inputs <<1 I[3], I[2], I[1], I[0] Q[3], Q[2], Q[1], Q[0] n+8 6-LUT Phase[3:0] copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 18 CORDIC Modulator Iout = Iin cosφ - Qin sinφ Qout = Qin* cosφ + Iin sinφ Signal In Iin Qin CORDIC Rotator Iout Modulated Qout Signal Out Phase Accumulator copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 19 CORDIC Algorithm Explained • Coordinate rotation in a plane: Q x’ = xcos(φ) - ysin(φ) y’ = ycos(φ) + xsin(φ) • Rearranges to: x’ = cos(φ) [x - ytan(φ)] y’ = cos(φ) [y + xtan(φ)] sinφ φ I cosφ copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 20 CORDIC Structure x0 y0 >>0 >>0 ± >>1 >>1 >>3 xn ± >>3 ± const sign ± >>4 ± const sign ± yn const sign ± ± const sign >>2 ± >>4 ± ± ± const sign ± ± >>2 z0 ± zn copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 21 Digital Filtering • Many Constant Multipliers • Delay Queues • Products Summed • Advantages – No tolerance drift – Low cost – precise characteristic x[k] C0 Z-1 Z-1 Z-1 C1 Ci-2 Ci-1 y[k]=Σx[k-i]•Ci copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 22 Distributed Arithmetic Filter C0 Scaling Accum Shift Reg <<1 C1 Shift Reg C2 Shift Reg C3 Shift Reg Addr 0000 0001 0010 0011 D a ta 0 C0 C1 C0 + C1 1110 1111 C 1+ C 2+ C 3 C 0 + C 1+ C 2+ C 3 copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 23 Take Advantage of Symmetry X[k] • Real filters are symmetric • Add bits with like coef’s before filtering • Uses Serial Adders • Halves taps x[k]+x[k-6] SREG SREG x[k-1]+x[k-5] SREG SREG x[k-2]+x[k-4] SREG SREG x[k-3] SREG copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 24 Decimating FIR Filters • Low pass filter then discard samples • Keep only every 4th output Yn+0=akC0 +ak-1C1 +ak-2C2 + ak-3C3 +ak-4C4 +… Yn+4= ak+4C0 +ak+3C1 +ak+2C2 +ak+1C3 +akC4 +… Yn+8=ak+8C0 +ak+7C1 +ak+6C2 +ak+5C3 +ak + 4C4 +… • reduces to n parallel filters fed every nth sample • sub-filter results summed to get result copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 25 128 tap 8:1 Decimating FIR Filter a0,a8,a16... 16 Tap FIR (c0,c8…) a1,a9,a17... 16 Tap FIR (c1,c9…) a2,a10,a18... 16 Tap FIR (c2,c10…) 40 MHz Input 16 Tap FIR (c3,c11…) 16 Tap FIR (c4,c12…) 5 MHz Output 16 Tap FIR (c5,c13…) 16 Tap FIR (c6,c14…) a7,a15,a23... 16 Tap FIR (c7,c15…) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 26 Decimating FIR Filter Reduction 40 MHz Input 5 MHz Load FIR filters without scaling accumulators a0,a8,a16... 16 Tap FIR (c0,c8…) a1,a9,a17... 16 Tap FIR (c1,c9…) a2,a10,a18... 16 Tap FIR (c2,c10…) Scaling Accum <<1 16 Tap FIR (c3,c11…) 16 Tap FIR (c4,c12…) 16 Tap FIR (c5,c13…) 16 Tap FIR (c6,c14…) a7,a15,a23... 5 MHz Output 16 Tap FIR (c7,c15…) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 27 Multiplier-less Filtering • Boxcar or Moving Average filter • FIR with unity coefficients 0 • Nulls at Fs/N x[k] -1 Z -5 -10 -15 -20 -25 -30 -35 Z-1 Z-1 N Boxcar response (dB) with N=10 Fs/2 y[k]=Σ x[k-i] i=1 copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 28 Cascaded Integrator-Comb Filters • • • • Response same as M cascaded N*R boxcar filters High order multiplier-less interpolation or decimation Constant response relative to decimated sample rate Use small FIR to shape response + Z-1 + + Z-1 + + Z-1 + M Integrator section (poles) R↓ + Z-N - + Z-N - + Z-N - M Comb sections (zeros) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 29 Frequency Sampling with CIC 0 2fc -5 10 Output from clean-up filter fc 15 20 25 30 35 F0 2*F0 3*F0 4*F0 40 F0 =FS /R CIC (M=1, N=1,R=10) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 30 Half Band Filters • Special case of FIR filter • Nearly half of the coefficients are zero • Response is antisymmetric about Fs/4 • 15th order half-band filter is only 5 taps Ci = [ -15, 0, 84, 0, -296, 0, 1251, 2048, 1251, 0, -296, 0, 84, 0, -15] copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 31 Comparison to dedicated digital modulator chips • • • • • • Performance similar to dedicated chips Can be tailored to exact requirements Other logic can be integrated into chip Filtering in dedicated chips sometimes better Some development required FPGA generally cheaper than Digital Modulator Chips copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 32 Comparison to DSP Micros • • • • Higher performance than DSP micro Higher integration Cost is comparable considering performance Hardware vs. software development copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 33 Design Considerations • Floorplanning required for performance and density • Macro generator tools simplifying design task • Use instantiation in logic synthesis copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 34 Other Considerations • Transmitter – filter needed to mimimize ISI – Carrier can be coordinated with symbol rate • Receiver – filter for noise rejection, correction of channel distortion – Accurate phase reference for carrier needed – Timing normally recovered from signal – Coordinate IF, sample frequency, symbol frequency copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 35 Summary • Approach depends on performance, resolution and size • Competes with dedicated digital modulator chips • Can be tailored to exact requirements • Each design has potential for hardware shortcuts copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 36 Resources • FPGA Vendor Application Notes – Xilinx web page http://www.xilinx.com – DSP applications notes • Tools – DSP toolbox for Xilinx – Vendor DSP Macro Generators • Consulting services, Training, etc. – Andraka Consulting Group, Inc – 401/884-7930 – web page: http://users.ids.net/~randraka copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 37 References • M. Frerking, “Digital Signal Processing in Communication Systems”, Kluwer Academic Publishers, 1994 • R. Andraka, “A survey of CORDIC algorithms for FPGA based computers”, ACM, 1998 • E.B. Hogenauer, “An Economical Class of Digital Filters for Decimation and Interpolation”, IEEE Trans on ACSSP vol ASSP-29 no.2 April 1981 • A. Peled & B. Liu, “A New Hardware Realization of Digital Filters”, IEEE trans on ACSSP, vol. ASSP-22 no. 6, December 1974. ASSP and other transactions are a goldmine of hardware implementations copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 38 Example: North American TDMA Digital Cellular π/4 DQPSK Modem copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 39 Specifications • Differential phase encoding Q – ±π/4 or ±3π/4 phase offset • Non-coherent receiver • 28.6 kbps I π/4 DQPSK Constellation copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 40 Transmitter • Differential coding • Uses 2 bits per symbol • I and Q take values 0, ±1, ±1/√ √2 • Filter interpolates to upsample • Similar to QAM modulator example copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 41 Receiver • Non-coherent differential detection • Digital demodulation from IF to baseband • Equalizers left out of example • Nyquist filter is RRC w/ 35% excess BW • 48.6 kbps (24.3 kbaud) copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 42 π/4 DQPSK Demodulator Symbol Delay RRC Filter Modulated data RRC Filter cos(ωct) -sin(ωct) NCO I CMULT Slicer Recovered data Q Symbol Timing copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 43 Receiver • Sample baseband at 4x baud rate = 97.2Khz • Bit serial detection logic – 16 bit baseband I and Q = 16x bit clock • Sample IF using bit clock then decimate • IF center frequency = bit clock/4 = 16*B – simplifies mixer copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 44 Down-Converter and Filters • • • • • DAC sampled at bit rate IF at bit rate/4 64 Tap Matched NCO sequence is 1,-j,-1,j Filter Modulated Decimate 16:1 data 64 Tap Decimating filter reduces Matched to 16 parallel filters Filter • Use 4 tap filters cos(ωct) -sin(ωct) – Net filter is 64 taps NCO copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved Re[Ak] Im[Ak] 45 Reduced Down-Converter and Filters Sampled IF input 4 Tap FIR (C0,C16,C32,C48)*(1) 4 Tap FIR (C1,C17,C33,C49)*(-j) I Output 4 Tap FIR (C2,C18,C34,C50 )*(-1) 4 Tap FIR (C3,C19,C35,C51)*(j) 16 step sequencer (drives load enables) 4 Tap FIR (C4,C20,C36,C52)*(1) 4 Tap FIR (C5,C21,C37,C53)*(-j) 4 Tap FIR (C6,C22,C384,C54 )*(-1) 4 Tap FIR (C7,C23,C39,C55)*(j) Extended to 16 filters copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved To Q Output 46 Further Filter Reduction • Each 4 tap filter is 4 LUT and Scaling Acc • Move SA to end of adder tree – Each filter is a 4 LUT with delay queue • Mixer and 64 tap filter (I and Q) is 152 CLBs • Filter bit rate is a low 1.55 MHz • Present data LSB first – Allows serial LSB first output from filter copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 47 Detector • Differential Detection • x[k] • x*[k-1] yields sin(φk- φk-1 ),cos(φk- φk-1 ) • Implement CMULT with scaling accumulators • 4 Samples per symbol – Delay is 64 clocks fixed • Symbol timing selects which sample to output Symbol Delay I CMULT Slicer Recovered data Q Symbol Timing copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 48 Complex Multiply <<1 I Q Cos(φk- φk-1) =x∆k SREG SREG <<1 SREG sgn Slicer sgn LUT Recovered Di-bit Sin(φk- φk-1) = y∆k SREG copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 49 Symbol Timing • Error statistic controls state machine • e= x∆k2 + y∆k2 – proportional to magnitude squared – Use larger + 1/2 smaller approx • Compute for each sample • Compare to previous sample – if difference is less than a threshold no change to sample point – otherwise if current larger, advance sample point – or if previous larger, retard sample point • 29 CLBs copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 50 Modem Implementation • • • • 1.55 MHz master clock (slooowww) Entire design in 3/4 XCS30-3 (or 4013E-4) Device cost under $10 Performance compares favorably with heavily loaded TMS320C50 – – – – TI DSP can only implement 20 Tap filters, Tx and Rx not concurrent in TI DSP TI DSP design in/out is complex baseband, not IF TI DSP design can’t handle equalizer or viterbi decoder copyright 1998,1999,2000 Andraka Consulting Group, Inc. All Rights reserved 51