21.3: Adaptive p Equalization q and Data Recovery in a Dual-Mode (PAM2/4) Serial Link Transceiver Vladimir Stojanovic1,2, Andrew Ho1, Bruno Garlepp1, 1 2, Carl Werner1, Fred Chen1, Jason Wei1, Elad Alon1,2 Jared Zerbe1 and Mark Horowitz2 Rambus Inc Inc. 2 Stanford University 1 • Same backplane • Loss is variable • Custom fit • Equalization • Modulation Atte enuation [d dB] Backplane p channels 0 9" FR4 -10 -20 -30 26" FR4 -40 -50 50 9" FR4, via i stub t b 26" FR4, via stub -60 • Lots of different links 0 2 4 • Huge control overhead • Nice to have link adapt itself to the channel 6 8 10 frequency [GHz] • Adapt both coefficients and modulation/equalization type Backplane p channel – variable in time • Changes in time dB • Temperature • Humidity H idit • Slow but significant • Need to re-adjust the equalizer • Can do with slow slow, simple hardware GHz G. Sheets, J. D’Ambrosia “The Impact of Environmental Conditions on Channel Performance," DesignCon 2004 Outline • Show adaptive and reconfigurable link • Minimum hardware overhead • • • • Link equalization and modulation Dual-mode Dual mode (PAM2/PAM4) link Making the link adaptive Results Tx Data Anticausal taps Peak p power constraint Attenuatio on [dB] Transmit equalization q – headroom constraint 0 unequalized -5 -10 Channel 50Ω Causal taps 50Ω outP outN d d I eq 0 -15 equalized -20 20 -25 0 frequency [GHz] 0.5 1 1.5 2 Amplitude of equalized signal depends on the channel • Transmit DAC has limited voltage headroom • Tx Eq attenuates the signal • Best to combine with feedback equalization 2.5 Including g feedback equalization q 1 • Feedback equalization (DFE) • Problem with DFE • ISI must be causal 0.8 Amplitude • Subtracts error from input • No attenuation Feedback equalization 0.6 0.4 0.2 0 0 2 4 6 8 10 12 14 16 18 Symbol time • Problem - latency y in the decision circuit • Receive latency + DAC settling < bit time • Can increase allowable time by loop unrolling • Receive R i nextt bit b before f th the previous i iis resolved l d One-tap p DFE with loop p unrolling g 1 Pulse response α +1 0 -1 One-tap p DFE with loop p unrolling g 1 α +1+α +1 +α 0 -1+α -1 One-tap p DFE with loop p unrolling g 1 α +1+α +1 +α +1 α +1-α 0 -α -1+α -1 -1-α One-tap p DFE with loop p unrolling g +1+α +α d n | d n −1 = 1 +α +1 α +1-α xn -α -1+α D Q dClk d n | d n −1 = 0 -α -1-α d n −1 dClk • Instead of subtracting the error • Move the slicer level to include the noise • Slice S ffor each possible level, since previous value unknown K.K. Parhi, "High-Speed architectures for algorithms with quantizer loops," IEEE International Symposium on Circuits and Systems, May 1990 Pulse amplitude p modulation • PAM2 (Binary) • 1 bit / symbol • Symbol y rate = bit rate • PAM4 • 2 bits / symbol • Symbol rate = bit rate/2 00 1 01 0 11 10 Outline • Show adaptive and reconfigurable link • Minimum hardware overhead • • • • Link equalization and modulation Dual-mode Dual mode (PAM2/PAM4) link Making the link adaptive Results Hardware re-use: Dual-mode receiver prDFE enable thresh (+) D Q D Q D Q th h(+) thresh(+) 0 0 0 1 dClk in D Q D Q prDFE enable 0 dClk D Q msb 1 thresh(-) 1 D Q thresh (-) dClk D Q prDFE enable D Q 0 0 1 • PAM4 lsb(+) D Q lsb(-) Hardware re-use: Dual-mode receiver prDFE enable thresh (+) D Q D Q 0 D Q th h(+) thresh(+) 0 0 1 dClk in D Q prDFE enable D Q 0 dClk D Q msb 1 thresh(-) thresh (-) clk D Q 1 inP outN outP inP inN I + I thresh 2 D Q clk prDFE enable 0 dClk • PAM4 lsb(+) 1 outP I − I th resh 2 pre-amp with offset Q 0 D Q D Q outN clk Q comparator lsb(-) Hardware re-use: Dual-mode receiver prDFE enable thresh (+) D Q D Q 0 D Q 0 0 1 dClk in lsb(+) D Q D Q prDFE enable 0 dClk D Q msb 1 1 D Q thresh (-) dClk D Q prDFE enable D Q 0 0 1 • PAM2 D Q lsb(-) Hardware re-use: Dual-mode receiver prDFE enable thresh (+) D Q D Q 0 D Q 0 1 dClk in lsb(+) D Q D Q prDFE enable 0 dClk D Q msb 1 1 D Q thresh (-) dClk D Q prDFE enable D Q 0 0 1 • PAM2 with loop-unrolled DFE tap D Q lsb(-) Hardware re-use: Dual-mode receiver prDFE enable thresh (+) D Q D Q 0 D Q 0 thresh(+) 1 dClk in lsb(+) D Q D Q prDFE enable 0 dClk D Q msb 1 thresh(-) 1 D Q thresh (-) dClk D Q prDFE enable D Q 0 0 1 D Q lsb(-) • PAM2 with loop-unrolled DFE tap • Leverage multi-level properties of signals in loop-unrolling • Re-use PAM4 receiver hardware (slicers and CDR) Partial response p CDR +1+α (0,1) (0,1) (1,0) (1,0) +1-α -1+α -1-α 1α • Four F signal i l llevels l Partial response p CDR +1+α (0,1) (0,1) (1,0) (1,0) +1-α -1+α -1-α 1α (1,1) (0,0) +α (0,1) -α (1,0) • F Four signal i l llevels l • Offset edge samplers for transitions with ISI • Otherwise timing error • Need to filter edges – similar to PAM4 Dual-mode CDR thresh (+) D Q 0 eClk x D Q edgen (+) lsb n(+) , lsb n-1 (+) edgen (0) PAM4 msb n , msb n-1 , msb n-2 eClk D Q thresh ((-)) eClk • PAM4 edgen (-) PAM2 prDFE lsb n(-), lsb n-1 (-) filtered early/late J. Zerbe et al, "Design, Equalization and Clock Recovery for a 2.5-10Gb/s 2-PAM/4-PAM Backplane Transceiver Cell," IEEE Journal Solid-State Circuits, Dec. 2003. Dual-mode CDR thresh (+) D Q 0 eClk x D Q edgen (+) lsb n(+) , lsb n-1 (+) edgen (0) PAM4 msb n , msb n-1 , msb n-2 eClk D Q thresh ((-)) eClk edgen (-) PAM2 prDFE lsb n(-), lsb n-1 (-) filtered early/late • PAM2 with loop-unrolled DFE tap • Leverage multi-level properties of signals in loop-unrolling • Re-use PAM4 receiver hardware (slicers and CDR) Outline • Show adaptive and reconfigurable link • Minimum hardware overhead • • • • Link equalization and modulation Dual-mode Dual mode (PAM2/PAM4) link Making the link adaptive Results Adaptation with minimum overhead dLev Tx Data error adaptive l aClk sampler Rx data Channel Adaptive Ad i macro dClk thresholds tap updates edge CDR eClk aClk dClk eClk • Adaptive sampler tap updates • Generates the error signal at reference level (dLev) • Monitors the link • Adjustable voltage and time reference • On-chip On chip sampling scope • Can replace any other sampler - calibration Dual-loop p adaptive p algorithm g • Data level reference loop dLevn +1 = dLevn − ΔdLev sign(en ), d n > 0 dLevinit errorinitpp-pp dL dLev sign(en | d n > 0) sign (d n ) dn Initial eye • Equalizer loop wnk+1 = wnk + Δw sign(d n−k )sign(en ) Dual-loop p adaptive p algorithm g • Data level reference loop dLevn +1 = dLevn − ΔdLev sign(en ), d n > 0 dLevinit dLevmid pp errorinitp-p dLevend dLev … … dn Initial eye Mid-way equalized Equalized • Equalizer loop wnk+1 = wnk + Δw sign(d n−k )sign(en ) • Scale the equalizer - output Tx constraint Partial response p adaptation p - start • Extend data filter by one bit (msbn,msbn-1) (a) Update loops only on (msbn,msbn-1)=(1,1) • Finds dLev(1,1) - “1+α” (b) Update loops only on (msbn,msbn-1)=(0,1) • Finds dLev(0,1) – “1-α” Partial response p adaptation p - end 2α 2α=dLev(1,1)-dLev(0,1) Iterate α finding and equalization loops • msbn,msbn-1 filter tolerates one tap post-cursor ISI α Outline • Show adaptive and reconfigurable link • Minimum hardware overhead • • • • Link equalization and modulation Dual-mode Dual mode (PAM2/PAM4) link Making the link adaptive Results Fully y adaptive p dual-mode link Config g Registers g CDR Logic Phase Mixers Receiver Reflection Canceller PLL • • • • PAM2/PAM4 1-10Gb/s 0.13µm 40mW/Gb/s Transmitter Backchannel RX Backchannel TX • Reconfigurable dual-mode PAM2/PAM4 link • Adaptive equalization • Transmit and receive equalization • DFE with loop unrolling Dual-loop ua oop co convergence e ge ce – 4 tap e example a pe PAM2, 5Gb/s, 4taps Tx Equalization 100 1000 800 tap weightt [mV] dLev [m mV] 80 60 40 20 0 0 main tap 600 400 200 post2 0 pre1 -200 50 100 150 number b off updates d t 200 -400 0 post1 50 100 150 number of updates • Reference loop tracks the signal • Once reference loop provides reliable error • Equalizer taps converge quickly 200 Dual-loop p convergence g - stability y dLev [mV] 150 1000 dLev speed 1x - eq speed 1x dLev speed 10x - eq speed 1x dLev speed 1x - eq speed 10x 500 0 tap weightts [mV] 200 100 50 -500 0 1000 20 40 60 80 100 20 40 60 80 100 80 100 500 0 -500 0 1000 500 0 0 0 20 40 60 80 number of updates 100 -500 0 20 40 60 number of updates • Hard to estimate analytically • Experimental results show • Both loops are stable within wide range 0.1 – 10x of relative speeds Improvements p with loop-unrolling p g 0.4 unequalized 0.3 -3 200 150 0.2 -3.5 100 01 0.1 -4 4 [ps] 0 0.25 1000 [V] 2000 3000 4000 fully transmit equalized 0 15 0.15 0 -4.5 -50 -5 -100 transmit equalized with one tap DFE 0.2 [mV] 50 0 0.1 log10(volttage probability distributtion) [V] 0 50 100 150 200 [ps] • Signal as seen by the receiver i ((on-chip hi scope)) 0.05 0 [ps] 0 1000 2000 3000 4000 Conclusions • PAM2 with one-tap DFE similar to PAM4 • Need N db both th tto cover allll th the channels h l • Dual-mode D l d lilink k - front-end f t dh hardware d re-use • 1-10 Gb/s with 40 mW/Gb/s • Adaptation with one extra sampler • Dual-loop D ll algorithm l ith