Model Predictive Control Equalization for High-Speed 10 Links by Amr A. Suleiman Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of ARCHNES Master of Science MASSACHUSETTS IN7 OF TECHNOLGY at the JUL 08 2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY LIBRARIES June 2013 @ 2013 Massachusetts Institute of Technology. All Rights Reserved. ....... ... .. .. ....................................................... Signature of A uthor .................... . Department of Electrical Engineering and Computer Science May 15, 2013 ...................................... Certified by ................ /(,1 Accepted by .................. N Vladimir Stojanovic Associate Professor of Electrical Engineering Thesis Supervisor .......................... Leslie A. Kolodziejski Professor of Electical Engineering Chair of the Committee on Graduate Students E Model Predictive Control Equalization for High-Speed 10 Links by Amr A. Suleiman Submitted to the Department of Electrical Engineering and Computer Science on May 15, 2013 in partial fulfillment of the requirements for the Degree of Master of Science in Electrical Engineering ABSTRACT The demand for bandwidth in chip-to-chip communication has been increasing as the industry demands higher quantity and quality of information. Serial links provide a suitable architecture for this kind of transmission, because of speed, power and area limitations on parallel links. However, due to inter-symbol interference (ISI) and channel discontinuities, the data rate is being limited. Equalization schemes have been developed to cope with these problems. These equalizers compensate the loss of transmission mediums, equalize for reflections and extend the channel's maximum data rate. In this work, we formulate a new, nonlinear and time-variant, transmitter equalization method based on the Model Predictive Control (MPC) algorithm. MPC is a class of control algorithms in which the current control action is obtained by solving, perhaps approximately, an online open-loop optimal control problem. One important advantage of the MPC in peak-power constrained link environment is its ability to cope with hard constraints on controls and states. Knowing the state of the channel enables a very fine nonlinear equalization. We utilize this flexibility to create various MPC formulations that control the entire eyemask, receive signal dynamic range as well as the required quantization. Our MPC equalization significantly outperforms traditional transmitter techniques such as linear feed-forward and Tomlinson-Harashima equalizers, and gets very close to the optimized decision-feedback equalization at lower transmitter resolutions. We also describe the possible complexity reduction techniques that enable efficient implementation of our MPC algorithm in hardware. Thesis Supervisor: Vladimir Stojanovic Title: Associate Professor of Electrical Engineering Acknowledgement First of all, I give my great and total respect and thanks to my supervisor Prof. Vladimir Stojanovic. He always gave me his time and knowledge, his patience and tolerance, and of course his superior technical guidance. This really helped me to work hard on this topic and fully understand what I'm doing. I would like to thank my group members and friends, Ranko Sredojevic, Chen Sun, and Michael Georgas for their great help, support and advice. And a special thanks to my friend Yu-Hsin Chen, without him I would have spent days and nights to solve the devastating hardware design problems that I faced. I also wish to thank our project collaborative in UC Berkeley, UCLA, Oregon State University, and Intel. My special thanks to Prof. Elad Alon in UC Berkeley for his useful hints and reviews, and to Intel Research for funding the project. I would like also to thank the Egyptian Ministry of Finance for the funding support through the ODGE endowed fellowship. This fellowship is a really great opportunity for Egyptians to pursuit their graduate studies at MIT. Last but not least, my full gratefulness and love to my family; Mom, dad and my two brothers. I faced some difficult circumstances throughout the last year, although the long physical distance between us, they all were always there for me. I'm lucky to have a mom and dad like mine, may God bless them. Coming to MIT, and specially EECS department, from Egypt was a challenge for me. A challenge that started the minute I decided applying to MIT admission, and which - of course - will continue though my PhD pursuit. I would like to thank everyone who believed in me, and always pushed me through difficulties to reach my dreams. A special thanks to my undergrad and grad advisor in Cairo University, Prof. Serag Habib. 3 Glossary ASIC Application Specific Integrated Circuit BER Bit Error Rate DAC Digital to Analog Converter DFE Decision-Feedback Equalizer DRAM Dynamic Random Access Memory DS Direct Search DSL Digital Subscriber Line FFE Feed Forward Equalizer FIR Finite Impulse Response I\O Input\Output ISI Inter Symbol Interference LMS Least-Mean-Square LQG Linear Quadratic Gaussian LUT Look-Up Table MMSE Minimum mean-squared-error MPC Model Predictive Control RAM Random Access Memory RHC Receding Horizon Control SSR State Space Representation TH Tomlinson-Harashima UI Unit Interval ZFE Zero Force Equalizer 4 Contents Acknowledgement 3 Glossary 4 1 2 3 1.1 Goal and Motivation 1.2 Thesis Overview . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . 12 13 Equalization in High Speed Links 14 2.1 Eye Diagram Specifications ..... 2.2 Feed Forward Equalizer . . . . . . 15 2.3 FFE+DFE . . . . . . . . . . . . . . 17 2.4 Precoding Schemes . . . . . . . . . 18 2.5 Summary . . . . . . . . . . . . .. 3.2 20 21 Model Predictive Control 3.1 4 11 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 H istory 25 MPC Equalizer 4.1 Equalizer Model . . . . . . . . . . . . . . . . 25 4.2 Mathematical Formulation . . . . . . . . . . 27 4.3 Channel output with MPC equalizer . . . . . . 28 4.4 Two-eye channel output . . . . . . . . . . . . 30 4.5 Evaluation results . . . . . . . . . . . . . . . 31 4.6 Full Eye-mask Control . . . . . . . . . . . . 33 4.7 Implementation Issues . . . . . . . . . . . . 34 4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5 5 6 7 Simplified MPC (Direct Search) 36 5.1 A lgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 5.2 Evaluation results - with quantization 5.3 Summ ary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 . . . . . . . . . . . . . . . . . . . . Implementations 38 41 6.1 RAM -DAC Equalizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Online Channel State Estimator. . . . . . . . . . . . . . . . . . . . . . . . 44 Conclusion and Future Work 46 A MATLAB Codes A.1 47 MATLAB CODE FOR eyeMax ALGORITHM . . . . . . . . . . . . . . . 47 A.2 MATLAB CODE FOR CONVEX MPC EQUALIZATION . . . . . . . . . 48 Bibliography 50 6 List of Tables 4.1 Eye opening height values (in millivolts) for different test channels using different equalization techniques: Linear FFE, MPC algorithm with 1-eye & 2-eye, Linear FFE at transmitter and DFE at receiver with 1 & 3 taps, Tomlsinson-Harashima precoding technique with equivalent 3- and 16-tap feedback. Test channel pulse responses are depicted in Fig. 4.6. 4.2 . . . . . . 32 Eye opening width (in UI) for different test channels using different equalization techniques: Linear FFE (20-tap), MPC algorithm with 1-eye and 2-eye. Test channel pulse responses are depicted in Fig. 4.6. 5.1 . . . . . . . . 33 Quantization effect on various equalization schemes, comparing from both eye height and eye width points of view. . . . . . . . . . . . . . . . . . . . 40 7 List of Figures 2.1 (a) Frequency response of different channels within the same backplane: FR4 material, 20" and 32" trace length, middle (M) and bottom (B) routing layers. (b) An example of a dispersed received pulse due to a band-limited channel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Eye diagram specifications: Upper plot shows the conventional eye diagram. Lower plot shows the sampled data output. . . . . . . . . . . . . . . 2.3 13 14 (a) Analog FIR implementation of transmit pre-emphasis FFE with peakpower limit. (b) The high pass characteristics of FFE flatten the overall equalized channel frequency response. For example, in this case of FR4 20" backplane channel with bottom routing layer, 5-tap FFE flattens the frequency response up to 4 GHz. This means a maximum data rate of 8 Gbps. 16 2.4 Eye height (Upper plot) and eye width (lower plot) of channel output with linear FFE, comparing ZFE and eyeMax algorithms in taps weight calculation. The channel is a 3-inch backplane test channel at 8 Gbps data rate, its pulse response is shown in the upper left corner of Fig. 4.6. 2.5 17 Block diagram of a 3-tap DFE. Wi through W3 are the first three post-cursor ISI taps coefficients, and T represent a one unit delay. 2.6 . . . . . . . . . . . . . . . . . . . 18 (a) Tomlinson-Harashima precoding block diagram. (b) Eye diagram of the test channel (3-inch backplane at 8 Gbps Fig. 4.6) equalized with TH precoder, with 3-tap feedback and N = 2. Although this is a PAM-2 transmission, the output has three eye openings because of wrapping-around of TH control signal. 3.1 . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . 19 Principle of model predictive control. The plots show a sample of MPC controller output (v), and the plant output (y) tracking the defined trajectory (u) which represents optimization problem constraints. A simple block diagram defines signals on the controller and the plant. . . . . . . . . . . . 8 23 4.1 Modeling the transmission channel with state-space representation (SSR). 26 4.2 MPC equalizer block diagram with input bit stream window example. A dynamic model for the channel must be known (the channel pulse response). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3 (a) Channel output using eyeMax linear 20-tap FFE. (b) Channel output using MPC equalization, with eye constraints EH = 0.24 and 11 = 1. Channel pulse response is shown in the upper left corner of Fig. 4.6. 4.4 . . . . . . . . 28 Channel output using MPC equalizer with the same eye height as in Fig. 4.3, but with different residual ISI constraints: (a) 7j = 1 (unconstrained), small eye width of 0.27 UI. (b) q = 0.1, eye width = 0.59 UI. (c) 11 - 0, large eye width, 0.67 UI. Channel pulse response is shown in upper left corner of Fig. 4.6. 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Channel output with two eye openings, equalized by convex MPC technique without quantization (Residual ISI is constrained). pulse response is shown in upper left corner of Fig. 4.6. 4.6 Test channel . . . . . . . . . . 30 Pulse responses of test channels used in equalization performance comparison. Every channel is sampled at the indicated data rate. . . . . . . . . . . 4.7 31 3-inch backplane test channel pulse response at 8 Gbps. The solid lines are samples at maximum eye height phase, while dashed and dotted lines are for early and late sampling phases respectively. 4.8 . . . . . . . . . . . . . . . 33 Eye diagram of MPC equalized channel (a) without eye-shaping constraints (b) with eye-shaping constraints. Eye width improvement is achieved by trading-off the eye height. . . . . . . . . . . . . . . . . . . . . . . . . . . 34 . . 37 . . . . . . 37 5.1 Flow graph of the simplified algorithm for equalization (DS approach). 5.2 Simplified flow graph for two eye openings, using DS approach. 5.3 Quantization effect on eyeMax FFE, convex MPC and Direct Search equalizers for test channel. (a) One eye opening scheme. (b) Two eye openings scheme. Channel pulse response is shown in the upper left corner of Fig. 4 .6 . 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Transmitter architecture using RAM-DAC equalization with input bits history. . . . . . . . .. . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . 42 9 6.2 Simulation results for uniqueness of MPC equalizer outputs. (a) Un- quantized outputs. (b) 4-bit quantized outputs. (c) 6-bit quantized outputs. .. . . . . . . . .. 6.3 . . . . . . . . . . . . . 42 3-inch backplane channel pulse response, the significant ISI taps are . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Block diagram of online channel state estimation architecture for DS MPC equalizer. 6.6 .. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 marked with arrows. 6.5 .. Transmitter architecture using RAM-DAC equalization with channel state feedback. 6.4 . . . ... .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 A 3-D plot of the eye height of the 3-inch backplane channel output versus the equalizer output resolution and the channel model resolution. . . . . . 45 A. 1 L, and E matrices used in the optimization function of the convex MPC solver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 10 Chapter 1 Introduction 1.1 Goal and Motivation As CMOS technology scales down, modem systems like server and router backplanes, Ethernet and even processor-memory interfaces target data rates well into the multi-Gbps range. While circuit technology scaling might handle such speeds, performance of these chip 1/0 links is limited by the bandwidth of the wire communication channel. The dielectric and skin-effect induced losses introduce low-pass channel characteristics resulting in inter-symbol interference (ISI) in received data and complicating detection of transmitted information. Additionally, impedance mismatches and/or discontinuities from chip and package parasitics, connectors and backplane vias, increase the ISI and further limit the achievable data rates. Tight power budgets and device speed limitations impose significant challenges to direct implementation of a number of well-known complex digital signal processing schemes developed for use in Ethernet, DSL and disk read channels [1]. Instead, mixed-signal transmit and receive equalization techniques have been widely used in high-speed links to compensate for the ISI. Moreover, certain situations where many links are required on a centralized chip, such as router chips [2], non-retimed switches, etc, asymmetric link design is preferable since it reduces the link complexity and power at a convergence point [3]. Similarly, in processor-memory interfaces, asymmetric links are desirable due to a mismatch in transistor performance between the fast logic process and slow DRAM process. In these asymmetric link situations, it is often desirable to centralize signal-processing and clocking complexity on either the receiver or transmitter side. 11 In this work, we introduce a new nonlinear transmit equalization method for asymmetric links, which requires no additional filtering at the receiver and outperforms traditional transmit equalization and precoding techniques. The new technique makes use of an advanced process control algorithm called Model Predictive Control (MPC) [4, 5, 6]. We formulate the link equalization problem into an MPC algorithm, and build on the main algorithm concepts to develop variations that fit well in the real high-speed link implementation constraints. 1.2 Thesis Overview This thesis is organized as follows: " Chapter 2 presents an overview of different equalization schemes in high speed links. " Chapter 3 introduces Model Predictive Control (MPC) theory. " Chapter 4 is devoted to the idea of using MPC to equalize high speed links. " Chapter 5 presents Direct Search (DS) approach as a simplified solver for the MCP equalization scheme. " Chapter 6 proposes some implementation ideas for MPC equalizers. " Chapter 7, the conclusion and future work. 12 Chapter 2 Equalization in High Speed Links An ideal transmission channel could propagate all frequency components without any loss. However, in reality, all the electrical mediums have finite signaling bandwidth as shown in Fig. 2.1-a. This low-pass characteristics makes a transmitted clean pulse be dispersed in time at the receiver as shown in Fig. 2.1-b. The long tail of the pulse results in ISI which limits the transmission data rate. Additionally, impedance mismatches and/or discontinuities from chip and package parasitics, connectors and backplane vias, increase the ISI and further limit the achievable data rates. Fortunately, many equalization schemes have been developed at the transmitter side and/or the receiver, to cope with these problems. These equalizers generally compensate the loss of transmission mediums, equalize for reflections and extend the channel's maximum data rate. -820 -2 .--- -B --- -- -- - 32 0 -1................. 2 -4 -20 ....... 2 Frequency [GHz) (b) (a) Figure 2.1: (a) Frequency response of different channels within the same backplane: FR4 material, 20" and 32" trace length, middle (M) and bottom (B) routing layers. example of a dispersed received pulse due to a band-limited channel. 13 (b) An 2.1 Eye Diagram Specifications Before diving into different equalization schemes, we'll define the main metrics for the performance measurement. In high-speed links, the eye diagram is often used as the proxy for judging the quality of signal transmission. The ultimate goal of equalization is to create a reasonable eye opening at the receive decision circuit to achieve a reliable transmission with low bit error rate (BER). As shown in upper plot of Fig. 2.2, the main parameters of the eye diagram are: eye height EH, eye width Ew and voltage variation (also called residual ISI). Data dependent timing jitter is another way to measure the eye width, which is usually normalized to one unit interval (UI) [7]. Increasing the eye opening means increasing EH and/or Ew. Voltage Noise U1 0.250.2 01 j Residual lSI 0.1 0.05 E -0.05? -0 EH -0.4 --. 15 Residual ISI -0. Tm2 3 Time in#samnples 4 4 X 10, Figure 2.2: Eye diagram specifications: Upper plot shows the conventional eye diagram. Lower plot shows the sampled data output. 14 The optimization algorithm we develop in this work has the flexibility to tailor the whole eye mask, i.e. the gradual decrease in the eye height away from the sampling point. For simplicity, we first illustrate the optimizer performance on sampled eye diagram at the point of maximum eye opening. Initially, the objective is set to maximize EH and minimize the residual ISI without any direct constraints on the eye width Ew. The lower plot in Fig. 2.2 shows how we view the eye opening in the optimization problem. The plot here is the sampled channel output versus time, assuming that samples are taken exactly at the maximum eye opening. The clean gap between samples represents EH, while the fuzz above and below represents the residual ISI. 2.2 Feed Forward Equalizer One way to overcome channel IST is to flatten the frequency response of the channel, which is usually done by transmit pre-emphasis finite-impulse response (FIR) filter as shown in Fig. 2.3-a. Design of linear transmitter Feed-Forward Equalizers (FFE) involves mainly the choice of the FIR filter length (i.e. number of taps, post- and/or pre-cursors), and the calculation of each tap weight value. As shown in Fig. 2.3-b, due to power limitation, linear FFE compensates the low pass characteristics of the channel by means of attenuating low frequencies in order to get a flat frequency response up to the Nyquist rate, or so called Zero-Forcing FFE (ZFE). This is preferable in low-noise conditions. However, as shown, doing so results in an attenuated flat response and an attenuated eye opening. The more number of FFE taps, the more flat but more attenuated frequency response, and this is one of the things against ZFE. Minimum mean-squared-error (MMSE) equalizers trade-off noise enhancement for residual ISI and are often a result of using adaptive least-mean-square (LMS) algorithm [8, 9]. In peak-power constrained high-speed link transmitters, the optimal technique (i.e. the one that also minimizes BER) is to use a linear-programming approach for the equalizer tap calculation that maximizes the eye opening at the receiver similar to [10] (eyeMax). Taps weights calculation problem can be formulated as a large linear program, where the objective of the optimization problem is to maximize the worst-case eye opening (1i norm solution) [11]. Assuming the equalized channel pulse response is cheq, which is the result of a convolution between the channel pulse response (ch) and the equalizer under design (tap vector w, with length h), a mathematical representation of the problem is shown below: 15 Peak-power limit 0 TX Data - -2 qadChannel l Equalier only.. Equalized Channel only ChanneloyEue 4. 41 -16 -18 2 4 Frequency [GHz) (a) 6 8 (b) Figure 2.3: (a) Analog FIR implementation of transmit pre-emphasis FFE with peak-power limit. (b) The high pass characteristics of FFE flatten the overall equalized channel frequency response. For example, in this case of FR4 20" backplane channel with bottom routing layer, 5-tap FFE flattens the frequency response up to 4 GHz. This means a maximum data rate of 8 Gbps. cheq=ch*w (1) maximize: cheq(Tq)- Ek cheq(k)| ,k=0:L -1(exceptTq&TDFE) (2) subject to: Ek I w(k) <1 , k=O:h- 1 (3) In this formulation, L is the equalized channel length, Tq is the main tap of the equalized channel, and TDFE represents the potential decision-feedback equalization (DFE) taps at the receiver [12]. MATLAB code for this linear program formulation is provided in the Appendix, section A. 1. Peak-power limit constraints can be defined in different ways; it can be applied explicitly as in equation (3), or the filter taps (wk) can be normalized after maximizing the objective function. Both techniques are shown in the Appendix, and both give the same performance. Besides, as will be shown in the next section, the existence of DFE at the receiver gives the optimizer some degree of freedom while solving the optimization problem, and results in larger eye opening. Figure 2.4 shows eye opening height EH and width Ew values versus number of FFE taps. It is clear that eyeMax gives larger eye height. However, ZFE always results in a wider 16 horizontal eye opening. As we will see shortly, providing more eye opening constraints for samples around the main sampling point can robustify the optimization algorithm and result in a wider eye opening, i.e. full eye mask control, while still achieving EH values superior to the ZFE algorithm. 150 100 ]^ 50 0' 0 5 10 #taps 20 15 0.8 - - - - - r--- -- r-- - - - - - -- - -P 0.6 -- -- - - -----+ ---------+---------- -- 0.4 - - - + - - - - + - - - - - Lii 0.2 --------------------L----------L----------L ------------- --- 01 E ye a eyeMax -ZFE 8 10 12 14 16 18 20 #taps Figure 2.4: Eye height (Upper plot) and eye width (lower plot) of channel output with linear FFE, comparing ZFE and eyeMax algorithms in taps weight calculation. The channel is a 3-inch backplane test channel at 8 Gbps data rate, its pulse response is shown in the upper left corner of Fig. 4.6. 2.3 FFE+DFE A conventional equalization setup for links is to have a linear FFE at the transmitter and a DFE at the receiver. The transmitter FFE filter reshapes channel pulse response such that pre-cursor ISI are mostly removed. DFE uses the history of the received symbols to cancel the post-cursor ISI that is present in the channel. As shown in Fig. 2.5, given the channel pulse response and assuming correct initial received bits, we can subtract the residual signals from all the bits we have already seen, leaving only the signal from the bit of interest [13]. So, theoretically all post-cursor ISI taps can be removed from the received signal, 17 Input from Channel Figure 2.5: Block diagram of a 3-tap DFE. Withrough W3 are the first three post-cursor ISI taps coefficients, and T represent a one unit delay. producing a clean eye opening. The DFE is inherently a nonlinear receiver. However, it can be analyzed using linear techniques, if one assumes all previous bit decisions are correct. At multi-Gbps high-speed link rates, closing the feedback loop with settling within one bit-time and with relatively high resolution can often be challenging, depending on process technology. Furthermore, wrong bit decisions can lead to a catastrophic error propagation in the equalizer feedback loop. To alleviate some of these issues, various transmitter precoding schemes were proposed and will be quickly summarized in the next section. 2.4 Precoding Schemes The idea of precoding is that if the transmitter knows the channel information, it can send a coded signal so that ISI at the receiver side is greatly mitigated. By employing precoding, disadvantages of the DFE error-propagation can be avoided. However, as same as DFE, only post-cursor ISI can be removed. One of the well-known schemes developed originally for dial-up modems is Tomlinson-Harashima (TH) precoding [14, 15, 16]. The TH precoding cancels the post cursor ISI in the transmitter, where the past transmit symbols are known without possibility of errors. To avoid unstable outputs of the system, a modulo-N adder is used as shown in Fig. 2.6-a instead of a conventional adder, where N represents the transmitter output swing range. Although this technique is pretty complex, some implementations have been done recently to use TH precoding in high speed memorylinks as in [17]. This modulo-N adder operates as a normal adder except that if the result of the summation is greater than + f, N is subtracted an integral number of times until the result is less than + j. If the result of the summation is less than 18 - f, N is added an integral TH precoder (a) 0.8 Logic "0" 0.6 - - - 0.4i Logic "1" 0_2 .0,2 Logic "0" -0. Logic "1" .08 .1............................. Unit Interval (U) (b) Figure 2.6: (a) Tomlinson-Harashima precoding block diagram. (b) Eye diagram of the test channel (3-inch backplane at 8 Gbps Fig. 4.6) equalized with TH precoder, with 3tap feedback and N = 2. Although this is a PAM-2 transmission, the output has three eye openings because of wrapping-around of TH control signal. number of times until the result is greater than - Y. Thus the output of the adder has always a magnitude between t i, and, as a consequence, the system is always stable [14]. This wrapping-around of the control signals produced by the TH transmitter introduces a power penalty in peak-power constrained transmitters, resulting in performance loss i.e. smaller eye opening at the receiver compared to DFE. This penalty decreases with increased number of modulation levels, but is most severe in binary signaling which is most frequently used in high-speed links for complexity reasons. Figure 2.6-b shows the output of the test channel driven by a TH precoder with 3-tap feedback. Increasing number of feedback taps will result in more opening of the three eyes. To decode the output into binary data, three slicers are needed to detect the three thresholds, then a simple combination circuit outputs the received binary data according to the decision regions indicated in the 19 figure. Although this precoding scheme is considered too complex to be implemented for multi-Gbps links, a recent architecture for TH is presented in 22nm process technology which achieves data rates up to 8 Gbps [17]. 2.5 Summary Different equalization schemes are developed to overcome the channel ISI and increase the achievable data rates. For transmitter equalization, ZFE linear equalizer flattens the channel frequency response by attenuating the low frequency components. Applying the linear-programming approach instead in designing the linear FFE tap weights, we can get the maximum achievable eye opening using linear FIR filtering with peak-power limits. On the other hand, precoding techniques, like TH, take advantage of the channel state information, in order to achieve bigger eye opening estimating the ISI that the channel is facing. However, there is no direct control on the desired behavior of the received signal. This motivated us to develop a new equalization technique, where this control can be done explicitly to freely shape the received eye opening, as well as the transmitter-side constraints. 20 Chapter 3 Model Predictive Control Model Predictive Control is an advanced method of process control, which has been widely used since 1980s in the industrial processes. The interest of the processing industry in MPC can be traced back to a set of papers which appeared in the late 1970s. In 1978 Richalet et al. described successful applications of "Model Predictive Heuristic Control" [18], and in 1979 engineers from Shell outlined "Dynamic Matrix Control" (DMC) and reported applications to a fluid catalytic cracker [19, 20]. In both algorithms an explicit dynamic model of the plant is used to predict the effect of future actions of the manipulated variables on the output (thus the name "Model Predictive Control"). The future moves of the manipulated variables are determined by optimization with the objective of minimizing the predicted error subject to the operating constraints. The optimization is repeated at each sampling time based on updated information (measurements) from the plant. [21]. 3.1 History The MPC concept has a long history. The connections between the closely related minimum time optimal control problem and Linear Programming were recognized first in [22]. The moving horizon approach was proposed in [23], which is at the core of all MPC algorithms. It became known as "Open Loop Optimal Feedback". The development of modern control concepts can be traced to the work of Kalman in the early 1960s [24, 25]. The solution of the problem Kalman presented was called Linear Quadratic Gaussian (LQG) controller, where the optimum control input is computed using the information of the plant current state. 21 LQG theory soon became a standard approach to solve control problems in a wide range of application areas. However, it has had a little impact on control technology development in the process industries, basically because of the culture of the industrial process control community at the time, in which instrument technicians and control engineers either had no exposure to LQG concepts or regarded them as impractical. This environment led to the development, in industry, of a more general model based control methodology in which the dynamic optimization problem is solved online at each control execution. Process inputs are computed so as to optimize future plant behavior over a time interval known as the prediction horizon. Plant dynamics are described by an explicit process model. Any process model, capable of predicting future output signals based on future input signals and initial values, can be used. Process input and output constraints are included directly in the problem formulation so that future constraint violations are anticipated and prevented. This new methodology for industrial process modeling and control is what we now refer to as MPC technology [26]. 3.2 Problem Formulation Unlike the conventional control that uses pre-computed laws, MPC, which is also called receding horizon control (RHC), obtains the current control action by solving an online optimal control problem based on a dynamic model of the process under control and an initial state. As shown in Fig. 3.1, a horizon is defined (from k to k + HR) with a specific objective function and constraints. The optimization problem is solved to produce the optimum control actions (v) for the plant output to reach a reference value (d), tracking a pre-defined trajectory (u). Due to overlap between horizons, only the first part of the calculated control signal is taken (control horizon = v(k : k + OV)), and the rest (which is called the prediction horizon = v(k + OV + 1 : k + HR)) can be ignored. 22 Reference Reference Trajectory (u) Predicted output (y) HR: Horizon OV: Overlap Prediction Horizon k+OV k+HR Control output (v) k+HR Control Horizon Reference trajectory (u) Figure 3.1: Principle of model predictive control. The plots show a sample of MPC controller output (v), and the plant output (y) tracking the defined trajectory (u) which represents optimization problem constraints. A simple block diagram defines signals on the controller and the plant. 23 So, there are three main requirements for the MPC algorithm to be applied: " A dynamic model for the process, which can be: - Impulse response. - Step response. - Transfer function. - State-space. " History of the previous control actions, which is typically defining the state for control horizon in every iteration. " Convex objective function and constraints. In the next section, we will show how MPC concept can be applied in the high-speed link context. The idea of applying an online optimization solver under certain constraints at the transmitter is expected to give some gains in the equalization performance, compared to traditional equalization schemes under the same peak-power limits. 24 Chapter 4 MPC Equalizer The signal equalization is, in general, a control process. The channel can be considered a plant to be controlled, and the transmitter should be producing the right control sequence to meet some specific constraints. 4.1 Equalizer Model The signal equalization can be translated into an MPC problem by defining the three points discussed in section 3.2. The plant model is basically the channel pulse response. Since we are modeling the channel as an FIR filter, we can use the State Space Representation (SSR) as shown in Fig. 4.1. In this representation, (h,) is the channel pulse response, (xn) is the channel state, (v) is the channel input, and (y) is the channel output [27]. Figure 4.2 shows a block diagram of the MPC equalizer. The initial state of the control horizon is simply the current channel state at each solving iteration. As depicted in Fig. 3.1, and according to an FIR-based channel model SSR, the current channel state is simply the history of previous control output samples (v). The optimization objective function and the constraints are used to shape the received eye opening as will be illustrated later. They can be defined freely, as long as the problem remains convex. The optimization problem is solved for every horizon, then the horizon slides along input bits with some overlap, as illustrated in Fig. 4.2. This overlap gives the solver more information about previous and future operation of the process, leading to more robust optimization. 25 x1 (t+1) X.11 (1 X1 X2 2z = 01000. 00100 00010 00001 0 00 00 ... 0 0 (t) + 00 00 \00000.1 v(t) x) xl x2) y(t) = (hn hn- 1 hn-z ...... h 2 hi1 ) (t) Figure 4.1: Modeling the transmission channel with state-space representation (SSR). Reference value (d) (bit stream) Constraints (eye & residualISI) Bit stream (d) Horizon (HR) Overlap (OV) Prediction horizon Slidingwindow Figure 4.2: MPC equalizer block diagram with input bit stream window example. A dynamic model for the channel must be known (the channel pulse response). 26 4.2 Mathematical Formulation As an example, the mathematical representation of a typical optimization problem for the MPC equalizer is shown below as a quadratic program, with more detailed MATLAB code described in the appendix, section A.2. HR: horizon size. v: MPC control output horizon. d: input bit values elements of f-1,+J. y: equalized channel output. Q: real positive-semi-definite matrix. ch: channel pulse response. minimize: (4) (VT Qv) Subject to: (5) |V1:HR 11 5 1 sign (di) (yi - L di) > 0, sign (di) (yi - , dd i = 1: HR (6) i=l:HR (7) where: y=ch*v (8) In this formulation, the objective function is defined in (4) as the equalizer output energy. The matrix (Q) is a weighting factor to calculate the output energy, as will be illustrated in section A.2. However, the equalizer performance is specified through the constraints, which define the shape of the channel output. In (5), the power limit is defined by means of the infinity norm of the control horizon vl:HR. This constraint ensures that all MPC output values are less than one in absolute. In (6) and (7), constraints are defined to have an eye height of at least EH at the channel output, and the dynamic range of the residual ISI no larger than 17. To insure convexity of the problem, the sign of the input bit di is used to indicate the right inequality, instead of using absolute values. In this formulation, constraints are defined to have an eye opening of at least EH at the channel output, and a residual ISI dynamic range no larger than 7j. Eye height and residual ISI parameters are 27 defined as shown in lower plot of Fig. 2.2. No constraints are defined here for the reference trajectory of the channel output which should control the eye width Ew. However, as we will show later, the entire eye mask can be specified with the MPC easily by adding some additional constraints. 4.3 Channel output with MPC equalizer As an evaluation example, the MPC equalization and eyeMax FFE are applied to a backplane channel with a length of 3 inches. Figure 4.3 shows the resulting channel output for the above mentioned optimization problem formulation. The eye opening constraint is met with 65% increase in the eye height over the eyeMax FFE. However, large dynamic range of the residual ISI exists on both sides relative to the eyeMax linear FFE, due to relaxed residual ISI constraint. 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 S0.1 0.1 140 mV0 EH240 7 V E -01 E -0.1 -0.2 -0.2 -0.3 -0.3 -0.4 - -0.4 -0.5.- 1 . -0.5 2 3 4 #samples X 104 (a) 1 2345 #samplesx 104 (b) Figure 4.3: (a) Channel output using eyeMax linear 20-tap FFE. (b) Channel output using MPC equalization, with eye constraints EH = 0.24 and 1 = 1. Channel pulse response is shown in the upper left corner of Fig. 4.6. Interestingly, tightening the constraints of the residual ISI into a very small dynamic range does not degrade the eye height improvement. Figure 4.4 shows that eye height constraint can remain unchanged for the MPC optimization over the same channel, with 28 significantly restricted dynamic range of the residual ISI. As mentioned earlier, this reduced dynamic range directly affects the eye-width as depicted in the lower plot in Fig. 4.4. We can see that the eye width values are comparable to those of the conventional ZFE with large number of taps, and the same value as in 20-tap ZFE when the residual ISI constraint is set to very small ranges as in Fig. 4.4-c. 0. 05 0. 0.4 0.4 0. 0.3 0.3, 0. 02 0 V 0 4 4) V 0- 0 -U.I -0.14 -0.2 -0'1 -0.4 2 #samples ~ #samples x 104 0 5 0.5 , 0,4 o.4i- 0.3 0.3 03 0.2 02 02 C 0.1 4, 01 a) V 0 -0.1 -V1 -0.2 -0.2 -0.3 -03 -W -0.4 -0.4 1...... -0.5 -1 4 #samples x -0 3 .... -0........... -0 51 0 Unit Interval (UI) (a) -1 0 1 -1 0 Unit Interval (UI) Unit Interval (Ul) (b) (c) Figure 4.4: Channel output using MPC equalizer with the same eye height as in Fig. 4.3, but with different residual ISI constraints: (a) i = 1 (unconstrained), small eye width of 0.27 UI. (b) q = 0.1, eye width = 0.59 UI. (c) 1 ~ 0, large eye width, 0.67 UI. Channel pulse response is shown in upper left corner of Fig. 4.6. 29 4.4 Two-eye channel output The eye opening shape at the channel output can be changed as long as the optimization problem is still convex. So, another proposed manipulation is to give the optimizer more degrees of freedom by allowing two eye openings at the receiver, instead of one. The motivation behind this is similar to the main idea of the TH precoding, which reduces the output power penalty through modulo arithmetic that results in multiple eye openings at the receiver. Giving the MPC solver additional similar degree of freedom, under the same power limits, enables a significant increase in the eye opening values. Eye Diagram 0.8 Logic "1" 0.4 E Logic '0" -0.4 ..... 0 Logic "1" 480 n -0.8 -180 1 Unit Interval (UI) Figure 4.5: Channel output with two eye openings, equalized by convex MPC technique without quantization (Residual ISI is constrained). Test channel pulse response is shown in upper left corner of Fig. 4.6. Figure 4.5 is a plot of the test channel output with modified constraints. Previously, equalizer was trying to pull the channel output to a positive amplitude when the input is logic one and to a negative amplitude when the input is logic zero. With two eyes, a logic zero decision region at the channel output will lay around zero amplitude, while a logic one decision region will be above some positive threshold and below the negative threshold of same amplitude. The optimizer has the freedom to place control signals into the channel so that the signal that corresponds to logic one at the channel output can be either positive or negative depending on the channel current state and ISI. This simple modification to the 30 original MPC formulation achieves 100% increase in eye height in this example, with a corresponding improvement in eye-width (0.8 UI). However, the cost of this improvement is having two slicers at the receiver rather than one in order to detect these two eye openings. 4.5 Evaluation results To have a fair comparison, various types of channels with different characteristics are used as test channels, with pulse responses shown in Fig. 4.6. Backplane channels are extracted from test systems designed in [28], and the active cable response is provided by Intel. These channels differ in number of pre- and/or post-cursors, attenuation at the Nyquist frequency, reflections and pulse dispersion. Table 4.1 shows a performance comparison of the different equalization techniques. Channels where the MPC performs really well have strong pre- and/or post-cursors ISI (dispersion in channeli and channel 2 , respectively) as well as reflections (channeli). Active cable (lower right corner in Fig. 4.6) is an example of channels with strong dispersion, where very small eye opening values are achieved. However, this high dispersion is also a good candidate for the MPC algorithm to get very high improvement. 3 Inches backplane @ 8Gbps 20 Inches backplane @ 6.85Gbps 0.6 cha nnel2 Channel1 0.4 0.4 0 0.2 E 2 02 IL E 0 0- TV 0 5 10 Samples 15 0 20 5 I 10 15 20 Samples 1 meter active cable @ 64Gbps 20 Inches backplane @ 9Gbps 0.6 0.4 Channel 3 0.4 0 W015 2 hannel4 0 2 a. E 4: 0.2 0 -0.2 L 0 02 A I 5 E AW 10 15 20 -0.2 Samples 0 10 20 30 40 Samples Figure 4.6: Pulse responses of test channels used in equalization performance comparison. Every channel is sampled at the indicated data rate. 31 In Table 4.1, the MPC equalizer outperforms the linear FFE as shown earlier. However, results also show that the 2-eye MPC is better than TH precoder up to a large number of feedback taps. It is important to mention that TH precoding isn't capable of equalizing channels with pre-cursors (such as channel 2,3 &4). However, it's clear from the results that the MPC equalizer can handle these types of channels. The 2-eye MPC outperforms all of the transmitter equalization techniques, making it the best candidate for asymmetric link design. Interestingly, the 2-eye MPC also outperforms the FFEZFE+DFE on some channels. Performance of the MPC equalizer is generally worse than the FFEeyeMax+DFE setup, although the 2-eye MPC performance approaches this type of equalization. To complete the comparison, Table 4.2 shows the eye-width values for 1-eye MPC, 2-eye MPC, eyeMax and ZFE FFE. For all test channels, as expected, the eye width values are increasing from those of eyeMax FFE, ZFE FFE, to 1-eye MPC and 2-eye MPC. For Channel 4 , the eye width values are relatively small due to high channel attenuation and small eye height values. Channel Channel, Channel 2 Channel3 Channel 4 FFEeyeMax 142 214 100 6 FFEZFE 127 212 96 5 MPCi-eye 240 230 135 12.4 MPC2-eye 480 243 210 40 TH3-T 205 0 0 0 TH16-T 450 0 0 0 FFEeyeMax+DFE-T 548 297 280 77 FFEZFE+DFE1_T 397 253 235 66 FFEeyeMax+DFE3-T 587 355 422 119 FFEZFE+DFE 3-T 408 253 319 72 Table 4.1: Eye opening height values (in millivolts) for different test channels using different equalization techniques: Linear FFE, MPC algorithm with 1-eye & 2-eye, Linear FFE at transmitter and DFE at receiver with 1 & 3 taps, Tomlsinson-Harashima precoding technique with equivalent 3- and 16-tap feedback. Test channel pulse responses are depicted in Fig. 4.6. 32 Channel Channeli Channel 2 Channel3 FFEeyeMax 0.54 FFEZFE MPCi _eye 0.66 0.67 0.64 0.67 0.69 0.66 0.69 0.77 Channel 4 0.23 0.47 0.46 MPC2 -eye 0.8 0.75 0.78 0.47 Table 4.2: Eye opening width (in UI) for different test channels using different equalization techniques: Linear FFE (20-tap), MPC algorithm with 1-eye and 2-eye. Test channel pulse responses are depicted in Fig. 4.6. 4.6 Full Eye-mask Control In the previous MPC problem formulation, we only constrained the signal at the sampling point (maximum eye height point), which controlled the eye height and the residual ISI. The eye width was controlled indirectly by decreasing the dynamic range of the residual ISI at the sampling point. However the MPC algorithm can also accommodate a much fuller control of the eye-mask shape by introducing some additional constraints. 0.6 9-- 0.5 - ---- 0.4 - --------------- -- 0.3 :E 0.2 Earlysampled pulse response -.-.-.-.-. --- - - Late sampled pulse response - -------------- - - ------------ ------------ - IR - ---'OD -- --------- 0.1 - - - ---------- -- - 0 -0.1 - -2 0 2 4 6 1' 8 samples Figure 4.7: 3-inch backplane test channel pulse response at 8 Gbps. The solid lines are samples at maximum eye height phase, while dashed and dotted lines are for early and late sampling phases respectively. The idea is to get the channel pulse response sampled at more phases than just the maximum eye height phase. Figure 4.7 shows an example for our test channel pulse response 33 with two extra sampling phases. The eye height and the residual ISI constraints for these two new phases can be added to the MPC formulation in section 4.2. Figure 4.8 shows eye opening shapes before and after adding the eye shaping constraints. The additional constraints trade-off the eye height for the eye width. Although the eye height is reduced, it is still more than 60% larger than the eyeMax FFE. On the other hand, the eye width of 0.73 UI can be achieved, compared to 0.67 UI without constraining the eye width, both of which exceed the maximum achievable eye width of 0.66 UI with any linear FFE. Depending on the desired level of eye-mask control, it is straightforward to add additional constraints at different sampling phases. 0.5 0.5 0.4 0.4 0.3 0.3 --- --- --- 0.2 0.1 4) 0.1 CL 0 E E -0.1 -0.2- -------- --- -0.5 -1 E= 0.67 UI 0 Unit Interval (UI) -0.2 -0,3 EH240 mV: -0.4 -0.1 -0.4 L -0.5 E,, =230 MV:--- ----E=-0.73--UI- - ----- - 1L 0 1 Unit Interval (UI) (a) (b) Figure 4.8: Eye diagram of MPC equalized channel (a) without eye-shaping constraints (b) with eye-shaping constraints. Eye width improvement is achieved by trading-off the eye height. 4.7 Implementation Issues Implementing most of the transmit equalization schemes requires high-speed digital to analog converter (DAC) [29, 30, 31]. These DACs are often limited to 7-8 bits of resolution due to area and energy constraints. Lowering the resolution requirements for the transmit 34 DAC is very beneficial, however, most equalization techniques (including the MPG) are very sensitive to the quantization noise since the control signals are quantized after the optimization is performed. In addition to this, the online MPC implementation at high-speed rates is extremely challenging and power hungry since it requires floating-point calculations. One alternative method is to pre-map all the offline MPC computations to a look-up table and do the fast read-outs based on the incoming bits and the channel state information. Unfortunately, even with 7-8 bits of resolution on the output samples, the channel state space is huge, making this approach impractical. More details about this approach is presented in Chapter 6. Again, smaller resolution control signals would significantly reduce the channel state space and the size of the required look-up table, but result in significant degradation of the MPC performance since the quantization step cannot be directly included into the convex MPC formulation. This prompts us to look for alternative optimization methods to directly incorporate the quantization noise into the optimization algorithm, while preserving the MPC optimization framework (i.e. objectives and constraints). In doing so, we also want to address the computation complexity of the MPC algorithm and seek suitable approximations that can enable real-time operation at high sample rates. 4.8 Summary Channel equalization can be mapped into an MPC problem. By defining the channel pulse response and the required constraints, a significant improvement in eye opening is achieved compared to conventional transmitter equalization schemes. Unlike other schemes, the eye opening parameters (height, width and residual ISI), can all be controlled explicitly by the optimization problem constraints. Besides, for PAM-2 transmission, multiple eye openings can be achieved at the receiver by manipulating the constraints, which gives more degrees of freedom for the optimization solver. Simulation results show that this new precoding scheme performs better than TH precoding, and approaches the performance of the conventional FFE+DFE setup for some values of DFE feedback taps. However for implementation, as most of transmitter equalizers, this proposed scheme requires a high-resolution DAC. Along with the complexity of the MPC algorithm, hardware implementation is very challenging for practical applications like high speed links. 35 Chapter 5 Simplified MPC (Direct Search) As discussed in Chapter 4, the motivation behind the simplified algorithm is to apply quantization inside the solver and enable low resolution control signals, which both decrease the cost of the transmit DAC and the MPC implementation. Previously, the quantization could only be applied to the MPC output, because applying state quantization inside the optimization model leads to solution divergence due to non-convexity of the quantized signals. However, just quantizing the MPC output leads to a very poor performance for relatively low resolutions. 5.1 Algorithm The modified approach is a simple Direct Search (DS) solver, where the optimum solution is found by sweeping through the entire solution space and selecting the absolute minimum value that meets the requirements. This sweep is fast in cases where the control signals are coarsely quantized and the horizon is small. Figure 5.1 shows a flow chart for this simplified method. 36 v: output sample e: eye opening constraint. End of stream? No d: voltage noise constraint. y: channel output Correction : +ve or -ve value is a function of solution convergence. Yes offset: position in input bit stream. Figure 5.1: Flow graph of the simplified algorithm for equalization (DS approach). bit = current bit Xo =current channel state v =0 a bs(v) :5 1 a bs(v) :5 Calculate (y) Vi v correction If bito= 1 Ifbit= 0 absMy < do No 9 &Wv Yes offset No V c s No oYes +1 End of stream? :t ptoffset sam pu e: eye opening constraint. d: voltage noise constraint for logic (1). do:, voltage noise constraint for logic (0). y: channel output Correction : +ve or -ve value is a function of solution convergence. offset: position in input bit stream. Yes Figure 5.2: Simplified flow graph for two eye openings, using DS approach. 37 Additionally, the DS algorithm can also be extended to the 2-eye scheme, with the same eye opening values that are achieved by the convex MPC. The new constraints can be noticed in the modified flow chart in Fig. 5.2. All eye shaping parameters that are used in convex MPC can also be applied in this solver, like for example controlling the eye height and the residual ISI. Figure 5.3 shows the quantization effect on the eye height for both convex and DS MPC, compared to the linear FFE. It is obvious that the DS solver significantly outperforms the others, in both one eye and two eyes schemes, from the quantization point of view. Even with very low resolutions (3-4 bits), DS can still get reasonable eye height values at the output, while the eye height of the quantized convex MPC and eyeMax FFE degrades very quickly. Since the quantization is handled inside DS, it can compensate for both ISI and quantization noise. . 300 550 convex MPC 2-DSMPC ~ 500 . - --- - convex MP DS MPC 200 Q0 0 r ----... 300# 1Sr-, Quan'tiZationr resolution (in bits) Quantization resolution (in bits) (a) (b) Figure 5.3: Quantization effect on eyeMax FFE, convex MPC and Direct Search equalizers for test channel. (a) One eye opening scheme. (b) Two eye openings scheme. Channel pulse response is shown in the upper left corner of Fig. 4.6. 5.2 Evaluation results - with quantization Table 5.1la compares the performance of the convex MPC and the DS MPC algorithms to different equalization techniques with quantization at the transmitter and the receiver. All values are calculated for Channel 1 test channel in Fig. 4.6. The conventional FFE with 20 taps and the FFE+DFE setups with different DFE number of taps are shown, as well as the TH precoding. It is clear that with very low resolution (3-4 bits), using the DS MPC with two eye openings gives the best performance compared to other transmitter equalization 38 techniques, which makes it the most suitable technique for asymmetric links. The convex MPC with two eyes comes next, then the DS MPC with one eye. In the FFE+DFE setup with quantization, the shown quantization resolution is for FFE only. Quantization resolution for DFE is always 6 bits, assuming higher resolutions can be achieved in the feedback loop without large power overhead, due to smaller range of the feedback correction signal. It is interesting to note that the performance of the 2-eye DS MPC, in asymmetric links, is now close to the performance of the best FFEeyeMax+DFE scheme and beats the FFEZFE+DFE with quantization. Finally, Table 5.1 b shows the eye width values for the different equalization techniques with very low resolution quantization. Again, DS solver outperforms the other equalization techniques resulting in a very reasonable eye width at such low resolutions. Generally, the MPC techniques have much better eye width than the TH precoding. 5.3 Summary In this chapter we introduce Direct Search as a simpler variant of the MPC solver, while still preserving all the peak-power and eye mask shaping constraints. Since this new solver takes into account signal quantization effects, it performs very well under low resolutions (3-4 bits) compared to convex MPC. Interestingly, with such low resolutions, DS MPC also outperforms DFE for some number of feedback taps. The simple algorithm and low resolution operation of DS MPC make it a good candidate for a reasonable hardware implementation for practical high speed links as will be seen in the next section. 39 height (mV) Continuous 4-bit quant 3-bit quant FFE DS MPC eyeMax ZFE 1-eye 2-eye 1-eye 2-eye 142 57 0 127 60 0 240 110 0 480 348 172 240 190 100 480 448 350 FFEZFE FFEeyeMax DFE1-tap DFE3-tap DFEi-tap DFE3 -tap 555 452 284 591 475 343 397 313 150 413 350 153 3-tap 205 161 0 TH 16-tap 450 335 180 (a) Eye height values for different equalization techniques: FFE (eyeMax and ZFE), with and without DFE (1-tap and 3-tap), TH precoding (3-tap and 16-tap), the proposed convex MPC and DS approaches (1-eye and 2-eye). DFE quantization is 6 bits. eye width (UI) Continuous 4-bit quant 3-bit quant FEMCDS FFE eyeMax ZFE 0.54 0.31 0 0.66 0.39 0 TH MPC 1-eye 2-eye 1-eye 2-eye 3-tap 16-tap 0.67 0.37 0 0.67 0.53 0.33 0.80 0.68 0.49 0.49 0.46 0 0.60 0.50 0.35 0.80 0.67 0.41 (b) Eye width values for different equalization techniques: FFE (eyeMax & ZFE), TH precoding (3-tap and 16-tap), the proposed convex MPC and DS approaches (1-eye and 2-eye). Table 5.1: Quantization effect on various equalization schemes, comparing from both eye height and eye width points of view. Chapter 6 Implementations MPC solvers can be very complex due to floating point operations. Lowering the equalization resolution is a good candidate to reduce the complexity, but at the cost of performance. Some hardware tricks are applied to simplify the hardware circuits in order to meet the required high speed data rates. In this chapter, we discuss the possible directions for the implementation of the MPC equalizers. 6.1 RAM-DAC Equalizer One possible system implementation is using a RAM-DAC architecture, where the MPC is mapped into a memory which stores all possible outputs. Two ideas for indexing the memory are proposed. The MPC algorithm depends directly on the channel current state, but to make the system a complete open loop, the memory can be indexed by previous input bits (history bits) which should compensate for the channel state information. As shown in Fig. 6.1, a shift register of length (L) stores the input bits. Combined with the current input bit, a memory index of (L+1) bits is used. Although this method works theoretically, the simulation results show that it is unpractical. In order to compensate for the channel state with the input history bits, large number of bits (L) has to be taken. To view this fact, same input bit pattern is applied to the MPC equalizer with different initial channel states. By taking different number of history bits (L), the output of the equalizer in each case is compared. Figure 6.2-a shows the difference between the equalizer outputs versus number of history bits taken. The output difference approaches zero for large number of bits. As an example of 4-bit quantized output, a history bits of length more than 25 is needed to have a unique equalization solution, i.e. a zero difference shown in Fig. 41 Input Figure 6.1: Transmitter architecture using RAM-DAC equalization with input bits history. 6.2-b, which means very large RAM/LUT size of 226 words. Another example of 6-bit quantization in Fig. 6.2-c shows that increasing the output resolution requires more input history bits, with L larger than 43 bits in this case. n7 (a) 01- 0 0 10 20 30 40 50 60 70 8C0 Number of history bits (L) 5 (b) (D -j 0 0 10 A 20 60 30 40 50 Number of history bits (L) 70 80 (c) (0 '3) -J 0 0 10 20 30 40 50 60 Number of history bits (L) 70 80 Figure 6.2: Simulation results for uniqueness of MPC equalizer outputs. (a) Un-quantized outputs. (b) 4-bit quantized outputs. (c) 6-bit quantized outputs. The second proposed idea of memory indexing is by using the actual channel state in the feedback as shown in Fig. 6.3. In this case, RAM/LUT size is a function of the channel pulse response, which determines channel state size, and the resolution of the equalization process. In this context, the DS MPC technique is very promising since it gives a reasonable 42 performance down to 3-bit resolution, which significantly reduces the channel state space, and accordingly the RAM/LUT size requirements. (N+1) bits (N) bits To Channel RAM/LUT State feedback circuit DAC (M) bits Figure 6.3: Transmitter architecture using RAM-DAC equalization with channel state feedback. In order to reduce the size of the RAM/LUT further more, the state feedback circuit is responsible of selecting the important (N) bits out of total (M) bits of the state vector. This selection is done due to the fact that this equalization process does not cover all possible state vector values. Besides, depending on the error resolution and the weight of each tap in the channel pulse response, some least significant bits, or even the whole bits of specific states can be ignored without causing errors. As an example, for the 3-inch test channel pulse response shown in Fig. 6.4, the state vector length is 20 values. Using a resolution of 3-bits, the expected size of the LUT is 260 words, which is not feasible. However, in the ISI taps, there is only one large value tap plus five to six smaller taps. Due to quantization error, small-value taps can tolerate smaller number of state quantization bits without affecting the performance. Simulations show that only 12 out of 60 state bits are enough to emulate the state behavior. This 12 state bits are comprised of 3 state bits associated with the main ISI tap, and most significant bits of the states associated with other ISI taps. Assuming a 1-bit input horizon, this selection leads to a reasonable LUT size of just 213 words, where the word size is the resolution of the output sample (in this case 3-bits). 43 0.6I 0.5- ISI taps 0.4 0.3 0.1 0 - - -0.1 0 5 10 Samples 15 20 Figure 6.4: 3-inch backplane channel pulse response, the significant ISI taps are marked with arrows. In the RAM-DAC architecture, serialization is dependent on the required data rate. At moderate data rates, the input could be serialized and the RAM word size will be only one sample directly connected to the driver DAC. At higher data rates, the serialization could be done at the output with a multibit horizon scenario, where the RAM word will contain multiple control samples. 6.2 Online Channel State Estimator Another possibility is to directly implement the DS algorithm in hardware, in a very similar way to TH implementation in [17], as depicted in Fig. 6.5. This implementation would incorporate a channel state estimator, which calculates the channel state residual ISI on-line every clock cycle. This estimator is essentially the feedback loop in the TH architecture, and can be implemented with an FIR filter mapping the channel pulse response. Taking advantage again of the DS solver with very low resolution (3-4 bits), all possible values for the channel output (7 values in case of 3-bit resolution) can be calculated efficiently at high-speeds, and added to the estimated channel ISI. Then the optimum value is chosen such that the corresponding channel output satisfies the eye opening constraints. 44 ISI Taps ISI Estimation Main Tap Main Tap Caics Eye Shaping V Output Constraints Input bits Figure 6.5: Block diagram of online channel state estimation architecture for DS MPC equalizer. The complexity, and accordingly the maximum throughput of this proposed implementation are function in the calculation resolution. As shown in Fig. 6.6, using more than 8-bit resolution for the channel model doesn't give any performance gain. For the equalizer output though, it's preferred to use as small number of bits as possible to decrease the solution space. So, it's a trade-off between fine equalization and speed. 500 4 00 p E 3_- ., . i 7 5 300 Z200 - 01 12 10 ~~ 6 8 ~-5 6 Channel resolution (in bits) - 4 3 4 Output resolution (in bits) Figure 6.6: A 3-D plot of the eye height of the 3-inch backplane channel output versus the equalizer output resolution and the channel model resolution. 45 Chapter 7 Conclusion and Future Work In this work, a new nonlinear precoding transmitter equalization method was proposed using Model Predictive Control algorithm. This class of control algorithms obtains the current control action by solving an online optimal control problem. Different aspects of the new method were analyzed and compared to the existing equalization techniques like linear feed-forward equalizers, Tomlinson-Harashima precoding and decision-feedback equalizers. The MPC non-linear time-variant transmit-side equalization significantly outperformed both the linear feed-forward transmit equalizers and Tomlinson-Harashima precoders, especially on hard channels with lots of inter-symbol interference from dispersion and reflections. Direct Search algorithm was also introduced as a design approach which incorporates quantization effects in the algorithm and enables good performance at low resolutions, enabling low-complexity implementation. Direct Search algorithm outperforms all transmit equalization approaches and gets very close to the performance of the decision-feedback equalizers. As future work, more efforts will be directed to the implementation and hardware measurements of the MPC equalizers. Some other tricks can be applied for an efficient RAMDAC implementations, like reduced LUT, linear interpolations, and functional fitting. The online DS MPC architecture is also a promising candidate for low resolutions. However, DS implementations need to address also pre-cursor equalization, which is one of the areas where significant performance improvements can be gained over TH methods. 46 Appendix A MATLAB Codes MATLAB CODE FOR eyeMax ALGORITHM A.1 We can solve the optimization problem of eyeMax algorithm for FFE taps calculation using YALMIP toolbox [32]. This MATLAB code is implementing the exact formulation shown in section 2.2. function [ffe, val] = maxer(pulse, main, eqDeg, eqPre, DFEDeg) %ffe: calculated taps weights % val: the value of objectivefunction % pulse: symbol-spaced pulse response % main: required "main" tap % eqDeg: order of thefilter % eqPre: delay of the filter (the number of pre-cursortaps) % DFEDeg:number of DFE taps at the receiver a = sdpvar(1, eqDeg); pulseEq = conv(pulse, a); % pulse after equalization mainEq = main + eqPre; % recalculatenew main position % Setting peak-powerlimit: F = set(sum(abs(a)) < 1); % Defining the objective: objective = pulseEq(mainEq)- sum(abs(pulseEq)) + sum(abs(pulseEq(mainEq:mainEq+DFEDeg))); solvesdp(F,-objective, sdpsettings('solver','sedumi','verbose',,'showprogress',O)); ffe val = = double(a); double(objective); 47 Another alternative to solve this problem is by addressing the peak-power constraint through scaling after optimization. As we discussed previously, the result is the same in both cases. Here we show the CVX implementation with post-optimization scaling. CVX is a package for specifying and solving convex programs [33, 34]. In this code, peak- power limits are defined by the means of normalizing the tap weights after solving the optimization problem. We are showing below the MATLAB code of the function used. function [ffe, val] = maxer cvx(pulse, main, eqDeg, eqPre, DFEDeg) cvxbegin variable a(eqDeg); clear conv pulseEq = conv(pulse, a); % pulse after equalization mainEq = main + eqPre; tmp = pulseEq; tmp(mainEq:mainEq+DFEDeg)= 0; objective = pulseEq(mainEq) - sum(abs(tmp)); maximize(objective); cvxend ffe = a/sum(abs(a)); val = objective; A.2 MATLAB CODE FOR CONVEX MPC EQUALIZATION YALMIP toolbox is used to solve the convex optimization problem of MPC equalizer. Before presenting the code and to be familiar with the terms used, we will present an example for some variables shown in the code. Assume h is the channel pulse response with length L, i.e. .h = [hi h 2 ... hL]. The channel output over a horizon of length HR is determined by the matrix L,, which has a size of HR x HR. The channel state effect on the output, which is basically the ISI, is calculated using the matrix E, which has a size of HR x L. Both matrices are shown in Fig. A. 1. The matrix Q is a real positive-semi-definite matrix used as a weighting factor for calculating the control output energy. It can take any value as long as it is a positive-semi-definite matrix, i.e. it can be an identity matrix. In our formulation, we used the Observability Gramian as our weighting matrix. This matrix is guaranteed to be positive-semi-definite as long as the channel model is observable [35]. 48 h1 h2 = L hunRl hHR . . 0 0 . . . 0 h2 hi . . 0 h2 0 hi . 0 . . h2 h1 . . . h2 0 hl . h . huR-2 hHR-1 h h 0 h1 . h3 . . hL-1 .. . . hL hL 0 hL 0 - 0 0 - - hL- E . . hHR hHR+1 hHR+1 hHR+2 Where, . . . hL-1 h L-1 h -1 hL 0 L 1 hL 0 0 0 - 0 0 0 0 HR S L Figure A. 1: L, and E matrices used in the optimization function of the convex MPC solver. The MATLAB code of the function used is shown below. function [vflag] = SolveMPC( Q, Lp, E, xO, d, 77) % v: equalizercontrol output %flag: (0 or 1)for solutionfound or not Q: quadraticmatrix % % Lp: channel input response % E: channel state response matrix (ISI calculation) % x: channel current state % d: input bits horizon % : residualISI constraint v = sdpvar(size(Q,2), I ); e0 = sign(d). *(E*xO + Lp*v) F = set( abs(v) < 1)+ - sign(d).*d; set( eQ > 0 ) + set ( eQ < 7) obj = v'*Q*v; solvesdp( F, obj, sdpsettings( 'solver', 'sedumi') ); % YALMIP v = double( v); flag=O; if (double(eO) < 0 || isnan(v)) flag=1; end 49 Bibliography [1] V. Stojanovic and M. Horowitz. Modeling and analysis of high-speed links. In Proceedings of the IEEE Custom IntegratedCircuits Conference, pages 589 - 594, Sept. 2003. [2] N. McKeown, M. Izzard, A. Mekkittikul, W. Ellersick, and M. Horowitz. Tiny Tera: a packet switch core. Micro, IEEE, 17(1):26 -33, Jan/Feb 1997. [3] Kun-Yung Ken Chang, Shang-Tse Chuang, N. McKeown, and M. Horowitz. A 50 Gb/s 32x32 CMOS crossbar chip using asymmetric serial links. In Symposium on VLSI Circuits.Digest of Technical Papers., pages 19 -22, 1999. [4] M. Rau and D. Schroder. Model predictive control with nonlinear state space models. In 7th InternationalWorkshop on Advanced Motion Control, pages 136 - 141, 2002. [5] H.-B. Kuntze, A. Jacubasch, U. Hirsch, J. Richalet, and C. Arber. On the application of a new method for fast and robust position control of industrial robots. In IEEE InternationalConference on Robotics andAutomation, 1988, pages 1574 -1580 vol.3, Apr 1988. [6] L. Saint-Bauzel, V. Pasqui, B. Gas, and J.-L. Zarader. Pathological sit-to-stand predictive models for control of a rehabilitation robotic device. In The 16th IEEE International Symposium on Robot and Human interactive Communication, pages 1173 -1178, Aug. 2007. [7] Yung-Shou Cheng and Ruey-Beei Wu. Direct eye diagram optimization for arbitrary transmission lines using fir filter. IEEE Transactionson Components, Packagingand ManufacturingTechnology, 1(8):1250 -1258, Aug. 2011. [8] S.U.H. Qureshi. Adaptive equalization. Proceedingsof the IEEE, 73(9):1349 - 1387, Sept. 1985. 50 [9] Bernard Widrow and Marcian E. Hoff. Neurocomputing: foundations of research. chapter Adaptive switching circuits, pages 123-134. MIT Press, Cambridge, MA, USA, 1988. [10] Byungsub Kim and V. Stojanovic. Equalized interconnects for on-chip networks: modeling and optimization framework. In IEEE/ACM InternationalConference on Computer-Aided Design, pages 552 -559, Nov. 2007. [11] J. Ren and M. Greenstreet. A unified optimization framework for equalization filter synthesis. In Proceedings of Design Automation Conference, pages 638 - 643, June 2005. [12] C.A. Belfiore and Jr. Park, J.H. Decision feedback equalization. Proceedings of the IEEE, 67(8):1143 - 1156, Aug. 1979. [13] J.M. Cioffi, G.P. Dudevoir, M. Vedat Eyuboglu, and Jr. Fomey, G.D. MMSE decisionfeedback equalizers and coding. I.equalization results. Communications,IEEE Trans- actions on, 43(10):2582 -2594, Oct 1995. [14] M. Tomlinson. New automatic equaliser employing modulo arithmetic. Electronics Letters, 7(5):138 -139, 1971. [15] H. Miyakawa and H. Harashima. Information transmission rate in matched transmission systems with peak transmitting power limitation. In Nat. Conf Rec., Inst. Electron., Inform., Commun. Eng. of Japan,number 1269, page 1268, Aug. 1969. [16] H. Harashima and H. Miyakawa. Matched-transmission technique for channels with intersymbol interference. IEEE Transactionson Communications, 20(4):774 - 780, Aug 1972. [17] Marcel Kossel, Thomas Toifl, Pier Andrea Francese, Matthias Brandli, Christian Menolfi, Peter Buchmann, Lukas Kull, Toke Meyer Andersen, and Thomas Morf. An 8Gb/s 1.5mW/Gb/s 8-tap 6b NRZ/PAM-4 Tomlinson-Harashima precoding transmitter for future memory-link applications in 22nm CMOS. In IEEE International Solid-State Circuits Conference Digest of Technical Papers(ISSCC), pages 408-409, 2013. [18] Rault A. Testud J.L. Papon J. Richalet, J. Model predictive heuristic control. applications to industrial processes. In Automatica, volume 14, pages 413-428, Sept. 1978. 51 [19] B.L. Ramaker C.R. Cutler. Dynamic matrix control: a computer control algorithm. In AIChE NationalMtg, 1979. [20] R.D. Gillette D.M. Prett. Optimization and constrained multivariable control of a catalytic cracking unit. In AIChE National Mtg, 1979. [21] Manfred Morari Carlos E. Garcia, David M. Prett. Model Predictive Control: Theory and Practice, a Survey. In Automatica, volume 25, pages 335-348, 1989. [22] B.H. Whalen L.A. Zadeh. On optimal control and linear programming. In IRE Transaction of Automatic Control, volume 7, page 45, 1962. [23] A.I. Propoi. Use of LP methods for synthesizing sampled data automatic systems. In Automn Remote Control, volume 24, page 837, 1963. [24] R. E. Kalman. Contributions to the Theory of Optimal Control. In Bulletin de la Societe Mathematique de Mexicana, volume 5, pages 102-119, 1960. [25] R. E. Kalman. A new approach to linear filtering and prediction problems. In Transactions of ASME, Journalof Basic Engineering,volume 87, pages 35-45, 1960. [26] Thomas A. Badgwell S. Joe Qina. A survey of industrial model predictive control technology. In Control EngineeringPractice, volume 11, pages 733-764, 2003. [27] B. P. Lathi. Linear systems and signals. Berkeley-Cambridge Press, Carmichael, CA, 1992. [28] W. Peters, E. Gong, C. Chen, and H. Kim. Improved HVM ATCA measure- ment data. http: //www. ieee802. org/3/ap/public/channel-model/peters_ m1_0605. zip, Jul 2005. [29] Hayun Chung, A. Liu, and Gu-Yeon Wei. A 12.5-Gbps, 7-bit transmit DAC with 4tap LUT-based equalization in 0.13 um CMOS. In IEEE Custom Integrated Circuits Conference, pages 563 -566, Sept. 2008. [30] R. Sredojevic and V. Stojanovic. Fully digital transmit equalizer with dynamic impedance modulation. IEEEJournalofSolid-State Circuits,46(8):1857 -1869, Aug. 2011. 52 [31] J. Savoj, A. Abbasfar, A. Amirkhany, M. Jeeradit, and B.W. Garlepp. A 12-GS/s Phase-Calibrated CMOS Digital-to-Analog Converter for Backplane Communications. IEEE Journalof Solid-State Circuits,43(5):1207 -1216, May 2008. [32] J. Lofberg. YALMIP : a toolbox for modeling and optimization in MATLAB. In IEEE InternationalSymposium on Computer Aided Control Systems Design, pages 284 -289, Sept. 2004. [33] Inc. CVX Research. CVX: MATLAB software for disciplined convex programming, version 2.0 beta. http: //cvxr. com/cvx, Sep 2012. [34] M. Grant and S. Boyd. Graph implementations for nonsmooth convex programs. In V. Blondel, S. Boyd, and H. Kimura, editors, Recent Advances in Learning and Control, Lecture Notes in Control and Information Sciences, pages 95-110. SpringerVerlag Limited, 2008. http: //stanf ord. edu/~boyd/graph_dcp.html. [35] B. Moore. Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Transactions on Automatic Control, 26(1):17-32, 1981. 53