DesignCon 2011 Worst-Case Patterns for HighSpeed Simulation and Measurement Masashi Shimanouchi, Altera Corporation mshimano@alatera.com Mike Peng Li, Altera Corporation mpli@altera.com Daniel Chow, Altera Corporation dchow@altera.com 1 Abstract Design and validation of high speed serial link at multi Gbps requires time-domain simulation and measurement. The pattern length for transistor level simulation is limited to a few hundred bits due to the practical simulation time while the pattern length for oscilloscope measurement is limited to a few hundred to a few thousand of bits due to the record length. This is where and why “killer” patterns are needed, which are relatively short (less than a few hundred bits), yet induce the worst case ISI. Author(s) Biography Masashi Shimanouchi is a senior member of technical staff at Altera Corporation. His work on high-speed serial links of FPGA products includes link system and component architecture, modeling, characterization, and link jitter and BER simulation tools development with expertise in signal processing, signal integrity and jitter area. Dr. Mike Peng Li is a Principal Architect and Distinguished Engineer at Altera Corporation since September 2007. He is a corporate expert and adviser on jitter, noise, signal integrity, high-speed link, SERDES and on-die instrumentation (ODI), electrical and optical signaling, silicon photonics, and optical FPGA. Dr. Li was the Chief Technology Officer (CTO) for Wavecrest Corporation from 2000-2007, where he led the technology roadmap, leadership, vision, and product developments. Dr. Daniel Chow is a Principal Signal Integrity Engineer at Altera Corporation. His responsibilities include defining design, testing, and validation methodologies for signal integrity, power integrity, and jitter analysis in high-speed components. Specifically, he is responsible for developing Altera’s knowledge base on jitter-related issues. Dr. Chow received his Ph.D. from the University of California, Davis. 2 1. Introduction In order to design, and validate a high speed serial link at multi Gbps and above, timedomain integrated circuit simulation at transistor level such as SPICE simulation and behavioral simulation such as IBIS-AMI and jitter/noise measurement with sophisticated test and measurement instruments such as broadband high speed real time or equivalent sampling oscilloscopes equipped with jitter extraction/decomposition software, are all important and essential. On the simulation side, serial data pattern length used in SPICE simulation is commonly limited to within a few hundred bits, due to the long simulation time (several hours to days if no accuracy compromise is made) associated with it. On the measurement side, a few hundred to a few thousand bits long data pattern is used for high-bandwidth real time sampling oscilloscopes due to the record length limitation in those instruments. Meanwhile, some communication standards [1] call for PRBS 2^31-1 (~ 2 billion bits) as the compliance pattern to excite and validate the worst inter symbol interference (ISI). Obviously, simulating and testing such long pattern bits at a highspeed link output is prohibitive in practice. It would be beneficial to develop a “killer” pattern that is much shorter compared with PRBS2^n-1 in terms of the number of bits, and approximates PRBS2^n-1 in terms of exciting worst case ISI, especially for a linear time-invariant (LTI) system such as transmission lines on a PCB or a back plane, and this subject is the focus of this paper. While “killer” pattern would make accurate transistor level SPICE simulation and full waveform measurement for jitter and noise investigation doable in practice, it would not simply replace more complicated patterns and/or PRBS2^n-1 patters. It would rather give us deeper insight into the ISI mechanism in channel loss and its equalization. In section 2, we will review the basics of ISI mechanism in LTI system, and discuss worst case ISI-inducing patterns. In section 3, we will study PRBS2^n-1 pattern in detail and explain the guideline to design “killer” patterns. In section 4, we will discuss some applications of “killer” patterns. In section 5, we will summarize and conclude. 2. Basics of Inter-Symbol-Interference in LTI System and Worst Case Patterns 2.1 LTI System Model of High Speed Serial Link A block diagram of high speed serial link consisting of a transmitter, channel and a receiver is shown in Figure 1 (a), and the lower level building blocks of the link’s physical layer are shown in Figure 1 (b). Large portion of the link, which is highlighted in Figure 1, can be approximated by linear time-invariant model. Considering this portion of the link as an LTI system, we can study the ISI and resulting jitter and noise with respect 3 to the signal patterns being transmitted by utilizing a rich set of mathematical tools such as convolution, Fourier transform, etc [2]. Figure 1Building Blocks of High Speed Serial Link Considered as LTI System 2.2 Inter-Symbol-Interference Mechanism in LTI System To study the ISI mechanism in a lossy channel, let’s consider its single bit response [3]. When one bit signal goes into the channel, its output width is broadened spreading over many unit intervals (UIs) as illustrated in Figure 2. The amplitude at each UI time is called cursor, and the entire single bit response consists of pre-cursors, main cursor and post-cursors. Figure 2 Single Bit Response and Pre/Post Cursors Referring to Figure 3, let’s study how each single bit response (drawn in blue) is affected by preceding bits response. One previous bit is assumed to be zero to consider an isolated target bit. Because of the long-lasting tail of preceding bits, the target bit signal level is raised. The more preceding bits are ones, the higher the target bit signal level is pushed up, which causes the edges’ timing shift, i.e. jitter. The leading edge of the target bit is to come earlier, and the trailing edge is to come later. Thus, the preceding contiguous identical digits (ones in this example), CIDs, strongly affect the target edge timing shift. The term “run length” and CID is usually inter-changeably used. Note that the spreading leading edge transition of the following bits would also affect the target bit. Thus, the neighboring bits affect each other, and therefore it is called inter-symbol-interference (ISI). 4 Figure 3 Accumulation of Neighboring Bits Response 2.3 Inter-Symbol-Interference vs. Contiguous Identical Digits In order to quantitatively study the relationship between ISI (resulting Time Interval Error, TIE here) and CID, simulation was performed, the concept of which is illustrated in Figure 4. The pattern is 1010… + CIDs (1s) + 01 + flipped CIDs (0s) + 1010…, where the number of CIDs was varied and ISI (TIE) was calculated as the waveform’s state change timing deviation from the corresponding ideal edge timing. Figure 4 Transition Edge Timing Deviation due to ISI Different channel resulted in different ISI vs. CID characteristic. While one channel exhibited monotonous ISI vs. CID relationship as shown in Figure 5 (a), another channel exhibited more complicated non-monotonous relationship as shown in Figure 5 (b). We will analyze the mechanism behind these two distinct characteristics in the next section. 5 Figure 5 Two Types of ISI vs. CID Relationship 2.4 Single Bit Response and Worst Case Patterns Figure 6 illustrates how the ISI from preceding bits accumulates at the target bit location using the concept of the cursors of a single bit response. Figure 6 ISI Accumulation from Preceding Bits As discussed in section 2.2, the longer the CID is, the larger the overall ISI is in this example. If some of the cursors, however, are opposite polarity, they contribute to cancelling some ISI. In this case, the longer CID would not simply result in larger ISI, which is the mechanism behind the non-monotonous ISI vs. CID relationship in Figure 5 (b). If the bits were flipped at the cursors with opposite polarity, this new pattern would cause larger ISI as illustrated in Figure 7 (b), or the worst ISI for a given CID. Though they are 6 not literally “contiguous identical digits” any more, allow us to use the term CID here. This is the basis of the worst case pattern segment. Figure 7 Two Types of Cursors’ Polarities 3. Worst Case Pattern and Killer Pattern 3.1 Patterns Used for Tests Various standards specify their own test patterns [1][4][5]. They depend on coding scheme, which puts constraint on the maximum run length (CID) and affects overhead (8B/10N vs. 64B/66B, etc.). They also depend on the purpose of each test. Some patterns are used to test transmitters while others are used to test receivers. Some patterns are used to test RJ, others are used to test DJ, or some patterns are used to test overall jitter performance mixing DJ and RJ. 3.2 Killer Pattern Our interest in this paper is the pattern which is short, excites worst case ISI and provides insight into the ISI mechanism. We name such pattern “killer” pattern. • Short pattern As discussed in section 1, it is desirable for a pattern to be shorter than a few hundred to a few thousand bits for transistor level simulation and/or real time oscilloscope measurement. Even for behavioral level simulation, very short simulation time with short pattern allows us to examine many scenarios and/or explore large solution space. • Exciting worst case ISI While there may be several “worst” cases, we will discuss two types of worst cases. One is pathological worst case which would provide the worst case jitter bound for a given system though it may never occur in a link’s life time. Another is PRBS- 7 • equivalent worst case which induces less worse ISI than pathological worst case, but is more likely to occur in practice. Providing insight into ISI mechanism Designing killer pattern involves analyzing and understanding the ISI mechanism in detail, and therefore we will be able to gain insight into it in the killer pattern design process itself as inferred from the preceding discussion, and recognized in the rest of this paper. 3.3 PRBS Pattern Pseudo-Random Bit Sequence (PRBS) pattern is usually generated using a maximal length Linear Feedback Shift Register (LFSR). A block diagram to generate PRBS 2^151 is shown in Figure 8 as an example. In order for an n-bit LFSR to generate 2^n-1 bits random sequence, particular combinations of the feedback taps need to be used. Figure 8 PRBS2^15-1 Generator by LFSR When 2^N-1 bits pseudo-random sequence is generated by N-bit maximal LFSR, specific number of CIDs are found in the sequence. An example of PRBS 2^7-1 is shown in Table 1. Table 1 Occurrence of CIDs in PRBS 2^7-1 When the spreading of single bit response is longer than a CID segment under consideration with associated ISI, the states of the preceding bits to this CIDs also affect the ISI. Because of this, more preceding bits’ states need to be considered when ISI with shorter CID is to be studied. Thus the ISI for a given CID exhibits a certain distribution for a given channel and data rate. An example distributions with PRBS 2^10-1 are shown in Figure 9 in two slightly different views. From the ISI distribution, it is noticed that only a few particular patterns cause large or worst case ISI, which are of our interest in this paper. 8 Figure 9 ISI Histogram vs. CID with PRBS 2^10-1 3.4 Guideline to Design Killer Pattern A killer pattern is designed for a given channel, data rate and equalization, but it is not uniquely determined. One killer pattern may better approximate a worst case situation than another. Below is a guideline to design simple killer pattern with small effort when very high accuracy is not needed. There are four points to consider. • • Considertaion-1 : DC Balance In order to prevent extreme bias voltage, the numbers of 1s and 0s are set to the same. Note that though the number of 1s of PRBS pattern is larger by one than the number of 0s, the DC unbalance would be negligibly small when N is large. Considertaion-2 : Settling Time Both physical measurement and computer simulation start from either settled low level or settled high level, which is extremely DC unbalanced state. After a certain time from the beginning, the bias voltage level reaches equilibrium in a global sense if the pattern is DC balanced. This is usually when we are interested in the system performance such as BER and ISI. An example of a pattern (1010… in this case) settling over many UIs is shown in Figure 10 where the equilibrium is reached at about 180 UI from the beginning. Figure 10 Settling 10… Pattern over Time 9 • • Considertaion-3 : ISI vs. CID As discussed in section 2, the longer the CID, the larger the ISI in general, but actual relationship becomes more complicated depending on the polarities of the cursors of the system’s single bit response. This detail needs to be taken into account in the killer pattern design in practice. Thus the “1...10” bit-segment and “0…01” bit-segment with finer structure in the CID parts if needed become the core parts of a killer pattern. Considertaion-4a : Pathological Pattern By repeating the core part of the same polarity within a range of the setting time, the DC bias is deviated from the equilibrium in a short time, which results in pathologically worst situation. An example is shown in Figure.11. Figure 11 Pathological Killer Pattern Response • Considertaion-4b : PRBS-equivalent Pattern Two distinct characteristics of PRBS patter are taken into account. One is that the maximum CID of PRBS 2^N-1 pattern is N, and another is that because of the way PRBS pattern is generated by LFSR, larger CIDs tend to cluster as observed in Figure 12, which causes large local DC bias deviation from its global equilibrium. This is the reason why the worst case ISI does not always occur right after the largest ISI but somewhere near the cluster of large CIDs even if the polarity of all the cursors of the single bit response is the same (monotonous ISI vs. CID). Simplest way to approximate this clustering is to repeat the core bitsegment of the same polarity twice or three times. An example is shown in Figure 13. 10 Figure 12 PRBS 2^10-1 Pattern Response Figure 13 PRBS-equivalent Pattern Response 3.5 Killer Pattern Length vs. CID The length of the two types of killer patterns and PRBS patterns are shown in Figure 14. Settling time of 200 UIs is assumed, and 1010... bit-segment of the length of the settling time is inserted between the adjacent core bit-segment of different polarities. Note that the length of killer patterns does not explode with increasing CIDs, which makes accurate transistor level SPICE simulation and full waveform measurement by real time oscilloscope for jitter and noise investigation doable in practice 11 Figure 14 Pattern Length vs. CID (400 “padding” bits) 4. Applications of Killer Pattern Four topics are discussed in this section. While they are not directly related each other, we have found them very useful and important characteristics in our investigation of the relationship between ISI and signal bit patterns. 4.1 Channel ISI Measurement To evaluate the correlation between PRBS pattern and PRBS-equivalent killer pattern of the same CID, the peak-to-peak ISI of the signal at the output of a lossy backplane channel was measured. The measurement set up is shown in Figure 15. Figure 15 ISI Measurement Set Up for Backplane Output Signal The measured ISIs are shown in Figure 16. The difference within +/-5% between PRBS pattern and equivalent killer pattern is good enough in many cases for simple killer patterns. 12 Figure 16 Correlation between PRBS Pattern and Killer Pattern 4.2 Equalization Effect on Channel Loss Each killer pattern is generated based on the single bit response of an LTI system. This means that killer pattern for a high speed serial link without equalization and the killer pattern for this link with equalization are different. While necessity of the different killer patterns for different link configurations is not very convenient, analyzing each killer pattern leads us to deeper understanding of how a lossy channel is equalized and the consequence on the resulting ISI. A channel output waveform with PRBS 2^7-1 without equalization is shown in Figure 17 along with the resulting time interval errors (TIEs) due to ISI at each state transition and the digital input pattern. Bit errors occurred (no TIE line segment in the TIE vs. Time plot) at around single bit locations surrounded by large CIDs as inferred from the discussion in sections 2 and 3. Figure 17 Waveform and TIE with PRBS2^7-1 without TX PreEmphasis The output waveform of the same channel and the associated TIEs with equalization by TX PreEmphasis is shown in Figure 18. No bit error occurred in this case. The bit 13 locations of the worst case TIEs due to the ISI are different from the bit error locations without equalization. While it may be difficult to analyze the bit errors and/or worst TIEs by studying the phenomena only with PRBS pattern, one would gain deeper insight into it by using the single bit response and killer pattern based on it. Figure 18 Waveform and TIE with PRBS2^7-1 with TX PreEmphasis The single bit responses of this system with and without equalization are shown in Figure 19, from which we are to expect that bit errors or worst case ISIs occur at very different bit locations because their pre and post cursors polarity characteristics are very different. Figure 19 Single Bit Response with/without TX PreEmphasis The channel output waveforms and the associated TIEs with the same equalization with two types of killer patterns (KP_noPreE and KP_eh) are shown in Figure 20 and Figure 21. The killer pattern KP_noPreE was generated based on the single bit response shown in Figure 19 (a), and the killer pattern KP_eh was generated based on the single bit response shown in Figure 19 (b). The overall eye height and eye width with each pattern is summarized in Table 2. Note that the eye with the killer pattern considering the 14 equalization is worse than the eye without considering the equalization, which is expected from the preceding discussion. Figure 20 Waveform and TIE with TX PreEmphasis with Killer Pattern : KP_noPreE Figure 21 Waveform and TIE with TX PreEmphasis with Killer Pattern : KP_eh Table 2 Eye Width and Height with Three Patterns 4.3 Worst Case Eye Height and Eye Width While the eye height with KP_eh pattern is worse than the eye height with PRBS pattern in Table2, the eye width with KP_eh pattern is better than the eye width with PRBS pattern. This can be explained with the exaggerated illustration in Figure 22 as follows. Let’s model the ISI situation as noise being superimposed on clean signal/eye. Depending on where on the eye the noise is superimposed, there are two extreme cases. One is that noise is superimposed in the middle of the eye, and another is that noise is superimposed in the middle of states transition. Note that the former reduces the eye height while the latter reduces the eye width. As discussed in section 4.2 with Figure 19 (b), the killer pattern KP_eh was generated by considering the pre/post cursors which are off from the 15 main cursor by integer-multiple of UI. Since this corresponds to the noise in the middle of the eye scenario, the resulting eye with this pattern has worst case eye height. Figure 22 Two Types of Stress on Eye Now we can design a killer pattern which stress eye width more than eye height by considering the pre/post cursors which are off from the main cursor by (n+0.1)*UI (n=+/1,+/-2,…) as shown in Figure 23. Figure 23 Single Bit Response with Half-UI-Shifted Cursors Let’s call this new killer pattern KP_ew. The resulting waveform and the associated TIEs are shown in Figure 24, and the overall eye height and eye width with three patterns (PRBS, KP_eh stressing eye height and KP_ew stressing eye width) are summarized in Table 3. 16 Figure 24 Waveform and TIE with Eye-Width Stress Pattern Table 3 Eye Width and Height with Three Patterns 4.4 Channel Loss and Signal Spectrum Though Fourier transform is most frequency used to study the frequency contents of a signal, it is not always the best tool. We will review the Fourier spectrum and another type of spectrum (short time Fourier spectrum) of PRBS pattern and killer pattern below. 4.4.1 Spectrum by Fourier Transform The Fourier spectra of PRBS 2^7-1 signals at the source, channel output without TX PreEmphasis and channel output with TX PreEmphasis are shown in Figure 25 along with channel’s insertion loss (red curve). Figure 25 Fourier Spectra with PRBS 2^7-1 Pattern The Fourier spectra of KP_eh signals at the source, channel output without TX PreEmphasis and channel output with TX PreEmphasis are shown in Figure 26 along with the channel’s insertion loss (red curve). Though KP_eh seems to have very high 17 energy at round 6.25GHz, it is because many 1010… bit segment is used in this pattern at 12.5Gbps to provide neutral bias level, and it is not because the 1010… bit segment is essential feature for the ISI vs. bit pattern relationship. As a way to address this issue, modified Fourier transform is discussed in the next section. Figure 26 Fourier Spectra with KP_eh Pattern 4.4.2 Spectrum by Short Time Fourier Transform The issue of Fourier transform for our application is that the “average” spectrum is obtained for the entire signal while distinct characteristics of ISI vs. bit pattern are “localized” as discussed in the preceding sections. Various signal processing methods have been developed to study signals in frequency domain and in time domain simultaneously, and therefore it is called joint time-frequency analysis/representation [6][7]. Among them, we chose Short Time Fourier Transform (STFT) to study our signals in this paper because its interpretation is much more straightforward than other methods, and therefore it seems to be more suitable to begin with. STFT is also called Windowed Fourier Transform, which is formulated by the equation below. ∞ STFT {x(t )} ≡ X (ω ,τ ) = ∫ x(t ) w(t − τ )e − jωt dt −∞ The idea is that small time duration segment is extracted from the signal x(t) by an window function w(t), and Fourier transform is applied to this small segment. By sliding this time window, the entire signal is analyzed. Thus, the spectrum of each local portion of the signal is obtained. The magnitude of the STFT is called spectrogram, which is often visualized with time in abscissa, frequency in ordinate and magnitude in color code. The results of the joint time-frequency analysis of our signals are shown below. Figure 27 shows the spectrogram of the source signal and channel output signal without TX PreEmphasis along with the time domain waveforms. It is noted in the source signal that large CID portions exhibit strong low frequency energy around 5~6ns while short CID portions such as 1010… exhibit strong high frequency energy around 4.5~5ns, 7ns and 9ns and 6~6.25GHz. On the other hand, it is noted in the channel output signal that high frequency energy around 6~6.25GHz, which corresponds CID=1, are significantly lowered around 4.5~5ns, 7ns and 9ns. 18 Figure 27 Spectrogram of Source and Channel Output without TX PreE with PRBS 2^7-1 Figure 28 shows the spectrogram of the channel output signal with TX PreEmphasis along with the time domain waveform. It is noted that high frequency energy around 4.5~5ns, 7ns and 9ns are better maintained by the channel equalization. This feature, however, is not exhibited very well for the CID=1 bit surrounded by large CIDs around 6ns, which is important pattern from the view point of worst case ISI. Figure 28 Spectrogram of Channel Output with TX PreE with PRBS 2^7-1 19 By using KP_eh pattern instead of PRBS pattern, the spectral feature of the high frequency signal and low frequency signal existing in small time duration is better exhibited including the effect of TX PreEmphasis on the high frequency signal as shown in Figure 29 and Figure 30. The coexistence of high frequency energy and low frequency energy is clearly observed in the 5n~6ns segment. Figure 29 Spectrogram of Source and Channel Output without TX PreE with KP_eh Figure 30 Spectrogram of Channel Output with TX PreE with KP_eh 20 4.5 Limitations and Future Work While it is possible to design a killer pattern which approximates the worst case ISI vertically and horizontally very well, that is, reproduces the inner eye of the eye diagram, as shown in Figure 31, such killer pattern is not unique and it currently requires much larger effort that designing simpler killer pattern. Developing more universal algorithm to automatically generate more accurate PRBS-equivalent killer pattern is our next step. Figure 31 Eye Diagram Overlay of PRBS2^10-1 and Equivalent KP Another future work is to improve time resolution in joint time-frequency analysis even for PRBS pattern to detect ISI-prone bit patterns. Though the coexistence of high frequency energy and low frequency energy is clearly observed in Figure 29 and Figure 30, their time resolution is not good. This issue of STFT is due to its fixed time window size, which is overcome or significantly alleviated by wavelet transforms etc. in other fields [7]. It will be our next step. 5. Summary and Conclusion The length of the patterns used for transistor level simulation for best accuracy is limited to a few hundred to a few thousand bits because of its very long simulation time. The maximum length of the patters measured by real time oscilloscope is limited to a few thousand bits because of the required memory size. On the other hand, some communication standards call for PRBS 2^31-1 (~ 2 billion bits). In order to fill the gap between the real life constraints and ever increasing demand mentioned above, we have studied worst case ISI mechanism in LTI system, developed killer pattern concept and killer pattern generation algorithm, and methodology to analyze the ISI vs. bit pattern. We began with reviewing basic ISI mechanism in an LTI system which approximates a high speed serial link. Then we studied PRBS pattern in detail, and proposed two worst case ISI scenarios and distinct short pattern segments to induce them, which are pathological worst case and PRBS-equivalent worst case. By applying joint timefrequency analysis to our time domain signals, we studied the spectra vs. bit pattern to gain insight into time-varying frequency contents of signals which cannot be done by standard Fourier transform. 21 6. Acknowledgement We thank our colleagues Kaiyu Ren for the backplane channel ISI measurement, and Hsinho Wu for his suggestion and discussion on some topics. 7. References [1] http://www.ieee802.org/3/ae/public/index.html [2] R.E. Ziemer, W.H. Tranter, D.R. Fannin, Signals & Systems, Prentice Hall, 1996 [3] B.K.Casper, M.Haycock, R.Mooney, An Accurate and Efficient Analysis Method for Multi-Gb/s Chip-to-chip Signaling Schemes, IEEE Symposium On VLSI Circuit Design, 2002 [4] http://www.t10.org/ftp/t11/pub/t11.3/98w147r0.pdf [5] http://www.ieee802.org/3/ae/public/jan01/taborek_2_0101.pdf [6] Shie Qian, Dapang Chen, Joint Time-Frequency Analysis, Prentice Hall, 1996 [7] Barbara Burke Hubbard, The World According to Wavelets, A K Peters, Ltd., 1998 22