Understanding the ParBERT Generator/Analyzer modules by Aldo Hoyt - November 30, 2004 Note: The description in this paper is only an attempt to provide a functional view into the “black box” and is not intended as an accurate description of the hardware architecture. To simplify matters, we will deal with a single channel. Please refer to the following diagram as we discuss the module: SR = segment resolution FM = frequency multiplier Behind each channel there is a memory bank, a sequencer and a multiplexer. The multiplexer We usually think of time division multiplexing as a means of combining many channels on one line for economy of long distance transmission. The ParBERT, however, takes advantage of another value of multiplexing: parallel to serial conversion. Handling data at high rates presents a number of challenges that tend to raise the cost of equipment, deteriorate the quality of the signal and introduce a degree of error. For this reason, it is preferable to process data at lower speeds whenever possible and only use high rates as needed. The ParBERT generator handles parallel data at lower rates throughout most of the system and then uses the parallel to serial conversion capability of the multiplexer to achieve the higher rates at the output. To this end, the memory is arranged in multiple pages. It is the function of the sequencer to select the appropriate bits from these pages and present them as words1 to the multiplexer. The words received from the sequencer are clocked out by the multiplexer at a rate determined by the system clock and the frequency multiplier (FM). Output bit rate = System clock frequency * FM The memory Each channel stores its data in multiple memory pages. The E4832A modules provide 16 pages of memory for each channel while the E4861A modules provide 64 pages for each channel. Each page is 128k bits deep (The memory pages in the E4841A are only 64k bits deep). Each memory page selected by the sequencer presents one bit at a time to the multiplexer. The sequencer What is called a “sequencer” in this paper, determines what bits are output and in what order. One might picture the sequencer as a small “pick-and-place” mechanism that takes bits from the memory and places them on the “conveyor-belt” we call an output multiplexer. If the sequencer places one bit on the conveyor-belt each second, the bit-rate will be 1 bit per second. However, if it places 16 bits on the conveyor-belt each second (word size = 16), the bit-rate will be 16 bits per second. __________ 1 Word: a collection of bits recognized as the information unit that the device will process in one clock cycle, sometimes referred to as a “machine word.” The size of a word is determined by the architecture of the device. 2 It is the sequencer that determines which memory pages will be used and in which order they will be accessed. The number of pages to be used in forming the output word is a function of segment resolution and frequency multiplier. Word size = SR * FM Possible word sizes depend on the hardware architecture of the data module: For the E4832A: 1,2,4,8,16 bits For the E4861A: 16,32,64 bits For the E4861B: 1,2,4,8,16,32,64,128 bits For the E4867A: 256 bits For the N487xA: 32,64,128,256,512 bits These words are presented to the multiplexer at a rate that is a function of system clock frequency and segment resolution. Sequencer rate = System clock frequency / SR (segment resolution) Implications The interdependency of variables causes the limitations at one point of the system to result in limitations elsewhere. Examples: 1. Word size in the E4832A is limited to the range of 1 to 16 bits. Consequently, no SR or FM can be selected which will cause word size to be smaller than 1 or greater than 16. For example: if a fractional frequency multiplier of ¼ is selected, segment resolution must be >= 4. In this case, word size = 4 * ¼ = 1. 2. The sequencer clock cannot exceed 41.66MHz. Therefore any system clock frequency or SR that would attempt to make the sequencer go any faster than 41.66MHz is not allowed. 3. Word size also affects available memory. To get the maximum memory, we must use all the available memory pages. With a segment resolution of 4 and a frequency multiplier of 1, only four pages are used. In this case, memory depth is limited to 512k bits. Effective memory depth = SR * FM * 128k 3 4. The maximum word size of the E4861A is 64 bits. Any frequency multiplier and segment resolution that combined would require a word size greater than 64 bits is not allowed. 5. Combining the E4832A and E4861A in one clock group further limits the system. Frequency and SR limitations of one module affect the range of the other. With only an E4832A, a system can go down to 1kHz2. However, when you add an E4861A to the same clock group, the minimum frequency is much higher For example, with any of the following settings, the lowest frequency available is 5.21MHz : SR3 = 4, E4832A FM = 1/4 and E4861A FM = 16 SR3 = 8, E4832A FM = 1/8 and E4861A FM = 8 SR3 = 16, E4832A FM = 1/16 and E4861A FM = 4 The greatest range of frequency will be obtained by minimizing word size in the E4832A and maximizing word size in the E4861A. Thus, the lowest frequency in the E4832A cannot be less than 1/64th of the frequency of the E4861A. Conversely, the highest frequency of the E4861A cannot be more than 64 times the frequency of the E4832A. Since the lowest frequency in the E4861A is 333.334MHz, the lowest frequency of the E4832A when in the same clock group as the E4861A is 5.208MHz (333.334MHz divided by 64 = 5.208MHz). As we have seen, when the E4832A is set to 5.2MHz, the highest rate in the E4861A is only 333.334MHz instead of 2.67GHz. If the settings were changed to obtain an upper frequency of 2.7GHz, the lowest frequency you could get from the E4832A would be 41.667MHz Automatic Synchronization The 81250A offers two principal methods of automatic synchronization: auto bit synchronization and auto delay alignment. Understanding the general concepts of synchronization will help us define the necessary parameters. Auto delay alignment Given an initial estimated setting for the delay between the time when the generator outputs a bit and when the same bit is sampled by the analyzer, the analyzer adjusts its sampling point until it gets a BER that is below the threshold. __________ 2 Note: synchronization is not available at this rate. Minimum rate for auto synchronization is 20.83MHz. 3 When combined in the same clock group, the segment resolution is the same for all modules. The frequency multiplier, however, can be different. 4 The range of this adjustment is limited in order to conserve time. For the E4832A modules this range is + - 50ns and for the E4861x modules it is + - 10ns. If the propagation delay from the generator to the analyzer is known with this level of accuracy, an initial delay can be set and auto delay alignment will work well. If it is not known, a one-time manual discovery process can yield this parameter. Once this delay is saved as a setting, auto delay alignment should be able to find the eye each time. If the propagation delay varies, as in the case of an elastic buffer, auto delay alignment will probably not be able to find the eye each time. If generator and analyzer are in different clock groups, in addition to having a common clock reference, there must be a triggering mechanism that will guarantee that the start time of the two clock groups will always be related in the same way. Auto bit synchronization Auto bit synchronization only adjusts delay within a range of one UI (unit interval), in 1/8 UI increments, in order to find an acceptable sampling point. The pattern is matched by comparing two shift registers. 5 Each analyzer channel has a couple of shift registers associated with it. A comparator between the two constantly monitors this relationship and reports when the received bits in one register are the same as the expected bits in the other register. Pure PRBS/PRWS patterns If the expected data is generated in the shift register, as in the case of Pure PRBS/PRWS patterns, this register is programmed with the appropriate loop-back paths for the desired polynomial. Loading the register with any valid portion of the PRBS bit sequence will cause the expected data shift register to be “seeded” and continue to generate the bit sequence according to the functionality of that particular polynomial. When the analyzer is started, it forces the first n bits of the expected data polynomial to the value of their counterparts in the received data shift register. If this pattern is valid, all subsequent bits should also match, since they are created with the same polynomial. Of course, there are situations when this may not prove to be true. One such case would be if the wrong polynomial were programmed into the register. Another, less obvious case would be if the sample point is close to the edge of the eye, resulting in uncertain data. To handle the latter condition, after starting the analyzer, a bit error rate measurement is made and if it is above the BER threshold, the sample delay is adjusted by 1/8 UI and the received data register is “re-seeded.” This process is repeated until synchronization is achieved or the delay has been advanced through one UI. After advancing the delay one UI, if the analyzer cannot sync, it is assumed that there is another error which can only be resolved by the observant test engineer. At this point the system quits trying to synchronize. 6 If the system fails to synchronize, any data received is assumed to be “noise” and therefore not reported. The Actual BER display in the GUI remains blank and the total received bits as well as bits received in error are reported to be zero. Memory based patterns In the case of memory-based patterns, auto bit synchronization is slightly different. While auto bit synchronization immediately starts shifting the expected register-based pattern through the register when run, it does not shift the expected memory-based pattern through the register until a 48-bit match is found. Thus, the memory-based pattern should start with a unique bit sequence at least 48 bits long. This unique 48-bit sequence is loaded into the shift register at run, but remains static until the comparator issues a start trigger. This explains why auto bit synchronization is not possible in a single clock group when using memory-based patterns—the generator has to be “started” in order to generate the pattern, but the analyzer must not be “started” until the 48-bit unique sequence is found in the received data stream. Since “starting” is a function of the clock, separate clocks must be used for generator and analyzer. The sampling point may be near the edge, introducing uncertainty and errors. So, if the 48-bit sequence is not matched within the time it takes to shift the entire pattern through the received data shift register, the delay will be incremented 1/8 UI and the search for a match will resume. This can be done as many as 8 times, shifting through an entire UI. At this point, failure to find a match is considered a sign of problems that must be resolved by the test engineer, and the system does not attempt to synchronize any longer. 7