Gyroscope Angular Rate Processing Across Asynchronous Clock Domains by Christopher Scott Osborn Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Science and Electrical Engineering at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2002 ® Christopher Scott Osborn. MMII. All rights reserved. The author hereby grants to MIT permission to reproduce and distribute publicly paper and electronic copies of this thesis and to MASSACHUSETTS INSTITUTE grnt others the right to do so. OFTECHNOLOGY JUL 3 1 2002 LIBRARIES . .................. . A u th o r ........ .--Department of Electrical Engineering and Computer Science May 23. 2002 C ertified by ....... ....................... David J. McGorty C.S. Draper Laboratory Principal Engineer VI-A-Conpanmy Thesis Supervisor Certified by. ............ Steven B. Leeb Associate Professor of Electrical Engineering 1.1:T. Thesis Supervisor Accepted by. Arthur C. Smith Chairman. Department Conunittee on Graduate Theses 2 Gyroscope Angular Rate Processing Across Asynchronous Clock Domains by Christopher Scott Osborn Submitted to the Department of Electrical Engineering and Computer Science on May 23, 2002, in partial fulfillment of the requirements for the degree of Master of Engineering in Computer Science and Electrical Engineering Abstract The Draper Laboratory requires a robust means of digital signal resampling for use in a microelectromechanical inertial measurement system. Complications arise because the complete inertial measurement unit requires a stable global time reference for the purposes of axis rate decoupling and integration, yet each gyroscope must adhere to an independent local time reference that is neither controllable nor known in advance. The digital interface between the host system and its component gyroscopes is therefore completely asynchronous. Asynchrony implies that sampling rate conversion is necessary, but also that the conversion ratio may be irrational or otherwise highly inconvenient. Traditional sampling rate conversion by rational fractions is thus impractical. This research examines several simple approaches to asynchronous resampling, then builds upon recent work to develop a flexible, efficient embedded resampling system. The new system utilizes a piecewise polynomial impulse response that may be designed to meet arbitrary frequency domain specifications using minimax linear programming methods. A detailed explanation of the design process is provided, along with a reference design implemented in VHDL. VI-A Company Thesis Supervisor: David J. McGorty Title: C.S. Draper Laboratory Principal Engineer M.I.T. Thesis Supervisor: Steven B. Leeb Title: Associate Professor of Electrical Engineering 3 4 Acknowledgments Researching and preparing this thesis has been an extremely interesting and educational experience for me, and one that would not have been possible without the help of my advisors, Professor Steven Leeb at MIT and David McGorty at Draper. I would also like to thank the rest of the Draper contingent, especially Paul Ward, Nick Homer, Eric Hildebrandt, and Ochida Martinez. Their support, friendship, and wisdom have been much appreciated. Finally, it must be noted that without the balancing influence of my family and my friends, I would have descended into signal processing madness long ago. Thank you Mom, Dad, Jean, Seth, and Max. It's been fun. This thesis was prepared at The Charles Stark Draper Laboratory, Inc., under Internal Company Sponsored Research Project Number 18520. Publication of this thesis does not constitute approval by Draper or the sponsoring agency of the findings or conclusions contained herein. It is published for the exchange and stimulation of ideas. Christopher Osborn 5 6 Contents 1 Introduction 11 1.1 O utline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.2 Previous Work 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 The HPG/MMIMU System 15 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2 The MMIMU Interface .......................... 19 2.3 Multiple Clock Domains ........................ . 20 2.4 Requirements for the HPG/MMIMU Interface Resampler ....... 3 Rate Conversion Strategies 21 23 3.1 The Resampling Operation ............................. 23 3.2 Zero-Order Hold Resampling ............................ 25 3.2.1 . . . . . . . . . . . . . . . . . . . 29 Improving Zero-Order Hold 3.3 Linear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.4 Higher Order Interpolation . . . . . . . . . . . . . . . . . . . . . . . . 35 4 Polynomial Impulse Response Filters 37 4.1 Constructing the Impulse Response . . . . . . . . . . . . . . . . . 37 4.2 Frequency Domain Behavior . . . . . . . . . . . . . . . . . . . . . . 40 4.3 Optimal Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.3.1 Linear Programming . . . . . . . . . . . . . . . . . . . . . . 41 4.3.2 Accounting for Phase . . . . . . . . . . . . . . . . . . . . . . 43 7 4.3.3 Controlling the Time Domain . . . . . . . . . . . . . . . . . .4 46 51 5 The Farrow Structure 5.1 Derivation of the Farrow Structure . . . . . . . . . . . . . . . . . . . 52 5.2 Limitations of the Farrow Structure . . . . . . . . . . . . . . . . . . . 54 57 6 The MMIMU Resampler 6.1 Resampler Design for the HPG/MMIMU . . . . . . . . . . . . . . . . 57 6.2 Implementation in Hardware . . . . . . . . . . . . . . . . . . . . . . . 62 69 7 Conclusion 7.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 A Resampler VHDL Implementation 8 71 List of Figures 2-1 HPG instrument top level block diagram. . . . . . . . . . . . . . . . . 17 2-2 HPGD angular rate signal path. . . . . . . . . . . . . . . . . . . . . . 18 2-3 CIC filter output spectrum for 125Hz sinusoidal rate. . . . . . . . . . 19 3-1 Ideal behavior of resampling operation. . . . . . . . . . . . . . . . . . 24 3-2 Typical input and output waveforms for a zero-order hold resampler. 25 3-3 Aliased tones added by zero-order hold resampler; nominal F. . . . . 26 3-4 Impulse response of zero-order hold resampler. . . . . . . . . . . . . . 28 3-5 Frequency response of zero-order hold resampler . . . . . . . . . . . . 29 3-6 Aliased tones added by zero-order hold resampler; adjusted F.. ..... 30 3-7 Impulse response of linear interpolation resampler. . . . . . . . . . . . 33 3-8 Frequency response of linear interpolation resampler. . . . . . . . . . 33 3-9 Determination of linear interpolation output sample via convolution. . 34 3-10 Convolution of linear interpolation impulse response with itself. . . . 35 4-1 Basis functions for n = 0, 1, 2, and 3. . . . . . . . . . . . . . . . . . . 39 4-2 Shifting of impulse response to restore causality. . . . . . . . . . . . . 45 4-3 Impulse response with discontinuities at segment boundaries. . . . . . 47 4-4 Impulse response with forced continuity of value. . . . . . . . . . . . 48 4-5 Impulse response with forced continuity of slope . . . . . . . . . . . . 50 5-1 Farrow structure signal flow diagram. . . . . . . . . . . . . . . . . . . 55 6-1 8-segment resampler impulse response. . . . . . . . . . . . . . . . . . 59 6-2 8-segment resampler frequency response. . . . . . . . . . . . . . . . . 59 9 . . . . . . . . . . . . . . . . 61 . . . . . . . . . . . . . . . 61 6-5 36-segment resampler impulse response. . . . . . . . . . . . . . . . . . 63 6-6 36-segment resampler frequency response . . . . . . . . . . . . . . . . 63 6-7 48-segment resampler impulse response. . . . . . . . . . . . . . . . . . 64 6-8 48-segment resampler frequency response . . . . . . . . . . . . . . . . 64 6-9 Simulated angular rate signal prior to resampling. . . . . . . . . . . . 67 6-10 Spectrum of angular rate signal prior to resampling. . . . . . . . . . . 67 6-11 Simulated angular rate signal after resampling. . . . . . . . . . . . . . 68 . . . . . . . . . . . 68 6-3 FIR antialiasing filter impulse response . 6-4 FIR antialiasing filter frequency response. 6-12 Spectrum of angular rate signal after resampling. 10 Chapter 1 Introduction 1.1 Outline The Draper Laboratory Micromechanical Inertial Measurement Unit (MMIMU), a microelectromechanical sensor suite, makes use of an architecture that requires accurate asynchronous resampling of discrete-time signals. Asynchronous resampling is any operation by which one real-time sampled signal is converted into another representation of the same signal sampled with a different clock. In contrast with standard digital rate conversion, there are no restrictions on the frequency ratio or relative phase of the input and output sample clocks, and the conversion ratio is assumed to be irrational. Asynchronous resampling is therefore a fundamentally continuous time operation, despite the discrete-time input and output signals. This document first provides an overview of the MMIMU system and its components in Chapter 2, explaining the fundamental asynchrony present in the system and why it must be addressed. Chapter 3 then examines two simple means of resampling, zero-order hold and linear interpolation, finding both to be inadequate for the needs of the MMIMU. Chapter 4 develops a generalization of these two methods based on piecewise polynomial impulse responses, and a design procedure useful for generating the coefficients that describe them. Chapter 5 presents the Farrow structure, a means of implementing asynchronous resampling efficiently in digital logic. Chapter 6 returns at last to the MMIMU system, utilizing the methods from preceding chapters 11 to design a suitable asynchronous resampler offering improved frequency response control and slightly reduced implementation complexity. 1.2 Previous Work The basic concepts behind rate conversion of discrete-time signals follow directly from the Nyquist Sampling Theorem and are well understood as the foundations of discrete-time signal processing. The related problem of asynchronous resampling, on the other hand, has received much less attention. The reasons for this relative neglect are twofold. Asynchronous resampling does not fit comfortably into the conceptual frameworks of either discrete or continuous time, instead requiring elements of both. More important perhaps is the fact that it is only rarely necessary in practical systems. Most systems possess no inherent asynchrony, and may be designed so as to avoid the issue entirely. One major exception is in the area of graphics processing, where there is a frequent need to resample image data for the purposes of display, transformation, or compositing. The common practice of storing images using frequency domain encodings mitigates this need to some extent by making resampling relatively straightforward. Nevertheless, it is not difficult to find software that performs time-domain asynchronous resampling. The less sophisticated examples are often easy to identify by their visibly noisy output. Higher quality versions typically use spline interpolation, an approach very similar to the polynomial impulse response filters that will be developed later in this thesis. Uses of one-dimensional asynchronous resampling are comparatively few and far between, but will become much more plentiful in the near future. The most common examples are the so-called fractional delay filters, which may be considered a special case of asynchronous resampling. These filters, typically all-pass with a group delay of less than one sampling period, are useful for echo cancellation and similar applications. References on this subject include [3], [6], [7], and [8]. The general prob- lem of asynchronous resampling becomes more important with each new generation 12 of digital communications technology, where the permissible timing error in symbol sampling is decreasing steadily. Asynchronous resampling may be used to adjust symbol synchronization in digital receivers, decreasing timing error and improving channel bandwidth. Several approaches appear in [1], [2], [12], and [4]. This thesis draws primarily on related work in minimax resampling filter design that appears in [11] and [12]. 13 14 Chapter 2 The HPG/MMIMU System 2.1 Overview The Draper Laboratory High Performance Gyroscope project, or HPG, is an ambitious design intended to rival the performance of large conventional gyroscopes in a compact, low power microelectromechanical implementation. Although greater levels of integration are likely in the future, at present the instrument consists of three integrated circuits and measures angular rate about a single sense axis. The Draper Micromechanical Inertial Measurement Unit (MMIMU) uses three such instruments arranged orthogonally and combined with accelerometers, allowing measurement of rotational and translational motion in three dimensions. The MMIMU and similar instruments are starting to see use in a variety of environments unsuited to large traditional gyroscopes, including small unmanned aircraft, guided artillery projectiles, and automotive systems. Such applications require inertial instruments to have high precision and large dynamic range while at the same time imposing severe size and power constraints. The HPG employs novel microelectromechanical (MEMS) technology and embedded digital signal processing to meet these requirements. Although this document concerns itself primarily with portions of the HPG digital signal processing architecture, a basic understanding of the overall system will help to place later discussion in context. The HPG is a complicated mixed-signal design that performs extensive signal processing in both the analog and digital do15 mains. The HPG analog electronics occupy two of the three integrated circuits in the system. The first is the MEMS sensor itself and the other is a custom analog integrated circuit providing amplification, filtering, phase-locking, and digital to analog conversion functions. The fact that these chips are separate owes less to any natural abstraction barrier than to the limitations of MEMS fabrication processes, which are typically inadequate for precision analog designs. The HPG digital section, referred to collectively as HPGD, occupies the third integrated circuit. HPGD is a synthesizable VHDL design presently implemented in a Xilinx field programmable gate array (FPGA). The digital electronics perform the majority of the HPG's signal processing activities, and also provide a digital interface to the larger MMIMU system. At the core of the HPG analog section is a tuning fork gyroscope (TFG) sensor, an etched silicon structure consisting of small floating proof masses suspended on elastic silicon supports. The supports hold the proof masses motionless in one dimension, but flex to allow small motions in the remaining two dimensions. This structure, true to its name, is a mechanical oscillator that vibrates with a roughly constant natural frequency. The natural frequency, designated F, is a function of the structure's geometry and material properties and thus varies significantly over temperature and time, and from one TFG device to the next. F may fall anywhere between 12 KHZ and 25 KHz, with most devices around 13 KHz. Under static conditions, this oscillation is one dimensional, occurring entirely within the plane of the silicon die. However, when the instrument rotates about its sense axis, it experiences a coriolis force that induces a perpendicular oscillation, out of the plane of the die. The out-of-plane oscillation is also at a frequency of F, but 90 degrees behind the other in phase and with an amplitude proportional to the instrument's rate of angular motion. Capacitive sensors surrounding the tuning fork structure measure the out-of-plane motion, amplify it, and filter it to produce an analog signal rc(t) consisting of the angular rate modulated on a sinusoidal carrier at F. Modulation does not take the form of an explicit multiplication, but is inherent in the way that the TFG detects angular motion. The corresponding baseband angular rate signal, r(t), does not exist as an analog quantity anywhere in the HPG system. 16 TFG Clock Domain I I Analog Section Clock Domain Digital Section CLK, CLK TFG CMMIMU HPGD Analog ASIC Logic modulated I rate MMIMU System Processor Additional HPG axes (and accelerometers) Figure 2-1: HPG instrument top level block diagram. The natural oscillation of the tuning fork structure provides the fundamental time reference used throughout the HPG. Taking the mechanical oscillation as input, an analog phase-locked loop generates two digital clock signals, CLKF and CLKFs, phase locked and at frequencies F, and F, = 256F,, respectively. CLKF is the fastest clock in the HPG system, between 3 MHz and 6 MHz depending on the TFG. The analog section uses CLKF to run a sigma-delta modulator (SDM), which produces a digital representation of the carrier band rate signal rc(t). An SDM is an oversampling analog to digital converter for which the relative density of ones and zeros in the single-bit output signal is proportional to the level of the analog input. Given a sigma-delta modulated signal, it is possible through low pass filtering to generate a lower-bandwidth signal of greater resolution (more than one bit). The sigma-delta modulated rate signal and the clocks CLKF of the HPG analog section. and CLKF are the primary outputs Additional signals required for various compensation functions are also provided to the HPGD, but are of no great importance to this discussion. A top-level view of the HPG instrument and its connection to the MMIMU appears in Figure 2-1. For simplicity, the figure focuses on the HPG rate channel, 17 TFG Clock Domain modulated rate CIC Filter Clock main CokDmi Antialiasing Filter MMIMU Interface sampled cosine Figure 2-2: HPGD angular rate signal path. and omits a number of elements not directly related to the processing of angular rate. Chief among these are the motor channel, a feedback control loop responsible for maintaining the correct operation of the MEMS sensor, and the compensation variable channel, which processes environmental variables such as temperature for later use by the MMIMU processor. Once within the HPGD, the angular rate signal path is as shown in Figure 2-2. As mentioned above, it is necessary to low-pass filter and decimate the sigma-delta modulated rate signal to increase the resolution of each sample. However, the HPGD postpones this operation until after demodulation, which consists of multiplying the signal by a digitally synthesized sinusoid in phase with the original carrier. Performing demodulation first has a number of advantages. The most significant is the fact that individual samples of the sigma-delta modulated rate are single-bit two's complement numbers, which greatly simplifies the task of multiplication. The SDM samples can be either 1 or -1, so multiplication results in a sequence of samples that either take the value of the demodulating function or its negative. The result of low-pass filtering this sequence is the same as would result from filtering first, then demodulating. The only difference is the elimination of a potentially large and slow hardware multiplier unit. After demodulation, the rate signal undergoes low-pass filtering and decimation via the cascaded integrator comb (CIC) filter indicated in the figure. The CIC filter downsamples the angular rate signal by a factor of 1024, from F, to F,, the rate channel sampling frequency. This block serves not only as a low-pass element and decimator, but as a precise self-tuning notch filter. The electrostatic motor mechanism 18 201: ' I I I 0- -20- -40- -100 -120- -1 4C 0 50 1000 1500 Frequency (Hz) 2000 2500 Figure 2-3: CIC filter output spectrum for 125Hz sinusoidal rate. responsible for maintaining the in-plane TFG oscillation adds unwanted harmonics to the angular rate signal. These harmonics are generally stronger in amplitude than the rate signal itself, and will easily obscure it if not properly filtered out. The CIC filter, with periodic nulls in its frequency response, provides the frequency selectivity necessary to remove the harmonics while preserving the rate signal. A complete treatment of CIC filter techniques appears in [5]. Figure 2-3 demonstrates the results of the processing up to this stage, for the case where the rate signal is a 125 Hz test tone. The CIC output power spectrum shows no evidence of the motor harmonics, but does include another TFG artifact, a strong tone near 1 KHz. The task of removing this tone falls to the antialiasing filter in Figure 2-2, which also serves to bandlimit the rate signal in preparation for downsampling and output. 2.2 The MMIMU Interface After demodulating and decimating the angular rate data, the HPGD passes it to the MMIMU processor where it combines with information from the other axes. The 19 MMIMU processor compensates each signal for temperature and axis coupling effects, then uses them to maintain a running estimate of spatial position and orientation. The decoupling and motion integration activities require that the data from all three axes be sampled uniformly, meaning that there is a common time base with equally spaced sampling instants and that the angular rate about each axis is known for each sampling instant. To enforce this timing requirement, the MMIMU processor provides a common sample strobe to all gyroscope axes. This clock, designated SSTB and with frequency Fi, is a precise crystal oscillator reference to which all of the tributary HPG axes must adhere. Each HPG axis defines its own local clock domain, wherein every clock derives from the natural frequency of the tuning fork oscillator. Allowing the different HPG axes to provide sampled data based on their own internal clocks would force the MMIMU processor to reconcile data from sources with different sampling rates and phases. Considering the difficulty and computational complexity of processing non-uniformly sampled signals, defining a common sampling clock is a reasonable approach. However, it does not solve the underlying problem of asynchrony. 2.3 Multiple Clock Domains The HPG/MMIMU architecture is one of a relatively small class of DSP applications distinguished by their use of multiple asynchronous clocks. Any digital system with asynchronous clocks must contend with a variety of potential problems, such as metastability and the possible violation of register setup and hold times. Handshaking synchronizers and other methods exist to solve such problems relatively easily. However, if the information moving from one clock domain to another is a real-time sampled signal as in the HPG/MMIMU system, more subtle difficulties arise. Although the HPG instrument provides its rate output as a discrete time signal and the MMIMU processor demands a discrete time input, the signals are quite different. They are sampled at different rates, the ratio of which may be irrational. Only very rarely will an MMIMU sample strobe perfectly coincide with a sampling instant on the HPG side of the interface. The rest of the time, the MMIMU processor is re- 20 questing values of a discrete time signal at non-integer indices, where it is undefined. Adhering strictly to the definition of discrete-time signals, the vast majority of angular rate samples transmitted across the interface should be of undefined value. This is mathematically self-consistent but not particularly useful, so it is necessary in some way to discard the constraints of discrete-time signal processing. While the HPG and MMIMU processor are both discrete-time systems unto themselves, the interface between them violates all assumptions of discrete time. The operation that moves the angular rate signal from the HPG to the MMIMU processor, indicated by the Resampler block in Figure 2-2, is thus inherently a continuous-time system. This thesis seeks to better understand the problem of asynchronous resampling, and how best to solve it in the context of the HPG. It first examines a number of simple approaches to resampling and their potential shortcomings. By implementing and extending recent work ([3], [11], and [12]), it then develops a resampling strategy that better respects the continuous-time nature of the operation, while retaining the advantages of digital signal processing. 2.4 Requirements for the HPG/MMIMU Interface Resampler The HPG/MMIMU resampler must transition the angular rate signal from the rate channel sampling frequency, F, to the interface sampling frequency F. Implicit in this transition is a reduction in sampling frequency, although due to variation in F the precise conversion ratio is variable and unknown. F has a fixed value of 600 Hz, while F may theoretically vary between 12 KHz and 25 KHz from one tuning fork gyroscope specimen to the next. Because the vast majority of TFG sensors produced appear at the far low end of this range, the resampler design should assume a rate channel sampling frequency of F, = F0 /4 = 3.25 KHz. The design should accommodate F, values at the higher end of the range with only a minor change of design parameters and FPGA reprogramming, but any single HPGD does 21 not need to perform optimally across the entire range. At the Fr value for which it is designed, the resampler should pass all signal components below 150 Hz and attenuate all frequencies above F/2 by at least -60 dB. If the resampler adds signal components, they must be at least 60 dB weaker than the strongest passband component. The resampler should have linear phase in the passband, and any delay it imposes on the signal should be minimized, provided that the frequency response requirements are met. As part of the HPGD design, the resampler must be fully digital to permit implementation within the existing field programmable gate array. The FPGA is a 236,000 gate device, approximately one quarter of which is available for use by the resampler design. Several clock signals are available, both in the TFG and MMIMU clock domains. The fastest in the TFG domain is CLKF,, at 1024 times the rate channel sampling frequency. In addition to the SSTB signal, there is a faster MMIMU clock that is used to control the serial transmission of data over the MMIMU interface. It is designated BITCLK, and may operate at either 1 MHz or 10 MHz. The resampler may use it, if necessary, to operate finite state machines or other mechanisms that must be synchronous to the output sampling clock. The remainder of this thesis assumes the following parameter values, unless otherwise stated: Quantity Symbol Value TFG natural frequency Fo 13 KHz TFG fast clock frequency F, 3.3 MHz Rate channel sample rate Fr 3250 Hz MMIMU interface sample rate F 600 Hz 22 Chapter 3 Rate Conversion Strategies 3.1 The Resampling Operation Digital sampling rate conversion is typically considered in terms of upsampling and decimation operations, which either increase or decrease the sampling rate by an integer factor. These operations are relatively simple, both in theory and practice. In order to upsample a signal by a factor R, one needs only to insert R - 1 zero-valued samples between each pair of existing samples, then filter the resulting signal with a cutoff frequency of half the original sampling rate. The filter fills in the values of the new samples using sinc interpolation or some reasonable approximation thereto. Downsampling by R is similar, consisting of an antialiasing filter followed by decimation, or the removal of samples such that only every Rth sample remains. The filter, in this case, bandlimits the signal such that it satisfies the Nyquist criterion even at the lower sampling rate. It is possible to cascade any number of upsampling and downsampling operations to yield a final conversion ratio that is a rational fraction. A complete treatment of these concepts appears in [9]. In addition to conceptual simplicity, integer ratio conversion has the advantage of being computationally efficient. Polyphase structures and other optimizations exist to make even relatively inconvenient conversion ratios tractable. This type of operation is not possible when the ratio between the input and output sampling rates is irrational. Consider a signal x(n), sampled at rate F, 23 which is to ------ o ' underlying CT signal input sample output sample x(n.) y(n +2) y(n.) -- -- -'- x(n + D) y(n +1) D d D d Figure 3-1: Ideal behavior of resampling operation. be resampled at a lower rate F,.t to produce y(n). The ratio takes the form F. R =F- = D + d Fout (3.1) where D is the integer portion and d is the remaining fraction, which may be irrational [7}. Resampling is generally performed such that if y(n,) = x(n 0 ), then y(n0 + 1) = x(n 0 + R). However, this rule fails if d # 0, because x(n) is undefined for fractional indices. The correct value for y(n 0 + 1) is neither x(no + D) nor x(n, + D + 1), but a point on the underlying continuous time signal, located in the time interval between these two samples. Where in this interval it falls depends on the parameter d, usually termed the fractional interval. Figure 3-1 demonstrates these relationships. The task of the asynchronous resampling block is to convert the angular rate signal sampled at F, into a different representation of the same signal, sampled at Fi. Ideally, the resulting output samples should represent the values of the underlying analog signal at the requested time instants. This is not an unreasonable expectation, as we may assume that the underlying analog signal is properly bandlimited such that it contains only frequency components below F,/ 2 . It is therefore possible, according to the Nyquist Sampling Theorem [9], to perfectly reconstruct the underlying continuous-time signal. That done, it is simply a question of sampling the continuous 24 ------ underlying CT signal resampler input. -resampler output - Figure 3-2: Typical input and output waveforms for a zero-order hold resampler. signal at the appropriate times to produce the output. Unfortunately, perfect bandlimited reconstruction of the underlying signal requires ideal lowpass filtering, either in the analog or digital domain. The appropriate filter has a sinc-function impulse response of infinite support, so it is not realizable in practice. Any actual filter based on ideal bandlimited reconstruction will at best be a finite-support approximation to the ideal impulse response, typically generated through windowing, minimax iterative design, or some similar method. Although ideal bandlimited interpolation using sine functions is not practical in a real design, there are other ways to achieve a similar effect. Sinc interpolation does nothing more than find a continuous signal that passes through each of the discretetime samples, with the condition that the signal does not contain any frequency components higher than half of the sampling rate. There is only a single unique output signal that perfectly interpolates the input samples, but other interpolating functions, smooth or otherwise, may be adequate for any given problem. 3.2 Zero-Order Hold Resampling The simplest means of resampling is zero-order hold interpolation. It is not strictly interpolation at all, because each output sample is a function of only a single input sample. A zero-order hold resampler is purely causal, and produces each output 25 -A 0 -20- -40- -60- -80- -100- -120- -1400 50 100 150 Frequency (Hz) 200 250 300 Figure 3-3: Aliased tones added by zero-order hold resampler; nominal F,. sample simply by picking off the most recent input sample. Figure 3-2 illustrates this behavior. Zero-order hold resampling is extremely easy to implement in hardware. It requires little more than a register to hold the most recent input sample and a synchronization mechanism that allows the register contents to be read at any time without glitches. There is no explicit interpolation. The results, unfortunately, are not very good. The error between the resampler output for a given sample instant and the underlying analog signal can be significant, as in Figure 3-2. Figure 3-3 demonstrates the effect of zero-order hold resampling from 5 KHz to 600 Hz, showing this same error as it appears in the frequency domain. The applied angular rate was a 125 Hz sinusoid. Despite being bandlimited to 300 Hz prior to resampling, the resulting rate signal power spectrum includes strong aliased peaks. Because asynchronous resampling by definition violates important assumptions of discrete time, analyzing it as a discrete-time system will generally fail. It may only 26 be correctly analyzed as a continuous-time (CT) system. Thus, to understand the behavior of the zero-order hold resampler, consider the CT impulse-train interpretation of the analog rate signal r(t): 00 r(t - mT)b(t - mT,). rp(t) = (3.2) m=-oo r,(t) results from multiplying the CT angular rate signal by a train of unit impulses arranged uniformly in time with a period of T,. This operation generates an image of the r(t) spectrum at every multiple of F. Assuming that r(t) includes no frequency components above Fr/ 2 , the images do not overlap. The zero-order hold resampling behavior is equivalent to filtering r,(t) with a causal rectangular impulse response, then sampling the resulting CT signal at the output sampling frequency F. The appropriate impulse response, as depicted in Figure 3-4, is h(t) 1, O< t < T 0, otherwise. (3.3) The corresponding frequency response is H(jw) = e-jTr 2 sin (wTr/2)) (3.4) the magnitude of which appears in Figure 3-5. Because the next step is to resample the output of this filter at the lower frequency F (600 Hz for the MMIMU), the filter's effect on the spectral images of rp(t) is of great importance. Even if r(t) is bandlimited to F2 /2 prior to resampling, rp(t) retains images at multiples of F,., and energy in these images will fold down to lower frequencies upon sampling. It is necessary to attenuate the images to prevent aliasing while simultaneously preserving desirable signal components in the passband below F/2. The frequency response of the zero-order hold system shows respectable passband performance, falling only 0.1 dB from DC to the MMIMU interface bandwidth of 300 Hz. However, it offers very little image attenuation for wideband signals. There are nulls in the response 27 01 _T1 t Figure 3-4: Impulse response of zero-order hold resampler. conveniently located at multiples of the input sampling frequency, but they are too narrow to adequately remove images of the HPG rate signal. Consider the example given earlier of zero-order hold resampling applied to a cosine rate test signal, x(t) = cos(wet). Sampling x(t) with an impulse train at the rate-channel sampling frequency results in a signal xp(t) with the following Fourier transform [10]: 00 X,(jO) = (7r6(w - WO - 27rkF,) + 7r3(w + w0 - 27rkF,)). (3.5) k=-oo For the 125 Hz test signal used in Figure 3-3, X,(jw) will have components at ±125 Hz, F, ± 125 Hz, -F, ± 125 Hz, etc. Zero-order hold resampling begins with the convolution of x,(t) and Equation 3.3 to yield xp,(t). This filtering extends the width of each impulse to create a stairstep waveform, attenuating its component frequencies according to the response in Figure 3-5. The MMIMU interface then samples xpz(t) at the MMIMU interface rate F to produce xi(t). xi(t) has the Fourier transform 00 Xi(jw) = F >3 XP (j(w - 27rkF)) . (3.6) k=-oo This equation demonstrates the aliasing that occurs upon resampling. For F, = 5000 Hz and F = 600 Hz, there is an image at 4875 Hz which folds down to 75 Hz. The at28 0- -10- -20 - -30- -40- -50- 0 1 2 3 4 5 6 7 8 9 10 Frequency (normalized by input sample rate) Figure 3-5: Frequency response of zero-order hold resampler. tenuation of the zero-order hold filter at 4875 Hz is only -31.8 dB, so the resulting unwanted tone is quite strong. The image at -5125 Hz, attenuated by -32.3 dB, appears at 275 Hz. Other images at higher frequencies also fold down and reinforce the aliases. The 75 Hz tone, for instance, is the sum of images from 4875 Hz, -10125 Hz, 19875 Hz, and many others. For input and output sampling rates that are less commonly divisible, the aliased tones appear at a wider array of passband frequencies. Changing F, to 5005 Hz yields the new power spectrum in Figure 3-6. This is more representative of the level of distortion that the HPG/MMIMU interface would experience in practice, given that F, is uncontrolled. 3.2.1 Improving Zero-Order Hold Again examining Figure 3-2, it appears that the error for each output sample would be smaller if the value held was not simply the most recent input, but the average of the two most recent inputs. This is a simple variation on zero-order hold, achieved by lengthening the impulse response in Figure 3-4 to 2T, and halving its value. The 29 0 -10-20- -30-40 ID 0 -50 -60 -70 -80 0 50 100 150 Frequency (Hz) 200 250 300 Figure 3-6: Aliased tones added by zero-order hold resampler; adjusted F,. 30 performance is indeed better, but not significantly. In the frequency domain, extending the impulse response by a factor M is equivalent to compressing the response in Figure 3-5 by the same factor. As a result of the compression, the first image will coincide not with the first null, but with the Mth null, which has marginally greater attenuation. Unfortunately, this approach brings rapidly diminishing returns, and only at the cost of accelerated rolloff in the passband. Zero-order hold is clearly inadequate for direct use in resampling the angular rate signal, but it is important to note that there is nothing qualitatively wrong with it. Zero-order hold resampling fails here because the angular rate is a relatively wideband signal. If it were to vary more slowly with respect to the sampling rate, the time-domain error in Figure 3-2 would be less significant. This intuitive observation is consistent with the frequency domain analysis. The nulls in the response of the zero-order hold filter would more completely remove the unwanted images if the rate signal occupied a narrower band around DC. To take advantage of this fact, it is necessary to upsample the angular rate signal prior to zero-order hold resampling. For instance, the zero-order hold filter offers image attenuation of at least -60 dB for all frequencies below 0.001Fr. To realize this attenuation over the entire 300 Hz MMIMU interface bandwidth, F, needs to exceed 300 Hz -(1/0.001), or 300 KHz. It is a simple matter to increase F, requiring only zero insertion and interpolation filtering with an appropriate linear-phase FIR filter. Unfortunately, while this approach is conceptually simple, it is unwieldy when rendered in hardware. The number of multiplication and addition operations for each FIR filter output sample is large. If the sample rate is low enough, these operations may be performed sequentially by a single multiplier and adder under the control of a finite state machine. This strategy is used in the existing rate channel FIR filter, and allows for a very hardware-efficient implementation. It does not extend well to higher sample rates, however, because the decreased time between output samples leaves less room for sequential processing. Even if very fast adders and multipliers are available, a clock significantly faster than the sampling rate is still required to operate the finite state machine. The rate channel FIR filter uses F, as its clock and requires K + 4 cycles to produce each output 31 sample, where K is the number of filter coefficients. Because F, is the fastest clock available anywhere in the system, an FIR interpolation filter could achieve an output sample rate of, at best, F/(K + 4). Adding multipliers and adders or increasing F, would help to circumvent this limitation, but neither of these approaches is feasible in the short term, as they require a larger FPGA and changes to the analog ASIC, respectively. Furthermore, additional FIR filtering adds latency to the signal path. The overall rate channel latency should be kept to a minimum, so this is not an ideal solution. While effective, it is a brute-force approach that is not well suited to a small, low power system such as the MMIMU. 3.3 Linear Interpolation Linear interpolation is a slightly more sophisticated means of resampling, wherein each output sample lies on the straight line connecting the two most recent input samples. The corresponding continuous time impulse response is 0 < t < T, T, h74= Tt<T .+ < 2Tr ,T&t 0, (3.7) otherwise, shown also in Figure 3-7. Linear interpolation offers dramatically better image attenuation than zero-order hold resampling. Specifically, its frequency response, in Figure 3-8, is the square of the zero-order hold response. It still lacks the -60 dB image attenuation desired for the HPG/MMIMU interface, but is a definite improvement. In terms of implementation, linear interpolation is substantially more complex than zero-order hold. In some cases (the HPG included) it is possible to perform linear interpolation using only digital counters and shift registers, but in general both multiplication and addition are necessary. It is worthwhile to consider the implementation not as a process of constructing lines between input samples, but more directly, in terms of the impulse response and convolution. Figure 3-9 demonstrates 32 1* 0 T, t 2T, Figure 3-7: Impulse response of linear interpolation resampler. 0 -20 -40 -60 V -80 -100 -120 0 1 2 3 4 5 6 7 Frequency (normalized by input sample rate) 8 9 10 Figure 3-8: Frequency response of linear interpolation resampler. 33 1 r(to-T,-d) r to - 2T, (to- 2T,- d)r(to t -d) to -Tr t dT, Figure 3-9: Determination of linear interpolation output sample via convolution. the convolution of rp(t) with the linear interpolation impulse to determine ri(t,). The impulse response is causal and spans two T, intervals, so the new output sample is a linear combination of the two most recent input samples. It also depends on the fractional interval parameter d, shown in the diagram, which is the fraction of Tr that has elapsed since the most recent input sample instant. Increasing d as a continuous variable slides the impulse response in the positive time direction with respect to the input impulse train, performing continuous-time convolution. Because the output is also a sampled signal, however, it is not necessary to compute the entire CT output. Instead, the interpolator takes snapshots of the CT convolution by varying d discontinuously. It evaluates the convolution only at the output sampling instants, thus generating a sequence of samples identical to what would result from true continuoustime filtering followed by asynchronous sampling. In practice, filtering in this manner requires a means of measuring d and a representation of the impulse response as a function of d over each Tr interval. In the the case of the HPG, d may be easily measured by counting the number of CLKF, periods between an output sampling instant, indicated by SSTB, and the most recent input sampling instant. The triangular response in Figure 3-7 is an affine function of d over each interval, so the required representation is also relatively simple. 34 0 T, 2T, 3T, 4T, t Figure 3-10: Convolution of linear interpolation impulse response with itself. 3.4 Higher Order Interpolation Considering the impressive performance gain made possible by the use of linear interpolation rather than zero-order hold, it is tempting to go a step further. The linear interpolation impulse response resulted from convolving the zero order hold response with itself. Convolution of the linear interpolation response with itself to produce the response in Figure 3-10 would bring about another squaring of the frequency response. This would yield even better image attenuation, satisfying the requirements of the MMIMU interface. Furthermore, the process may be repeated ad infinitum, yielding ever longer, more complicated impulse responses with increasingly complete image removal. The implementation of such higher-order filters is similar to the linear interpolation scheme above, albeit with more than two intervals and more complex polynomial functions of d to describe the response over each interval. There is only one degree of freedom, so designing a resampling filter along these lines is extremely simple. For that same reason, it is also somewhat suspect from a performance standpoint. If there is any potential for fine control or efficiency in a resampling filter, this approach almost certainly fails to exploit it. It would be preferable to have a clearer insight into the construction of these resampling filters, and how they may be better tailored to a specific need. To this end, the next chapter takes a step back from the specific problem, to build a general characterization of this type of filter and develop an approach to design. 35 36 Chapter 4 Polynomial Impulse Response Filters 4.1 Constructing the Impulse Response Zero-order hold and linear interpolation are two examples of a larger class of interpolating systems, referred to here as polynomial impulse response filters, which first appear in [11] and [12]. A filter of this type has an impulse response composed of K consecutive intervals extending from t = 0 to t = KT, where T, is the input sampling period. Its value is given by a different polynomial function over each interval. For instance, the zero order hold impulse response in Figure 3-4 spans only a single interval, from t = 0 to t = T,. Its value over this interval is a particularly simple polynomial of order 0, with the sole coefficient equal to 1. The triangular impulse response characteristic of linear interpolation is slightly more complex. It spans two intervals, and its value is given by a different affline function of time over each. Generalizing from these examples, consider the class of impulse responses defined as follows: K-1 h(t) = s(k, t), (4.1) k=O where s(k, t) is a function restricted to be nonzero only for the kth interval, in the 37 range t E [kTr, (k+ 1)Tr). Equation 4.1 thus represents the complete impulse response as the concatenation of K independently defined intervals. The function s(k, t) is itself a weighted sum of shifted basis functions [11]: N s(k,t) = Z ck,n -f (n, t - kTr , Tr). (4.2) n=O The basis function f(n, t, Tr) is the simple exponentiation tn for It < 1, shifted and scaled to span the range t E [0, Tr) tJ f (n, t, Tr) = , O < t < T. (4.3) Trr otherwise. 0, Figure 4-1 plots f(n, t, Tr) for n = 0, 1, 2, and 3. Note that the plots show unshifted basis functions. The function s(k, t) must shift them in time by kT,, as apparent in Equation 4.2, such that they span the correct interval in the complete impulse response. Using these conventions, the zero-order hold system has parameters K = 1, N = 0, and c 0 ,0 = 1. The linear interpolation impulse response, in Figure 3-7, spans two intervals and adds a linear component: K = 2, N = 1, and co,o = 1/2, co,1 = 1/2, c1'0 = 1/2, c 1,1 = -1/2. The exact form used for the basis functions is to some extent arbitrary. Multiplying the basis function by a constant would alter the functions as they appear in Figure 4-1, but would not affect their utility, as any change could be offset by altering the coefficients Ck,n. However, other qualities, such as the fact that the axis of even and odd symmetry is in the center of the interval (at t = T,/2), are of great importance. Without this symmetry, the set of possible impulse responses constructed from the basis functions would be limited unnecessarily. There are also numerous other ways to construct a filter impulse response, and the component pieces do not need to be polynomial functions. The polynomial representation, however, brings two significant practical advantages not shared by most most other approaches. One is ease and efficiency of implementation in hardware. Polynomial impulse response 38 f(Ot) f(1,t) 1 1 0.5 0.5 0 0 05 05. 1 1 0 0 0.5 0.5 f(3,t) f(2,t) 1 1 0.5 0.5 0 0 05 05 1 0 0.5 0 1 0.5 Figure 4-1: Basis functions for n = 0, 1, 2, and 3. 39 1 resampling filters may be realized using a Farrow structure [3], a highly efficient implementation well suited to the constraints of embedded processing. The other is ease and flexibility of design. When properly formulated, the design process is very similar to optimal minimax FIR design, offering both a high degree of control over the filter response and high efficiency in terms of performance versus complexity. 4.2 Frequency Domain Behavior The use of a piecewise polynomial formulation is advantageous because the complete impulse response function is a simple linear combination of fixed elements. By the linearity property of the continuous-time Fourier transform, this is also true of the corresponding frequency response. Combining equations 4.1 and 4.2 yields K-1 h(t) N Ck,n - f = (n,t - kT,, T,). (4.4) k=O n=O Taking the Fourier transform of the basis function provides a very similar, and useful, expression for the frequency response of the complete filter: K-1 N H(jw) = Ck,n (4.5) -F(n, k, jW, T,). k=O n=O The frequency response of the basis function is available through direct application of the Fourier analysis equation. Because the basis function is nonzero over only a limited range, the analysis equation becomes a well-behaved definite integral: F(n, k,jw, Tr) f(n,t - kT, T,)e-jwtdt = J (k+)T kT, 2 - 1') eiwtdt (4.6) Tr Calculating the frequency response is easiest if done for fixed values of n. In practice, it is usually not productive to use polynomial orders higher than quadratic for describing impulse response sections, so it is sufficient to determine F(n, k, jW, Tr) for n = 0, 1, 40 and 2: F(0, k, jW, T,) = (e-iwTk - e-wTr(k+l)) F(1, k, jw, T) = T 2 ((Tw + 2j)eiwT'k + (TrW Trw F(2, k, jw, Tr) = T w 2 ((-Tjw - 4jTr - 2j)e-wor(k+1)) (4.7) (4.8) 48 + 8)e-iwTrk + rW (T2w 2 - 4jTrw - 8)e-iwTr(k+J)) 4.3 4.3.1 (4.9) Optimal Design Linear Programming The mathematical descriptions in the previous section are important both for understanding piecewise polynomial filters in the frequency domain and, later, for implementing them in the time domain. That said, it is not immediately clear where one obtains the necessary coefficients ck,n. Any given filter is completely defined by these coefficients, so there is nothing else to design, but the coefficients themselves represent a formidable design space. Fortunately, there is a large body of preexisting knowledge and methodology that applies very naturally to this problem. The fact that the impulse and frequency responses are linear functions of a finite set of variables makes them candidates for design using linear programming techniques. The linear programming problem, as usually stated, seeks to determine values for the variables of a linear objective function such that the function is minimized. The objective function may be an expression of the error between a design and an ideal model, in which case linear programming seeks the design that minimizes that particular error metric. This is the standard form for single-objective linear programming, but it is not quite sufficient for the filter design problem. In the case of Equations 4.4 and 4.5, the linear function of ck,n is the response at a single point in time or frequency, 41 respectively. Each frequency under consideration in the design process constitutes a different objective function that must be optimized. Therefore, single-objective linear programming can only design a filter to meet specifications at a single point, with the rest of the response uncontrolled. Ideally, the optimization should work over a continuum of frequencies. A discrete grid of frequencies is an acceptable approximation to this continuum, but a grid comprising only a single point is clearly inadequate. Multiobjective linear programming, on the other hand, provides precisely the required function. It is similar to the more common single-objective linear programming, but as its name implies, it finds variable values that optimize an arbitrary number of objective functions. For the purposes of filter design, the most useful multiobjective optimization is minimax, or minimization of the maximum of the objective function values. The minimax problem is Find S subject to AS = b such that max (O(5)) is minimized. (4.10) The column vector S in this expression is the vector of independent variables. For piecewise polynomial filter design, S is any convenient one-dimensional arrangement of the coefficients Ck,n. O(5). is a function that returns a vector of the objective functions evaluated for Y, the maximum element of which should be minimized. Most minimax solution algorithms allow the optimization to occur subject to constraints on 5, which must be in the form of a matrix A and vector b as in Equation 4.10. The ability to impose constraints is important, and will appear later in the process of filter design. Multiobjective linear programming makes possible the design of piecewise polynomial resampling filters directly in the frequency domain. The process is very similar to the Remez exchange algorithm used in the design of discrete-time FIR filters, but with a few important differences. Both procedures begin by defining a frequency grid that covers all bands of interest, and it is in this first step that they differ most sharply. An FIR filter is a purely discrete-time system that operates at a fixed sample rate. It is guaranteed that the filter will encounter no frequencies higher than half of that sampling rate, simply because the sequence it processes is not capable of en42 coding any higher frequencies. The frequency band of interest when designing such a filter covers, at most, DC to half of the sampling rate. In contrast, resampling filters are continuous time systems and the bands of interest may extend to much higher frequencies. The imaging bands around multiples of the input sampling rate demand special consideration in any design, as they are the regions that may fold into the passband upon resampling. For this reason, the frequency grid used for resampler design typically extends from DC to three or four times the input sampling rate. The frequency grid should include enough uniformly spaced discrete frequencies to provide a faithful representation of the frequency response. More grid points will yield a more accurate minimax solution, but at a significant computational cost, because each additional frequency adds an objective function to the minimax problem. Given the frequency grid, the remainder of the problem is similar in form to minimax FIR design. For each frequency grid point wi, there is a target frequency response value T(wi) and an error weight W(wi). The objective function is the weighted error E(wi) between the actual and target responses: E(jwi) = W(w1 ) - IIT(jwi)l - jH(jwi)lj (4.11) The minimax algorithm evaluates this error function for each wi in the frequency grid and adjusts the elements of j? to minimize the maximum error. The result is a vector x whose elements are the 4.3.2 Ck,n coefficients describing a minimax optimal filter. Accounting for Phase It should be noted that the minimax objective function given in Equation 4.11 considers only the magnitude of the resampling filter frequency response. There is no mention of the filter's phase characteristics, which are in fact completely uncontrolled. This is a significant weakness in the linear programming approach to resampler design. Although the frequency response magnitude of a piecewise polynomial filter is linear in the polynomial coefficients Ck,n, its phase angle at any given frequency is not. Nonlinearity precludes the use of additional objective functions measuring 43 phase angle error. Adding objective functions would be the most flexible means of controlling phase, because it would allow phase and magnitude error to be weighted by relative importance. The objective functions could conceivably be used to specify linear phase in the passband while relaxing requirements for the rest of the spectrum. The fact that phase angle resists optimization is unfortunate, but it is possible to circumvent the problem entirely by forcing the filter to have linear phase for all This is accomplished by temporarily permitting the filter to be frequencies [11]. noncausal and redefining the impulse response as follows: K12-1 N k=0 n=O h'(t) = c,- + kTr + Tr, Tr)]. [f(1)f(n,t (4.12) Whereas Equation 4.4 expresses h(t) as a sum of K nonoverlapping segments, h'(t) is a sum of K/2 pairs of segments. Each pair consists of a segment extending from kTr to (k + 1)T, and its mirror image across the t = 0 axis, extending from (-k - 1)Tr to -kTr. The same c coefficients apply to both segments in a pair, but are negated for odd values of n when applied to one of the segments. Negation compensates for the fact that the basis functions for odd n exhibit odd symmetry. Pairs of segments, and by extension the impulse response as a whole, are thus constrained to have perfect even symmetry about t = 0. Symmetry of the impulse response ensures that the frequency response is purely real for all frequencies, so the filter has zero phase angle. It is now possible to define H'(jw) as a sum of the frequency response of K/2 pairs of segments. With zero phase inherent in the definition of the resampling filter, minimax design may proceed as before, but without problems resulting from uncontrolled phase. There are a few significant differences, however. Most importantly, the contribution of any single coefficient to the complete frequency response has changed (although it is still linear), and the minimax objective functions must be altered to reflect the change. Also, there are now half as many variables in the problem, because each Ck,n applies to two impulse response segments. This implies the further restriction that the impulse response must span an even number of segments. Once the minimax 44 4d (- 1)"- (- 1)"- (- 1)"- C 2,n C In C O,n -3Tr .4 -2Tr -Tr h'(t) C'On C'In Tr 0 C 2,n 2Tr 3Tr t 6Tr t h(t) C on C 2,n Cm,n c 4,n C 3,n C 5 ,n -4 0 Tr 2Tr 3Tr 4Tr STr Figure 4-2: Shifting of impulse response to restore causality. 45 design of the coefficients is complete, the resulting impulse response may be shifted by KT,/2 in the positive time direction to restore causality. The filter then has a linear phase characteristic rather than zero phase, but the magnitude of the frequency response remains unchanged. The shift also restores the impulse response to its original representation from Equation 4.4. An example of this shifting appears in Figure 4-2 for a six-segment impulse response. The impulse response as designed by the minimax procedure (top) is described by three sets of n + 1 coefficients for positive time. Negated for odd n, these same coefficients describe the negative time half as well. Once shifted to restore causality (bottom), there are six sets of coefficients. The new coefficients ck,n are given by Ck,n = C'K/2-k-,n (-1)" 0 <k K/2 -1 (4.13) K/2 < k < K. k-K/2,n The figure also illustrates the effect of the shift on the filter phase characteristic. The bottom impulse response is causal, but has a group delay of 3 T, for all frequencies. 4.3.3 Controlling the Time Domain In its most basic form, the above minimax design process treats the coefficients ck,n as nearly independent variables. They possess an indirect dependence on one another by virtue of the fact that they collectively describe the filter being designed, but there is no explicit mathematical relationship. As such, there is nothing to guarantee that the impulse response will be continuous at its segment endpoints. The segments evolve almost independently, and the concatenation of multiple segments will generally be discontinuous. Figure 4-3 is an example of a linear phase impulse response (N = 2, K = 10) showing the typical discontinuities. If desired, it is possible to control the time domain behavior of a piecewise polynomial resampling filter by imposing continuity conditions as part of the minimax design process. Minimax linear programming algorithms, as noted earlier, generally support constraints of the form A' = b. The minimax algorithm will attempt to 46 x 10' 2 0 -1- -2 .3 0 0.5 1 1.5 2 2.5 3 t x1 3 Figure 4-3: Impulse response with discontinuities at segment boundaries. optimize the vector of coefficients X' within the solution space of this system of linear equations. A and b are arbitrary, and may be designed so as to place constraints on the behavior of the impulse response at the segment endpoints. Each of the Ck,n coef- ficients (of which X' is composed) specifies the contribution of a single basis function over a single interval, with N + 1 basis functions contributing to each interval. In order to ensure that segment q is continuous in value with segment q + 1, for instance, the contributions of all basis functions to the value at the right end of segment q must equal the contributions to the left end of segment q + 1. A similar approach works to ensure continuity in any derivative, by taking into account the contributions of each basis function to that derivative at the segment endpoints. To better understand this approach, consider again the basis functions in Figure 4-1. The basis functions attain a value of 1 at both endpoints for even n. For odd n, the right endpoint is at 1 while the left, due to the odd symmetry, is -1. In an impulse response for which N = 2, therefore, segments q and q + 1 are continuous at their border provided that Cq,O + Cq,1 + Cq,2 - Cq±1,O 47 + Cq+1,1 - Cq+1,2 = 0 (4.14) l 400 300 F 200 100 0 -100 -200 0 1 0.5 1.5 2 2.5 t 3 x10 Figure 4-4: Impulse response with forced continuity of value. The left and right halves of this equation form one row of the matrix A and one element of b, respectively. A similar equation could force equality between segments q + 1 and q + 2, and so on. If the complete filter includes three segments, two linear equations are necessary to force continuity of value at the segment junctions. Expressed as AY = b, the equations are: CO, 0 CO, 1 CO,2 1 1 1 -1 0 0 0 1 1 -1 0 0 01 1 -1 1 -iJ 1 C1,0 C1, 1 = 0 0 . (4.15) C1 ,2 C2 ,0 C2 ,1 C2 ,2 Figure 4-4 shows the effect these zero-order continuity conditions have on the impulse 48 response. Besides the continuity, all other design variables are unchanged from Figure 4-3. Although the segments now meet with no discontinuity in value, they are still discontinuous in slope. By noting the contribution of each basis function to the slope at the segment endpoints, similar equations are found to ensure continuity in the first derivative: CoO Co, 1 CO, 2 0 1 2 0 1 -2 0 0 0 0 0 0 1 0 1 -2 2 0 -I C1 ,O C1 , 1 = [0 0]. (4.16) C1 ,2 C2 ,0 C2 ,1 C2 ,2 Redesigning the filter with these additional conditions in place results in the impulse response in Figure 4-5. Although there is little reason to enforce them, continuing in this fashion will yield continuity conditions for any desired derivative. 49 200 150- 100- 50 0 -50 0 0.5 1 1.5 t 2 2.5 3 X1 -3 Figure 4-5: Impulse response with forced continuity of slope. 50 Chapter 5 The Farrow Structure In order for the polynomial resampling filter developed in Chapter 4 to be of any use to the HPG/MMIMU system, there must be a feasible FPGA-based hardware implementation available. The resampling filter is a continuous-time system, so any attempt to implement it using only discrete-time digital logic is immediately suspect. Fortunately, while it is true that a discrete-time system cannot perfectly represent a continuous-time system, in general it does not need to. This chapter presents the Farrow structure, a computational approach that represents continuous time as well as possible within the confines of digital logic. It achieves its purpose, continuous time filtering, in a way that is more simulation than implementation, but the practical difference is small enough to be of no consequence to the overall instrument performance. The Farrow structure was originally proposed as a variable fractional delay element to support echo cancellation in modems [3]. As such, it is ideally suited to the needs of the resampling filter, which must effectively delay its input signal by a fraction of a sample period in order to produce each output sample. It achieves this effect with low algorithmic complexity, putting it easily within reach of embedded FPGA implementations where finite state machines are the primary means of control. It is also reasonably efficient in terms of computational complexity, placing only minimal burden on power-sensitive applications such as the MMIMU. 51 5.1 Derivation of the Farrow Structure Asynchronous resampling is most easily understood as a two stage process, consisting of continuous-time filtering of the input impulse train followed by sampling at the output sample rate. It is not feasible to implement the resampler in this manner, however, because it requires computation of the filtered signal for all time. As noted in Section 3.3, it is sufficient to compute the filtered signal only at the required output sampling instants. The implementation takes the form of direct convolution, r1 (t) = r,(t) * h(t), (5.1) where r,(t) is the continuous time impulse train interpretation of the angular rate signal, with an impulse period of Tr, and rf(t) is the new filtered rate signal. h(t) is the filter impulse response developed in Chapter 4: K-1 h(t) = N Z Ck,n E f(n,t - kTrTr). (5.2) k=0 n=O At each output sampling instant, the resampler receives the SSTB signal from the MMIMU processor and must determine the instantaneous value of rf(t). In the interest of simplicity, the following derivation assumes that the input sampling instant immediately preceding the SSTB occurs at t = 0. The SSTB signal therefore always occurs at t = dT, where d E [0, 1). Expressed in terms of the convolution, the corresponding sample of the filtered rate signal is r1 (dT) = j rp(7r)h(dT - T)dT. (5.3) rp(t) is nonzero only at integer multiples of Tr, so the convolution integral becomes a summation: 00 rf(dT) rp(mT)h(dT - mT) = m=-oo 52 (5.4) After substituting the impulse response from Equation 5.2 into rf(dT) and reordering summations, we have 00 rf(dT) = E N K-1 r,(mT,) M=-oo Ck,- f (n, dT - mT - kTr, T). (5.5) n=O k=O The basis function f(n, t, T) (Equation 4.3) is defined to be nonzero only for 0 < t < Tr. Equation 5.5 includes the basis function evaluated for t = dT, - mT - kTr. Given the constraints on d, it is known that 0 < dT, < Tr. The index variables k and m are restricted to integer values, so the inequality 0 < dTr - mT - kTr < Tr only holds if m = -k. Given this constraint, it is possible to eliminate the first summation and rewrite rf(dT) as N K-1 rf(dT) = p rp(-kT)Ck,nf (n, dTr, Tr) Z n=O k=O N K-1 = E n 2dr rp(-kT)ck,n ( -Tr 1 ,) (5.6) n=O k=O where the basis function has been expanded according to Equation 4.3. At this point, it is convenient to perform a change of variables: 2d - 1 = (5.7) The use of p in place of d will simplify the resulting Farrow structure by compensating for the time shift and compression in the basis function definition. With the change of variables, the output sample r1 (dT) is now no longer dependent on Tr and may be rewritten as a function of t: N K-1 rf(p) = rp(-kT)c,n,u n=O k=0 53 (5.8) And, after one final rearrangement: N r((p) = E K-1 p' E rp(-kT)ck,n (5.9) k=O n=O This expression for r1 (p) reveals the important features of the Farrow structure. The inner summation describes a finite impulse response discrete-time filter with K taps, where the filter coefficients are Ck,n for some fixed n. The outer summation accumulates the outputs of N + 1 such FIR filters, each weighted by P'. Figure 5-1 presents the Farrow structure as a signal flow diagram, showing the N + 1 FIR branches and the exponentiation of p via repeated multiplication. The FIR branches operate at the sampling rate Fr to match the input. Their outputs then become the coefficients of a polynomial in p. Whenever an output sample is required, the surrounding system determines the fraction d of the input sampling period that has elapsed since the last input sample, calculates y using this value, and provides it to the Farrow structure. The Farrow structure then evaluates the polynomial to produce the desired output sample. 5.2 Limitations of the Farrow Structure As noted above, the Farrow structure is not a continuous-time system. Although Chapter 4 treated the resampler input as a continuous-time impulse train for the purposes of analysis, the Farrow structure input is the actual discrete-time angular rate signal as exists in the HPGD. The structure itself consists of purely discretetime FIR filters. Nevertheless, the Farrow structure performs what is for all practical purposes a continuous-time operation. It fails to qualify as a true continuous-time system only because the parameter d, and by extension p, must be discrete in any real implementation. If d was truly a continuous variable and all multiplication and addition operations were capable of producing true continuous outputs, then the Farrow structure would be a continuous-time system without qualification. The finite resolution of d is the primary nonideality in a practical realization of 54 M rp(mTr) II FIR branch N K-1 HN (Z) Ck,NZ I k=O FIR branch N-1 HN1 (Z) CkN IZ k = x 0 FIR branch 1 x K-1 H (z) = XCkZk=O T FIR branch 0 K-1 H0 (z)=Ic OXkZ k=0 rf (n) Figure 5-1: Farrow structure signal flow diagram. 55 the Farrow structure, and is of significant importance. In order for an implementation to measure d, there must exist a clock that is faster than the input sample rate by a known and constant factor. This factor determines the possible resolution with which d may be known by the resampling system. If there is a clock available at twice the input sampling frequency, for example, then d can take on values of either 0 or 0.5. A clock at four times the input sample rate permits d values of 0, 0.25, 0.5, and 0.75. Provided that the ratio of input and output sampling rates is irrational, the actual value of the fractional interval is indeed continuous. The problem lies in measuring it. This limitation effectively brings the discussion full circle to zero-order hold resampling. If the fractional interval value that best describes the desired output sample is 0.6 but the resampler can only measure d with two bits of resolution, then d will round off to 0.5. The Farrow structure, provided with the corresponding y value, will yield an output sample that is correct for a fractional interval of 0.5. The effect is that of sampling the filtered signal one tenth of a period too early. As in zero-order hold resampling, the output sample always takes the value of the underlying analog signal at a point slightly earlier than would be correct. The discrete time resampler output signal is the same as would result from continuous time filtering of the input impulse train followed by resampling at the rate of the clock used to measure d, then zero-order hold resampling at the output rate. The zero-order hold analysis employed in Chapter 3 thus applies to the Farrow structure as well. 56 Chapter 6 The MMIMU Resampler The methods developed in Chapters 4 and 5 are all that is necessary to produce a resampler fitting the requirements of the HPG/MMIMU interface. This chapter first describes the design of the resampler block as originally envisioned in Figure 2-2. It then explores a more efficient solution that combines the functions of the antialiasing filter and resampler into a single component, offering improved performance and a significant reduction in design complexity. 6.1 Resampler Design for the HPG/MMIMU The resampler forms one component of the HPGD rate channel, bridging the gap between the asynchronous TFG and MMIMU clock domains. It must accept both incoming samples from the antialiasing filter and requests for output samples from the MMIMU interface. The MMIMU interface requests samples via the sample strobe (SSTB) signal, and may do so at any time with respect to the resampler input. Upon receipt of the SSTB signal, the resampler uses a Farrow structure to evaluate the filtered angular rate signal at the requested sampling instant, then passes the new sample to the interface module for transmission to the MMIMU processor. Design of the Ck,n coefficients describing the resampling filter begins with identi- fication of the required frequency response. The resampler input signal has already undergone antialiasing filtration, and it is safe to assume that it includes only com57 ponents below the MMIMU interface bandwidth of 300 Hz. The resampling filter must attenuate the images of the rate signal by at least 60 dB to meet the requirements given in Chapter 2. These images occur at all multiples of the rate channel sampling frequency F, or 3250 Hz for a typical 13 KHz gyroscope. The antialiasing filter alone provides nearly 6 dB of attenuation at 150 Hz, so resampler attenuation in the band between DC and 150 Hz should be kept to a minimum. The minimax design process requires a frequency grid covering all bands of interest along with the desired frequency response and error weighting for each frequency in the grid. There is considerable flexibility in the choice of frequency grid, but in general it should extend from DC out to two or three times the input sample rate and include as many uniformly spaced frequency points as is practical. The other major design variable is the length K of the impulse response. As with discrete-time FIR filters, the impulse response length determines the possible frequency selectivity of the filter, and the accuracy with which its frequency response will match. the ideal target. Choosing the impulse response length and the error weights is essentially an iterative process, as it is difficult to predict the values that will lead to the best filter. The impulse response in Figure 6-1 has a length of eight segments, and was designed to meet the above requirements using a frequency grid extending from DC to 2.5F,. The circles in the figure represent segment boundaries. The bands of interest, along with their response magnitudes and error weights, are as follows: Band Range Desired Response Error Weight Passband 0 Hz to 150 Hz 1.0 1 1st image band 2950 Hz to 3550 Hz 0.0001 1 2nd image band 6200 Hz to 6800 Hz 0.0001 1 Any frequencies not in the passband or an image band are considered "don't care" bands, and given a weight of 0. The effect of minimax design is most evident in the resulting filter's frequency response, shown in Figure 6-2. The response is very flat in the passband, then descends gradually to slightly below the required -60 dB attenuation in the first image band. The response is uncontrolled between the two 58 1200 1000 800 800 400 200 0 -200 -0.5 0 0.5 1 1.5 2 2.5 3 t x 1073 Figure 6-1: 8-segment resampler impulse response. 0 -20 -40 0 -80 -80 -100 0 1000 2000 3000 4000 5000 6000 7000 Frequency (Hz) Figure 6-2: 8-segment resampler frequency response. 59 -K image bands, and rises slightly before falling below -70 dB in the second image band. It then rises again, but does not exceed -60 dB. We may therefore be confident that the filter provides adequate attenuation for all higher image bands without needing to specifically account for them in the minimax frequency grid. The 8-segment resampling filter effectively removes images of the angular rate signal while preserving the passband, and, once implemented via the Farrow structure, is appropriate for use in the HPG/MMIMU interface. However, there is significant room for improvement. The rate channel antialiasing filter bandlimits the rate signal to F/2, permitting the design of the resampling filter to consider only the narrow passband and images. Considering that the resampling filter also has a natural low pass characteristic, the combined FIR/resampler frequency response exceeds the requirements by a substantial margin over some or all of the relevant frequencies. The FIR filter as presently implemented appears in Figures 6-3 and 6-4. With 41 taps and over 6 ms of group delay, it is a significant component. If it could be designed in a way that reduces the required number of taps by taking advantage of the resampler frequency response, there might be a valuable savings in signal latency or hardware complexity. While redesign of the FIR antialiasing filter is certainly possible, there is a similar approach that better exploits the available resources. Given the flexibility of the piecewise polynomial filters and the frequency response control afforded by minimax design, there is in fact no reason why the FIR filter is needed at all. By incorporating both the antialiasing and resampling functions into a single filter, it is possible to control the combined frequency response directly, thereby eliminating any unnecessary complexity. In order to meet or exceed the performance of the existing FIR filter, the combined filter/resampler must have better than -60 dB attenuation above the 300 Hz cutoff frequency, and less than 6 dB of passband ripple between DC and 150 Hz. These requirements are satisfied by a polynomial filter having 36 segments designed using the following target response and weights on a frequency grid extending from DC to 2F,: 60 0.1 0.08- 0.06 r 0.04[ 0.02 F 0 10 15 20 25 30 35 40 10 15 20 25 30 35 40 44cbcb 0 5 n Figure 6-3: FIR antialiasing filter impulse response. 0 -10 -20 -30 0 0 -40 -50 -60 -70 -80 -90 0 200 400 600 Frequency (Hz) 800 1000 1200 Figure 6-4: FIR antialiasing filter frequency response. 61 Band I Range Desired Response [Error Weight Passband 0 Hz to 150 Hz 1.0 1 Stopband 300 Hz to 2000 Hz 0.0001 300 The resulting filter is shown in Figures 6-5 and 6-6. As expected, combining the antialiasing and resampling functions into one frequency response has decreased the total complexity. The 36-segment filter has a group delay of 5.5 ms, a large improvement over the 7.4 ms total latency of the separate 41-tap FIR and 8-segment resampler. Alternately, the same increased efficiency may be used to improve the frequency response achievable with a given impulse response length. The total impulse response length for the FIR and 8-segment resampler is 49T,. Redesigning the combined filter with K = 48 (not 49, which is odd) and appropriately adjusted weights yields the filter in Figures 6-7 and 6-8. With no degradation of stopband attenuation, the passband ripple is now only 0.6 dB. In addition to improved performance, the consolidation of these two components allows the elimination of a large and complex hardware multiply/accumulate unit, greatly reducing the required FPGA resources. 6.2 Implementation in Hardware The HPG/MMIMU interface resampler, as described above, has been implemented in hardware as a fully synthesizable VHDL entity. The code, comprising seven processes and approximately five hundred lines, is included in Appendix A. The entity includes inputs for CLKF,, CLKF,, SSTB, and BITCLK, as well as 24-bit input and output channels and a pair of handshaking signals to support glitch-free transmission of the resampled data from one clock domain to another. Although the provided code represents the 36-segment combined filter/resampler designed above, implementing a shorter or otherwise different impulse response requires only a change of constant declarations. As the interface between two clock domains, the resampler implementation nat62 450 400 350 300 250 200 150 100 50 0 2 0 -50 4 t 6 8 10 X1- Figure 6-5: 36-segment resampler impulse. response. 10 0 -10 -20 -30 -40 -50 -60 -70 -90 -90 0 5U00 1000 1500 2000 Frequency (Hz) 2500 3000 3500 Figure 6-6: 36-segment resampler frequency response. 63 400 350 300 250 200 150 1001 50 0 -50 -1001- 15 10 5 0 x 10 Figure 6-7: 48-segment resampler impulse response. 0 -20- -40- g -60- -80F -10010 1000 3000 2000 4000 5000 6000 00 00 -120 0 100 20 40 300 Frequency (Hz) Figure 6-8: 48-segment resampler frequency response. 64 urally consists of two major subsystems, although there are also a number of less complex processes performing auxiliary functions. One of these major subsystems, the FIR evaluation finite state machine, operates entirely within the TFG clock domain. It accepts new input samples from the rate channel CIC filter and evaluates the N+ 1 FIR filters sequentially, one tap at a time. The other subsystem, the resampling finite state machine, waits for an output sampling instant as indicated by the SSTB signal from the MMIMU processor. Upon receipt of the SSTB, it determines the parameter p based on the fraction d of the input sampling period that has elapsed since the last input sample. It then evaluates the polynomial using the FIR filter outputs as coefficients, and provides the requested output sample for transmission over the MMIMU interface. Another FSM counts elapsed CLKF, cycles since the arrival of the most recent input sample, allowing measurement of the fractional interval. There are 1024 CLKF periods per input sample period, so this count starts at -512 and climbs to 511 before wrapping around, thus eliminating the need to explicitly calculate A from d. As a further optimization, the two major subsystems-share a single multiplier and adder unit. The two phases of the computation occur asynchronously, so sharing requires a mutual exclusion mechanism (mutex). One additional FSM supports this feature by managing access to the shared arithmetic logic. The FIR evaluation and resampling finite state machines each indicate their need for the arithmetic logic via mutex request signals and receive access on a first come first served basis. In addition to allowing more efficient use of expensive hardware, the mutual exclusion mechanism plays a more critical role. The two main finite state machines represent the same type of producer/consumer relationship that frequently causes difficulties in multithreaded computer architectures. If they were to operate completely independently, the FIR evaluation finite state machine could potentially alter the polynomial coefficients while the resampling FSM is using them, resulting in an incorrect output sample. The mutex which regulates access to the arithmetic logic also serves to synchronize these two operations and prevent such an occurrence. With the new resampler, the complete HPGD design is quite compact. Despite improved image attenuation, the synthesized HPGD occupies 75% of the available FPGA logic, 1% less 65 than the design it replaces. This compactness translates into reduced FPGA power consumption as well as room to add to the design, if necessary, without redesigning the HPG circuit board. VHDL simulation of the HPG system with this new resampler shows excellent performance. Figure 6-9 presents a small sample for the CIC filter output, with both the desired 125 Hz angular rate and the unwanted 1 KHz TFG artifact clearly visible. Figure 6-10 is the power spectrum of the same signal. The Farrow structure resampler applies simulated continuous time filtering to this signal using the frequency response in Figure 6-6, then resamples the resulting signal at 600 Hz. The resulting signal appears in Figure 6-11. The 125 Hz test tone remains, but the 1 KHz artifact appears to have been removed. The power spectrum in Figure 6-12 confirms this observation. The 1 KHz artifact, which would have aliased into the passband upon resampling, has been pushed below -60 dB, and none of the spectral components generated by asynchronous resampling exceed -60 dB. Experiments with other test signals and additional stopband components show similar results, with the requirements met in all cases. 66 x 10 ---T 4 I A AAAA A I I 3 2 0 I I y I -1 -2 U -3 -4 0.15 0.155 0.16 0.165 0.17 0.175 0.18 Simulation Time (s) 0.185 0.19 0.195 0.2 Figure 6-9: Simulated angular rate signal prior to resampling. 20 0 -20- -4010 a -60 0 0~ -80 -100 -120 - I-rv 0 500 1000 1500 Frequency (Hz) 2000 2500 Figure 6-10: Spectrum of angular rate signal prior to resampling. 67 x 10 f- 3 2 75 0 -1 .2 I I I 0.15 0.155 0.16 A 0.165 I 0.17 0.175 0.18 Simulation Time (s) 0.185 0.19 0.195 0.2 Figure 6-11: Simulated angular rate signal after resampling. 0-20- -40- 0 CL -60- -80 -100 0 50 100 150 Frequency (Hz) 200 250 300 Figure 6-12: Spectrum of angular rate signal after resampling. 68 Chapter 7 Conclusion 7.1 Results This thesis is a response to the need for a digital signal processing mechanism capable of moving a real-time gyroscope angular rate signal between the Draper Laboratory High Performance Gyroscope instrument and its parent system, which use asynchronous clocks. Despite the discrete-time context, asynchronous resampling was shown to be a fundamentally continuous time operation, and failure to consider the implications of continuous time processing may result in significant corruption of the gyroscope signal. Initial attempts to address this need made use of zero-order hold and linear interpolation resampling methods. However, theoretical analysis and simulation demonstrated that neither of these provides a level of signal distortion low enough to satisfy the specific requirements of the HPG system. In response to this performance shortfall, the theory describing zero-order hold and linear interpolators was generalized to include a large family of resampling filters. Every such filter has a continuous time impulse response consisting of an integer number of non-overlapping segments, each of which is described by a low-order polynomial. Due to the linearity property of the Fourier transform, the filter frequency response is at every point a linear combination of the coefficients describing its impulse response. This property allows the filter to be designed directly in the frequency domain using minimax optimization, a form of multiobjective linear programming. The filter may 69 be made to approximate arbitrary frequency domain requirements with weighted error optimized in a minimax sense, in a process very similar to Remez-exchange FIR filter design. This minimax design process was employed in the creation of a simple resampling filter that successfully addresses the HPG requirements. A more aggressive design was then proposed, which uses a sophisticated resampling filter to eliminate the need for additional antialiasing filtration elsewhere in the system, thus realizing a reduction in total system complexity and processing latency. Using the definition of continuous time convolution as a starting point, it was shown how the piecewise polynomial formulation leads naturally to an efficient implementation structure. This so-called Farrow structure allows for the simulated evaluation of a continuous time resampling filter using discrete time digital logic. A synthesizable VHDL implementation was demonstrated via simulation, and shown to have the required performance in a form compatible with the existing HPG design. 7.2 Future Work Although examples of the new resampling mechanism exist and perform well in simulation, the inavailability of actual gyroscope hardware and testing facilities at the time of this writing has precluded hardware verification. It is unlikely that the resampler itself will differ significantly from predicted performance, but there remains the possibility that testing will reveal inadequacies in the input signal characterization used to design it. In this case, it may be necessary to redesign the resampler coefficients as appropriate. The same is true if the natural frequency of any given gyroscope changes significantly, as is expected to occur as its design is refined. It may also be worthwhile to further explore the time-domain properties of minimax filter design. As noted, it is necessary to explicitly constrain the time domain response to ensure continuity at the segment endpoints. However, the consequences of allowing discontinuity are not fully understood. It is possible that discontinuous impulse responses may possess useful properties not considered in this thesis. 70 Appendix A Resampler VHDL Implementation library ieee; use ieee.std-logic-1164.all; use ieee.std-logicarith.all; use ieee.stdlogic-unsigned.all; entity Resampler is port( ClockFS : in stdlogic; Fastest TFG domain clock : in stdjlogic; -TFG domain rate sample clock, ClockFS/1024 : in std-logic-vector(23 downto 0); -TFG domain data in : in std-logic; Fastest MMIMU domain clock : in std-logic; -MMIMU domain rate sample clock : out std-logicvector(23 downto 0); -MMIMU domain data out -- ClockRC DataIn BitClock SSTB DataOut DataOutValid : out std-logic; Asserted when MMIMU domain rate sample corresponding to most recent SSTB is ready : in stdlogic; -Acknowledgement signal for handshake protocol : in std.logic -- DataTaken Reset -- System reset signal end Resampler; 71 architecture ResamplerArch of Resampler is -------- Constant Declarations -------------------------------------- : integer := 2; constant kInterpOrder Order of polynomial interpolation (one less than number of --- branches) integer := 35; constant kFIROrder Order of individual FIR branches (one less than number of -taps) -- integer := DataIn'length; constant kSampleWidth -- Bit width of each input sample : integer := 24; constant kCoeffWidth -- Bit width of filter coefficients : integer := 24; constant kMACWidth Bit width multiply/accumulate operations and associated --- registers time := 1 ns; constant Tsd Short signal assignment delay to prevent simulation issues -- -------- Type Declarations ------------------------------------------ is signed(kSampleWidth-1 downto 0); subType SampleType := (others => '0'); : SampleType constant kZeroSample -- Data type of incoming signal samples, and zero value is signed(kCoeffWidth-1 downto 0); subType CoeffType := (others => '0'); : CoeffType constant kZeroCoeff -- Data type of filter coefficients, and zero value is signed(kMACWidth-1 downto 0); subType MACType := (others => '0'); : MACType constant kZeroMAC -- Data type used for arithmetic, and zero value for subType BranchIndexType is integer range 0 to kInterpOrder; -- Index into coefficient array, constrained to valid range is integer range 0 to kFIROrder; subType TapIndexType -- Index into coefficient array, constrained to valid range type DelayLineType -- is array(TapIndexType) of SampleType; Delay line buffer for input samples is array(TapIndexType) of CoeffType; type FIRBranchType -- Coefficient storage for one Farrow structure FIR branch 72 type CoeffArrayType is array(BranchIndexType) -- Aggregation of FIR branches of FIRBranchType; type PolynomCoeffsType is array(BranchIndexType) of MACType; -Temporary holder for FIR outputs -(the polynomial coefficients) -------type signal ---- Finite State Machines -----------------------------------SSTBHoldStateType is (Ready, Holding); SSTBHoldState SSTBHoldStateType := Ready; SSTB Hold FSM Captures SSTB signal in the MMIMU domain, and holds it until the TFG domain (the Resampling FSM) acknowledges receipt type FIRFSMStateType is (Idle, SecureALU, StartBranch, EvalBranch, EndBranchi, EndBranch2); signal FIRFSMState : FIRFSMStateType:= Idle; -- FIR Evaluation FSM -- Handles evaluation of FIR branches upon receipt of new -- input sample type ResampleStateType signal ResampleState is (Idle, ConfirmSSTB, AckSSTB, MultBranch, AddBranch, FinishEval, AssertValid, AwaitConfirm); ResampleStateType := Idle; -- Resampling FSM -- Captures fractional interval upon receipt of SSTB, the evaluates the final polynomial to produce an output sample -- -------- Signal Declarations --------------------------------------- signal PhaseCounter -- : integer range -512 to 511; Counter for tracking fractional interval signal FracInt : MACType := kZeroMAC; -- Fractional interval captured from PhaseCounter -- synthesis translate-off signal SampleInstActual : time; signal SampleInstIdeal : time; -- Non-synthesizing code for measuring sample time error -- synthesis translate-on signal DelayLine -- : DelayLineType := (others => kZeroSample); input sample delay line (circular buffer implementation) 73 signal PolynomCoeffs : PolynomCoeffsType -- buffer for FIR outputs (others => kZeroMAC); signal BranchIndex : BranchIndexType 0; -- index of FIR branch currently being processed signal SSTBHeld : std-logic '0'; signal SSTBAck : std-logic '0'; -- Handshake signals used by SSTB Hold and FIR Evaluation FSMs signal SynchedDataTaken : std-logic := '0'; -- TFG domain synchronized version of MMIMU interface handshake signal Data~utInt Internal version signal Data0utValidInt Internal version : SampleType := kZeroSample; of output sample port : std-logic := '0'; of output sample valid port signal P1NeedsALU : boolean := false; ALU mutex request signal for FIR Evaluation FSM signal P2NeedsALU : boolean false; ALU mutex request signal for Resampling FSM signal ALUmutex : integer range 0 to 2 := 0; ALU mutex state -0 if unheld -1 if FIR Evaluation FSM has ALU access -2 if Resampling FSM has ALU access signal signal signal signal PiMultInA : MACType P1MultInB : MACType P2MultInA : MACType P2MultInB : MACType Multiplier inputs used by each FSM signal MultResult : MACType Multiplier output : : : kZeroMAC; signal signal signal signal : : : : kZeroMAC; kZeroMAC; kZeroMAC; kZeroMAC; : kZeroMAC; PlAddInA : MACType P1AddInB : MACType P2AddInA : MACType P2AddInB : MACType Adder inputs used by each FSM signal AddResult MACType Adder output constant BranchO : : kZeroMAC; kZeroMAC; kZeroMAC; kZeroMAC; : FIRBranchType slv(x"FFE1A6"), slv(x"FFD16C"), slv(x"FFB8A2"), slv(x"FFA11E"), slv(x"FF947C"), slv(x"FF9DOA"), 74 slv(x"FFC782"), slv(x"017989"), slv(x"050216"), slv(x"08ADCC"), slv(x"09DD21"), slv(x"079CAA"), slv(x"O03BB1"), slv(x"OOAED2"), slv(x"FF9DOA"), slv(x"FFB8A2"), slv(x"001FC2"), slv(x"027D4B"), slv(x"0659EC"), slv(x"097452"), slv(x"097452"), slv(x"0659EC"), slv(x"027D43"), slv(x"001FC2"), slv(x"FF947C"), slv(x"FFD16C"), slv(x"OOAED2"), slv(x"03BOB1"), slv(x"079CAA"), sl(x"09DD21"), slv(x"08ADCC"), slv(x"050216"), slv(x"017989"), sly (x"FFC782"), slvWx"FFA11E"), slv(x"FFE1A6") constant Branchi slv(x"0006C6"), slv(x"FFF64D"), slv(x"002004"), slv(x"0073FE"), slv(x"OOABEC"), slv(x"007760"), slv(x"FFE330"), slv(x"FF6958"), slv(x"FF5D7F"), slv(x"FFA9A3"), slv(x"FFF40A"), slv(x"00OC5D"), FIRBranchType slv(x"FFF342"), slv(x"FFFE44"), slv(x"003960"), slv(x"008EA3"), slv(x"00A86E"), slv(x"004C6A"), slv(x"FFB396"), slv(x"FF5792"), slv(x"FF715D"), slv(x"FFC6AO"), slv(x"0001BC"), slv(x"OOOCBE"), slv(x"FFF3A3"), slv(x"OOOBF6"), slv(x"00565D"), slv(x"00A281"), slv(x"0096A8"), slv(x"001CDO"), slv(x"FF88AO"), slv(x"FF5414"), slv(x"FF8CO2"), slv(x"FFDFFC"), slv(x"0009B3"), slv(x"FFF93A") constant Branch2 slv(x"FFF5FE"), slv(x"000164"), slv(x"0005C5"), slv(x"00073A"), slv(x"OOOODF"), slv(x"FFF6A1"), slv(x"FFF44F"), slv(x"FFF9BB"), slv(x"0003D7"), slv(x"000796"), slv(x"000442"), slv(x"FFFFFO"), : FIRBranchType slv(x"000040"), slv(x"000297"), slv(x"0006E8"), slv(x"000619"), slv(x"FFFD62"), slv(x"FFF3E4"), slv(x"FFF3E4"), slv(x"FFFD62"), slv(x"000619"), slv(x"0006E8"), slv(x"000297"), slv(x"000040"), slv(x"FFFFFO"), slv(x"000442"), slv(x"000796"), slv(x"0003D7"), slv(x"FFF9BB"), slv(x"FFF44F"), slv(x"FFF6A1"), slv(x"OOOODF"), slv(x"00073A"), slv(x"0005C5"), slv(x"000164"), slv(x"FFF5FE") constant CoeffArray : CoeffArrayType : (BranchO, Branchi, Branch2); -- Coefficients for each FIR branch, declared individually 75 --- then combined into a 2D array. sly is a subroutine that converts hex strings into standard logic vectors begin DataOut <= std-logic.vector(DataOutInt) after Tsd; <= DataOutValidInt after Tsd; DataOutValid -- Mirror internal signals on output ports FIRFSM : process variable StartTap variable CurrTap variable CurrBranch : TapIndexType; : TapIndexType; : BranchIndexType; variable CurrCoeff variable DLHeadIndex : TapIndexType; : TapIndexType := 0; begin wait until ClockFS = '1'; if (Reset = '1') then Idle after Tsd; 0; 0; 0; FIRFSMState CurrCoeff CurrBranch DLHeadIndex else case FIRFSMState is when Idle => if (ClockRC = '1') then DelayLine(DLHeadIndex) <= SampleType(DataIn); -- Capture new TFG domain input sample DLHeadIndex; 0; StartTap CurrBranch -- Initialize tap and branch index variables if (DLHeadIndex DLHeadIndex else DLHeadIndex = 0) then kFIROrder; DLHeadIndex - 1; end if; -- Decrement delay line buffer head index P1NeedsALU 76 <= true after Tsd; FIRFSMState <= StartBranch after Tsd; -- Start evaluation of FIR structure end if; when StartBranch => -- Spin until ALU mutex is secured if (ALUmutex = 1) then PiMultInA <= kZeroMAC after Tsd; P1MultInB <= kZeroMAC after Tsd; PlAddInA <= kZeroMAC after Tsd; PlAddInB <= kZeroMAC after Tsd; CurrTap StartTap; CurrCoeff 0; FIRFSMState <= EvalBranch after Tsd; end if; when EvalBranch => -- Evaluate one FIR branch PlMultInA P1MultInB PlAddInA PlAddInB <= CoeffArray(CurrBranch)(CurrCoeff); <= DelayLine(CurrTap) after Tsd; <= MultResult after Tsd; <= AddResult after Tsd; if (CurrTap = DLHeadIndex) then -- We've finished the FIR branch FIRFSMState <= EndBranchi after Tsd; else CurrCoeff := CurrCoeff + 1; end if; -- Increment tap index if (CurrTap = kFIROrder) then -- tap index increments by rolling over CurrTap := 0; else tap index just increments CurrTap := CurrTap + 1; end if; -- when EndBranchi => <= MultResult after Tsd; <= AddResult after Tsd; <= EndBranch2; PlAddInA PlAddInB FIRFSMState 77 when EndBranch2 => PolyNomCoeffs(CurrBranch) <= AddResult; -- Store final result of mult/acc in the -- polynomial coefficient register if (CurrBranch = kInterpOrder) then -- We've finished the whole structure P1NeedsALU <= false after Tsd; FIRFSMState <= Idle after Tsd; -- Relinquish the ALU mutex and return to idle else -Advance to next branch := CurrBranch + 1; CurrBranch <= StartBranch after Tsd; FIRFSMState end if; when others => P1NeedsALU FIRFSMState <= false after Tsd; <= Idle; end case; end if; end process; ResampleFSM : process : BranchIndexType variable CurrBranch begin wait until ClockFS = '1'; if (Reset = '1') then Data0utValidInt < <= Data~utInt <= SSTBAck <= ResampleState else := 0; '0' after Tsd; (others => '0') after Tsd; '0' after Tsd; Idle after Tsd; <= DataTaken after Tsd; SynchedDataTaken -Synchronize the data taken signal in the TFG -domain to avoid glitches case ResampleState is when Idle => Data0utValidInt <= '0' after Tsd; if ((SSTBHeld = '1') or (SSTB = '1')) then 78 -- synthesis translate-off SampleInstActual <= now; -- synthesis translate-on P2MultInA <= conv-signed(PhaseCounter, kMACWidth); -put the fractional interval in P2MultInA -so it is ready for multiplication later P2NeedsALU <= true after Tsd; -- request the ALU mutex ResampleState end if; <= ConfirmSSTB after Tsd; when ConfirmSSTB => if (SSTBHeld = '1') then SSTBAck <= '1' after Tsd; ResampleState <= AckSSTB after Tsd; end if; when AckSSTB => -spin until SSTB receipt is acknowledged and -ALU mutex is secure if ((SSTBHeld = '0') and (ALUMutex = 2)) then SSTBAck <= '0' after Tsd; CurrBranch := kInterpOrder; -- initialize the branch counter <= kZeroMAC after Tsd; zero the second multiplier input so we don't add anything on the first run through AddBranch P2MultInB ---- ResampleState end if; <= AddBranch after Tsd; when AddBranch => <= MultResult after Tsd; <= PolyNomCoeffs(CurrBranch); P2AddInA P2AddInB if (CurrBranch = 0) then ResampleState <= FinishEval after Tsd; else ResampleState <= MultBranch after Tsd; end if; 79 when MultBranch => P2MultInB CurrBranch ResampleState <= AddResult after Tsd; := CurrBranch - 1; <= AddBranch after Tsd; when FinishEval => <= AddResult after Tsd; DataOutInt <= false after Tsd; P2NeedsALU -- relinquish ALU mutex <= AssertValid after Tsd; ResampleState when AssertValid => <= Data~utValidInt if (SynchedDataTaken = -MMIMU interface -deassert valid DataOutValidInt <= <= ResampleState end if; '1' after Tsd; '1') then signals taken, so '0' after Tsd; AwaitConfirm after Tsd; when AwaitConfirm => DataOutValidInt <= '0' after Tsd; if (SynchedDataTaken = '0') then -MMIMU interface confirmed receipt, so -- return to Idle ResampleState end if; when others => ResampleState end case; end if; end process; <= Idle; <= Idle; SSTBHoldFSM : process -Capture the sample strobe and hold it until the TFG domain acknowledges. This is probably not necessary for most clock --frequency combinations, but important for a robust design begin wait until BitClock = '0'; if (Reset = '1') then <= '0' after Tsd; SSTBHeld <= Ready after Tsd; SSTBHoldState else case SSTBHoldState is when Ready => if (SSTB = '1') then 80 ----- SSTBHeld SSTBHoldState <= '1' after Tsd; <= Holding after Tsd; else SSTBHeld end if; <= '0' after Tsd; when Holding => if (SSTBAck = '1') then SSTBHeld <= '0' after Tsd; SSTBHoldState <= Ready after Tsd; else SSTBHeld <= '1' after Tsd; end if; when others => SSTBHeld SSTBHoldState <= '0' after Tsd; <= Ready after Tsd; end case; end if; end process; TrackPhase : process begin wait until ClockFS = '1'; if (Reset = '1') then PhaseCounter <= -512 after Tsd; else if (ClockRC = '1') then PhaseCounter <= -512 after Tsd; else PhaseCounter <= PhaseCounter + 1 after Tsd; end if; end if; end process; MutexProcess : process (P1NeedsALU, P2NeedsALU) -- Manage access to the single ALU using a mutex token begin case ALUmutex is when 0 => -- Mutex is available if (P1NeedsALU) then ALUmutex <= 1 after Tsd; -- give it to FIR FSM elsif (P2NeedsALU) then ALUmutex <= 2 after Tsd; 81 I -- give it to Resampling FSM end if; when 1 => -- FIR FSM has mutex if (not P1NeedsALU) then if (P2NeedsALU) then ALUmutex <= 2 after Tsd; -- hand off mutex to else ALUmutex <= 0 after Tsd; -- set mutex to free end if; end if; when 2 => if (not P2NeedsALU) then if (P1NeedsALU) then ALUMutex <= 1 after Tsd; -- hand off mutex to else ALUmutex <= 0 after Tsd; -- set mutex to free end if; end if; end case; end process; Adder : process (PlAddInA, PlAddInB, : MACType variable AddInA : MACType variable AddInB : MACType variable TempSum begin if (ALUmutex = 1) then AddInA PlAddInA; AddInB PlAddInB; else P2AddInA; AddInA P2AddInB; AddInB end if; AddResult end process; Resample FSM state FIR FSM state P2AddInA, P2AddInB, ALUmutex) kZeroMAC; kZeroMAC; kZeroMAC; <= AddInA + AddInB after Tsd; Multiplier : process (PlMultInA, PlMultInB, P2MultInA, P2MultInB, ALUmutex) kZeroMAC; : MACType variable MultInA kZeroMAC; : MACType variable MultInB : signed(2*kMACWidth-1 downto 0); variable TempProd uct 82 begin if (ALUmutex MultInA MultInB else MultInA MultInB end if; = 1) then PiMultInA; PIMultInB; P2MultInA; P2MultInB; TempProduct := conv-signed(MultInA * MultInB, 2*kMACWidth); if (ALUmutex = 1) then MultResult <= MACType( TempProduct(2*kMACWidth-3 downto kMACWidth-2) ) after Tsd; else MultResult <= MACType( TempProduct(2*kMACWidth-16 downto kMACWidth-15) ) after Tsd; end if; end process; end ResamplerArch; S 83 84 Bibliography [1] Djordje Babic, Jussi Vesma, and Markku Renfors. Decimation by irrational factor using CIC filter and linear interpolation. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing,pages 3677-3680, Salt Lake City, UT, USA, May 2001. IEEE. [2] S. Cucchi, F. Desinan, G. Parladori, and G. Sicuranza. DSP implementation of arbitrary sampling frequency conversion for high quality sound application. In Int. Conf. on Acoustics, Speech, and Signal Processing, volume 5, pages 36093612, Toronto, Ont., Canada, April 1991. [3] C. W. Farrow. A continuously variable digital delay element. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing,pages 2641-2645, Espoo, Finland, June 1988. IEEE. [4] Tim Hentschel and Gerhard Fettweis. Software radio receivers. In Francis Swarts, Pieter von Rooyan, Ian Opperman, and Michiel L6tter, editors, CDMA Techniques for Third Generation Mobile Systems, chapter 10, pages 257-283. Kluwer Academic Publishers, Boston, 1999. [5] Eugene B. Hogenauer. An economical class of digital filters for decimation and interpolation. IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-29(2):155-162, April 1981. [6] Timo I. Laakso, Vesa Viilimsiki, and Jukka Henriksson. Tunable downsampling using fractional delay filters with applications to digital TV transmission. In 85 Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, pages 1304-1307, Detroit, MI, USA, May 1995. IEEE. [7] Timo I. Laakso, Vesa Vsilimsiki, Matti Karjalainen, and Unto K. Laine. Splitting the unit delay: Tools for fractional delay filter design. IEEE Signal Processing Magazine, pages 30-60, January 1996. [8] Ging-Shing Liu and Che-Ho Wei. A new variable fractional sample delay filter with nonlinear interpolation. IEEE Transactions on Circuits and Systems - II, 39(2):123-126, February 1992. [9] Alan V. Oppenheim and Ronald W. Schafer. Discrete Time Signal Processing. Prentice Hall, Upper Saddle River, New Jersey, second edition, 1999. [10] Alan V. Oppenheim and Allan S. Willsky. Signals and Systems. Prentice Hall, Upper Saddle River, New Jersey, second edition, 1983. [11] Jussi Vesma. Timing adjustment in digital receivers using interpolation. Master of science thesis, Tampere University of Technology, November 1995. [12] Jussi Vesma. Optimization and Applications of Polynomial-Based Interpolation Filters. PhD thesis, Tampere University of Technology, May 1999. 86