Chapter 2: Transitioning to Fixed-Point Chapter 2: Transitioning to Fixed-Point ............................................................................. 1 Introduction ......................................................................................................................... 1 Fixed-Point Design: Graphics vs Text ............................................................................ 1 The Fixed-Point Echo Canceller Model ............................................................................. 3 Always Run the Model First ........................................................................................... 4 Side-by-Side: Floating and Fixed ................................................................................... 5 Auto-scaling ...................................................................................................................... 11 Where Auto-scaling Fits ............................................................................................... 12 Auto-scaling: Step-By-Step .......................................................................................... 13 Accelerating Fixed-Point Simulations .............................................................................. 20 Introduction This section illustrates the process of taking a functioning floating point design and transitioning it to fixed-point in Simulink. We use MATLAB’s fixed-point toolbox in a similar fashion in Chapter 8. There is not just one way to convert a design to fixed-point. Experience does help. And even with experience there are still different approaches to working in fixed-point. Some engineers choose to design directly in fixed-point, skipping floating point altogether. In general, we recommend having a floating point reference design to fall back on regardless of your experience level. If a floating point design does not work, the cause is not likely to be a numerical problem like quantization or overflow. With a fixed-point design however, the source of the problem could be a design flaw, a fixed point quantization problem, or a combination of both. It’s harder to tell and therefore harder to debug. Fixed-Point Design: Graphics vs Text The transition to fixed-point is rarely considered easy. It is intrinsically difficult at times not matter what the environment and language you are using. Textual languages like C, assembler, and HDL don’t make it any easier. Let’s take a look at a very small example that should be simple to follow, a fixed point multiply-accumulator (MAC) with 16 bit inputs and one 32 bit output with an intermediate data type of 24 bits. Yes, some DSP’s do have native 24 bit data types. Here is a fixed-point MAC modeled graphically in Simulink. It’s easy to see the data types propagating through the blocks. Now do the same thing textually… Here is the identical fixed-point accumulator modeled textually in C code. Which one do you find easier to comprehend? Had this been a more complicated example, the differences between graphical and textual fixedpoint would be even more extreme. At a glance it’s difficult to figure out what the fixed-point C code is doing since you can’t see any of the fixed point data types. Remember native fixed-point C arithmetic does not exist, only integer arithmetic does. And from integer arithmetic, you have to emulate fixed point behavior with the proper shifting, masking, and casting. Which is the 16 bit data type and where is the binary point? Where is the 24 bit data type? How about the 32 bit data type and its precision? This example would have been even more difficult to follow in C had the effects of rounding and overflow behavior been included. This C code also does not show each variable’s declaration which adds yet another dimension to the work when programming in text. In other words, I tried to keep this example simple and make the fixed point C code compare favorably but couldn’t. We think you will find that Simulink’s graphical approach is a natural fit for describing, simulating, and debugging fixed point behaviors. We will re-use the same testbench we used in the floating point design earlier to evaluate how well the fixed point design is performing by compare the floating and fixed point designs side by side. This is a very useful technique. The Fixed-Point Echo Canceller Model So, where do we start? That’s easy. We start with our working floating-point version of the design. Always start with something that works if you have it and in this case we do. In this section we layout the steps necessary to create ec_single_vs_fixed.mdl out of our existing floating point model. All of the files referred to in this workflow are accessed from an HTML example selector. You can open this example selector by running the included M-file run_exsel.m. The end result of our work shows up in ec_single_vs_fixed.mdl. This model is accessed via the second link “Single vs Fixed” in the example selector. The first thing you should do when open this model is RUN IT! Always Run the Model First Before we dive in and really break down the different parts of the model, run it first. It’s an executable spec so treat it as such by running it. Running the model first also aids in understanding what you are looking at when you do dive down into the subsystems. Side-by-Side: Floating and Fixed Once nice thing about working in Simulink is how easy it is to use an existing known working version of your design and transition it to the next more evolved version. This is ever so true when you are making the move from a floating-point design to a fixed-point one. You won’t merely over-write the floating-point design with fixed-point parameters. Instead you will preserve the floating-point design and run a fixed-point version in parallel with it. This is the side-by-side approach. The single-precision version of the echo canceller is in red (reference) and an initial guess at the fixed-point design is in green, i.e. unproven. The key point is that both versions use the same test bench. Here is one set of steps you could follow to transition to this design to fixed-point. 1. Save-As your working reference model to a different name so you don’t over-write the working version. This is important. Don’t overlook the obvious. 2. Create a copy of the floating-point LMS block, a right click drag and drop operation. 3. Change its settings to fixed-point using its masked dialog. Make best initial guesses as to what the word lengths and precisions should be. Start out conservative. Later on, we will investigate the fixed-point tool which can fine-tune your design. 4. Surround the LMS with interface blocks to the testbench. These blocks convert between the testbench domain and the algorithm’s data domain. The ToSingle and ToFixed interface blocks clearly segment between the testbench domain and the implementation domain. In many cases the testbench may be in the continuous-time domain. This is what’s inside the ToFixed subsystem. The Convert block is a part of The basic Simulink library under Signal Attributes. The Data Type Conversion block is set up for outputting a signed fixed point data type with a word length of 16 bits and the binary point at bit 14. This signal can take on a value between approximately + and – 2. 5. Make re-use of the source blocks and the measurement blocks already present. The Matrix Concatentation blocks (in cyan/aqua color below) are handy if you want to overlay two sets of results. 6. The Goto and From blocks came in handy to avoid routing wires long distances across the model. We will investigate an alternative technique for “clean signal routing” using buses in a later section. 7. Run the model and see how the fixed point implementation compares. If it doesn’t compare so favorably, you can adjust the word lengths, precisions, and rounding modes manually or use the auto-scaling capability. In this figure we show how to architect your model for side-by-side comparisons of a floating point implementation in red with a fixed-point one in green. Red mnemonically stands for reference, i.e. a proven working design. Green denotes something that is unproven, i.e it’s “green”. Notice that we re-use most of the testbench. Only the interface blocks (ToFixed and ToSingle) and the fixed point LMS block have been added. Notice how the data types change from single-precision to fixed point and back to single-precision again. The testbench stays in single precision since that part is not to be implemented. You can rotate blocks using Ctrl-R to rotate a block in case you want the inputs to be on the right. By default the inputs will be on the left side of a block. The two Matrix Concatenation blocks are colored in cyan = turquoise = aqua. This modeling approach allows us to compare the results of the floating and fixed point simulations overlaid on the same Vector Scope. The LMS block has its own fixed-point settings. At this point we just set the word length and binary points to some initial guesses. This is the dialog for the fixed point implementation of the LMS subsystem. Everything is set to use Q15 except for the accumulator. Run this model as-is with Nearest rounding. Observe the way the taps converge. Floating and fixed point implementations agree well when using rounding “nearest”. By default these scopes lack a zoom capability but you can add zoom using a technique introduced in Appendix B in Chapter 1. Next run the model with the rounding set to Floor which is basically “no rounding, also called truncation. The results are drastically different. In a feedback system like the LMS algorithm, small offsets can have a big cumulative effect over time. The green line represents the fixed point taps when the rounding mode is set to Floor. Performance is not acceptable in this case. Auto-scaling This section covers how to use a feature of the fixed-point tools for autoscaling. In brief it works like this: You pick the word lengths. The tool picks the binary point. Where Auto-scaling Fits It’s important to keep in mind a few things about the autoscaling tool before getting started with it. Many engineers who are new to fixed point are quick to assume the Fixed-Point Tool has powers beyond what it actually does, thus, the motivation behind this warning. First, the transition from a floating-point design to a fixed-point design can never be completely automated. Choosing the word lengths, type of rounding, and overflow behavior are still manual steps up to the designer. Rounding can be a make or break affair for many fixed point designs, especially those employing feedback as we saw earlier. The LMS algorithm has feedback. IIR filters and lattice filters also have feedback. The fixed point tool does not alter the rounding mode you have selected nor does it make recommendations for how to set it. Second, the autoscaling tool works empirically, not analytically or formally. This means the autoscaled results are a function of what inputs you provide, i.e. results are data-dependent. The autoscaling tool is not analyzing the model using some formal procedure to decide where to best place the binary points. The peak to peak swings in the output of any system are dependent upon the nature of the particular input sequence. Your fixed point design may work fine for one set of inputs and then fail on another that you didn’t envision. Thus the engineer should always be actively engaged in the fixed point design no matter how good the auto-scaling tool is. Third, the autoscaling tools works only on the existing floating point design. Many times there are architectural changes you can make to your floating point design that will significantly impact the efficiency of the fixed-point design. That being said the autoscaling tool can provide an excellent starting point from which the engineer can fine-tune. Auto-scaling: Step-By-Step Here is one use-case of the auto-scaling tool. These steps take only about 5 minutes once you understand the process. That’s a significant time-savings over what it would have taken if done manually. 1. Create a subsystem out of the fixed point portion of the model. Grab the 3 subsystems, ToSingle, LMS Canceller (fixed), and ToFixed to create one new subsystem. It should look something like this after you create the new subsystem. 2. Turn data types display to ON. Select Format/Port&Signal Displays/Port Data Types. 1. Select Tools/Fixed Point Settings. 2. Select the subsystem you just created. Here it’s named EC fixed (for autoscaling). 3. Set the Logging mode to “Minimums, maximums, and overflows”. 4. Set Data type override to “True Singles” since our reference design is implemented in single-precision arithmetic. Then Hit Apply. 5. Do an update diagram (Ctrl-D) on your model to verify the data type override took effect. Now everything is a single in spite of the fact that the Convert blocks say to output a fixdt(1,16,14). We have over-ridden the Convert block. This is possible if you haven’t checked the “Lock output scaling against changes by the autoscaling tool” checkbox. Be aware of that. 6. Run the model for a while and hit stop. The Fixed-Point Tool has its own “separate but equal” run, stop, and pause buttons. 7. The Fixed-Point tool collected minimum and maximum information for every signal in your subsystem of interest and displays it in the middle section of the Fixed-Point Tool. But remember the data was collected using a single-precision implementation. So if everything is working now, you aren’t out of the woods yet. The columns SimMin and SimMax are the simulation minimums and maximums that occurred for the duration of the latest run. 9. Enter a “Percent Safety Margin”, say 100% for starters. We can trim it back later. 10. Hit “Propose fraction lengths”. It will compute where the most appropriate binary point should be for each signal given the last simulation run’s minimums and maximums and the percent safety margin. The Column Proposed FL shows the proposed fraction lengths. 11. If you want to try a simulation run with these new fixed-point settings, hit “Apply accepted fraction lengths”. 12. Change “Data type override” back to “Use local settings”. Do another Update Diagram by hitting Ctrl-D with the model the active figure. You should see the new auto-scaled fixed-point settings in effect. 13. Run the model again and see how well the auto-scaled results do. If you want to be more aggressive, repeat steps 9 through 13 with a reduced safety margin. 14. If you want to be more aggressive yet, you could manually reduce the word lengths for the LMS filter. For another model of where auto-scaling was applied, you can see ec_fixed_slp_sb_autoscaled.mdl. In this model you’ll see scale factors like 21 fraction bits in a 16 bit number, sfix16_En21. This is a completely valid fixed-point description although it may not sound intuitive that you can have more fraction bits than total bits but you can! Accelerating Fixed-Point Simulations In general there are four main things that slow down Simulink simulations. 1. Having displays turned on. Any scope or display will slow down the simulation. It’s best to place displays inside enabled subsystems. This will be covered in a later chapter. 2. Running sample-based simulations. In general sample-based processing is substantially slower than frame-based simulations. This is also discussed in a later chapter. 3. Simulating using a fixed-step solver with too small a step size. Sometimes you can get away with a coarser step size and achieve acceptably close simulation accuracy. Other times you may need to employ a variable-step solver and adjust the error tolerances appropriately. Variable-step solvers are relevant when there is a continuous-time aspect to your model. Solver selection is not discussed in this workflow since all simulations are fixed-step discrete-time. 4. Running fixed-point simulations. This is what we are addressing in this section. Fixed-point simulations run slower than their floating-point equivalents. No two ways about it. This is true because fixed-point simulations have more word to do, more to keep track of. With fixed-point, you have to shift, mask, round, and conditionally saturate on every math operation. That requires more cycles than floating point math where none of the above is relevant. And to make matters worse there is a certain degree of interpretation that goes on at run-time with fixed-point math. The fixed-point parameters are not hard-coded in the model. Generally they are set up via a dialog which the user can change between runs. Such a scheme is flexible but you take a hit in terms of simulation speed, tradeoffs, tradeoffs. Your Simulink model is configured to use any set of fixed-point parameters instead of being optimized for a given set. There is a way to improve upon this situation however. The solution involves code generation via the Accelerator. Starting in 2007B the Accelerator became a standard part of Simulink. Before that it was a Simulink add-on. It’s recommended that you review the documentation on Simulink’s different run modes if you plan on taking full advantage of them. This is the Simulink documentation on how the 3 different run modes differ. Close to the play button you will find a pulldown listing the different run modes. The first selection is the most commonly used, Normal. Normal mode is analogous to Debug mode in other development environments. It provides maximum flexibility at the cost of reduced efficiency but in general it’s the right tradeoff to make when simulating. The second selection is the Accelerator. Let’s say you have a fixed-point simulation you’d like to speed up. Here is how you would use the Accelerator. 1. Select Accelerator from the run-mode menu. 2. Hit the Play button. The first time you run in Accelerator mode the simulation won’t start immediately. First it will compile and link most of the model to a DLL. Once that process is complete, then it will run. And it should run faster than it did in Normal mode. Speed up factors vary greatly depending on the complexity of the model. The more fixed-point you have a in a model, the more speed up you will get. Remember that displays may be your simulation’s limiting factor. To learn just how much of a speed up you are getting with the Accelerator it’s best to turn your displays off. The display for the LMS weights has been placed inside of an enabled subsystem that can be turned on or off via a user switch. Displays are the number one culprit responsible for slowing simulations down.