Engineer To Engineer Note EE-66 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Using Memory Overlays memory contains the main program, an overlay manager function and two segments reserved for execution of overlay program instructions. In order to reduce DSP system costs, many applications use DSPs with smaller amounts of on chip memory— placing much of the program code and data off chip. In order to run the applications efficiently, memory overlays are used. This note discusses the concept of memory overlays and how they are used with Analog Devices 32-bit SHARC DSPs. The following topics and examples are discussed: • “The Concept of Memory Overlays” • “VisualDSP Overlay Support” • “Basic Overlay Example” • “Pre-Load Overlay Manager Example” All of the code segments used in the following discussion are parts of the two example programs that appear at the end of this note. The Concept of Memory Overlays Memory overlays provide support for applications whose entire program instructions and data do not fit in the internal memory of the processor. In such a case, program instructions and data are partitioned and stored in external memory until they are required for program execution. The partitions are referred to as memory overlays and the routines that call and execute them overlay managers. Overlays are a “many to one” memory mapping system. Several overlays “live” (or are stored) in unique locations in external memory, but they “run” (or execute) in a common location in internal memory. Throughout this note, the storage location of overlays are referred to as the “live” location, and the internal location where instructions are executed are referred to as the “run” (runtime) space. Figure 1 demonstrates the concept of memory overlays. In Figure 1 there are two memory spaces: internal and external. The external memory is partitioned into five overlays. The internal Figure 1. Memory Overlays In this example, Overlay 1 and 2 share the same run time location within internal memory. Overlays 3, 4 and 5 also share a common run time memory. If FUNC_B is required, the overlay manager loads Overlay 2 in the location within internal memory where overlay 2 is designated to run. If FUNC_D is required, the overlay manager loads Overlay 3 into its designated run time memory. The overlay manager is a user defined function responsible for insuring that a required symbol (function or data) within an overlay is in the run time memory when it is needed. The transfer occurs using the direct memory access (DMA) capability of the SHARC processor. The overlay manager may also handle more advanced functionality such as checking if the requested overlay is already in run time memory, executing another function while loading an overlay, and tracking recursive overlay function calls. a VisualDSP Overlay Support The overlay support provided by the 32-bit tools includes the following: overlays. The information provided by the linker includes the following constants: _ov_startaddress_N _ov_endaddress_N _ov_size_N • specification of the live and run location of each overlay _ov_word_run_size_N • the generation of constants _ov_runtimestartaddress_N • the redirection of overlay function calls to a jump table where N = the Overlay ID • the overlay manager. The overlay support is provided by the linker and is partially designed by the user in the linker description file (LDF). The user specifies which overlays share run time memory and which memory segments establish the live and run space. Listing 1 shows the section of an LDF defining two overlays .pm_code { OVERLAY_INPUT OVERLAY_OUTPUT(OVLY_one.ovl) INPUT_SECTIONS(FUNC_A.doj(pm_code)) }>ovl_code { _ov_word_live_size_N Listing 2. Linker Generated Overlay Constants Each overlay has a word size and an address which the overlay manager uses to determine where the overlay resides and where it is executed. The overlay live and run word sizes are different if the internal memory and external memory widths are different. For example, the instruction word width of the SHARC DSP is 48-bits. A system containing 32-bit wide external memory requires data packing to store an overlay containing instructions. The overlay live word size (number of words in the overlay) is based on the number of 32-bit words required to pack all of the 48bit instructions. OVERLAY_INPUT OVERLAY_OUTPUT(OVLY_two.ovl) INPUT_SECTIONS(FUNC_B.doj(pm_code)) (FUNC_C.doj(pm_code)) }>ovl_code }>pm_code Listing 1. Overlay Declaration in LDF The overlay declaration in Listing 1 configures two overlays to share a common run time memory space. The first overlay, OVL_one, contains FUNC_A and lives somewhere in memory segment ovl_code. The second overlay, OVLY_two, contains functions FUNC_B and FUNC_C. The second overlay also lives in memory segment ovl_code. The common run time location shared by overlays OVL_one and OVL_two is within the memory segment pm_code. The LDF provides the linker with direction on how to configure the overlays as well as the information necessary for the overlay manager routine to load the Figure 2. Example Overlay Run and Live Sizes Figure 2 shows the difference between overlay live and run size. • Overlays 1 and 2 are instruction overlays, with a run word width of 48-bits. • Because external memory is 32-bits, their live word width is 32-bits. • Overlay 1 contains one function with 16 instructions—overlay 2 contains two functions with a total of 40 instructions. EE-66 Page 2 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com • The live word size for overlays 1 and 2 are 24 and 60 words respectively. • The run word size for overlay 1 and 2 are 16 and 40 respectively. The following code shows the value of all constants generated by the linker for the example in Figure 2: _ov_startaddress_1 = 0x20000 _ov_startaddress_1 = 0x20000 _ov_endaddress_1 = 0x20017 _ov_endaddress_1 = 0x20017 _ov_word_run_size_1 = 0x118 _ov_word_run_size_1 = 0x118 _ov_word_live_size_1 = 0x10 _ov_word_live_size_1 = 0x10 _ov_runtimestartaddress_1 = 0x8800 _ov_runtimestartaddress_1 = 0x8800 Listing 3. Linker Generated Constants The .plt_FUNC_A is the entry in the PLIT containing your defined instructions. These instructions prepare the VisualDSP environment for the overlay manager to load the overlay containing FUNC_A. The instructions executed in the PLIT are specified within the LDF. Listing 4 is an example PLIT definition from an LDF file. In the example the register R0 is set to the value of the overlay ID that contains the referenced symbol, and register R1 is set to the run time address of the referenced symbol. (PLIT_SYMBOL_OVERLAY_ID and PLIT_SYMBOL_ADDRESS are linker key words). The last instruction branches to the overlay manager. The overlay manager uses the initialized registers to determine which overlay to load, and where to jump to execute the overlay function called. PLIT { Along with providing constants, the linker redirects overlay symbol references within your code to the overlay manager routine. This redirection is accomplished using a procedure linkage table (PLIT). The PLIT is essentially a jump table that executes user defined code and then jumps to the overlay manager. The linker replaces an overlay symbol reference (function call) with a jump to a location in the PLIT. PLIT code is defined within the linker description file (LDF) by the programmer. This code prepares the overlay manager to handle the overlay containing the referenced symbol. The code generally initializes registers to contain the overlay ID and the referenced symbols run time address. The following is an example call instruction to an overlay function: R0 = DM(I0,M3); R1 = R0 * R2; CALL FUNC_A; /* Call to function in overlay */ DM(I3,M3) = R1; If FUNC_A is in an overlay, the linker replaces the function call with the following instruction: R0 = DM(I0,M3); R0 = PLIT_SYMBOL_OVERLAY_ID; R1 = PLIT_SYMBOL_ADDRESS; JUMP_OverlayManager } Listing 4. PLIT Definition in LDF The linker expands the PLIT definition into individual entries in a table. An entry is created for each overlay symbol as shown in Figure 3. The redirect function calls and the PLIT table for overlays 1 and 2 of the example. For each entry the linker replaces the generic assembly instructions with specific instructions (where applicable). For example, the first entry in the PLIT shown in Figure 3 is overlay symbol FUNC_A. The linker replaces the constant name PLIT_SYMBOL_OVERLAYID with the ID of the overlay containing FUNC_A. The linker also replaces the constant name PLIT_SYMBOL_ADDRESS with the run time address of FUNC_A. When the overlay manager subroutine is called via the jump instruction of the PLIT table, R0 contains the referenced function’s overlay ID, and R1 contains the referenced function’s run time address. The overlay manager subroutine uses the overlay ID and run time address to load and execute the referenced function. R1 = R0 * R2; CALL .plt_FUNC_A; /* Call to PLIT entry */ DM(I3,M3) = R1; EE-66 Page 3 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Overlay 2 FUNC_B FUNC_C Overlay 1 FUNC_A You may need your overlay manager to perform other specialized tasks to satisfy the special needs of a given application. Internal Memory Main: call .plt_FUNC_A . . . call .plt_FUNC_C call .plt_FUNC_B . . Plit_table: .plt_FUNC_A: .plt_FUNC_B: .plt_FUNC_C: r0=0x00001; r1=0x22000; jump OverlayManager; r0=0x00002; r1=0x22000; jump OverlayManager; r0=0x00002; r1=0x23000; jump OverlayManager; Figure 3. Expanded PLIT Table. Overlay Manager The overlay manager is a user defined routine that is responsible for loading a referenced overlay function or data buffer into internal memory (run time space). This is done with the aid of the linker generated constants and the PLIT commands. The linker generated constants tell the overlay manager the addresses of the live overlay, were the overlay resides for execution, and the number of words in the overlay. The PLIT commands tell the overlay manager such information as which overlay is required and the run time address of the referenced symbol. The main objective of overlay managers is to transfer overlays to their run time location when required. However, overlay managers may also be required to: • check if a referenced symbol has already been transferred into its runtime space as a result of a previous reference if the overlay is already in internal memory, the overlay transfer is bypassed and execution of the overlay routine can begin immediately. • 21065L) The following sections discuss two different overlay managers. The first always executes an overlay function immediately after the overlay has been transferred—it only transfers the overlay at the time execution is requested. In this case, the core is idle while the transfer occurs. The second overlay manager is designed to load one overlay while executing a function from a previously loaded overlay. This reduces the core idle time that occurs while the core waits for the overlay function to load for execution. The two example overlay mangers are demonstrated on the same application. The application is a 256 point radix-2 FFT. The FFT implementations of the two examples are the same except for the mapping of functions to overlays. The main FFT routine is composed of four functions: fft_first_2_stages, fft_middle_stages, fft_next_to_last and fft_last_stage. The first example consists of two overlays, each containing two functions, and the second example consists of four overlays, each containing one function. The allocation of functions to overlays and the overlay manager for each example are explained. set up a stack to store register values in some cases stacks may be corrupted by the overlay • Overlay Manager Examples (ADSP- load an overlay while executing a function from a second overlay (or a non overlay function). Basic Overlay Example This example has two overlays, each of which contain two functions. Overlay 1 contains the functions fft_first_two_stages and fft_last_stage. Overlay 2 contains functions fft_middle_stages and fft_next_to_last. In this example the overlay manager: 1. creates and maintains a stack for the registers it uses 2. determines if the referenced function is in internal memory 3. sets up a DMA transfer EE-66 Page 4 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com 4. flushes the cache and executes the referenced function The following code segment from the LDF represents the overlay definitions of Example 1. Several code segments for the LDF and the Overlay Manager are displayed and explained in the text. function is referenced, overlay fft_one, overlay id=1 is again transferred to internal memory for execution. The following code segment calls the four functions of FFT: fftrad2: call fft_first_2_stages (db); call fft_middle_stages (db); OVERLAY_INPUT { call fft_next_to_last (db); ALGORITHM(ALL_FIT) call fft_last_stage (db); OVERLAY_OUTPUT(fft_one.ovl) INPUT_SECTIONS( Fft_1st_last.doj(pm_code) ) PACKING(12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code ovl_code // Overlay to live in section OVERLAY_INPUT wait: idle; jump wait; The linker replaces the overlay function calls with calls to the appropriate entry in the procedure linkage table (PLIT). For this example only three instructions are placed in each entry of the PLIT as shown below: { PLIT ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_two.ovl) INPUT_SECTIONS( { R0 = PLIT_SYMBOL_OVERLAYID; Fft_mid.doj(pm_code) ) R1 = PLIT_SYMBOL_ADDRESS; PACKING(12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) JUMP _OverlayManager; } } >ovl_code // Overlay to live in section ovl_c Listing 5. FFT Overlay Example 1 Two overlays are defined: fft_one.ovl and fft_two.ovl. Both overlays live in the segment ovl_code defined in the memory section of the LDF and run in section pm_code. All instruction and data defined in segments named pm_code within the file Fft_1st_last.doj are part of overlay fft_one.ovl. All instructions and data defined in segments named pm_code within the file Fft_mid.doj are part of overlay fft_two.ovl. The result is two functions within each overlay. The first and the last functions called are in overlay fft_one. The two middle functions called are in overlay fft_two. When the first function is referenced during code execution, fft_one, overlay id=1 is transferred to internal memory. When the second function is referenced, fft_two, overlay id=2 is transferred to internal memory. Since the third function is in overlay fft_two, when it is referenced the overlay manager recognizes that it is already in internal memory and an overlay transfer does not occur. Finally, when the last Register R0 contains the overlay ID that occupies the referenced symbol and register R1 contains the run time address of the referenced symbol. The final instructions jump the program counter (PC) to the overlay manager routine. The overlay manager routine uses the overlay ID in conjunction with the overlay constants generated by the linker to transfer the proper overlay into internal memory. Once the transfer is complete, the overlay manager sends the PC to the address of the referenced symbol stored on R1. The linker generates the following constants used by the overlay manager: .EXTERN _ov_word_run_size_1; .EXTERN _ov_word_run_size_2; .EXTERN _ov_word_live_size_1; .EXTERN _ov_word_live_size_2; .EXTERN _ov_startaddress_1; .EXTERN _ov_startaddress_2; .EXTERN _ov_runtimestartaddress_1; .EXTERN _ov_runtimestartaddress_2; These constants supply the overlay manager with: EE-66 Page 5 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com 1. the size of the overlays, using both run time word sizes and live word sizes, /* Store values of registers used by the overlay manager in to the */ 2. the starting address of the live space dm(ov_stack)=i8; /* software stack. */ dm(ov_stack+1)=m8; 3. the starting address of the run space. dm(ov_stack+2)=l8; The overlay manager code places the constants in arrays as shown below. The arrays are referenced by using the overlay ID as the index to the array. The index or ID is stored in a modify (m#) register and the beginning address of the array is stored in the (i#) register. dm(ov_stack+3)=r2; /* Use the overlay id as an index (must subtract one) */ R0=R0-1; /* Overlay ID -1 */ .VAR liveAddresses[2] = _ov_startaddress_1, _ov_startaddress_2; m8=R0; /* Offset into the arrays containing linker defined overlay constants.*/ .VAR runAddresses[2] = _ov_runtimestartaddress_1, r2=dm(ov_id_loaded); r0=r0-r2; _ov_runtimestartaddress_2; if EQ jump continue; .VAR runWordSize[2] _ov_word_size_run_1, dm(ov_id_loaded)=m8; = _ov_word_size_run_2; .VAR liveWordSize[2] = _ov_word_size_live_1, _ov_word_size_live_2; r0=i0; dm(ov_stack+4)=r0; r0=m0; dm(ov_stack+5)=r0; Before preparing the DMA, the overlay manager stores the values contained in each register it uses onto a runtime stack. The stack stores the values of all data registers, address generator registers and any other registers required by the overlay manager. The overlay manager also stores the ID of an overlay currently in internal memory. When an overlay is transferred to internal memory the overlay manager stores the overlay ID in internal memory in the buffer labeled ov_id_loaded. Before another overlay is transferred, the overlay manager compares the required overlay ID with that stored in buffer ov_id_loaded. If they are equal the required overlay is already in internal memory and a transfer is not required. The PC is sent to the proper location to execute the referenced function. If they are not equal, the value in ov_id_loaded is updated and the overlay is transferred. r0=l0; dm(ov_stack+6)=r0; The overlay manager uses the value of the linker generated constants to set up the DMA transfer as shown in the following code segment of the overlay manager function. The constants are in arrays as previously described. The index registers I8 and I7 point to the first location of the arrays. The overlay ID is stored in the modify registers M8 and M7. The index and modify registers together in DAG instructions read the appropriate elements from the arrays. The following segment of the overlay manager function creates the runtime stack, stores the overlay ID in a modify register and checks the overlay ID stored in ov_id_loaded: /* _overlayID has been defined as R0. R0 is set in the PLIT of LDF. */ /* Set up DMA transfer to internal memory through the external port. */ EE-66 Page 6 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* Get overlay run and live addresses from memory and use to */ /* set up the master mode DMA. */ i8 = runAddresses; i0 = liveAddresses; r0=0; /* Disable DMA */ dm(DMAC0) = r0; /* Set DMA external pointer to overlay live address */ r0=dm(m0,i0); dm(EIEP0)=r0; /* Set DMA internal pointer to overlay run address */ dm(IIEP0)=r0; Number of words stored */ /* Most likely the word size will be 48 bits for instructions. */ 2. compares the requested overlay’s ID with that of the previously loaded overlay (stored in buffer ov_id_loaded) 3. sets up the DMA transfer of the overlay (if it is not already in internal memory) /* Set DMA external modifier */ r0=1; 4. jumps the PC to the run time location of the referenced function. dm(EMEP0)=r0; i8=liveWordSize; /* Number of words stored in external memory */ /* 16 Most likely the word size will be 32 or */ /* bits for external storage. */ /* Set DMA internal modify to 1 */ dm(IMEP0)=r0; /* Set DMA internal count to Overlay run size. */ r0=dm(m0,i0); dm(CEP0)=r0; /* Set DMA external count to Overlay live size. */ r0=pm(m8,i8); dm(ECEP0)=r0; /* DMA enabled, instruction word, Master, 48-32 packing */ r0=0x2e1; On completion of the transfer the overlay manager restores register values from the runtime stack, flushes the cache and then jumps the PC to the run time location of the referenced function. It is very important to flush the cache before jumping to the referenced function because when code is replaced or modified, incorrect code execution may occur if the cache is not flushed. If the program sequencer searches the cache for an instruction and an instruction from the previous overlay is in the cache, the cached instruction may be executed rather than receiving the expected cache miss. In summary, the overlay manager routine does the following: 1. maintains a runtime stack for registers being used by the overlay manager r0=pm(m8,i8); i0=runWordSize; /* in internal memory dm(DMAC0)=r0; These are the basic tasks that are performed by an overlay manager. More sophisticated overlay managers may be required for individual applications. Pre-Load Overlay Manager Example The second example incorporates the ability to transfer one overlay to internal memory while the core executes a function from another overlay. Instead of the core sitting idle while the overlay DMA transfer occurs, the core enables the DMA then begins executing another function. This example uses the concept of overlay function loading and executing. A function load is a request to load the overlay function into internal memory but not execute the function. A function execution is a request to execute an overlay function that may or may not be in internal memory at the time of the execution request. If the function is not in internal memory a transfer must occur before execution. EE-66 Page 7 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com There are several circumstances under which an overlay transfer can be in progress while the core is executing another task. Each circumstance can be labeled as deterministic or non-deterministic. A deterministic circumstance is one where you know exactly when an overlay function is required for execution. A nondeterministic circumstance is one where you cannot predict when an overlay function is required for execution. For example, a deterministic application may consist of linear flow code except for function calls. A non-deterministic example is an application with calls to overlay functions within an interrupt service routine where the interrupt occurs randomly. The following example contains deterministic overlay function calls. The time of overlay function execution requests are known as are the number of cycles required to transfer an overlay. Therefore, an overlay function load request can be placed such that the transfer is complete by the time the execution request is made. The next overlay transfer (from a load request) can be enabled by the core and the core can execute the instructions leading up to the function execution request. Since the linker handles all overlay symbol references in the same way (jump to PLIT table then overlay manager) it is up to the overlay manager to distinguish between a symbol reference requesting the load of an overlay function and a symbol reference requesting the execution of an overlay function. In the example the overlay manager uses a buffer in memory as a flag to indicate whether the function call (symbol reference) is a load or an execute request. The overlay manager first determines if the referenced symbol is in internal memory. If not it sets up the DMA transfer. If the symbol is not in internal memory and the flag is set for execution, the core waits for the transfer to complete (if necessary) and then executes the overlay function. If the symbol is set for load, the core returns to the instructions immediately following the location of the function load reference. Every overlay function call requires initializing the load/execute flag buffer. In the example, the function calls are delayed branch calls. The two slots in the delayed branch contain instructions to initialize the flag buffer. Register R0 is set to the value that is placed in the flag buffer, and the value in R0 is stored in memory; 1 indicates a load and 0 indicates an execution call. At each overlay function call the load buffer must be updated. The following code is from the main FFT subroutine. Each of the four function calls are execution calls so the pre-fetch (load) buffer is set to zero. The flag buffer in memory is read by the overlay manager to determine if the function call is a load or an execute. call fft_first_2_stages (db); r0=0; dm(prefetch) = r0; call fft_middle_stages (db); r0=0; dm(prefetch) = r0; call fft_next_to_last (db); r0=0; dm(prefetch) = r0; call fft_last_stage (db); r0=0; dm(prefetch) = r0; The next set of instructions represents a load function call. call fft_middle_stages (db); function call pre loads */ /* This r0=1; /* the function into the overlay run memory. */ dm(prefetch) = r0; /* Set prefetch flag to 1 to indicate a load. */ The implementation executes the first function and transfers the second function and so on. In this implementation each function resides in a unique overlay and requires reserving two run time locations; while one overlay is loading into one run time location, a second overlay function is executing in another run time location. The following code segment allocates the functions to overlays and forces two runtime locations. OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_one.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code ovl_code // Overlay to live in section EE-66 Page 8 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com OVERLAY_INPUT { jumps to the runtime location of the called function. If the overlay is not in internal memory, a DMA transfer is initiated and the core waits for the transfer to complete. ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_three.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm1_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code INPUT_SECTIONS(ovly_mgr.doj(pm_code)) OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_two.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm3_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code The complete overlay manager function is shown in the Overlay Example 2 at the end of this note. The overlay manager pushes the appropriate registers on the runtime stack. It checks to see if the requested overlay is currently in internal memory. If not, it sets up the DMA transfer. It then checks to see if the function call is a load or an execution call. If it is a load, it begins the transfer and returns the PC back to the instruction following the call. If it is an execution call the core is idle until the transfer completes (if the transfer was necessary) and then jumps the PC to the runtime location of the function. The overlay managers in these examples are used universally. Specific applications may require some modifications. These modifications may allow for the elimination of some instructions. For instance, if your application allows for the free use of registers, you may not need a runtime stack. OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_last.ovl) The following examples are intended to help explain the capabilities of overlays so you can get started on your own applications quickly. INPUT_SECTIONS( Fft_ovl.doj(pm2_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code The first and third overlays share one runtime location and the second and fourth overlays share the second runtime location. By placing an input section between overlay declarations, multiple runtime locations are allocated. The overlay manager requires modification from that of example one. Additional instructions are included to determine if the function call is a load or an execution call. If the function call is a load the overlay manager initiates the DMA transfer, then jumps the PC back to the location where the call was made. If the call is an execution call the overlay manager determines if the overlay is currently in internal memory. If so, the PC EE-66 Page 9 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Engineer To Engineer Note EE-66 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Overlay Example 1 Linker Description File ARCHITECTURE(ADSP-21065L) // // ADSP-21065L Memory Map: // ------------------------------------------------ // Internal memory // ------------------------------------------------ 0x0000 0000 to 0x0007 ffff // 0x0000 0000 to 0x0000 00ff IOP Regs // 0x0000 0100 to 0x0000 01ff IOP Regs of Processor ID 001 // 0x0000 0200 to 0x0000 02ff IOP Regs of Processor ID 002 // 0x0000 0300 to 0x0000 07ff Reserved //Block 0 0x0000 8000 to 0x0000 9fff Normal Word (32/48) Addresses // (0x0000 8000 to 0x0000 97ff) 48-bit words // (0x0000 8000 to 0x0000 9fff) 32-bit words // 0x0000 A000 to 0x0000 Bfff Reserved //Block 1 0x0000 C000 to 0x0000 Dfff Normal Word (32/48) Addresses // (0x0000 C000 to 0x0000 Cfff) 48-bit words // (0x0000 C000 to 0x0000 Dfff) 32-bit words // 0x0000 E000 to 0x0000 ffff Reserved //Block 0 0x0001 0000 to 0x0001 3fff Short Word (16) Addresses // 0x0001 4000 to 0x0001 7fff Reserved //Block 1 0x0001 8000 to 0x0001 Bfff Short Word (16) Addresses // to 0x0001 ffff Reserved 0x0001 C000 // ------------------------------------------------ // External memory // ------------------------------------------------ // External Bank 0 0x0002 0000 to 0x00ff ffff // External Bank 1 0x0100 0000 to 0x01ff ffff // External Bank 2 0x0200 0000 to 0x02ff ffff // External Bank 3 0x0300 0000 to 0x03ff ffff 0x0002 0000 to 0x03ff ffff // // This architecture file allocates: // Internal // External a SEARCH_DIR( "$ADI_DSP\21k\lib" ) $LIBRARIES = lib060.dlb ; $OBJECTS = $COMMAND_LINE_OBJECTS ; //MAP(fft_ovly.map) // Memory architecture description for FFT example on a 21062. // // 256 48-bit words for interrupt vector table (reset vector location). 1792 48-bit words of program memory for code storage. // 2k 32-bit words of internal program memory for data storage. // 2k 32-bit words of second segment of internal program memory for data storage. // 4k 32-bit words of internal data memory for data storage. // 4k 32-bit words of second segment of internal data memory for data storage. // 4k 48-bit words of external memory for program overlay MEMORY { isr_tabl { TYPE(PM RAM) START(0x00008000) END(0x000080FF) WIDTH(48) } pm_code { TYPE(PM RAM) START(0x00008100) END(0x000087ff) WIDTH(48) } pm_data { TYPE(PM RAM) START(0x00009000) END(0x000097ff) WIDTH(32) } pm_idat { TYPE(PM RAM) START(0x00009800) END(0x00009fff) WIDTH(32) } dm_data { TYPE(DM RAM) START(0x0000C000) END(0x0000Cfff) WIDTH(32) } dm_rdat { TYPE(DM RAM) START(0x0000D000) END(0x0000Dfff) WIDTH(32) } ovl_code { TYPE(DM RAM) START(0x00020000) END(0x00020FFF) WIDTH(32) } }// End MEMORY // The global PLIT to be used description // is not provided. overlay whenever a PROCESSOR or OVERLAY specific PLIT The PLIT initializes a register to the overlay id and the // runtime address of the symbol called. do not Be sure the registers used in the PLIT // contain values which cannot be overwritten. EE-66 Page 11 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com PLIT { R0 = PLIT_SYMBOL_OVERLAYID; R1 = PLIT_SYMBOL_ADDRESS; JUMP _OverlayManager; } Processor Fft { LINK_AGAINST($COMMAND_LINE_LINK_AGAINST) // Link against all doj files on // command line OUTPUT( $COMMAND_LINE_OUTPUT_FILE ) SECTIONS { .isr_tabl { INPUT_SECTIONS( Fftrad2n.doj(isr_tabl) ) // ISR table placement } >isr_tabl .pm_code { INPUT_SECTIONS(Fftrad2n.doj(pm_code) ovly_mgr_65L.doj(pm_code) $LIBRARIES(pm_code)) // Declare which functions reside in which overlay // The overlays have been split up into either different // segments if in the same file or different files. // The overlays declared in this section, pm_code, will run // in pm_code. OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_one.ovl) INPUT_SECTIONS( Fft_1st_last.doj(pm_code) ) PACKING(12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code OVERLAY_INPUT { EE-66 Page 12 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_two.ovl) INPUT_SECTIONS( Fft_mid.doj(pm_code) ) PACKING(12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code } >pm_code // PLIT code is to reside and run in pm_code section .PLIT { } >pm_code // assign remaining (non overlay) code segments and data segments to // memory segments. .pm_data { INPUT_SECTIONS( Fftrad2n.doj(pm_data) $LIBRARIES(pm_data)) } >pm_data .pm_idat { INPUT_SECTIONS( Fftrad2n.doj(pm_idat) $LIBRARIES(pm_idat)) } >pm_idat .dm_data { INPUT_SECTIONS(Fftrad2n.doj(dm_data) ovly_mgr_65L.doj(dm_data) $LIBRARIES(dm_data)) } >dm_data .dm_rdat { INPUT_SECTIONS( Fftrad2n.doj(dm_rdat)$LIBRARIES(dm_rdat)) } >dm_rdat }// End SECTIONS }// End FFT EE-66 Page 13 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Overlay Manager Function /* The OVLY_MGR.ASM file is the overlay manager. When a symbol */ /* residing in overlay is referenced, the overlay manager loads */ /* the overlay code and begins execution. /* checks to see if the overlay is already in internal */ /* memory, this option will be added later.) */ /* performed to load in the memory overlay. (This overlay manager */ A DMA transfer is */ #include "def21060.h" #include "def21065L.h" #define ADSP21065L .SEGMENT/DM dm_data; /* The following constants are defined by the linker. /* These constants contain the word size, live location /* and run location of the overlay functions. */ */ */ .EXTERN _ov_word_run_size_1; .EXTERN _ov_word_run_size_2; .EXTERN _ov_word_live_size_1; .EXTERN _ov_word_live_size_2; .EXTERN _ov_startaddress_1; .EXTERN _ov_startaddress_2; .EXTERN _ov_runtimestartaddress_1; .EXTERN _ov_runtimestartaddress_2; /* Placing the linker constants in an array so the overlay */ /* manager can use the appropriate constant based on the */ /* overlay id. */ .VAR liveAddresses[2] = _ov_startaddress_1, _ov_startaddress_2; .VAR runAddresses[2] = _ov_runtimestartaddress_1, _ov_runtimestartaddress_2; .VAR runWordSize[2] = _ov_word_size_run_1, _ov_word_size_run_2; .VAR liveWordSize[2] = _ov_word_size_live_1, _ov_word_size_live_2; /* Software stack to temporarily store registers corrupted by overlay manager */ /* and an initial value stored in memory used as an indicator of which overlay*/ /* is currently in internal memory.*/ EE-66 Page 14 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com .VAR ov_stack[10]; .VAR ov_id_loaded = -1; .ENDSEG; /************************************************************************/ /* Overlay Manager Function .SEGMENT/PM */ pm_code; _OverlayManager: .GLOBAL _OverlayManager; /* _overlayID has been defined as R0. R0 is set in the PLIT of LDF. */ /* Set up DMA transfer to internal memory through the external port. */ /* Store values of registers used by the overlay manager in to the */ /* software stack. */ dm(ov_stack)=i8; dm(ov_stack+1)=m8; dm(ov_stack+2)=l8; dm(ov_stack+3)=r2; /* Use the overlay id as an index (must subtract one) */ R0=R0-1; /* Overlay ID -1 */ m8=R0; /* Offset into the arrays containing linker */ /* defined overlay constants. */ r2=dm(ov_id_loaded); r0=r0-r2; if EQ jump continue; dm(ov_id_loaded)=m8; r0=i0; dm(ov_stack+4)=r0; r0=m0; dm(ov_stack+5)=r0; r0=l0; dm(ov_stack+6)=r0; l8=0; l0=0; m0=m8; /* Overlay ID - 1 */ EE-66 Page 15 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* Get overlay run and live addresses from memory and use to /* set up the master mode DMA. */ */ i8 = runAddresses; i0 = liveAddresses; r0=0; /* Disable DMA */ dm(DMAC0) = r0; /* Set DMA external pointer to overlay live address */ r0=dm(m0,i0); dm(EIEP0)=r0; /* Set DMA internal pointer to overlay run address */ r0=pm(m8,i8); dm(IIEP0)=r0; i0=runWordSize; /* Number of words stored in internal memory */ /* Most likely the word size will be 48 bits /* for instructions. */ */ /* Set DMA external modifier */ r0=1; dm(EMEP0)=r0; i8=liveWordSize; /* /* Number of words stored in external memory Most likely the word size will be 32 or16 /* bits for external storage. */ */ */ /* Set DMA internal modify to 1 */ dm(IMEP0)=r0; /* Set DMA internal count to Overlay run size. */ r0=dm(m0,i0); dm(CEP0)=r0; /* Set DMA external count to Overlay live size. */ r0=pm(m8,i8); dm(ECEP0)=r0; /* DMA enabled, instruction word, Master, 48-32 packing */ r0=0x2e1; EE-66 Page 16 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com dm(DMAC0)=r0; /* Enable DMA interrupt */ bit set mode1 IRPTEN; bit set imask EP0I; /* Restore register values from stack r0=dm(ov_stack+6); l0=r0; r0=dm(ov_stack+5); m0=r0; r0=dm(ov_stack+4); i0=r0; /* Wait for DMA to complete dma1_wait: */ */ idle; continue: r2=dm(ov_stack+3); l8=dm(ov_stack+2); i8=r1; m8=0; r1=dm(ov_stack+1); r0=dm(ov_stack); /* Flush the cache. If an instruction in previous overlay /* had been cached, it may be executed instead of the */ /* current overlays instruction. */ (If pm transfers align.) */ flush cache; /* Jump to the location of the function to be executed. */ jump (m8,i8) (db); i8=r0; m8=r1; .ENDSEG; /********************************************************************/ EE-66 Page 17 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com Overlay Example 2 Linker Description File ARCHITECTURE(ADSP-21065L) SEARCH_DIR( "$ADI_DSP\21k\lib" ) $LIBRARIES = lib060.dlb ; $OBJECTS = $COMMAND_LINE_OBJECTS ; //MAP(fft_ovly.map) MEMORY { INCLUDE("065L_mem.h")}// End MEMORY // The global PLIT to be used whenever a PROCESSOR or OVERLAY specific PLIT // description is not provided. The PLIT initializes a register to the overlay // id and the overlay runtime address of the symbol called. Be sure the registers // used in the PLIT do not contain values which cannot be overwritten. PLIT { R0 = PLIT_SYMBOL_OVERLAYID; R1 = PLIT_SYMBOL_ADDRESS; JUMP _OverlayManager; } PROCESSOR FFT { LINK_AGAINST( $COMMAND_LINE_LINK_AGAINST) // Link against all doj files on //command line OUTPUT( $COMMAND_LINE_OUTPUT_FILE ) SECTIONS { .isr_tabl { EE-66 Page 18 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com INPUT_SECTIONS( Fftrad2n.doj(isr_tabl) ) // ISR table placement } >isr_tabl .pm_code { // Declare which functions reside in which overlay // The overlays have been split up into either different // segments if in the same file or different files. // The overlays declared in this section, pm_code, will run // in pm_code. INPUT_SECTIONS(Fftrad2n.doj(pm_code) $LIBRARIES(pm_code)) OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_one.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_three.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm1_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code INPUT_SECTIONS(ovly_mgr.doj(pm_code)) OVERLAY_INPUT { ALGORITHM(ALL_FIT) OVERLAY_OUTPUT(fft_two.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm3_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code OVERLAY_INPUT { ALGORITHM(ALL_FIT) EE-66 Page 19 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com OVERLAY_OUTPUT(fft_last.ovl) INPUT_SECTIONS( Fft_ovl.doj(pm2_code) ) PACKING( 12 B1 B2 B3 B4 B0 B11 B12 B5 B6 B0 B7 B8 B9 B10 B0) } >ovl_code // Overlay to live in section ovl_code } >pm_code // PLIT code is to reside and run in pm_code section .PLIT { } > pm_code // assign remaining (non overlay) code segments and data segments to // memory segments. .pm_data { INPUT_SECTIONS( Fftrad2n.doj(pm_data) $LIBRARIES(pm_data)) } >pm_data .pm_idat { INPUT_SECTIONS( Fftrad2n.doj(pm_idat) $LIBRARIES(pm_idat)) } >pm_idat .dm_data { INPUT_SECTIONS( Fftrad2n.doj(dm_data) ovly_mgr.doj(dm_data) $LIBRARIES(dm_data)) } >dm_data .dm_rdat { INPUT_SECTIONS( Fftrad2n.doj(dm_rdat)$LIBRARIES(dm_rdat)) } >dm_rdat }// End SECTIONs }// End FFT Memory Description (from include file) EE-66 Page 20 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com // // ADSP-21065L Memory Map: // ------------------------------------------------ // Internal memory // ------------------------------------------------ 0x0000 0000 to 0x0007 ffff // 0x0000 0000 to 0x0000 00ff IOP Regs // 0x0000 0100 to 0x0000 01ff IOP Regs of Processor ID 001 // 0x0000 0200 to 0x0000 02ff IOP Regs of Processor ID 002 // 0x0000 0300 to 0x0000 07ff Reserved // Block 0 0x0000 8000 to 0x0000 9fff Normal Word (32/48) Addresses // (0x0000 8000 to 0x0000 97ff) 48-bit words // (0x0000 8000 to 0x0000 9fff) 32-bit words // 0x0000 A000 to 0x0000 Bfff // Block 1 Reserved 0x0000 C000 to 0x0000 Dfff Normal Word (32/48) Addresses // (0x0000 C000 to 0x0000 Cfff) 48-bit words // (0x0000 C000 to 0x0000 Dfff) 32-bit words // 0x0000 E000 to 0x0000 ffff // Block 0 // 0x0001 0000 to 0x0001 3fff 0x0001 4000 to 0x0001 7fff // Block 1 // Reserved Reserved 0x0001 8000 to 0x0001 Bfff 0x0001 C000 to 0x0001 ffff Short Word (16) Addresses Short Word (16) Addresses Reserved // ------------------------------------------------ // External memory // ------------------------------------------------ // External Bank 0 0x0002 0000 to 0x00ff ffff // External Bank 1 0x0100 0000 to 0x01ff ffff // External Bank 2 0x0200 0000 to 0x02ff ffff // External Bank 3 0x0300 0000 to 0x03ff ffff 0x0002 0000 to 0x03ff ffff // // This architecture file allocates: // Internal // External // // // // Memory architecture description for FFT example on a 21062. 256 48-bit words for interrupt vector table (reset vector location). 1792 48-bit words of program memory for code storage. 2k 32-bit words of internal program memory for data storage. // 2k 32-bit words of second segment of internal program memory for data storage. // 4k 32-bit words of internal data memory for data storage. // 4k 32-bit words of second segment of internal data memory for data storage. EE-66 Page 21 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com // 4k 48-bit words of external memory for program overlay isr_tabl { TYPE(PM RAM) START(0x00008000) END(0x000080FF) WIDTH(48) } pm_code { TYPE(PM RAM) START(0x00008100) END(0x000087ff) WIDTH(48) } pm_data { TYPE(PM RAM) START(0x00009000) END(0x000097ff) WIDTH(32) } pm_idat { TYPE(PM RAM) START(0x00009800) END(0x00009fff) WIDTH(32) } dm_data { TYPE(DM RAM) START(0x0000C000) END(0x0000Cfff) WIDTH(32) } dm_rdat { TYPE(DM RAM) START(0x0000D000) END(0x0000Dfff) WIDTH(32) } ovl_code { TYPE(DM RAM) START(0x00020000) END(0x00020FFF) WIDTH(32) } Overlay Manager Function /* The OVLY_MGR.ASM file is the overlay manager. When a symbol */ /* residing in overlay is referenced, the overlay manager loads */ /* the overlay code and begins execution. */ /* does check to see if the overlay is already in internal */ /* memory. */ /* is performed to load in the memory overlay. (This overlay manager If it is the DMA is not performed) A DMA transfer */ #include "def21060.h" #include "def21065L.h" #include "fft_ovly.h" .SEGMENT/DM dm_data; /* The following constants are defined by the linker. */ /* These constants contain the word size, live location */ /* and run location of the overlay functions. */ .EXTERN _ov_word_run_size_1; .EXTERN _ov_word_run_size_2; .EXTERN _ov_word_run_size_3; .EXTERN _ov_word_run_size_4; .EXTERN _ov_word_live_size_1; .EXTERN _ov_word_live_size_2; .EXTERN _ov_word_live_size_3; .EXTERN _ov_word_live_size_4; EE-66 Page 22 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com .EXTERN _ov_startaddress_1; .EXTERN _ov_startaddress_2; .EXTERN _ov_startaddress_3; .EXTERN _ov_startaddress_4; .EXTERN _ov_runtimestartaddress_1; .EXTERN _ov_runtimestartaddress_2; .EXTERN _ov_runtimestartaddress_3; .EXTERN _ov_runtimestartaddress_4; /* Placing the linker constants in an array so the overlay */ /* manager can use the appropriate constant based on the */ /* overlay id. */ .VAR liveAddresses[4] = _ov_startaddress_1, _ov_startaddress_2,_ov_startaddress_3, _ov_startaddress_4; .VAR runAddresses[4] = _ov_runtimestartaddress_1, _ov_runtimestartaddress_2,_ov_runtimestartaddress_3, _ov_runtimestartaddress_4; .VAR runWordSize[4] = _ov_word_size_run_1,_ov_word_size_run_2,_ov_word_size_run_3,_ov_word_size_run_4 ; .VAR liveWordSize[4] = _ov_word_size_live_1,_ov_word_size_live_2,_ov_word_size_live_3,_ov_word_size_li ve_4; .VAR prefetch=0; .GLOBAL prefetch; /* Used to indicate if the overlay symbol call is merely to load in the instruction */ /* Or to actually load in and execute the instructions (prefetch vs execute. .VAR ov_stack[10]; manager */ /* */ software stack to store registers used by overlay .ENDSEG; /************************************************************************/ /* Overlay Manager Function .SEGMENT/PM */ pm_code; _OverlayManager: .GLOBAL _OverlayManager; EE-66 Page 23 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* _overlayID has been defined as R0. R0 is set in the PLIT of LDF. */ /* R1 contains the run time location of the overlay symbol reference */ /* and is initialized in the PLIT jump table. Set up DMA transfer to */ /* internal memory through the external port. /* Store values of registers used by the overlay manager in the software stack. */ */ dm(ov_stack)=i8; dm(ov_stack+1)=m8; dm(ov_stack+2)=l8; dm(ov_stack+3)=r2; m8=R0; /* Use the overlay id as an index offset into */ /* the arrays containing linker defined */ /* overlay constants (declared above). */ i8 = runAddresses-1; */ /* Address of array of run locations for each overlay /* The -1 is necessary because the array index range */ /* begins with 0 but the overlay id's begin with 1. */ /* The overlay ids are used as indexes into the array */ i8=pm(m8,i8); /* The first word in the live and run location of each px=pm(0,i8); /* overlay is the overlay id. r2=px1; These instructions read /* first word in the run location of the overlay. If /* overlay has already been loaded, the first location /* will contain the corresponding overlay id. r0=r0-r2; */ */ */ */ */ /* Test to see if overlay resides in internal memory */ if EQ jump skipped_DMA_setup; /* If so, jump over the DMA setup. */ i8 = runAddresses-1; dm(ov_stack+4)=i0; /* Store i0, m0 and l0 on stack to free up for use. */ dm(ov_stack+5)=m0; dm(ov_stack+6)=l0; l8=0; l0=0; m0=m8; /* Overlay ID (indexes for runAddresses and liveAddresses arrays. */ /* Get overlay run and live addresses from memory and use to */ EE-66 Page 24 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* set up the master mode DMA. */ i0 = liveAddresses-1; /* Pointer to array containing the live location */ /* of the overlay. (-1 due to overlay id used as */ /* index.)*/ r0=0; /* Disable DMA */ dm(DMAC0) = r0; /* Set DMA external pointer to overlay live address */ r0=dm(m0,i0); dm(EIEP0)=r0; /* Set DMA internal pointer to overlay run address */ r0=pm(m8,i8); dm(IIEP0)=r0; i0=runWordSize-1; /* Pointer to array containing word size of */ /* the word in the overlay run space. Most */ /* likely the word size will be 48 bits for */ /* instructions. */ /* Set DMA external modifier */ r0=1; dm(EMEP0)=r0; i8=liveWordSize-1; /* Size of words stored in external memory /* Most likely the word size will be 32 or16 /* bits for external storage. */ */ */ /* Set DMA internal modify to 1 */ dm(IMEP0)=r0; /* Set DMA internal count to Overlay run size. */ r0=dm(m0,i0); dm(CEP0)=r0; /* Set DMA external count to Overlay live size. */ r0=pm(m8,i8); dm(ECEP0)=r0; EE-66 Page 25 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* DMA enabled, instruction word, Master, 48-32 packing */ r0=0x2e1; dm(DMAC0)=r0; /* The DMA has now begun. Time to restore register values, possibly wait for */ /* DMA to complete, and jump to the proper location to execute instructions. */ /* Enable DMA interrupt */ bit set mode1 IRPTEN; bit set imask EP0I; /* Restore register values from stack */ l0=dm(ov_stack+6); m0=dm(ov_stack+5); i0=dm(ov_stack+4); skipped_DMA_setup: i8=prefetch; /* Get address location of token indicating */ /* if the function is a load or an execute. */ if EQ jump dont_wait_for_DMA (db); test.*/ /* Flag is still set from overlay id /*If id was already loaded in internal */ /*(run) space, don't wait for DMA. r2=pm(0,i8); /* Read token word, prefetch, and r2 = pass r2; /* Determine if call was for a load */ /* or execute. */ */ */ if NE jump dont_wait_for_DMA; /* Wait for DMA to complete dma1_wait: /* */ idle; Restore register values from the stack */ dont_wait_for_DMA: r2=dm(ov_stack+3); l8=dm(ov_stack+2); i8=r1; /* Set i8 to symbol address, this is were PC will jump */ m8=0; /* to in order to execute the overlay function called. */ r1=dm(ov_stack+1); r0=dm(ov_stack); /* Flush the cache. If an instruction in previous overlay */ EE-66 Page 26 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com /* had been cached, it may be executed instead of the */ /* current overlays instruction. */ (If pm transfers align.) flush cache; /* Jump to the location of the function to be executed if */ /* the load token is a zero, else return to where call */ /* was made to load the function. */ if EQ jump (m8,i8) (db); i8=r0; m8=r1; rts; /* Executes only if it was a load */ .ENDSEG; /* */ /************************************************************************/ FFT Calling Routine {___________________________________________________________________________ FFTRAD2.ASM ADSP-2106x Radix-2 DIT Complex Fast Fourier Transform Calculates a radix-2 FFT. minimum of 32 points. routine. The FFT length (N) must be a power of 2 and a Input data is not destroyed during the course of this The input and output arrays are normal ordered. The both the real and the imaginary data arrays to be processed should be in DM. The resulting real data from the fft is placed into DM and the imaginary data is placed in PM. The real twiddle factors are in an N/2 long Cosine table stored in PM, and the imaginary twiddle factors are in an N/2 long Sine Table in stored in DM. The twiddle factors are generated by the program TWIDRAD2. To implement a inverse FFT, one only has to (1) swap the real and imaginary of the incoming data, (2) take the forward FFT, (3) swap the real and imaginary of the outgoing data, and (4) scale the data by 1/N. Calling Information: pm(cosine[N/2]) - real twiddle factors from TWIDRAD2 program dm(sine[N/2]) - imaginary twiddle factors from TWIDRAD2 program dm(redata[N]) - real input array EE-66 Page 27 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com pm(imdata[N]) - imaginary input array (Note: Because the bit reversed address mode is used with the arrays redata and imdata, they must start at addresses that are integer multiples of the length (N) of the transform, (i.e. 0,N,2N,3N,...). This is accomplished by specifing two segments starting at those addresses in the architecture file and placing the variables alone in their respective segments. These addresses must also be reflected in the preprocessor variables ORE and OIM in bit reversed format.) Results: dm(refft[N]) - real fft data output pm(imfft[N]) - imaginary imaginary fft data output First 2 Stages - 8 cycles per 2 butterflies Middle Stages - 4 cycles per butterfly 2nd to Last Stage - 9 cycles per 2 butterflies Last Stage - 5 cycles per butterfly group __________________________________________________________________________} { Include for symbolic definition of system register bits } #include "def21060.h" #include "fft_ovly.h" .SEGMENT/DM dm_data; .VAR sine[N/2]= "ts2.dat"; { imag twiddle factors, from TWIDRAD2 } .VAR refft[N]; { real result } .GLOBAL sine; .GLOBAL refft; .ENDSEG; .SEGMENT/DM .VAR dm_rdat; { This segment is an integer multiple of N } redata[N]= "inreal.dat"; { input real array } .GLOBAL redata; .ENDSEG; .SEGMENT/PM } .VAR pm_idat; { This segment is an integer multiple of N imdata[N]= "inimag.dat"; { input image array } .GLOBAL imdata; .ENDSEG; EE-66 Page 28 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com .SEGMENT/PM pm_data; .VAR cosine[N/2]= "tc2.dat"; .VAR imfft[N]; { real twiddle factors, from TWIDRAD2 } { imag result } .GLOBAL cosine; .GLOBAL imfft; .ENDSEG; /********************************************************************/ /* Interrupt Vector table .SEGMENT/PM */ isr_tabl; /* The loader begins with the interrupts up to and including the low NOP;NOP;NOP;NOP; ___lib_RSTI: /* Reserved interrupt */ */ NOP; jump fftrad2 (db); /* Begin loader */ r0=0x20108421; DM(WAIT)=r0; NOP;NOP;NOP;NOP; /* Reserved interrupt */ /* Vector for status stack/loop stack overflow or PC stack full: ___lib_SOVFI: */ RTI;RTI;RTI;RTI; /* Vector for high priority timer interrupt: ___lib_TMZHI: */ rti;rti;rti;rti; /* Vectors for external interrupts: ___lib_VIRPTI: RTI;RTI;RTI;RTI; ___lib_IRQ2I: RTI;RTI;RTI;RTI; ___lib_IRQ1I: RTI;RTI;RTI;RTI; ___lib_IRQ0I: RTI;RTI;RTI;RTI; NOP;NOP;NOP;NOP; */ /* Reserved interrupt */ /* Vectors for Serial port DMA channels: ___lib_SPR0I: RTI;RTI;RTI;RTI; ___lib_SPR1I: RTI;RTI;RTI;RTI; ___lib_SPT0I: RTI;RTI;RTI;RTI; ___lib_SPT1I: RTI;RTI;RTI;RTI; /* Vectors for link port DMA channels: */ */ EE-66 Page 29 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com ___lib_LP2I: RTI;RTI;RTI;RTI; ___lib_LP3I: RTI;RTI;RTI;RTI; /* Vectors for External port DMA channels: ___lib_EP0I: RTI;RTI;RTI;RTI; ___lib_EP1I: RTI;RTI;RTI;RTI; ___lib_EP2I: RTI;RTI;RTI;RTI; ___lib_EP3I: RTI;RTI;RTI;RTI; */ /* Vector for Link service request ___lib_LSRQ: */ RTI;RTI;RTI;RTI; /* Vector for DAG1 buffer 7 circular buffer overflow ___lib_CB7I: RTI;RTI;RTI;RTI; /* Vector for DAG2 buffer 15 circular buffer overflow ___lib_CB15I: */ RTI;RTI;RTI;RTI; /* Vector for lower priority timer interrupt ___lib_TMZLI: */ */ RTI;RTI;RTI;RTI; .ENDSEG; /* */ /********************************************************************/ .SEGMENT/PM pm_code; .extern prefetch; fftrad2: call fft_first_2_stages (db); r0=0; dm(prefetch) = r0; call fft_middle_stages (db); r0=0; dm(prefetch) = r0; call fft_next_to_last (db); r0=0; dm(prefetch) = r0; call fft_last_stage (db); r0=0; dm(prefetch) = r0; wait: idle; EE-66 Page 30 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com jump wait; nop; nop; .ENDSEG; FFT Subroutines #include "def21060.h" #include "fft_ovly.h" .SEGMENT/PM pm_code; .EXTERN refft; .EXTERN imfft; .EXTERN sine; .EXTERN cosine; .EXTERN prefetch; /* The following code will be the first of 4 overlay functions. */ /* This function uses bit reversal on both DAGs. */ /* the first two stages of the butterfly. /* live in external memory in section ovl_code and run in */ /* internal memory under section pm_code. */ It calculates This function will */ fft_first_2_stages: .GLOBAL fft_first_2_stages; .EXTERN fft_middle_stages; call fft_middle_stages (db); */ /* This function call is intended to pre load r0=1; /* the function into the overlay run memory. */ dm(prefetch) = r0; /* Set prefetch flag to 1 to indicate a load. */ bit set mode1 BR0 | BR8; b8=OIM; { enable bit reverse of i0 } Points to input imaginary array } l8=0; m8=BRMODIFY8; b0=ORE; { Modifier for bitreverse counter} { Points to input real array } EE-66 Page 31 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com l0=0; m0=BRMODIFY0; { Modifier for bitreverse counter} {Now do bitrev real within first two stages} b2=refft; l2=N; m1=1; { This loop increments forward +1} b10=imfft; l10=N; m14=1; {Do the first two stages (actually a radix-4 FFT stage)} f1=pm(i8,m8); f2=dm(i0,m0), f3=pm(i8,m8); f0=f0+f2, f2=f0-f2, f4=dm(i0,m0), f5=pm(i8,m8); f1=f1+f3, f3=f1-f3, f6=dm(i0,m0), f7=pm(i8,m8); f4=f6+f4, f6=f6-f4; f5=f5+f7, f7=f5-f7; f8=f0+f4, f9=f0-f4; f10=f1+f5, lcntr=N/4, FSTAGE: f0=dm(i0,m0), f11=f1-f5; do FSTAGE until lce; { do N/4 simple radix-4 butterflies } f12=f2+f7, f13=f2-f7, f0=dm(i0,m0), f1=pm(i8,m8); f14=f3+f6, f15=f3-f6, f2=dm(i0,m0), f3=pm(i8,m8); f0=f0+f2, f2=f0-f2, f4=dm(i0,m0), f5=pm(i8,m8); f1=f1+f3, f3=f1-f3, f6=dm(i0,m0), f7=pm(i8,m8); f4=f6+f4, f6=f6-f4, dm(i2,m1)=f8, pm(i10,m14)=f10; f5=f5+f7, f7=f5-f7, dm(i2,m1)=f12, pm(i10,m14)=f14; f8=f0+f4, f9=f0-f4, dm(i2,m1)=f9, pm(i10,m14)=f11; f10=f1+f5, f11=f1-f5, dm(i2,m1)=f13, pm(i10,m14)=f15; bit clr mode1 BR0 | BR8; {finished with bitreversal} rts; nop; nop; EE-66 Page 32 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com .ENDSEG; /* This file contains the function which executes the middle */ /* butterflies of the fft. */ /* will reside in external memory and run in pm_code. .SEGMENT/PM This function is an overlay which */ pm3_code; fft_middle_stages: .GLOBAL fft_middle_stages; .EXTERN refft; .EXTERN imfft; .EXTERN sine; .EXTERN cosine; .EXTERN prefetch; .EXTERN fft_next_to_last; call fft_next_to_last (db); /* This function call is intended to preload */ r0=1; /* the function into the overlay run memory. dm(prefetch) = r0; /* Set prefetch flag to 1 to indicate a load.*/ */ {middle stages loop } bit clr mode1 BR0 | BR8; {finished with bitreversal} b0=refft; l0=N; b8=imfft; l8=N; b1=sine; l1=N/2; b9=cosine; l9=N/2; b11=imfft; l11=N; m0=-BFLY8; m1=-N/8; m2=-BFLY8-1; m9=-N/8; EE-66 Page 33 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com m11=-1; r2=2; r3=-BFLY8; {initializes m0,10 - incr for butterfly branches r5=BFLY8; {counts # butterflies per a group r9=(-2*BFLY8)-1; {initializes m12 - wrap around to next grp + 1} r10=-2*BFLY8; {initializes m8 - incr between groups r13=-BFLY8-1; {initializes m2,13 - wrap to bgn of 1st group } r15=N/8; {# OF GROUPS IN THIRD STAGE f1=dm(i1,m1), f7=pm(i9,m9); } } } } {set pointers to tables to 1st coeff.} lcntr=STAGES-4, do end_stage until lce; {# OF STAGES TO BE HANDLED = LOG2N-4} m8=r10; m10=r3; m12=r9; i0=refft+N-1; i2=refft+N-1; i8=imfft+N-1; i10=imfft+N-1; i11=imfft+N-1; r15=r15-r2, m13=r13; {CALCULATE # OF CORE } {BFLIES/GROUP IN THIS STAGE} f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, modify(i11,m10); f11=f1*f7, f7=pm(i8,m8); f14=f0*f6, f12=f8+f12, f8=dm(i0,m0); f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0); {Each iteration does another set of butterflies in each group} lcntr=r5, do end_group until lce; {# OF BUTTERFLIES/GROUP IN THIS STAGE} {core butterfly loop} lcntr=r15, do end_bfly until lce; f8=f1*f6, f14=f11-f14, f11=f1*f7, f3=f9+f14, f14=f0*f6, f12=f8+f12, pm(i10,m10)=f9; f9=f9-f14, {Do a butterfly in each group - 2 } dm(i2,m0)=f10, f9=pm(i11,m8); dm(i2,m0)=f13, f7=pm(i8,m8); f8=dm(i0,m0), end_bfly: EE-66 Page 34 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m10)=f3; {finish up last bttrfly and set up for next stage} f8=f1*f6, f14=f11-f14, f11=f1*f7, f4=f9+f14, f14=f0*f6, f9=f9-f14, dm(i2,m0)=f10, f9=pm(i11,m8); dm(i2,m0)=f13, f14=pm(i8,m11); f12=f8+f12, f8=dm(i0,m2), pm(i10,m10)=f9; f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m8); f14=f11-f14, dm(i2,m0)=f10, {dm:sin} f9=pm(i11,m12); {start on next butterfly in each group} f12=f0*f7, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, dm(i2,m2)=f13, pm(i10,m10)=f4; f11=f1*f7, pm(i10,m10)=f9; f14=f0*f6, f3=f9+f14, f9=f9-f14, {pm:cos} f12=f8+f12, f8=dm(i0,m0), f7=pm(i8,m8); f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m13)=f3; end_group: f12=f0*f7, r4=r15+r2, i1=b1; r15=ashift r4 by -1; r4=-r15, {PREPARE R4 FOR #OF BFLIES CALC } {# OF BFLIES/GRP IN NEXT STAGE } i9=b9; m1=r4; {update inc for sin & cos } m9=r4; r5=ashift r5 by 1, f1=dm(i1,m1); r3=-r5; r13=r3-1, m0=r3; r10=ashift r3 by 1, f7=pm(i9,m9); end_stage: r9=r10-1, m2=r13; {update # bttrfly in a grp} { inc for bttrfly branch} { wrap to 1st grp } { inc between grps } { wrap to grp +1 } rts; nop; nop; .ENDSEG; /* The file fft_last.asm contains the functions executing the last /* two stages of the fft. Each function is an overlay. The /* functions are placed in separate overlays by declaring them in */ /* different segments. */ /* code segment pm1_code and function fft_last_stage is declared in */ /* code segment pm2_code. /* in memory segment ovl_code and run in memory segment pm_code. Function fft_next_to_last is declared in They will both live, according to LDF, */ */ */ */ EE-66 Page 35 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com .SEGMENT/PM pm1_code; .EXTERN refft; .EXTERN imfft; .EXTERN sine; .EXTERN cosine; .EXTERN prefetch; fft_next_to_last: .GLOBAL fft_next_to_last; .EXTERN fft_last_stage; call fft_last_stage (db); /* This function call is intended to preload */ r0=1; /* the function into the overlay run memory. */ dm(prefetch) = r0; /* Set prefetch flag to 1 to indicate a load. */ {_________ next to last stage__________} m1=-2; {modifier to sine table pntr } m8=r10; {incr between groups } m9=-2; {modifier to cosine table pntr } m10=r3; {incr between bttrfly branches } m12=r9; {wrap around to next grp + 1 } m13=r13; {wrap to bgn of 1st group } i0=refft+N-1; i1=sine+(N/2)-2; {pntr to 1st sine coeff } {pntr to 1st cosine coeff } i2=refft+N-1; i8=imfft+N-1; i9=cosine+(N/2)-2; i10=imfft+N-1; i11=imfft+N-1; f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f6=dm(i0,m0), f8=f1*f6, modify(i11,m10); f11=f1*f7, f7=pm(i8,m12); f14=f0*f6, f12=f8+f12, f12=f0*f7, f13=f8+f12, f1=pm(i9,m9); f8=dm(i0,m0); f10=f8-f12, f6=dm(i0,m0); {Do the N/4 butterflies in the two groups of this stage} EE-66 Page 36 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com lcntr=N/4, do end_group2 until lce; f8=f1*f6, f9=pm(i11,m8); f14=f11-f14, dm(i2,m0)=f10, f11=f1*f7, f1=pm(i9,m9); f3=f9+f14, f9=f9-f14, dm(i2,m0)=f13, f14=f0*f6, pm(i10,m10)=f9; f12=f8+f12, f8=dm(i0,m2), f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m8); f12=f0*f7, f8=f1*f6, f14=f11-f14, f6=dm(i0,m0), f3=f9+f14, f9=f9-f14, f11=f1*f7, f14=f0*f6, f12=f8+f12, f9=pm(i11,m12); dm(i2,m0)=f10, pm(i10,m10)=f3; dm(i2,m2)=f13, pm(i10,m10)=f9; f8=dm(i0,m0), f7=pm(i8,m12); end_group2: f12=f0*f7, f13=f8+f12, f10=f8-f12, f6=dm(i0,m0), pm(i10,m13)=f3; rts; nop; nop; .ENDSEG; .SEGMENT/PM pm2_code; fft_last_stage: .GLOBAL fft_last_stage; { The last stage } m0=-N/2; m2=-N/2-1; m10=m0; m13=m2; i0=refft+N-1; i1=sine+(N/2)-1; {pntr to 1st sine coeff } {pntr to 1st cosine coeff } {modifiers to coeff tables } i2=refft+N-1; i8=imfft+N-1; i9=cosine+(N/2)-1; i10=imfft+N-1; i11=imfft+N-1; m1=-1; EE-66 Page 37 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com m9=-1; {start first bttrfly} f12=f0*f7, f0=dm(i1,m1), f7=pm(i8,m11); f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, modify(i11,m10); f11=f1*f7; f14=f0*f6, f12=f8+f12, f8=dm(i0,m2), f9=pm(i11,m11); {do N/2 butterflies in the last stage} lcntr=N/2, do last_stage until lce; f13=f8+f12, f10=f8-f12, f0=dm(i1,m1), f7=pm(i8,m11); f12=f0*f7, f14=f11-f14, f6=dm(i0,m0), f1=pm(i9,m9); f8=f1*f6, f3=f9+f14, f15=f9-f14, dm(i2,m0)=f10, f9=pm(i11,m11); dm(i2,m2)=f13, pm(i10,m10)=f15; f8=dm(i0,m2), pm(i10,m13)=f3; f11=f1*f7, last_stage: f14=f0*f6, rts; f12=f8+f12, {finished} nop; nop; {_______________________________________________________________________} .ENDSEG; EE-66 Page 38 Notes on using Analog Devices’ DSP, audio, & video components from the Computer Products Division Phone: (800) ANALOG-D or (617) 461-3881, FAX: (617) 461-3010, EMAIL: dsp.support@analog.com