TEACHING REAL-TIME DSP USING DIGITAL SIGNAL PROCESSORS AT THE UNIVERSITY OF LEICESTER F. S. Schfindwein and N.B. Jones Department of Engineering, University of Leicester, Leicester, LE1 7RH. England. There is an urgent need to develop efficient methods to promote the understanding o Digital Signal Processor @SP) chips in the education of Electrical Engineering students. Such devices can be up to 70 times faster than standard microprocessors for some tasks, making the real-time implementation of some rather complex signal processing algorithms possible using cheap, but very flexible, hardware. These devices are therefore becoming important in many branches of Engineering and students need to be exposed to them at undergraduate level. The Engineering Department of the University of Leicester found that a lecture combined with practical work in a single laboratory session, in all lasting one day, is a very effective way to introduce the subject. The activities cover the particular architecture of the DSP processor chip and that of the development card used, programming in Assembly, understanding the software for controlling the interface of the DSP processor to the host processor (PC compatible) and the treatment of interrupts. Although we have three different DSP development cards our teaching is based on the TMS320C25 DSP chip, because it is probably the most popular of the fixed point DSPs and it is the one with the largest software base of applications. It is also very cheap! The laboratory session consists of a succession of supervised tasks building on a given example to help our students understand and implement techniques for developing real time digital signal processing algorithms. These include Writing programs in Assembly, loading and debugging them, dealing with interrupt service routines for sampling analogue signals, and implementing a simple moving average FIR filter to run in real time. The laboratory session is done immediately &er a lecture on the DigitaI Signal Processor architecture and addressing modes, with some tuition on Assembly language. The laboratory session is divided into four exercises. In the fist one, the students are given a working program which they have to understand, load, single step test using debugging software supplied, and then mod$ directly in the core program memory of the DSP.In the second exercise the students are required to change the sampling frequency in the source code, reassemble the program, re-load it, and then observe the effects of the different sampling frequencies including the aliasing phenomenon. Exercise three requires them to write a second version of the first example but using the intemal timer of the TMS320C25 chip to generate the sampling rate; and finally, example four deals with the multiply-accumulate instruction which is the basis of most digital signal processing algorithms. A digital FIR filter is implemented. At the end of one day we have had students programming a DSP chip they have never studied before! Here we give some background, then detail the equipment used and finally present part of the material handed out to the students for the laboratory session. 0 1995 The institution of Electrical Engineers. Printed and published by the IEE, Savoy Place, London WCPR OBL, UK. 5/1 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. 1. Background The field of Digital Signal Processing has grown and reached maturity in the last decade, from a theoretical subject into a practical and profitable area. We are exposed to all sorts of consumables that use DSP techniques, from the CD player for music to the cellular telephone. This maturity is due to both the understanding of the theory, and, mainly to the fast technological development of low cost digital hardware, going fiom MSI and LSI non-programmable chips to the microprocessors and the Digital Signal Processor chips. There is an urgent need to develop efficient methods to promote Understanding Digital Signal Processor @SP) chips in the education of Electrical Engineering students. Such devices can be up to 70 times faster than standard microprocessors for some tasks, making the real-time implementation of some rather complex signal processing algorithms possible using cheap, but very flexible, hardware. These devices are therefore becoming important in many branches of engineering. The teaching of Digital Signal Processing can be done avoiding the 'tedious hardware and s o h a r e projects' as in the approach advocated by Kamas and Lee (l), where all the exercises are done using a software package capable of graphical oqut, or it can be done together with the teaching of some computer architecture, circuit design and programming. Because we are an Engineering Department forming Electrical and Electronic engineen, who need to implement DSP techniques in the real world, we do both. This paper deserii. the second approach, where students use a commercial hardware-base (Digital Signal Processor + memory + A/D and D/A converters) and are invited to produce Assembly code that works in real-time, doing a real-world application including A/D conversion, digital processing, and DIA conversion, with the results being observable using an oscilloscope. There are a rich variety of texts and manuals for installing a digital signal processing laboratory and to use it for teaching (2-9). This text presents a somewhat simpler, certainly shorter, but very effective example of how to do it. We have found that a lecture combined with practical work in a single laboratory session, in all lasting one day, is a very effective way to teach the subject (2). In our approach the activities cover the particular architecture of the DSP processor chip and that of the development card used, programming in Assembly, software for controlling the interface of the DSP processor to the host processor (PC compatible) and treatment of interrupts. Although we have three different DSP development cards in the Department our teaching is based on the TMS320C25 DSP chip, because it is probably the most popular of the fixed point DSPs and the one with the largest software base of applications. It is also very cheap. We offer this combination of special lecture and laboratory session to both our undergraduates and our post-graduate students, and we assume that the students are familiar with the hndamentals of DSP - they either did or are taking a concurrent DSP course). Further we assume that they are familiar with the general ideas of Assembly programming, although not with the Assembly for the target digital signal processor chip to be used. No prior knowledge of DSP chips is assumed. All our undergraduates do Assembly programming in the laboratory over a six weeks period, and they are required to complete six different tasks dealing with various peripherals using an in-house-built hardware and a microcomputer based on the MC68000. 5 /2 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. 2. Program for the day and hardware, software and reference material requirements The program for the day includes firstly a lecture on the architecture of the TMS320C25 digital signal processor, and then a sequence of laboratory tasks: (a). Exploration of an example program given by running it under the monitor with single steps, (b). Modifications of the example program given, (c). Development of the algorithm to be used for the filter (previous DSP lectures), (d). Understanding the Assembly code for the implementation of the filter algorithm. (e). Merging the filter routine given with the sampling routine, making the filter run in real time and evaluating its performance. The hardware, software and reference material requirements for the lab session are: (a). IE3M compatible Personal Computer with a hard disk (1 Mb of hard disk :tiee). (b). Loughborough Sound Images (LSI) DSP System board based on the TMS320C25 chip. (c). Texas Instruments TMS32OC2x Assembler and linker,(XASM25, LINKE;,R). (d). LSI Monitor program to debug software loaded into the DSP board, (MON25). (e). A simple text editor such as MS EDIT (provided) or equivalent. (f). TMS32OC2x User's Guide. (g). Manual for the TMS320C25 LSI processor board, (h). Oscilloscope, (i). Signal generator. We are currently modifying the laboratory session to use the (much) cheaper DSP starter kits DSK'CSx, which sell for €81 in the UK.The LSI System board used previously costs more that ten times that! 3. Description of the laboratory session The laboratory session is divided into four exercises. In the first one, the students are given a working program which they have to understand, load, single step through it using debugging software (MON25) supplied by LSI, and then mod@ the code directly in tlhe program memory of the DSP board. In the second example the students are required to change the sampling fiequency in the source code, reassemble the program, re-load it, and then observe the aliasing phenomenon. Exercise three requires them to write a second version of the first example but using the internal timer to generate the interrupts; and finally, example four deals with the multiply-accumulate instruction which is the basis of most digital signal processing algorithms and a practical filter is implemented. The next part of this paper is a reduced version of the laboratory session notes given to the students. Exercise 1: An example on how to use the external timer to generate interrupts and to start A/D conversions is given by the INTECHO main program, produced by LSI (10) which: - programs the timer with a number to generate interrupts @ 44 lcHz - programs the interrupt mask enabling INTl (1 1) puts a branch to its Interrupt Service Routine in address 4 (INTIad.dress) - enables interrupts - In the ISR a value is read fiom the A/D and echoed to the D/A. Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. &XI In this exercise the students are invited to the LSI monitor, fill program memory with NOPS to make it easier to debug and locate the program, load Program Memory with the INTECHO example program, check the program using the DISassembly instruction, switch on the oscilloscope and the signal generator and adjust it to produce a sinusoidal waveform of frequency around 100 Hz and of amplitude lOV, and finally, to run the program and observe the output of the D/A on the oscilloscope. They are then invited to check the program memory for the interrupt 1 (INT1) branch to the ISR (program memory address 4), and then to change the sampling frequency to a very low value and observe the behaviour of the program (large amplitude steps from the A/D and D/A converters). INTECHO 32020 FAMILY MACRO ASSEMBLER 0001 06-24-87 ********************* * INTECHO * * * **************~****** 0005 0006 0007 0008 0000 0009 0010 0011 0012 0013 0000 0014 * Author: * Address: T.J.Wheatley Loughborough Sound Images Ltd. * Tel: Date: 2 1 Fpril 1986 (0509) 231843 PROGRAM TO PROVIDE INTERRUPT DRIVEN ECHO FROM ADC TO DAC * THE FOLLOWING LINKS MUST BE INSERTED BEFORE RUNNING THE CODE 0015 0016 0017 0018 0019 0020 0021 0022 0023 0024 0025 09:12:24 ' INTECHO' IDT 0002 0000 0003 0004 PC 1.1 86.036 LK15b LK16a LK17a 0000 0000 * DEFINE ADDRESS CONSTANTS 0000 0000 0004 0063 0026 0064 PAGE0 EQU IMR EQU TEMP EQU VAL EQU 0027 0 >4 >63 >64 2, 0028 0001 TIM EQU 1 ADC EQU 2 0032 0002 DAC EQU 2 0029 0030 0002 0031 Page 0 of data mem for mem-mapped regs Address of Mask Register in Page Zero Word >63 of B2 will be temporary store Word >64 of B2 will hold output value 0033 Port 1 is the external timer address Use LK15b to generate processor interrupts. Port 2 is the ADC address when using interval timer clocking (use LK17a) Port 2 is the DAC address when using interval timer. Put link at LK16a. 0034 0000 * DEFINE DATA CONSTANTS 0035 0 0 3 6 0000 0037 FF8E 0038 FFC2 TIMVAL IMASK EQU EQU >FFBE Timer value f o r clocking at about 44KHz >FFC2 Interrupt mask to enable INTl only 0039 0000 0040 0041 0004 0042 0043 0004 FFBO 0 0 0 5 0410 * SET I S R VECTOR AORG >4 Address of the INTl vector B ISR Branch to interrupt service routine .& 5/4 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. * 0044 0045 0046 0047 0400 * START O F * AORG >400 S t a r t o f a v a i l a b l e program a r e a C800 LDPK PAGE0 Page p o i n t e r s e t t o 0 DO00 LRLK >O,TIMVAL T i m e r value i n A R O SAR OUT >O,TEMP TEMP,TIM S t o r e i n d a t a memory t e m p o r a r i l y Output v a l u e t o t i m e r p o r t LRLK >l,IMASK Load i n t e r r u p t mask i n t o AR1 SAR >1,IMR Put i t i n t o Mask R e g i s t e r * 0048 0049 0400 0050 0051 0401 0402 0052 0403 0053 0404 0054 M A I N PROGRAM FF8E 7063 E163 0055 0405 DlOO 0406 FFC2 0056 0407 7104 0057 0058 0408 CEOO 0059 0060 0409 FF80 040A 0409 0061 0062 0063 0064 0065 0066 0410 0067 0068 0410 8264 0069 0411 E264 0070 0071 0412 CEOO 0072 0413 CE26 0073 0074 * EINT LOOP B Enable i n t e r r u p t s LOOP Wait f o r i n t e r r u p t s t o a r r i v e * THIS I S THE END OF THE M A I N PROGRAM * * THIS I S THE START OF THE INTERRUPT SERVICE ROUTINE AORG ISR Address of ISR >410 IN OUT VAL,AI)C VAL,DAC S t o r e t h e ADC v a l u e i n d a t a mem Load d a t a mem v a l u e t o DAC * Re-enable t h e i n t e r r u p t s Return t o main program EINT RET END NO ERRORS, NO WARNINGS - Figure 1 The initial example from LSI,'INTECHO.ASM'demonstrates the uses of the external timer, the Interrupt Mask Register, the A/D and D/A converters, and external Interrupt Service Routines (used with written permission of the author)8. Exercise 2: Modify the sampling frequency of INTECHO to 10 kHz and enter a sinusoid with fiequency of 100 Hz.Observe it on the oscilloscope. Now change the frequency of the sinusoid to 5.1 lcHz on the signal generator and estimate the fiequency of the sinusoid from the oscilloscope. What value did you get? Change the fiequency of the signal generator to 5 kHz and observe the oscilloscope. Any unexpected behaviour? Explain. Exercise 3 : Now modify the TNTECHO.ASM program into a new version (INTECHI.ASM) which uses the INTERNAL timer to generate the interrupts. To do this you have to remember that: - You want to enable TINT and not INTl in the Interrupt Mask Register, 5 15 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. - The internal timer is memory mapped and the PERIOD REGISTER has data memory address 3, - The internal timer counts down, not up, - The TINT vector is different from the INT1 vector. M e r editing your modifications ( M S EDIT is provided), assemble it: X A S M 2 5 IN"ECH1. This produces INTECHl. W O Run the 'MON25' again, load your assembled executable fle (INTECH1.MPO) and try it out. You don't have to link this one! The TI Assembler MON25 produces executable code if you are not using external subroutines. An important feature of MON25 which helps to debug programs is the SS (single step execution). Try it: PC 500<cr> DIS 500 514<cr> SS 1 (SS 1 shows all registers after the execution . SS shows only some of the registers) <space> (run next instruction) <space> ... (and next ...) <cr> (enter key to get out of the SS environment) Exercise 4: Implementation the simple FIR filter y(n) = a0 x(n) + a1 x(n-1) +a2 x(n-2) + a3 x(n-3) with the weights all equal to 0.25 (moving average, low-pass), using the following data memory positions: XN x1 x2 x3 YN A0 A1 A2 A3 (3OOh) (301h) (302h) (303h) (304h) (305h) (306h) (307h) (308h) ;current input sample x(n) ;previous input sample x(n-1) ; sample x(n-2) ;sample x(n-3) ;output value, fin) ;coefficient for current sample, x(n) ;coefficient. for previous sample, x(n-1) ;coefficient for sample x(n-2) ;coeScient for sample x(n-3) Figure 2 - The mapping of the positions in data memory to be used for weights AO-A3, the input samples, X, and the output of the filter YN. Obs: In Q15 notation 0.25 is written as 8192 or 2000h because all values (assumed to be within the interval -1 to +1 are normalised to fit the 16 bits available, therefore +1 is represented as 7FFFh and -1 as 8000h and all integer values are interpreted as fractions whose actual value is the number stored divided by the scaling factor (32k for Q15 notation). Got that? See if you understand the LTD instruction. It does the following: 5/ 6 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. 1) PC = PC+1 (program counter is incremented) 2) T register is loaded with value in data memory 3) contents of the data memory are copied into next memory position 4) ACC= ACC+ P register A brief explanation of a few other instructions follows: SPM 1 sets Product mode to 1 (see section 3.8). This means that all products are lefbshifted one position for correcting the result to fit into the 32 bits of P using 430 notation. This is done when using the Q15 and 31 notation for making the best possible use of the 16 bit values available to represent fractions. Try to multiply +1 by +1 in the Q15 notation. You get 7FFFh * 7FFFh, and when you do it you will see that you've got two zeroes as the most signiscant bits, i.e., the sign bit gets duplicated. If you try to read this number in 431 notation it will be -+-OSand not the +1 we would like to see there. In other words Q15 * Q15 = 430 and not 4 3 1. Andl because we are to save ,the result, the 16 most significant bits of the result (SACH) we use SPM 1 to give us the extra shift left. Alternatively we could have done it with no shifts on the P register, but one shifl when saving: SACH YN, 1. = load T with the content of data memory addressed by LT <dma> MPY <dma> = multiplies T by contents of data memory addressed by APAC = add P to Acc Use the code given in figure 3 as a starting point: SPM 1 LDPK 6 FIR ZAC LT X3 MPY A3 LTD x2 MPY A2 LTD x 1 MPY A1 LTD xo MPY A0 APAC SACH YN OUT YN,DAC IN x0,ADc B FIR ;to correct for the multiplications (Q15*Q 15 := Q30) ;data memory base-address = 300h ;initialise Accumulator to zero (ZEROACC) ;T = x(n-3) ;P = a3 x(n-3) ;LTD = accumulate previous P into ACCso that ACC= a3 * x(n-3), ;load T with X2,and DMOV X2 to X3 ;ACC= a3 x(n-3), P =a2 x(n-2) ;ACC= a2 x(n-2) + a3 x(n-3), T = X1, X1 moved to X2 ;P = a1 x(n-1) ;ACC= a1 x(n-1) +a2 x(n-2) + a3 x(n-3), T I= X1, XO 'aged ; P = aOx(n) ;ACC= a0 x(n) + a1 x(n-1) +a2 x(n-2) + a3 :u(n-3) ; Save result ; Show it through the DIA converter ;New input value. Notice how all others 'aged' by falling 1 place ;in memory - Figure 3 Code to be used as a starting point for the FIR filter of order 3. Make the sampling frequency equal to 44 kHz and check the zeroes of the filter by sweeping the frequency of the sinusoid. 5t7 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply. 4. Conclusion The combination of the normal course lectures (on digital signal processing, microprocessors and Assembly language programming) with a special lecture on the architecture and Assembly language of a particular DSP chip and a laboratory session has been used very successfilly to teach real-time digital signal processing for our undergraduates. The laboratory session consists of a succession of supervised tasks building on a given example to help students understand and implement techniques for developing real-time digital signal processing algorithms. In the fourth and final exercise a digital FIR filter is implemented and run in real-time. At the end of one day we have had students programming a DSP chip they have never studied before!. 5. Acknowledgements The authors would like to acknowledgethe very important support the Department have always had from Craig Marven fiom Texas Instruments, James Bryant and Jeff Channel1 from Analog Devices, and Tim Wheatley, David Quarmby and David Walsh fiom Loughborough Sound Images. We are also very grateful to Andy Wiuby and Pop Sharma, who have set the PC network for this course and to Dr. Ndim Dahnoun who helped run the first laboratory session. 6. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. Kamas, A. and Lee, E.A., Digital Signal Processing experiments using a personal computer with s o h a r e provided, Prentice Hall, 1989. Lee, E.A., Programmable DSP architectures - Part 1, IEEE ASSP Mag., N.4, pp.4-19, 1988. Lee, E.A., Programmable DSP architectures - Part 2,IEEE ASSP Mag., N.5, pp.4-14, 1989. Texas Instruments, TMS32OC25 DSP Design Workshop, 1989. Hutchins, B.A. and Parks, T.W.,A digital signal procesSing laboratory using the TMS320C25, Prentice Hall, 1990 Schlindwein, F.S. and Evans, D.H., Digital Signal Processors - Description of a laboratory for Digital Signal Processing equipped with DSP chips for applications in Biomedical Engineering (in Portuguese), Revista Brasileira de Engenharia, vo1.7, N.1, pp.79-85, 1990. Digital signal processing laboratory using the ADSP-2101 Ingle, V.K. and Proakis, J.G., microcomputer, Prentice Hall, 1991. Richardson, J. and Bore, C., The use of modem digital signal processing laboratory equipment in an engineering undergraduate lecture course, The Teaching of Electronic Engineering Degree Courses, Hull, 1992. Dahnoun, N. and Schlindwein, F.S., Introducing undergraduate students to the use of digital signal processors, International Journal of Electrical Engineering Education, vo1.3 1, N. 1, pp.66-83, Manchester University Press, 1994. Loughborough Sound Images, LSI TMS320C25 System Board Manual, 1989. Texas Instruments, TMS32OC2x User's Guide, revision B, 1990. 5/8 Authorized licensed use limited to: University of Leicester. Downloaded on November 4, 2008 at 09:38 from IEEE Xplore. Restrictions apply.