DSP Lecture 01 Chapter 1 Introduction Chapter 1, Slide 1 Learning Objectives • • • • • • • Chapter 1, Slide 2 Why process signals digitally? Definition of a real-time application. Why use Digital Signal Processing processors? What are the typical DSP algorithms? Parameters to consider when choosing a DSP processor. Programmable vs ASIC DSP. Texas Instruments’ TMS320 family. Present Day Applications Wireless / Cellular Voice-band audio RF codecs Voltage regulation Consumer Audio Stereo A/D, D/A PLL Mixers HDD PRML read channel MR pre-amp Servo control SCSI tranceivers DSP: Technology Enabler Automotive Digital radio A/D/A Active suspension Voltage regulation Multimedia Stereo audio Imaging Graphics palette Voltage regulation Chapter 1, Slide 3 DTAD Speech synthesizer Mixed-signal processor Why go digital? • • Digital signal processing techniques are now so powerful that sometimes it is extremely difficult, if not impossible, for analogue signal processing to achieve similar performance. Examples: – FIR filter with linear phase. – Adaptive filters. Chapter 1, Slide 4 Why go digital? • Analogue signal processing is achieved by using analogue components such as: – Resistors. – Capacitors. – Inductors. • Chapter 1, Slide 5 The inherent tolerances associated with these components, temperature, voltage changes and mechanical vibrations can dramatically affect the effectiveness of the analogue circuitry. Why go digital? • With DSP it is easy to: – Change applications. – Correct applications. – Update applications. • Additionally DSP reduces: – – – – – Chapter 1, Slide 6 Noise susceptibility. Chip count. Development time. Cost. Power consumption. Why NOT go digital? • High frequency signals cannot be processed digitally because of two reasons: – Analog to Digital Converters, ADC cannot work fast enough. – The application can be too complex to be performed in real-time. Chapter 1, Slide 7 Real-time processing • • • DSP processors have to perform tasks in real-time, so how do we define real-time? The definition of real-time depends on the application. Example: a 100-tap FIR filter is performed in realtime if the DSP can perform and complete the following operation between two samples: 99 y n a k xn k k 0 Chapter 1, Slide 8 Real-time processing Waiting Time Processing Time n n+1 Sample Time • We can say that we have a real-time application if: – Waiting Time 0 Chapter 1, Slide 9 Why do we need DSP processors? • Why not use a General Purpose Processor (GPP) such as a Pentium instead of a DSP processor? – What is the power consumption of a Pentium and a DSP processor? – What is the cost of a Pentium and a DSP processor? Chapter 1, Slide 10 Why do we need DSP processors? • • Chapter 1, Slide 11 Use a DSP processor when the following are required: – Cost saving. – Smaller size. – Low power consumption. – Processing of many “high” frequency signals in real-time. Use a GPP processor when the following are required: – Large memory. – Advanced operating systems. What are the typical DSP algorithms? • The Sum of Products (SOP) is the key element in most DSP algorithms: Algorithm Equation M Finite Impulse Response Filter a y ( n) k x( n k ) k 0 M Infinite Impulse Response Filter a y(n) N k k 0 x ( n k ) b y (n k ) k k 1 N Convolution x ( k ) h( n k ) y ( n) k 0 N 1 Discrete Fourier Transform X (k ) x(n) exp[ j(2 / N )nk] n 0 Discrete Cosine Transform Chapter 1, Slide 12 F u N 1 c(u ). f ( x). cos u2 x 1 2N x 0 What Problem Are We Trying To Solve? x ADC Digital sampling of an analog signal: DSP Y DAC Most DSP algorithms can be expressed with MAC: count A Y = i = 1 t a i * xi for (i = 1; i < count; i++){ sum += m[i] * n[i]; } What does it take to do this fast … and easy? Chapter 1, Slide 13 Fast MAC using only C Multiply-Accumulate (MAC) in Natural C Code for (i = 0; i < count; i++){ sum += m[i] * n[i]; } • Fastest Execution of MACs – The ‘C6x roadmap ... from 200 to 2400 MMACs • Ease of C Programming – Even using natural C, the ‘C6000 Architecture can perform 2 to 4 MACs per cycle – Compiler generates 80-100% efficient code Chapter 1, Slide 14 How does the ‘C6000 achieve such performance from C? 'C6000 Architecture: Built for Speed Memory A0 .. A15 .. A31 B0 .D1 .D2 .M1 .M2 .L1 .S1 .L2 .S2 Controller/Decoder Chapter 1, Slide 16 ‘C6000 Compiler excels at Natural C While dual-MAC speeds math intensive algorithms, flexibility of 8 independent functional units allows the compiler to quickly perform other types of processing All ‘C6000 instructions are conditional allowing efficient hardware pipelining Instruction set and CPU hardware orthogonality allow the compiler to achieve 80100% efficiency .. B15 .. B31 Fastest MAC using Natural C float mac(float *m, float *n, int count) { int i, float sum = 0; Memory A0 B0 .D1 .. A15 .. A31 .D2 .M1 .M2 .L1 .L2 .S1 .S2 Controller/Decoder Chapter 1, Slide 17 .. B15 .. B31 for (i=0; i < count; i++) { sum += m[i] * n[i]; } … ;** --------------------------------------------------* LOOP: ; PIPED LOOP KERNEL LDDW .D1 A4++,A7:A6 || LDDW .D2 B4++,B7:B6 || MPYSP .M1X A6,B6,A5 || MPYSP .M2X A7,B7,B5 || ADDSP .L1 A5,A8,A8 || ADDSP .L2 B5,B8,B8 || [A1] B .S2 LOOP || [A1] SUB .S1 A1,1,A1 ;** --------------------------------------------------* 'C6000 System Block Diagram External Memory Internal Buses .D1 .D2 .M1 .M2 .L1 .L2 .S1 .S2 CPU Chapter 1, Slide 18 Looking at the internal buses ... Register Set B Register Set A P E R I P H E R A L S Internal Memory ‘C6000 Internal Buses Internal Program Addr x32 Program Data x256 Data Addr - T1 x32 Data Data - T1 x32/64 Data Addr - T2 x32 Data Data - T2 x32/64 PC Memory External Memory A regs B regs DMA Addr - Read DMA Data - Read Peripherals DMA Addr - Write DMA Data - Write Chapter 1, Slide 19 DMA 'C6000 System Block Diagram Internal Memory External Memory Internal Buses .M1 .M2 .L1 .L2 .S1 .S2 CPU Chapter 1, Slide 20 Next, the internal memory ... Register Set B Register Set A .D1 .D2 ‘C6711 Memory 0000_0000 64KB Internal 4K Program Cache 0180_0000 64K CPU Prog / Data (Level 2) 8000_0000 9000_0000 4K Data Cache A000_0000 B000_0000 cache logic Chapter 1, Slide 21 On-chip Peripherals cache details FFFF_FFFF 0 128MB External 1 128MB External 2 128MB External 3 128MB External 'C6000 System Block Diagram External Memory Internal Buses .D1 .D2 .M1 .M2 .L1 .L2 .S1 .S2 CPU Chapter 1, Slide 24 Looking at each peripheral ... Register Set B Register Set A P E R I P H E R A L S Internal Memory Hardware vs. Microcode multiplication • • • DSP processors are optimised to perform multiplication and addition operations. Multiplication and addition are done in hardware and in one cycle. Example: 4-bit multiply (unsigned). Hardware Microcode 1011 x 1110 1011 x 1110 10011010 0000 1011. 1011.. 1011... 10011010 Chapter 1, Slide 26 Cycle Cycle Cycle Cycle 1 2 3 4 Cycle 5 Parameters to consider when choosing a DSP processor Parameter TMS320C6211 (@150MHz) 32-bit TMS320C6711 (@150MHz) 32-bit N/A 64-bit Extended Arithmetic 40-bit 40-bit Performance (peak) 1200MIPS 1200MFLOPS 2 (16 x 16-bit) with 32-bit result 2 (32 x 32-bit) with 32 or 64-bit result 32 32 Internal L1 program memory cache 32K 32K Internal L1 data memory cache 32K 32K Internal L2 cache 512K 512K Arithmetic format Extended floating point Number of hardware multipliers Number of registers C6711 Datasheet: \Links\TMS320C6711.pdf C6211 Datasheet: \Links\TMS320C6211.pdf Chapter 1, Slide 27 Parameters to consider when choosing a DSP processor Parameter TMS320C6211 (@150MHz) 2 x 75Mbps TMS320C6711 (@150MHz) 2 x 75Mbps 16 16 Not inherent Not inherent 3.3V I/O, 1.8V Core 3.3V I/O, 1.8V Core Yes Yes On-chip timers (number/width) 2 x 32-bit 2 x 32-bit Cost US$ 21.54 US$ 21.54 256 Pin BGA 256 Pin BGA External memory interface controller Yes Yes JTAG Yes Yes I/O bandwidth: Serial Ports (number/speed) DMA channels Multiprocessor support Supply voltage Power management Package Chapter 1, Slide 28 Floating vs. Fixed point processors • Applications which require: – – – – • High precision. Wide dynamic range. High signal-to-noise ratio. Ease of use. Need a floating point processor. Drawback of floating point processors: – Higher power consumption. – Can be more expensive. – Can be slower than fixed-point counterparts and larger in size. Chapter 1, Slide 29 Floating vs. Fixed point processors • • Chapter 1, Slide 30 It is the application that dictates which device and platform to use in order to achieve optimum performance at a low cost. For educational purposes, use the floating-point device (C6711) as it can support both fixed and floating point operations. General Purpose DSP vs. DSP in ASIC • • Chapter 1, Slide 31 Application Specific Integrated Circuits (ASICs) are semiconductors designed for dedicated functions. The advantages and disadvantages of using ASICs are listed below: Advantages Disadvantages • • • • • • • • • High throughput Lower silicon area Lower power consumption Improved reliability Reduction in system noise Low overall system cost High investment cost Less flexibility Long time from design to market General-purpose DSP market in 2003 Chapter 1, Slide 32 System Considerations Interfacing Performance Power Size Ease-of Use • Programming • Interfacing • Debugging Chapter 1, Slide 33 Cost • Device cost • System cost • Development cost • Time to market Integration • Memory • Peripherals Texas Instruments’ TMS320 family • Different families and sub-families exist to support different markets. C2000 C5000 C6000 Lowest Cost Efficiency Performance & Best Ease-of-Use Control Systems Motor Control Storage Digital Ctrl Systems Best MIPS per Watt / Dollar / Size Wireless phones Internet audio players Digital still cameras Modems Telephony VoIP Chapter 1, Slide 34 Multi Channel and Multi Function App's Comm Infrastructure Wireless Base-stations DSL Imaging Multi-media Servers Video Texas Instruments’ TMS320 family TMS320C64x: The C64x fixed-point DSPs offer the industry's highest level of performance to address the demands of the digital age. At clock rates of up to 1 GHz, C64x DSPs can process information at rates up to 8000 MIPS with costs as low as $19.95. In addition to a high clock rate, C64x DSPs can do more work each cycle with built-in extensions. These extensions include new instructions to accelerate performance in key application areas such as digital communications infrastructure and video and image processing. TMS320C62x: These first-generation fixed-point DSPs represent breakthrough technology that enables new equipments and energizes existing implementations for multi-channel, multi-function applications, such as wireless base stations, remote access servers (RAS), digital subscriber loop (xDSL) systems, personalized home security systems, advanced imaging/biometrics, industrial scanners, precision instrumentation and multichannel telephony systems. TMS320C67x: For designers of high-precision applications, C67x floatingpoint DSPs offer the speed, precision, power savings and dynamic range to meet a wide variety of design needs. These dynamic DSPs are the ideal solution for demanding applications like audio, medical imaging, instrumentation and automotive. Chapter 1, Slide 35 C6000 Roadmap Object Code Software Compatibility Floating Point Performance Multi-core C64x™ DSP 1.1 GHz 2nd Generation C6416 C6414 C6415 C6412 DM642 C6411 1st Generation C6203 C6202 C6201 C6701 C6713 C6204 C6205 C6211 C6711 C6712 C62x/C64x/DM642: Fixed Point C67x: Floating Point Time Chapter 1, Slide 36 ’C6000 Floating-Point C67x 3 GFLOPS and beyond C6701 1 GFLOPS C6711 C6712 900 MFLOPS 600 MFLOPS C33 C31 C30 C32 150 MFLOPS Time Chapter 1, Slide 37 TI Floating-Point Innovation TI Floating Point - A History of Firsts: First commercially-successful floating-point DSP First floating-point DSP with multiprocessing support First $10 floating-point DSP First 1-GFLOPS DSP First $5 floating-point DSP First 2-level cache floating-point DSP First to offer 600 MFLOPS for under $10 Chapter 1, Slide 38 ‘C30 (1987) ‘C40 (1991) ‘C32 (1995) ‘C6701 (1998) ‘C33 (1999) ‘C6711 (1999) ‘C6712 (2000) Useful Links • Selection Guide: – \Links\DSP Selection Guide.pdf \Links\DSP Selection Guide.pdf (3Q 2004) \Links\DSP Selection Guide.pdf (4Q 2004) Chapter 1, Slide 39 Looking for Literature on DSP? Chapter 1, Slide 40 “A Simple Approach to Digital Signal Processing” by Craig Marven and Gillian Ewers; ISBN 0-4711-5243-9 “DSP Primer (Primer Series)” by C. Britton Rorabaugh; ISBN 0-0705-4004-7 “Understanding Digital Signal Processing” by Richard G. Lyons; Prentice Hall; 2nd edition (March 15, 2004) ISBN 0-1310-8989-7 “DSP First : A Multimedia Approach” James H. McClellan, Ronald W. Schafer, and Mark A. Yoder; ISBN 0-1324-3171-8 Looking for Books on ‘C6000 DSP? “Digital Signal Processing Implementation using the TMS320C6000TM DSP Platform” by Naim Dahnoun; ISBN 0201-61916-4 “C6x-Based Digital Signal Processing” by Nasser Kehtarnavaz and Burc Simsek; ISBN 0-13-088310-7 Chapter 1, Slide 41 “Real-Time Digital Signal Processing: Based on the TMS320C6000” by Nasser Kehtarnavaz; Newnes; Book & CD-Rom (July 14, 2004) ISBN 0-7506-7830-5 “Digital Signal Processing and Applications with the C6713 and C6416 DSK (Topics in Digital Signal Processing)” Wiley-Interscience; Book&CD-Rom (December 3, 2004) by Rulph Chassaing; ISBN 0-4716-9007-4 Looking for Books on ‘C6000 DSP? Chapter 1, Slide 42 “Real-Time Digital Signal Processing from Matlab to C with the TMS320C6x DSK” by Thad B. Welch; Cameron Wright; Michael Morrow; Book & CD-Rom (2006) ISBN 0-8493-7382-4 Chapter 1 Introduction - End - Chapter 1, Slide 43