Mixed-Signal Systems-on-Chip: Programmable Architectures and Design Tools Alex Doboli, PhD Associate Professor Varun Subramanian and Anurag Umbarkar, PhD students Department of Electrical and Computer Engineering State University of New York, Stony Brook, NY 11794 Email: adoboli@ece.sunysb.edu Requirements of Modern Systems Signal sensing/data acquisition – data processing & data storing – actuation – data communication (networking) The “Wish List” • A new development paradigm that: – Much more efficient design methodologies – High levels of abstraction (single thread and parallel) – Exploit design reusability & retargeting – Frees designers from low level implementation details – Produces interoperable and portable designs – Supports hardware reconfigurability (analog & digital) – Provides extensive and extensible, third-party device libraries – Is self-documenting – Is scalable and extensible Research and Education Challenges – AMS design is “art” and less “science” • Theory? Methodology? How to invent applicationspecific topologies & circuits? – Architectures? – Few CAD tools (mostly simulators: SPICE etc.) • Transistor is not a switch! (modeling) • Design usually at a low level (transistors, layout) – Who will develop the architectures & tools? – Who designs AMS systems? EE/CE/CS/all? Some CAD related Issues • Higher Level Of Abstraction For Development – What behavioral descriptions are synthesizable? DAEs, TFs, SFGs? – How do you correctly mix together continuous time and discrete time descriptions? – What standard specification notations? VHDL-AMS, Verilog-A, MATLAB/SIMULINK, UML? • Frees Designers From Low Level Implementation Details – How do you synthesize a set of DAEs? (application-specific topologies) – Does the design work? (circuit modeling) – CAD tools for transistor sizing and layout (Neoliniar CADENCE) • Supports Reconfigurability – Reconfigurable AMS architectures, reconfigurable ADCs, filters Sound Localization Data Flow C code: Customized digital: Customized analog: Hardware Cost • • • • • PSoC chip: $4.81 Board (Radio Shack): $0.80 Power jack: $0.40 Regulator: $0.50 2x connectors: $2.00, 2 capacitors • Total: $8.51 ($13.51 with LCD) • Temperature sensing: • 1 thermistor: $2.26 + 1 resistor/node • Sound-based localization: 7 resistors: $ 0.99 each, 2 capacitors: $1.49 each, 2 microphones: $2.89 each • 1 Radio Module: $9.75, LCD module: $5 • Total: $38.95 Hardware Cost Basic node: Sound-based localization node Sound-based localization node & Wireless communication module Wireless Network Data Flow RESULT - Sound input - UART communication - Radio communication - Display result Mesh Type Wired Network with UART links Presentation Outline • PSoC: embedded mixed-signal architecture (PSOC 1) • Insight into the main architectural modules and design flow • Lab 1: Basics of PSoC programming (PSOC 3) • Lab 2: More advanced programming (PSOC 3) • Demo 3: Analog and mixed-signal frontends (PSOC 1) • Lab 4: Networked applications (PSOC 1 and PSOC 3) • Available educational material • Conclusions Requirements of Modern Systems • A: Integrated mixed-domain electronic systems: – Integrated signal acquisition, processing, control & actuation – Co-design of analog, digital, firmware, and software Sensing front-ends – Challenges: Background on analog, digital, high-level programming, firmware, compilers, and OS Requirements of Modern Systems (cont’d) • B: Reconfigurable systems: – PSoC: digital, analog, I/O ports, interconnect, supply voltages & clock frequencies – Customize architecture to produce performance-optimized implementations • accuracy, speed, real-time constraints, energy/power consumption, cost, reliability, etc. – Challenges: • complex performance and cost trade-offs in analog, digital and software • circuit non-idealities and nonlinearities • difficult system integration and testing • C: Networked systems: – Efficient data transfer among networked embedded systems – In PSoC: SPI, UART, I2C, PCI, USB, wireless etc.. Requirements of Modern Systems (cont’d) • Mixed-domain embedded systems involves broad knowledge: – Industry feedback suggests that knowledge is acquired primarily as industry experience over a long period of time – Currently provided by unrelated EE, CE & CS disciplines – Difficult to understand the cross-disciplinary issues that in designing modern embedded applications – Curriculum must study jointly the related topics in analog and digital design as well as software • Analog signal sampling & time-optimized vector multiplication (ADC: not only analog but also code optimization) Requirements of Modern Systems (cont’d) Advantages and limitations of high-level design paradigms: – High-level design flows offer short time-to-market & low costs – Penalty on design performance • e.g., analog and mixed-signal circuits, high-performance digital circuits – Students must understand • nature of performance trade-offs • nonidealities of electronic circuits: can invalidate high-level design Main Concepts • Integrated presentation of analog, digital & software: – Complete flow: analog signals at I/O ports, analog signal conditioning, filtering, ADC, interrupt service routines, digital signal processing, firmware, software routines for the control application, and actuation by PWM’s and power transistors Main Concepts • Identification & formulation of inter-domain performance trade-offs: – Different facets of the cost-accuracy-speed-energy/power consumption trade-offs that manifest in mixed-signal design • signal bandwidth – conversion accuracy – software latency • cost – speed (hardware-software partitioning for speed) • flexibility – speed (addressing modes & data mapping to memory) PSoC Mixed-Signal Array (PSOC 1) • Application-specific customization of the reconfigurable SoC: • Trade-offs guide performance-optimized implementation • Developed and tested using reconfigurable platform • Based on PSoC ©Alex Doboli 2008 Programmable Embedded Mixed-Signal Architectures • Programmable PSoC mixed-signal SoC (PSOC 1) • Main architectural features: – Hardware programmability • Programmable analog blocks • Programmable digital blocks • Programmable interconnect • Programmable I/Os • Programmable clocks • Selectable power supply – Integration as an SoC PSoC 1: Mixed-Signal Architecture • Analog blocks are programmable – Programmable functionality (control registers) – Programmable inputs & outputs • Analog blocks of two types – Continuous time blocks – Switched capacitor blocks (type C and type D) • Connected to programmable I/O ports • Programmable interconnect – Three kind of programmable interconnect PSoC 1: Mixed-Signal Architecture PSoC 1 Programmable SC Blocks Programmable Interconnect PSoC Analog Blocks Design Flow for Customization Sound Localization Data Flow HARDWARE SOFTWARE Halupka, D. and Mathai, J. and Aarabi, P. and Sheikholeslami, “A. Robust sound localization in 0.18 um cmos”, IEEE Transactions on Signal Processing, 53(6), June 2005. Lab 1 (Basics: Programming of PSoC3) • C & assembly programming using PSoC 3 • Programmable I/Os to read/write data • Programmable clocks • Debugging and troubleshooting • Debugger or using LCD displays TDOA Estimation and Triangulation Sound Localization Data Flow HARDWARE SOFTWARE Hanning Window • Equation: w(n) = 0.5 (1 - cos(2πn/N - 1)) Fast Fourier Transform (FFT) • Radix-2 DIT FFT: Phase Calculation • The atan function provides a range from (– π/2) to ( π/2) • The following method is used to increase the range to (– 2π) to (2π): – Phase = atan (|y| / |x|) + offset • The offset value is calculated using the following method: – – – – If x is positive and y is positive, offset = 0 if x is negative and y is positive, offset = π/2 if x is negative and y is negative, offset = π If x is positive and y is negative, offset = 3π/2 Maximum Likelihood • Estimating Time Delay Of Arrival (TDOA) using Generalized Cross Correlation (GCC) with Phase Transform (PHAT) Weighting Factor • Equation Implemented using C (Assembly and Hardware): • Angle of Arrival in radians = asin (τ*v / d) Compiling a project using PSoC Creator • Place and configure all hardware modules from the component catalog on the TopDesign.cysch window • Write the C program in main.c • Build the project • Program the PSoC3 module using Miniprog Debugging • Click on the execute code button or press F5 • Red circles in the code indicate breakpoints and the yellow arrow indicates the point at which the code is in execution • You can also watch the variables, memory and registers changing as you step through the code Debugging(2) • Click on either of the points in the area highlighted in blue to either execute, stop, step through or step over the code Programmable I/Os • Input and output ports can be manipulated in C or assembly using Special Function Registers (SFRs) • SFRPRTxSEL register is first modified to select the port pins which you intend to change, e.g. SFRPRT0SEL = 0x01 will enable you to change pin 0_0 • SFRPRTxDR register is then used to write to an output pin and SFRPRTxPS is used to read from an input pin, e.g. ‘SFRPRT0DR = 0x01’ will set pin0_0 if it is an input pin and ‘value = SFRPRT0PS’ will read the input pins in port0 into variable value Programmable I/Os Programmable Clocks Programmable Clocks Lab 2 (More Advanced Apps) • Optimization by writing mixed C & assembly programs • Optimization using Hardware-Software co-design • Interfacing PSoCs using UART • Receiving data using Interrupts Hanning Window code Assembly Routine in PSoC3 Hardware Synthesis using Verilog • In the Components tab, add a symbol and all input and output ports. Right click on the window and click on generate Verilog to add a Verilog code • Place the symbol on TopDesign.cysch similar to how you place any hardware module • Connect Control Registers to input pins and Status Registers to output pins of the Hardware Module • Use the API Control_Reg_Write to write data into the Hardware Module and Status_Reg_Read to read data from the Hardware Module Hardware Synthesis using Verilog (2) Hardware Synthesis using Verilog (3) Placement of User Defined Hardware Module Configuration of Hardware Module Timing analysis of Bit Reversal Algorithm FFT Implementation Time (us) Improvement w.r.t. C Implementation C implementation 48.5 - Hardware Implementation 42.7 11.96 % Assembly Implementation 22.25 54.12% Interfacing PSoCs using UART • UART Transmit Operation • UART Receive Operation Configuring the UART module in PSoC3 • Place the UART module and set the I/O pins • In the main function, use the appropriate APIs to start and operate the module • Data is received using Interrupts Setting the RX Interrupt in PSoC3 • UART Module and its Interrupt configuration Setting the RX Interrupt in PSoC3 (2) • In the main function, use the appropriate APIs to start and enable the Interrupt module and set the Interrupt priorities • Enable Global Interrupts • Write the ISR in the CY_ISR (Interrupt) function in the interrupt.c file • The RX Interrupt pin goes high when an Interrupt occurs and the ISR executes if all Interrupts are enabled Configure UART Module and its Interrupts Configure UART Module and its Interrupts(2) UART Receive ISR Demo 3 (Analog Frontend – PSoC 1) • Building analog frontends using PSoC 1 • Signal conditioning • Filters DS Analog-to-Digital Converters • Connecting to digital and software Sound Localization Data Flow HARDWARE SOFTWARE Microphone Circuitry PSoC Digital Blocks PSoC Analog Blocks PGA Reconfiguration Symbolic Name Gain Value PGA_G48_0 48.00 0Ch PGA_G24_0 24.00 1Ch PGA_G16_0 16.00 08h PGA_G8_00 8.00 18h PGA_G5_33 5.33 28h PGA_G4_00 4.00 38h PGA_G3_20 3.20 48h PGA_G2_67 2.67 58h PGA_G2_27 2.27 68h PGA_G2_00 2.00 78h PGA_G1_78 1.78 88h PGA_G1_60 1.60 98h PGA_G1_46 1.46 A8h PGA_G1_33 1.33 B8h PGA_G1_23 1.23 C8h PGA_G1_14 1.14 D8h PGA_G1_06 1.06 E8h PGA_G1_00 1.00 F8h PGA_G0_93 0.93 E0h PGA_G0_87 0.87 D0h PGA_G0_81 0.81 C0h PGA_G0_75 0.75 B0h PGA_G0_68 0.68 A0h PGA_G0_62 0.62 90h PGA_G0_56 0.56 80h PGA_G0_50 0.50 70h PGA_G0_43 0.43 60h PGA_G0_37 0.37 50h PGA_G0_31 0.31 40h PGA_G0_25 0.25 30h PGA_G0_18 0.18 20h PGA_G0_12 0.12 10h PGA_G0_06 0.06 00h main() { PGA_Start(); PGA_SetGain(0xDC); function1(); // Gain 48 PGA_SetGain(0x1C); function2(); PGA_Stop(); } // Gain 24 Filter Reconfiguration Corner Frequency C1 Value C2 Value C3 Value C4 Value 1000 1 1 2 31 2000 1 1 7 16 3000 2 2 8 22 4000 3 3 9 26 5000 4 4 11 28 6000 5 5 12 30 7000 6 6 13 31 8000 6 6 16 28 9000 7 7 17 29 10000 8 8 18 31 main() { LPF2_Start(); LPF2_SetC1(8); LPF2_SetC2(8); LPF2_SetC3(18); LPF2_SetC4(31); function1(); // 10K corner freq LPF2_SetC1(4); LPF2_SetC2(4); LPF2_SetC3(11); LPF2_SetC4(28); function2(); // 5K corner freq LPF2_Stop(); } Filter Reconfiguration Results INPUT : 1KHz tone , SNR 20dB Fc 10K Hz Fc 2K Hz SAMPLES Filter Reconfiguration Results HANN FFT Fc 10K Hz Noise RMS = 16.17 Fc 2K Hz Noise RMS = 14.17 Filter Reconfiguration Results INPUT : 1KHz tone , SNR 0dB Fc 10K Hz Fc 2K Hz SAMPLES Filter Reconfiguration Results HANN FFT Fc 10K Hz Noise RMS = Noise RMS = 29.13 16.17 Fc 2K Hz Noise RMS = 21.97 Filter Reconfiguration Results INPUT : speech , SNR 20dB Fc 10K Hz Fc 3.5 KHz SAMPLES Filter Reconfiguration Results HANN FFT Fc 10K Hz Noise = RMSRMS = 70.63 16.17 Fc 3.5 KHz RMS = 52.01 Filter Reconfiguration Results INPUT : speech , SNR 0dB Fc 10K Hz Fc 3.5 KHz SAMPLES Filter Reconfiguration Results HANN FFT Fc 10K Hz RMS = 41.63 Fc 3.5 KHz RMS = 52.91 Lab 4 (PSoC Network) • Interfacing PSoCs using the Cypress Wireless Module • Writing SPI drivers to link a PSoC with the Wireless module • Star type Wireless Network with Cypress Radio Modules • Mesh type Wired Network with UART links • Middleware Routines for the Mesh Type Network Interfacing PSoCs using the CYWM6935 Radio Module Interfacing PSoCs with the Radio Module using SPI • SPI Read Sequence • SPI Write Sequence Interfacing with Radio Module Star Type Wireless Network with Cypress Radio Modules Star Type Wireless Network with Cypress Radio Modules Network Data Flow RESULT - Sound input - UART communication - Radio communication - Display result PSoC Designer Interfacing PSoCs using the CYWM6935 Radio Module Mesh Type Wired Network with UART links PSoC Network and Middleware Principle Middleware Routines • Routines to manage data structures of the nodes – Define regions and its associated parameters (Target Point, Paths, Path Probabilities, Precision of Data Acquisition, Sampling Rate) – Define Events and Actuation Procedures – Routines to Start and Reset the Network • Routines to define Data Packets that communicate sensed/aggregated data within the network along the paths • Routines to define Event Packets to detects events when they occur • Routines to define Data Pool Packets that communicate collected data to the server Developed Teaching Material • Complete lecture material for a one-semester course: – Senior undergraduate and first-year graduate students in ECE – To be published as a textbook – Also available at www.cypress.com – Evaluation copy please contact cuap@cypress.com • Related laboratory material for a one-semester, three-hour lab • Student design projects: – Embedded controllers, encryption system (IDEA), temperature log systems, monitoring systems, and telephone log systems, audio/video display • Research papers in peer-reviewed conference proceedings (reconfigurable DS ADC): – SOCC 2006, DATE 2007, AHS 2008, IEEE Transactions on CAD 2007 Developed Teaching Material • Goal: Teach fundamental, theoretical concepts and practical skills on designing and building embedded systems: 1. Integrated presentation of analog, hardware, software & netw. 2. Performance & cost optimized through trade-off analysis 3. Implementing embedded systems on reconfigurable platforms • Specifics: 1. High-level specifications express abstract data flow for signal acquisition and conversion, control procedures & actuation 2. Examples in C language 3. Comprehensive treatment of design trade-offs, trade-off formulation & analysis: • Speed, cost, power consumption, and precision • Different design solutions for each targeted performance 4. Emphasis on performance modeling Developed Teaching Material • Applications based on SoC: – Implementing new functionality (e.g., data encryption) – Developing new interfacing capabilities (AMS frontends) – Improving performance by customizing the reconfigurable analog and digital hardware of the SoC • Prerequisites: computer programming, digital design and analog circuits • Expected Outcomes: Students learn to – Utilize combination of analog and digital modules and develop software drivers for interfacing new devices – Develop system-level designs including specification, profiling, debugging & trade-off exploration/optimization – Customize the reconfigurable architecture for implementing new functionality and obtaining better performance Developed Teaching Material • Challenge: – Proper presentation “interface” between EE, CE & CS topics – Common ground for students with different backgrounds – Liaison for topics covered in more traditional courses • Following abstraction levels: – Analog: macromodels with nonidealities and nonlinearities – Digital: FSM and basic blocks (registers, adders) – Microcontroller: instruction set architecture level – Software: three-layer structure – assembly, API routines, C • Advantages: – Theoretical signal representation, sampling, quantization, precision, feedback, model composition, modularity, hierarchy – Electronic issues for speed-precision-cost-power trade-offs – Impact of circuit nonidealities on operation and performance Lecture Material Chapter 1. Introduction (1 week) • Types of embedded applications, including a simple mixedsignal embedded design as an illustrative example • Importance of performance requirements for design • Summary of mixed-signal embedded architectures • Top-down design flow (Design refinement. Performance modeling. Testing) Chapters 2&3. Mixed-signal embedded SoC architectures (3 weeks) • Mixed-signal SoC architectures • Microcontroller core. Instruction set • Memory system • Interrupts • I/Os. ISR and drivers Textbook Introduction to Mixed-Signal, Embedded Design Alex N. Doboli and Edward H. Currie (available online at www.cypress.com/cua/), to be published at Springer Verlag) Lecture Material Chapters 4 &5. Digital subsystem (3 weeks) • • • • Principles of RTL (Register Transfer Level) design Basic digital building blocks (timers, counters, CRC generator) Dynamic reconfiguration Developing simple application specific co-processors using reconfigurable architectures Chapters 6&7. Analog building blocks (2 weeks) • • Basics of switched capacitor analog circuits Presentation of basic building blocks (ideal op amps, comparators, gain stages, integrators) Lecture Material Chapter 8&9. Analog filters and converters (3 weeks) • Filter characteristics • Filter types • Circuit non-idealities and their impact on filter performance • Mapping filters to building blocks) • Analog to digital converters (ADC characteristics. DS ADC) • Digital to analog converters Chapter 10. System level trade-off analysis (2 weeks) • System performance modeling • Trade-offs • Trade-off analysis (cost – speed – power/energy consumption – number of pins) • System optimization Lab Material • 12 lab sessions of 3 hours each • Underlining theme: Constructing a temperaturecompensated, fan controller • Motivation: – Typical examples of projects an engineer will design when employed in industry – Incorporate analog, digital and software design – Introduce some basic control theory concepts – Require the use to several different types of digital serial communication protocols Lab Material (Topics) • • • • • • • • • • • • Lab 1: CPU and General Purpose I/O Lab 2: Interrupts Lab 3: Pulse Width Modulation and Global Outputs Lab 4: Three Wire Fans, Tachometers and Global Inputs Lab 5: Integrating Speed Controller Lab 6: I2C Serial Interface Lab 7: Analog Grounds and DACs Lab 8: Comparators Lab 9: Delta Sigma Modulation Lab 10: Measuring Temperature Lab 11: Filters Lab 12: Temperature Compensated Fan Controller with I2C Interface Assessment Methods Outcomes Performance Criteria (PCs) Assessment Procedure (a) Ability to apply Building signal/data-flow desknowledge of math, criptions for complex systems science and Developing large-scale FSMs & engineering. logic equations optimized for speed and cost. Three course projects that focus on the development of understanding of embedded systems (c) Ability to design a system, component, or process to meet desired needs within realistic constraints Understanding performance trade-offs across mixed-domain modules. Designing efficient modules incorporating analog & digital hardware, drivers in assembly language and procedures in C. Identify the main modules for the system performance and optimize the modules. Develop a module that requires extensive data processing and optimizing the speed of the module. Develop three alternative design solutions and characterizing their performance and cost. (g) Ability to communicate effectively. Improving technical writing skills Improving presentation skills. Writing project reports Presenting of the projects Writing lab reports Extra credit for App Note Assessment Methods • Course Project 1 (sample): – Develop the routines in PSoC assembly language for a newspaper vending machine controller – Demo & written report describing the design, debugging and testing procedures, and the related experiments: – Presentation of the testing and debugging procedure to verify the correctness of the implementation – Development of a procedure that guarantees that the design is correct for as many situations as possible – Possibilities to improve the performance and cost of the design Assessment Methods • Course Project 2 (sample): – Develop and implement encryption & decryption (IDEA) optimized for execution speed – Specification of the application in C – Profiling (number of clock cycles & memory bytes) – Alternative implementations based on PSoC’s digital hardware blocks – Testing and simulation (test bench, simulation) Vector size C code C w. MAC Assembly Assembly w. MAC 16 8958 6043 2861 390 64 45177 23659 11932 1580 256 - - 52268 6188 Assessment Methods • Course Project 3 (sample): – Develop and implement a temperature logging system • Monitors and records the temperature measured using a temperature sensor, computes average temperature, standard deviation, average temperature and standard deviation over the last x minutes • Temperature values and statistics are transmitted through UART to a computer and stored in a file – Cost - Quality: Determine the computational resources required for temperature sampling, data processing, and data communication. How do these values differ for different values of x? – Real-time: If the temperature rises above temperature T1 then the system has to shut down immediately. How fast is “immediately” in the design? – Cost - Speed: Provide a design solution that maximizes the number of samples that are read and processed per second. – Submission: Implementation, and a report (solutions to all of the requested design issues, design details, motivations, experimental data, improvements) Assessment Methods • Course Project 3: Conclusions • Complete course material for a one-semester course: – Senior undergraduate and first-year graduate students in ECE – To be published as a textbook – Also available at www.cypress.com – Evaluation copy please contact cuap@cypress.com • Goal: Teach fundamental, theoretical concepts and practical skills on designing and building embedded systems: 1. Integrated presentation of analog, digital & software 2. Performance & cost optimized through trade-off analysis 3. Implementing on PSoC reconfigurable platform