Uploaded by amraa_re

final ES

Embedded systems
2018 2019
IV B. Tech I Semester (JNTUA-R15)
Mr. M. Jagadeesh Babu, Associate Professor
Chapter No
Unit 1 :Introduction to Embedded System
Embedded system introduction
Host and Target Concept
Embedded Applications
Features and Architecture considerations for Embedded systemsROM, RAM, Timers
Data and Address Bus concept
Embedded Processor and their types
Memory types (Student seminar)
Overview of design process of embedded systems
Programming languages and tools for embedded design
Unit -II Embedded Processor Architecture
CISC Vs RISC design philosophy
Von-Neumann Vs Harvard architecture
Introduction to ARM architecture and Cortex
Introduction to the TM4C family viz.TM4C123x & TM4C129x and
its targeted applications.
TM4C block diagram
Address space
on-chip peripherals (analog and digital) Register sets
Addressing modes and instruction set basics.
M series
Unit III : Overview of Microcontroller and Embedded Systems
Embedded hardware and various building blocks
Processor Selection for an Embedded System
Interfacing Processor, Memories and I/O Devices
I/O Devices and I/O interfacing concepts
I/O Devices and I/O interfacing concepts
Timer and Counting Devices,
Serial Communication and Advanced I/O,
Buses between the Networked Multiple Devices
Embedded System Design
Co-design Issues in System Development Process
Design Cycle in the Development Phase for an Embedded System
Uses of Target System or its Emulator and In-Circuit Emulator (ICE)
Page No
Use of Software Tools for Development of an Embedded System
Design metrics of embedded systems - low power, high Performance,
UNIT-IV : Microcontroller fundamentals for basic programming
I/O pin multiplexing
pull up/down registers
GPIO control
Memory Mapped Peripherals,
programming System registers
Watchdog Timer
Need of low power for embedded systems
System clocks and Control
Hibernation Module on TM4C
Active Vs Standby current consumption
Introduction to Interrupts, Interrupt Vector Table, Interrupt
Basic Timer
Real Time clock (RTC)
Motion Control Peripherals : PWM
Module and Quadrature Encoder Interfacing (QEI)
Unit-V : Embedded communications protocols and Internet of Things
Synchronous/Asynchronous interfaces (UART, SPI, I2C, USB)
Communication Basics and Baud Rate Concepts
Interfacing digital and analog external device
Implementing and programming UART, SPI and I2C, SPI interface
using TM4C
Case Study: Tiva based embedded system application using the
interface protocols for communication
IoT overview and architecture
Overview of wireless sensor networks and design examples
Adding Wi-Fi capability to the Microcontroller
Embedded Wi-Fi
Building IOT Applications using CC3100 user API
Case study : Tiva Based Embedded Networking Applications :
Introduction to Embedded System
1.1 Embedded System
An embedded system is a combination of hardware and software with some attached
peripherals to perform a specific task or a narrow range of tasks with restricted resources. It is
an electronic system that is not directly programmed by the user, unlike a personal computer.
An embedded system is a device that incorporates a computer within its implementation,
primarily as a means to simplify the system design, and to provide flexibility; and the user of
the device is not even aware that a computer is present. It is a microcontroller-based, softwaredriven, reliable, real time control system,. Autonomous or human or network interactive,
operating on diverse physical variables and in diverse environments, and sold in a competitive
and cost conscious market. Generally, an embedded system is a subsystem in ~ larger system
and it is application specific. The generic block diagram of an embedded system is shown in
Figure 1.1. Every embedded system consists of certain input devices such as: key boards,
switches, sensors, actuators; output devices such as: displays, buzzers, sensors; processor along
with a control program embedded in the off-chip or on-chip memory, and a real time operating
system (RTOS).
Input Output
Figure 1.1: Block Diagram of a Generic Embedded System.
An embedded system exhibits different characteristics such as: Single functionality, No reprogrammability, Security, Reliability, Dependability, Robustness and Efficiency in terms of
cost, weight, energy, size, and speed. Designing the system to meet these characteristics is very
important in the success of the final product.
A specialized computer system. That is part of a larger system or machine. Typically, an
embedded system is housed on a single microprocessor board with the programs stored in
ROM. Virtually all appliances that have a digital Interface -- watches, microwaves, VCRs,
cars -- utilize embedded systems. Some embedded systems include an operating system, but
many are so specialized that the entire logic can be implemented as a single program.
Embedded systems programming is the development of programs intended to be part of a larger
operating system or, in a somewhat different usage, to be incorporated on a microprocessor
that can then be included as part of a variety of hardware devices. Several other definitions are:
A combination of computer hardware and software, and perhaps additional mechanical
or other parts, designed to perform a dedicated function. In some cases, embedded
systems are part of a larger system or product, as in the case of an antilock braking
system in a car. Contrast with general-purpose computer.
A specialized computer system which is dedicated to a specific task. Embedded systems
range in size from a single processing board to systems with operating systems (ex,
Linux, Windows® NT Embedded). Examples of embedded systems are medical
equipment and manufacturing equipment.
A computer system that is a component of a larger machine or system. Embedded
systems can respond to events in real time. Most digital appliances, such as watches or
cars, utilize an embedded system.
Hardware and software that forms a component of some larger system and is expected
to function without human intervention. Typically an embedded system consists of a
single-board microcomputer with software in ROM, which starts running a dedicated
application as soon as power is turned on and does not stop until power is turned off.
An embedded system is some combination of computer hardware and software, either
fixed in capability or programmable, that is specifically designed for a particular kind
of application device. Industrial machines, automobiles, medical equipment, cameras,
household appliances, airplanes, vending machines, and toys (as well as the more
obvious cellular phone and PDA) are among the myriad possible hosts of an embedded
A phrase that refers to a device that contains computer logic on a chip inside it. Such
equipment is electrical or battery powered. The chip controls one or more functions of
the equipment, such as remembering how long it has been since the device last received
An embedded system is a special-purpose computer system, which is completely
encapsulated by the device it controls. An embedded system has specific requirements
and performs pre-defined tasks, unlike a general-purpose personal computer.
1.1.2 Characteristics of an Embedded System:
The important characteristics of an embedded system are
Speed (bytes/sec) : Should be high speed
Power (watts) : Low power dissipation
Size and weight : As far as possible small in size and low weight
Accuracy (% error) : Must be very accurate
Adaptability: High adaptability and accessibility.
Reliability: Must be reliable over a long period of time.
So, an embedded system must perform the operations at a high speed so that it can be
readily used for real time applications and its power consumption must be very low and the
size of the system should be as for as possible small and the readings must be accurate with
minimum error. The system must be easily adaptable for different situations.
1.1.3 Categories of Embedded systems:
Embedded systems can be classified into the following 4 categories based on their
functional and performance requirements.
1. Stand-alone embedded systems
2. Real time embedded system
a) Hard real time E.S
b) Soft Real time E.S
1. Small scale embedded system
2. Medium scale embedded s/m
3. Large scale embedded system
3. Networked embedded system
4. Mobile embedded system
Sophisticated Embedded
Stand-alone Embedded systems:
A stand-alone embedded system works by itself. It is a self-contained device which
does not require any host system like a computer. It takes either digital or analog inputs from
its input ports, calibrates, converts, and processes the data, and outputs the resulting data to its
attached output device, which either displays data, or controls and drives the attached devices.
Temperature measurement systems, Video game consoles, MP3 players, digital
cameras, and microwave ovens are the examples for this category.
Real-time embedded systems:
An embedded system which gives the required output in a specified time or which
strictly follows the time deadlines for completion of a task is known as a Real time system i.e.
a Real Time system, in addition to functional correctness, also satisfies the time constraints
There are two types of Real time systems. (i) Soft real time system and (ii) Hard real time
Soft Real-Time system: A Real time system in which, the violation of time constraints
will cause only the degraded quality, but the system can continue to operate is known
as a Soft real time system. In soft real-time systems, the design focus is to offer a
guaranteed bandwidth to each real-time task and to distribute the resources to the tasks.
Ex: A Microwave Oven, washing machine, TV remote etc.
Hard Real-Time system:
A Real time system in which, the violation of time
constraints will cause critical failure and loss of life or property damage or catastrophe
is known as a Hard Real time system.
These systems usually interact directly with physical hardware instead of through a human
being .The hardware and software of hard real-time systems must allow a worst case execution
(WCET) analysis that guarantees the execution be completed within a strict deadline. The chip
selection and RTOS selection become important factors for hard real-time system design.
Ex: Deadline in a missile control embedded system , Delayed alarm during a Gas leakage ,
car airbag control system , A delayed response in pacemakers ,Failure in RADAR functioning
Networked embedded systems:
The networked embedded systems are related to a network with network interfaces to
access the resources. The connected network can be a Local Area Network (LAN) or a Wide
Area Network (WAN), or the Internet. The connection can be either wired or wireless.
The networked embedded system is the fastest growing area in embedded systems
applications. The embedded web server is such a system where all embedded devices are
connected to a web server and can be accessed and controlled by any web browser.
Ex: A home security system is an example of a LAN networked embedded system where all
sensors (e.g. motion detectors, light sensors, or smoke sensors) are wired and running on the
TCP/IP protocol.
Mobile Embedded systems:
The portable embedded devices like mobile and cellular phones, digital cameras, MP3
players, PDA (Personal Digital Assistants) are the example for mobile embedded systems. The
basic limitation of these devices is the limitation of memory and other resources.
Based on the performance of the Microcontroller they are also classified into (i) Small
scaled embedded system (ii) Medium scaled embedded system and (iii) Large scaled embedded
1.1.4 Classifications of Embedded systems
1. Small Scale Embedded Systems: These systems are designed with a single 8- or 16bit microcontroller; they have little hardware and software complexities and involve
board- level design. They may even be battery operated. When developing embedded
software for these, an editor, assembler and cross assembler, specific to the
microcontroller or processor used, are the main programming tools. Usually, C
for developing these systems. C program compilation is done into the assembly, and
executable codes are then appropriately located in the system memory. The software
has to fit within the memory available and keep in view the need to limit power
dissipation when system is running continuously.
2. Medium Scale Embedded Systems: These systems are usually designed with a single
or few 16- or 32-bit microcontrollers or DSPs or Reduced Instruction Set Computers
(RISCs). These have both hardware and software complexities. For complex software
design, there are the following programming tools: RTOS, Source code engineering
tool, Simulator, Debugger and Integrated Development Environment (IDE). Software
tools also provide the solutions to the hardware complexities. An assembler is of little
use as a programming tool. These systems may also employ the readily available ASSPs
and IPs (explained later) for the various functions
for example, for the bus interfacing,
encrypting, deciphering, discrete cosine transformation and inverse transformation,
TCP/IP protocol stacking and network connecting functions.
3. Sophisticated Embedded Systems: Sophisticated embedded systems have enormous
hardware and software complexities and may need scalable processors or configurable
processors and programmable logic arrays. They are used for cutting edge applications
that need hardware and software co-design and integration in the final system; however,
they are constrained by the processing speeds available in their hardware units. Certain
software functions such as encryption and deciphering algorithms, discrete cosine
transformation and inverse transformation algorithms, TCP/IP protocol stacking and
network driver functions are implemented in the hardware to obtain additional speeds
by saving time. Some of the functions of the hardware resources in the system are also
implemented by the software. Development tools for these systems may not be readily
available at a reasonable cost or may not be available at all. In some cases, a compiler
or retarget able compiler might have to be developed for these.
1.2 Host and Target machine
An embedded system is a special-purpose system in which the computer is completely
encapsulated by the device it controls. Each embedded system has unique characteristics. The
components and functions of hardware and software can be different for each system.
Nowadays, embedded software is used in all the electronic devices such as watches, cellular
phones etc. This embedded software is similar to general programming. But the embedded
hardware is unique. The method of communication between interfaces can vary from processor
to processor. It leads to more complexity of software. Engineers need to be aware of the
software developing process and tools. There are a lot of things that software development
tools can do automatically when the target platform is well defined. This automation is possible
because the tools can exploit features of the hardware and operating system on which your
program will execute. Embedded software development tools can rarely make assumptions
about the target platform. Hence the user has to provide some explicit instructions of the system
to the tools. Figure 1.2 shows how an embedded system is developed using a host and a target
Figure 1.2 Embedded system using host and target machine
Host machine:
The application program developed runs on the host computer. The host computer is also called
as Development Platform. It is a general purpose computer. It has a higher capability processor
and more memory. It has different input and output devices. The compiler, assembler, linker,
and locator run on a host computer rather than on the embedded system itself. These tools are
extremely popular with embedded software developers because they are freely available (even
the source code is free) and support many of the most popular embedded processors. It contains
many development tools to create the output binary image. Once a program has been written,
compiled, assembled and linked, it is moved to the target platform.
Program Development Tool Kit
Program development tool kit or IDE
assembly mnemonics or C++ or Java or Visual
C++ using the keyboard of the host system (PC) for entering the program.
Using GUIs for allowing the entry, addition, deletion, insert, appending previously
written lines or files, merging record and files at the specific positions.
4. Create source file that stores the edited file.
5. File given an appropriate name by the programmer
6. Can use previously created files
7. Can also integrate the various source files.
8. Can save different versions of the source files.
9. Compiler, cross compiler, assembler,
Target machine
The output binary image is executed on the target hardware platform. It consists of two entities
- the target hardware (processor) and runtime environment (OS). It is needed only for final
1. Target system differs from a final system
2. Target system interfaces with the computer as well works as a standalone system
3. In target system might be repeated downloading of the codes during the development
4. Target system copy made that later on functions as embedded system
Designer later on simply copies it into final system or product.
6. Final system may employs ROM in place of flash, EEPROM or EPROM in embedded
Figure 1.3 Host and Target interfacing
1. Phillips LPC 21xx development board
2. MSP 430 development board
3. TIVA TM4Cxxx development board
LPC 21xx
1.3 Embedded Applications
Embedded systems used in various applications are listed in Table 1.1. It shows that embedded
systems have rapidly emerged as important computing discipline because of the technology
convergence in computers, consumer electronics, communications, entertainment etc. Further,
new applications m medical electronics, mobile communications etc., are being continuously
evolving and are being added to fulfil the ever growing requirements of the users.
Embedded System
Home Appliances
Office Automation
Dishwasher, washing machine, microwave, Top-set
box, security system, HVAC system, DVD,
answering machine, garden sprinkler systems etc...
Fax, copy machine, smart phone system, modern,
scanner, printers.
Face recognition, finger recognition, eye
recognition, building security system, airport
security system, and alarm system.
Smart board, smart room, OCR, calculator, smart
Signal generator, signal processor, power supplier,
Process instrumentation,
Router, hub, cellular phone, IP phone, web camera
Industrial automation
Banking & Finance
Fuel injection controller, anti-locking brake system,
air-bag system, GPS, cruise control.
MP3, video game, Mind Storm, smart toy.
Navigation system, automatic landing system, flight
attitude controller, space explorer, space robotics.
Assembly line, data collection system, monitoring
systems on pressure, voltage, current, temperature,
hazard detecting system, industrial robot.
PDA, iPhone, palmtop, data organizer.
CT scanner, ECG, EEG, EMG, MRI, Glucose
monitor, blood pressure monitor, medical
diagnostic device.
ATM, smart vendor machine, cash register ,Share
Elevators, tread mill, smart card, security door etc.
1.4 Features of an Embedded System
Embedded systems products have been effectively used not only in our day to day used
products but these are also used as wholly or partially unavoidable components in many high
end uses like military, scientific research, telecommunication etc. Its size may confined from
hand held cell phones to components of nuclear missile. Irrespective of its size it necessarily
consist of some hardware and software designed to work on hardware. So
Embedded Systems are called Product of Hardware and Software Co-design. Features of
different hardware and software units of embedded systems are explained in the following
1.4.1 Hardware features of standalone embedded systems
Standalone embedded system includes different types of processors, power supply unit, clock,
reset circuit, memories which are considered to be most essential hardware components of
standalone embedded systems. A brief discussion on important features of these components
are given below Different types of processors used
Processor: A processor is the heart of the embedded system. It is responsible for execution of
instruction and controlling flow of data to and from processor. Designer should have proper
knowledge regarding efficiency of different types of processor and based on which one should
select the appropriate processor as per requirement. Different types of processors available can
be categorized into four broad categories (l) General purpose processor (GPP) (2) Application
specific System processor (ASSP) (3) Multiprocessor system and (4) GPP core or ASIP core.
A GPP has the usage advantages over other processors because of
a). Having predefined known instruction set resulting fast system development.
b). Board and I/O Interfaces designed for GPP can be used for different system changing the
c) Ready availability of computer facilities in high level language along with compiler and
debugger, resulting in fast development of a new system.
a) General Purpose Processor (GPP) may be any one of Microprocessor, Microcontroller,
Embedded processor, Digital Signal Processor (DSP) and Media processor.
Microprocessor is a single VLSI chip that has a CPU with caches, floating point processing
arithmetic unit, pipelining and super scaling, units. Later units may present for faster
processing. RAM is externally connected to CPU. Microcontroller is also a single chip
VLSI unit with limited computational capability keeping all functional units /components
inside the chip. Embedded processor may be microprocessor or microcontroller when
design specially to achieve capabilities of fast context switching resulting lower latency,
atomic ALU operation with no shared data problem RISC core for fast and precise
calculations. ARM family processors, Intel i960 etc. belongs to this class. DSP as a GPP
is a single chip VLSI having computational capabilities of a microprocessor and a multiply
and Accumulate (MAC) unit(s). DSP is an essential units of an embedded system with
very large instruction word (VLWI) processing capabilities. It process very efficiently
Single Instruction Multiple Data (SIMD), Discrete Cosine Transformation (DCT) and
Inverse Discrete Cosine Transformation (IDCT). DCT and IDCT are most useful for
algorithms for signal analysing, coding, filtering noise cancellation, echo-elimination etc.
b) Application Specific System Processor (ASSP): ASSP is dedicated for faster processing
and useful for applications like real time video processing which incorporates lots of
processing before transmitting. It may also include some features of RTOS. ASSP provides
hardwired solution for most of its time consuming tasks. For example ASSP chip i2ehip
has TCP, UDP, IP, ARP and Ethernet 10/100 MAC Media Access Control) hardwired
logic included into it. In practice, an ASSP is used as an additional processing unit for
running the application specific tasks in place of processing using embedded software.
c) Multiprocessor System: As embedded algorithm has to work within strict deadline,
sometimes it may not be possible to carry out the same with a single processor. In a real
time video processing number of MAC operations required may be more than possible
from one DSP unit. In such a case an embedded system may go for two or more processors.
Similar requirement may be needed in modem cell phones which has to perform number
of tasks. Multiprocessors are
different tasks that have to be performed concurrently. The operations of all processors are
synchronized to obtain an optimum performance.
d) GPP or ASEP core: GPP core or ASIP core is integrated into either an Application
Specific Integrated Circuit (ASIC) or a VLSI or an FPGA (Field programming Gate Array)
core integrated with processor units. Lately a new innovation in this area is System on
Chip (SOC). A SOC may be embedded with multiple processors, memories, multiple
standard source solutions called IP (Intellectual Property) core and other logic and analog
units. It may have also a network protocol embedded on it. It can embed DSP applications
and FPGA core.
For a number of applications GPP core may not be a suitable solution. For
various security application, smart card, video game, mobile Internet, Gbps transceiver,
Gbps LAN, missile system needs a special processing unit on a VLSI design circuit to
function as a processor. These units are called Application Specific Instruction Processor
(ASIP). Sometime for an application both configurable processor (FPGA or ASIP) and
non - configurable processor (DSP or microprocessor or microcontroller) might be needed
on a chip. Generally this type of applications are very important in some killer applications
(application which is useful to millions of people) such as HDTV, cell-phone etc. Power supply unit
Generally embedded system has its own Power supply unit. Four range of voltage (i) 5.0V
+ 0.25V (ii) 3.3V+ 0.3V (iii) 2.0V +0.2V (iv) 1.5V+0.2V are used for operation of different
units. Additionally 12V+0.2V supply is needed for a flash or EEPROM and RS232 Serial
Interfaces. Supply of voltage to the chip depends on number of pins provided in the chip
which is generally in pair supply and ground. A processor may have more than two pins
of Vdd Vss which are responsible for distribution of power and reduction of inferences in
all the sections. Supply should separately power the (a) External I/O driving port (b) timers
and (c) clock and reset circuits. Clock and reset circuit should be specially designed to be
free from radio frequency inference
either connected to an external power supply or use charge pump for necessary power
supply. Example of first type may be Network Interface Card (NIC) and Graphics
accelerator which do not have their own power supply are connected to PC power-supply
line. In the second type charge pump brings power from a non-supply line. It consist of a
diode in the series followed by a charging capacitor. The diode gets forward bias input
from an external signal, say RTS (Request to Send) signal in case a mouse used with the
computer. The charge pump inside the mouse store charge in inactive state and dissipate
power when the mouse is used.
An embedded system has to perform tasks continuously from power-up to power-off
and may even be kept on continuously. Real Time Systems (RTOS) use Wait and Stop
instructions and disabling certain units when not needed. This indeed is very important for
saving power during program execution. Performing tasks at reduce clock rate is also a
way to control power dissipation. Performance of software analysis during design phase
can include power dissipation considerations also. A good design must optimize the
conflicting needs of low power dissipation and fast efficient program execution. Clock Oscillator
The function of this oscillator circuit is to provide an accurate and stable periodic clock
signal to a processor. The processor needs a clock oscillator as clock controls the various
clocking requirements of CPU. The clocking requirements are the system timers and CPU
machine cycles. The machine cycle includes (i) Fetching code and data from memory and
Decoding and execution and (ii) Transferring results to memory. The clock controls the
time for executing an instruction. The clock circuit uses either a crystal (External to the
processor) or a ceramic resonator (internally associated with the processor) or an external
IC attached to the processor. (a)The crystal resonator gives the highest stability in
frequency with temperature and drift in the circuit. The crystal in association with an
appropriate resistance in parallel and a pair of series capacitance at both pins. The crystal
is kept as near as feasible to the two pins of the processor, (b) The internal ceramic
generator, if available in a processor, saves the use of the external crystal and gives a
reasonable though not very high frequency (c) The external IC based clock oscillator has
a significantly higher power dissipation compared to the internal processor resonator. It
provides a higher driving capability, which might be needed when various embedded
circuits of embedded systems concurrently driven for e.g. in multiprocessor based systems. Real time clock or timer units
A timer is suitably configured as system clock sometime referred as RTC (Real Time
Clock). RTC is used by scheduler for real time programming. A hardware timer is a counter
that is incremented at a fixed rate when the system clock pulses. There are several different
types of timers available. A timer/counter can perform several different tasks. The CPU
uses the timer to keep track of time accurately. The timer can generate a stream of pulses
or a single pulse at different frequencies. It can be used to start and stop tasks at desired
times A COP (computer operating properly) or watchdog timer checks for runaway code
execution. The hardware implementation of watchdog timers varies considerably between
different processors. In general watchdog timers must be turned on once within the first
few cycles after reset and then reset periodically with software. Some watchdog timers can
programmed for different time-out delays. The reset sequence is sometimes as simple as a
specialized instruction or as complex as sending a sequence of bytes to a port. Watchdog
timers either reset the processor or execute an interrupt when they time out.
More than one timers using the RTC may be needed for various timing and counting
need. There may be hardware and software implementations of timers. At least one
hardware timer device is must in a system which is used as system clock. The hardware
timer gets the input from a clock out signal from processor and activates the system clock
as per the number ticks present at the hardware timer. Number of hardware timers present
are generally limited.
A software timer is a software that executes and increases or decreases a count variable
(count value) or an interrupt on a timer output or on a real time clock interrupt. A software
timer can also generate interrupt on overflow of count value or the final value of count
variable. Software timers are used as virtual timing devices. There are number of control
bits and time out status flags in each timer device. A timer device when given count inputs,
in place of clock pulses performs as a counting device. Interrupt Handlers
A system possesses a number of devices and the system processor has to control and
handle the requirements of devices by running appropriate Interrupt Service Routine (ISR)
for each. An interrupt handling mechanism must exist in each system to handle interrupt
from various processes in the system. An interrupt is an event that suspends regular
program operation while the event is serviced by another program. Interrupts increase the
response speed to external events. Different microcontrollers have different interrupt
sources which can include external, timer and serial port interrupts. When an interrupt is
received the current operation is suspended, the interrupt is identified and the controller
jumps (vectors) to an interrupt service routine. There are two sources of interrupt: hardware
and software. Hardware interrupts include a signal to a pin, timer overflow, and serial port
interrupts. Software interrupts are commands given by the programmer. There are two
different interrupt types: maskable and non-maskable. A maskable interrupt can be
disabled and enabled while non-maskable interrupts cannot be disabled and are therefore
always enabled. Most 8 bit microcontrollers use vectored arbitration interrupts. Vectored
arbitration means that when a specific interrupt occurs the interrupt handler automatically
branches to an address associated with that interrupt. The servicing of interrupts in general
is dictated by the status of the GIE (Global Interrupt Enable). GIE is cleared when an
interrupt occurs and all interrupts are delayed until it is set. Reset circuit and Watchdog tinier
Reset instruction start execution from starting address otherwise execution start from
this address when it is powered up. The reset circuit activates for a fixed period (a few
clock cycles) and then deactivates to let the program proceed from a default beginning
address. On deactivation of the reset that succeed the processor activation, a program
executes from start-up address. Reset can be activated either by external reset circuit that
activates on power up or by software instruction or by a programmed timer known as
watchdog timer. Watchdog timer is a timing device that resets the system after a predefined
timeout this time is usually configured and the watchdog timer is activated within the first
few clock cycles after power up. It has many applications. In many embedded systems
reset by a watchdog timer is very essential because it helps in rescuing the system from
program hangs. On restart program can function normally. Memories
Embedded system makes use of different types of memories based on their features. These can
be viewed with following chart. These may be briefly explained based on their functionality (i)
Internal RAM used for registers, temporary data and stack. (ii) Internal ROM/PROM/EPROM
for application program (iii) External RAM for temporary data and stack (iv) Intemal cache
available in case of some microcontroller or microprocessor. (v)EEPROM of flash memory for
saving the results (vi) External ROM or PROM for embedding software used in non microcontroller based systems. (vii) RAM memory buffers at ports. Caches for superscalar
Figure 1.4: Various forms of system memory
Different types of memory devices in varying sizes are available for use as per
requirement. These are (a) Masked ROM or EPROM of flash which stores the embedded
software (ROM image). Masked ROM is for bulked manufacturing. (2)EPROM or
EEPROM is used for testing and design stages. (3)EEPROM (5V form) is used to store
the results during the system program run time. It is erased byte by byte and written during
the system run. It is useful to store modifiable bytes for example run time system status,
time and date. Flash is very useful when a processed image or voice is to be stored or a
data set or system configuration data is to be stored which can be upgraded as and when
required. In a flash new images after compressing and processing can be stored and the
old one is erased from a sector in a single instruction cycle. In boot block flash a OPT
sector is reserved to store once only at the time of first boot. It stores boot program and
initial data or permanent system configuration data. This OTP sector can be used to store
ROM image. (4)RAM is mostly used in SRAM form in a system. Advanced system uses
RAM in the form of a DRAM, SDRAM, or RDRAM (5) Parameterised distributed RAM
is used when I/O devices and subunits require a memory buffer. (6) Subunits like MAC
which operates at fast speed uses separate blocks of RAM.
1.4.18 Input / Output units and buses
The system gets input from physical devices such as keypads/boards, sensors,
transducer circuits etc. It gets the values by read operations at the port address. The system
has output ports through which it sends output bytes to the real world. It sends the values
to output by a write operation at the port address. In case of some devices a port may be
used as both input as well as output port. One example is mobile phone which sends as
well as receives signals. There are two types of I/O ports (i) Parallel port and (ii) Serial
port. In a serial port, system gets a serial stream of bits at an input and sends the signal as
bits through a modem. A serial port facilitates long distance communications and
interconnections. A serial port may be serial URAT, a serial synchronous port or serial
interfacing port. A system may get inputs from multiple channels or may have to send
multiple output channels. A demultiplexer takes input from various channels and transfers
the input to a selected channel. A multiplexer takes output from the system and sends it to
another system. A system might have to be connected to a number of other devices and
systems. For networking system there are different types of buses e.g., I2C, CAN, USB,
For automatic control and signal processing applications, a system must provide
necessary interfacing circuit and software for Digital to Analog Conversion (DAC) unit
and Analog to Digital Conversion (ADC) unit. A DAC operation is done with the help of
a combination of PWM (Pulse Width Modulation) unit in the microcontroller and External
Integrator chip. ADC operations are needed in systems for voice processing,
Instrumentation, Data acquisition systems and automatic control.
Data and Address Bus concept
We refine the high level functional diagram to illustrate a typical bus configuration
comprising the address, data and control lines
Address bus and data bus:
According to computer architecture, a bus is defined as a system that transfers data
between hardware components of a computer or between two separate computers.
Initially, buses were made up using electrical wires, but now the term bus is used more
broadly to identify any physical subsystem that provides equal functionality as the
earlier electrical buses.
Computer buses can be parallel or serial and can be connected as multi drop, daisy chain
or by switched hubs.
System bus is a single bus that helps all major components of a computer to
communicate with each other.
It is made up of an address bus, data bus and a control bus. The data bus carries the
data to be stored, while address bus carries the location to where it should be stored.
Address Bus
Address bus is a part of the computer system bus that is dedicated for specifying a
physical address.
When the computer processor needs to read or write from or to the memory, it uses the
address bus to specify the physical address of the individual memory block it needs to
access (the actual data is sent along the data bus).
More correctly, when the processor wants to write some data to the memory, it will
assert the write signal, set the write address on the address bus and put the data on to
the data bus.
Similarly, when the processor wants to read some data residing in the memory, it will
assert the read signal and set the read address on the address bus.
After receiving this signal, the memory controller will get the data from the specific
memory block (after checking the address bus to get the read address) and then it will
place the data of the memory block on to the data bus.
The size of the memory that can be addressed by the system determines the width of
the data bus and vice versa. For example, if the width of the address bus is 32 bits, the system
can address 232 memory blocks (that is equal to 4GB memory space, given that one block
holds 1 byte of data).
Data Bus
A data bus simply carries data. Internal buses carry information within the processor,
while external buses carry data between the processor and the memory.
Typically, the same data bus is used for both read/write operations. When it is a write
operation, the processor will put the data (to be written) on to the data bus.
When it is the read operation, the memory controller will get the data from the specific
memory block and put it in to the data bus.
What is the difference between Address Bus and Data Bus?
Data bus is bidirectional, while address bus is unidirectional. That means data travels
in both directions but the addresses will travel in only one direction.
The reason for this is that unlike the data, the address is always specified by the
processor. The width of the data bus is determined by the size of the individual memory
block, while the width of the address bus is determined by the size of the memory that
should be addressed by the system.
Embedded Processor and their types
1.7 Memory Types
Data memory types:
1. Random Access Memory which can be read & written
Static & Dynamic RAM
2. Read Only Memory which retains data
Programmable Logic:
1. Programmable Arrays
2. Complex Programmable Devices
CPLD, FPGA technology
Summary of Characteristics
There are many kinds of RAM and new ones are invented all the time. One of aims is to
make RAM access as fast as possible in order to keep up with the increasing speed of CPUs.
SRAM (Static RAM) is the fastest form of RAM but also the most expensive. Due to its cost
it is not used as main memory but rather for cache memory. Each bit requires a 6-transistor
DRAM (Dynamic RAM) is not as fast as SRAM but is cheaper and is used for main memory.
Each bit uses a single capacitor and single transistor circuit. Since capacitors lose their
charge, DRAM needs to be refreshed every few milliseconds. The memory system does this
transparently. There are many implementations of DRAM, two well-known ones are
SDRAM (Synchronous DRAM) is a form of DRAM that is synchronised with the clock of
-side bus (FSB). As an example, if the
system bus operates at 167Mhz over an 8-byte (64-bit) data bus , then an SDRAM module
could transfer 167 x 8 ~ 1.3GB/sec.
DDR SDRAM (Double-Data Rate DRAM) is an optimisation of SDRAM that allows data to
be transferred on both the rising edge and falling edge of a clock signal. Effectively doubling
the amount of data that can be transferred in a period of time. For example a PC-3200 DDRSDRAM module operating at 200Mhz can transfer 200 x 8 x 2 ~ 3.2GB/sec over an 8-byte
(64-bit) data bus. Static RAM (SRAM)
Static Random Access Memory
Static: Data value is retained as long as VDD is present.
sequential addresses)
SRAM can be built using either: D-type latch or 6-transistor CMOS RAM cell
D-type Latch: Used for building CPU registers, etc. Derived from inverted S-R flip-flop
Inverted S-R flip-flop:
D-type latch
No Change
No Change
When the Enable line is zero (En=0)
/S = /R = 1 and the inverting SR flip-flop retains its previous value.
When the enable line is high (En=1)
The value of data line D is latched into the flip-flop.
Each BIT would need 16 transistors (NAND gate = 4 transistors)
For large SRAM modules not very efficient.
1-MB SRAM -> 8-Mb -> 128 Million transistors Transistor Cell (Cross Coupled Inverter)
For larger SRAM modules the above circuit is not very efficient
Transistor count per bit is too high
BIT lines are charged high
Enable line WL is pulled high, switching access transistors M5 and M6 on`
If value stored in /Q is 0, value is accessed through access transistor M5 on /BL.
If value stored in Q is 1, charged value of Bit line BL is pulled up to VDD.
Apply value to be stored to Bit lines BL and /BL
Enable line WL is triggered and input value is latched into storage cell
BIT line drivers must be stronger than SRAM transistor cell to override previous
While Enable line is held low, the inverters retain the previous value could use tri-state WE
line on BIT to drive into specific state.
Transistor count per bit is only 6 + (line drivers & sense logic) Addressed SRAM
Can view RAM as N-bit by M-word black box:
N input lines
N output lines
A address lines (2A = M)
WE write enable line
WE Single SRAM Bit
Data IN DI
Write W
Address A
Data OUT
When A = 0,
Latch Enable is off.
Data cannot be written into the D-type latch
DOUT = 0.
When A = 1
Latch is Enabled
If W = 1 (Data-Write)
Data at DIN can be written into the D-type latch
Output gate is enabled
IF W = 0
New value on DIN is not stored.
Output gate is enabled.
Not very efficient since 1-bit address line can access 2 memory locations.
This memory is 1-bit X 1-word RAM
Stores one 1-bit data value
Flip Flop Out
1 1-bit X 2-word SRAM
W 1-Bit Memory Cell
Data Out
1-Bit Memory Cell
When address bit AI = 0
Cell1 is disabled and Cell0 is enabled
IF W = 1 : Value of DIN is written to cell0
IF W = 0 : Data out is Cell0 OR 0
When address bit AI = 1
Cell0 is disabled and Cell1 is enabled
IF W = 1 : Value of DIN is written to cell1
IF W = 0 : Data out is Cell1 OR 0
Only 1 cell can be active at one time
Output line is always driven by one cell
Important for shared bus 4-bit X 16-word SRAM
a 15
Chip Select
W => to all cells
When CS = 1 AND A4 A3 A2 A1 = 0000
Address decoder decodes A4-A1 to
1000000000000000 (a0 = 1, a1-a15 = 0
Data at DI1 DI2 DI3 DI4 is written to address 0 when W = 1
If W = 0, No new data is stored and address0 drives the output bus
Contents of memory address 0 appear at output
Address decoder maps input address bits to row control signals
Should only set one bit for every possible input
2A states where A is the number of address lines
The CS (chip select) line allows the memory to be doubled with only one inverter [+ OR gates]. Tri-State Outputs:
In previous examples, one location is enabled during each operation which can drive
the output bus.
If RAM is on shared bus, the RAM cannot be allowed to drive the bus at all times
Must have method of removing RAM from bus
Solution is to use Tri-State logic
DI0 . . . .DI3
DO0 . . .DO3
DI0 . . . DI3
DO0 . . . DO3
Data Bus
Outputs from each cell are tri-state outputs.
When not active the outputs are in high impedance.
Can either use CS line to control when Hicontrols the output OE
Allows both other RAM cells and other devices to control data bus
1.7.2 Dynamic RAM (DRAM)
SRAM requires a number of transistors per bit
Difficult to cost-effectively scale for larger memories
DRAM utilises MOSFET capacitance to store data bit
Transistor per bit cost is approx. 1
Row select
Data I/O
Si02 insulates gate and substrate
Creating dielectric capacitor between gate and substrate
Data bit is stored in this capacitance
Each bit now only requires 1 MOSFET per bit.
However the charge stored in cell dissipates over time and must be recharged over
time to avoid corruption
DRAM Refresh
Must read data bit and write value back to cell.
JEDEC standardises DRAM row refreshes at least every 64 ms.
All bits in row must be refreshed.
Dedicated hardware control DRAM refresh
Refresh is transparent to user
Above 64 Kbits, DRAM more economic than SRAM logic
Even with refresh.
Row select
Data I/O
Write Operation
Data I/O
Read Operation
Data I/O
1.7. 2.1 DRAM Organization
Matrix stores n 1-bit words
N is determined by the number of address lines available
Each matrix is parallelised to create word size memories
i.e. : 8 parallel 4Kx1-bit DRAM matrices creates an 4K * 8-bit RAM module
An 8x8 array forms a 64 x 1 dynamic RAM
Column Address (CAS)
The row and column select logic are comprised of address decoders.
8-rows and 8-columns need 3-address bits each.
Above block is 64x1-bit DRAM
Diagram omits but matrix has 1 data I/O line.
Row and Column address control which bit is active
In addition to RAM, they are also a range of other semi-conductor memories that retain their
contents when the power supply is switched off.
ROM (Read Only Memory) is a form of semi-conductor that can be written to once, typically
-up program (so called firmware)
that a computer executes when powered on, although it has now fallen out-of-favour to more
flexible memories that support occasional writes. ROM is still used in systems with fixed
functionalities, e.g. controllers in cars, household appliances etc.
PROM (Programmable ROM) is like ROM but allows end-users to write their own programs
and data. It requires a special PROM writing equipment. Note: users can only write-once to
EPROM (Erasable PROM). With EPROM we can erase (using strong ultra-violet light) the
contents of the chip and rewrite it with new contents, typically several thousand times. It is
this firmware, the BIOS (Basic I/O System). Other systems use Open Firmware. Intel-based
Macs use EFI (Extensible Firmware Interface).
EEPROM (Electrically Erasable PROM). As the name implies the contents of EEPROMs are
erased electrically. EEPROMSs are also limited to the number of erase-writes that can be
performed (e.g., 100,000) but support updates (erase-writes) to individual bytes whereas
EPROM updates the whole memory and only supports around 10,000 erase-write cycles.
FLASH memory is a cheaper form of EEPROM where updates (erase-writes) can only be
performed on blocks of memory, not on individual bytes. Flash memories are found in USB
sticks, flash cards and typically range in size from 32M to 2GB. The number of erase/write
cycles to a block is typically several hundred thousand before the block can no longer be
Characteristics of the various memory types
Max Erase
Cost (per
Once, with a
Yes, with a
Fast to read,
slow to
Fast to read,
slow to
1.8 Overview of design process of embedded systems
Figure1.3 shows a high level flow through the development process and identifies the major
elements of the development life cycle.
Figure 1. Embedded system life cycle
The traditional design approach has been traverse the two sides of the accompanying diagram
separately, that is,
Design the hardware components
Design the software components.
Bring the two together.
Spend time testing and
Debugging the system.
The major areas of the design process are
Ensuring a sound software and hardware specification.
Formulating the architecture for the system to be designed.
Partitioning the h/w and s/w.
Providing an iterative approach to the design of h/w and s/w
1.8.1 Requirements
Informal descriptions gathered from the customer are known as requirements. The
requirements are refined into a specification to begin the designing of the system architecture.
Requirements can be functional or non-functional requirements. Functional requirements
need output as a function of input. Non-functional requirements includes performance, cost,
physical size, weight, and power consumption. Performance may be a combination of soft
performance metrics such as approximate time to perform a user-level function and hard
deadlines by which a particular operation must be completed. Cost includes the
manufacturing, nonrecurring engineering (NRE) and other costs of designing the system.
Physical size and weight are the physical aspects of the final system. These can vary greatly
depending upon the application. Power consumption can be specified in the requirements
stage in terms of battery life.
1.8.2 Specification
Requirements gathered is refined into a specification. Specification serves as the contract
between the customers and the architects. Specification is essential to create working systems
with a minimum of designer effort. It must be specific, understandable and accurately reflect
Considering the example of the GPS system, the specification would include details for several
Data received from the GPS satellite constellation
Map data
User interface
Operations that must be performed to satisfy customer requests
Background actions
1.8.3 Architecture Design
The specification describes only the functions of the system. Implementation of the system is
described by the Architecture. The architecture is a plan for the overall structure of the system.
It will be used later to design the components. The architecture will be illustrated using block
diagrams as shown below.
This block diagram (figure 3) is an initial architecture that is not based either on hardware or
on software but combination of both. This block diagram explains about GPS navigating
system where GPS receiver gets current position and the destination is taken from user, digital
map for source to destination is found from database and displayed by the renderer. The
system block diagram may be refined into two block diagrams - hardware and software
1.3.1 Hardware block diagram:
Hardware consists of one central CPU surrounded by memory and I/O devices. We have chosen
to use two memories that is frame buffer for the pixels to be displayed and separate
program/data memory for general use by the CPU. The GPS receiver is used to get the GPS
coordinates, and the panel I/O is used to get the destination from the user.
1.3.2 Software block diagram
The software block diagram closely follows the system block diagram. We have added a timer
to control when we read the buttons on the user interface and render data onto the screen.
To have a truly complete architectural description, we require more details, such as where
units in the software block diagram will be executed in the hardware block diagram and when
the operations will be performed in time.
Architectural descriptions must be designed to satisfy the functional and non-functional
requirements. Not only must all the required functions be present, but we must meet cost, speed,
power and other non- functional constraints. Starting out with a system architecture and refining
that to hardware and software architectures is one good way to ensure that we meet all
specifications. We can concentrate on the functional elements in the system block diagram, and
then consider the non- functional constraints when creating the hardware and software
How do we know that our hardware and software architectures in fact meet constraints on
speed, cost, and so on?
Estimate the properties of the components in the block diagrams (Example: search and
rendering functions in the moving map system)
Accurate estimation derives in part from experience, both general design and particular
All the non- functional constraints are estimated. If the decisions are based on bad data,
those results will show up only during the final phases of design.
1.4 Hardware and Software components
The architectural description tells us what components we need. The component design effort
builds those components in conformance to the architecture and specification. The components
in general includes both hardware and software modules. Some of the components will be
ready-made (example: CPU, memory chips).
In the moving map, GPS receiver is a predesigned standard hardware component. Topographic
software is a standard software module which uses standard routines to access the database.
Printed circuit board are the components which needs to be designed. Lots of custom
programming is required.
When creating these embedded software modules, ensure the system runs properly in real time
and that it does not take up more memory space than allowed. The power consumption of the
moving map software example is particularly important. You may need to be very careful about
how you read and write memory to minimize power. For example, memory transactions must
be carefully planned to avoid reading the same data several times, since memory accesses are a
major source of power consumption.
1.8.5 System integration
After the components are built, they are integrated. Bugs are typically found during the system
integration. Good planning can help us to find the bugs quickly. By debugging a few modules
at a time, simple bugs can be uncovered. By fixing the simple bugs early, more complex or
obscure bugs can be uncovered. System integration is difficult because it usually uncovers
problems. The debugging facilities for embedded systems are usually much more limited than
the desktop systems. Careful attention is needed to insert appropriate debugging facilities
during design which can help to ease system integration problems.
1.9 Programming languages and tools for embedded design
The software is the most important aspect of the embedded system, hardware perform
the task as per software instruction. It is actually the brain of the system. An Embedded system
processor and the system need software that is specific to a given application of that system.
The processor of the system processes instructions coded and data. In the final stage these are
placed in the memory (ROM) for all the tasks that have to be executed. Assembly as well as
high level language like C, C++, and Java etc. are used for software development.
Challenging in designing and implementing embedded software comes from reliability,
performance and cost. Reliability expectation brings greater responsibility to eliminate bugs
and fault tolerant as many embedded system has to run 24 hours a day, a week and 365 days in
a year. Sometime rebooting is not possible, so good programming and thorough testing is must
for embedded software development Performance issue may come from different
considerations, such as proper multitasking and scheduling any considerably effect the
performance. At the same time systems using sensors depends on how accurately sensor value
is converted into real world value. Input/output device may effect speed, complexity and cost.
For better productivity sometime it may be needed to program directly in assembly in place of
high level language. Embedded consumer products as produced in large so it is possible to keep
in minimal production cost and no modification is performed once it start produced.
1.9.1Creation of ROM image
In the final stage processed codes and instructions are placed in ROM which is called creation
of ROM image. All executions of tasks are carried out from there. A brief description of
creation of ROM image in assembly and High level language is described below There are
different stages in converting an assembly language program into machine implementable
software file and then finally obtaining ROM image file. These steps are explained with the
following figure 1.9.
In the assembling step assembler translate assembly software into machine codes. Next in
linking phase linker links no of codes with other assembled codes. There are certain codes
having certain beginning address. Linking produces the final binary file by linking all these.
The linked file in a computer is commonly known as .exe file. In the third phase reallocation
of codes is done by placing it in physical memory by a program called loader. Loader find out
appropriate position in RAM that is ready to run. Finally in locating phase ROM image is
permanently placed in actually available address of ROM. In embedded system since there is
only one program so designer has to define the available address to load and create files for
permanent location. The locator locates the I/O task and hardware device driver codes at
unchanged address as port address of these are fixed. In the last phase device programmer takes
the ROM image and is burnt in to the PROM or EPROM.
Figure 1.9.1 : Process of converting assembly language program into ROM image
In the conversion process of a high level language like C to ROM image file first compiler
generates the object codes. As per processor instruction compiler assemble the codes and then
code optimization is carried out by code optimizer Optimization is carried out before linking.
After compilation linker links codes including various standard codes like printf, scanf and
device driver codes. After linking subsequent steps for creating ROM image is same as
explained for assembly language
Figure 1.9.2 :Proeess of converting C Program into ROM image
A comparative view of build and load process of desktop and embedded application can be
depicted with following figures 1.9.3 and 1.9.4
Figure 1.9.3: The build and load process for desktop application program
Fig 1.9.4: The build and load process for embedded application program
1.9. 2 Software for embedded system device driver, multiple tasks, RTOS
There may be a number of physical devices attached with embedded systems. Device
driver is the program needed to drive these devices. A driver uses hardware status flag and
control register. It controls three functions (a)Initializing by placing appropriate bits at the
control register.(b)Calling Interrupt service routine(ISR) for setting status flag (c)Resetting the
status flag after interrupt service. Device driver coding is made using operating system
functions such that underlying hardware is hidden. Device management software module
provides codes for detecting the presence of devices. In designing the software for this category
two types of devices are considered -Physical and Virtual. Physical devices includes Keyboard,
Printers, display matrix etc. Virtual device could be a file which may be used for reading and
writing the stream of bytes. Operating system has modules for insertion of both device driver
and device management module.
Sometime embedded systems has to control multiple devices for scheduling of multiple
functions (task). To implement this embedded system must have a multitasking operating
system above application level which is generally a Real Time Operating System (RTOS). In
multitasking OS each process (task) has different memory allocation of its own and task has
one or more than one procedures for a specific job [12]. A task may share memory (data) with
other task. Processor may process different task separately or concurrently. An OS or RTOS
has a kernel which is responsible for scheduling the transition of task from ready state to
running state. Kernel may select a task for processing based on its priority value out of many
ready state tasks. Calling ISR kernel may temporarily halt a running task and allow another
task to run and resume the same after completion of new task.
An embedded system in multitasking environment always need not require an RTOS.
An RTOS is required in a multitasking environment when real time constraints becomes must
(i.e. task has to be completed in defined deadline). An RTOS main functions includes Real
time task scheduling, Interrupt latency control, Time allocation and de-allocation to attain
efficiency, predictable timing behaviour, priority management and time slicing of process
soft real time. Hard real time strictly adhere
task schedule whereas in soft real time precedence and sequence of task is defined.
1.9.3 Tools for designing embedded software
Different software tools for assembly language programming, high level language
programming, RTOS, debugging and integrated tools can be summarized as given below.
Editor: It enables users to write codes for high level as well as assembly language in computer.
Different features like addition, deletion, copy, insertion are made available for easy writing.
It saves the content in a file with user defined or default extension. User can make necessary
modification of saved files as and when required.
Compiler: It takes the input of whole high level source code and converts it to machine
readable object code. It may include functions, library routines etc. for compilation.
Interpreter: It converts high level codes to machine readable form line by line. Like compiler
it may also include functions, library routines etc. for conversion.
Assembler: It is used for conversion of assembly language programs to executable binary files.
It creates the list file which has address, source code and hexadecimal object codes. It is
processor specific.
Cross assembler: Cross assembler assembles the assembly code of target processor as
assembly code of the processor of the PC used in the system development. Later it provides
the object codes for the target processor. These will be the final codes used for the developed
Simulator: It is the program which can simulate all the functions of an embedded system
circuit including additional memory and peripherals. It is independent of a particular target
RTOS: Explained in above
Stethoscope: This program is used to keep track of dynamic change in program variables and
parameters. It can demonstrate the sequences of multiple processes, tasks, threads that execute
and keeps entire time history.
Trace scope: It traces the change in module according to time. Accordingly list of actions to
be initiated at desired time is also prepared.
Integrated Development Environment (IDE): Total software and hardware environment
consist of simulator, compiler, assembler, cross assembler, logic analyser EPROM/EBPROM,
application codes, burners defines the integrated development environment of the system.
Locator: Locator program uses cross-assembler output and a memory allocation map and
provides locator program output.
Unit -II
Embedded Processor Architecture
2.1 CISC Vs RISC design philosophy
hitectural designs of CPU are RISC (Reduced
instruction set computing) and CISC (Complex instruction set computing). CISC has the ability
to execute addressing modes or multi-step operations within one instruction set. It is the design
of the CPU where one instruction performs many low-level operations. For example, memory
storage, an arithmetic operation and loading from memory. RISC is a CPU design strategy
based on the insight that simplified instruction set gives higher performance when combined
with a microprocessor architecture which has the ability to execute the instructions by using
some microprocessor cycles per instruction.
Fig 2.1: CISC Vs RISC
We discusses about the RISC and CISC architecture with suitable diagrams.
1. Hardware of the Intel is termed as Complex Instruction Set Computer (CISC)
2. Apple hardware is Reduced Instruction Set Computer (RISC).
What is RISC and CISC Architectures?
Instruction Set Architecture
Instructin sst can bs dsfnsd as ths cimmunicatin intsrfacs bstwssn ths pricsssir and ths
prigrammsr. Evsry pricsssir has its iwn instructin sst imppsmsntsd in ths hardwars ti sxscuts
instructins such as mivs, add ir muptppy data in a dsfnits way. Prigrammsrs can sithsr uss any
high psvsp panguags such as C, C++, Java stc. ir asssmbpy panguags ti writs ths prigram. Accirdingpy,
a cimpipsr ir asssmbpsr can bs ussd ti transpats ths prigram inti machins undsrstandabps
panguags fippiwing ths pricsssir instructin sst. Thsrs ars twi cpassic architscturss if instructin
sst imppsmsntatin, ths cimppsx instructin sst cimputsr (CISC) and ths rsducsd instructin sst
cimputsr (RISC). Each has its iwn advantagss and disadvantagss. Ths CISC architscturs has mirs
cimppsxity in ths hardwars itsspf whips RISC architscturs ifsrs mirs cimppsxity ti ths sifwars.
Ths fsaturss if sach architscturs ars summarizsd as bspiw.
Features of Complex Instruction Set Computer (CISC):
Mist if ths instructins ars cimppsx in typs.
Instructins rsquirs muptpps cpick cycpss fir sxscutin.
Mirs addrsssing midss ars avaipabps in ths instructin sst.
Fswsr wirking rsgistsrs and mirs frsqusnt msmiry accsss.
Liad and Stirs ipsratins ars incirpiratsd in instructins.
High cids dsnsity is achisvsd bscauss if avaipabipity if muptfunctinap instructins.
Pipspins imppsmsntatin is diffcupt.
Mirs cimppsxity is givsn ti ths hardwars dssign.
Features of Reduced Instruction Set Computer (RISC):
Mist if ths instructins ars simpps in naturs.
App ths instructins ars sxscutsd in singps cpick duratin.
Ths addrsssing midss avaipabps ars fswsr than in cass if CISC.
Instructin sst has ssparats Liad/Stirs architscturs.
Highsr numbsr if wirking rsgistsrs si psss frsqusnt msmiry accsss.
Mist if ths data transfsr happsns frim rsgistsr ti rsgistsr.
Largs cids sizs cimparsd ti CISC architscturs.
Psrfirmancs if RISC architscturs is apways bstsr than CISC architscturs.
Pipspins imppsmsntatin is sasisr cimparsd ti CISC.
Mirs cimppsxity is ifsrsd ti ths cimpipsr dssign.
Memory Block
Ths msmiry bpick cinsists if prigram and data msmiry. ROM is ussd as ths prigram msmiry
and RAM is ussd as ths data msmiry. Thsrs ars twi msmiry architscturss: Harvard and VinNsumann. In Harvard architscturs, ths prigram and data msmiriss ars ssgrsgatsd with
ssparats addrsss and data bus drawn ti sach. Si thsrs can bs parappsp accsss ti bith and
psrfirmancs if ths systsm can bs imprivsd at ths cist if hardwars cimppsxity. On ths ithsrhand, ths Vin-Nsumann architscturs has ins unifsd msmiry ussd fir bith prigram and data.
Ths systsm is cimparatvspy spiwsr, but ths dssign imppsmsntatin is simpps and cist sfsctvs
fir an smbsddsd systsm. Variius ROM and RAM dsvicss ars ussd in smbsddsd systsms bassd
in ths apppicatins.
ARM Architecture
ARM cirss ars dssignsd spscifcappy fir smbsddsd systsms. Ths nssds if smbsddsd systsms
can bs satsfsd inpy if fsaturss if RISC and CISC ars cinsidsrsd tigsthsr fir pricsssir dssign.
Si ARM architscturs is nit a purs RISC architscturs. It has a bpsnd if bith RISC and CISC
Tabps 1.1. ARM Architscturs Fsaturss and Bsnsfts
High Psrfirmancs
Liw piwsr cinsumptin
Liw sipicin arsa
High Cids dsnsity
Liad/stirs architscturs
Rsgistsr bank with pargs numbsr if wirking
Bsnsfts ti smbsddsd systsm
Ensurss ths systsm has a fast rsspinss
Makss ths systsm mirs snsrgy sfcisnt
Rsducss ths sizs and apsi cinsumss psss
Hspps smbsddsd systsm ti havs psss msmiry
Ussd ti piad data frim ths msmiry ti ths
ARM CPU rsgistsr ir stirs data frim ths CPU
rsgistsr ti ths msmiry; snabpss ths msmiry
accsss whsn rsquirsd
Rsquirsd ti psrfirm mist if ths ipsratins
within ths CPU and prividss fastsr cintsxt
switch in a mupttasking apppicatins
A Basic architecture of the ARM7core
ARM 7, ths basic architscturs if ARM ssriss if cirss, is intriducsd hsrs in this ssctin. A brisf
intriductin abiut sach functinap bpick if ths architscturs if ARM7 cirs shiwn in Figurs.1.2 is
prsssntsd bspiw.
Ths Rsgistsr Bank has sixtssn gsnsrap purpiss rsgistsrs (R0-R15) and a currsnt prigram status
rsgistsr (CPSR) which ars accsssibps by ussr apppicatins. In additin ti that, it has twsnty
numbsrs if banksd rsgistsrs spscifcappy ussd fir difsrsnt ipsratng midss if ARM cirs. Thsss
ars invisibps ti ussr apppicatins. Ths rsgistsr bank has twi rsad pirts ti rsad ipsrand1 and
ipsrand2 and ins writs pirt ti writs back ths rssupt if ipsratin ti ths any rsgistsr spscifsd in
ths instructin. It has an additinap bidirsctinap pirt ti updats ths prigram ciuntsr with
addrsss rsgistsr and incrsmsntsr. Addrsss rsgistsr cintsnt is incrsmsntsd at svsry ssqusntap
byts accsss by ths incrsmsntsr but ths prigram ciuntsr is incrsmsntsd by fiur in ARM stats if
ths cirs ir is incrsmsntsd by 2 in Thumb stats if ths cirs at svsry instructin accsss. ARM and
Thumb statss if ths cirs ars discusssd in ssctin 1.3. Addrsss rsgistsr is dirsctpy cinnsctsd ti
ths addrsss bus.
 Ths barrsp shifsr can shif ir ritats ipsrand 2 by spscifsd numbsr if bits priir
ti arithmstc ir pigic ipsratins.
 Ths 32 bit ALU psrfirms ths arithmstc and pigic functins.
 Ths data in and data iut rsgistsrs hipd ths input and iutput data frim and ti
ths msmiry.
 Ths instructin dscidsr and assiciatsd cintrip pigic gsnsratss appripriats
cintrip signaps fir ths data path afsr dsciding ths fstchsd instructin.
 Ths MAC unit is ti muptppy twi rsgistsr ipsrands and accumupats with anithsr
rsgistsr hipding ths partap sum if ths priducts.
Ths sncidsd instructin byts if ths prigram savsd in ths cids msmiry is fstchsd thriugh ths data
bus and frst sntsrs inti ths data-in rsgistsr if ths ARM architscturs frim whsrs it is dspivsrsd ti
ths instructin dscidsr. Afsr ths instructin is dscidsd, appripriats cintrip signaps ars gsnsratsd
fir ths data path. Ths rsquirsd rsgistsrs ars actvatsd in ths rsgistsr bank and ths ipsrands fiw iut
frim twi rsad pirts if rsgistsr bank ti ths ALU: ipsrand1 thriugh A-bus and ipsrand2 thriugh Bbus afsr prspricsssing at barrsp shifsr. Ths rssupt if ipsratin at ALU is writsn back ti ths rssupt
rsgistsr thriugh a writs pirt at rsgistsr bank. Fir Liad/Stirs instructins, afsr dsciding ths
instructin, ths data msmiry addrsss is frst capcupatsd at ALU as spscifsd in ths instructin and ths
piintsr rsgistsr is updatsd at ths rsgistsr bank. Ths addrsss in ths piintsr rsgistsr is givsn ti ths
addrsss rsgistsr ti accsss ths msmiry and transfsr data. If it is a piad muptpps ir stirs muptpps
instructin, ths cirs diss nit hapt bsfirs cimppstng ths rsquirsd numbsr if data transfsrs unpsss it
is a rssst sxcsptin.
Migration to Cortex Series
In ths path if architscturap sviputin, ARM has cintributsd many vsrsiins if IP cirss ti ths
smbsddsd cimputng wirpd. ARM piinssrsd smbsddsd priducts ars sxcspping in svsry visibps
spsctrum. Sincs its incsptin, ARM has migratsd ivsr a ping msaningfup riad map startng frim v4T
ARM7TDMI ti v7 Cirtsx ssriss if architscturss achisving many string mipsstinss in bstwssn. It is
currsntpy ths nsw sra if fsaturs rich ARM Cirtsx ssriss architscturss trupy smpiwsring ths
smbsddsd cimputng wirpd.
ARM architecture evolution
Fig 1.13. Performance and capability graph of Classic ARM and Cortex application processors
ARM architscturs has bssn imprivsd a pit in ths riad map frim cpassic ARM ti ARM Cirtsx. Fig1.7
and fg117 dspict ths psrfirmancs and capabipity cimparisin if cpassic ARM with smbsddsd cirtsx
and apppicatin cirtsx ssriss if pricsssirs. Evsn thiugh ARM had sarpisr vsrsiins if priducts
i.s.,v1, v2, v3 and v4, ths cpassic griup if ARM starts with v4T. Ths cpassic griup is dividsd inti fiur
basic famipiss cappsd ARM7, ARM9, ARM10 and ARM11.
 ARM7 has thrss-stags (fstch, dscids, sxscuts) pipspins, Vin-Numann architscturs
whsrs bith addrsss and data uss ths sams bus. It sxscutss v4T instructin sst. T
stands fir Thumb.
 ARM9 has fvs-stags (fstch, dscids, sxscuts, msmiry, writs) pipspins with highsr
psrfirmancs, Harvard architscturs with ssparats instructin and data bus. ARM9
sxscutss v4T and v5TE instructin ssts. E stands fir snhancsd instructins.
 ARM10 has six-stags (fstch, issus, dscids, sxscuts, msmiry, writs) pipspins with
iptinap vsctir fiatng piint unit and dspivsrs high fiatng piint psrfirmancs.
ARM10 sxscutss v5TE instructin ssts.
Microcontroller profle (Cortex -M)
Cirtsx M ssriss if architscturss havs v6-M as cirtsx M0, M0+ and M1 and v7-M with Cirtsx M3,
M4 and ithsr succsssirs. This ssriss if architscturss dsvspipsd fir dssppy smbsddsd
micricintrippsr prifps, ifsr piwsst gats ciunt si smappsst sipicin arsa. Thsss ars fsxibps and
piwsrfup dssigns with cimppstspy prsdictabps and dstsrministc intsrrupt handping capabipitss by
intriducing ths nsstsd vsctir intsrrupt cintrippsr (NVIC). Ths smapp instructin ssts suppirt fir high
cids dsnsity and simppifsd sifwars dsvspipmsnt. Dsvspipsrs ars abps ti achisvs 32-bit
psrfirmancs at 1-bit prics. Ths vsry piw gats ciunt if Cirtsx M0 facipitatss its dsppiymsnt in anapig
and mixsd mids dsvicss. Dus ti furthsr dsmanding apppicatins rsquiring svsn bstsr snsrgy
sfcisncy, Cirtsx M0+ was dssignsd with twi stags pipspins and achisvsd high psrfirmancs with
vsry piw dynamic piwsr cinsumptin, rsducsd branch shadiw and rsducsd numbsr if fash
msmiry accsss. Cirtsx M1 was dssignsd fir imppsmsntatin in FPGA. It is functinappy a subsst if
Cirtsx M3 and runs ARM v6 instructin sst with OS sxtsnsiin iptins. It has 32-bit AHB pits bus
intsrfacs, ssparats tghtpy ciuppsd msmiry intsrfacs and JTAG intsrfacs ti facipitats dsbug iptins.
It has thrss stags pipspins imppsmsntatin and cinfgurabps NVIC fir rsducing intsrrupt patsncy.
Introduction to TIVA Microcontrollers
In this text book, TIVA platforms and launch pads are used to develop various embedded
applications. So in this section two TIVA series microcontrollers are introduced.
TIVA TM4C123GH6PM Microcontroller
The microcontroller block diagram shown in Fig 1.20 and Fig 1.21 have six functional units. The
cortex M4F core, on-chip memory, analog block, serial interface, motion control and system
TM4C123GH6PM microcontroller has 32 bit ARM Cortex M4 CPU core with 80
MHz clock rate.
Memory protection unit provides protected operating system functionality and
floating point unit supports IEEE single precision operations.
JTAG/SWD/ETM for serial wire debug and trace.
Nested vector interrupt controller (NVIC) reduces interrupt response latency.
Serial control block holds the system configuration information.
The microcontroller has a set of memory integrated in it: 256 KB flash memory, 32
KB SRAM, 2 KB EEPROM and ROM loaded with TIVA software library and
Serial communications peripherals such as: 2 CAN controllers, full speed USB
controller, 8 UARTs, 4 I2C modules and 4 Synchronous serial interface modules.
On chip voltage regulator, two analog comparators and two 12 channel 12-bit analog
to digital converter with sample rate I million samples per second are the analog
functions in built to the device.  Two quadrature encoder with index module and
two PWM modules are the advanced motion control functions integrated into the
device that facilitate wheel and motor controls.
Various system functions integrated into the device are: Direct Memory Access
controller, clock and reset circuitry with 16 MHz precision oscillator, six 32-bit
timers, six 64-bit timers, twelve 32/64 bit capture compare PWM, battery backed
hibernation module and RTC hibernation module, 2 watchdog timers and 43 GPIOs.
Few Applications:
Building automation system
Lighting control system
Data acquisition system
Motion control
IoT and Sensor networks. TIVA TM4C129CNCZAD Microcontroller
TM4C129CNCZAD microcontroller has 32 bit ARM Cortex M4F CPU core with
120 MHz clock rate.
Memory protection unit provides a privileged mode for protected operating system
functionality and floating point unit supports IEEE 754 compliant single precision
JTAG/SWD/ETM for serial wire debug and trace.
Nested vector interrupt controller (NVIC) reduces interrupt response latency and high
performance interrupt handling for time critical applications.
The microcontroller has a set of memory integrated in it: 1MB flash memory, 256 KB
SRAM, 6 KB EEPROM and ROM loaded with TIVAware, software library and
Serial communications peripherals such as: 2 CAN controllers, full speed and high
speed USB controller, 8 UARTs, 10 I2C modules and 4 Synchronous serial interface
On chip voltage regulator, three analog comparators and two 12 channel 12-bit analog
to digital converter with sample rate 2 million samples per second and temperature
sensor are the analog functions in built to the device.
One quadrature encoder and one PWM module with 8 PWM outputs are the
advanced motion control functions integrated into the device that facilitate wheel and
motor controls.
Various system functions integrated into the device are: Micro Direct Memory
Access controller, clock and reset circuitry with 16 MHz precision oscillator, eight
32-bit timers, low power battery backed hibernation module and RTC hibernation
module, 2 watchdog timers and 140 GPIOs.
Cyclic Redundancy Check (CRC) computation module is used for message transfer
and safety system checks. CRC module can be used in combination with AES and
DES modules.
Advanced Encryption Standard (AES) and Data Encryption Standard (DES)
accelerator module provides hardware accelerated data encryption and decryption
Secure Hash Algorithm/ Message Digest Algorithm (SHA/MD5) provides hardware
accelerated hash functions for secured data applications.
Rsgistsrs ars fir tsmpirary data stirags within pricsssir architscturs. As shiwn in Fig.1.1, ARM
pricsssir has sixtssn numbsrs if gsnsrap purpiss rsgistsrs, R0-R15 and a currsnt prigram status
rsgistsr (CPSR) dsfnsd fir ussr mids if ipsratin. Each if thsss rsgistsrs is if 32-bits. Out if thsss
rsgistsrs, R13, R14 and R15 havs spsciap purpisss
R13: Ussd as ths stack piintsr that hipds ths addrsss if ths tip if ths stack in ths currsnt pricsssir
R14: Ussd as ths pink rsgistsr that savss ths cintsnt if prigram ciuntsr in cintrip transfsr dus ti
ths iccurrsncs if sxcsptins ir using ths branch instructins in ths prigram.
R15: Ussd as ths prigram ciuntsr that piints ti ths nsxt instructin ti bs sxscutsd. In ARM stats,
app instructins ars if 32-bits (fiur bytss) fir which, PC is apways apignsd ti a wird biundary. This
msans that ths psast signifcant twi bits if ths PC ars apways zsri. Ths PC can apsi bs hapfwird (16bit) apignsd fir Thumb stats (16 bit instructins) ir byts apignsd fir Jazspps stats (1-bit instructins)
suppirtsd by difsrsnt vsrsiins if ARM architscturs
Current Program Status Register (CPSR)
CPSR, a 32-bit status rsgistsr, hipds ths currsnt stats if ths ARM cirs. As shiwn in Fig 1.4, ths
rsgistsr is dividsd inti fiur difsrsnt fspds- fags, status, sxtsnsiin and cintrip; sach if 1-bits. Ths
fag fspd has ths bit spscifcatin fir fiur cinditin fags; N, a, C and V and is ussd fir arithmstc and
pigic instructins.
N-(Nsgatin fag) 1 indicatss nsgatvs rssupt frim ALU.
a- (asri fag) 1 indicatss zsri rssupt frim ALU.
C- (Carry fag) 1 indicatss ALU ipsratin gsnsratsd carry.
V- (Ovsrfiw fag) 1 indicatss ALU ipsratin ivsrfiwsd.
Mist if ths ARM instructins ars cinditinappy sxscutsd. Bassd in ths status if thsss cinditin
fags, cinditin cidss ars ussd aping with instructin mnsminics ti cintrip whsthsr ir nit ths
instructin wipp bs sxscutsd. Status and sxtsnsiin fspds ars rsssrvsd fir futurs usags. In ths cintrip
fspd, ths psast signifcant fvs bits ars ussd ti savs ths midss if ipsratin if ARM cirs. Pricsssir
mids can bs changsd by dirsctpy midifying thsss cintrip bits. Ths mist signifcant thrss bits I, F and
T havs signifcancs as bspiw:
 I
1 indicatss IRQ is disabpsd ;
0 indicatss IRQ is snabpsd.
 F 1 indicatss FIQ is disabpsd ; 0 indicatss FIQ is snabpsd.
 T
1 indicatss ths Thumb stats is actvs ; 0 indicatss ARM stats is actvs.
Thsss ars pricsssir spscifc fsaturss.
Addressing modes
Addrsssing mids is ths way if addrsssing data ir ipsrand in ths instructin. Evsry pricsssir
instructin sst ifsrs difsrsnt addrsssing midss ti dstsrmins ths addrsss if ipsrands. Sims
fundamsntap addrsssing midss ussd by mist if ths pricsssirs ars: rsgistsr addrsssing, immsdiats
addrsssing, dirsct addrsssing and rsgistsr indirsct addrsssing. In rsgistsr addrsssing mids, ths
ipsrand is hspd in a rsgistsr which is spscifsd in ths instructin. In immsdiats addrsssing mids, ths
ipsrand is hspd in ths instructin. In dirsct addrsssing mids, ths ipsrand rssidss in ths msmiry
whiss addrsss is spscifsd in ths instructin. Simiparpy in rsgistsr indirsct addrsssing mids, ths
ipsrand is hspd in ths msmiry whiss addrsss rssidss in a rsgistsr that is spscifsd in ths instructin
ARM Addressing modes:
 Rsgistsr Addrsssing: Ths ipsrands ars in ths rsgistsrs. MOV R1, R2
// mivs cintsnt if
R2 ti R1 //
SUB R0, R1, R2 //subtract cintsnt if R2 frim R1 and mivs ths rssupt ti R0 //
 Rspatvs Addrsssing: Addrsss if ths msmiry dirsctpy spscifsd in ths instructin.
Bsubriutns1// branch ti suriutns1 //
BEQ LOOP // branch ti LOOP if prsviius instructin ssts ths zsri fag i.s, a 1 //
 Immsdiats Addrsssing: Opsrand2 is an immsdiats vapus. SUB R0, R0, #1// Savs (R0 –1) ti R0
MOV R0, #0xFF00 // Put 0xFF00 ti R0 //
 Rsgistsr Indirsct Addrsssing: Addrsss if ths msmiry picatin that hipds ths ipsrands thsrs
in a rsgistsr. LDR R1, [R2]//Liad R1 with ths data piintsd by rsgistsr R2. //
ADD R0, R1, [R2]//add R1 with ths data piintsd by R2 and put ths rssupt inti R0//
 Rsgistsr Ofsst Addrsssing: Opsrand2 is in a rsgistsr with sims ifsst capcupatin. MOV R0,
R2, LSL #3
// (R2 << 3), thsn mivs ti R0 //
AND R0, R1, R2, LSR R3// (R2 >> R3), pigicappy AND with R1 and mivs rssupt ti R0 //
 Rsgistsr bassd with Ofsst Addrsssing: Efsctvs msmiry addrsss has ti bs capcupatsd frim a
bass addrsss and an ifsst. Ofsst can bs an immsdiats ifsst, rsgistsr ifsst ir scapsd
rsgistsr ifsst.
 Prs-Indsxsd Addrsssing LDR R2, [R3, #0x0F] // Immsdiats ifsst. // Taks vapus in R3, add
ti 0x0F, uss it as addrsss and piad data frim that addrsss ti R2 //
STR R1, [R0, -R2]
// Rsgistsr ifsst // Uss (R0-R2) as addrsss if ths msmiry and stirs
data if R1 ti that addrsss.//
LDR R3, [R1, R2 LSR #1] // Scapsd rsgistsr ifsst// // Uss (R1+ (R2>>1)) as addrsss and piad
ths data frim that addrsss ti R3. //
 Prs-Indsxsd with writs back apsi cappsd auti-indsxing with prs-indsxsd addrsssing. symbip
indicatss that ths instructin savss ths capcupatsd addrsss in ths bass addrsss rsgistsr. LDR
R0, [R1, #4]! // Immsdiats ifsst // // Uss (R1+4) as addrsss and piad ths data frim that
addrsss ti R0 and updats R1 by (R1+4)//
STR R1, [R2, R0]! // Rsgistsr ifsst // // Uss (R2+R0) as addrsss and stirs ths data frim R1
ti that addrsss. Updats R2 by (R2+R0) //
STR R3, [R1, R2 LSL #4]! // Scapsd rsgistsr ifsst // // Uss (R1+ (R2<<4)) as addrsss and
stirs ths data frim R3 ti that addrsss. Updats R1 by (R1+ (R2<<4)) //
 Pist-Indsxsd apsi cappsd auti-indsxing with pist-indsxsd addrsssing. LDR R0, [R1], #4 //
Immsdiats ifsst // // Liad ths data piintsd ti by R1 ti R0 and thsn updats R1 by (R1+4). //
STR R1, [R3], R4 // Rsgistsr ifsst // // Stirs ths data in R1 ti ths msmiry picatin piintsd
ti by R3 and thsn updats R3 by (R3+R4)//
LDR R2, [R0], -R3, LSR #4 // Scapsd rsgistsr ifsst // // Liad ths data frim ths addrsss
piintsd ti by R0 ti R2 and thsn updats R0 ti (R0- (R3>>4)). //
ARM Instruction Set
In any pricsssir architscturs, an instructin incpudss an ipcids that spscifss ths ipsratin ti
psrfirm, such as add cintsnts if twi rsgistsrs ir mivs data frim a rsgistsr ti msmiry stc, with
spscifsd ipsrands, which may spscify rsgistsrs, msmiry picatins, ir immsdiats data. Instructin
sst if a pricsssir givss infirmatin abiut ths instructins, addrsssing midss and ths tming
rsquirsmsnt fir ths sxscutin if sach instructin. Ths instructin sst is apways spscifsd by ths
pricsssir dssignsr. Evsry pricsssir imppsmsnts its instructin sst in ths architscturs. ARM Ltd
bsing ths pricsssir cirs dssignsr and nit ths sipicin manufactursr, it dsfnss ths instructin sst ti
bs imppsmsntsd by ths chip manufactursrs.
ARM architscturs has twi instructin ssts. Ths ARM instructin sst and Thumb instructin sst. In
ARM instructin sst, app instructins ars 32 bits wids and ars apignsd at 4-bytss biundariss in
msmiry. On ths ithsr hand, in thumb instructin sst, app instructins ars if 16 bits wids and ars
apignsd at svsn ir twi bytss biundariss in msmiry.
Ths impirtant fsaturss if ths ARM and Thumb instructin sst ars:
Mist if ths instructins ars sxscutsd in ins cycps.
Liad/Stirs architscturs fir accsssing data frim sxtsrnap msmiry with piwsrfup
auti-indsxing addrsssing midss.
o Incpusiin if piad and stirs muptpps rsgistsr instructins.
o 3-addrsss instructins: twi siurcs ipsrand rsgistsrs and ths rssupt rsgistsr ars app
distnctpy spscifsd.
o Data pricsssing instructins act inpy in rsgistsrs.
o Evsry instructin can bs cinditinappy sxscutsd which imprivss ths psrfirmancs
and cids dsnsity by rsducing ths numbsr if branch instructins.
o Ths abipity ti sxscuts a barrsp shif ipsratin and an ALU ipsratin if a singps
cimppsx instructin in a singps cpick cycps.
o Incpusiin if advancsd DSP instructins in ths ARM instructin sst fir ths muptppy
and accumupats (MAC) unit rsppacss ths nssd if ssparats digitap signap pricsssir.
o Imppsmsntatin if cipricsssir instructin sst with sxtsnsiin if ths prigramming
o Ths Thumb instructin sst is 16-bit cimprssssd rsprsssntatin if ths ARM
instructins that prividss high cids dsnsity.
ARM Instructins can bs catsgirizsd inti fippiwing briad cpassss:
1 .Data mivsmsnt instructins
2. Data Pricsssing Instructins
Arithmstc/pigic Instructins
Barrsp shifing instructins
Cimparisin Instructins
Muptppy Instructins
3. Branch Instructins
4. Liad and stirs Instructins
Liad and Stirs rsgistsr instructin
Liad and Stirs muptpps rsgistsr instructins
Stack instructins
Swap rsgistsr and msmiry cintsnt
5. Prigram Status rsgistsr Instructins
Sst ths vapuss if ths cinditinap cids fag
Sst ths vapuss if ths intsrrupt snabps bit
Sst ths pricsssir mids
6. Excsptin gsnsratng Instructins
Sifwars Intsrrupt Instructin
Sifwars Brsak Piint instructin
Overview of Microcontroller and Embedded Systems
3.1 Embedded hardware and various building blocks:-
Fig. 1 Components of Embedded system hardware
Fig. 2 Various Building blocks of embedded system
3.2 Processor Selection for an Embedded System:-
3.2.1. Microcontroller Selection:
3.3 Interfacing Processor, Memories and I/O Devices:-
3.4. Timer & Counting Devices:Most embedded systems needs a timing device.
Timing Device:
Counting Device:
Timer cum Counting Device:
Uses of Timer Devices:
States in a Timer:
Ten Forms of a Timer:
Variables for control bits and status in a software timer:
3.5. Serial Communication and advanced I/O:-
I/O Types & Examples:
Serial Bus Communication Protocols:-
3.6 Buses between the Networked multiple Devices:-
3.7 Embedded System Design and Co-Design Issues in System Development Process:-
3.8 Design Cycle in the Development Phase for an Embedded System:-
3.9. Uses of Target System or its Emulator and In-Circuit Emulator:-
3.10 Use of software tools for Development of an Embedded System:-
3.11 Design Metrics of Embedded Systems:-
The I/O pin configurations for the TM4C123 microcontrollers. The regular function of a pin is to perform
parallel I/O. Most of the pins have an alternative function. Joint Test Action Group (JTAG) is a standard
test access port used to program and debug the microcontroller board. Each microcontroller uses five port
pins for the JTAG interface.
I/O pins on Tiva microcontrollers have a wide range of alternative functions:
Analog Comparator
Universal asynchronous receiver/transmitter
Synchronous serial interface
Inter-integrated circuit
Periodic interrupts, input capture, and output compare
Pulse width modulation
Analog to digital converter, measure analog signals
Compare two analog signals
Quadrature encoder interface
Universal serial bus
High-speed network
Controller area network
The UART can be used for serial communication between computers. It is asynchronous and allows for
simultaneous communication in both directions.
The SSI is alternately called serial peripheral interface (SPI). It is used to interface medium-speed I/O
I2C is a simple I/O bus that we will use to interface low speed peripheral devices. Input capture and output
compare will be used to create periodic interrupts and measure period, pulse width, phase, and frequency.
PWM outputs will be used to apply variable power to motor interfaces. In a typical motor controller, input
capture measures rotational speed, and PWM controls power. A PWM output can also be used to create a
The ADC will be used to measure the amplitude of analog signals and will be important in data acquisition
systems. The analog comparator takes two analog inputs and produces a digital output depending on which
analog input is greater.
The QEI can be used to interface a brushless DC motor. USB is a high-speed serial communication
The Ethernet port can be used to bridge the microcontroller to the Internet or a local area network.
The CAN creates a high-speed communication channel between microcontrollers and is commonly found
in automotive and other distributed control applications.
4.1 Tiva TM4C123 LaunchPad I/O pins
Pins on the TM4C family can be assigned to as many as eight different I/O functions. Pins can be
configured for digital I/O, analog input, timer I/O, or serial I/O. For example PA0 can be digital I/O or
serial input. There are two buses used for I/O. The digital I/O ports are connected to both the advanced
peripheral bus and the advanced high-performance bus. Because of the multiple buses, the microcontroller
can perform I/O bus cycles simultaneous with instruction fetches from flash ROM. The
TM4C123GH6PM adds up to 16 PWM outputs. There are 43 I/O lines. There are twelve ADC inputs;
each ADC can convert up to 1M samples per second. Table 6.1 lists the regular and alternate names of
the port pins.
Figure : I/O port pins for the TM4C123GH6PM microcontrollers.
Each pin has one configuration bit in the GPIOAMSEL register. We set this bit to connect the port pin to
the ADC or analog comparator. For digital functions, each pin also has four bits in the GPIOPCTL register,
which we set to specify the alternative function for that pin (0 means regular I/O port). Not every pin can
be connected to every alternative function.
Pins PC3 – PC0 were left off Table 4.1 because these four pins are reserved for the JTAG debugger and
should not be used for regular I/O. Notice, most alternate function modules (e.g., U0Rx) only exist on one
pin (PA0). While other functions could be mapped to two or three pins (e.g., CAN0Rx could be mapped
to one of the following: PB4, PE4, or PF0.)
The microcontroller board provides an integrated In-Circuit Debug Interface (ICDI), which allows
programming and debugging of the onboard TM4C123 microcontroller. One USB cable is used by the
debugger (ICDI), and the other USB allows the user to develop USB applications (device). The user can
select board power to come from either the debugger (ICDI) or the USB device (device) by setting the
Power selection switch.
Pins PA1 – PA0 create a serial port, which is linked through the debugger cable to the PC. The serial link
is a physical UART as seen by the TM4C and mapped to a virtual COM port on the PC. The USB device
interface uses PD4 and PD5. The JTAG debugger requires pins PC3 – PC0. The LaunchPad connects PB6
to PD0, and PB7 to PD1. If you wish to use both PB6 and PD0 you will need to remove the R9 resistor.
Similarly, to use both PB7 and PD1 remove the R10 resistor.
Figure: Tiva LaunchPad based on the TM4C123GH6PM.
The Tiva LaunchPad evaluation board has two switches and one 3-color LED. See Figure 4.3. The
switches are negative logic and will require activation of the internal pull-up resistors. In particular, you
will set bits 0 and 4 in GPIO_PORTF_PUR_R register. The LED interfaces on PF3 – PF1 are positive
logic. To use the LED, make the PF3 – PF1 pins an output. To activate the red color, output a one to PF1.
The blue color is on PF2, and the green color is controlled by PF3. The 0-Ω resistors (R1, R2, R11, R12,
R13, R25, and R29) can be removed to disconnect the corresponding pin from the external hardware.
The LaunchPad has four 10-pin connectors, labeled as J1 J2 J3 J4 in Figures 4.2 and 4.4, to which you
can attach your external signals. The top side of these connectors has male pins, and the bottom side has
female sockets.
Figure 4.3. Switch and LED interfaces on the Tiva LaunchPad Evaluation Board. The zero ohm resistors can be removed so the
corresponding pin can be used for its regular purpose.
4.2 GPIO
GPIO stand for General Purpose Input/Outputs, meaning that it's a module capable of receiving
and transmitting signals. They work with digital signals but can be mixed to use the pins with other
peripheral functions (ADC, SSI, UART, etc).
Tiva GPIO’s
The tm4c123gh6pm has 6 GPIO blocks, each with his own GPIO port (portA, port B, port C, port D
, port E , port F).
Up to 43 GPIOs, depending on configuration
Highly flexible pin muxing allows use as GPIO or one of several peripheral functions
 5-V-tolerant in input configuration
 Ports A-G accessed through the Advanced Peripheral Bus (APB)
Fast toggle capable of a change every clock cycle for ports on AHB, every two clock cycles
for ports on APB
 Programmable control for GPIO interrupt
Interrupt generation masking
 Edge-triggered on rising, falling, or both
 Level-sensitive on High or Low values
 Bit masking in both read and write operations through address lines
 Can be used to initiate an ADC sample sequence or a μDMA transfer
 Pin state can be retained during Hibernation mode
 Pins configured as digital inputs are Schmitt-triggered
Programmable control for GPIO pad configuration
Weak pull-up or pull-down resistors
2-mA, 4-mA, and 8-mA pad drive for digital communication; up to four pads can sink
18-mA for high-current applications
Slew rate control for 8-mA pad drive
Open drain enables
Digital input enables
Note that PD4, PD5, PB0 and PB1 aren't 5V tolerant and are maxed at a 3.6V input.
Each GPIO has 8 pins which should make a total of 48 pins but some of those are internal and can't
be used so the maximum is 43. The launchpads usually have less since some are
not physically available. The TM4C123 launchpad has just 37 GPIO pins.
Alternate functions
The GPIO allows digital inputs or outputs and also allows alternate functions. The alternate functions
can be analog readings by muxing the ADC to a pin, or UART communication by making the right
Very Important GPIO Pins With Special Considerations
Some pins are locked to a certain configuration and can only be used if you unlock them. You need to
do that in the GPIOLOCK register and uncommitted it by setting the GPIOCR register. If you use
TivaWare this should work, just chose the right base:
TM4C123 GPIO Programming
The TI LaunchPad uses the TM4C123GH6PM microcontroller, which has 256K bytes (256KB) of onchip Flash memory for code, 32KB of on-chip SRAM for data, and a large number of on-chip
The ARM Cortex-M4 has 4GB (Giga bytes) of memory space. It uses memory mapped I/O, which
means that the I/O peripheral ports are mapped into the 4GB memory space.
Allocated size
256 KB
Allocated address
0x0000.0000 To 0x0003.FFFF
0x2000.0000 To 0x2000.7FFF
All the peripherals 0x4000.0000 to 0x400F.FFFF
The General Purpose I/O ports (GPIO) on TM4C123GXL LaunchPad are designated to port A to port F.
The address range assigned to each GPIO port is shown as follows:
Port A: 0x4000.4000 to 0x4000.4FFF
Port B: 0x4000.5000 to 0x4000.5FFF
Port C: 0x4000.6000 to 0x4000.6FFF
Port D: 0x4000.7000 to 0x4000.7FFF
Port E: 0x4002.4000 to 0x4002.4FFF
Port F: 0x4002.5000 to 0x4002.5FFF
The 4K bytes of memory space is assigned to each of the GPIO. The reason is that each GPIO has a
large number of special function registers associated with it, and furthermore GPIO Data Register
supports bit-specific addressing, which allows collective access to 1 to 8 bits in a data port.
To initialize an I/O port for general use seven steps need to be performed.
1. Activate the clock for the port in the Run Mode Clock Gating Control Register 2 (RCGC2).
2. Unlock the port (LOCK = 0x4C4F434B). This step is only needed for pins PC03, PD7 and PF0 on TM4C123GXL LaunchPad.
3. Disable the analog function of the pin in the Analog Mode Select register (AMSEL), because we
want to use the pin for digital I/O. If this pin is connected to the ADC or analog comparator, its
corresponding bit in AMSELmust be set as 1. In our case, this pin is used as digital I/O, so its
corresponding bit must be set as 0.
4. Clear bits in the port control register (PCTL) to select regular digital function. Each GPIO pin
needs four bits in its corresponding PCTL register. Not every pin can be configured to every
alternative function. Figure 2.2 shows which pin can be used as what kind of alternate functions.
5. Set its direction register (DIR). A DIR bit of 0 means input, and 1 means output.
6. Clear bits in the alternate Function Select register (AFSEL).
Enable digital port in the Digital Enable register (DEN).
We need to add a short delay between activating the clock and setting the port registers.
Figure: registers used to configure GPIO
Figure: – PMCx bits in the GPIOPCTL register on the TM4C specify alternate functions. PD4 and PD5 are
hardwired to the USB device. PA0 and PA1 are hardwired to the serial port
The GPIO Data Register is located at the offset address of 0x000 from the base address of its port. As
we mentioned before, the data register supports bit-specific addressing. In order to write to this register,
the corresponding bits in the mask, resulting from the address bus bits[9:2], must be set. Otherwise, the
bit values remain unchanged by the write.
For example, writing to address 0x40004038 means that bits 1, 2 and 3 of port A must be changed, since
the base address of port A is 0x40004000. The explanation is shown in below.
The following table help you calculate offset address for the bits of a port, to which you want to access.
If we want to access bit Offset Constanct
If we want to read and write all 8 bits of a port, it means that we need to sum all these 8 offset constants,
which makes the offset address of 0x3FC (001111111100 in binary).
4.3 Peripheral and Memory Address
A 32-bit processor can have 4 GB (=232) of address spaces. It depends on the architecture of the CPU
how these address spaces are segregated, among the memory and peripherals.
Peripheral Addressing
There are two complementary methods of addressing I/O devices for input and output between CPU and
peripheral. These are known as memory mapped I/O (MMIO) and port mapped I/O (PMIO). www.ti.com
Peripheral and Memory Address
In MMIO, same address bus is used to address both memory and peripheral devices. The address bus of
the CPU is shared between the peripheral devices and memory devices attached to the CPU. Thus, any
address accessed by the CPU may denote an address in the memory or a register of attached peripheral.
In these architectures, same CPU instructions used for memory access can also be used for I/O access.
In PMIO, peripheral devices possess a separate address bus from general memory devices. This is
accomplished in most architectures by providing a separate address bus dedicated to the peripheral devices
attached to the CPU. In these CPUs, the instruction set includes separate instructions to perform I/O
A TM4C123GH6PM chip employs MMIO which implies that the peripherals are mapped into the 32-bit
address bus.
4.4 Memory Mapped Peripherals
A TM4C123GH6PM chip consists of a 256 KB of Flash memory and 32 KB of SRAM. Table 5 shows
the memory map of a TM4C123GH6PM chip with addresses.
Flash Memory
Flash memory is structured into multiple blocks of single KB size which can be individually written to
and erased. Flash memory is used for store program code. Constant data used in a program can also be
stored in this memory. Lookup tables are used in many designs for performance improvement. These
lookup tables are stored in this memory.
Table: Memory Mapping in TM4C123GH6PM Chip
The on-chip SRAM starts at address 0x2000.0000 of the device memory map. ARM provides a
technology to reduce occurrences of read-modify-write (RMW) operations called bit-banding. This
technology allows address aliasing of SRAM and peripheral to allow access of individual bits of the
same memory in single atomic operation. For SRAM, the bit-band base is located at address
0x2200.0000. Bit band alias are computed according to following formula.
bitband alias= bitband base + byte offset *32 + bit number *4 (2.1)
Note: Bit banding is the technique to access and modifying content of bits in a register. It is helpful to
finish the read-modify operation in single machine cycle.
The region of the memory which device consider for modification is known as bit band region and the
region of memory to which device maps the selected memory is known as bit band alias.
The SRAM is implemented using two 32-bit wide SRAM banks (separate SRAM arrays). The banks are
partitioned in a way that one bank contains all, even words (the even bank) and the other contains all
odd words (the odd bank). A write access that is followed immediately by a read access to the same
bank. This incurs a stall of a single clock cycle.
Internal ROM
The internal ROM of the TM4C123GH6PM device is located at address 0x0100.0000 of the device
memory map. The ROM contains:
-specific peripherals and
The boot loader is used as an initial program loader (when the Flash memory is empty) as well as an
application-initiated firmware upgrade mechanism (by calling back to the boot loader). The Peripheral
Driver Library, APIs in ROM can be called by applications, reducing flash memory requirements and
freeing the Flash memory to be used for other purposes (such as additional features in the application).
Advance Encryption Standard (AES) is a publicly defined encryption standard used by the U.S.
Government and Cyclic Redundancy Check (CRC) is a technique to validate if a block of data has the
same contents as when previously checked.
All Peripheral devices, timers, and ADCs are mapped as MMIO in address space 0x40000000 to
0x400FFFFF. Since the number of supported peripherals is different among ICs of ARM families, the
upper limit of 0x400FFFFF is variant.
Memory Layout in TIVATM Launchpad
To observe the memory layout of TM4C123GH6PM, users can run an experiment on the board with a
simple code provided below. This is a simple code that results in the glow of the GREEN LED.
Fig : Flowchart to glow onboard LED
Pseudo code:
Start: Set clock (division| PLL| 16 Mhz| main OSC)
Configure the pins (Pin 1, 2, 3)
Output: Toggle the led (Pin1, 2, 3)
Delay generation (in nanoseconds)
Run infinite
Once this code is compiled, under workspace, if we expand <the project>/Debug, we can see the
memory map file.
4.5 Watchdog Timer
Every CPU has a system clock which drives the program counter. In every cycle, the program counter
executes instructions stored in the flash memory of a microcontroller. These instructions are executed
sequentially. There exist possibilities where a remotely installed system may freeze or run into an
unplanned situation which may trigger an infinite loop. On encountering such situations, system reset or
execution of the interrupt subroutine remains the only option. Watchdog timer provides a solution to this.
A watchdog timer counter enters a counter lapse or timeout after it reaches certain count. Under normal
operation, the program running the system continuously resets the watchdog timer. When the system
enters an infinite loop or stops responding, it fails to reset the watchdog timer. In due time, the watchdog
timer enters counter lapse. This timeout will trigger a reset signal to the system or call for an interrupt
service routine (ISR).
Fig : Operation of Watchdog Timer
TM4C123GH6PM microcontroller has two Watchdog Timer modules, one module is clocked by the
system clock (Watchdog Timer 0) and the other (Watchdog Timer 1) is clocked by the PIOSC therefore
it requires synchronizers.
Features of Watchdog Timer in TM4C123GH6PM controller:
-bit down counter with a programmable load register
protection from runaway software
-enabled stalling when the microcontroller asserts the CPU halt flag during debug
The watchdog timer can be configured to generate an interrupt to the controller on its first time out, and
to generate a reset signal on its second time-out. Once the watchdog timer has been configured, the lock
register can be written to prevent the timer configuration from being inadvertently altered.
4.6 Low Power Microcontroller
Need for Low Power Microcontroller
It is imperative for an embedded design to be low on its power consumption. Most embedded systems
and devices run on battery. Power demands are increasing rapidly, but battery capacity cannot keep up
with its pace. Therefore, a microcontroller which inherently consumes very less power is always
encouraging. However, embedded systems engineers usually need to optimize between power and
performance. Power and performance are inversely proportional to each other.
Let us consider an example where we are to design a system to monitor water level in a tank. When the
water level reduces below a particular level, water should be pumped in. There are many ways to go
about this design.
Hibernation Module on TivaTM Microcontrollers
This module manages to remove and restore power to the microcontroller and its associated peripherals.
This provides a means for reducing system power consumption. When the processor and peripherals are
idle, power can be completely removed if the Hibernation module is only the one powered.
Fig : Block diagram of Hibernation module
To achieve this, the Hibernation (HiB) Module is added with following features:
(i) A Real-Time Clock (RTC) to be used for wake events (ii) A battery backed SRAM for storing and
restoring processor state. The SRAM consists of 16 32-bit
word memory.
The RTC is a 32- bit seconds counter and 15- bit sub second counter. It also has an add-in trim capability
for precision control over time. The Microprocessor has a dedicated pin for waking using external signal.
The RTC and the SRAM are operational only if there is a valid battery voltage. There is a VDD30N mode,
which provides GPIO pin state during hibernation of the device.
Thus we are actually shutting the power off for the device or part at the lowest power mode. Under such
circumstances, it is safe to assume that in the wake up we are actually coming out of reset. But this will
allow the device to the keep the GPIO pins in their state without resetting them. A mechanism for power
control is used to shut down the part. In TM4C123GH6PM we have an on-chip power controller which
controls power for the CPU only. There is also a pin output from the microcontroller which is used for
system power control.
It should be duly noted that in TIVA Launchpad, the battery voltage is directly connected to the
processor voltage and it is always valid. But in a custom design with TM4C123GH6PM microcontroller
running on a battery, if the battery voltage is not valid, it will not go into hibernation mode.
The Hibernation module of TM4C123GH6PM provides two mechanisms for power control:
Table : Power Modes of Tiva
The second mechanism controls the power to the microcontroller with a control signal (HIB)
that signals an external voltage regulator to turn on or off.
The Hibernation module power source is determined dynamically. The supply voltage of the Hibernation
module is the larger of the main voltage source (VDD) or the battery voltage source (VBAT).
Hibernate mode can be entered through one of two ways:
The user initiates hibernation by setting the HIBREQ bit in the Hibernation Control (HIBCTL)
Power is arbitrarily removed from VDD while a valid VBAT is applied
Power Modes
There are six power modes in which TM4C123GH6PM operates as shown in the below table. They are
Run, Sleep, Deep Sleep, Hibernate with VDD3ON, Hibernate with RTC, and Hibernate without RTC. To
understand all these modes and compare them, it is necessary to analyze them under a condition. Let us
consider that the device is operating at 40 MHz system clock with PLL.
Programming Hibernation Module
This code can be compiled and executed on a TIVA Launchpad. When this code executes, the GREEN
LED glows continuously. We can observe that after 4s, the system automatically goes into sleep and the
LED stops glowing. When SW2 (switch on the right hand bottom corner of the Launchpad) is pressed, it
triggers a wake event and the GREEN LED starts glowing again. Now, after 4s, the system goes to sleep
again. This shows that, the wakeup process is the same as powering up. When the code starts, we can
determine that the processor woke from hibernation and restore the processor state from the memory.
Fig : Flowchart for programming hibernation module
4.7 Interrupts
The reader is aware that a microprocessor is connected to several input and output devices. It is important
at this point for us to know how a microprocessor manages these devices efficiently.
Introduction to Interrupts and Polling
A microprocessor executes instructions sequentially. Alongside, it is also connected to several devices.
Dataflow between these devices and the microprocessor has to be managed effectively. There are two
ways it is done in a microprocessor: either by using interrupts or by using polling.
Polling is a simple method of I/O access. In this method, the microcontroller continuously probes whether
the device requires attention, i.e. if there is data to be exchanged. A polling function or subroutine is called
repeatedly while a program is being executed. When the status of the device being polled responds to the
interrogation, a data exchange is initiated. The polling subroutine consumes processing time from the
presently executing task. This is a very inefficient way because I/O devices do not always crave for
attention from the microprocessor. But the microprocessor wastes valuable processing time in
unnecessarily polling of the devices.
Optimizing for low power in embedded MCU designs
Low-power embedded design is motivated by the need to run applications for as long as possible while
consuming minimum power. In a battery-powered system, this need is magnified. Furthermore, low
power implies lower cost of operation and smaller battery size to make applications more mobile. When
energy comes at a premium as it does with today’s green initiatives, ensuring that an embedded design
consumes as little energy as possible is even important for wall-powered applications.
Designing power-efficient applications also ensures less overhead to manage thermal dissipation, and
heat generation is controlled at the source by optimizing the power consumed. Given these advantages,
embedded systems engineers can no longer ignore the problem of optimizing power. This article will
focus on the major factors contributing to power consumption in an embedded system by analyzing the
various power modes which most microcontrollers offer. Then we will analyze a real-life example of an
embedded application in terms of power consumption and how its efficiency can be maximized.
MCU Power Consumption
There are several points to be aware of when selecting an MCU or external components. The overall
power consumption of an MCU is defined by its power consumption in different modes, typically active
and standby (which includes sleep, hibernate, etc.), and taking into account the power consumed to
transition from one mode to another.
Active power consumption by an MCU is the power consumed when the MCU is running. As almost all
controllers are based upon CMOS logic, power is consumed primarily during the switching of
transistors. As a starting point, we will analyze the power consumption of a CMOS inverter , which is
the basic building block of any CMOS design.
Figure: CMOS inverter
CMOS circuits dissipate power by charging the various load capacitances whenever they are switched.
When considering internal architectures, this is mainly the gate capacitance but there are drain and source
capacitances too. Power is dissipated across the PMOS transistor while the load capacitor is being charged
and across the NMOS when the load capacitor is being discharged. Instantaneous power dissipation across
PPMOSi = iL(Vdd - Vo)
After substituting the value of iL:
PPMOSi= CL (Vdd - Vo) dVo/dt
Total power dissipation across the PMOS to switch the output from low to high can be found by
integrating power dissipation across the PMOS to the charge load capacitor from 0 V to Vdd:
PMOS power consumption, PPMOS = ½ CLVdd2
Similarly, to switch the output from high to low, total power dissipation across the NMOS is:
NMOS power consumption, PNMOS = ½ CLVdd2
For one switching cycle, then, power dissipation is:
PTotal = PNMOS + PPMOS = CLVdd2
If we define the average power in terms of the switching frequency (f), we get:
P = fCLVdd2
From this equation we can see that power consumption depends upon the switching frequency, load
capacitance, and supply voltage. Load capacitance is determined by the technology parameters and the
design layout, and is therefore beyond the control of the embedded system designer. However, the other
two factors – switching frequency and supply voltage – are factors a system designer can modify to
impact power efficiency for a given microcontroller. Of course, the value of these parameters is also
heavily dependent on the application of the design.
However, modern controllers run at an internal regulated voltage irrespective of the input voltage on the
supply pins. There are controllers available in market that can be operated from 0.5 V to 5.5 V, but the
internal core runs at a fixed regulated voltage such as 1.8V, no matter what the supply voltage is.
Therefore, this parameter is not as important in the case of modern controllers as it was in the past.
However, it is good to keep the supply voltage to the minimum requirement for regulators or near the
voltage where the regulator is bypassed.
This leaves system designers have just one parameter available for affecting power control: switching
frequency. Hence, in the active mode the minimum required operating speed for the MCU should be
calculated and higher clock speeds should be avoided.
Stand by Power
The other major factor which determines battery life is the standby power consumption of an embedded
system. Most applications can spend significant periods of time in standby mode. In these systems, the
major contributor to total system power consumption is the standby current rather than the active
current. Standby current is the sum of leakage current, current consumed by power management circuits,
clocking systems, power regulators, RTC, IOs, interrupt controllers, and so on. It varies from controller
to controller, based upon the particular features and peripherals supported in standby mode.
Finally, the power consumed while transitioning from low power mode to active mode should not be
overlooked. Devices may end up wasting a significant amount of power while transitioning between
these two modes.
Based on these power modes, an MCU’s average power consumption is:
MCU average power consumption = (Active Power + Sleep Power + Transition Power) / Total
Time, where Active Power= Time for which MCU is active * Active Current
Sleep Power= Time for which MCU is in sleep * Sleep Current.
Transition Power= Power consumed while making transition from sleep to active mode
The amount of time the system remains in active and standby mode is application-dependent. Some
applications may need to have the MCU running all the time while some may need to have it running
only occasionally. There are MCUs available on the market that come with power-down modes other
than sleep, such as hibernate, deep sleep, or shut down, in which power consumption can be on the order
of 10s of nA. System designers need to look at the power consumption of the mode in which the system
has to operate for the majority of time to ensure that the overall design is as power efficient as it can be.
If we look deeper, there are some vital tradeoffs that must be considered. For some applications, it could
prove beneficial for the system to run at a higher speed so it can finish its job faster and return to a low
power mode. Other systems may do better running at a slower speed to keep active power consumption
low. Here, the system designer has to analyze the best case for the application considering the current at
different operating speeds, the time it takes to come out of low power mode, the current consumption in
low power mode, and the frequency with which the system needs to switch between active and sleep
Peripheral Power
MCU power consumption is only one factor when considering system power consumption. Some
engineers tend to concentrate too much on the MCU and ignore the power consumed by external
peripherals. If the objective is to optimize the power consumption of the entire embedded system
solution, one cannot afford to do this.
Consider a simple temperature measurement system for home use (Figure 2).
Figure : Temperature monitoring system
This system has one ADC to measure the sensor voltage, one DAC to generate a reference, one LCD
module to display data, and one MCU to process the data. Power should be saved beginning at the
individual block level. If the power consumption is calculated for this system, it will be given by:
Total power consumption = MCU power consumption + ADC power consumption + DAC power
consumption + LCD power consumption
For this system, the sample rate need not be very high since temperature does not change rapidly. Power
consumption can be kept to a minimum by switching on the ADC and DAC only when required and
optimizing the time ratio of MCU active and standby modes. If this system is made up of discrete
components, it can get quite challenging to coordinate. Power consumption for discrete componentbased architectures will look similar to the system.
Figure: Power consumption system using discrete-based designs
In Figure 3, standby current is contributed to by MCU standby current and the active current of the ADC,
DAC and LCD. To save ADC power consumption, the ADC may have an option by which the MCU can
stop ADC conversion before it goes to sleep. However, there will still be some standby current for ADC
and similarly for the DAC. Alternatively, the system could be implemented using a system-on-chip (SoC)
architecture where all of the peripherals are integrated onto a single chip along with the ability to control
the power of each individual peripheral. Power consumption of such a system will look similar to that
shown in Figure 4 and can lead to a dramatic reduction in power consumption compared to a discrete
Click on image to enlarge.
Figure: Power consumption of SoC-based solution vs. discrete solution
When designing any system, we should use what is needed rather than simply using what is available.
Choosing a faster or more sophisticated component than is needed results in higher cost and lower
power efficiency. For example, a 20-bit ADC running at 1 Msps is clearly more than is needed for a
temperature measurement application. In addition, the ADC needs a high-frequency operating clock to
sample at this rate.
Advancements in SoC technology allow developers to access a wide range of on-chip peripherals such
as filters, ADCs, DACs, Op-amps, and programmable analog and digital blocks. For example, PSoC
devices from Cypress Semiconductors have a wide operating frequency range with programmable clock
sources for different blocks, including the MCU, and support numerous power management modes.
These modes range from active mode, where all the features on the device are available, to hibernate
modes, where current can be as low as 100 nA while retaining the contents of configuration registers
and RAM.
As complex as SoC architectures are, they represent almost the complete system, making it more
straightforward to compute power consumption. For example, if the system is doing nothing, then the
standby current of the whole system can be as low as 100 nA. Since peripherals and the MCU can be
switched on or off individually, only the appropriate blocks need to resume operation after the next
wake up event. This is one of the key benefits of an SoC can have from a system point of view.
In some systems, for some periods the only hardware functions needed do not require an MCU, such as
when generating a waveform using a DAC. This task can be completed by the DMA (Direct Memory
Access) and DAC without the MCU, and so the MCU can be switched off. SoCs enable developers to
design ultra-low-power embedded systems that are also cost and space efficient, with the added
advantage of fast time to market.
A system’s average power calculation in a SoC-based system becomes more complex since along with
the average MCU current, we need to consider the operating state of each individual peripheral on the
chip. Average system current is:
Battery Life
Battery life is a critical specification for any battery-powered application. Battery ratings are given in
units of mA -Hr, meaning the battery can supply ‘X’ mA of current for one hour. If we know the
average current, we can calculate battery life:
Battery Life = Battery rating/ Iavg
This equation gives the battery life in hours if Iavg is given in mA.
Most consumers care about power consumption in both wall-powered and battery-powered devices. In
today’s competitive market, designing a product that consumes higher power or costs more than
competitive products can result in reduced market success. When optimizing power consumption is a
major criterion, designers should look at critical parameters such as choosing the appropriate
components and making sure they are not overrated for the desired end application, as well as making
sure the system does not operate at higher speeds than required. In addition, developers will want to
seriously consider how long the system spends in active and standby modes and the relative power
consumption in each.
However, in interrupt method, whenever a device requires the attention from the microprocessors, it pings
the microprocessor. This ping is called interrupt signal or sometimes interrupt request (IRQ). Every IRQ
is associated with a subroutine that needs to be executed within the microprocessor. This subroutine is
called interrupt service routine (ISR) or sometimes interrupt handler. The microprocessor halts current
program execution and attends to the IRQ by executing the ISR. Once execution of ISR completes, the
microprocessor resumes the halted task.
The current state of the microprocessor must be saved before it attends the IRQ in order to be able to
continue from where it was before the interrupt. To achieve this, the contents of all of its internal registers,
both general purpose and special registers, are required to be saved to a memory section called the stack.
On completion of the interrupt call, these register contents will be reinstated from the stack. This allows
the microprocessor to resume its originally halted task.
There are two types of interrupts namely software driven interrupts (SWI) and hardware driven interrupts
(HWI). SWIs are generated from within a currently executing program. They are triggered by the interrupt
opcode. A SWI will call a subroutine that allows a program to access certain lower level service. HWIs
are signals from a device to the microprocessor. The device sets an interrupt line in the control bus high.
Microprocessors have two types of hardware interrupts namely, non-maskable interrupt (NMI) and
interrupt request (INTR). An NMI has a very high priority and they demand immediate execution. There
is no option to ignore an NMI. NMI is exclusively used for events that are regarded as having a higher
priority or tragic consequences for the system operation. For example, NMI can be initiated due to an
interruption of power supply, a memory fault or pressing of the reset button. An INTR may be generated
by a number of different devices all of which are connected to the single INTR control line. An INTR
may or may not be attended by the microprocessor. If the microprocessor is attending an interrupt, then
no further interrupts, other than an NMI, will be entertained until the current interrupt has been completed.
A control signal is used by the microprocessor to acknowledge an INTR. This control signal is called
ACK or sometimes INTA.
Interrupt vector table
It is discussed in the previous section that when an interrupt occurs, the microprocessor runs an associated
ISR. IRQ is an input signal to the microprocessor. When a microprocessor receives an IRQ, it pushes the
PC register onto the stack and load address of the ISR onto the PC register. This makes the microprocessor
execute the ISR. These associated ISRs, corresponding to every interrupt, become a part of the executable
program. This executable is loaded in the memory of the device. Under such circumstances, it becomes
easier to manage the ISRs if there is a lookup table where address locations of all ISRs are listed. This
lookup table is called Interrupt vector table. Table 2.9 shows an interrupt vector table for ARM cortex-M
microcontroller. In ARM microcontroller, there exist 256 interrupts. Out of these, some are hardware or
peripheral generated IRQs and some are software generated IRQs. However, first 15 interrupts, INT0 to
INT15 are called the predefined interrupts. In ARM Cortex-M microcontrollers, Interrupt vector table is
an on-chip module, called as Nested Vector Interrupt Controller (NVIC).
NVIC is an on-chip interrupt controller for ARM Cortex-M series microcontrollers. No other ARM series
has this on-chip NVIC. This means that the interrupt handling is primarily different in ARM Cortex-M
microcontrollers compared to other ARM microcontrollers .
Table : Interrupt Vector Table for ARM Cortex M4
Predefined Interrupts(INT0-INT15)
All ARM devices have a RESET pin which is invoked on device power-up or in case of warm reset. This
exception is a special exception and has the highest priority. On the assertion of Reset signal, the execution
stops immediately. When the Reset signal stops, execution starts from the address provided by the Reset
entry in the vector table i.e. 0x0000.0004. Hereby, to run a program on Reset, it is necessary to place the
program in 0x0000.0004 memory address.
In the ARM microcontroller, some pins are associated with hardware interrupts. They are often called
IRQs (interrupt request) and NMI (non-maskable interrupt). IRQ can be controlled by software masking
and unmasking. Unlike IRQ, NMI cannot be masked by software. This is why I is named as nonmaskable
interrupt. As shown in Table 2.9, "INT 02" in ARM Cortex-M is used only for NMI. On activation of
NMI, the microcontroller load memory location 0x0000008 to program counter.
Hard Fault
All the classes of fault corresponding to a fault handler cannot be activated. This may be a result of the
fault handler being disabled or masked.
Memory Management Fault
It is caused by a memory protection unit violation. The violation can be caused by attempting to write into
a read only memory. An instruction fetch is invalid when it is fetched from non-executable region of
memory. In an ARM microcontroller with an on-chip MMU, the page fault can also be mapped into the
memory management fault.
Bus Fault
A bus fault is an exception that arises due to a memory-related fault for an instruction or data memory
transaction, such as a pre-fetch fault or a memory access fault. This fault can be enabled or disabled.
Usage Fault
Exception that occurs due to a fault associated with instruction execution. This includes undefined
instruction, illegal unaligned access, invalid state on instruction execution, or an error on exception
return may termed as usage fault. An unaligned address of a word or half-word memory access or
division by zero can cause a usage fault.
A supervisor call (SVC) is an exception that is activated by the SVC instruction. In an operating system,
applications can use SVC instructions to contact OS kernel functions and device drivers. This is a software
interrupt since it was raised from software, and not from a Hardware or peripheral exception.
PendSV is pendable service call and interrupt-driven request for system-level service. PendSV is used for
framework switching when no other exception is active. The Interrupt Control and State (INTCTRL)
register is used to trigger PendSV. The PendSV is an interrupt and can wait until NVIC has time to service
it when other urgent higher priority interrupts are being taken care.
A SysTick exception is generated by the system timer when it reaches zero and is enabled to generate an
interrupt. The software can also produce a SysTick exception using the Interrupt Control and State
(INTCTRL) register.
User Interrupts
This interrupt is an exception signaled either by a peripheral or by a software request and fed through the
NVIC based on their priority. All interrupts are asynchronous to instruction execution. In the system,
peripherals use interrupts to communicate with the processor. An ISR can be also propelled as a result of
an event at the peripheral devices. This may include timer timeout or completion of analog-to-digital
converter (ADC) conversion. Each peripheral device has a group of special function registers that must
be used to access the device for configuration. For a given peripheral interrupt to take effect, the interrupt
for that peripheral must be enabled.
4.8 Timers
Timers are basic constituents of most microcontrollers. Today, just about every microcontroller comes
with one or more built-in timers. These are extremely useful to the embedded programmer - perhaps
second in usefulness only to GPIO. The timer can be described as the counter hardware and can usually
be constructed to count either regular or irregular clock pulses. Depending on the above usage, it can be
a timer or a counter respectively.
Sometimes, timers may also be termed as “hardware timers” to distinguish them from software timers.
Software timers can be described as a stream of bits of software that achieve some timing function.
The TM4C123GH6PM General-Purpose Timer Module (GPTM) contains six 16/32-bit GPTM blocks
and six 32/64-bit Wide GPTM blocks. These programmable timers can be used to count or time external
events that drive the Timer input pins. Timers can also be used to trigger μDMA transfers, to trigger
analog-to-digital conversions (ADC) when a time-out occurs in periodic and one-shot modes.
The GPT Module is one timing resource available on the Tiva™ C Series microcontrollers. Other timer
resources include the System Timer (SysTick) and the PWM timer in PWM modules
The General-Purpose Timer Module (GPTM) blocks with the following functional options:
16/32-bit operating modes:
16- or 32-bit programmable one-shot timer
16- or 32-bit programmable periodic timer
16-bit general-purpose timer with an 8-bit prescaler
32-bit Real-Time Clock (RTC) when using an external 32.768-KHz clock as the input
16-bit input-edge count- or time-capture modes with an 8-bit prescaler
16-bit PWM mode with an 8-bit prescaler and software-programmable output inversion of
the PWM signal
32/64-bit operating modes:
32- or 64-bit programmable one-shot timer
32- or 64-bit programmable periodic timer
32-bit general-purpose timer with a 16-bit prescaler
64-bit Real-Time Clock (RTC) when using an external 32.768-KHz clock as the input
32-bit input-edge count- or time-capture modes with a16-bit prescaler
32-bit PWM mode with a 16-bit prescaler and software-programmable output inversion of
the PWM signal
Count up or down
Twelve 16/32-bit Capture Compare PWM pins (CCP)
Twelve 32/64-bit Capture Compare PWM pins (CCP)
Daisy chaining of timer modules to allow a single timer to initiate multiple timing events
Timer synchronization allows selected timers to start counting on the same clock cycle
ADC event trigger
User-enabled stalling when the microcontroller asserts CPU Halt flag during debug (excluding RTC
Ability to determine the elapsed time between the assertion of the timer interrupt and entry into the
interrupt service routine
Efficient transfers using Micro Direct Memory Access Controller (μDMA)
1. Dedicated channel for each timer
2. Burst request generated on timer interrupt
Fig : GPTM block diagram
The below table lists the external signals of the GP Timer module and describes the function of each.
Table : General purpose Timer signals
Basic Timers/Counters
A standard timer will comprise a pre-scaler, an N-bit timer/counter register, one or more N-bit capture
and compare registers. Usually N is 8, 16 or 32 bits. Along with these, there will also be registers for
control and status units responsible to configure and monitor the timer.
To count the incoming pulses, an up-counter is deployed as fundamental hardware. A counter can be
converted to a timer by fixing incoming pulses and setting a known frequency. Also note that the size in
bits of a timer should not be related directly to the size in bits of the CPU architecture. An 8-bit
microcontroller can have 16-bit timers (in fact mostly do), and a 32-bit microcontroller can have 16-bit
timers (and some do).
The pre-scaler takes the basic timer clock frequency as an input and divides it by some value depending
upon the circuit requirements before feeding it to the timer, to configure the pre-scaler register(s). This
configuration might be limited to a few fixed values (powers of 2), or integers from 1 to 2^m, where m is
the number of pre-scaler bits.
Pre-scaler is used to set the clock rate of the timer as per your desire. This provides a flexibility in
resolution (high clock rate implies better resolution) and range (high clock rate causes quicker overflow
Timer Register
The timer register can be defined as hardware with an N-bit up-counter, which has accessibility of read
and write command rights for the current count value, and to stop or reset the counter. As discussed, the
timer is driven by the pre-scaler output. The regular pulses which drive the timer, irrespective of their
source are often called “ticks”. We may understand now that it is not necessary for a timer to time in
seconds or milliseconds, they do time in ticks. This enables us the elasticity to control the rate of these
ticks, depending upon the hardware and software configuration. We may construct our design to some
human-friendly value such as e.g. 1 millisecond or 1 microsecond, or any other design specified
Capture Registers
A capture registers are those hardware which can be routinely loaded with the current counter value upon
the occurrence of some event, usually a change on an input pin. Therefore the capture register is used to
capture a “snapshot” of the timer at the instant when the event occurs. A capture event can also be
constructed to produce an interrupt, and the Interrupt Service Routines (ISR) can save or else use the justcaptured timer snapshot.
There is no latency problem in snapshot value as the capture occurs in hardware, which would be if the
capture was done in software. Capture registers can be used to time intervals between pulses or input
signals, to determine the high and low times of input signals.
Compare/Match Registers
Compare or match registers hold a value against which the current timer value is routinely compared
and shoots to trigger an event when the value in two registers matches.
1.If the timer/counter is configured as a timer, we can generate events at known and precise
times. Events can be like output pin changes and/or interrupts and/or timer resets.
2.If the timer/counter is configured as a counter, the compare registers can generate events
based on preset counts being achieved.
For instance, the compare registers can be used to generate a timer “tick”, a fixed timer interrupt used for
system software timing. For example, if a 2ms tick is desired, and the timer is configured with a 0.5us
clock, setting a compare register to 4000 will cause a compare event after 2ms. If we set the compare
event to generate an interrupt as well as to reset the timer to 0, the result will be an endless stream of 2ms
Another notable use of a compare register can be to generate a pulse with variable width. Set an output
high/low when the timer is at 0, configure the compare register with value of pulse width, and on the
compare event set the output low/high. We may use a second compare register with a larger value, to set
the pulse interval by retuning the timer on compare.
Real Time Clock(RTC)
RTC is a mainframe clock that keeps track of the current time. RTCs are present in approximately every
electronic device which needs to maintain accurate time. The term RTC came into picture to avoid
confusion with regular hardware clocks which are merely signals that administer digital electronics, and
do not count time in human units.
Benefits of using RTC:
1. Low power consumption
2. Liberates the main system for time-critical tasks
3. Increases accuracy if compared to other methods
A GPS receiver can cut down its startup time by comparing the current time as per its RTC, with the
moment of last valid signal. If it has been less than a few hours, then the previous ephemeris is still
With the option of alternative power source with RTCs, they can continue to keep time while the primary
power source being unavailable. This alternate source may be a lithium battery or a supercapacitor.
Fig: Real Time Clock with external power source
Fig: Frequency Based Measurement System
Timing Generation and Measurement
In various microprocessor systems, it is desirable to use frequency to formulate measurements, rather
than the digital output of an ADC. Motivations for using frequency measurement include:
1. In systems with ground offsets, signals can be capacitively coupled or optically isolated to
reduce ground loops and other damaging effects.
2. Noise introduced during analog transmission may be eliminated by transmitting a logic-level
frequency signal instead.
3. Measuring frequency instead of analog values may allow an uncomplicated microprocessor to
be used, since an ADC is not required.
Today mostly, we can convert an analog (physical quantity) input, such as temperature, to a timebased
signal that can be calculated with a microprocessor.
Microprocessor with Capture Capability
In a microprocessor with capture capability, the sensor output and microprocessor input can be
connected for pulse capture. In the block diagram below, one such capture system is described. Here, a
16-bit register is used to capture a free-running, 16-bit counter when the input frequency changes from
the lower state to higher state. At the same instance, a short pulse is triggered to reset the counter. In the
illustration shown in Fig 3.4 below, one time period of the input is 90µs and the second period is 100µs.
The counter here will count up 90 counts for the first period and 100 counts for the second period.
Microprocessor without Capture Capability
To perform measurements similar to discussed previously on Microprocessors without capture
capability, we need to construct a counter free-run and join the frequency signal to an interrupt input.
Here, the counter can be an external or internal part that is clocked from an imitative of the
microprocessor clock. As an interrupt triggers, read and reset are performed by software to the counter.
Due to variable interrupt latency, this method is somewhat less accurate than the capture method. In
situation, when system latency should not affected by other interrupts, and also microprocessor is
available with a non-maskable interrupt input, then this should be used for the frequency input.
The frequency input can be linked to the input of a timer, and the timer should be programmed to
increment with an external clock. The microprocessor then fetch/read the timer value on a periodic basis
to get the number of counts that arose in the measurement period. Issues like Interrupt latency can be
reduced by joining a period-based signal to a counter (running on the microprocessor clock), but counts
only when the input is high.
The counter will
Count up while the input is high
Hold the count while the input is low
The processor can read the count during the count is low. Until the count goes high again, the processor
keeps on reading. Hence, the count will be accurate.
Measuring Period Based Inputs with Free Running Counter
Fig: Period based input Read with free running counter
Fig: Period based input Read with counter that increments only while gate input is HIGH
(Gate connected to Period Based Input)
4.9 Pulse Width Modulation
Pulse width modulation (PWM) is a simple but powerful technique of using a rectangular digital
waveform to control an analog variable or simply controlling analog circuits with a microprocessor's
digital outputs. PWM is employed in a wide variety of applications, from measurement &
communications to power control and conversion.
TM4C123GH6PM PWM module provides a great deal of flexibility and can generate simple PWM
signals, such as those required by a simple charge pump as well as paired PWM signals with deadband
delays, such as those required by a half-H bridge driver. Three generator blocks can also generate the
full six channels of gate controls required by a 3-phase inverter bridge.
Each PWM generator block has the following features:
1.One fault-condition handling inputs to quickly provide low-latency shutdown and prevent damage to
-bit counter
Runs in Down or Up/Down mode
Output frequency controlled by a 16-bit load value
Load value updates can be synchronized
Comparator value updates can be synchronized
Produces output signals o
Output PWM signal is constructed based on actions taken as a result of the counter and
PWM comparator output signals
Produces two independent PWM signals www.ti.com Pulse Width Modulation
2.Dead-band generator
Produces two PWM signals with programmable dead-band delays suitable for driving a
half-H bridge.
Can be bypassed, leaving input PWM signals unmodified.
3.Can initiate an ADC sample sequence
The control block determines the polarity of the PWM signals and which signals are passed through to
the pins. The output of the PWM generation blocks are managed by the output control block before
being passed to the device pins.
Fig 3.14. PWM Module Block Diagram
Block Diagram
TM4C123GH6PM controller contains two PWM modules, each with four generator blocks that generate
eight independent PWM signals or four paired PWM signals with dead-band delays inserted.
TM4C123GH6PM controller contains two PWM modules, each with four generator blocks that generate
eight independent PWM signals or four paired PWM signals with dead-band delays inserted.
Fig 3.15.
PWM Generator Block Diagram
Functional Description
Clock Configuration
The PWM has two clock source options:
The System Clock
A pre divided System Clock
The clock source is selected by programming the USPWMDIV bit in the Run-Mode Clock
Configuration (RCC) register. The PWMDIV bit field specifies the divisor of the system clock that is
used to create the PWM Clock.
PWM Timer
The timer in each PWM generator runs in one of two modes: Count-Down mode or Count-Up/Down
mode. In Count-Down mode, the timer counts from the load value to zero, goes back to the load value,
and continues counting down. In Count-Up/Down mode, the timer counts from zero up to the load
value, back down to zero, back up to the load value, and so on. Generally, Count-Down mode is used
for generating left- or right-aligned PWM signals, while the Count-Up/Down mode is used for
generating center-aligned PWM signals.
The timers output three signals that are used in the PWM generation process: the direction signal (this is
always Low in Count-Down mode, but alternates between low and high in Count-Up/Down mode), a
single-clock-cycle-width High pulse when the counter is zero, and a single-clock-cycle-width High
pulse when the counter is equal to the load value. Note that in Count-Down mode, the zero pulse is
www.ti.com Pulse Width Modulation
immediately followed by the load pulse. In the figures in this chapter, these signals are labelled "dir,"
"zero," and "load."
PWM Comparators
Each PWM generator has two comparators that monitor the value of the counter, when either comparator
matches the counter, they output a single-clock-cycle-width High pulse, labeled "cmpA" and "cmpB" in
the figures in this chapter. When in Count-Up/Down mode, these comparators match both when counting
up and when counting down, and thus are qualified by the counter direction signal. These qualified pulses
are used in the PWM generation process. If either comparator match value is greater than the counter load
value, then that comparator never outputs a High pulse.
Figure (a): PWM Count- Down Mode
Figure(b): PWM Count- Up/ Down Mode
PWM Signal Generator
Each PWM generator takes the load, zero, cmpA, and cmpB pulses (qualified by the dir signal) and
generates two internal PWM signals, pwmA and pwmB. In Count-Down mode, there are four events that
can affect these signals: zero, load, match A down, and match B down. In Count-Up/Down mode, there
are six events that can affect these signals: zero, load, match A down, match A up, match B down, and
match B up. The match A or match B events are ignored when they coincide with the zero or load events.
If the match A and match B events coincide, the first signal, pwmA, is generated based only on the match
A event, and the second signal, pwmB, is generated based only on the match B event.
Dead-Band Generator
The pwmA and pwmB signals produced by each PWM generator are passed to the dead-band generator.
If the dead-band generator is disabled, the PWM signals simply pass through to the pwmA' and pwmB'
signals unmodified. If the dead-band
generator is enabled, the pwmB signal is lost and two PWM signals are generated based on the pwmA
signal. The first output PWM signal, pwmA' is the pwmA signal with the rising edge delayed by a
programmable amount. The second output PWM signal, pwmB', is the inversion of the pwmA signal with
a programmable delay added between the falling edge of the pwmA signal and the rising edge of the
pwmB' signal.
The resulting signals are a pair of active high signals where one is always high, except for a programmable
amount of time at transitions where both are low. These signals are therefore suitable for driving a half-H
bridge, with the dead-band delays preventing shoot-through current from damaging the power electronics.
Fig: QEI Input Signal Logic
4.10 Quadrature Encoder Interface (QEI)
A quadrature encoder, also known as a 2-channel incremental encoder, converts linear displacement into
a pulse signal. By monitoring both the number of pulses and the relative phase of the two signals, you
can track the position, direction of rotation, and speed. In addition, a third channel, or index signal, can
be used to reset the position counter.
A classic quadrature encoder has a slotted wheel like structure, to which a shaft of the motor is attached
and a detector module that captures the movement of slots in the wheel.
Interfacing QEI using Tiva TM4C123GH6PM
The TM4C123GH6PM microcontroller includes two quadrature encoder interface (QEI) modules. Each
QEI module interprets the code produced by a quadrature encoder wheel to integrate position over time
and determine direction of rotation. In addition, it can capture a running estimate of the velocity of the
encoder wheel.
The TM4C123GH6PM microcontroller includes two QEI modules providing control of two motors at
the same time with the following features:
-in timer
example, 12.5 MHz for a 50-MHz system)
o Index pulse Quadrature Encoder Interfacewww.ti.com
o Velocity-timer expiration
o Direction change
o Quadrature error detection
Functional Description
The QEI module interprets the two-bit gray code produced by a quadrature encoder wheel to integrate
position over time and determine direction of rotation. In addition, it can capture a running estimate of the
velocity of the encoder wheel. The position integrator and velocity capture can be independently enabled,
though the position integrator must be enabled before the velocity capture can be enabled. The two phase
signals, PhAn and PhBn, can be swapped before being interpreted by the QEI module
Fig : QEI Block Diagram
The QEI module input signals have a digital noise filter on them that can be enabled to prevent spurious
operation. The noise filter requires that the inputs be stable for a specified number of consecutive clock
cycles before updating the edge detector. The filter is enabled by the FILTEN bit in the QEI Control
(QEICTL) register. The frequency of the input update is programmable using the FILTCNT bit field in
the QEICTL register.
The QEI module supports two modes of signal operation:
Quadrature phase mode, the encoder produces two clocks that are 90 degrees out of
phase, the edge relationship is used to determine the direction of rotation.
Quadrature Encoder Interface
Clock/direction mode, the encoder produces a clock signal to indicate steps and a
direction signal to indicate the direction of rotation. This mode is determined by the
SIGMODE bit of the QEICTL register.
When the QEI module is set to use the quadrature phase mode (SIGMODE bit is clear), the capture
mode for the position integrator can be set to update the position counter on every edge of the PhA
signal or to update on every edge of both PhA and PhB. Updating the position counter on every PhA
and PhB edge provides more positional resolution at the cost of less range in the positional counter.
When edges on PhA lead edges on PhB, the position counter is incremented. When edges on PhB lead
edges on PhA, the position counter is decremented. When a rising and falling edge pair is seen on one
of the phases without any edges on the other, the direction of rotation has changed.
The positional counter is automatically reset on one of two conditions:
The reset mode is determined by the RESMODE bit of the QEICTL register.
RESMODE is set, the positional counter is reset when the index pulse is sensed. This
mode limits the positional counter to the values [0: N-1], where N is the number of phase
edges in a full revolution of the encoder wheel. The QEI Maximum Position (QEIMAXPOS)
register must be programmed with N-1 so that the reverse direction from position 0 can move
the position counter to N-1. In this mode, the position register contains the absolute position of
the encoder relative to the index (or home) position once an index pulse has been seen.
RESMODE is clear, the positional counter is constrained to the range [0: M], where M is the
programmable maximum value. The index pulse is ignored by the positional counter in this
mode. Velocity capture uses a configurable timer and a count register. The timer counts the
number of phase edges (using the same configuration as for the position integrator) in a given
time period.
The edge count from the previous time period is available to the controller via the QEI Velocity
(QEISPEED) register, while the edge count for the current time period is being accumulated in the QEI
Velocity Counter (QEICOUNT) register. As soon as the current time period is complete, the total number
of edges counted in that time period is made available in the QEISPEED register (overwriting the
previous value), the QEICOUNT register is cleared, and counting commences on a new time period. The
number of edges counted in a given time period is directly proportional to the velocity of the encoder.
Case Study: TIVA based embedded system application using ADC & PWM
This case study is for the application of the PWM based speed control of DC motor using Potentiometer.
In this study all the sensors are initialized and then synchronized with the synchronization clock pulse.
Here the sensor used is potentiometer which is connected to the ADC of the Tiva C Series Launchpad and
the Motor is connected to the PWM pin of Launchpad as shown in the below diagram. The value read
from the potentiometer is used to vary the duty cycle of the PWM to which the motor is connected, the
value will change as per the rotation of the potentiometer. After executing this we can control the speed
of the motor by adjusting the rotation of the potentiometer.
Fig 3.19. Schematic for motor control using TIVA
Fig 3.20. Flowchart for DC motor control using PWM
Unit-V :
Embedded communications protocols and Internet of Things
Many serial communication interfaces compete for use in embedded systems. The right serial interface for your
system depends on several key factors. In this article I will describe seven of the most common serial interfaces,
to help you decide which bus is right for your next project.
Why serial?
There are many different reasons to use a serial interface. One of the most common is the need to interface with
a PC, during development and/or in the field. Most, if not all PCs have some sort of serial bus interface available
to connect peripherals. For embedded systems that must interface with a general-purpose computer, a serial
interface is often easier to use than the ISA or PCI expansion bus.
A benefit of serial communications is low pin counts. Serial communications can be performed with just one I/O
pin, compared to eight or more for parallel communications. Many common embedded system peripherals, such
as analog-to-digital and digital-to-analog converters, LCDs, and temperature sensors, support serial interfaces.
Serial buses can also provide for inter-processor communication-a network, if you will. This allows large tasks
that would normally require larger processors to be tackled with several inexpensive smaller processors. Serial
interfaces allow processors to communicate without the need for shared memory and semaphores, and the
problems they can create.
This isn't to say that parallel buses have no use. For operational fetches, address and data buses, and other
microprogram control, parallel buses have always been the clear winner. "Memory-mapping" peripherals has
been a technique commonly used for systems with address and data buses. This tendency allows parallel access
to off-chip peripherals. However, with many 8-bit microcontrollers (let alone 8-pin) with no external
address/data bus available for designs, memory-mapping is not an option.
Before we get into the individual interface details, we should define several terms:
On an asynchronous bus, data is sent without a timing clock. A synchronous bus sends data with a timing
Full-duplex means data can be sent and received simultaneously. Half-duplex is when data can be sent or
received, but not at the same time.
Master/slave describes a bus where one device is the master and others are slaves. Master/slave buses are
usually synchronous, as the master often supplies the timing clock for data being sent along in both
A multi-master bus is a master/slave bus that may have more than one master. These buses must have an
arbitration scheme that can settle conflicts when more than one master wants to control the bus at the same
Point-to-point or peer interfaces are where two devices have a peer relation to each other; there are no
masters or slaves. Peer interfaces are most often asynchronous.
The term multi-drop describes an interface in which there are several receivers and one transmitter.
Multi-point describes a bus in which there are more than two peer transceivers. This is different from a
multi-drop interface as it allows bidirectional communication over the same set of wires.
TIA/EIA-232-F (typically referred to as RS-232) is a common interface that can be found on almost every
personal computer. RS-232 is a complete standard, not only including electrical characteristics, but
physical and mechanical characteristics as well, such as connection hardware, pin-outs, and signal names.
A point-to-point interface, RS-232 is capable of moderate distances at speeds up to 20Kbps. While not
specifically called out in the specification, speeds of greater than 115.2Kbps are possible, provided that
connections are short and proper grounding is used. Cable lengths of 30 feet are common, and cables of
over 200 feet can be attained with low-capacitance cable.
An RS-232 bus is an unbalanced bus capable of full-duplex communication between two
receiver/transmitter pairs, named data terminal equipment (DTE) and data communication equipment
(DCE). Each one has a transmit signal that is connected to the receive signal on the other end. As such,
there is a pin difference between the two sides. (Your PC is a DTE, while the connected peripheral is
Each transmitter sends data by varying the voltage on the line. A voltage higher than 3V is a binary zero,
while a voltage less than --3V is a binary one. Between these voltages, the value is undefined. To convert
from logic levels (0 and 5V) to these levels and back, an RS-232 conversion IC, such as the 1488, 1489, or
ubiquitous MAX232, can be used.
Typical RS-232 communication consists of a start bit, data bits, parity bits (if any), and stop bit(s). When
communicating with PCs, the typical format is eight data bits, no parity, and one stop bit (8N1). Seven data
bits, even parity, and one stop bit (7E1) is also common. A start bit is often a zero and a stop bit is often a
one, as shown in Figure 1. The official specification does not delineate any communications protocol,
including the use of start/stop bits.
Figure 1: RS-232
Many embedded systems that use the RS-232 bus either interface with PCs or PC peripherals such as
modems. Other systems use RS-232 so that bus traffic can be monitored easily with an inexpensive
protocol analyzer or a PC equipped with two serial ports.
Almost every microcontroller vendor has products that include hardware support for RS-232, called
Universal Asynchronous Receiver Transmitters (UARTs). UARTs are often interrupt-driven and capable
of speeds up to 115.2Kbps with little software overhead, although this varies by architecture.
RS-422 and RS-485
TIA/EIA-422-B (typically referred to as RS-422) and TIA/EIA-485-A (typically referred to as RS-485) are
balanced, twisted-pair interfaces capable of speeds up to 10Mbps and distances up to 4,000 feet. Being
differential buses, each uses signals from 1.5V to 6V to transmit the data. (With a differential, balanced
bus, noise immunity is increased over a comparable single-ended, unbalanced bus such as RS-232.)
The RS-422 interface is a multi-drop interface, giving unidirectional communication over a pair of wires
from one transmitter to several receivers, up to 10 unit loads (UL). If the devices receiving the data wish to
communicate back to the transmitter, the designer must use a separate, dedicated bus between each
receiver and the transmitter. (Using this return bus will allow full-duplex transmissions.) For that reason,
RS-422 is seldom used between more than two nodes.
The RS-485 interface, on the other hand, is a bidirectional communication over one pair of wires between
several transceivers. The specification states that the bus can include up to 32 UL worth of transceivers.
Many manufacturers produce fractional-UL transceivers, thereby increasing the maximum number of
devices to well over 100.
The RS-422 and RS-485 interfaces often use the same start bit/data/stop bit format of RS-232. In fact,
several converters exist to go from RS-232 to RS-485 and back. Do keep in mind, however, that RS-232 is
a full-duplex interface, while RS-485 is half-duplex.
Several microcontroller manufacturers provide built-in UARTs that boast special RS-485 abilities.
The Inter-Integrated Circuit bus (I2C) is a patented interface developed by Philips Semiconductors. (In
order for an IC manufacturer to implement the I 2C bus in hardware, they must obtain licensing from
The I2C bus is a half-duplex, synchronous, multi-master bus requiring only two signal wires: data (SDA)
and clock (SCL). These lines are pulled high via pull-up resistors and controlled by the hardware via opendrain drivers, giving a wired-AND interface.
I2C uses an addressable communications protocol that allows the master to communicate with individual
slaves using a 7-bit or 10-bit address. Each device has an address that is assigned by Philips to the
manufacturer of the device. In addition, several special addresses exist, including a "general call" address
(which addresses every device on the bus) and a high-speed initiation address.
During communication with slave devices, the master generates all clock signals for both communication
to and from the slave. Each communication begins with the master generating a start condition, an 8-bit
data word, an acknowledge bit, followed by a stop condition or a repeated start. Each data bit transition
takes place while SCL is low, except for the start and stop conditions. The start condition is a high-to-low
transition of the SDA line while the SCL line is high. A stop condition is a low-to-high transition of the
SDA line while the SCL line is high (see Figure 2). The acknowledge bit is generated by the receiver of the
message by pulling the SDA line low while the master releases the line and allows it to float high. If the
master reads the acknowledge bit as high, it should consider the last communication word not received and
take appropriate action, including possibly resending the data.
Figure 2: I2C
I2C has a rather interesting feature called clock stretching, which is done when the slave device is unable to
process the bit and wishes for more time. When this happens, the slave pulls the SCL line low. Since the
signal behaves as a wired-AND, when the master releases the SCL line while the slave is "stretching" the
clock, the master should notice that the line stays low. Upon seeing this, the master waits until the slave
has processed the data bit and released the line. Once released by the slave, the SCL line floats back high,
signaling to the master to send the next data bit..
The I2C bus has three speeds: slow (under 100Kbps), fast (400Kbps), and high-speed (3.4Mbps), each
downward compatible. Philips has specified a recommended wiring arrangement should the signals need to
leave the circuit board.
I2C bus distances are often limited to on-board communications, although I have heard of developers using
I2C successfully over distances of 50 feet! The true limit to I 2C distances is the bit-rate and capacitance of
the bus. As such, for off-board communications, I2C is practically limited to under 10 feet for moderate
For more details on I2C, read David and Roee Kalinsky's "Beginner's Corner: I2C" (August 2001).
The Serial Peripheral Interface (SPI) is a synchronous serial bus developed by Motorola and present on
many of their microcontrollers.
The SPI bus consists of four signals: master out slave in (MOSI), master in slave out (MISO), serial clock
(SCK), and active-low slave select (/SS). As a multi-master/slave protocol, communications between the
master and selected slave use the unidirectional MISO and MOSI lines, to achieve data rates over 1Mbps
in full duplex mode. The data is clocked simultaneously into the slave and master based on SCK pulses the
master supplies. The SPI protocol allows for four different clocking types, based on the polarity and phase
of the SCK signal. It is important to ensure that these are compatible between master and slave.
In addition to the 1Mbps data rate, another advantage to SPI is if only one slave device is used, the /SS line
can be pulled low and the /SS signal does not have to be generated by the master. (This capability is,
however, dependent on the phase selection of the SCK.)
A disadvantage to SPI is the requirement to have separate /SS lines for each slave. Provided that extra I/O
pins are available, or extra board space for a demultiplexer IC, this is not a problem. But for small, lowpin-count microcontrollers, a multi-slave SPI interface might not be a viable solution.
For more detail on SPI, read David and Roee Kalinsky's "Beginner's Corner: Serial Peripheral Interface"
(February 2002).
Microwire is a three-wire synchronous interface developed by National Semiconductor and present on
their COP8 processor family.
Similar to SPI, Microwire is a master/slave bus, with serial data out of the master (SO), and serial data in
to the master (SI), and signal clock (SK). These correspond to SPI's MOSI, MISO, and SCK, respectively.
There is also a chip select signal, which acts similarly to SPI's /SS. A full-duplex bus, Microwire is capable
of speeds of 625Kbps and faster (capacitance permitting).
Microwire devices from National come with different protocols, based on their data needs. Unlike SPI,
which is based on an 8-bit byte, Microwire permits variable length data, and also specifies a "continuous"
bitstream mode.
Microwire has the same advantages and disadvantages as SPI with respect to multiple slaves, which
require multiple chip select lines. In some instances, an SPI device will work on a Microwire bus, as will a
Microwire device work on an SPI bus, although this must be reviewed on a per-device basis.
Both SPI and Microwire are generally limited to on-board communications and traces of no longer than 6
inches, although longer distances (up to 10 feet) can be achieved given proper capacitance and lower bit
Dallas Semiconductor's 1-Wire bus is an asynchronous, master/slave bus with no protocol for multimaster. Like the I2C bus, 1-Wire is half-duplex, using an open-drain topology on a single wire for
bidirectional data transfer. However, the 1-Wire bus also allows the data wire to transfer power to the slave
devices, although this is somewhat limited. Though limited to a maximum speed of 16Kbps, bus length can
be upwards of 1,000 feet, given the proper pull-up resistor.
For more detail on the 1-Wire bus, read H. Michael Willey's "One Cheap Network Topology" (January,
Bit banging
Should you not have hardware support for any of the above, it is possible to use general-purpose I/O pins.
The act of software controlling a serial communication is often referred to as "bit banging," as the software
is truly "banging away" at the adopted "serial port."
Bit banging requires the software to be cognizant of the exact timing required for each bit, for it must
toggle an output line for every bit change (as well as monitor the receive pin for incoming data, if such
interface is full-duplex). Luckily for embedded developers, quite a few bit-banging routines are available
on the Internet for every serial bus described here, and for use in almost every microcontroller architecture.
In fact, several microcontroller manufacturers have developed and published their own such routines.
Catching the right bus
As you can see, there is a multitude of serial communication buses to choose from. (And we didn't even
discuss wireless, networks, Firewire, and USB protocols.) Your choice in a serial bus should not only meet
the needs of the product today, but also be available as well as viable for the life of the product. I hope this
has helped you decide which serial interface is proper for your current embedded design.
Table 1 Protocol comparison
Not including ground.
Faster speeds available but not specified.
Dependent on capacitance of the wiring.
Software handshaking. Hardware handshaking requires additional
Device count given in unit loads (UL). More devices are possible if fractional-UL
Unidirectional communication only. Additional pins needed for each bidirectional
Limitation based on bus capacitance and bit
Additional pins needed for every slave if slave count is more than
What is Communication?
Before we move on to serial communication, lets discuss a bit about communication in general. In
simple terms, communication is an exchange of ideas between two individuals. Ideas can be anything
and in any form – they could be written/spoken words, in form of media like audio/video, or if you like
sci-fi, then it can also in form of telepathy! ;)
But what does communication between two microcontrollers mean? Its simple! An exchange of data
(bits)! There are many protocols for communication (which would be discussed later) but all of them
are based on either serial communication or parallel communication.
Why do we need Communication?
Lets take an example. As kids, we all must have played with those remote controlled toy cars and
airplanes. It was pretty fun and fascinating at that time. I am sure that most of us at that time didn’t try
to figure out how it was possible! How could the remote control device in your hand control the car or
the aeroplane? Well, of course, the device in your hand sends some data, which is received by the
car/aeroplane. There is a microcontroller onboard the toy, which interprets the signals and acts
accordingly. Correct! So far so good, but now it doesn’t end here. As grown ups, there are a few more
questions which should arise! Like how does the device send the signal? From where is the signal being
sent? What is actually being sent? Who receives it? How is it processed?
Lets take another example. This one’s a more common example. You have a file in your mobile and
you would like to share it with your friend who is sitting next to you? How would you do it – Bluetooth,
IR, NFC, LAN or email? Mostly people would use Bluetooth. IR is obsolete, NFC is still in
developmental phase and isn’t available in most devices, LAN needs a WiFi/LAN network whereas
email requires an active Internet connection. The same questions can be put forth here as well – how is
it send, from where is it sent and to where, what is being sent and how is it processed?!
Well, this is why communication is required! And to answer all those questions, several communication
protocols have been developed! Now lets discuss a little about serial and parallel communication.
Serial Communication
Serial Transfer
In Telecommunication and Computer Science, serial communication is the process of sending/receiving
data in one bit at a time. It is like you are firing bullets from a machine gun to a target… that’s one
bullet at a time! ;)
Parallel Communication
Parallel Transfer
Parallel communication is the process of sending/receiving multiple data bits at a time through parallel
channels. It is like you are firing using a shotgun to a target – where multiple bullets are fired from the
same gun at a time! ;)
Serial vs Parallel Communication
Now lets have a quick look at the differences between the two types of communications.
Serial Communication
Parallel Communication
1. One data bit is transceived at a time
1. Multiple data bits are transceived at a
2. Faster
3. Higher number of cables required
2. Slower
3. Less number of cables required to
transmit data
Serial vs Parallel
So these were the basic differences between serial and parallel communication. From the above
differences, one would obviously think that parallel communication is far better than serial
communication. But wait, these are just the basic differences. Before we proceed further, we need to be
acquainted with a few terminologies:
1. Bit Rate: It is the number of bits that are transmitted (sent/received) per unit time.
2. Clock Skew: In a parallel circuit, clock skew is the time difference in the arrival of two sequentially
adjacent registers. To explain it further, let us take the machine gun example again. When, say around
5 people are firing at the same time, there is bound to be a time difference in the arrival of the bullet
from the first shooter and that from the second shooter and so on. This time difference is what we call
clock skew. This is better illustrated in the picture below: There is a time lag in the data bits through
different channels of the same bus. Clock skew is inevitable due to differences in physical conditions
of the channels, like temperature, resistance, path length, etc
3. Crosstalk: Phenomenon by which a signal transmitted on one channel of a transmission bus creates an
undesired effect in another channel. Undesired capacitive, inductive, or conductive coupling is usually
what is called crosstalk, from one circuit, part of a circuit, or channel, to another. It can be seen from
the following diagram that clock skew and crosstalk are inevitable.
Major Factors Limiting Parallel Communication
Before the development of high-speed serial technologies, the choice of parallel links over serial links
was driven by these factors:
1. Speed: Superficially, the speed of a parallel link is equal to bit rate*number of channels. In
practice, clock skew reduces the speed of every link to the slowest of all of the links.
2. Cable length: Crosstalk creates interference between the parallel lines, and the effect only magnifies
with the length of the communication link. This limits the length of the communication cable that can
be used.
These two are the major factors, which limit the use of parallel communication.
Advantages of Serial over Parallel
Although a serial link may seem inferior to a parallel one, since it can transmit less data per clock cycle,
it is often the case that serial links can be clocked considerably faster than parallel links in order to
achieve a higher data rate. A number of factors allow serial to be clocked at a higher rate:
Clock skew between different channels is not an issue (for un-clocked asynchronous serial
communication links).
A serial connection requires fewer interconnecting cables (e.g. wires/fibers) and hence occupies less
space. The extra space allows for better isolation of the channel from its surroundings.
Crosstalk is not a much significant issue, because there are fewer conductors in proximity.
In many cases, serial is a better option because it is cheaper to implement. Many ICs have serial
interfaces, as opposed to parallel ones, so that they have fewer pins and are therefore less expensive. It
is because of these factors, serial communication is preferred over parallel communication.
How is Data sent Serially?
Since we already know what are registers and data bits, we would now be talking in these terms only.
If not, I would recommend you to first take a detour and go through the introduction of this post by
When a particular data set is in the microcontroller, it is in parallel form, and any bit can be accessed
irrespective of its bit number. When this data set is transferred into the output buffer to be transmitted,
it is still in parallel form. This output buffer converts this data into Serial data (PISO)
(Parallel In Serial Out), MSB (Most Significant Bit) first or LSB (Least Significant Bit) first as
according to the protocol. Now this data is transmitted in Serial mode.
When this data is received by another microcontroller in its receiver buffer, the receiver buffer converts
it back into parallel data (SIPO) (Serial In Parallel Out) for further processing. The following diagram
should make it clear.
Data Transfer in Serial Communication
This is how serial communication works! But it is not as simple as it looks. There is a catch in it, which
we will discuss little later in the same post. For now, lets discuss about two modes of serial data transfer
– synchronous and asynchronous.
Serial Transmission Modes
Serial data can be transferred in two modes – asynchronous and synchronous.
Asynchronous Data Transfer
Data Transfer is called Asynchronous when data bits are not “synchronized” with a clock line, i.e. there
is no clock line at all!
Lets take an analogy. Imagine you are playing a game with your friend where you have to throw colored
balls (let’s say we have only two colors – red (R) and yellow (Y)). Lets assume you have unlimited
number of balls. You have to throw a combination of these colored balls to your friend. So you start
throwing the balls. You throw R, then R, then Y, then R again and so on. So you start your sequence
RRYR… and then you end your round and start another round. How will your buddy on the other side
know that you have finished sending him first round of balls and that you are already sending him the
second round of balls?? He/she will be completely lost! How nice it would be if you both sit together
and fix a protocol that each round consists of 8 balls! After every 8 balls, you will throw two R balls to
ensure that your friend has caught up with you, and then you again start your second round of 8 balls.
This is what we call asynchronous data transfer.
Asynchronous data transfer has a protocol, which is usually as follows:
The first bit is always the START bit (which signifies the start of communication on the serial line),
followed by DATA bits (usually 8-bits), followed by a STOP bit (which signals the end of data packet).
There may be a Parity bit just before the STOP bit. The Parity bit was earlier used for error checking,
but is seldom used these days.
The START bit is always low (0) while the STOP bit is always high (1).
The following diagram explains it.
Asynchronous Data Transfer Timing Diagram
Synchronous Data Transfer
Synchronous data transfer is when the data bits are “synchronized” with a clock pulse.
We will take the same analogy as before. You are still playing the throw-ball game, but this time, you
have set a timer in your watch such that it beeps every minute. You will not throw a ball unless you
hear a beep from your watch. As soon as you hear a beep from your watch, you and your friend, both
know that you are going to throw a ball to her. Both of you can keep a track of time using this; say you
start a new round after every 8 beeps. Isn’t it a much better approach? This approach is what we
call synchronous data transfer.
The concept for synchronous data transfer is simple, and as follows:
The basic principle is that data bit sampling (or in other words, say, ‘recording’) is done with respect to
clock pluses, as you can see in the timing diagrams.
Since data is sampled depending upon clock pulses, and since the clock sources are very reliable, so
there is much less error in synchronous as compared to asynchronous.
Synchronous Data Transfer Timing Diagram
Serial Communication Terminologies
Now its time to learn about some new words, which we will use frequently in the next few posts. There
are many terminologies, or ‘keywords’ associated with serial communication. We will discuss all of
them one by one:
1. MSB/LSB: this stands for Most Significant Bit (or Least Significant Bit). You can refer to
Mayank’s this post for more information on MSB and LSB. Since data is transferred bit-by-bit in serial
communication, one needs to know which bit is sent out first: MSB or LSB.
2. Simplex Communication: In this mode of serial communication, data can only be transferred from
transmitter to receiver and not vice versa.
3. Half Duplex Communication: this means that data transmission can occur in only one direction at a
time, i.e. either from master to slave, or slave to master, but not both.
4. Full Duplex Communication: full duplex communication means that data can be transmitted from the
master to the slave, and from slave to the master as the same time!
Types of Transmission
5. Baud Rate: according to Wikipedia, baud is synonymous to symbols per second or pulses per second.
It is the unit of symbol rate, also known as baud or modulation rate. However, though technically
incorrect, in the case of modem manufacturers baud commonly refers to bits per second.
Importance of Baud Rate
For two microcontrollers to communicate serially they should have the samebaud rate, else serial
communication won’t work. This is because when you set a baud rate, you direct the microcontroller to
transmit/receive the data at that particular rate. So if you set different baud rates, then the receiver might
miss out the bits the transmitter is sending (because it is configured to receive data and process it with
a different speed!)
Different baud rates are available for use. The most common ones are 2400, 4800, 9600, 19200, 38400
etc. You cannot choose any arbitrary baud rate, there are some fixed values which you must use like
2400, 4800, etc. Please note that the unit of baud rate is bps (bits per second).
The Catch in Serial Communication
Now it’s all clear to you. You have data. You decide how to send your data
(synchronous/asynchronous). You send your data by following proper protocols. The transmitter
converts your parallel data to serial, sends it across the channel, then the receiver converts your serial
data to parallel. Bingo! But that’s not sufficient for a proper serial communication. There are two things
which still needs to be taken care of:
1. Baud Rate: Unless the baud rate of both the transmitter and receiver are the same, serial communication
cannot work. The reason is specified in the previous section.
2. Address: If you are trying to send multiple data together over the same channel and/or you are sharing
the same channel space with other users sending their own data, then you need to take care to properly
address your data. We won’t discuss about it in this post, but we will surely discuss about it in one of
our upcoming posts.
If you take care of these two factors, your serial communication will be established perfectly and your
data will go through properly. These are the two main reasons for unsuccessful serial link.
UART stands for Universal Asynchronous Receiver Transmitter, whereas USART stands
for Universal Synchronous Asynchronous Receiver Transmitter. They are basically just a piece of
computer hardware that converts parallel data into serial data. The only difference between them is that
UART supports only asynchronous mode, whereas USART supports both asynchronous and
synchronous modes. Unlike Ethernet, Firewire etc., there is no specific port for UART/USART. They
are commonly used in conjugation with protocols like RS-232, RS-434 etc. (we have specific ports for
these two!).
In synchronous transmission, the clock data is recovered separately from the data stream and no
start/stop bits are used. This improves the efficiency of transmission on suitable channels since more of
the bits sent are usable data and not character framing.
The USART has the following components:
A clock generator, usually a multiple of the bit rate to allow sampling in the middle of a bit period
Input and output shift registers
Transmit/receive control
Read/write control logic
Transmit/receive buffers (optional)
Parallel data bus buffer (optional)
First-in, first-out (FIFO) buffer memory (optional)
Serial Communication Protocols
A variety of communication protocols have been developed based on serial communication in the past
few decades. Some of them are:
1. SPI – Serial Peripheral Interface: It is a three-wire based communication system. One wire each for
Master to slave and Vice-versa, and one for clock pulses. There is an additional SS (Slave Select) line,
which is mostly used when we want to send/receive data between multiple ICs.
2. I2C – Inter-Integrated Circuit: Pronounced eye-two-see or eye-square-see, this is an advanced form
of USART. The transmission speeds can be as high as a whopping 400KHz. The I2C bus has two wires
– one for clock, and the other is the data line, which is bi-directional – this being the reason it is also
sometimes (not always – there are a few conditions) called Two Wire Interface (TWI). It is a pretty
new and revolutionary technology invented by Philips.
3. FireWire – Developed by Apple, they are high-speed buses capable of audio/video transmission. The
bus contains a number of wires depending upon the port, which can be either a 4-pin one, or a 6-pin
one, or an 8-pin one.
4. Ethernet: Used mostly in LAN connections, the bus consists of 8 lines, or 4 Tx/Rx pairs.
5. Universal serial bus (USB): This is the most popular of all. Is used for virtually all type of connections.
The bus has 4 lines: VCC, Ground, Data+, and Data-.
USB Pins
6. RS-232 – Recommended Standard 232: The RS-232 is typically connected using a DB9 connector,
which has 9 pins, out of which 5 are input, 3 are output, and one is Ground. You can still find this socalled “Serial” port in some old PCs. In our upcoming posts, we will discuss mainly about RS232 and
USART of AVR microcontrollers.