CPU registers

advertisement
ARM Core Architecture
Common ARM Cortex Core
In the case of ARM-based microcontrollers a
company named ARM Holdings designs the
core and licenses it to manufacturers like ST
(or NXP, Apple, Samsung, Qualcomm, HP, etc).
Result:
The same CPU (core) but different Peripherals
Why CPU Registers ?
Load parameters into CPU registers, execute operation and Store result in memory.
User’s view of CORTEX M3 CPU Registers
• General Purpose registers R0 to R12 are used to
store data and addresses.
• Stack Pointer (SP) controls stack memory processes
such as PUSH and POP
• Link Register (LR) is used to store the return
program counter value when a subroutine is
called
• Program Counter (PC) stores address of the
current instruction
• Program Status Register (PSR) stores flags
(i.e. single bits) that represent the current
status of CPU
ARM organization (simplified)
Load & Store Architecture
ALL Data operations performed on CPU Registers only
Instruction Pipeline
Stalled Pipeline
To speed up execution (partially) unroll the loops:
do{
++ count;
++ count;
while( count <21)};
Memory Map
CPU registers are accessed by their name (e.g. r1, RL).
Other registers, including memories, are accessed by address values.
Addresses are 32 bit unsigned integers and form a linear 4G (232)
long space called Memory Map.
Not all addresses are implemented !!!
Storage
Location
Accessed Memory (STM32)
Flash ROM
Code
0x00000000 -
0x1FFFFFFF
– 0x3FFFFFFF
Data
SRAM
0x20000000
Peripheral
Data &
Configuration
On-chip
Hardware
0x40000000 - 0x5FFFFFFFF
External
Ext Memory
0x60000000 -
0x9FFFFFFF
Microcontroller Programming Paradigm
1. Decide what peripheral you want to use
2. Look in datasheet for the registers to enable and
configure
3. Set bits in the registers to make peripheral behave
the way you want
4. GOTO 1
Instruction Set Architecture - ISA
Microprocessor’s ISA provides programmer’s overview of
the:
• data types used
• type of machine instructions
• different addressing modes and memory access
• CPU registers and their role
• accessing peripherals
• operation of interrupts
x86 ISA example
The x86 ISA processors will all run the SAME user code.
Because the PC architecture will be implemented in hardware in different ways
the processor’s performance (such as execution speed and power consumption )
will differ widely.
• The high end performance processors (e.g. Intel Core i8 )will specific hardware
components to perform many common operations, several fast memory caches,
fast data and address buses, several parallel CPUs etc.
• medium end performance processors e.g .Intel Core 2 will have microprogrammed components that are slower of the order of magnitude
•low end performance processors e.g. Intel Atom will have to perform some CPU
operations (e.g. memory access and arithmetic) using software routines rather
than in hardware.
CORTEX-M Machine Instructions (THUMB II)
•Instruction length can be either 16 or 32 bits
•Instruction fetched from Flash Memory or SRAM
• Instruction Memory Alignment in Half Word (16 bits)
• Reduced Instruction Set Computer (RISC)
There are about 100 instructions
Most instructions offer option of conditional execution
Machine Instruction Types
Instruction Type
Mnemonic
Example
Description
Frequency
80%
Data Movement
MOV
STR
LDR
R
← R
SRAM ← R
R
← SRAM
Arithmetic & Logic Ops
ADD
R
Flow Control
B
Branch to address
← Rs1 + Rs2
10%
10%
Anatomy of Assembler Commands
Instructions:
translated to binary machine code by Assembler
<label>
start
opcode
ldr
ldr
adds
str
Directives:
provide Assembler with information e.g. values of symbols and code/data address placement:
<label>
x
y
x
y
directive
equ
equ
dcb
db
<dest, src1, .....>
R2, # 0x3456789A
R1, x
R0, R1, R2
R0, y
parameter
0x20000004
0x20000008
0xdeadbeef
; comment
; R2 ← 0x3456789A
; R1 ← x
; R0 ← R1 + R2
; value in R0 stored in memory address y
; comment
; x ≡ 0x20000004
; y ≡ 0x20000008
; stores value 0xdeadbeef in memory address x
; reserves 4 bytes in SRAM starting at y
NOTES:
•Labels ALWAYS represent memory addresses. They can be symbolic names or numbers
•Opcode is a user mnemonic for the binary coding of the instruction type
•The number of parameters varies between 0 to 3, depending on the instruction type
•Directives direct the assembler . They do not translate to machine code !!!
•Some (pseudo) instructions convert to a sequence of machine instructions .
Addressing Memory
Depending on what information is encoded (included) in the machine code:
• Immediate
Constant Data
e.g. MOV R2, #0x12
• Register
CPU Register
e.g. MOV R2, R1
• Register Indirect
Memory Address in Register + constant offset
e.g. MOV R2 , [R1 + 0x25]
Note :
THUMB 2 branch instructions use PC indirect (relative) addressing mode with respect to
the current value of PC (?)
e.g.
loop
b loop
=>
b ? 1
The true offset is 0x00 but the bit #0 has been set by assembler for THUMB2 execution.
Loading Registers with Constants
ONLY limited support for Immediate Addressing for small and special constants
EXAMPLE:
MOV R1, #0xEF ;
OK
MOV R1, #0xDEADBEEF;
LDR R1, =0xDEADBEEF;
??
OK
small enough constant to fit into 16 bit Machine code format
may not be possible for all 32 bit constants
pseudo-instruction LDR R1,=const will generate PC relative
addressing instruction format with reference to the
constant stored at a nearby ROM location
Example: Using Pseudo Instruction
Machine code generated (on the right) for the LDR pseudo-instruction (left)
LDR R1,=0xDEADBEEF
loop
add R1, R1, #1
B loop
→
LDR R1, [?, #4 ]
B ? -2
DC32 0xDEADBEEF
Notes:
• Current value of PC Counter ? always points at the address of the current instruction.
• The value ?+ #4 in the LDR instruction points to the numerical constant 0xDEADBEEF placed in the ROM
• The - 2 offset in the branch instruction is 0xFE in 2’s complement but because the bit #0 is also set for
THUMB2 mode the actual machine code parameter is 0xFD .
Addressing Memory- Example
ORG
0x00000204
MOV
LDR
R0, #my_const
R1, = MY_LOOP
MOV
ADD
JNE
STR
R2, my_data
R3, R0, R2
[ R1]
R2, [R1 +0x4]
; Directive:
; Set the Assembler Memory Counter to 0x00000204
; Immediate Address: R0 <= 0x0000000FF
; Pseudo-Instruction using PC relative addressing
; value 0x10004000 loaded into R1
; Direct address: 0xDEADBEEF loaded into R2
; Register addressing: R3 <= R0 + R2
; Register Relative: Jump on Non-Zero to 0x10004000
; Register Relative with Offset:
; store R2 at Memory Address in 0x10004004
; Data Definitions
MY_LOOP
my_const
my_data
EQU
EQU
ORG
DC32
my_result
DC32
END
0x10004000
0x000000FF
0x 20004000
0xDEADBEEF
; Directive: Define label (text replacement)
; Directive: Set the Assembler Memory Counter
; Directive:
; Reserve and initiate 4 bytes at RAM at current
; memory counter value
; Assembler Directive:
; Reserve 32 bits at RAM at current memory counter value
; Directive : End of Source File
Source Code
R1 - internal Register stores value of variable counter
PC - Program Counter Register
In Little Endian Storage
Change of Flow Control
The code needs a comparison and a jump instruction.
Question:
How many jumps altogether ?
How many comparisons ?
Answ: 22
Answ: 22
Conditional Branch Instruction
BLT.N
Branch If Less Than OFFSET = 0xFC
PC <- PC + OFFSET
or
PC <- 10C – 4 = 108
Code Optimization
Optimized
Original
do{
while(counter < 21 ){
++ counter;
}
++ counter;
while(counter <21)};
Optimized code is faster because the loop has one less instruction (no need for
unconditional jump instruction).
Jumps Slow down execution because of the break of the Pipeline.
Download