Slides

advertisement
‘C’ for Microcontrollers,
Just Being Efficient
Lloyd Moore, President
Lloyd@CyberData-Robotics.com
www.CyberData-Robotics.com
Agenda
Microcontroller Resources
 Knowing Your Environment
 Memory Usage
 Code Structure
 Interrupts
 Math Tricks
 Optimization

Disclaimer

Some microcontroller techniques necessarily
need to trade one benefit for another –
typically lower resource usage for
maintainability
 Point of this presentation is to point out various
techniques that can be used as needed
 Use these suggestions when necessary
 Feel free to suggest better solutions as we go
along
Microcontroller Resources

EVERYTHING resides on one die inside one
package: RAM, Flash, Processor, I/O
 Cost is a MAJOR design consideration

Typical costs are $0.25 to $25 each (1000’s)

RAM: 16 BYTES to 32K Bytes typical
 Flash/ROM: 384 BYTES to 256K Bytes
 Clock Speed: 4MHz to 80MHz typical


Much lower for battery saving modes (32KHz)
Bus is 8, 16, or 32 bits wide (just like the old
days)
Other Considerations

Specialized resources often present


Portability inside families a big concern


May have hardware centric API, or just raw
registers!
No floating point hardware


Across families, not so much
Typically no operating system present


Counters, UART, USB PHY, LCD Controller
May have other math hardware (MAC, CRC)
No protected memory / MMU

Do have specialized memory segments
Power Consumption
Microcontrollers typically used in battery
operated devices
 Power requirements can be
EXTREMELY tight

Energy harvesting applications
 Long term battery installations (remote
controls, hard to reach devices, etc.)


EVERY instruction executed consumes
power, even if you have the time!
Know Your Environment
Traditionally we ignore hardware details
 Need to tailor code to hardware available



Specialized hardware MUCH more efficient
Compilers typically have extensions
Interrupt – specifies code as being ISR
 Memory model – may handle banked
memory and/or simultaneous access banks
 Multiple data pointers / address generators


Debugger may use some resources
Memory Usage


Use ‘const’ to put data into program memory
Alignment / padding issues


Avoid dynamic memory allocation



Take extra space and processing time
Memory fragmentation a big issue
Use and reuse static buffers




Typically NOT an issue, non-aligned access ok
Reduces variable passing overhead
Allows for smaller / faster code due to reduced indirections
Does bring back over write bugs if not done carefully
Use the appropriate variable type


Don’t use int and double for everything!!
Affects processing time as well as storage
Char vs. Int Increment on 8051
int iX;
iX++;
char cX;
cX++;
000A
000D
000E
000F


900000
E0
04
F0
MOV
MOVX
INC
MOVX
DPTR,#cX
A,@DPTR
A
@DPTR,A
6 Bytes of Flash
4 Instruction cycles
0000
0003
0004
0007


900000
E4
75F001
120000
MOV
CLR
MOV
LCALL
DPTR,#iX
A
B,#01H
?C?IILDX
10 Bytes of Flash +
subroutine overhead
Many more than 4
instruction cycles with a
LCALL
Code Structure

Count down instead of up



Pointers vs. array notation


Saves a subtraction on all processors
DJNZ style instruction on some processors
Generally better using pointers
Bit Shifting



May not always generate what you think
May or may not have barrel shifter hardware
May or may not have logical vs. arithmetic shifts
Shifting Example
cX = cX << 3;
0006
0007
0008
0009



33
33
33
54F8
cA = 3;
cX = cX << cA;
RLC
RLC
RLC
ANL
A
A
A
A,#0F8H
Constants turn into seperate
statements
Variables turn into loops
Both of these can be one
instruction with a barrel shifter
000B
000E
000F
0010
0011
0013
0014
0016
0016
0017
0018
0018
900000
E0
FE
EF
A806
08
8002
C3
33
D8FC
MOV
DPTR,#cA
MOVX
A,@DPTR
MOV
R6,A
MOV
A,R7
MOV
R0,AR6
INC
R0
SJMP
?C0005
?C0004:
CLR
C
RLC
A
?C0005
DJNZ
R0,?C0004
More Code Structure

Actual parameters typically passed in registers if
available




Global variables




Keep function parameters to less than 3
May also be passed on stack or special parameter area
May be more efficient to pass pointer to struct
While generally frowned upon for most code can be very
helpful here
Typically ends up being a direct access
Read assembly code for critical areas
Know which optimizations are present


Small compilers do not always have common optimizations
Inline, loop unrolling, loop invariant, pointer conversion
Indexed Array vs Pointer on M8C
ucMode = g_Channels[uc_Channel].ucMode;
01DC
01DE
01E0
01E2
01E3
01E5
01E6
01E8
01E9
01EB
01EC
01EF
01F1
01F4
01F7
01FA
01FD
01FF
52FC
5300
5000
08
5100
08
5000
08
5007
08
7C0000
38FC
5F0000
5F0000
060000
0E0000
3E00
5403
mov A,[X-4]
mov [__r1],A
mov A,0
push A
mov A,[__r1]
push A
mov A,0
push A
mov A,7
push A
xcall __mul16
add SP,-4
mov [__r1],[__rX]
mov [__r0],[__rY]
add[__r1],<_g_Channels
adc[__r0],>_g_Channels
mvi A,[__r1]
mov [X+3],A
ucMode = pChannel->ucMode;
01ED
01EF
01F1
01F3




5201
5300
3E00
5405
mov
mov
mvi
mov
A,[X+1]
[__r1],A
A,[__r1]
[X+5],A
Does the same thing
Saves 29 bytes of memory AND a
call to a 16 bit multiplication routine!
Pointer version will be at least 4x
faster to execute as well, maybe 10x
Most compilers not this bad – but
you do find some!
Interrupts


Generally implemented as individual hardware vectors
with a small amount of program memory at the
location
ISR is what you get – no OS, no threads, no IST


Also very common to use interrupts to simulate
threads



Can use a flag with main loop to get IST behavior for less time
critical code
Interrupt itself take the place of the WaitFor_XXX or signal
Follows very naturally for hardware tasks and timers
Generally an “interrupt” statement provided
Interrupt Example
static unsigned char g_TimerTriggered;
void main()
{
ConfigureTimer0();
g_TimerTriggered = 0;
GlobalEnableInterrupt();
while(1)
{
if(g_TimerTriggered)
{
g_TimerTriggered = 0;
//Could also disable the timer interrupt here
DoTimerTask();
//to avoid a race condition resetting g_TimerTriggered
}
//Can put optional sleep here, interrupts can wake up processor
}
}
void Timer0ISR(void) interrupt 1 using 2
{
g_TimerTriggered = 1;
//Can put other small, quick work here
}
//Interrupt source 1, attached to vector 2
Switch Statement Implementation

Switch statements can be implemented in various
ways




Specific implementation can also vary based case
clauses




Sequential compares
In line table look up for case block
Special function with look up table
Clean sequence (1, 2, 3, 4, 5)
Gaps in sequence (1, 10, 30, 255)
Ordering of sequence (5, 4, 1, 2, 3)
Knowing which method gets implemented critical to
optimizing!
Switch Statement Example
switch(cA)
{
case 0:
cX = 4;
break;
case 1:
cX = 10;
break;
case 2:
cX = 30;
break;
default:
cX = 0;
break;
}
0006
0009
000A
000B
000C
000F
0011
0012
0014
0015
0017
0018
001A
001C
001C
001F
0021
0022
900000
E0
FF
EF
120000
0000
00
0000
01
0000
02
0000
0000
900000
7404
F0
8015
MOV
MOVX
MOV
MOV
LCALL
DW
DB
DW
DB
DW
DB
DW
DW
?C0002:
MOV
MOV
MOVX
SJMP
DPTR,#cA
A,@DPTR
R7,A
A,R7
?C?CCASE
?C0003
00H
?C0002
01H
?C0004
02H
00H
?C0005
DPTR,#cX
A,#04H
@DPTR,A
?C0006
...More blocks follow for each case
Bit Variables
Some processors have special memory
areas and op-codes for single bit storage
 Saves overhead of masking operations
 Some key from bit fields notation, some
need keyword (frequently ‘bit’)


struct {
unsigned int foo : 1;
} flags;

unsigned int my_bit : 1;

bit my_bit;
Math Tricks

Floating point math VERY expensive on microcontrollers




No hardware support
Typically 32 bits for float, 64 bits for double
Support provided by a BIG library
Can use fixed point math in many cases


Basically the same as integer math, however move the decimal inside the
integer.
Binary number is really:


To make a fixed point number just adjust the exponents:



2^7 + 2^6 +… 2^2 + 2^1 + 2^0
2^6 + 2^5 + … 2^1 + 2^0 + 2^-1
:Note 2^-1 = 0.5
Assume 8 bit value: Range = [0,255]
Assume one binary decimal point



XXXXXX.X
Range is now [0, 127.5]
All the internal math stays the same so long as only fixed point numbers with the
same binary point location used together!
More Math Tricks

You may not have multiply and/or divide ops!
 Decomposing operations can help
X*5=X*4+X
(X * 4) can become 2 shift left operations

Formulas should also be restructured for math
available:
Y=ax^2 + bx + c : 1 Pow or Mult, 2 Mult, 2 Add
Y = x (ax + b) + c : 2 Mult, 2 Add

Lookup tables can be great for limited
domain problems
Optimization
Step 0 – Before coding anything think about
risk points and prototype unknowns!!!
 Step 1 – Get it working!!




Fast but wrong is of no use to anyone
Optimization will typically reduce readability
Step 2 – Profile to know where to optimize


Usually only one or two routines are critical
You need to have specific performance metrics to
target
Optimization

Step 3 – Let the tools do as much as
they can
Turn off debugging!
 Select the correct memory model
 Select the correct optimization level


Step 4 – Do it manually
Read the generated code! Might be able to
make a simple code or structure change.
 Last – think about assembly coding

Summary
Microcontroller hardware is much simpler
than most of us are used to
 Be familiar with the hardware in your
microcontroller
 Be familiar with your compiler options
and how it translates your code
 For time or space critical code look at the
assembly listing from time to time

Questions?
Download