Code Optimization of PSoC® 1 Project when using ImageCraft

advertisement
Code Optimization of PSoC® 1 Project when
using ImageCraft Compiler
AN60486
Author: Archana Yarlagadda
Associated Project: No
Associated Part Families: CY8C29x66, CY8C28xxx, CY8C27x43, CY8C24x94, CY8C24x23A
CY8C24x33, CY8C23x33, CY8C21x34, CY8C21x23
Software Version: PSoC Designer™ 5.0
Associated Application Notes: AN2017, AN2218, AN2129
Application Note Abstract
This application note shows the basic guidelines of optimizing code using the ImageCraft compiler with PSoC Designer while
®
developing PSoC 1 projects.
(a)
Introduction
This application note shows some methods for code
optimization for a project in PSoC Designer (PD) using the
ImageCraft compiler.
The Build Tab in the Output status window reports the
amount of ROM and RAM used by the project. A portion of
the PSoC Designer build window below shows the
memory usage is shown in Figure 1.
Figure 1. Build Message showing ROM and RAM Usage
When a C function is called from an ISR, the compiler
saves (pushes onto stack) and restores (pops from stack)
all the virtual registers, since the registers used by the
called function are unknown to the ISR.
PSoC chips with RAM greater than 256 bytes use a
paging system. The programming model that handles the
paging scheme is called the Large Memory Model (LMM).
A single page model is known as Small Memory Model
(SMM). In the case of larger PSoC chips with LMM, 4
page pointers are also stored and restored along with
virtual registers. They are named as follows: (CUR_PP,
IDX_PP, MVW_PP and MVR_PP). For more information
about the LMM, refer to “Design Aids - Large Memory
Model Programming for PSoC” AN2218.
Consider the following simple example of inline code and
a function call from the ISR. Since the assignment in Code
1 can be done without use of any virtual registers, none
are stored.
Several code optimizing methods to decrease/optimize the
use of ROM (also called Flash or code space) are shown
in this application note.
Function Calls in Interrupt Service
Routines (ISR)
When using the ImageCraft compiler, function calls in
Interrupt Service Routines (ISR) use more ROM. In ISRs,
the active state of the processor’s registers might get
changed when a function is called from an ISR.
Consequently these registers need to be saved and
restored for a proper jump and return from the ISR’s
function call. The ImageCraft C Compiler uses up to 15
virtual registers to store temporary data on the stack. They
are r0, r1, r2, r3, r4, r5, r6, r7, r8, r9, r10, r11, rX, rY, rZ
and can be found in the .mp file.
July 27, 2010
Code 1
BYTE bVar1;
#pragma interrupt_handler
SleepTimerHandler;
void SleepTimerHandler(void)
{
bVar1 = 1;
}
If however the same functionality was implemented using
a function call as shown in Code 2, then an additional 15
virtual registers and 4 page pointers are required to be
saved and restored. Each register requires the following
additional overhead code: MOV [2 bytes] + PUSH [1 byte]
+ POP [1 byte] + MOV [2 bytes] for a total of 6 bytes per
register. Therefore, Code 2 method takes an additional
114 bytes of code. There is also the additional call and
return code, which are comparatively negligible.
Document No. 001-60486 Rev. **
1
[+] Feedback
AN60486
Function calls from a ISRs written in C should be avoided
to help optimize code size.
Figure 3. Map File in PSoC Designer
Code 2
BYTE bVar1;
void TestFunc()
{
bVar1 = 1;
}
#pragma interrupt_handler
SleepTimerHandler;
void SleepTimerHandler(void)
{
TestFunc();
}
Relocatable Start Code Address
When a C program is compiled, the ImageCraft C compiler
converts the C files into assembly files. The assembler
then converts these into relocatable object files, which are
then mapped by the linker to obtain the executable .hex
file. The PSoC designer software IDE gives an option to
specify the address where the code has to be placed in
the .hex file, and thus relocated. This is given under
ProjectSettingsLinker, and the popup window is as
shown in Figure 2.
The map file shows the start and end address of different
areas. The boot section is shown by “TOP”, and the code
area is shown by “lit” in the .mp file shown in Figure 4. The
“Relocatable code start address” can be set to the End
address of the “TOP” area for the most efficient use of
space. For example, the “Relocatable code start address”
is set to 0x150 in Figure 2, and the End of the “TOP”
section in map files was at 0xD1. Thus by setting the right
value 127 (0x150 – 0xD1 = 0x7F) bytes were saved.
Figure 2. Relocatable Start Code Address Selection
Figure 4. “TOP” and “lit” Area in .mp File
Boot code is not included in this area and is placed at the
start of memory. For efficient use of memory, the
relocatable code address should be placed immediately
following the end of boot code. The default value is set to
a higher value than required, to support all the devices
and can be changed using the above tool. The end of the
boot code can be found by looking into the map file (.mp)
that can be opened through PD as shown in Figure 3.
If the start address is set to a value lower than that
required for boot code, then the compiler throws an error
notifying the user that the code space contains a value.
For example, If the value if set to 0xD0, and the boot code
ends at 0xD1, then “!E psocconfigtbl.asm(112): Code
address 0:0xd0 already contains a value “ message is
displayed during build.
July 27, 2010
Document No. 001-60486 Rev. **
2
[+] Feedback
AN60486
Sublimation and Condensation
Condensation
PSoC Designer gives code compression tools that can be
set under ProjectSettingsCompiler. The selection
window is shown in Figure 5.
When the Condensation option is chosen, a subroutine is
formed for segments of code that are repeated in a
project. Therefore rather than inline repetition of code, a
jump to a subroutine is added in the executable. A simple
piece of code, see Code 3, was repeated four times in a
test project to verify the Condensation option. With the
condensation option, 193 bytes were saved as shown in
Figure 7.
Figure 5. Sublimation and Condensation Selection
Code 3
iTest1=1;
iTest2=iTest1+1;
iTest3=iTest1+2;
iTest4=iTest1+3;
iTest5=iTest1+4;
Figure 7. ROM Saved with Condensation
Sublimation
When the sublimation option is chosen, PSoC Designer
deletes the unused user module APIs and thus saves
space.
In a test project, a PGA and PWM user modules were
placed and started. The first compilation was with no
Sublimation and the second was with Sublimation. As
shown in Figure 6, 88 bytes of memory were saved for
these modules due to elimination of unused user module
APIs.
In the event there is no repeated code in a program, and
the Condensation option was chosen, the following note
will be displayed in the Build message: “program code in
'text' area too small for worthwhile code compression”.
Treat const as ROM vs. const as RAM
This option can be accessed from Program
SettingsCompiler, as shown in Figure 5. This is not a
code optimization technique for ROM. The “Treat const as
ROM” handles the treatment of constants to be compliant
with standard C. The “Treat const as RAM” is for
backward compatibility with previous versions of the
ImageCraft compiler. The “Treat const as ROM” selection
uses less RAM, and the ROM usage remains the same
with either selection.
Figure 6. ROM Saved with Sublimation
In some cases, all the API are used in the project. In this
event, the following note will be displayed in the Build
message: “no dead symbol found”.
July 27, 2010
Document No. 001-60486 Rev. **
3
[+] Feedback
AN60486
Configuration Initialization Type
Direct Access vs. Index Addressing
During startup of PSoC, all the initialization values for
example, gain settings for PGAs, routing, and more, are
written into the configuration registers. This can be done
through two methods: “Loop” and “Direct write”, as shown
in Figure 8. In the “Loop” selection, a table is created with
the register address and initial values. A function is used
to traverse through the table and load the values into the
respective addresses. In the case of “Direct Write”, the
assignment is done through MOV instructions for each
register.
Using “direct access” addressing such as with global or
static variables, it is more efficient with the SMM. Using
“indexed addressing”, local variables, is more efficient
when using the LMM. This is because in the LMM, the
page pointer is set every time a global or static variable
are accessed. Thus when multiple variables are being
accessed in LMM, it is ROM code efficient to access local
variables.
Figure 8. Configuration Initialization Selection
There are many general coding optimization techniques
that are not IDE, platform, or chip specific. This section
presents several of them.
Optimization in Firmware
Use of Unsigned Integers
When integer arithmetic is used in a program, it adds math
library functions into the code space as required.
Depending on the size and type of the variables used (8,
16 or 32 bit, signed or unsigned), different functions are
added to the code. The details about the byte usage of
these functions can be found in the “Libraries user guide”
in
the
PSoC
Designer
documentation
folder
(HelpDocumentation).
The difference in code between the two selections can be
observed in the configuration file (PSoCConfigTBL.asm)
as shown in Figure 9. The compiler makes this change in
code based on the selection. The user does not need to
change anything in the configuration file.
The math functions in the M8C processor in PSoC are for
unsigned variables by default. When other variables are
used, there are additional functions to handle the
conversion and value checking. Thus, it is recommended
to use unsigned integers when possible. For example, the
use of unsigned integers instead of signed integers as
loop variables will help optimize the memory usage.
Shift and Add in Place of Multiply or Divide
Figure 9. Code Difference between Loop and Direct Write
Some math libraries may be avoided from being included
into code space. Tricks such as a bitwise-shift and add, in
place of a multiply or divide are examples for unsigned
integers. In unsigned integers, a single bitwise shift right
is equivalent to divide by 2, and shift left is equivalent to
multiplication by 2. By using shift and add, as shown in the
example below, the multiplication and division functions
can be avoided in few cases.
The two methods differ in memory usage in two aspects.
The first difference is that the loop method occupies a
fixed amount of memory for the traverse function, which is
not required in direct method. The second difference is the
loop method uses two bytes of ROM per register, whereas
the Direct Write method uses 3 bytes per register for the
MOV instruction. As a result, the “Loop” selection will
optimize code size when the number of registers are more
than the size of the traverse function (94 bytes). In
programs that have multiple user modules, the loop
method is usually recommended.
In the following two similar pieces of code, the Code 4
implementation uses 50 bytes more than Code 5. This is
due to the addition of “__mul16” function into the code.
Code 4
unsigned int iTest1, iTest2;
void main(void)
{
iTest1 = iTest2 *3;
}
Code 5
unsigned int iTest1, iTest2;
void main(void)
{
iTest1 = (iTest2 << 1) + iTest2;
}
July 27, 2010
Document No. 001-60486 Rev. **
4
[+] Feedback
AN60486
Avoiding Floating Point Math
Code 7
Floating-point math should be avoided when possible
because of the overhead of the libraries. Anytime a
floating-point operation is used, utility functions such as
rounding, normalization, and checking special conditions
are added to the code on top of the floating point parent
function. The byte usage for the floating point functions
are provided below for an estimate of memory usage. The
byte sizes differ based on small and large memory model
and the version of PSoC Designer used. The complete
details of floating point libraries are also given in “Libraries
user
guide”.
It
can
be
accessed
through
HelpDocumentationLibraries user guide.
int iTest1, iTest2;
void main(void)
{
iTest1 = iTest2 * 242;
if(iTest1 > 750)
{
iTest3 = 2;
}
else
{
iTest3 = 1;
}
}
Comparisons (*_fpcmp): 78 bytes
Addition (*_fpadd): 250
In some instances, a look up table can be used in place of
either floating-point or integer arithmetic math to save
code space.
Subtraction (*_fpsub + *_fpadd) = 9 + 250 = 259 bytes
Look up Table in place of Calculation
Multiplication (*_fpmul+i_mulu8_block_util) = 292 + 29 =
321 bytes
The use of a formula for calculation can include multiple
integer or floating-point math library functions into the
code space. Instead of using a formula, a Look Up Table
(LUT) method can be used to obtain results to save code
space. There are multiple tradeoffs, like speed and
accuracy, along with the code space in choosing one over
the other. The choice is based on the type of application.
Floating point utility functions (*_util): 180 bytes
Division (*_fpdiv) = 221 bytes
The floating-point utility functions (180 bytes) are common
to all the functions except for comparisons functions. Thus
the total memory usage of the floating point functions are
obtained by adding the byte size of the utility function to
the parent floating point function. For example, the total
memory usage for addition floating point function is 250 +
180 = 430 bytes.
The floating-point math functions use the integer math
libraries as the base. The floating-point math libraries use
more code space than the integer math libraries. In place
of using floating-point math, the variables sometimes can
be scaled up so the integer math can be used.
For example, in the following two pieces of code, Code 6
method uses 492 bytes more than the Code 7.
Code 6
int iTest2;
float fTest1;
void main(void)
{
fTest1 = iTest2 * 2.42;
if(fTest1 > 7.5)
{
iTest3 = 2;
}
else
{
iTest3 = 1;
}
}
July 27, 2010
For example, the project given in “Thermistor-Based
Thermometer, PSoC Style” AN2017, offers an option for
floating point and LUT method implementation. The use of
a LUT in place of floating point math in this project saves
1920 bytes of memory.
Array Indexing vs. Pointer
Embedded system platforms implement array-indexing
and pointer access differently. Depending on whether the
variables are local or global can also change the memory
usage.
For example, in the following two similar pieces of code,
Code 8 uses three bytes more than Code 9. When the
variables are configured as local instead of global, Code 8
uses two bytes less than Code 9.
Code 8
BYTE bVar1;
BYTE array[10];
void main(void)
{
bVar1=0;
while(array[bVar1]!=0)
{
bVar1++;
}
}
Document No. 001-60486 Rev. **
5
[+] Feedback
AN60486
sData* myPtr;
void main(void)
{
myTest.myArray[1].iData
myTest.myArray[1].bData
myTest.myArray[2].iData
myTest.myArray[2].bData
myTest.myArray[3].iData
myTest.myArray[3].bData
Code 9
BYTE *ptr;
BYTE array[10];
void main(void)
{
ptr = array;
while(*ptr != 0)
{
ptr++;
}
}
Part of Code in Assembly
For example, consider the following two code examples.
Code 10 uses 60 bytes more than Code 11. Thus a careful
observation of the type of access method being used
(array-index vs. pointer) is important for code optimization.
There are number of variations in the type of access and
variable types. Providing every combination of array-index
and pointer access comparison is beyond the scope of this
application note.
typedef struct
{
int iData;
BYTE bData;
}sData;
typedef struct
{
sData myArray[10];
}sArray;
sArray myTest;
sData* myPtr;
void main(void)
{
myPtr = myTest.myArray;
myPtr->iData = 100;
myPtr->bData = 10;
myPtr++;
myPtr->iData = 200;
myPtr->bData = 20;
myPtr++;
myPtr->iData = 200;
myPtr->bData = 20;
}
Code 11
typedef struct
{
int iData;
BYTE bData;
}sData;
typedef struct
{
sData myArray[10];
}sArray;
sArray myTest;
July 27, 2010
100;
10;
200;
20;
300;
30;
}
The number of bytes saved for the simple example is only
a few bytes. When the code is part of a structure or other
user defined variable, the difference in the type of access
will lead to a large variation in memory usage.
Code 10
=
=
=
=
=
=
Writing a program in assembly will avoid compiler
interpretations and allow complete optimization by the
user. Though writing an entire program in assembly is
tedious and cumbersome, converting a part of code into
assembly level language can optimize code size and
performance. For more information refer to “Interfacing
Assembly and C Source Files” AN2129.
IF-ELSE vs. Switch
For a switch statement on a single byte variable (BYTE),
the ImageCraft compiler produces more efficient code
using an if-else construct as compared to a switch
construct.
The number of bytes used by the switch statement is 9 + 5
bytes more than if-else per each case. For example, a
four case switch statement with a default clause as shown
in the Code 12 will use (9 + 5 * 4) = 29 more bytes than
the equivalent Code 13.
Code 12
BYTE bTest1, bTest2;
void main(void)
{
switch(bTest1)
{
case 4:
{
bTest2 = 1;
break;
}
case 3:
{
bTest2 = 2;
break;
}
case 2:
{
bTest2 = 3;
break;
}
default:
{
bTest2 = 4;
}
}
}
Document No. 001-60486 Rev. **
6
[+] Feedback
AN60486
Code 13
About the Author
BYTE bTest1, bTest2;
void main(void)
{
if(bTest1 == 4)
{
bTest2=1;
}
else if(bTest1 == 3)
{
bTest2 = 2;
}
else if(bTest1 ==2)
{
bTest2 = 3;
}
else
{
bTest2=4;
}
}
Name:
Archana Yarlagadda
Title:
Applications Engineer
Background:
Applications Engineer at Cypress with
focus on PSoC.
Masters in Analog VLSI from University
of Tennessee, Knoxville
Contact:
yara@cypress.com
When the switch statement is for a two-byte variable
(WORD), the resulting code size is nearly identical for
either the switch or the if-else implementation.
Initializing Global Variables
Global variables are initialized to zero by default.
Reinitializing the global variables to zero explicitly adds
additional code. Thus, for code optimization, global
variables should not be explicitly set to zero.
Conclusion
This application note discusses the basic methods of code
optimization. Some of these are specific to the ImageCraft
compiler, and few are general. There are many more
general optimization techniques to be explored beyond
what has been given here.
July 27, 2010
Document No. 001-60486 Rev. **
7
[+] Feedback
AN60486
Document History
®
Document Title: Code Optimization of PSoC 1 Project when using ImageCraft Compiler
Document Number: 001-60486
Revision
**
ECN
2994124
Orig. of
Change
YARA
Submission
Date
07/27/2010
Description of Change
New Application Note.
PSoC is a registered trademark of Cypress Semiconductor Corp. "Programmable System-on-Chip," PSoC Designer, and PSoC Express are
trademarks of Cypress Semiconductor Corp. All other trademarks or registered trademarks referenced herein are the property of their
respective owners.
Cypress Semiconductor
198 Champion Court
San Jose, CA 95134-1709
Phone: 408-943-2600
Fax: 408-943-4730
http://www.cypress.com/
© Cypress Semiconductor Corporation, 2010. The information contained herein is subject to change without notice. Cypress Semiconductor
Corporation assumes no responsibility for the use of any circuitry other than circuitry embodied in a Cypress product. Nor does it convey or imply any
license under patent or other rights. Cypress products are not warranted nor intended to be used for medical, life support, life saving, critical control or
safety applications, unless pursuant to an express written agreement with Cypress. Furthermore, Cypress does not authorize its products for use as
critical components in life-support systems where a malfunction or failure may reasonably be expected to result in significant injury to the user. The
inclusion of Cypress products in life-support systems application implies that the manufacturer assumes all risk of such use and in doing so indemnifies
Cypress against all charges.
This Source Code (software and/or firmware) is owned by Cypress Semiconductor Corporation (Cypress) and is protected by and subject to worldwide
patent protection (United States and foreign), United States copyright laws and international treaty provisions. Cypress hereby grants to licensee a
personal, non-exclusive, non-transferable license to copy, use, modify, create derivative works of, and compile the Cypress Source Code and derivative
works for the sole purpose of creating custom software and or firmware in support of licensee product to be used only in conjunction with a Cypress
integrated circuit as specified in the applicable agreement. Any reproduction, modification, translation, compilation, or representation of this Source
Code except as specified above is prohibited without the express written permission of Cypress.
Disclaimer: CYPRESS MAKES NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, WITH REGARD TO THIS MATERIAL, INCLUDING, BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. Cypress reserves the
right to make changes without further notice to the materials described herein. Cypress does not assume any liability arising out of the application or
use of any product or circuit described herein. Cypress does not authorize its products for use as critical components in life-support systems where a
malfunction or failure may reasonably be expected to result in significant injury to the user. The inclusion of Cypress’ product in a life-support systems
application implies that the manufacturer assumes all risk of such use and in doing so indemnifies Cypress against all charges.
Use may be limited by and subject to the applicable Cypress software license agreement.
July 27, 2010
Document No. 001-60486 Rev. **
8
[+] Feedback
Download