Transformation of a synthesizable subset of ANSI C code into

advertisement
Transformation of a synthesizable subset of ANSI C
code into behavioral SystemC code
Piotr Dziurzanski, Vladimir Beletskyy
Faculty of Computer Science & Information Systems, Technical University of Szczecin,
ul. Zolnierska 49, 71-210 Szczecin, Poland
e-mail: pdziurzanski@wi.ps.pl, bielecki@man.szczecin.pl
Abstract:
In this paper, there is a preliminary description of a system under development
for translating codes written in ANSI C into behavioral SystemC codes. The
limitation of the translable structures of ANSI C are described and
implementation details are stressed.
Key words:
Higl level synthesis, synthesizable subset, ANSI C, System C
1.
INTRODUCTION
Different hardware description languages (HDLs) are used as input to
behavioral synthesis. The most commonly used are VHDL and Verilog, but since
designers often write system level models using programming languages,
application of software languages are of mounting popularity. Applying software
languages makes easier performing SW/HW cosynthesis, which accelerates the
design process and improves the flexibility of the software/hardware migration.
Moreover, the system performance estimation and verification of the functional
correctness is easier, as software languages offer fast simulation and a sufficient
amount of legacy code and libraries which facilitate the task of system modelling.
To implement parts of the design modelled in C/C++ in hardware using
synthesis tools, designers must translate these parts into a synthesizable subset of a
HDL, which then is synthesized into a logic netlist.
A leadership of ANSI C/C++ in the field of software languages contributes to a
large number of HDLs based on these languages, for example SystemC, Cynapps,
Accellera, and SpecC. This choice makes rewriting the C/C++ code into an
equivalent HDL description less time consuming and less error prone that results in
a shorter time to market and higher quality [4].
2
In this paper, we analyze the transformation of an ANSI C code into a SystemC
code, which is open and supported by Synopsys, Cadence, Mentor Graphics, Xilinx
and other vendors.
2.
PRINCIPLES OF SYSTEMC
In this section, we introduce basic SystemC concepts and nomenclature. In
SystemC, a modelled system is comprised of modules with single or multiple
processes to specify combinational or sequential logic. Processes define the parallel
behavior of a particular module, processes are executed concurrently. However, an
execution of the code within a process is sequential. Each process is declared as a
C++ member function of a module class and registered in the constructor. Except for
processes, a module contains ports, internal signals, internal data variables, and
member functions. It may also include other modules for hierarchical design.
Defining a process is based on the method of defining a C++ function, as it is
declared as a member function of a module class. Then it is registered as a process in
the constructor of the module. There exist three typees of SystemC processes. Since
we are aimed at the creating of synthesizable models, we make usage of the only
synthesizable type of SystemC process, an SC_METHOD process, which is either
level-sensitive or edge-sensitive with respect to a set of signals called its sensitivity
list. For defining the module constructor, the C++ macro SC_CTOR is applied. In its
body processes are registred and their sensitivity lists are declared. The sensitivity
list is defined with the sensitive( ), sensitive_pos( ), sensitive_neg( ) functions or the
sensitive, sensitive_pos, or sensitive_neg streams.
Similarly to other HDLs, processes in SystemC communicate with their
environment using ports, whereas for the communication between processes internal
signals or internal variables can be utilizied. However, it is not adviced to apply
internal variables, as during simulation the processes are executed in random order
which can lead to nondeterminism. Ports are declared with template classes
sc_in<port_type>, sc_out<port_type>,
sc_inout<port_type> regarding their
direction, and signals are declared with the template class sc_signal<port_type>.
Internal variables are declared as in ANSI C.
3.
ANSI C TO SYSTEMC TRANSLATION
Due to the fact that SystemC is a library of the ANSI C language, there is a
possibility of one-to-one translation between those two systems in most cases. The
vast majority of ANSI C statements are supported by SystemC, but a part of them is
nonsytnesizable. In the next section, synthesizable and nonsynthesizable statements
are enumerated.
ANSI C defines sequential processes, whereas in SystemC processes are run in
parallel. Thus one of the crucial points of trasforming ANSI C code into SystemC
code is to establish groups of ANSI C functions which should be executed in a
single SystemC process. The basic three approaches for this problem are as follows.
2
3
int a;
void add() {
a++;
}
void main() {
add();
return 1;
}
(a)
SC_MODULE(ex1){
sc_in<bool> start;
sc_out<int> output;
int a;
void add() {
a++;
}
void main() {
add();
output=1;
}
SC_MODULE(ex1) {
sc_in<bool> start;
sc_out<int> output;
sc_signal<int> a;
sc_signal<bool> CS;
void add() {
a++;
}
void main() {
CS=false;
CS=true;
output=1;
}
SC_CTOR(ex1) {
SC_METHOD(main);
sensitive_pos(start);
SC_METHOD(add);
sensitive_pos(CS);
}
};
VWDUW
PDLQ
RXWSXW
(d)
VWDUW
PDLQ
&6
RXWSXW
D
DGG
(e)
(c)
SC_CTOR(ex1) {
SC_METHOD(main);
sensitive_pos(start);
}
};
(b)
Fig. 1. The ANSI C program (a), corresponding SystemC realizations (b,c) and block
diagrams of the processes (d,e) (Example 1)
1.
2.
3.
Treating the whole ANSI C program as a single SystemC process,
Treating each ANSI C function as a separate SystemC process,
Forming partitions from ANSI C functions and then implementing each
partition as a separate SystemC process (a hybrid approach).
In approach 1, the whole program is executed serially. This approach is sensible
if ANSI C functions share a lot of variables or the functions themselves are so
simple that time used for synchronization would outnumber benefits of the
paralelization. Then the start point of an ANSI C program (usually a function named
main) is declared as the only process. This approach is quite straightforward and
makes the data dependency analysis unnecessary. Consequently, there is no need of
adding control buses for synchronization.
Such the approach, however, eliminates all the benefits following from the
possibility of parallelization as a single process is serially executed. As functions are
executed serially, the function, executing another one, has to wait for the finishing of
the executed function.
3
4
int a;
void add() {
a++;
}
void main() {
add();
return a;
}
(a)
SC_MODULE(ex1){
sc_in<bool> start;
sc_out<int> output;
int a;
void add() {
a++;
}
void main() {
add();
output=a;
}
SC_CTOR(ex1) {
SC_METHOD(main);
sensitive_pos(start);
}
};
SC_MODULE(ex2) {
sc_in<bool> start;
sc_out<int> output;
sc_signal<int> a;
sc_signal<bool> CS;
sc_signal<bool> RDY;
void add() {
a++;
RDY=true;
}
void main() {
CS=false;
RDY=false;
CS=true;
while(RDY==false);
output=a;
}
VWDUW
PDLQ
RXWSXW
(d)
VWDUW
PDLQ
&6
RXWSXW
D 5'<
DGG
(e)
SC_CTOR(ex1) {
SC_METHOD(main);
sensitive_pos(start);
SC_METHOD(add);
sensitive_pos(CS);
}
};
(c)
(b)
Fig. 2. The ANSI C program (a), corresponding SystemC realizations (b,c) and
block diagrams of the processes (d,e) (Example 2)
Approach 2 is worth considering in the case when functions are lousy tightened,
i.e., when there is no much communication between processes.
One of complications following from this approach is the need of implementing
blocking actions before accessing to shared variables. These actions can be
implemented as wait statements, which can be left when synchronization signals
from other modules are set. Obviously, these signals can complicate the
implementation so that it can be not acceptable due to a large size of the obtaining
realization.
Approach 3 leads to the best results, but the problem with the function
partitioning is computable expensive. One of data structures that can help with the
optimization of this stage is a dependency graph [5].
Example 1. Let us consider the ANSI C code given in Fig. 1a, where the
function add does not share any variables with the function main. Consequently, the
main function does not have to wait until the add finishes. As there are no data
dependence between the functions main and add, they can be executed in parallel,
and, consequently, realized in one SystemC module as two processes. As main
4
5
int main() {
int i;
char a[100];
char b[100];
int n=100;
for(i=0;i<100;i++)
b[i]=a[i];
return 1;
}
SC_MODULE(ex2) {
sc_in<bool> start;
sc_out<int> output;
sc_signal<int> a[100], b[100];
sc_signal<bool> CS1, CS2, CS3;
sc_signal<bool> RDY1, RDY2, RDY3;
void loop_body (int From,int To) {
int i;
for(i=From; i<To ; i++)
b[i]=a[i];
}
(a)
VWDUW
RXWSXW
PDLQ
5'<
&6
ORRS
&6
ORRS
5'<
&6
ORRS
5'<
(c)
void main() {
RDY1=false; RDY2=false; RDY3=false;
CS1=false; CS1=true;
CS2=false; CS2=true;
CS3=false; CS3=true;
while(RDY1 & RDY2 & RDY3 != true);
output=1;
}
void start1(){loop_body(0,33); RDY1=true;}
void start2(){loop_body(33,66); RDY2=true;}
void start3(){loop_body(66,100); RDY3=true;}
SC_CTOR(main) {
SC_METHOD(main);
sensitive_pos(start);
SC_METHOD(start1);
sensitive_pos(CS1);
SC_METHOD(start2);
sensitive_pos(CS2);
SC_METHOD(start3);
sensitive_pos(CS3);
}
};
(b)
Fig. 3. The ANSI C program (a), corresponding SystemC realization (b) and block diagram
of the processes (c) (Example 3)
executes add, a control connection between the corresponding processes is
necessary. In Fig. 1c and e, the realization with positive edge activating the add
function and the block diagram of the processes are depicted, respectively.
If we utilize approach 3, we obtain a single module and have no benefits from
possible parallelization (Fig. 1b and d).
Example 2. Since in the code presented in Fig. 2a the both functions main and
add share the same variable a, the execution of main has to wait until add finishes its
execution. In order to synchronize the execution of processes, the signal which is set
when add finishes is added. In Fig. 2b the realization with positive edge activating
the add function and synchronization signal RDY is given. The block diagram of the
processes is depicted in Fig. 2c.
5
6
Bool Datatype
Struct
Integer, Character, Enumeration Constants
Postfix Incrementation (++, --)
Unary Operators (+,-)
Logical Negation Operator (!)
Additive Operators (+,-)
Relational Operators
Bitwise AND, XOR, OR Operators
Conditional Operator (?:)
Comma Operator (,)
Declarations
Storage Class Specifiers (extern, static, typedef)
Array Declarators
Labeled Statements
Selection Statements (if, switch)
Jump Statements (goto, continue, break, return)
File Inclusion (#include)
Function Overloading
Operator sizeof
Integer Datatypes
Enumeration Datatype
Arrays
Casts
One's Complement Operator (~)
Multiplicative Operators (*,/,%)
Shift Operators (<<,>>)
Equality Operators
Logical AND, OR Operators
Assignment Expressions
Constant Experssions
Init Declarations
Type Specifier const
Function Declarators
Compound Statement (block)
Iteration Statements (while, do, for)
Function Definitions
Conditional Compilation
Operators Overloading
Tab. 1. Synthesizable ANSI C Constructs
Floating Datatypes
File Datatype
Union Datatype
Volatile Qualifier
Address Operator (&)
Floating Constants
Standard Library Functions
Recursions
Pointers
Void Datatype
Global Variables
Storage Classes auto, register
Indirection Operator (*)
Pointer Declarations
Dynamic Memory Allocation
Operator ->
Tab. 2. Nonsynthesizable ANSI C Constructs
This case can also be implemented with approach 1, which leads to a single
SystemC process, given in Fig. 2d and e.
In order to benefit more from the transformation, there is a possibility of
parallelizing statements inside a function. The next example shows the
parallelization of a for loop.
Example 3. In the code presented in Fig. 3a, there is the for loop where there
are no dependencies among iterations. Then, it could be split in a few processes and
run in parallel. In Fig. 3b and c, the realization and the diagram with three processes
are depicted (only the synchronization wires are visible).
6
7
4.
ANSI C CONSTRUCTS FOR BEHAVIORAL
SYSTEMC SYNTHESIS
For the designer synthesizing hardware from an C code, the most useful would
be a synthesizer which accepts the full ANSI C standard described in [3]. This task,
however, turns out to be particularly difficult due to such statements as dynamic
memory allocation, function calls, recursions, jumps, type castings, and pointers [1],
[4].
In our implementation, we established synthesizable and nonsynthesizable
subsets of the ANSI C constructs as given in Table 1 and Table 2, respectively.
Although an arbitrary control flow caused by jump statements complicates the
scheduling of operations, it has been included into the synthesizable subset.
Arrays types can be synthesized as long as each field is of a synthesizable data
type.
The constructs which have no hardware meaning, such as file operations, are not
synthesizable and thus should be avoided.
Floating point types are not synthesizable due to the fact that straightforward
implementation resulting in the hardware which requires an enormous amount of
resources which is beyond the contemporary technology. However, the method
described in [2] that offers a fixed point implementation from a floating-point
description is under consideration.
The dynamic memory allocation and recursion is not synthesizable as an amount
of the required memory is unknown at the synthesis stage. Therefore, the synthesis
of C code involving dynamic memory allocation would require the access to an
operating system running in software or the generation of hardware allocators [6].
Pointers are especially difficult to synthesize as they have different applications,
such as complex memory management operations, referencing data structures,
referencing functions, passing parameters by reference. SpC, an interesting approach
to synthesise pointers and malloc/free statements is described in [6]. However,
dynamic memory allocation needs still a lot of research to be carried out so as it can
be synthesizable at a satisfactory level, so the synthesis pointers is not included in
the majority of available systems (BACH C, COWARE, OCAPI, Synopsys
COCENTRIC, and NEC CYBER).
5.
CONCLUSIONS AND FUTURE WORK
In this paper, we have described a system under development for translating
codes written in ANSI C into behavioral SystemC codes. The method of parallel
running of funcions are described; synthesizable and nonsythesizable ANSI C
subsets are given. For nonsythesizable constructs, we have presented a short
justification why it is difficult or impossible to synthesize them.
In our future work, we are going to develop methods to synthesize ANSI C code
with OpenMP pragmas, which define the parallelization of a code.
7
8
6.
REFERENCES
[1] ‘Describing Synthesizable RTL in SystemC’, Version 1.2, November 2002, Synopsys,
www.synopsys.com
[2] H. Keding, M. Willems, M. Coors, H. Meyr, ‘FRIDGE: a fixed-point design and
simulation environment Integrated Signal’, In Proceedings of the Design, Automation
and Test in Europe, Paris , France, 1998, pp. 429-435
[3] B. Kerninghan, D. Ritchie, ‘The C Programming Language’, Prentice Hall Software
Series, Englewood Cliffs, NJ, 1988
[4] S. Y. Liao, ‘Towards a new standard for system-level design’, In Proceedings of the
Eighth International Workshop on Hardware/Software Codesign, San Diego, CA, USA,
2000
[5] G. De Micheli, ‘Synthesis and Optimization of Digital Circuits’, Mc Graw Hill,
Highstown, NJ, 1994
[6] L. Semeria, K. Sato, G. De Micheli, ‘Synthesis of hardware models in C with pointers and
complex data structures’, IEEE Transactions on Very Large Scale
Integration Systems, vol. 9 no. 6 , 2001, pp. 743 -756
[7] ‘SystemC Version 2.0 User's Guide’, www.systemc.org, 2002
8
Download