A Floating Point Divider for Complex Numbers in the NIOS II

advertisement
A Floating Point Divider for
Complex Numbers in the NIOS II
Presented by John-Marc Desmarais
Authors: Philipp Digeser, Marco Tubolino , Martin Klemm, Daniel Shapiro and Miodrag Bolic
Email: {dshap092, mbolic}@site.uottawa.ca
CARG 2010
Overview
 Floating point division
 Instruction Set Extensions (ISE)
 NIOS II processor
 Instruction hardware
 Software interface
 Experiment
 Conclusion
carg.site.uottawa.ca
CARG 2010
Floating Point Division
Unlike real multiplication or real division, mathematical
operations for complex numbers are usually provided by slow
software. Consider complex division:
Slow
carg.site.uottawa.ca
CARG 2010
Floating Point Division
• Fast complex dividers are necessary to drive
an increasing number of applications such as
signal processing systems for image and audio
manipulation, GPS, and multi-antenna
systems.
• Example: STSDAS offers math libraries for
image analysis, including
stsdas.analysis.fourier.carith, which is used to
multiply or divide two complex images1.
1http://stsdas.stsci.edu/cgibin/gethelp.cgi?carith.hlp
carg.site.uottawa.ca
CARG 2010
Instruction Set Extensions
ISE
(Instruction Set Extensions)
Instruction-Set Extensions, as the
name implies, involves the addition of
custom instructions to a processor’s
instruction set.
Many market processors allow for the addition
of these internal custom instructions:
1. Tensilica Xtensa (VLIW)
2. Altera NIOS II
3. Xilinx Microblaze
4. MIPS CorExtend
In recent years there has been much research
into the area of automatic identification of
Instruction Set Extensions.
carg.site.uottawa.ca
CARG 2010
Instruction Set Extensions
These automated efforts vary in their
approach. Some look at the functional C
level of the program where hotspot
functions are identified. Others look lower
at the basic construct of the program as
data and control flow graphs.
ISE
(Instruction Set Extensions)
z
y
+
Modify ISA
Add
Custom
Hardware
Modify
Compiler,
ASM & LD
Regenerate
Custom
Program
x
/
>>
carg.site.uottawa.ca
CARG 2010
Instruction Set Extensions
• An ISE candidate has limited IO access to the
register file.
Possible Remedies:
Solution (Pozzi05):
1.
2.
3.
4.
5.
We use multicycle reads/writes from/to
the register bank in order to squeeze
several operands into the two inputone-output register file.
Multiport Register File
Register File Replication
Shadow Registers
Multicycle Reads (Altera’s NIOS II)
Dedicated Data Links (Microblaze)
• The instruction width also poses an IO barrier.
opcode
31
rs
26 25
rd
rt
21 20
carg.site.uottawa.ca
16 15
funct
shamt
11 10
6
5
0
CARG 2010
NIOS II Processor
Generic custom instruction datapath
carg.site.uottawa.ca
Our custom logic block
CARG 2010
Instruction Hardware
Cycles
 We can see in these figures that a sequence of three calls to the custom
instruction results in a complex operation with four inputs and two outputs.
carg.site.uottawa.ca
CARG 2010
Instruction Hardware
Operation when n=0 above, n=1 at right.
carg.site.uottawa.ca
CARG 2010
Software Interface
The designed hardware
for complex division can
be used easily in assembly
(by inline) or C/C++ code
as shown below:
carg.site.uottawa.ca
CARG 2010
Experiment
We used a NIOS II processor and a PLL as the starting point for the design.
carg.site.uottawa.ca
CARG 2010
Experiment
carg.site.uottawa.ca
CARG 2010
Conclusion
Applications can be accelerated with
instruction set extensions, and complex
division is one case where there is a
tangible benefit.
• We designed a complex divider instruction
set extension for the NIOS II
• This instruction was able to accelerate the
execution of code that uses complex division
• In the future we would like to implement
additional complex operations, and publish
the core on OPENCORES.org
carg.site.uottawa.ca
Download