p-2-4

advertisement
Introduction to CBEA SDK
Veselin Dikov
Moscow-Bavarian Joint Advanced Student School
19-29. March 2006
Moscow, Russia
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Outline
●
●
●
●
●
Getting started
SPU Language Extensions
SPE Library
SDK Libraries
Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Getting started
● Two executables formats: ppu vs. spu
 Two different compilers: ppu-gcc and spu-gcc
 Check the format of an executable
# file <executable>
● Makefile headers
 Reside in SDK root directory
make.header
make.footer
make.env
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Getting started
● Makefile - ppu
# Subdirectories ###############
DIRS := spu
# Target #######################
PROGRAM_ppu:= simple
● Makefile - spu
# Target #######################
PROGRAMS_spu := simple_spu
# created embedded library
LIBRARY_embed:= lib_simple_spu.a
# Local Defines ################
# Local Defines
################################
IMPORTS = $(SDKLIB_spu)/libc.a
IMPORTS := spu/lib_simple_spu.a \
-lspe
# imports the embedded simple_spu
# library allows consolidation of
# spu program into ppe binary
# make.footer
################################
include ../../make.footer
# make.footer ##################
include ../make.footer
# make.footer is in the top of
# the SDK
# make.footer is in the top of
# the SDK
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Outline
●
●
●
●
●
Getting started
SPU Language Extensions
SPE Library
SDK Libraries
Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
SPU Language Extensions
●
●
●
●
●
Provides SPU functionality
Extended RISC instruction set
Extended Data types: vector
C/C++ intrinsic routines
Memory Flow Control (MFC)
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
SIMD Vectorization
● vector - Reserved as a C/C++ keyword
● 128 bits of size
 16 bytes, or
 4 32-bit words, or
 etc.
● Single instruction acts on the entire vector
vector unsigned int vec =
(vector unsigned int)(1,2,3,4);
vector unsigned int v_ones =
(vector unsigned int)(1);
vector unsigned int vdest =
spu_add(vec, v_ones);
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
SIMD Vectorization
● Overridden C/C++ operators
 sizeof() (=16 for any vector)
 operator=
 operator&
● Various intrinsic functions for vectors
 Arithmetical – spu_add, spu_mult, etc.
 Logical – spu_and, spu_nor, etc.
 Comparison – spu_cmpeq
 Scalar – spu_extract, spu_insert, spu_promote
 Etc.
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
MFC routines
● Local Store (LS) vs. Effective Address (EA)
● Data transport via DMA
 Up to 16,384 bytes on DMA message
 Asynchronous
 Part of the instruction set, C/C++ intrinsics
mfc_get(&data,addr+16384*i,16384,20,0,0);
LS address
EA address
● What is __attribute__ ((aligned (128)));?
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
MFC example
● ppu main file
int main() {
…
/* the array needs to be aligned on
a 128-byte cache line */
data = (int *) malloc(127 + DATASIZE*sizeof(int));
while (((int) data) & 0x7f) ++data;
…
spe_create_thread(gid,&sample_spu,data,NULL,-1,0);
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
MFC example
● ppu main file
int main() {
…
/* the array needs to be aligned on
a 128-byte cache line */
data = (int *)malloc_align(DATASIZE*sizeof(int),7);
…
spe_create_thread(gid,&sample_spu,data,NULL,-1,0);
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
MFC example
● spu main file
int databuf [DATASIZE]__attribute__ ((aligned (128)));
int main(void * arg) {
int *data = &databuf[0];
…
mfc_get(data,arg,DATASIZE*sizeof(int),20,0,0);
mfc_write_tag_mask(1<<20);
mfc_read_tag_status_all();
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Outline
●
●
●
●
●
Getting started
SPU Language Extensions
SPE Library
SDK Libraries
Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
● Provides PPU functionality
● Two sets of functions
 Thread management – POSIX like
 MFC access functions – access to mailboxes
● Header: <libspe.h>
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Thread management
● Threads are organized in thread groups
Thread scheduling:
•SCHED_RR
•SCHED_FIFO
•SCHED_OTHER
Priority:
•RR and FIFO – 1 to 99
•OTHER – 0 to 99
Thread
Group 0
Th0 Th1 Th2
ThN
Thread
Group 1
Th0 Th1 Th2
ThN
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Thread management
● Threads are organized in thread groups
spe_gid_t gid; // group handle
speid_t speids[1]; // thread handle
// Create an SPE group
gid = spe_create_group (SCHED_OTHER, 0, 1);
if (gid == NULL) exit(-1); // failed
// allocate the SPE task
speids[0] = spe_create_thread(gid,&sample_spu,0,0,-1,0);
if (speids[0] == NULL) exit (-1);
// wait for the single SPE to complete
spe_wait(speids[0], &status[0], 0);
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Thread management
● spe_get_affinity, spe_set_affinity
 8-bit mask specifying SPE units where to start the
thread
● spe_get_ls
 Gets access to SPU’s local store
 Dirty!!
 Not supported by all operating systems
 Used for RPC communication
● spe_get/set_context, spe_get_ps
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Thread management
● Two ways of communicating with threads
 via signals
 via mailboxes
● POSIX-like Signals
 spe_get_event - check for signals from a thread
 spe_kill – send signal to a thread
 spe_write_signal – writes a signal through MFC
● MFC mailboxes
 spe_read_out_mbox, spe_write_in_mbox
 spe_stat_in_mbox, spe_stat_out_mbox,
spe_stat_out_intr_mbox
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Outline
●
●
●
●
●
Getting started
SPU Language Extensions
SPE Library
SDK Libraries
Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
SDK Libraries
●The SDK comes with various applied libraries
Library name
Short Description
PPE
SPE
C Library
standard C99 functionality. POSIX.1 functions.
x
x
Audio Resample
Library
audio resampling functionality for PPE and SPE
x
x
Curves and
Surfaces Library
quadratic and cubic Bezier curves. Biquadric and bicubic
Bezier surfaces, and curved point-normal triangles.
x
x
FFT Library
1-D FFT and kernel functions for 2-D FFT
x
x
Game Math
Library
math routines applicable to game performance needs
x
x
Image Library
routines for processing images - convolutions and
histograms
x
x
Large Matrix
Library
basic linear algebra routines on large vectors and
matrices
Math Library
general purpose math routines tuned to exploit SIMD
x
x
x
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
SDK Libraries
●The SDK comes with various applied libraries
Library name
Short Description
PPE
SPE
Matrix Library
routines for operations on 4x4 Matrices and quaternions
x
x
Misc Library
set of general purpose routines that don’t logically fit within
any other
x
x
Multi-Precision
Math Library
operations on unsigned integer numbers with large number
of bits
Noise
LibraryPPE
1-,2-, 3-, 4-D noise, Lattice and non-lattice noise,
Turbulance
x
x
Oscillator
Libraries
definition of sound sources
x
x
Simulation
Library
functionality related to the Full-Simulator
-
-
Sync Library
synchronization primitives, like atomic operations, mutex
x
x
Vector Library
a set of general purpose routines that operate on vectors.
x
x
x
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Example
typedef union {
vector float fv;
vector unsigned int uiv;
unsigned int ui[4];
float f[4];
} v128;
typedef vector float floatvec;
● Solving Ax  y
…
v128 A[4] = {(floatvec)(0.9501,0.8913,0.8214,0.9218),
(floatvec)(0.2311,0.7621,0.4447,0.7382),
from LargeMatrix library.
(floatvec)(0.6068,0.4565,0.6154,0.1763),
from Matrix (floatvec)(0.4860,0.0185,0.7919,0.4057)};
Solves:
v128
invA[4];
library
y = A*x + y
v128 x, y;
x.fv = (floatvec)(1, 5, 6, 7);
y.fv = (floatvec)(0);
inverse_matrix4x4(&invA[0].fv, &A[0].fv);
madd_vector_matrix(4,4,&invA[0].fv,4,(float *)&x,(float *)&y);
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Outline
●
●
●
●
●
Getting started
SPU Language Extensions
SPE Library
SDK Libraries
Remote Procedure Calls
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Function-Offload Model
 SPE threads as services
 PPE communicates with them thought RPC
calls
● Remote Procedure Calls
 stubs
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Preparing stubs
 Interface Description Language (IDL)
 idl files
interface add
{
import "../stub.h";
const int ARRAY_SIZE = 1000;
[sync] idl_id_t do_inv ([in] int array_size,
[in, size_is(array_size)] int array_a[],
[out, size_is(array_size)] int array_res[]);
…
}
 idl compiler
# idl -p ppe_sample.c -s spe_sample.c sample.idl
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Remote Procedure Calls
● Preparing stubs
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Example
● .idl file
interface add
{
import "../stub.h";
const int ARRAY_SIZE = 1000;
[sync] idl_id_t do_add ([in] int array_size,
[in, size_is(array_size)] int array_a[],
[in, size_is(array_size)] int array_b[],
[out, size_is(array_size)] int array_c[]);
[sync] idl_id_t do_sub ([in] int array_size,
[in, size_is(array_size)] int array_a[],
[in, size_is(array_size)] int array_b[],
[out, size_is(array_size)] int array_c[]);
}
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Example
● spe do_add.c file (user file)
#include "../stub.h"
idl_id_t do_add (
int array_size, int array_a[],
int array_b[], int array_c[])
{
int i;
for (i = 0; i < array_size; i++) {
array_c[i] = array_a[i] + array_b[i];
}
return 0;
}
idl_id_t do_sub (
int array_size, int array_a[],
int array_b[], int array_c[])
{
...
return 0;
}
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Example
● ppe main.c file (user file)
#include "../stub.h"
int array_a[ARRAY_SIZE] __attribute ((aligned(128)));
int array_b[ARRAY_SIZE] __attribute ((aligned(128)));
int array_add[ARRAY_SIZE] __attribute ((aligned(128)));
int main()
{
...
do_add (ARRAY_SIZE, array_a, array_b, array_add);
...
}
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Conclusion
● Powerful SDK
 vector data type
 Extensive Matrix routines
 Extensive Signal Processing routines
● Game-industry driven
 But applicable for scientific work as well
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
References
Thank you for your attention!
●
●
●
●
●
CBEA-Tutorial.pdf, SDK documentation
idl.pdf, SDK documentation
libraries_SDK.pdf, SDK Documentation
libspe_v1.0.pdf, SDK Documentation
SPU_language_extensions_v21.pdf
 Sony online resources
 http://cell.scei.co.jp/pdf/SPU_language_extensions_v21.pdf, 15.03.2006
Moscow-Bavarian Joint Advanced Student School
19-29 March 2006, Moscow, Russia
Download