X10-Tutorial-Rice

advertisement
X10 Tutorial
IBM Research
X10 Tutorial
http://x10.sf.net
1
Vijay Saraswat
(based on tutorial co-authored with Vivek Sarkar,
Christoph von Praun, Nate Nystrom, Igor Peshansky)
August 2008
This material is based upon work
supported by the Defense Advanced
IBM Research
Research Projects Agency under its
Agreement No. HR0011-07-9-0002.
© 2007 IBM Corporation
X10 Tutorial
Acknowledgements



IBM Research

2
X10 Core Team (IBM)
– Ganesh Bikshandi, Sreedhar Kodali, Nathaniel
Nystrom, Igor Peshansky, Vijay Saraswat, Pradeep
Varma, Sayantan Sur, Olivier Tardieu, Krishna
Venkat, Tong Wen, Jose Castanos, Ankur Narang,
Komondoor Raghavan
X10 Tools
Recent Publications
1.
2.
3.
– Philippe Charles, Robert Fuhrer
4.
Emeritus
5.
6.
7.
– Kemal Ebcioglu, Christian Grothoff, Vincent Cave,
Lex Spoon, Christoph von Praun, Rajkishore Barik,
Chris Donawa, Allan Kielstra
8.
Research colleagues
– Vivek Sarkar, Rice U
– Satish Chandra,Guojing Cong
– Ras Bodik, Guang Gao, Radha Jagadeesan, Jens
Palsberg, Rodric Rabbah, Jan Vitek
– Vinod Tipparaju, Jarek Nieplocha (PNNL)
– Kathy Yelick, Dan Bonachea (Berkeley)
– Several others at IBM
9.
10.
11.
12.
“Solving large, irregular graph problems using adaptive work-stealing”, to
appear in ICPP 2008.
“Constrained types for OO languages”, to appear in OOPSLA 2008.
“Type Inference for Locality Analysis of Distributed Data Structures”, PPoPP
2008.
“Deadlock-free scheduling of X10 Computations with bounded resources”,
SPAA 2007
“A Theory of Memory Models”, PPoPP 2007.
“May-Happen-in-Parallel Analysis of X10 Programs”, PPoPP 2007.
“An annotation and compiler plug-in system for X10”, IBM Technical Report,
Feb 2007.
“Experiences with an SMP Implementation for X10 based on the Java
Concurrency Utilities” Workshop on Programming Models for Ubiquitous
Parallelism (PMUP), September 2006.
"An Experiment in Measuring the Productivity of Three Parallel Programming
Languages”, P-PHEC workshop, February 2006.
"X10: An Object-Oriented Approach to Non-Uniform Cluster Computing",
OOPSLA conference, October 2005.
"Concurrent Clustered Programming", CONCUR conference, August 2005.
"X10: an Experimental Language for High Productivity Programming of
Scalable Systems", P-PHEC workshop, February 2005.
Tutorials



TiC 2006, PACT 2006, OOPSLA 2006, PPoPP 2007, SC 2007
Graduate course on X10 at U Pisa (07/07)
Graduate course at Waseda U (Tokyo, 04/08)
© 2007 IBM Corporation
X10 Tutorial
Acknowledgements (contd.)
 X10 is an open source project (Eclipse
Public License).
 Reference implementation in Java, runs
on any Java 5 VM.
– Windows/Intel, Linux/Intel
– AIX/PPC, Linux/PPC
– Runs on multiprocessors
IBM Research
 X10Flash project --- cluster
implementation of X10 under
development at IBM
3
– Translation of X10 to C + LAPI
– Will not be discussed in today’s
tutorial
– Contact: Vijay Saraswat,
vsaraswa@us.ibm.com
 Website: http://x10.sf.net
 Website contains
–
–
–
–
–
–
Language specification
Tutorial material
Presentations
Download instructions
Copies of some papers
Pointers to mailing list
 This material is based upon work
supported in part by the Defense
Advanced Research Projects Agency
under its Agreement No. HR0011-07-90002.
© 2007 IBM Corporation
X10 Tutorial
Outline
1.
2.
3.
IBM Research
4.
4
5. Clocks
What is X10?
• creation, registration, next,
•
background, status
resume, drop,
Basic X10 (single place)
ClockUseException
•
async, finish, atomic
6. Basic serial constructs that
•
future, force
differ from Java
Basic X10 (arrays & loops)
• const, nullable, extern
•
points, rectangular regions,
7. Advanced topics
arrays
• Value types, conditional
•
for, foreach
atomic sections (when),
Scalable X10 (multiple places)
general regions &
•
places, distributions, distributed
distributions
arrays, ateach,
• Refer to language spec for
BadPlaceException
details
© 2007 IBM Corporation
X10 Tutorial
IBM Research
What is X10?
5

X10 is a new experimental language developed in the IBM
PERCS project as part of the DARPA program on High
Productivity Computing Systems (HPCS)

X10’s goal is to provide a new parallel programming model
and its embodiment in a high level language that:
1. is more productive than current models,
2. can support higher levels of abstraction better than current
models, and
3. can exploit the multiple levels of parallelism and non-uniform
data access that are critical for obtaining scalable
performance in current and future HPC systems
© 2007 IBM Corporation
X10 Tutorial
The X10 RoadMap
DARPA milestones
Internal milestone
MS5
MS6
MS7
MS8
MS9
MS10
Libraries, APIs, Tools, user trials
Trial at Rice U
X10DT
enhancements
Concur. refactorings
SSCA#2, UTS
Other Apps (TBD)
More refactorings
Language Definition
v1.7 spec
v2.0 spec
IBM Research
JVM Implementation
6
v1.7 impl
Initial debugger release
v2.0 impl
X10 Flash C/C++ Implementation
v1.7 impl Rel 1
v1.7 impl Rel 2
v2.0 impl
Port to PERCS h/w
Design of X10 Implementation for PERCS
Multi-process debugger
Advanced
debugger features
P
u
b
li
c
R
e
l
e
a
s
e
o
f
X
1
0
2
.
0
s
y
s
t
e
m
© 2007 IBM Corporation
X10 Tutorial
Current X10 Environment:
Unoptimized Single-process Implementation
X10 source program --- must contain a class
named Foo with a public static void main(String[]
args) method
Foo.x10
x10c Foo.x10
x10c
Foo.class
X10 compiler --- translates Foo.x10 to Foo.java,
uses javac to generate Foo.class from Foo.java
Foo.java
IBM Research
x10 Foo.x10
7
X10 Virtual Machine
(JVM + J2SE libraries +
X10 libraries +
X10 Multithreaded Runtime)
X10 Program Output
X10 program translated into Java --// #line pseudocomment in Foo.java
specifies source line mapping in Foo.x10
X10 extern
interface
External DLL’s
X10 Abstract Performance Metrics
(event counts, distribution efficiency)
Caveat: this is a prototype implementation with many limitations. Please be patient!
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Eclipse environment for X10 – X10DT
8
© 2007 IBM Corporation
X10 Tutorial
Future X10 Environment
Parallelism
through
Domain Specific
Unordered Languages (Streaming,
Tasks
…)
X10 application programs
IBM Research
X10 Libraries
9
Implicit parallelism,
Implicit data distributions
X10 places --- abstraction
of explicit control & data
distribution
X10 Deployment
Mapping of places to nodes
in HPC Parallel Environment
HPC Runtime Environment
Primitive constructs for
parallelism, communication,
and synchronization
(Parallel Environment, MPI, LAPI, …)
HPC Parallel System
Target system for parallel
application
© 2007 IBM Corporation
X10 Tutorial
The current architectural landscape
SMP Node
SMP Node
PEs,
PEs,
...
PEs,
PEs,
...
Memory
...
Memory
Interconnect
Power6 Clusters
Blue Gene
IBM Research
I/O
gateway
nodes
10

(100’s of such
cluster nodes)
“Scalable Unit” Cluster Interconnect Switch/Fabric
Multi-core w/ accelerators (IXP 2850)
Road Runner: Cell-accelerated Opteron
© 2007 IBM Corporation
X10 Tutorial
X10 vs. Java

X10 is an extended subset of Java
– Base language = Java 1.4
• Java 5 features (generics, metadata, etc.) are currently not supported in
X10
– Notable features removed from Java
• Concurrency --- threads, synchronized, etc.
• Java arrays – replaced by X10 arrays
IBM Research
– Notable features added to Java
11
• Concurrency – async, finish, atomic, future, force, foreach, ateach, clocks
• Distribution --- points, distributions
• X10 arrays --- multidimensional distributed arrays, array reductions, array
initializers,
• Serial constructs --- nullable, const, extern, value types

X10 supports both OO and non-OO programming paradigms
© 2007 IBM Corporation
X10 Tutorial
x10.lang standard library
Java package with “built in” classes that provide support for selected X10
constructs

IBM Research






12
Standard types
– boolean, byte, char, double, float, int, long, short, String
x10.lang.Object -- root class for all instances of X10 objects
x10.lang.clock --- clock instances & clock operations
x10.lang.dist --- distribution instances & distribution operations
x10.lang.place --- place instances & place operations
x10.lang.point --- point instances & point operations
x10.lang.region --- region instances & region operations
All X10 programs implicitly import the x10.lang.* package, so the x10.lang
prefix can be omitted when referring to members of x10.lang.* classes

e.g., place.MAX_PLACES, dist.factory.block([0:100,0:100]), …
Similarly, all X10 programs also implicitly import the java.lang.* package


e.g., X10 programs can use Math.min() and Math.max() from java.lang
In case of conflict (e.g. Integer), user must import the desired one explicitly, e.g.
import java.lang.Integer;
© 2007 IBM Corporation
X10 Tutorial
Calling foreign functions from X10 programs
 Java methods
– Can be called directly from X10 programs
– Java class will be loaded automatically as part of X10
program execution
– Basic rule: don’t call any method that can perform
wait/notify or related thread operations
• Calling synchronized methods is okay
IBM Research
 C functions
13
– Need to use extern declaration in X10, and perform a
System.loadLibrary() call
© 2007 IBM Corporation
X10 Tutorial
Outline
1.
2.
3.
IBM Research
4.
14
5. Clocks
What is X10?
• creation, registration, next,
•
background, status
resume, drop,
Basic X10 (single place)
ClockUseException
•
async, finish, atomic
6. Basic serial constructs that
•
future, force
differ from Java
Basic X10 (arrays & loops)
• const, nullable, extern
•
points, rectangular regions,
7. Advanced topics
arrays
• Value types, conditional
•
for, foreach
atomic sections (when),
Scalable X10 (multiple places)
general regions &
•
places, distributions, distributed
distributions
arrays, ateach,
• Refer to language spec for
BadPlaceException
details
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Basic X10 (Single Place)
15
 async = construct used to execute
a statement in parallel as a new
activity
 atomic = construct used to
coordinate accesses to shared
heap by multiple activities
 finish = construct used to check for
global termination of statement and
all the activities that it has created
 future = construct used to evaluate
an expression in parallel as a new
activity
 force = construct used to check for
termination of future
Core constructs used for intra-place (shared memory) parallel programming
© 2007 IBM Corporation
X10 Tutorial
async
IBM Research
Stmt ::= async Stmt
16
async S
 Creates a new child activity
that executes statement S
 Returns immediately
 S may reference final
variables in enclosing blocks
 Activities cannot be named
 Activity cannot be aborted or
cancelled
cf Cilk’s spawn
void run() {
if (r < 2) return;
final Fib f1 = new Fib(r-1),
f2 = new Fib(r-2);
finish {
async f1.run();
f2.run();
}
result = f1.r + f2.r;
}
See also TutAsync.x10
© 2007 IBM Corporation
X10 Tutorial
finish
finish S
 Execute S, but wait until all
(transitively) spawned asyncs have
terminated.
Stmt ::= finish Stmt
cf Cilk’s sync
void run() {
IBM Research
if (r < 2) return;
17
Rooted exception model
 Trap all exceptions thrown by
spawned activities.
 Throw an (aggregate) exception if
any spawned async terminates
abruptly.
 implicit finish at main activity
final Fib f1 = new Fib(r-1),
f2 = new Fib(r-2);
finish {
async f1.run();
f2.run();
}
result = f1.r + f2.r;
finish is useful for expressing
“synchronous” operations on
(local or) remote data.
}
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Granularity of concurrency
18
 Too many tasks can reduce
performance.
– Overhead for task creation,
scheduling, interference,
cache effects.
 Too few tasks may result in
under-utilization.
– Idle workers
 In some cases, workload can
be divided equally among all
tasks statically
– E.g. array-based
computations
 Some problems may need
dynamic load-balancing
– Work-stealing in X10 1.7.
 Identify tasks that can be
performed in parallel
– Use async to specify them
 Identify point in code by
which tasks must be finished
– Use finish
 If tasks will read/write shared
variables during execution,
ensure atomicity of updates
– Use atomic
 In principle, create k*P tasks,
where P is the number of
processors available.
© 2007 IBM Corporation
X10 Tutorial
Example programs
 Fibonacci
 NQueens
 Tree
IBM Research
 How would you parallelize these computations?
19
© 2007 IBM Corporation
X10 Tutorial
Termination
Local termination:
Statement s terminates locally when activity has completed all its
computation with respect to s.
IBM Research
Global termination:
Local termination + activities that have been spawned by s
terminated globally (recursive definition)
20
 main function is root activity
 program terminates iff root activity terminates.
(implicit finish at root activity)
 ‘daemon threads’ (child outlives root activity) not
allowed in X10
© 2007 IBM Corporation
X10 Tutorial
Termination (Example)
IBM Research
termination
start local global
21
public void main (String[] args) {
...
finish {
async {
for () {
async {...
}
}
finish async {...
}
...
}
} // finish
}
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Rooted computation X10
22
public void main (String[] args) {
...
finish {
async {
Root-of hierarchy
for () {
async {...
root activity
}
}
finish async {...
}
...
}
...
} // finish
}
ancestor
relation
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Rooted exception model
23
public void main (String[] args) {
...
finish {
root-of relation
async {
for () {
async {...
}
}
finish async {...
}
...
...
}
} // finish
exception flow along
}
root-of relation
Propagation along the lexical scoping:
Exceptions that are not caught inside an activity are propagated
to the nearest suspended ancestor in the root-of relation.
© 2007 IBM Corporation
X10 Tutorial
Example: rooted exception model (async)
IBM Research
int result = 0;
try {
finish {
ateach (point [i]:dist.factory.unique()) {
throw new Exception (“Exception from “+here.id)
}
throw new Exception();
result = 42;
} // finish
} catch (x10.lang.MultipleExceptions me) {
System.out.print(me);
}
assert (result == 42); // always true
24
 no exceptions are ‘thrown on the floor’
 exceptions are propagated across activity and place
boundaries
© 2007 IBM Corporation
X10 Tutorial
Behavioral annotations
nonblocking
On any input store, a nonblocking method can continue execution or
terminate. (dual: blocking, default: nonblocking)
recursively nonblocking
Nonblocking, and every spawned activity is recursively nonblocking.
IBM Research
local
A local method guarantees that its execution will only access variables
that are local to the place of the current activity.
(dual: remote, default: local)
25
sequential
Method does not create concurrent activities.
In other words, method does not use async, foreach, ateach.
(dual: parallel, default: parallel)
Sequential and nonblocking imply recursively nonblocking.
© 2007 IBM Corporation
X10 Tutorial
atomic
 Atomic blocks are conceptually
executed in a single step while
other activities are suspended:
isolation and atomicity.
IBM Research
 An atomic block ...
26
– must be nonblocking
– must not create concurrent
activities (sequential)
– must not access remote data
(local)
Stmt ::= atomic Statement
MethodModifier ::= atomic
// target defined in lexically
// enclosing scope.
atomic boolean CAS(Object old,
Object new) {
if (target.equals(old)) {
target = new;
return true;
}
return false;
}
// push data onto concurrent
// list-stack
Node node = new Node(data);
atomic {
node.next = head;
head = node;
}
© 2007 IBM Corporation
X10 Tutorial
Statically checked conditions on atomic blocks
An atomic block must...be local, sequential, nonblocking:
IBM Research
 ...not include blocking operations
– no await, no when, no calls to blocking methods
 ... not include access to data at remote places
– no ateach, no future, only calls to local methods
 ... not spawn other activities
– no async, no foreach, only calls to sequential methods
27
© 2007 IBM Corporation
X10 Tutorial
Exceptions in atomic blocks
IBM Research
 Atomicity guarantee only for successful execution.
– Exceptions should be caught inside atomic block
– Explicit undo in the catch handler
28
boolean move(Collection s, Collection d, Object o) {
atomic {
if (!s.remove(o)) {
return false; // object not found
} else {
try {
d.add(o);
} catch (RuntimeException e) {
s.add(o); // explicit undo
throw e; // exception
}
return true; // move succeeded
}
}
}
 (Uncaught) exceptions propagate across the atomic block boundary;
atomic terminates on normal or abrupt termination of its block.
© 2007 IBM Corporation
X10 Tutorial
Data races with async / foreach
final double arr[R] =
…; // global array
IBM Research
class ReduceOp {
double accu = 0.0;
double sum ( double[.] arr ) {
finish foreach (point p: arr) {
atomic accu += arr[p];
}
concurrent conflicting
return accu;
access to shared variable:
}
data race
29
X10 guideline for avoiding data races:
 access shared variables inside an atomic block
 combine ateach and foreach with finish
 declare data to be read-only where possible (final or value type)
© 2007 IBM Corporation
X10 Tutorial
NQueens: Searching in parallel
 NQueens.x10
IBM Research
 How should this be parallelized?
30
© 2007 IBM Corporation
X10 Tutorial
future
Expr ::= future PlaceExpSingleListopt {Expr }
IBM Research
future (P) S
 Creates a new child activity at
place P, that executes statement
S;
 Returns immediately.
 S may reference final variables
in enclosing blocks.
31
future vs. async
 Return result from
asynchronous computation
 Tolerate latency of remote
access.
Considering addition of a delayed
future: needs run() to be called
before it is activated
// global dist. array
final double a[D] = …;
final int idx = …;
future<double> fd =
future (a.distribution[idx])
{
// executed at a[idx]’s
// place
a[idx];
};
future type
 no subtype relation between T
and future<T>
© 2007 IBM Corporation
X10 Tutorial
future example
public class TutFuture1 {
static int fib (final int n) {
if ( n <= 0 ) return 0;
if ( n == 1 ) return 1;
future<int> x = future { fib(n-1) };
int y = fib(n-2);
return x.force() + y;
}
IBM Research
public static void main(String[] args) {
System.out.println("fib(10) = " + fib(10));
}
32
}
 Divide and conquer: recursive calls execute concurrently.
© 2007 IBM Corporation
X10 Tutorial
Example: rooted exception model (future)
IBM Research
double div (final double divisor)
future<double> f = future { return 42.0 / divisor; }
double result;
try {
result = f.force();
} catch (ArithmeticException e) {
result = 0.0;
}
return result;
}
33
 Exception is propagated when the future is forced.
© 2007 IBM Corporation
X10 Tutorial
Futures can deadlock
nullable<future<int>> f1=null;
nullable<future<int>> f2=null;
void main(String[] args) {
f1 = future(here){a1()};
f2 = future(here){a2()};
f1.force();
}
IBM Research
cyclic wait condition
34
int a1() {
nullable<future<int>> tmp=null;
do {
tmp=f2;
} while (tmp == null);
return tmp.force();
}
int a2() {
nullable<future<int>> tmp=null;
do {
tmp=f1;
} while (tmp == null);
return tmp.force();
}
X10 guidelines to avoid deadlock:
 avoid futures as shared variables
 force called by same activity that created body of future, or a
descendent.
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Outline
35
5. Clocks
1. What is X10?
• creation, registration, next,
• background, status
resume, drop,
2. Basic X10 (single place)
ClockUseException
• async, finish, atomic
6. Basic serial constructs that
• future, force
differ from Java
• const, nullable, extern
3. Basic X10 (arrays & loops)
• points, rectangular regions, 7. Advanced topics
• Value types, conditional
arrays
atomic sections (when),
• for, foreach
general regions &
4. Scalable X10 (multiple places)
distributions
• places, distributions,
• Refer to language spec for
distributed arrays, ateach,
details
BadPlaceException
© 2007 IBM Corporation
X10 Tutorial
point
A point is an element of an n-dimensional Cartesian
space (n>=1) with integer-valued coordinates e.g., [5], [1, 2], …
– Dimensions are numbered from 0 to n-1
– n is also referred to as the rank of the point
A point variable can hold values of different ranks e.g.,
– point p; p = [1]; … p = [2,3]; …
Operations
– p1.rank
IBM Research
• returns rank of point p1
36
– p1[i]
• returns element (i mod p1.rank) if i < 0 or i >= p1.rank
– p1.lt(p2), p1.le(p2), p1.gt(p2), p1.ge(p2)
• returns true iff p1 is lexicographically <, <=, >, or >= p2
• only defined when p1.rank and p1.rank are equal
© 2007 IBM Corporation
X10 Tutorial
Syntax extensions for points
 Implicit syntax for points:
point p = [1,2] 
point p = point.factory(1,2)
IBM Research
 Exploded variable declarations for points:
point p [i,j]
// final int i,j
37
 Typical uses :
– region R = [0:M-1,0:N-1];
– for (point p [i, j] : R) { ... }
– for (point [i, j] : R) { ... }
– point sum (point [i,j], point [k, l])
{ return [i+k, j+l]; }
– int [.] iarr = new int [R] (point [i,j]) { return i; }
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Example: point (TutPoint1)
38
public class TutPoint {
public static void main(String[] args) {
point p1 = [1,2,3,4,5];
point p2 = [1,2];
point p3 = [2,1];
System.out.println("p1 = " + p1 +
" ; p1.rank = " + p1.rank +
" ; p1[2] = " + p1[2]);
System.out.println("p2 = " + p2 +
" ; p3 = " + p3 + " ; p2.lt(p3) = " +
p2.lt(p3));
}
}
Console output:
p1 = [1,2,3,4,5] ; p1.rank = 5 ; p1[2] = 3
p2 = [1,2] ; p3 = [2,1] ; p2.lt(p3) = true
© 2007 IBM Corporation
X10 Tutorial
Rectangular regions
A rectangular region is the set of points contained in a rectangular subspace
A region variable can hold values of different ranks e.g.,
– region R; R = [0:10]; … R = [-100:100, -100:100]; … R = [0:-1]; …
IBM Research
Operations
39
–
–
–
–
–
–
–
–
–
–
–
–
–
R.rank ::= # dimensions in region;
R.size() ::= # points in region
R.contains(P) ::= predicate if region R contains point P
R.contains(S) ::= predicate if region R contains region S
R.equal(S) ::= true if region R equals region S
R.rank(i) ::= projection of region R on dimension i (a one-dimensional region)
R.rank(i).low() ::= lower bound of ith dimension of region R
R.rank(i).high() ::= upper bound of ith dimension of region R
R.ordinal(P) ::= ordinal value of point P in region R
R.coord(N) ::= point in region R with ordinal value = N
R1 && R2 ::= region intersection (will be rectangular if R1 and R2 are rectangular)
R1 || R2 ::= union of regions R1 and R2 (may not be rectangular)
R1 – R2 ::= region difference (may not be rectangular)
© 2007 IBM Corporation
X10 Tutorial
Example: region (TutRegion1)
IBM Research
public class TutRegion {
public static void main(String[] args) {
region R1 = [1:10, -100:100];
System.out.println("R1 = " + R1 + " ; R1.rank = " +
R1.rank + " ; R1.size() = " + R1.size() + " ;
R1.ordinal([10,100]) = " + R1.ordinal([10,100]));
region R2 = [1:10,90:100];
System.out.println("R2 = " + R2 + " ; R1.contains(R2) =
" + R1.contains(R2) + " ; R2.rank(1).low() = " +
R2.rank(1).low() + " ; R2.coord(0) = " + R2.coord(0));
}
}
40
Console output:
R1 = {1:10,-100:100} ; R1.rank = 2 ; R1.size() = 2010 ;
R1.ordinal([10,100]) = 2009
R2 = {1:10,90:100} ; R1.contains(R2) = true ;
R2.rank(1).low() = 90 ; R2.coord(0) = [1,90]
© 2007 IBM Corporation
X10 Tutorial
Syntax extensions for regions
Region constructors
int hi, lo;
region r = hi;
 region r = region.factory.region(0, hi)
region r = [low:hi];
IBM Research
 region r = region.factory.region(lo, hi)
41
region r1, r2; // 1-dim regions
region r = [r1, r2];
 region r = region.factory.region(r1, r2);
// 2-dim region
© 2007 IBM Corporation
X10 Tutorial
X10 arrays


Java arrays are one-dimensional and local
– e.g., array args in main(String[] args)
– Multi-dimensional arrays are represented as “arrays of arrays” in
Java
X10 has true multi-dimensional arrays (as Fortran) that can be
distributed (as in UPC, Co-Array Fortran, ZPL, Chapel, etc.)
IBM Research
Array
–
–
Array
–
42
declaration
T [.] A declares an X10 array with element type T
An array variable can refer to arrays with different rank
allocation
new T [ R ] creates a local rectangular X10 array with
rectangular region R as the index domain and T as the element
(range) type
– e.g., int[.] A = new int[ [0:N+1, 0:N+1] ];
Array initialization
– elaborate on a slide that follows...
© 2007 IBM Corporation
X10 Tutorial
Array declaration syntax: [] vs [.]
IBM Research
General arrays: <Type>[.]
– one or multidimensional arrays
– can be distributed
– arbitrary region
43
Special case (“rail”): <Type>[]
– 1 dimensional
– 0-based, rectangular array
– not distributed
– can be used in place of general arrays
– supports compile-time optimization
© 2007 IBM Corporation
X10 Tutorial
Simple array operations
A.rank ::= # dimensions in array
A.region ::= index region (domain) of array
A.distribution ::= distribution of array A
A[P] ::= element at point P, where P belongs to A.region
A | R ::= restriction of array onto region R
– Useful for extracting subarrays
IBM Research





44
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Aggregate array operations
45
 A.sum(), A.max() ::= sum/max of elements in array
 A1 <op> A2
– returns result of applying a pointwise op on array
elements, when A1.region = A2. region
– <op> can include +, -, *, and /
 A1 || A2 ::= disjoint union of arrays A1 and A2
(A1.region and A2.region must be disjoint)
 A1.overlay(A2)
– returns an array with region, A1.region || A2.region, with
element value A2[P] for all points P in A2.region and A1[P]
otherwise.
© 2007 IBM Corporation
X10 Tutorial
Example: arrays (TutArray1)
IBM Research
public class TutArray1 {
public static void main(String[] args) {
int[.] A = new int[ [1:10,1:10] ]
(point [i,j]) { return i+j;} ;
System.out.println("A.rank = " + A.rank +
" ; A.region = " + A.region);
int[.] B = A | [1:5,1:5];
System.out.println("B.max() = " + B.max());
}
array copy
}
46
Console output:
A.rank = 2 ; A.region = {1:10,1:10}
B.max() = 10
© 2007 IBM Corporation
X10 Tutorial
Initialization of mutable arrays
Mutable array with nullable references to mutable objects:
nullable<RefType> [.] farr = new RefType[R];
// init with null value
Mutable array with references to mutable objects:
RefType [.] farr = new RefType [R];
// compile-time error, init required
RefType [.] farr = new RefType [R] (point[i]) { return RefType(here, i); }
Execution of initializer is implicitly parallel / distributed (pointwise operation):
IBM Research
That hold ‘reference to value objects’ (value object can be inlined by imp.)
47
int [.] iarr = new int[N] ; // init with default value, 0
int [.] iarr = new int[.] {1, 2, 3, 4}; // Java style
ValType [.] V = new ValType[N] (point[i])
{ return ValType(i);}; // explicit init
© 2007 IBM Corporation
X10 Tutorial
Initialization of value arrays
Initialization of value arrays requires an initializer.
Value array of reference to mutable objects:
RefType value [.] farr = new value RefType [N];
// compile-time error, init required
RefType value [.] farr = new value RefType [N] (point[i])
{ return new Foo(); }
Value array of ‘reference to value objects’ (value object can be inlined)
IBM Research
int value [.] iarr = new value int[.] {1, 2, 3, 4};
// Java style init
48
ValType value [.] iarr = new value ValType[N] (point[i])
{ return ValType(i); };
// explicit init
© 2007 IBM Corporation
X10 Tutorial
foreach
foreach ( FormalParam: Expr ) Stmt
foreach (point p: R) S
 Creates |R| async statements in parallel at current place.
foreach (point p:R) S
for (point p: R)
async { S }
IBM Research
 Termination of all (recursively created) activities can be ensured
with finish.
49
 finish foreach is a convenient way to achieve master-slave
fork/join parallelism (OpenMP programming model)
© 2007 IBM Corporation
X10 Tutorial
Summing up elements of an array in parallel
 See ArraySum.x10
IBM Research
 When P is small, local sums can be summed up
serially.
50
© 2007 IBM Corporation
X10 Tutorial
Iterative Averaging with a 1-D stencil
IBM Research
 Problem Statement
– Initialize a 1-D array, A[0:n+1] with A[0:n] = 0 & A[n+1] = n+1
– A[0] = 0 and A[n+1] = n+1 are fixed boundary conditions
– Iteratively compute new values for A[1:n] by averaging the values of the two
neighboring elements
– Terminate when the sum of element changes is less than a given threshold
 Acknowledgment
– Example and codes for C + MPI, UPC, CAF versions were provided by Steve
Deitz and Brad Chamberlain from Cray, {deitz,bradc}@cray.com
51
51
© 2007 IBM Corporation
X10 Tutorial
1-D stencil: parallelism within a place
 See Stencil1D.
 Uses x10.util.dist.Distribution.block(R, P) to blockdivide a region into P regions.
IBM Research
 Exercise: Parallelize Pascal’s triangle.
52
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Outline
53
1. What is X10?
• background, status
2. Basic X10 (single place)
• async, finish, atomic
• future, force
3. Basic X10 (arrays & loops)
• points, rectangular regions,
arrays
• for, foreach
4. Scalable X10 (multiple places)
• places, distributions,
distributed arrays, ateach,
BadPlaceException
5.
6.
7.
Clocks
•
creation, registration, next,
resume, drop,
ClockUseException
Basic serial constructs that differ
from Java
•
const, nullable, extern
Advanced topics
•
Value types, conditional atomic
sections (when), general
regions & distributions
•
Refer to language spec for
details
© 2007 IBM Corporation
X10 Tutorial
Limitations of using a Single Place
Immutable Data (I)
-- final variables,
value type
instances

Activities
IBM Research
Locally
Synchronous
(coherent access
to intra-place
shared heap)
54
...
Activity
Stacks (S)
Storage classes:
 Immutable Data (I)
 Shared Heap (H)
 Activity Stacks (S)
Place 0
Shared Heap (H)
Largest deployment granularity for a
single place is a single SMP
– Smallest granularity can be a
single CPU or even a single
hardware thread
 Single SMP is inadequate for
solving problems with large memory
and compute requirements
 X10 solution: incorporate multiple
places as a core foundation of the
X10 programming model
 Enable deployment on large-scale
clustered machines, with integrated
support for intra-place parallelism
© 2007 IBM Corporation
X10 Tutorial
What is Partitioned Global Address Space (PGAS)?
IBM Research
Process/Thread
55
Address Space
Message passing
Shared Memory
PGAS
MPI
OpenMP
UPC, CAF, X10
 Computation is performed in
multiple places.
 A place contains data that can be
operated on remotely.
 Data lives in the place it was
created, for its lifetime.
 A datum in one place may
reference a datum in another place.
 Data-structures (e.g. arrays) may
be distributed across many places.
 Places may have different
computational properties (e.g. PPE,
SPE, …).
A place expresses locality.
© 2007 IBM Corporation
X10 Tutorial
IBM Research
What is Asynchronous PGAS?
56
 Asynchrony
– Simple explicitly concurrent
model for the user: async (p) S
runs statement S “in parallel” at
place p
– Controlled through finish, and
local (conditional) atomic
 Used for active messaging
(remote asyncs), DMAs, finegrained concurrency,
fork/join concurrency, doall/do-across parallelism
– SPMD is a special case
Concurrency is made explicit and programmable.
© 2007 IBM Corporation
X10 Tutorial
Places in X10






place.MAX_PLACES = total number of places (runtime constant)
place.places = value array of all places in an X10
place.factory.place(i) = place corresponding to index i
here = place in which current activity is executing
<place-expr>.toString() returns a string of the form “place(id=99)”
<place-expr>.id returns the id of the place
IBM Research
X10 programs defines mapping from X10
objects to X10 places, and abstract
performance metrics on places
57
X10 Data Structures
X10 Places
Future X10 deployment system will define
mapping from X10 places to processes, and
processes to nodes.
Processes
Nodes
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Specifying number of places in X10DT
58
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Locality rule
59
 An activity executing an
 Activity may access remote
atomic block may access
data synchronously (outside
only local data.
an atomic block).
 Locality of data determined
– Write (update) access is as
through types.
if performed in a
finish/async.
– Data of type Foo! is
potentially remote.
– Read access is as if
performed in future/force.
– Data of type Foo!p is at place
p.
 Programmer may always use
explicit asyncs to improve
– Data of type Foo is local.
locality of computation.
 Data may be cast to local
type. Failure throws
Current implementation limitation
BadPlaceException.
Place types not yet supported.
Atomic block does not check array
accesses are place local.
© 2007 IBM Corporation
X10 Tutorial
Distributions in X10
A distribution maps every point in a region to a place.
IBM Research
Creating distributions (x10.lang.dist):
– dist D1 = dist.factory.constant(R, here); // local distribution
– maps region R to here
– dist D2 = dist.factory.block(R); // blocked distribution
– dist D3 = dist.factory.cyclic(R); // cyclic distribution
– dist D4 = dist.factory.unique(); // identity map on
[0:MAX_PLACES-1]
60
© 2007 IBM Corporation
X10 Tutorial
async and future with explicit place specifier
async (P) S
 Creates new activity to execute statement S at place P
 async S is equivalent to async (here) S
future (P) { E }
 Create new activity to evaluate expression E at place P
 future { E } is equivalent to future (here) { E }
IBM Research
Note that here in a child activity for an async/future computation will refer to
the place P at which the child activity is executing, not the place where
the parent activity is executing
61
Specify the destination place for async/future activities so as to obey the
Locality rule e.g.,
async (O.location) O.x = 1;
future<int> F = future (A.distribution[i]) { A[i] } ;
© 2007 IBM Corporation
X10 Tutorial
Inter-place communication using async and future
Question: how to assign A[i] = B[j], when A[i] and B[j] may be
in different places?
Answer #1: Use nested async:
IBM Research
finish async ( B.distribution[j] ) {
final int bb = B[j];
async ( A.distribution[i] ) A[i] = bb;
}
62
Answer #2: Use future-force and an async:
final int b = future (B.distribution[j])
{ B[j] }.force();
finish async ( A.distribution[i] ) A[i] = b;
© 2007 IBM Corporation
X10 Tutorial
ateach (distributed parallel iteration)
ateach ( FormalParam: Expr ) Stmt
ateach (point p:D) S
 Creates |D| async statements in parallel at place specified by
distribution.
ateach (point p:D) S
for (point p:D.region)
async (D[p]) { S }
IBM Research
 Termination of all (recursively created) activities with finish.
 ateach is a convenient construct for writing parallel matrix code
that is independent of the underlying distribution, e.g.,
63
ateach ( point p : A.distribution )
A[p] = f(B[p], C[p], D[p]) ;
 SPMD computation:
finish ateach( point[i] : dist.factory.unique() ) S
© 2007 IBM Corporation
X10 Tutorial
Example: ateach (TutAteach1)
public class TutAteach1 {
public static void main(String args[]) {
finish ateach (point p: dist.factory.unique()) {
System.out.println("Hello from " + here.id);
}
} // main()
}
IBM Research
Console output:
64
Hello
Hello
Hello
Hello
from
from
from
from
unique distribution: maps point i in
region [0 : place.MAX_PLACES-1]
to place place.factory.place(i).
1
0
3
4
© 2007 IBM Corporation
X10 Tutorial
AllDistribution
 See AllDistributionP2P.x10
IBM Research
 See AllDistributionP2PRemoteAccess.x10
65
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Butterfly communication
66
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Outline
67
1. What is X10?
• background, status
2. Basic X10 (single place)
• async, finish, atomic
• future, force
3. Basic X10 (arrays & loops)
• points, rectangular regions,
arrays
• for, foreach
4. Scalable X10 (multiple places)
• places, distributions,
distributed arrays, ateach,
BadPlaceException
5.
6.
7.
Clocks
•
creation, registration, next,
resume, drop,
ClockUseException
Basic serial constructs that differ
from Java
•
const, nullable, extern
Advanced topics
•
Value types, conditional atomic
sections (when), general
regions & distributions
•
Refer to language spec for
details
© 2007 IBM Corporation
X10 Tutorial
Clocks: Motivation


Activity coordination using finish and force() is accomplished by
checking for activity termination
But in many cases activities have a producer-consumer relationship and
a “barrier”-like coordination is needed without waiting for activity
termination
– The activities involved may be in the same place or in different places
 Design clocks to offer determinate and deadlock-free coordination
between a dynamically varying number of activities.
Phase 0
IBM Research
Phase 1
68
...
Activity 0
Activity 1
Activity 2
...
© 2007 IBM Corporation
X10 Tutorial
Clocks (1/2)
clock c = clock.factory.clock();
 Allocate a clock, register current activity with it. Phase 0 of c starts.
async(…) clocked (c1,c2,…) S
ateach(…) clocked (c1,c2,…) S
foreach(…) clocked (c1,c2,…) S
 Create async activities registered on clocks c1, c2, …
IBM Research
c.resume();
 Nonblocking operation that signals completion of work by current
activity for this phase of clock c
69
next;
 Barrier --- suspend until all clocks that the current activity is registered
with can advance. c.resume() is first performed for each such clock, if
needed.
 Next can be viewed like a “finish” of all computations under way in the
current phase of the clock
© 2007 IBM Corporation
X10 Tutorial
Clocks (2/2)
c.drop();
 Unregister with c. A terminating activity will implicitly drop all clocks that
it is registered on.
c.registered()
 Return true iff current activity is registered on clock c
 c.dropped() returns the opposite of c.registered()
IBM Research
ClockUseException
 Thrown if an activity attempts to transmit or operate on a clock that it is
not registered on
 Or if an activity attempts to transmit a clock in the scope of a finish
70
© 2007 IBM Corporation
X10 Tutorial
Semantics
Static semantics
– An activity may operate only on those clocks it is registered with.
– In finish S,S may not contain any (top-level) clocked asyncs.
IBM Research
Dynamic semantics
– A clock c can advance only when all its registered activities have
executed c.resume().
– An activity may not pass-on clocks on which it is not live to subactivities.
– An activity is deregistered from a clock when it terminates
71
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Example (TutClock1.x10)
72
finish async {
final clock c = clock.factory.clock();
foreach (point[i]: [1:N]) clocked (c) {
parent transmits clock
while ( true ) {
to child
int old_A_i = A[i];
int new_A_i = Math.min(A[i],B[i]);
if ( i > 1 )
new_A_i = Math.min(new_A_i,B[i-1]);
if ( i < N )
new_A_i = Math.min(new_A_i,B[i+1]);
A[i] = new_A_i;
next;
int old_B_i = B[i];
int new_B_i = Math.min(B[i],A[i]);
if ( i > 1 )
new_B_i = Math.min(new_B_i,A[i-1]);
if ( i < N )
new_B_i = Math.min(new_B_i,A[i+1]);
B[i] = new_B_i;
next;
if ( old_A_i == new_A_i && old_B_i == new_B_i )
break;
exiting from while loop
} // while
terminates activity for
} // foreach
} // finish async
iteration i, and automatically
deregisters activity from clock
© 2007 IBM Corporation
X10 Tutorial
Example (TutClock1.x10)
hierarchical static dynamic concurrency
concurrency model
model
...
...
IBM Research
...
73
activities
(foreach, finish)
clock phases
...
© 2007 IBM Corporation
X10 Tutorial
Clock safety
 An activity may be registered on one or more clocks
 Clock c can advance only when all activities registered
with the clock have executed c.resume().
IBM Research
Runtime invariant: Clock operations are guaranteed to
be deadlock-free and determinate.
74
© 2007 IBM Corporation
X10 Tutorial
Deadlock freedom
IBM Research
 Central theorem of X10:
– Arbitrary programs with
async, atomic, finish (and
clocks) are deadlock-free.
75
 Key intuition:
– atomic is deadlock-free.
– finish has a tree-like
structure.
– clocks are made to satisfy
conditions which ensure treelike structure.
– Hence no cycles in wait-for
graph.
 Where is this useful?
– Whenever synchronization
pattern of a program is
independent of the data read
by the program
– True for a large majority of
HPC codes.
– (Usually not true of reactive
programs.)
© 2007 IBM Corporation
X10 Tutorial
Examples using clocks
 See AllReductionBarrier.
 See Streams, specifically Sieve of Eratosthenes, Fib.
IBM Research
 Exercise: Try Pascal’s triangle.
76
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Outline
77
1. What is X10?
• background, status
2. Basic X10 (single place)
• async, finish, atomic
• future, force
3. Basic X10 (arrays & loops)
• points, rectangular regions,
arrays
• for, foreach
4. Scalable X10 (multiple places)
• places, distributions,
distributed arrays, ateach,
BadPlaceException
5.
6.
7.
Clocks
•
creation, registration, next,
resume, drop,
ClockUseException
Basic serial constructs that differ
from Java
•
const, nullable, extern
Advanced topics
•
Value types, conditional atomic
sections (when), general
regions & distributions
•
Refer to language spec for
details
© 2007 IBM Corporation
X10 Tutorial
when
Stmt ::= WhenStmt
WhenStmt ::= when ( Expr ) Stmt |
WhenStmt or (Expr) Stmt
 when (E) S
– Activity suspends until a state in which
the guard E is true.
– In that state, S is executed atomically
and in isolation.
class OneBuffer {
nullable<Object> datum = null;
boolean filled = false;
void send(Object v) {
when ( ! filled ) {
datum = v;
filled = true;
}
}
 Guard E
– boolean expression
IBM Research
– must be nonblocking
– must not create concurrent activities
(sequential)
– must not access remote data (local)
– must not have side-effects (const)
78
 await (E)
– syntactic shortcut for when (E) ;
}
Object receive() {
when ( filled ) {
Object v = datum;
datum = null;
filled = false;
return v;
}
}
© 2007 IBM Corporation
X10 Tutorial
Static semantics of guard for when / await
IBM Research
 boolean field
 boolean expression with field access or constant values
79
class BufferBuffer {
..
void send(Object v) {
when (size() < MAX_SIZE)
{
datum = v;
filled = true;
}
}
...
}
compile-time error
© 2007 IBM Corporation
X10 Tutorial
Semaphores
class Semaphore {
private boolean taken;
void p() {
when (!taken)
taken = true;
}
atomic void v() {
taken = false;
}
IBM Research
}
80
© 2007 IBM Corporation
X10 Tutorial
Value types : immutable instances
IBM Research
value class
– Can only extend value class
or x10.lang.Object.
– All fields are implicitly final
– Can only be extended by
value classes.
– May contain fields with
reference type.
– May be implemented by
reference or copy.
81
Values are equal (==) if their
fields are equal, recursively.
public value complex {
double im, re;
public complex(double im,
double re) {
this.im = im;
this.re = re;
}
public complex add(complex a)
{
return new complex(im+a.im,
re+a.re);
} …
}
© 2007 IBM Corporation
X10 Tutorial
Nullable Types
 Unlike Java, reference types
in X10 do not contain the
value null by default.
– A method invocation on a
variable of a reference type
cannot throw an NPE.
 Nullable may be applied to
value types as well.
– nullable<int> : value is null or
an int.
IBM Research
 The nullable type constructor
can be used to add null to a
type:
82
– nullable<Foo> : values of this
type are references to Foo or
null.
– nullable<nullable<Foo>> is
the same as nullable<Foo>
© 2007 IBM Corporation
X10 Tutorial
Dependent types
IBM Research
 Classes and interfaces may
define properties
– final instance fields.
83
 Types may contain a where
clause: Foo(:c)
– c is a condition, a conjunction
of equalities.
– c may reference final
variables in environment or
properties of Foo.
– c may reference special
variable self, of type Foo.
 Dependent types may be
used everywhere where
types are used.
– Values may be cast to
dependent types.
– Code is generated to check
values of properties at
runtime.
 Examples
– foo(:place==p)
– region(:rank==2)
– dist(:region==r)
© 2007 IBM Corporation
IBM Research: Software Technology
Programming Technologies
X10 Summary
84
© 2005 IBM Corporation
X10 Tutorial
IBM Research
X10 v1.5 Language Summary
85
 async [(Place)] [clocked(c…)] Stm
– Run Stm asynchronously at Place
 finish Stm
– Execute Stm, wait for all asyncs to terminate
 Region
– Collection of index points, e.g.
region r = [1:N,1:M];
 foreach ( point P : Reg) Stm
– Run Stm asynchronously for each point in
region
 Distribution
– Mapping from region to places, e.g.
•
dist d = dist.factory.block(r);
 ateach ( point P : Dist) Stm
– Run Stm asynchronously for each point in dist,
in its place.
 new T
– Allocate object at this place (here)
 atomic Stm
– Execute Stm atomically
 next
– suspend till all clocks that the current activity is
registered with can advance
– Clocks are a generalization of barriers and MPI
communicators
 future [(Place)] [clocked(c…)] Expr
– Compute Expr asynchronously at Place
 F. force()
– Block until future F has been computed
 atomic Stm
– Execute Stm atomically
 extern
– Lightweight interface to native code
Deadlock safety: any X10 program written with async, atomic,
finish, foreach, ateach, and clocks can never deadlock
© 2007 IBM Corporation
X10 Tutorial
x10.lang standard library
Java package with “built in” classes that provide support for selected X10
constructs

IBM Research






86
Standard types
– boolean, byte, char, double, float, int, long, short, String
x10.lang.Object -- root class for all instances of X10 objects
x10.lang.clock --- clock instances & clock operations
x10.lang.dist --- distribution instances & distribution operations
x10.lang.place --- place instances & place operations
x10.lang.point --- point instances & point operations
x10.lang.region --- region instances & region operations
All X10 programs implicitly import the x10.lang.* package, so the x10.lang
prefix can be omitted when referring to members of x10.lang.* classes

e.g., place.MAX_PLACES, dist.factory.block([0:100,0:100]), …
Similarly, all X10 programs also implicitly import the java.lang.* package


e.g., X10 programs can use Math.min() and Math.max() from java.lang
In case of conflict (e.g. Integer), user must import the desired one explicitly, e.g.
import java.lang.Integer;
© 2007 IBM Corporation
X10 Tutorial
Rectangular regions
A rectangular region is the set of points contained in a rectangular subspace
A region variable can hold values of different ranks e.g.,
– region R; R = [0:10]; … R = [-100:100, -100:100]; … R = [0:-1]; …
IBM Research
Operations
87
–
–
–
–
–
–
–
–
–
–
–
–
–
R.rank ::= # dimensions in region;
R.size() ::= # points in region
R.contains(P) ::= predicate if region R contains point P
R.contains(S) ::= predicate if region R contains region S
R.equal(S) ::= true if region R equals region S
R.rank(i) ::= projection of region R on dimension i (a one-dimensional region)
R.rank(i).low() ::= lower bound of ith dimension of region R
R.rank(i).high() ::= upper bound of ith dimension of region R
R.ordinal(P) ::= ordinal value of point P in region R
R.coord(N) ::= point in region R with ordinal value = N
R1 && R2 ::= region intersection (will be rectangular if R1 and R2 are rectangular)
R1 || R2 ::= union of regions R1 and R2 (may not be rectangular)
R1 – R2 ::= region difference (may not be rectangular)
© 2007 IBM Corporation
X10 Tutorial
X10 Cheat Sheet: Regions & Distributions
IBM Research
Region:
Expr : Expr
[ Range, …, Range ]
Multidimensional Region
Region && Region
Region || Region
Region – Region
difference
BuiltinRegion
Dist:
Region -> Place
-- Constant
-- 1-D region
distribution
-Distribution | Place
-- Restriction
Distribution | Region
-- Restriction
-- Intersection
Distribution || Distribution
-- Union
-- Union Distribution – Distribution
-- Set
-- Set difference
Distribution.overlay ( Distribution )
BuiltinDistribution
88
88
© 2007 IBM Corporation
X10 Tutorial
X10 runtime parameters
-NUMBER_OF_LOCAL_PLACES=int
The number of places (default = 1)
-INIT_THREADS_PER_PLACE=int
Initial number of Java threads per single place(default =2)
-ABSTRACT_EXECUTION_STATS=false Dump out parallel execution statistics
-ABSTRACT_EXECUTION_TIMES=false If dumping statistics also dump out unblocked exec times
-BIND_THREADS=false
Use platform-specific calls to bind Java threads to CPUs.
-BIND_THREADS_DIAGNOSTICS=false Print diagnostics related to platform-specific calls to
bind Java threads to CPUs.
-BAD_PLACE_RUNTIME_CHECK=false Perform runtime place checks
The name of the main class
-OPTIMIZE_FOREACH=false
Experimental: Enable runtime loop optimizations.
-PRELOAD_CLASSES=false
Pre-load all classes on recursively startup
-PRELOAD_STRINGS=false
If pre-loading classes, also pre-load all strings
-LOAD=null
Load specified shared library.
IBM Research
-MAIN_CLASS_NAME=null
89
© 2007 IBM Corporation
X10 Tutorial
IBM Research
X10 preferences
90
© 2007 IBM Corporation
X10 Tutorial
IBM Research
Specifying X10 runtime parameters
91
© 2007 IBM Corporation
Download