Hummel

advertisement
Contemporary
Languages in
Parallel
Computing
Raymond Hummel
Current
Languages
Standard Languages
•
Distributed Memory Architectures
 MPI
•
Shared Memory Architectures
 OpenMP
 pthreads
•
Graphics Processing Units
 CUDA
 OpenCL
Use in Academia
•
Journal articles referencing parallel languages and
libraries





MPI – 863
CUDA – 539
OpenMP – 391
OpenCL – 195
Posix - 124
MPI
•
Stands for: Message Passing Interface
•
Pros
 Extremely Scalable
 Remains the dominant model for high performance computing
today
 Can be used to tie implementations in other languages together
 Portable
 Can be run in almost all OS/hardware combinations
 Bindings exist for multiple languages, from Fortran to Python
 Can harness a multitude of hardware setups
 MPI programs can run on both distributed memory and shared
memory systems
MPI
•
Cons
 Complicated Software
 Requires the programmer to wrap their head around all aspects of
parallel execution
 Single program must handle the behavior of every process
 Complicated Hardware
 Building and maintaining a cluster isn’t easy
 Complicated Setup
 Jobs have to be run using mpirun or mpiexec
 Requires mpicc to link mpi libraries for compiler
MPI
OpenMP
•
Stands for: Open Multi-Processing
•
Pros
 Incremental Parallelization
 Parallelize just that pesky triple for-loop
 Portable
 Does require compiler support, but all major compilers already
support it
 Simple Software
 Include the library, add a preprocessor directive, compile with a
special flag
OpenMP
•
Cons
 Limited Use-Case
 Constrained to shared memory architectures
 63% of survey participants from http://goparallel.sourceforge.net
were focused on development for individual desktops and servers
 Scalability limited by memory architecture
 Memory bandwidth is not scaling at the same rate as computation
speeds
OpenMP
POSIX Threads
•
Stands for: Portable Operating System Interface
Threads
•
Pros
 Fairly Portable
 Native support in UNIX operating systems
 Versions exist for Windows as well
 Fine Grained Control
 Can control mapping of threads to processors
POSIX Threads
•
Cons
 All-or-Nothing
 Can’t use software written with pthreads on systems that don’t
have support for it
 Major rewrite of main function required
 Complicated Software
 Thread management
 Limited Use-Case
POSIX Threads
CUDA
•
Stands for: Compute Unified Device Architecture
•
Pros
 Manufacturer Support
 NVIDIA is actively encouraging CUDA development
 Provide lots of shiny tools for developers
 Low Level Hardware Access
 Because Cross-Platform Portability isn’t a priority, NVIDIA can
expose low-level details
CUDA
•
Cons
 Limited Use-Case
 GPU computing requires massive data parallelism
 Only Compatible with NVIDIA Hardware
CUDA
OpenCL
•
Stands for: Open Compute Language
•
Pros
 Portability
 Works on all major operating systems
 Heterogeneous Platform
 Works on CPUs, GPUs, APUs, FPGAs, coprocessors, etc…
 Works with All Major Manufacturers
 AMD, Intel, NVIDIA, Qualcomm, ARM, and more
OpenCL
•
Cons
 Complicated Software
 Manual Everything
 Special Tuning Required
 Because it cannot assume anything about the hardware on which
it will run, programmer has to tell it the best way to do things
Non-Standard Languages
•
CILK
•
OpenACC
•
C++ AMP
CILK
•
Language first developed by MIT
•
Based on C, commercial improvements extend it to C++
•
Championed by Intel
•
Operates on the theory that the programmer should
identify parallelism, then let the run-time divide the
work between processing elements
•
Has only 5 keywords: cilk, spawn, sync, inlet, abort
•
CILK Plus implementation merged into version 4.9 of
the GNU C and C++ compilers
OpenACC
•
Stands for: Open ACCelerators
•
Not currently supported by major compilers
•
Aims to function like OpenMP, but for heterogeneous
CPU/GPU systems
•
NVIDIA’s answer to OpenCL
C++ AMP
•
Stands for: C++ Accelerated Massive Parallelism
•
Library implemented on DirectX 11 and an open
specification by Microsoft
•
Visual Studio 2012 and up provide Debugging and
Profiling support
•
Works on any hardware that has DirectX 11 drivers
Future
Languages
Developing Languages
•
D
•
Rust
•
Harlan
D
•
Performance of Compiled Languages
•
Memory Safety
•
Expressiveness of Dynamic Languages
•
Includes a Concurrency Aware Type-System
•
Nearing Maturity
Rust
•
Designed for creation of large Client-Server Programs on
the Internet
•
Safety
•
Memory Layout
•
Concurrency
•
Still Major Changes Occurring
Harlan
•
Experimental Language
•
Based on Scheme
•
Designed to take care of boilerplate for GPU
Programming
•
Could be expanded to include automatic scheduling for
both CPU and GPU, depending on available resources.
Questions?
Download