Math Libraries for Windows HPC - Center

Math Libraries for Windows HPC
Author: Dr. Edward Stewart, IMSL Product Manager, Visual Numerics, Inc.
Published: February 15, 2016
Abstract
This paper discusses the current state of mathematical libraries available for Microsoft® Windows®, with a
focus on High Performance Computing (HPC) use and specifically Windows ® HPC Server 2008. Software
analyzed will be limited to versions designed for 64-bit x86 architectures (specifically Intel EM64T and
AMD Opteron) as this is the required platform for Windows HPC Server 2008. An overview of both open
source and commercial software options will be provided with a focus on uses in distributed computing.
This paper is intended for software developers familiar with HPC. A high level of expertise in mathematics
or details of distributed computing techniques like MPI is not required. Familiarity with Windows tools like
Visual Studio® 2008 and the Windows HPC Job Scheduler for Windows HPC Server 2008 would be
helpful to follow the examples, but is not necessary.
Starting with an overview of Math Library uses in industry and research, coverage will progress from open
source linear algebra tools, to vendor-supplied libraries, to broad commercial libraries. Special libraries for
distributed computing will also be discussed. Finally, two examples will be presented that describe
developing HPC applications on Windows using math libraries. The first is a distributed application
utilizing MPI and the IMSL® Fortran library. The second is a parameter sweep type distributed application
utilizing the IMSL C# Library for .NET Applications.
Contents
Usage of Math Libraries ............................................................................................................................2
Linear Algebra Libraries ............................................................................................................................2
Vendor Libraries ........................................................................................................................................3
Distributed Computing Libraries ................................................................................................................4
Broad Math Libraries .................................................................................................................................4
.NET Math Libraries ..................................................................................................................................5
Java Math Libraries ...................................................................................................................................6
SMP Parallelization ...................................................................................................................................6
Distributed Math Libraries on Windows .........................................................................................................7
Architecture Overview ...............................................................................................................................7
MPI and the IMSL Fortran Numerical Library ...........................................................................................7
Example: IMSL Fortran on Windows HPC Server 2008 with MS-MPI ................................................... 10
Installing the IMSL Fortran Library on Windows .................................................................................... 10
Creating the Project................................................................................................................................ 12
Running the Example ............................................................................................................................. 18
From the Command Line ....................................................................................................................... 20
Other IMSL Libraries with MPI ............................................................................................................... 22
Parameter Sweep Distributed Applications ................................................................................................ 23
Architecture Overview ............................................................................................................................ 23
Example: Parameter Sweep Distributed Application in C# .................................................................... 23
Installing the IMSL C# Numerical Library ............................................................................................... 24
Creating the Project................................................................................................................................ 26
Running the Example ............................................................................................................................. 31
Summary .................................................................................................................................................... 35
Appendix A ................................................................................................................................................. 36
Feedback ................................................................................................................................................... 37
More Information and Downloads .......................................................................................................... 37
Math Libraries for Windows HPC
1
Contributors and Acknowledgements
Contributors from Visual Numerics for this document include Dr. Edward Stewart, IMSL Product Manager;
Greg Holling, Principal Consulting Engineer; and Ryan Wagner, Technical Support Engineer.
Math Libraries for Windows HPC
1
An Overview of Math Libraries for Windows
Usage of Math Libraries
For computer programmers, calling pre-written subroutines to do complex calculations dates back to early
computing history. With minimal effort, any developer can write a function that multiplies two matrices, but
these same developers would not want to re-write that function for every new program that requires it.
Further, with good theory and practice, one can optimize practically any algorithm to run several times
faster, though it would typically take a several hours to days to match the performance of a highly
optimized algorithm. For example, compare a naïve matrix multiplication to a modern algorithm with
blocking that ensures efficient use of cache memory. At this point, it makes sense for the original
developer, who has a much larger problem to solve, to call on a pre-written function to perform this matrix
multiplication quickly and efficiently so that attention can be focused on the higher level application. The
first experience many developers have with such functions is the Numerical Recipes reference books.
Positioned as educational material on “The Art of Scientific Computing” the series contains complete
commented source code for hundreds of algorithms. While there is controversy around some of the
algorithms themselves, Numerical Recipes has exposed many programmers to the fundamental concept
of utilizing pre-written functions instead of starting from scratch. Numerical Recipes is available today as
source code in several languages at http://www.nr.com/. Using Numerical Recipes on Windows® is as
easy as copying the source into your application and compiling the code.
Historically, during the 1950s and 1960s the United States had what can be called a “software crisis”
partly because of the space race and cold war, but also partly because of the rapid advances in
computing hardware. At that time, one had to be a mathematician to program a computer. Even when
Fortran was developed in 1965, the complexities of the hardware and algorithms required a great deal of
mathematical knowledge. The software crisis arose because there were far more problems needing
solving than capable mathematicians (and therefore programmers), because programs were difficult and
time consuming to write, and because the software produced was very error prone. Commercial libraries
arrived in the early 1970s as a solution to the crisis and to address the issue of consistency of numerical
results across various computing platforms. Primarily these tools meant that programmers did not have to
reinvent the wheel to program well-known algorithms, but it also meant the resulting programs where
easier to read and follow. Improved reliability was (and still is for many users) another key reason to
justify using a library, as one can rely on a tested proven collection of shared knowledge instead of an
individual mathematicians particular view of the solution.
Scientific computing, and the use of math libraries, was traditionally limited to research labs and
engineering disciplines. In recent decades, this niche computing market has blossomed across a variety
of industries. While research institutes and universities are still the largest users of math libraries,
especially in the High Performance Computing (HPC) arena, industries like financial services and
biotechnology are increasingly turning to math libraries as well. Even the business analytics arena around
business intelligence and data mining is starting to leverage the existing tools. From bond pricing and
portfolio optimization to exotic instrument evaluations and exchange rate analysis, the financial services
industry has a wide variety of requirements for complex mathematical algorithms. Similarly, the biology
disciplines have aligned with statisticians to analyze experimental procedures which produce hundreds of
thousands of results. With this expanded industry use, and use in new environments like Microsoft ®
Windows®, use of math libraries has grown significantly.
Linear Algebra Libraries
The core area of the math library market implements linear algebra algorithms. More specialized
functions, such as numerical optimization and time series forecasting, are often invoked explicitly by
users. In contrast, linear algebra functions are often used as key background components for solving a
wide variety of problems. Eigen analysis, matrix inversion and other linear calculations are essential
components in nearly every statistical analysis in use today including regression, factor analysis,
discriminate analysis, etc. The most basic suite of such algorithms is the BLAS (Basic Linear Algebra
Subprograms) libraries for basic vector and matrix operations. BLAS are divided into three separate
levels. Level 1 contains functions that operate on two vectors; Level 2, on a vector and a matrix; and
Math Libraries for Windows HPC
2
Level 3, on two matrices. In particular, scalar dot products would fall into Level 1, multiplication of a matrix
by a vector into Level 2, and matrix-matrix multiplication into Level 3. BLAS implementations will be
discussed later in this paper as parts of other broader libraries. One of the first such libraries to build on
the BLAS foundations is the LINPACK library written in the 1970s. LINPACK has now been superseded
by LAPACK (the Linear Algebra PACKage) which solves a wide variety of problems including solving
linear systems of equations, least-squares problems, Eigen systems (superseding the earlier EISPACK),
various decompositions and others.
LAPACK is a free library available as source code from the NetLib repository at
http://www.netlib.org/lapack/. The source code is Fortran77, but LAPACK95 is also available in Fortran95.
Prebuilt binaries are available for some environments including 32-bit Windows compiled specifically for
the Pentium Pro architecture. To use LAPACK on Windows® HPC Server 2008 as a 64-bit library, a
developer needs access to a Fortran compiler and will need to rebuild the library. Unfortunately, this is not
always straightforward. For best performance, LAPACK should be used with a BLAS package optimized
for specific hardware. NetLib supplies the ATLAS (Automatically Tuned Linear Algebra Software,
http://www.netlib.org/atlas/) package for this purpose. APIs for ATLAS are available in both C and
Fortran77. ATLAS is included in many source software packages that extend functionality well beyond
linear algebra, but require BLAS functionality. As with LAPACK, Windows HPC Server 2008 developers
will want to rebuild the library from source for optimal performance.
GotoBLAS is a BLAS library available from the Texas Supercomputing Project in binary and source form.
GotoBLAS claims to be the fastest available implementation of BLAS. The speedup in the library is
obtained by optimizing Translation Look-aside Buffer (TLB) table misses, with less relative emphasis on
optimizing the usage of L1 and L2 cache. GotoBLAS is available in Fortran source form, and it has been
ported to a number of high-performance computing platforms. “Goto” is the name of the developer of the
BLAS routines, Kazushige Goto. The GotoBLAS is being ported to the 64-bit Windows platform by Dr.
Goto with the same excellent performance as the Linux version. Contact the Texas Advanced Computing
Center (http://www.tacc.utexas.edu) to obtain access to GotoBLAS.
Many developers would rather not build libraries from source code; they require pre-built binaries that are
already optimized for their development platform. Optimized vendor-supplied libraries have been
marketed for this purpose.
Vendor Libraries
Most hardware vendors recognize the need for optimized linear algebra routines for their platforms. Since
they are most familiar with the details for their particular platforms, these vendors are in the best position
to build and optimize such libraries. For the Windows HPC Server 2008 platform, the two libraries that fall
into this category are the Intel Math Kernel Library (MKL) and the AMD Core Math Library (ACML). Both
of these include complete implementations of all three levels of BLAS, LAPACK, Fast Fourier Transforms
(FFTs), and Random Number Generators. The latest versions include excellent scaling for multi-core
hardware and take advantage of low-level processor extensions like SSE, SSE2 and SSE3. Vendors will
often tune particularly popular routines for optimal performance on their latest processor offerings. For
example, ACML now includes explicit support for the AMD Barcelona processor in the Level 3 BLAS
functions SGEMM and DGEMM (single- and double- precision matrix multiplication). Similarly, Intel has
optimizations for the new quad-core Xeon processor 5300 series.
To achieve high performance on Shared Memory Parallel (SMP) systems such as today’s multi-core
processors and multi-CPU systems, these libraries leverage OpenMP. To allow the libraries to take full
advantage of available hardware, the environment variable OMP_NUM_THREADS is set by the user at
runtime to the number of threads. Alternatively, if the developer wants to parallelize code at a higher level,
these libraries are available in single-threaded, but thread-safe versions, so they can be used in explicitly
threaded applications.
Both MKL and ACML are commercial libraries available from various resellers. As low-level libraries, they
include flexible distribution policies and are relatively inexpensive.
Math Libraries for Windows HPC
3
Distributed Computing Libraries
The linear algebra and vendor libraries discussed so far focus on high performance algorithms for single
machines. The HPC space and Windows HPC Server 2008 also target distributed computing where a
collection of separate computers, or nodes, are connected together with a high-speed interconnect and
usable as a single unit. This style of cluster computing often makes use of the Message Passing Interface
(MPI) for parallel programming. A number of libraries have been extended to utilize MPI and allow
developers to solve large linear algebra problems on these much larger systems without having to
explicitly parallelize the algorithms themselves.
The ScaLAPACK (Scalable LAPACK, http://www.netlib.org/scalapack/) library is one such
implementation, again hosted by NetLib. This library is a subset of the LAPACK functions that have been
redesigned for distributed systems. The overall ScaLAPACK project is actually made up of four
components:




ScaLAPACK for dense and band matrix systems
PARPACK for sparse eigenvalue systems
CAPSS for sparse direct systems
ParPre for sparse iterative solvers
Using ScaLAPACK is quite a bit more complex than using the LAPACK equivalent. Data must be stored
in the block cyclic decomposition, and familiarity with MPI within the program is required. For the BLAS
component, PBLAS (Parallel Basic Linear Algebra Subprograms) are utilized along with BLACS (Basic
Linear Algebra Communication Subprograms) for communication tasks.
Pre-built binaries are available for some systems, but many users may again turn to the supported and
optimized vendor options like the Intel Math Kernel Library (MKL). In previous versions, Intel supported a
separate version of MKL called Intel Cluster MKL for distributed systems. With version 10.0, MKL and
Cluster MKL have been merged into a single product. Thus MKL now includes BLACS and ScaLAPACK
implementations optimized for Intel hardware in distributed architectures. For users who require
fundamental distributed linear algebra functions on Windows HPC Server 2008, the Intel MKL is a good
choice to avoid all the issues of building such complex libraries from source.
Broad Math Libraries
The libraries mentioned thus far are largely focused on linear algebra. While linear algebra makes up a
core piece of any numerical library offering, it is only a subset of the algorithms many of today’s
developers require. Since there are many mathematical tools on the market today, the focus of the
following discussion is narrowed to callable libraries with native language programming interfaces. This
narrowing of focus excludes desktop math tools such as Matlab, PV-WAVE, SAS, Sage, Mathematica,
and dozens of others. While some of these packages offer solutions for distributed systems and almost all
are available for Windows, a developer writing in standard languages like C, Fortran or any .NET
language and seeking high performance solutions will usually opt not to wrap function calls into a desktop
analytics package.
Reducing the scope as such, we are left with a much smaller list of broad math libraries. The GNU
Scientific Library (GSL) is one example. The list of areas covered by GSL is extensive and well beyond
basic linear algebra. Topics like root finding, numerical optimization, statistical analysis, differential
equations, and curve fitting are covered but are just a short sample. GSL is an open source library
available under the GNU General Public License (GPL). Limitations of the GPL aside, GLS is challenging
to use on a Windows system. Compiled binaries are available as part of the Cygwin environment for
Windows which mimics a Linux-like environment. While it might seem appealing to migrate HPC
applications to Windows via the Cygwin environment, it is unfortunately challenging to integrate Cygwin
with the standard Windows development environment. Specifically, mixing Windows development tools
with Cygwin libraries is difficult at best. As such, unless all development tools and resources for a project
exist for the Cygwin environment, GSL is not usually a viable option for the Windows HPC Server 2008
Math Libraries for Windows HPC
4
environment. Finally, GSL functions are not MPI-enabled and so distributed calculations across a cluster
environment would require significant development effort.
The remaining broad math libraries available for Windows are commercial libraries with a long history of
supporting the Microsoft environment. Both the Numerical Algorithm Group (NAG) Library and Visual
Numerics’ IMSL Numerical Library are available in C and Fortran versions for Microsoft Windows 64-bit
systems. Both of these libraries support the Intel Visual Fortran compiler, Intel C++ compiler and the
Microsoft® Visual C++® compiler. As with GSL, the coverage is very wide and goes well beyond linear
algebra. Further, all versions of these products will link in vendor-supplied BLAS like MKL or ACML for the
best performance for linear algebra functions (either directly utilized through the higher level interface or
internally as BLAS). The NAG and IMSL Numerical Libraries are commercial products with solid
documentation and available technical support. Both have been on the market since the early 1970s and
have continued to evolve over the decades.
For distributed computing environments NAG requires two Fortran libraries, the NAG SMP Library (for
shared memory systems) and the NAG Parallel Library. As of this writing, neither of these libraries is
available for the Windows operating system. The IMSL Fortran Library is a single product that contains
some SMP-enabled functions and MPI-enabled routines. The MPI functions were expanded in version 6.0
to include a wide variety of ScaLAPACK functions along with utility functions that make distributed
computing much more accessible for developers who are not MPI experts. This version of the IMSL
Fortran Library is available for 64-bit Windows systems and will be the focus of an example in the
following section.
.NET Math Libraries
With a focus on HPC, much of the discussion has been around libraries implemented in Fortran, and to a
lesser extent C or C++. Since the focus of this paper is on Windows HPC Server 2008, .NET languages
may also play a significant role for many developers. These libraries are almost all written in C#. Some
are pure C# like the open source Math.NET project while others include native code for higher
performance like the NMath product from CenterSpace Software. NMath started out as a C# wrapper for
the Intel MKL library, but has expanded into statistical functions and other areas. The Extreme
Optimization Numerical Libraries for .NET focus on an object oriented interface and covers a similarly
wide variety of functionality. The IMSL C# Numerical Library comes in two formats as of version 5.0. One
version is written in pure C# for developers who require purely managed code implementations and a
second integrates MKL for low-level BLAS functions to boost performance for developers whose projects
do not require pure managed code, but still want a .NET interface. The IMSL C# Numerical Library also
includes charting classes with a programmatic interface to allow an easy path to visualization of results
using a single tool.
For programming numerical applications, F# is becoming a very popular option (see
http://research.microsoft.com/fsharp/fsharp.aspx). This general purpose language includes many features
that make complex programming tasks easier as a combination of procedural, object oriented and
functional programming language elements. With full integration into the .NET Framework, solid
performance, and an interactive scripting environment, there are many advantages to this novel platform.
Not only does F# have access to the full .NET class library and is now integrated into Visual Studio®
2008, but third party libraries like those mentioned above are also fully supported. A quick example of
calling the IMSL C# Library from the F# Interactive Console is shown in Figure 1.
MSR F# Interactive, (c) Microsoft Corporation, All Rights Reserved
F# Version 1.9.4.17, compiling for .NET Framework Version v2.0.50727
>#r “c:\\program files\\vni\\imsl\\imslcs500\\bin\\imslcs.dll”;;
--> Referenced ‘c:\program files\vni\imsl\imslcs500\bin\imslcs.dll’
> open Imsl.Math;;
Math Libraries for Windows HPC
5
> let g = Sfun.Gamma(0.5);;
val g : float
Binding session to ‘c:\program files\vni\imsl\imslcs500\bin\imslcs.dll’...
> g;;
val it : float = 1.772453851
>
Figure 1. A sample interactive F# session calling a .NET library.
Java Math Libraries
Java is another platform option along the lines of the .NET Framework. Java is more focused on crossplatform application development, while .NET is more focused on effective development within the
Windows environment. Java applications and tools will generally work well in heterogeneous Windows
and Unix/Linux environments, for example. Calling C or C++ libraries from Java is an option for some
developers, but when cross-platform solutions are required a pure Java library becomes a requirement.
Java has a very large open source community and some numerical libraries are available in this form.
The Java Matrix Package (JAMA) covers the basics of linear algebra, while the Colt Project expands on
the theme to cover a broader range of algorithms for scientific and technical computing in Java. The
project seeing the most active development with a very ambitious set of future goals is Commons-Math
under the Apache Commons hierarchy of open source projects. The commercial offerings of Java
numerical libraries are not as wide as the other platforms. While there are some industry-specific
commercial tools (especially for financial services and biotech), the only broad commercial math library
for Java is the Visual Numerics JMSL Numerical Library.
SMP Parallelization
With multi-core and many-core hardware becoming commonplace, shared memory parallelization is
becoming more popular. The most common interface for SMP parallelization in the mathematical and
scientific programming communities is OpenMP. By adding OpenMP directives into applications,
supported compilers will take care of a lot of the details of parallelizing the code. Many mathematical
algorithms involve repeated looping over data and very good performance gains can be realized by
parallelizing large outer loops in these algorithms. Since Visual Studio 2005, the Microsoft Visual C++
compiler has supported OpenMP.
In addition to OpenMP and with a focus on the .NET platform, Parallel Extension to .NET Framework 3.5
is currently available as a Community Technology Preview (see
http://www.microsoft.com/downloads/details.aspx). While this is an early release for testing purposes
only, the implementation holds a lot of promise for developers who want to multi-thread their .NET
applications without managing all the details of the thread pool themselves. This release of the Parallel
Extensions is essentially an updated System.Threading library with additional constructs like
“Parallel.For” and “Parallel.ForEach”. This syntax is intuitive for anyone familiar with OpenMP, and thus
adding SMP threading to .NET code with this library is expected to be straightforward. The extension also
includes a System.Threading.Tasks namespace providing support for imperative task parallelism. This
namespace includes the expected Task objects as well as a TaskManager class representing the
schedule that executes the tasks. Finally, Parallel LINQ (PLINQ) is included in this work that allows LINQ
developers to leverage parallel hardware with minimal impact to the existing programming model.
Visual Numerics is currently investigating the details of this programming model with hopes to integrate
the functionality inside a future release of the IMSL C# Numerical Library, providing enhanced
performance with no code changes required for existing users. While existing IMSL classes can be
leveraged inside parallel blocks, the best performance gain is clearly to add parallelism within the
multifaceted math algorithms of the library.
Math Libraries for Windows HPC
6
Distributed Math Libraries on Windows
Architecture Overview
In this section, we will provide an example of distributed calculations utilizing Microsoft ® Message Passing
Interface, MS-MPI. The IMSL Fortran Library will be the library of choice due to its support for this
Windows environment and also the ease of use for MPI applications. Consider the typical network
topology for Windows HPC Server 2008 shown in Figure 2. The fine black lines and arrows indicate the
flow of data in a network where the Head Node will spawn a job that utilizes the parallelization features
included in the IMSL Fortran Library. In this case all of the necessary programming tools are installed on
the Head Node; these include Visual Studio 2008, Intel Visual Fortran 10 and IMSL Fortran Numerical
Library 6. The example codes are compiled and linked on the head node and then executed using the
Windows HPC Job Manager. The runtime components are located in a shared folder on the head node
that each compute node can access. Information is distributed to each compute node, which then
performs its set of calculations, returning the results when work is complete.
Figure 2. A typical cluster network topology for an MS-MPI application.
MPI and the IMSL Fortran Numerical Library
The IMSL Fortran Numerical Library is the primary offering from Visual Numerics for distributed
applications and a good choice for the Windows platform given the review above. The library contains
many functions that are MPI-enabled to solve large problems on clustered hardware. The functions are
centered on linear algebra problems, but also extend into the realm of optimization. Furthermore, with
version 6 of the IMSL Fortran Library, the existing API for many routines has been extended to leverage
ScaLAPACK behind the scenes. This feature allows new users of MPI and ScaLAPACK to use the
familiar IMSL Fortran Library interface instead of learning all the intricate details of MPI.
While the MPI interface of the IMSL Fortran Library is based on MPICH2, the MS-MPI implementation
included with Windows HPC Server 2008 is compatible and the best choice when using the Microsoft
tools such as the Job Manager.
There are only two requirements to develop with the IMSL Fortran Library on Windows operating
systems: 1) a license for the IMSL Fortran Library and 2) a supported compiler. As of 2008, the
supported compilers are the Intel Visual Fortran compiler Version 10, Absoft Pro Fortran Version 10.1 and
the Portland Group’s PGI Fortran compiler version 7.1-5. One benefit of these “Visual Fortran” compilers
Math Libraries for Windows HPC
7
is their integration with Visual Studio. Fortran developers who struggle with various command line tools
and sparse options for fully featured editors will find being able to use Visual Studio for Fortran
development is a change that should result in a significant improvement in productivity. The example in
this section will use the Intel Visual Fortran compiler.
Using MS-MPI with the IMSL Fortran Library is straightforward. By default, the batch script for compiling
Windows Fortran applications makes reference to the MPICH2 binary library file mpi.lib. To execute
using MS-MPI, the link option must be changed to pick up msmpi.lib instead. At runtime, the Compute
Cluster Scheduler and Job Manager will be used to define and execute MPI applications bound for the
cluster. The graphical user interface and this tool’s knowledge of other users’ schedule projects allows
for a pleasant experience. If certain tasks require specific resources, the tool will wait for them to become
available before attempting to execute the distributed task.
There are two primary methods the IMSL Fortran Library has to distribute problems: the ScaLAPACK API
technique and the Box Data Type technique. In the ScaLAPACK API technique, an IMSL function that
references a ScaLAPACK function is used. Instead of having to manually configure the program with MPI
functions such as MPI_BCAST or MPI_COMM_WORLD, the IMSL Fortran Library provides several
utilities to configure MPI and ScaLAPACK, easing the burden on the developer. This technique is
particularly helpful for programmers new to MPI-based distributed computing. An example from the IMSL
Library documentation follows in Figure 3. Fortran code example shows the IMSL Fortran Library tools to
leverage ScaLAPACK.where the IMSL Library interfaces and utilities are used instead of traditional code.
Behind the scenes the ScaLAPACK function DGESVD (which computes the singular value decomposition
of a double precision rectangular matrix) is referenced through the call to the IMSL subroutine LSVRR.
USE MPI_SETUP_INT
USE IMSL_LIBRARIES
USE SCALAPACK_SUPPORT
IMPLICIT NONE
INCLUDE ‘mpif.h’
!
Declare variables
INTEGER
KBASIS, LDA, LDQR, NCA, NRA, DESCA(9), DESCU(9), &
DESCV(9), MXLDV, MXCOLV, NSZ, MXLDU, MXCOLU
INTEGER
INFO, MXCOL, MXLDA, LDU, LDV, IPATH, IRANK
REAL
TOL, AMACH
REAL, ALLOCATABLE ::
A(:,:),U(:,:), V(:,:), S(:)
REAL, ALLOCATABLE ::
A0(:,:), U0(:,:), V0(:,:), S0(:)
PARAMETER
(NRA=6, NCA=4, LDA=NRA, LDU=NRA, LDV=NCA)
NSZ = MIN(NRA,NCA)
!
Set up for MPI
MP_NPROCS = MP_SETUP()
IF(MP_RANK .EQ. 0) THEN
ALLOCATE (A(LDA,NCA), U(LDU,NCA), V(LDV,NCA), S(NCA))
Set values for A
A(1,:) = (/ 1.0, 2.0, 1.0, 4.0/)
A(2,:) = (/ 3.0, 2.0, 1.0, 3.0/)
A(3,:) = (/ 4.0, 3.0, 1.0, 4.0/)
A(4,:) = (/ 2.0, 1.0, 3.0, 1.0/)
A(5,:) = (/ 1.0, 5.0, 2.0, 2.0/)
A(6,:) = (/ 1.0, 2.0, 2.0, 3.0/)
ENDIF
!
!
!
!
Set up a 1D processor grid and define
its context ID, MP_ICTXT
CALL SCALAPACK_SETUP(NRA, NCA, .TRUE., .TRUE.)
Get the array descriptor entities MXLDA,
Math Libraries for Windows HPC
8
!
!
!
!
!
!
!
!
!
!
!
!
!
MXCOL, MXLDU, MXCOLU, MXLDV, AND MXCOLV
CALL SCALAPACK_GETDIM(NRA, NCA, MP_MB, MP_NB, MXLDA, MXCOL)
CALL SCALAPACK_GETDIM(NRA, NSZ, MP_MB, MP_NB, MXLDU, MXCOLU)
CALL SCALAPACK_GETDIM(NSZ, NCA, MP_MB, MP_NB, MXLDV, MXCOLV)
Set up the array descriptors
CALL DESCINIT(DESCA, NRA, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDA, INFO)
CALL DESCINIT(DESCU, NRA, NSZ, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDU, INFO)
CALL DESCINIT(DESCV, NSZ, NCA, MP_MB, MP_NB, 0, 0, MP_ICTXT, &
MXLDV, INFO)
Allocate space for the local arrays
ALLOCATE (A0(MXLDA,MXCOL), U0(MXLDU,MXCOLU), &
V0(MXLDV,MXCOLV), S(NCA))
Map input array to the processor grid
CALL SCALAPACK_MAP(A, DESCA, A0)
Compute all singular vectors
IPATH = 11
TOL = AMACH(4)
TOL = 10. * TOL
CALL LSVRR (A0, IPATH, S, TOL=TOL, IRANK=IRANK, U=U0, V=V0)
Unmap the results from the distributed
array back to a non-distributed array.
After the unmap, only Rank=0 has the full
array.
CALL SCALAPACK_UNMAP(U0, DESCU, U)
CALL SCALAPACK_UNMAP(V0, DESCV, V)
Print results.
Only Rank=0 has the solution.
IF (MP_RANK .EQ. 0) THEN
CALL WRRRN (’U’, U, NRA, NCA)
CALL WRRRN (’S’, S, 1, NCA, 1)
CALL WRRRN (’V’, V)
ENDIF
Exit ScaLAPACK usage
CALL SCALAPACK_EXIT(MP_ICTXT)
Shut down MPI
MP_NPROCS = MP_SETUP(‘FINAL’)
END
Figure 3. Fortran code example shows the IMSL Fortran Library tools to leverage ScaLAPACK.
This example computes the singular value decomposition of a 6 x 4 matrix A. The matrices U and V
containing the left and right singular vectors, respectively, and the diagonal of S, containing the singular
values, are printed using the utility function WRRRN. More information about this example can be found
on online documentation.
Notice array allocation and several different utility routines are still required, but working off of this
example is significantly more straightforward for a developer new to MPI than the equivalent non-IMSL
ScaLAPACK version. In comparison, calling ScaLAPACK directly would require several calls to
BLACS_**** routines and many other functions not required when using the MP_SETUP and
SCALAPACK_SETUP convenience routines provided by the IMSL Fortran Library. One can remove the
MPI_SETUP and SCALAPACK_SETUP calls from the above example and the call to DLSVRR would be
executed on a single computer (not distributed) using the equivalent LAPACK function instead.
Math Libraries for Windows HPC
9
In the Box Data Type technique, multiple independent two-dimensional linear algebra problems can be
stacked up as planes in a three-dimensional “box”. Individual planes are then distributed among nodes on
the cluster for calculation and the results returned to the head node. This technique has shown superlinear scaling for very large problems. Many other parallelized functions are available through overloaded
operators.
Example: IMSL Fortran on Windows HPC Server 2008 with MS-MPI
This section will provide a detailed walkthrough to build and execute a distributed calculation using MPI.
As mentioned above, the Intel Visual Fortran Compiler will be used through the Visual Studio 2008
interface. A subsection at the end will describe the steps to build and run the project from the command
line interface as well. The example uses a basic example using the Box Data Type technique and the
overloaded operator .ix. which computes the product of the inverse of matrix A and vector or matrix B,
.
Installing the IMSL Fortran Library on Windows
To obtain the IMSL Fortran Numerical Library, you can download an evaluation copy from the Visual
Numerics website at http://www.vni.com/forms/fortran_download_choice.php. Select the option for
“x86_64, Windows XP 64, Intel Fortran Compiler 10.0”; an evaluation CD can also be requested to be
mailed. You will need a valid license key to execute the example, which can be acquired by contacting an
Account Manager at Visual Numerics by email. If you have downloaded the product, first unzip the
archive named fnl60_p10484.zip for version 6.0 of the IMSL Fortran Library. The CD contents are the
same as this archive, as shown in Figure 4. In either case, start the setup procedure by running the
setup.exe application.
Figure 4. Installation files for the IMSL Fortran Library.
Run the setup.exe application to start the installation.
Running the setup program will initialize the installation procedure. Select the library to install (there is
likely to be only one option) and click Next > as illustrated in Figure 5.
Math Libraries for Windows HPC
10
Figure 5. Select the appropriate product to install for the platform.
You will immediately be presented with the option to update system environment variables. These will
make using the product easier and it is recommended you choose the option. The only case where this
option should be declined is when running multiple versions of the IMSL Fortran Library on the same
system. This option is shown in Figure 6.
Figure 6. It is recommended to let the setup application update environment variables.
The InstallShield Wizard starts next; click Next > to continue. To install the product, you must agree to the
Visual Numerics, Inc. Software License Agreement by clicking Yes on the following screen. If you are a
current customer and have a License Number, enter it on the following screen, otherwise enter 999999
and click Next > to continue. The Installation Location must be specified next; to use the default of
“C:\Program Files (x86)\VNI\” click Next > to proceed. The installer is finally ready to copy files; progress
is monitored as the files are copied. Once complete, a final dialog will let you click Finish to complete the
installation and close the Setup Wizard. The following montage of screenshots, Figure 7 collectively,
should help guide you through the process.
Math Libraries for Windows HPC
11
Figure 7. Screenshots showing the steps to install the IMSL Fortran Numerical Library.
Creating the Project
Now that all the products have been installed, the next step is to create a new Fortran console Project in
Visual Studio 2008 as shown in Figure 8 and Figure 9.
Math Libraries for Windows HPC
12
Figure 8. Creating a new Project in Visual Studio 2008.
Figure 9. Selecting an Intel Fortran project in Visual Studio 2008.
This will set up a default blank solution. Create a new source file named Source1.F90 as follows: Rightclick on Source Files in the Solution Explorer and select Add -> New Item, and in the Add New Item
dialog, Source should be highlighted with a default filename of Source1.F90. Select Add to create the
new file; see Figure 10.
Math Libraries for Windows HPC
13
Figure 10. Add a new Fortran source file in Visual Studio 2008.
The next task is to change the Project Property settings to build for the x64 architecture by selecting Build
-> Configuration Manager. In this dialog, select the Active Solution Platform drop-down menu and select
<New…> and then set the Type to x64 as shown in Figure 111.
Figure 11. Modifying the Solution Platform to x64 for 64-bit Windows environments.
Close open dialogs and double-click to open Source1.F90 in the Solution Explorer. Here, paste in the
following code from an example in the IMSL Fortran documentation for the .ix. operator shown in Figure
Math Libraries for Windows HPC
14
12. Notice the array size is n x n x k for this box data type problem, which translates to k planes of an n x
n matrix. Configured in this way, the problem separates nicely into the number of planes which is
distributed across the network using MPI.
use rand_int
use norm_int
use operation_x
use operation_ix
use mpi_setup_int
implicit none
integer, parameter :: n=32, k=4
real(kind(1e0)) :: one=1e0
real(kind(1e0)), dimension(n,n,k) :: A, b, x, err(k)
call erset(0,1,0)
! Setup for MPI.
MP_NPROCS=MP_SETUP()
! Generate random matrices for A and b:
IF (MP_RANK == 0) THEN
A = rand(A); b=rand(b)
END IF
! Compute the box solution matrix of Ax = b.
x = A .ix. b
! Check the results.
err = norm(b - (A .x. x))/(norm(A)*norm(x)+norm(b))
if (ALL(err <= sqrt(epsilon(one))) .and. MP_RANK == 0) &
write (*,*) 'Example for .ix. is correct.'
! See to any error messages and quit MPI.
MP_NPROCS=MP_SETUP('Final')
end
Figure 12. Fortran source code for the example using MPI and the .ix. operator.
In this example, two random matrices are created and the inverse computed using the .ix. operator,
which is MPI-enabled and will distribute the work across the cluster. The result is checked using the .x.
operator and the norm function. More information about this example and the use of overloaded
operators can be found in the online documentation.
The IMSL Fortran library has not yet been added to the project as a reference, so the next step is to
update the Project Properties again to add the Include and Library directories. Select Project -> Console1
Properties to open the next dialog. Under the section Configuration Properties -> Fortran, several
directories need to be added under the Additional Include Directories. Click in the blank area, click the
dropdown button and select <Edit..> to bring up an easy dialog to edit. The folders include:



C:\Program Files\Microsoft HPC Pack 2008 SDK\Include
C:\Program Files (x86)\VNI\imsl\fnl600\Intel64\include\dll
C:\Program Files (x86)\Intel\Compiler\Fortran\10.1.021\em64t\Include
Note that the paths may be different on different systems depending on where the various products and
tools were installed. Click OK to close the dialog. Figure 13 illustrates this step:
Math Libraries for Windows HPC
15
Figure 13. Configuring the additional Include directories in Visual Studio 2008.
Next select the Language item and click on “Process OpenMP Directives” to add the /Qopenmp flag to
the compiler options as shown in Figure 14. This is not specifically required for the example here, but
your own code my include OpenMP directives and this switch must be turned on in that case.
Figure 14. Adding the /Qopenmp command line compiler option in Visual Studio 2008.
Math Libraries for Windows HPC
16
Click Apply to set this change. Next, a few items need to be added under the Linker section. Close the
Fortran part of the tree under Configuration Properties, open the Linker options and select Input. Under
Additional Dependencies, we need to add the following items:
imsl.lib imslsuperlu.lib imslhpc_s.lib imslp_err.lib mkl_scalapack.lib
mkl_blacs_mpich2.lib mkl_em64t.lib libguide.lib msmpi.lib msmpifec.lib lmgr.lib
kernel32.lib user32.lib netapi32.lib advapi32.lib gdi32.lib comdlg32.lib comctl32.lib
wsock32.lib libcrvs.lib libFNPload.lib libsb.lib
There is not a pop-up dialog, such as in the Include Directories section, to enter this list. It is necessary to
enter them in the area one after another with spaces between (or paste the list from this document).
These add references to several IMSL Fortran Library components and also MS-MPI files. Use Figure 15
as a guide.
Figure 15. Adding dependent libraries to the project in Visual Studio 2008.
Of course, the project needs path information for these files as well. Under Linker -> General, add the
following paths as above for the Include Directories under the Additional Library Directories section:




C:\Program
C:\Program
C:\Program
C:\Program
Files\Microsoft HPC Pack 2008 SDK\lib\amd64
Files (x86)\VNI\imsl\fnl600\Intel64\lib
Files (x86)\Intel\Compiler\Fortran\10.1.021\em64t\Lib
Files\Microsoft SDKs\Windows\v6.0A\Lib\x64
Figure 166 illustrates this step.
Math Libraries for Windows HPC
17
Figure 16. Adding additional Library paths in Visual Studio 2008.
Close all the open dialogs, and we are finally ready to compile the solution. Select Build -> Build Solution
from the main Visual Studio menu. Hopefully everything builds properly and you see the friendly “Build
succeeded” message in the bottom information area. If not, check the source code and configuration
settings for typos and missing pieces.
Running the Example
After the project is built, browse to the output directory (typically Console1\x64\Release) and locate the
executable console1.exe. This file must be copied to a network share that is visible to all the nodes on
the cluster. For this example, the head node is named “clusterbot” and by convention distributed
applications are placed in a shared directory named “tasks”. The full path is
\\clusterbot\tasks\VNI\console1.exe.
To execute this code, open the Windows HPC Job Manager by browsing Start -> Programs -> Microsoft®
HPC Pack -> HPC Job Manager. To submit a new job, select Actions -> Job Submission -> New Job as
shown in Figure 17.
Math Libraries for Windows HPC
18
Figure 17. Selecting a New Job in the Windows HPC Job Manager.
This will open the Create New Job dialog box where all the details are entered. Name the Job something
descriptive like “IMSL Example” and then select the Task List option from the left hand navigation pane
and click Add. In the Command Line field, enter the full path to the console1.exe discussed above along
with a leading “mpiexec” entry. The mpiexec program is required to run MPI applications. The working
directories should also be entered as valid shared paths on the network. Also configure the Minimum and
Maximum resources as applicable to your configuration. Please refer to Figure 18.
Figure 18. Configuring the Task Details in the Windows HPC Job Manager.
Save these entries and the summary should appear similar to Figure 19.
Math Libraries for Windows HPC
19
Figure 19. The summary of the job for the MPI example.
Click Submit and the Job will be added to the queue. On the listing of All Jobs in the Windows HPC Job
Manager, this job will be appear after it has been submitted. At first, its State will be Running, but it will
soon change to Finished as shown in Figure 20.
Figure 20. The status of the submitted job in the Windows HPC Job Manager.
From the Command Line
Many developers continue to be very comfortable at the command line; therefore this short section will
walk through the above example from the command line point of view. When using the compiler tools
from the command line, it is best practice to start with the command line window supplied with the Intel
Math Libraries for Windows HPC
20
compiler. This window presets environment variables and compiler settings; it can be accessed by using
the shortcut Start -> Programs -> Intel Software Development Tools -> Intel Fortran Compiler 10.1 ->
Visual Fortran Build Environment. To use IMSL Fortran in this setting, the next step is to run the
fnlsetup.bat startup script. The command session to this point may look like the following:
Intel(R) Visual Fortran Compiler for applications running on Intel(R) 64, Version
10.1.021
Copyright (C) 1985-2008 Intel Corporation. All rights reserved.
Setting environment for using Microsoft Visual Studio 2008 x64 cross tools.
C:\>"c:\Program Files (x86)\vni\imsl\fnl600\Intel64\bin\fnlsetup.bat"
Setting environment for IMSL Fortran Library - Intel64
C:\>
Several important and useful environment variables will be set at this stage. Since the standard MPI
environment for the IMSL Fortran product is MPICH2, you may need to adjust the LINK_MPI_HPC
environment to match the following:
SET LINK_MPI_HPC=imsl.lib imslsuperlu.lib imslhpc_s.lib imslp_err.lib
mkl_scalapack.lib mkl_blacs_mpich2.lib mkl_em64t.lib libguide.lib msmpi.lib
msmpifec.lib lmgr.lib kernel32.lib user32.lib netapi32.lib advapi32.lib gdi32.lib
comdlg32.lib comctl32.lib wsock32.lib libcrvs.lib libFNPload.lib libsb.lib /link
/force:multiple
You may also need to add the path to msmpi.lib and msmpifec.lib (typically C:\Program Files\Microsoft
HPC Pack 2008 SDK\Lib\amd64) to the LIB environment variable as well. To compile the source code,
issue the following command utilizing the configured environment variables:
%MPIF90% %MPIFLAGS% Source1.F90 %LINK_MPI_HPC%
This will build the executable Source1.exe that again should be copied to a shared location on the
network. Using the shared directory location described above, the command to submit the job is as
follows:
C:\>job submit /StdOut:\\cluterbot\tasks\vni\stdout.txt /numnodes:4 mpiexec
\\clusterbot\tasks\vni\source1.exe
Job had been submitted. ID: 1.
Note that a job submitted through the command line in this manner will still appear in the Windows HPC
Job Manager GUI interface. All of the options that can be set in the graphical interface are also available
from the command line. The results of the submission are the same and will work together if some
developers prefer one method over another using the same cluster. Also, the command line switches
passed to “job submit” will override those passed to mpiexec; think of mpiexec as another parameter for
the specific job. The interaction between the GUI and command line is visible in Figure 21 where the
command is issued in the console area and the job appears with its current status in the Windows HPC
Job Manager.
Math Libraries for Windows HPC
21
Figure 21. Submitting an MPI job through the command line is equivalent to using the graphical interface.
Other IMSL Libraries with MPI
Other versions of the IMSL Numerical Libraries can be used in MPI settings as well as the IMSL Fortran
Library. For MPI developers writing C/C++ applications, it may be easier to call a C library instead of
interfacing a Fortran library. While no components of the IMSL C Library themselves utilize MPI to
distribute calculations, the library can be integrated into parallel applications that require advanced
analytical calculations at each node. The IMSL C Numerical Library is thread safe, however, which
enables developers to write shared memory parallel applications built on the library.
For .NET developers, the Indiana University group lead by Doug Gregor has made bindings to MPI
available for managed code. The MPI.NET package can be used with the IMSL C# Numerical Library for
.NET Applications akin to the IMSL C Library mentioned previously. As a managed code library written in
pure C#, the IMSL C# Library integrates easily with .NET tools like MPI.NET.
Math Libraries for Windows HPC
22
Parameter Sweep Distributed Applications
Architecture Overview
In this section, we will provide an example of a Parameter Sweep application where the same code is
executed on each compute node but with different input data. MS-MPI is not used in this example as the
nodes require no communication with each other. Instead, the Windows HPC Job Manager is used to
configure a Parameter Sweep job that indicates what program is to be executed, what input parameters
are to be used, and where output is to be collected. The runtime components are located in a shared
folder on the head node that each compute node can see. When the job begins to run, information is
distributed to each compute node where the program is run independently of other computer nodes.
Consider the typical network topology for Windows HPC Server 2008 shown in Figure 22 in contrast to
the one described in Figure 2. Here distribution of the code is managed by the Head Node and its tools
rather than MS-MPI. Calculations are spawned on individual nodes where access to dependencies like
the IMSL Libraries is required by each node, typically using a shared network resource. This network
resource could also be used to collect output from each node or instance of the distributed application,
but for this basic example output is piped to standard out and collected after the simulation completes.
Figure 22. A typical cluster network topology for a Parameter Sweep application.
Example: Parameter Sweep Distributed Application in C#
The Windows HPC Server 2008 suite of tools allows developers to easily create parameter sweeps where
the same code is executed in parallel on different nodes of the cluster. In a typical case, a single-threaded
application is written to perform some calculation that can be repeated hundreds or thousands of times by
variation of input parameters. The Windows HPC Job Manager has command line and graphical user
interfaces to define the tasks to be distributed. Any of the IMSL Numerical Libraries could be included as
a component of the code to be distributed. The following example focuses on the IMSL C# Numerical
Library and the distribution of a Monte Carlo simulation .NET application across a cluster.
In this example, a small application was created to run a simulation based on a random seed provided via
the command line when defining the distributed tasks. Since a single simulation runs very quickly, each
individual task performs a number of simulations. Subsequently the results are aggregated together in a
file for this simple example. The rest of this section walks through all of the steps necessary to create and
execute a Parameter Sweep application written in C#. Of course any language could be used, not even
Math Libraries for Windows HPC
23
limited to the .NET family, but this example will leverage Visual Studio and C#; the code is fairly
straightforward and should not be challenging to port to other languages.
Installing the IMSL C# Numerical Library
To obtain the IMSL C# Numerical Library, you can request an evaluation copy from the Visual Numerics
website at http://www.vni.com/forms/cSharp_registrationForm.php and an evaluation CD will be mailed to
you. Alternatively, you can contact an Account Manager at Visual Numerics and request a secure FTP
download. You will need a valid license key to execute the example, which can be acquired by contacting
an Account Manager at Visual Numerics by email. If you have downloaded the product, first unzip the
archive named p10408.zip. Note that the part number may be updated as new versions are released.
The CD contents are the same as this archive. In either case the file listing should look like Figure 23.
Start the setup procedure by running the Setup.exe application.
Figure 23. Install files for the IMSL C# Numerical Library.
Run the Setup.exe application to start the installation.
The IMSL C# Library can be used in 32-bit and 64-bit environments for .NET 1.1 or .NET 2.0 and greater.
For the 64-bit environment of Windows HPC Server 2008, select the third option, “IMSL C# for .NET 2.0
and above, 64-bit FlexLM DLL” as seen in Figure 24. The contents of the library are the same for each
version, but this version is built specifically for .NET 2.0 or greater linking in a 64-bit DLL for the license
manager.
Figure 24. Select the 64-bit version for Windows HPC Server 2008.
Math Libraries for Windows HPC
24
You should now see the initial welcome screen for the Setup Wizard. Click Next > to continue the
installation. Next, you will need to accept the Visual Numerics, Inc. End-User License Agreement by
selecting “I accept” and clicking Next >. Enter a User Name and Organization and click Next > again. If
you are a current customer and have a License Number, enter it on the following screen; otherwise enter
999999 and click Next > to continue. The Installation Location must be specified next; to use the default
of “C:\Program Files (x86)\VNI\” click Next > to proceed. The installer is finally ready to copy files; click
Install to begin this step. Progress is monitored as the files are copied. Once complete, a final dialog will
let you click Finish to complete the installation and close the Setup Wizard. Click Close on the initial setup
dialog to end the procedure. The following montage of screenshots, collectively Figure 25, should help
guide you through the process:
Math Libraries for Windows HPC
25
Figure 25. Screenshots showing the steps to install the IMSL C# Numerical Library.
The product can now be found in the installation folder. To install your license key, browse to C:\Program
Files (x86)\VNI\imsl\license and create or paste the license file as indicated by the information supplied
with the license key. You can find the full product documentation under the C:\Program Files
(x86)\VNI\imsl\imslcs500\manual folder along with a gallery of demonstration applications in
C:\Program Files (x86)\VNI\imsl\imslcs500\gallery. All of the assemblies and shared libraries can be
found in C:\Program Files (x86)\VNI\imsl\imslcs500\bin. Note for all these paths, the folder name
“imslcs500” is specific to the 5.0 version of the IMSL C# Library; future versions will have updated version
numbers for the folder. The ImslCS.dll is the primary assembly, which is a pure managed code library.
For higher performance, Visual Numerics also supplies a version of the library that uses the native C++
Intel Math Kernel Library (MKL) for BLAS functions and this is named ImslCS_mkl.dll. More information
about these files and the product can be found in the ReadMe.html file located at C:\Program Files
(x86)\VNI\imsl\imslcs500\ReadMe.html.
Creating the Project
To get started, create a new Project that holds a C# Console Application in Visual Studio as shown in
Figure 26. Any name is fine, but the default ConsoleApplication1 is used in the example.
Math Libraries for Windows HPC
26
Figure 26. Creating a new C# Console Application in Visual Studio 2008.
You will be presented with a standard C# class template with a Namespace and Class that has an empty
Main method. The next step is to integrate the IMSL C# Numerical Library into the project by adding a
reference to the assembly. The reference can be added by right-clicking on References in the Solution
Explorer and selecting “Add Reference…” or by choosing Project -> Add Reference on the Visual Studio
menu bar. This will spawn the Add Reference dialog box shown in Figure 27. Browse to the ImslCS.dll
assembly or enter its path in the File Name area; the default path is
C:\Program Files (x86)\VNI\imsl\imslcs500\bin\ImslCS.dll.
Math Libraries for Windows HPC
27
Figure 27. Adding a reference to the IMSL C# assembly in Visual Studio 2008.
Confirm the assembly is available by entering using Imsl.Math in the source code. The Visual Studio
auto-complete feature should display suggestions after the dot is typed. Additionally, whenever a class is
referenced, all of the available methods and properties are displayed; selecting a method or constructor
will show all of the required parameters. This convenient feature of Visual Studio 2008 is shown in Figure
28, Figure 299 and Figure 30.
Figure 28. Code completion at the Namespace level in Visual Studio 2008.
Math Libraries for Windows HPC
28
Figure 29. Code completion showing available methods for an Imsl.Stat.Random object instance in Visual Studio 2008.
Figure 30. Code completion showing required parameters for a method in Visual Studio 2008.
The next step is to enter the source code. This code is shown in Figure 31, but without the hardcoded
variance-covariance matrix. This would take over 40 pages to print in the document, so it is summarized
here in a #region block; please refer to Appendix A for details on obtaining the dataset. This is a dense
100 x 100 matrix of double values with the main diagonal containing the variance of each of the 100
assets to be modeled; the off-diagonal element at ai,j is the covariance of the i-th and j-th assets.
using System;
using Imsl.Stat;
using Imsl.Math;
namespace Simulate
{
class Compute
{
private Cholesky chol;
private double[] bins, portfolioValues;
private int nVariables;
private int nSamples = 5000;

public Compute(int seed)
{
[covar data]
nVariables = covar.GetLength(0);
chol = new Cholesky(covar);
portfolioValues = new double[nVariables];
for (int i = 0; i < nVariables; i++)
{
portfolioValues[i] = 200;
}
RunMonteCarlo(seed);
}
public void RunMonteCarlo(int seed)
{
Math Libraries for Windows HPC
29
int nBins = 50;
double max = 0.01;
Imsl.Stat.MersenneTwister mt = new MersenneTwister(seed);
Imsl.Stat.Random random = new Imsl.Stat.Random(mt);
double center = Portfolio(new double[nVariables]);
bins = new double[nBins];
double dx = 2.0 * max / nBins;
double[] x = new double[nBins];
for (int k = 0; k < nBins; k++)
{
x[k] = -max + (k + 0.5) * dx;
}
// This would typically be a threaded loop
// but in this serial version, we just work
// on a set of single samples.
for (int i = 0; i < nSamples; i++)
{
double[] r = random.NextMultivariateNormal(
nVariables, chol);
double val = Portfolio(r);
double t = (val - center) / center;
int j = (int)System.Math.Round((t + max - 0.5 * dx)
/ dx);
Console.Out.WriteLine(j);
}
}
double Portfolio(double[] returns)
{
double sum = 0.0;
for (int k = 0; k < returns.Length; k++)
{
sum += portfolioValues[k] * (1.0 + returns[k]);
}
return sum;
}
/// <summary>
/// The main entry point for the application.
/// One argument is expected, the integer seed
/// for the random number generator.
/// </summary>
static void Main(string[] args)
{
int seed;
try
{
seed = Convert.ToInt32(args[0]);
}
catch (Exception)
{
System.Random r = new System.Random();
seed = r.Next();
Math Libraries for Windows HPC
30
}
new Compute(seed);
}
}
}
Figure 31. C# source code for the Monte Carlo model to be run as a parameter sweep.
The Main method expects a random seed to be input as an argument; there is some error checking
included for testing purposes so that it will still run if the argument is not provided. The seed is used in the
Imsl.Stat.MersenneTwister class to create a set of random numbers of this instance of the application.
The data used for the simulation is actually the Cholesky factorization of the variance-covariance matrix
computed using the Imsl.Math.Cholesky class. The simulation is rather simple, with no weighting of the
assets, and the output for the nSamples results are written to the console output. The Parameter Sweep
configuration will drop these in a central location for easy post run analysis. Build the application by
selecting Build -> Build Solution in Visual Studio; hopefully “Build succeeded” appears in the status area.
Running the Example
The binaries required in the deployment include the executable just built (ConsoleApplication1.exe) and
the assemblies associated with the IMSL C# Library (ImslCS.dll and LicenseFlexLM.dll). Copy these
three files to a shared directory visible to all nodes of the cluster. This example uses
\\clusterbot\tasks\VNI. Finally note that the environment variable LM_LICENSE_FILE must be configured
for each node pointing to a valid license file. The pieces are in place, so next it is time to define the job.
Open the Windows HPC Job Manager (Start -> Programs -> Microsoft HPC Pack -> Windows HPC Job
Manager) and select Actions -> Job Submission -> Parametric Sweep Job as shown in Figure 32.
Figure 32. Submitting a new Parametric Sweep Job using the Windows HPC Job Manager.
This opens the Submit Parametric Sweep Job dialog window. For this example, run 50 tasks across 8
nodes for a total number of 250,000 simulations (as each individual task does 5000). Therefore, set the
End Value index to 50 and modify the Command Line entry to point to the executable built above. The
options should look similar to Figure 33.
Math Libraries for Windows HPC
31
Figure 33. Task details for a Parametric Sweep Job in the Windows HPC Job Manager.
Click Submit and the job will run as defined. Note however that using this quick method the job will only
run on a single core. To spread it out to all the nodes, select the Finished job in the Windows HPC Job
Manager window and click View Job under Job Actions. Examine the Task List and you will find that the
Requested Resources is just “1-1 Cores” (or Socket or Nodes depending on the default resource type
configured). To expand this job to run on all resources, click Save Job As and save the job description as
an XML file, paramsweep.xml for example. Click Cancel to close the open dialog after saving. Now select
Create New Job From Description File under the Job Submission menu and open the XML file just saved.
Under Job Details, set the Minimum and Maximum resources as appropriate for the cluster. With four
dual-core nodes on this cluster, we can set this to 8 here as shown in Figure 34.
Math Libraries for Windows HPC
32
Figure 34. Configuring the minimum and maximum resources for a job in the Job Details menu.
Next select Task List on the left hand navigation menu and notice the “1-1” under Required Resource,
Number of Cores. Set this to match the values defined just above; see Figure 35.
Figure 35. Updating the Required Resources for a parameter sweep job in the Windows HPC Job Manager.
Now you can save the updated XML file with the “Save Job As” button or click Submit to run the job
across all the resources on the cluster. The output of this example will be 50 files in the working directory
with names like “24.out”. To view the results in a meaningful way, a separate program can be written that
reads in each file and bins the values into a histogram. This job ran the task 50 times across eight nodes
for a total number of 250,000 simulations with a very smooth distribution of results. The output is
presented in Figure 36.
Math Libraries for Windows HPC
33
Figure 36. Monte Carlo simulation results from a distributed IMSL C# Numerical Library application.
Math Libraries for Windows HPC
34
Summary
This document provided an overview of Math Libraries available for the Windows platform, with specific
focus for developers writing distributed applications using Windows HPC Server 2008. A distributed
example using the MS-MPI implementation and the IMSL Fortran Numerical Library demonstrated using
the Intel Fortran Compiler, Visual Studio 2008, and the Windows HPC Job Manager. Finally, a parameter
sweep example presented code written in C# leveraging the IMSL C# Numerical Library for .NET
Applications. Again, Visual Studio 2008 and the Windows HPC Job Manager were primary tools.
Math Libraries for Windows HPC
35
Appendix A
The raw data used for the Parameter Sweep example is too long to list in-line with the source code in
Figure 31. For the sake of completeness the full data array is available online in a thread at the Visual
Numerics Forum so a reader can utilize the example code. A ZIP file is attached to the thread and is
available for download. The definition for the covar variable should be placed where  [covar data]
is indicated in the source code listing.
Math Libraries for Windows HPC
36
Feedback
Did you find problems with this tutorial? Do you have suggestions that would improve it? Send us your
feedback or report a bug on the HPC developer forum.
More Information and Downloads
Informational URL for the IMSL Libraries http://www.vni.com/products/imsl/index.php
Download link http://www.vni.com/downloads/index.php
This document was developed prior to the product’s release to manufacturing, and as such, we cannot guarantee that all details included herein will be
exactly as what is found in the shipping product.
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication.
Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft
cannot guarantee the accuracy of any information presented after the date of publication.
This White Paper is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED, OR STATUTORY, AS TO THE
INFORMATION IN THIS DOCUMENT.
Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may
be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying,
recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document.
Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these
patents, trademarks, copyrights, or other intellectual property.
© 2008 Microsoft Corporation. All rights reserved.
Microsoft, Visual C++, Visual Studio, Windows, and the Windows logo are trademarks of the Microsoft group of companies.
All other trademarks are property of their respective owners.
Math Libraries for Windows HPC
37