High Throughput Computing Week: Introduction to the Digipede Network

advertisement
High Throughput Computing Week:
Introduction to the Digipede
Network™
©Copyright Digipede Technologies, LLC.
Digipede and the Digipede Network are trademarks of Digipede Technologies, LLC. Microsoft, Excel, Visual Basic, and
Visual Studio are registered trademarks of Microsoft Corporation in the United States and/or other countries. All other
trademarks are the property of their respective owners.
Digipede Network
™
Session 1 Training Guide
Table of Contents
Table of Contents .......................................................................................................................................... 2
Preface .......................................................................................................................................................... 5
Intended Audience.................................................................................................................................... 5
How to Contact Us.................................................................................................................................... 1
Conventions Used in this Guide ............................................................................................................... 6
Introduction.................................................................................................................................................... 7
Benefits of the Digipede Network ............................................................................................................. 8
Digipede Network System Overview ........................................................................................................ 9
................................................................................... 10
How the Infrastructure Works............................................................................................................. 10
Digipede Server ............................................................................................................................. 10
Digipede Agent............................................................................................................................... 11
How to Create and Submit a Job ....................................................................................................... 13
Digipede Workbench...................................................................................................................... 13
Digipede Framework SDK.............................................................................................................. 13
PowerShell Scripting ...................................................................................................................... 13
jobsubmit.exe ................................................................................................................................. 13
Software Requirements ...................................................................................................................... 13
.NET 2.0 ......................................................................................................................................... 14
IIS ................................................................................................................................................... 14
Microsoft SQL Server..................................................................................................................... 14
Summary ................................................................................................................................................ 14
Digipede Software Components ................................................................................................................. 15
Digipede Server ...................................................................................................................................... 15
What Gets Installed? .......................................................................................................................... 15
DigipedeControl ............................................................................................................................. 15
DigipedeTransfer............................................................................................................................ 15
DigipedeWS ................................................................................................................................... 15
Database ........................................................................................................................................ 15
Services ......................................................................................................................................... 16
Event Log ....................................................................................................................................... 16
Advanced Topics ................................................................................................................................ 16
Digipede Licensing......................................................................................................................... 16
Server Licenses ............................................................................................................................. 16
Agent Processor Licenses ............................................................................................................. 16
Summary ................................................................................................................................................ 17
Digipede Agent ....................................................................................................................................... 18
What Gets Installed? .......................................................................................................................... 18
Local Storage ................................................................................................................................. 19
Event Log ....................................................................................................................................... 19
Advanced Topics ................................................................................................................................ 19
Silent installation for large-scale deployments............................................................................... 19
Summary ............................................................................................................................................ 19
Digipede Control Basics.............................................................................................................................. 20
Pools ....................................................................................................................................................... 20
Compute Resources ............................................................................................................................... 20
Agent Availability ................................................................................................................................ 21
Agent Base Priority............................................................................................................................. 21
Agent Administration............................................................................................................................... 21
Check-in Frequency ........................................................................................................................... 21
Users....................................................................................................................................................... 23
System Roles ..................................................................................................................................... 23
System Administrator ..................................................................................................................... 23
High Throughput Computing Digipede Training.doc
November, 2007
2
Digipede Network
™
Session 1 Training Guide
System Monitor .............................................................................................................................. 23
Job Template Administration .................................................................................................................. 23
Job Administration .................................................................................................................................. 24
Job Control ......................................................................................................................................... 24
Troubleshooting.................................................................................................................................. 24
Best Practices ......................................................................................................................................... 24
Pool Configuration .............................................................................................................................. 24
Job Template Management................................................................................................................ 24
Summary ................................................................................................................................................ 25
Job Templates, Jobs, and Tasks ................................................................................................................ 26
....................................................................... 26
File Definitions and Parameters ............................................................................................................. 27
File Definition...................................................................................................................................... 27
Relevance ...................................................................................................................................... 27
File Transfer ................................................................................................................................... 27
Location.......................................................................................................................................... 28
Parameter........................................................................................................................................... 28
Summary ................................................................................................................................................ 29
Job Template .......................................................................................................................................... 29
File Definition...................................................................................................................................... 29
Version ............................................................................................................................................... 29
Application Control ............................................................................................................................. 30
Command line ................................................................................................................................ 30
Standard Out/Standard Error ......................................................................................................... 30
.NET APIs ........................................................................................................................................... 31
Executive........................................................................................................................................ 31
Worker............................................................................................................................................ 31
COM ................................................................................................................................................... 31
IComWorker ................................................................................................................................... 31
Job Defaults........................................................................................................................................ 32
Summary ............................................................................................................................................ 32
Job .......................................................................................................................................................... 32
File Definitions .................................................................................................................................... 32
Parameters ......................................................................................................................................... 32
Settings............................................................................................................................................... 33
Summary ............................................................................................................................................ 33
Task ........................................................................................................................................................ 33
File Definitions .................................................................................................................................... 33
Parameters ......................................................................................................................................... 33
Result Files......................................................................................................................................... 33
Summary ............................................................................................................................................ 33
Digipede Workbench................................................................................................................................... 33
Wizards ................................................................................................................................................... 34
Job Template Wizard.......................................................................................................................... 34
Job Wizard.......................................................................................................................................... 35
Parameters in Workbench.................................................................................................................. 35
Designers................................................................................................................................................ 36
Job Template Designer ...................................................................................................................... 36
Job Designer ...................................................................................................................................... 37
Job Tracking ........................................................................................................................................... 38
Saving Job Templates and Jobs ............................................................................................................ 39
Digipede Control ......................................................................................................................................... 39
Job Template Page................................................................................................................................. 39
Jobs Page ............................................................................................................................................... 40
Task Page............................................................................................................................................... 41
Hello World Walkthrough ............................................................................................................................ 43
High Throughput Computing Digipede Training.doc
November, 2007
3
Digipede Network
™
Session 1 Training Guide
Defining the Job Template and Initial Job .............................................................................................. 43
Minen Walkthrough ..................................................................................................................................... 47
References ....................................................................................................Error! Bookmark not defined.
Glossary ...................................................................................................................................................... 52
High Throughput Computing Digipede Training.doc
November, 2007
4
Digipede Network
™
Preface
Distributed computing has moved from academic research to
commercial reality. Organizations today can use existing
compute resources to improve the scalability and speed of
their most demanding applications. Choosing the right
platform is the key to distributed computing success. Most or
all of a business’s servers, workstations, and software run on
Microsoft Windows. Their developers use Microsoft'
s Visual
Studio .NET software development tools. The Digipede
Network, built entirely on the Microsoft .NET platform, is the
answer.
As a seasoned Microsoft software developer, Digipede
understands the needs of customers using Microsoft
technologies. Other distributed computing solutions focus on
UNIX and Linux, requiring lengthy implementation, steep
learning curves, and a heavy IT burden. In contrast, the
Digipede Network is radically easier to buy, install, learn, and
use. With its familiar Windows user interface, the Digipede
Network allows users to become productive immediately.
Unlike competing solutions, no complex scripting, major
modification of existing applications, or on-site implementation
help is necessary.
Session 1 Training Guide
How to Contact Us
Address:
Digipede Technologies
3640 Grand Avenue
Suite 206
Oakland, CA 94610
Phone:
(510) 834-3645
Community Forums:
http://support.digipede.net
/community/
Website:
http://www.digipede.net/
The Digipede Network delivers the benefits of distributed
computing at any scale. Whether a small department with five
computers or a corporation with thousands of servers,
desktops, and cluster nodes, everyone can benefit. The
Digipede Network can be downloaded, installed, and
configured in less than an hour, and you’ll be on your way to
improved productivity and application performance. It'
s that
simple.
Intended Audience
This document serves as an introduction to the Digipede
Network and is a companion guide for Session 1 of the
Digipede Training Series. Session 1 training is intended for
those who will install and administer the Digipede Network.
We have tested and verified the information provided in this
book, however, you might find that features have changed or
been added. Please let us know of any errors you find, as well
as any suggestions for future editions.
High Throughput Computing Digipede Training.doc
November, 2007
5
Digipede Network
™
Session 1 Training Guide
Conventions Used in this Guide
The following formatting conventions are used in this guide:
Type of term
Convention
Reference to another
document
“Quotes”
Reference to another section Bold
in this document
Reference to a table or
figure in this document
Bold
Element in user interface
Bold
File names and paths
Italics
Command lines, code
examples
Courier font
We'
ll use the terms grid computing and distributed computing
interchangeably. Some people prefer one over the other, but
we think of them as meaning the same thing: using many
computers together to get work done faster.
High Throughput Computing Digipede Training.doc
November, 2007
6
Digipede Network
™
Session 1 Training Guide
Introduction
The Digipede Network is a distributed computing solution that
delivers dramatically improved performance for real-world
business applications. By utilizing the power of distributed
computing, enterprises and developers achieve better speed,
scalability, and reliability for their applications. Built entirely on
the .NET platform, the Digipede Network is radically easier
buy, install, learn, and use than other grid computing
solutions. It includes the Digipede Framework, with which
developers can build scalable, high-performance, distributed
applications in the familiar Visual Studio environment. It can
also be used with existing software, without re-linking or
recompiling.
Here are a few examples of how users can reap immediate
benefits:
1. Command-line Applications - No Recompiling
Required. Any command-line application that does not
require user interactions can be distributed as is. The
Digipede Network delivers dramatically increased
performance on key applications - with no code
modification.
2. Enterprise Software – Scale-Out the Middle Tier. If your
enterprise applications could scale better, so could your
business. When your software scales to meet your needs,
you can handle more growth, take on larger jobs, and keep
your teams more productive. Many enterprise applications
are constrained by middle-tier scalability issues; the
Digipede Network is designed to eliminate these
bottlenecks.
3. Service Oriented Applications - Don't Let Your Users
Wait. As more applications are made available as
services—either as part of a Services Oriented
Architecture, or through any web interface—the need for
using standardized techniques for scaling those
applications to handle variable usage becomes
increasingly important.
The Digipede Network features patent-pending technology to
provide automatic CPU load-balancing and guaranteed quality
of service, making it an ideal solution for scaling web services
or any SOA. Used in combination with Excel Services and
SharePoint Server, the Digipede Network puts the power of
grid computing behind Office 2007 servers--creating a solution
that delivers the new capabilities of Office 2007 with the
scalability you need.
High Throughput Computing Digipede Training.doc
November, 2007
7
Digipede Network
™
Session 1 Training Guide
Benefits of the Digipede Network
Scales out applications and processes for higher performance:
Distributes application load across Windows desktops,
servers, and clusters.
Scales from five nodes to thousands.
Delivers order-of-magnitude increase in speed and
throughput.
Capacity on demand.
Increases productivity:
Shorter runtimes mean less waiting, more productive
work.
Quote: "Installation was
straightforward, and the
Digipede Framework SDK
made grid-enabling our
applications far simpler than
we'd anticipated. We
demonstrated near-linear
scalability on a critical
application with just a few lines
of code, and we got far better
management, monitoring, and
flexibility than our own tools
offered.” - actual Digipede
customer
Increased use of idle resources raises IT efficiency.
Run multiple jobs on your grid - simultaneously.
Submit jobs from any networked computer.
Flexible and powerful APIs enable your developers to
grid-enable applications quickly.
Enable developers to focus on business requirements
instead of building an in-house grid or distribution
platform.
First commercial grid computing solution based entirely
on .NET:
Integrates with Visual Studio .NET for developer
productivity.
Integrates with Windows security for consistency with
current practices.
Uses Web services for ease of implementation.
Development community with forums and sample
code.
Relies on a scalable grid computing platform that:
Quote: “With the Digipede
Network we’ve been able to
handle ten times the load on
our Web application with no
decrease in quality of service to
our users...and we saved about
$100,000 in hardware and
software licensing costs when
compared to alternate
solutions.” - actual Digipede
customer
Guarantees quality of service.
Guarantees task completion.
Integrates data transfer.
Integrates with your current Windows security.
Provides job monitoring functions.
Provides automatic CPU load balancing.
Supports smart caching.
Low total cost of ownership:
High Throughput Computing Digipede Training.doc
November, 2007
8
Digipede Network
™
Session 1 Training Guide
Radically easier than other grid systems – you can
install and administer it yourself.
Uses standard server and desktop hardware.
Digipede Network System Overview
The Digipede Network is comprised of the following
components:
Infrastructure
The Digipede Server manages the workflow through
the system.
Digipede Agents™ manage each of the individual
desktops, servers, or cluster nodes and the tasks that
run on them.
Administration
Digipede Control™, a website that resides on the same
machine as the Digipede Server and provides the
administrative user interface for the system.
Job Creation/Submission
The Digipede Workbench, an easy to use Windows
application through which users can define and run
jobs.
The Digipede Framework SDK™, a programming API
that developers can use to programmatically create
and submit jobs.
High Throughput Computing Digipede Training.doc
November, 2007
9
Digipede Network
™
Session 1 Training Guide
How the Infrastructure Works
The Digipede Server and the Digipede Agents make up the
grid infrastructure. For a compute resource to join the grid the
resource must be able to connect with the Digipede Server
(via HTTP) and a Digipede Agent must be installed on it.
Once the grid infrastructure has been set up, a user can
submit a job from any machine on the network. The user
machine does not have to be a part of the grid; it must simply
be able to talk to the Digipede Server. For users coming from
a cluster background, this can be a new concept. Generally,
jobs submitted on a cluster must be submitted from the head
node. This is not the case with the Digipede Network where
any computer on your network can submit a job to the grid.
Note: If an agent or a compute
resource goes down while
working, the server will
automatically reassign that task
to another agent. The server
itself can have a separate
failover ready to take over if it
goes down. Guaranteed quality
of service, with no single point of
failure.
A job is a collection of tasks that is submitted to the Digipede
Network as one unit of work. A user creates a job and submits
the job to the Digipede Server. Once the Digipede Server
receives the job, the job information is placed into a prioritized
queue of work. As each Digipede Agent checks in with the
Digipede Server, it looks at the job queue to see if there are
any tasks available that it can run. When a Digipede Agent
identifies a task that it can execute, it takes the task, runs it,
and returns the results to the Digipede Server. The Digipede
Server then returns the task results back to the user.
A task executing on a compute node can access any
networked resource that the Digipede Agent has the right to
use. This includes file shares, databases, and even the
Internet.
On the surface the flow of job requests through the Digipede
Network is very simple, but there is a lot going on under the
covers to manage the requests and to ensure optimal use of
the grid resources.
Digipede Server
Many grid computing solutions have a job scheduler that
assigns tasks to specific compute resources. This requires
that the job scheduler keep track of compute resource’s
specific information such as availability, hardware, and
installed software. This approach does not scale well, because
the server is forced to track and actively monitor each of the
compute resources.
The Digipede Server instead pushes the task assignment
High Throughput Computing Digipede Training.doc
November, 2007
Note: The Digipede Server
consists of a Windows service
and a web service. The user
interface is provided entirely
through a browser-based
component (see Digipede
Control below).
10
Digipede Network
™
Session 1 Training Guide
decision off to the Digipede Agents specifically because the
Digipede Agent knows about the compute resource that it is
installed on. The Digipede Server makes sure that tasks are
completed, keeps track of the jobs and their status, and stores
job information.
Often jobs have files that are associated with them such as
data files or execution files. The Digipede Server supports
moving these files through the system to the compute
resource where they are needed. This frees the administrator
and user from the need to pre-load software and data on the
compute resources.
The Digipede Server can support a large grid because the
work assignment decisions are actually made by the Digipede
Agent.
Guaranteed job completion. The Digipede Server passively
monitors all work on the Digipede Network. If a Digipede
Agent is unable to finish a task, the Digipede Server puts that
task back in the task queue so that another Digipede Agent
can execute it. This is how the Digipede Server guarantees
that a job completes.
Digipede Agent
The Digipede Agent decides which tasks it can execute based
on its hardware, software, and availability. This is called a pullsystem and there are many benefits to this approach:
Automatic CPU load-balancing. Each Digipede Agent takes a
task when it is available.
Compute node resource information is always up to date. The
Digipede Agent collects compute resource configuration
information each time it is started. For example, if RAM is
added to the compute resource, the Digipede Agent
automatically collects the new information when the machine
reboots. Knowing the amount of available RAM is important
because jobs can be defined with specific hardware
requirements. The Digipede Agent uses the most recent
system data to decide which tasks it can take. This eliminates
the need to notify the Digipede Server about hardware
upgrades.
Note: Only one Agent is installed
on a compute node—even if the
compute node has more than
one processor. However, the
agent can manage multiple
processes simultaneously in
order to take advantage of multiprocessor systems.
Software components can be cached on the compute node.
The Digipede Network supports the caching of files to reduce
bandwidth utilization. The Digipede Agent knows what
software it has installed and cached on the compute node. So
if the user has a specific job that is submitted regularly the
required execution and data files can be cached on the
compute resource so that they are only moved once. This
reduces the amount of bandwidth used to run common jobs
and eliminates the time needed to move the common files.
High Throughput Computing Digipede Training.doc
November, 2007
11
Digipede Network
™
Session 1 Training Guide
When the agent has identified a job in the queue that it is
eligible to work on, it executes the following steps:
1. It identifies any files that it needs to get in order to work on
the job. Files can arrive at the agent in one of 3 ways: they
can be streamed directly through the Digipede Network,
they can be copied from a file share, or they can be
fetched via HTTP.
2. It notifies the server that it is taking a task (or several)
tasks from the job, and asks for those tasks. The job itself
determines how many tasks an agent can take
simultaneously—it may be permitted to run more than one
task simultaneously.
3. It receives the tasks from the server (and the server notes
which tasks it was assigned).
4. It executes the tasks according to the type of job
(command-line, COM or .NET).
5. As each task completes, it notifies the server of the
completion (returning any appropriate results), and asks
for more work (when appropriate).
High Throughput Computing Digipede Training.doc
November, 2007
12
Digipede Network
™
How to Create and Submit a Job
The Digipede Network provides several ways for a user to
create and submit a job. Traditional grid computing supports
job submission through a scripting language. The Digipede
Network expands job submission to new levels by providing
tools that make it easier for a user to create and submit jobs.
Session 1 Training Guide
Flexibility is the key here. Some
users are comfortable with
programming languages, and
some are not. By providing a
comprehensive SDK and an
easy-to-use user interface, the
Digipede Network brings the
power of grid computing to more
people than ever.
Digipede Workbench
The Digipede Workbench is designed to replace scripting. A
GUI application, the Digipede Workbench provides wizards
that walk the user through the process of job creation. Jobs
can be submitted and monitored right from the UI.
Digipede Framework SDK
The Digipede Framework SDK is a set of libraries and
development tools that can be used to programmatically
manage the Digipede Network. Using the Digipede Framework
SDK, programmers can add the power of grid computing to
their own applications and some types of third party
applications.
PowerShell Scripting
Digipede has released a PowerShell snap-in that allows
complete use of the Digipede Framework—job submission,
monitoring, and control—from within a scripting environment.
While PowerShell alone allows complete access to the
Digipede Framework, the snap-in was designed to make
management tasks even easier by providing cmdlets for many
common tasks.
jobsubmit.exe
Jobsubmit.exe is a command-line application that can be used
to submit a Job to the Digipede Network and is often used to
submit jobs for batch processing. Jobsubmit sends XML files
representing jobs; these files can be created with the Digipede
Workbench, programmatically (serialized from the Digipede
Framework), or hand encoded.
Software Requirements
The Digipede Network is designed to make grid computing
easy and accessible. To accomplish these objectives the
Digipede Network is able to move files, guarantee job
completion, accurately report status, and allow job submission
from any computer on the network.
The Digipede Network takes advantage of Microsoft
technologies and as a result requires that certain Microsoft
High Throughput Computing Digipede Training.doc
November, 2007
13
Digipede Network
™
Session 1 Training Guide
technologies and software be installed on the machines the
Digipede Network runs on.
.NET 2.0
The Digipede Network is built using .NET 2.0 and takes
advantage of many of the advanced capabilities provided
by .NET. .NET 2.0 is required by the Digipede Server,
Digipede Agent, and the Digipede Workbench.
IIS
Microsoft’s Internet Information Services (IIS) is used by
several parts of the Digipede Network. Digipede Control is the
web-based administration tool, DigipedeTransfer provides
HTTP-based file transport, and DigipedeWS provides web
services for the entire Digipede Network. All of these
components are installed with the Digipede Server. IIS does
not need to be installed on the compute nodes themselves;
only the Digipede Server requires IIS.
While the Digipede Network
itself takes advantage of .NET
2.0, the applications distributed
by the Digipede Network do not
need to use .NET—they can be
unmanaged command-line
executables, COM servers, or
either .NET 1.1 or .NET 2.0
applications.
Microsoft SQL Server
Microsoft SQL Server is used by the Digipede Network to
store jobs, job templates, and configuration information. SQL
Server is required by the Digipede Server. If the administrator
does not have access to a SQL Server installation, the
Digipede Server will install and use a SQL Server Express
database.
Summary
The Digipede Network is a very easy to use, yet powerful
distributed computing tool. With many job submission tools to
choose from the Digipede Network makes grid computing an
accessible and cost effective tool to improve application
performance and scalability.
High Throughput Computing Digipede Training.doc
November, 2007
14
Digipede Network
™
Session 1 Training Guide
Digipede Software Components
Digipede Server
The Digipede Server is the communication hub for the
Digipede Network and is the first Digipede Network
component to be installed. It is recommended that the
Digipede Server be installed on a dedicated server machine,
however this is not required.
For step-by-step instructions on
installing the Digipede Server,
please see the “Digipede
Network Installation Guide”.
What Gets Installed?
The Digipede Server requires both IIS and SQL Server to work
properly. IIS is required because the Digipede Network installs
websites for administration and communication. SQL Server is
required because the Digipede Network uses a database to
store job and configuration information.
DigipedeControl
A multi-page website that provides the administrative user
interface for the Digipede Network. Users can submit and
monitor jobs via Digipede Control, but most non-administrator
users prefer using Digipede Workbench.
Having a browser-based
administrative tool means that the
administrator can monitor and
control the Digipede Network from
any machine in the enterprise.
DigipedeTransfer
A website that transports files via the HTTP (or HTTPS)
protocol. If the network architecture does not permit the use of
shares for file copying, you can use DigipedeTransfer to serve
files. You can also use this as a destination for results files.
Note: Installation of
DigipedeTransfer is optional.
DigipedeTransfer is only required if
you are using HTTP for file
transport.
Digipede Transfer can also be installed separately from the
Digipede Server by simply running the Digipede Server setup
application and choosing a Custom installation.
DigipedeWS
Provides functionality to the Digipede Agents, Digipede
Workbench, and any other applications that submit and
monitor jobs on The Digipede Network.
Database
SQL Server is required because the Digipede Network creates
a database called DigipedeDB. DigipedeDB stores all job, job
template, and configuration information for the Digipede
Network.
The SQL Server instance can be installed on the same
computer as the Digipede Server or on a different one. If an
administrator is expecting to install a large and active grid,
High Throughput Computing Digipede Training.doc
November, 2007
If you are going to install a failover
Digipede Server, you must install
SQL Server (and the Digipede
database) on a different machine
than the Digipede Server. Both the
primary and secondary Digipede
Servers will be configured to run
from that database.
15
Digipede Network
™
Session 1 Training Guide
then for optimal performance it is recommended that SQL
Server and the Digipede Server be installed on different
machines.
If the administrator does not have access to a SQL Server
installation, the Digipede Server will install and use SQL
Server Express. SQL Server Express is free and perfect for a
small grid installation.
Services
The primary functionality of the Digipede Server is provided by
the Digipede Network Service. This program is a Windows
service, running as the local system account. It starts
automatically on start-up, and will run whether or not any
users are logged in to the local machine.
Event Log
The installation creates a Digipede Event Log that can be
viewed through the Windows Event Viewer administrative tool.
This event log is useful in administering, setting up, and
troubleshooting the Digipede Network.
Advanced Topics
Digipede Licensing
You will need both server and agent-processor licenses. You
can use the Digipede License Manager on your Digipede
Server to manage your license, add additional agent
processor licenses, and to activate your license online.
Server Licenses
The Digipede Server will not run without a valid license file.
After you download your installation of the Digipede Network,
you will receive a license file from Digipede. This file must be
installed on your server for the Digipede Server, Web Service,
and Digipede Control to run. If you try to start the services or
view the website without a valid license, you will receive an
error message indicating that the license is invalid. If you feel
you have received this message in error, contact Digipede at
www.digipede.net/support.
Agent Processor Licenses
Unlike the Digipede Server, Digipede Agents do not need
license files. However, each Digipede Server license indicates
the number of agent-processor licenses that have been
purchased. You can install the agent on as many machines
as you like, but only licensed agents will be permitted to
perform work on the network.
High Throughput Computing Digipede Training.doc
November, 2007
Digipede Agents are licensed per
processor (not per core). A dual
core, single processor machine
only takes one license. But a dual
processor machine takes two
licenses.
16
Digipede Network
™
Session 1 Training Guide
Summary
The Digipede Server is more than one Windows service.
Using standard, stable, and well documented Microsoft
solutions, the Digipede Server is able to provide
communication, storage, and administration services for the
Digipede Network.
High Throughput Computing Digipede Training.doc
November, 2007
17
Digipede Network
™
Session 1 Training Guide
Digipede Agent
Once the Digipede Server has been installed, you can begin
installing the Digipede Agent on to the compute resources in
your enterprise. The Digipede Server must be running and
accessible from the compute resource in order for the
installation to succeed. The Digipede Agent must be able to
connect to the Digipede Server and register.
For step-by-step instructions
on installing the Digipede
Agent, please see the
“Digipede Network Installation
Guide”.
A Digipede Agent can run on either a shared or a dedicated
compute resource, and may be installed on as many compute
nodes as you like—the Digipede Server license controls how
many of those agents actually perform work on jobs.
The Digipede Agent can be installed on the same machine as
the Digipede Server. However, this configuration is not
recommended for installations of the Digipede Network
Professional Edition with large numbers of agents.
What Gets Installed?
On a desktop machine, the Digipede Agent requires Windows
XP or higher; on a server machine, Windows 2000, SP4 or
higher. What this means for the average company is that the
Digipede Agent can be installed on any Windows compute
resource on the network.
After installation, any
subsequent configuration of the
agent happens through
Digipede Control. You don'
t
need to go to your compute
nodes to administer them.
Installation of the Digipede Agent is very simple and there are
multiple ways to start the installation. A Digipede Agent
installation file is installed with the Digipede Server and can be
found at:
C:\Inetpub\wwwroot\Digipede\DigipedeControl\Install\Agent\se
tup.exe
The installer is accessible from Digipede Control’s home page
via a hyperlink. An administrator can open Digipede Control
from the target compute resource and click the Digipede Agent
download hyperlink to start the install. The installation file
could also be copied to a file share or a disk.
It is also possible to install to multiple compute resources
simultaneously using silent installation. This functionality is
only available in Digipede Network Professional Edition. See
Advanced Topics in this section for more details.
The Digipede Agent is made up of multiple components. The
three main components are:
NISvc.exe – is the Digipede Agent Service. NISvc.exe is a
Windows service that starts automatically at start-up and logs
on to the local system as the Local System account.
High Throughput Computing Digipede Training.doc
November, 2007
18
Digipede Network
™
For greater security, install the Digipede Agent service as
a specific user account. Typically this is a local or domain
account with limited privileges. When you use a specific
account, you can use additional features, such as disk
quota and limited directory access, for enhanced security.
NICore.exe - starts and monitors processes, and handles
communication with the Digipede Server.
NIUser.exe - provides the System Tray user interface.
Local Storage
Session 1 Training Guide
The Agent has very minimal user
interface—an administrator
performs most configuration from
within Digipede Control.
However, a person using a
computer with the Digipede
Agent on it can always disable
the Agent; this prevents the
Agent from degrading
performance on a shared
resource.
The Digipede Agent executes tasks it selects from the
Digipede Server. To execute a task the Digipede Agent often
needs supporting files. These files may be execution or data
files and they need to be stored on the compute resource. The
Digipede Agent installer creates the directory C:\Documents
and Settings\All Users\Application Data\Digipede\Agent to
store data files.
Event Log
The Digipede Agent installer creates an Event Log that is
viewable through the Windows Event Viewer administrative
tool. This event log is useful in administering and
troubleshooting your installation.
Advanced Topics
Silent installation for large-scale deployments
If you plan a large-scale deployment of Digipede Agents for
the Digipede Network Professional Edition, you can use a
“silent” installation. Silent installations do not require any user
or administrator interaction. Contact Digipede for information
on this functionality.
Summary
Where the Digipede Server is the communication hub for the
Digipede Network, the Digipede Agent is the workhorse. Install
a Digipede Agent onto each compute resource you want to
add to the Digipede Network and seamlessly grow your grid.
High Throughput Computing Digipede Training.doc
November, 2007
19
Digipede Network
™
Session 1 Training Guide
Digipede Control Basics
Digipede Control is the Digipede Network’s administration tool.
As a thin client it can be accessed from any machine on the
network that has access rights to the web server where it is
installed. Digipede Control is automatically installed with the
Digipede Server.
Pools
A pool is collection of compute resources. Pools allow an
administrator to partition the grid into smaller computational
groups. This allows the administrator to control where jobs are
run on the grid and also allows control for users'access rights
to those machines. Compute Resources can belong to more
than one pool.
Administrators and users can both
use Digipede Control. However,
they will have different
experiences. Administrators have
menus and abilities that other
users do not.
Each installation of the Digipede Network contains a Master
Pool which contains every compute node available on the grid.
An administrator can create as many pools as he needs and
may segment the grid for security, technology, or business
reasons.
Compute Resources
Digipede Control provides an administrator with the ability to
configure the Digipede Agent installed on each compute
resource. As the Digipede Agent takes work it affects both the
compute resource that it is running on and the network. Based
on network and business needs an administrator can
configure each Digipede Agent to maximize availability and at
the same time reduce negative effects.
Tip: Nearly every screen in
Digipede Control has sorting and
filtering. To filter the items being
displayed, use the Find tab. To
resort the items, click the column
headers.
Administrators can configure a specific Digipede Agent by
opening the Compute Resource page in Digipede Control
and selecting the compute resource.
High Throughput Computing Digipede Training.doc
November, 2007
20
Digipede Network
™
Session 1 Training Guide
Agent Availability
The Agent Availability option specifies whether the agent is
available Always (subject to the Peak Time schedule) or Only
when idle (either the screen saver is active or no one is
logged in to the machine). By default the option is set to
Always.
An administrator may want to set this flag to Only when idle if
the Digipede Agent is installed on a desktop computer that is
sometimes used by a person. For example, the administrator
may configure a desktop compute resource that is in use
during business hours to Only when idle so that the business
user has complete use of the processor when he needs it, but
the machine is available to the grid when the user is not there.
Many users make agents Always
Available even on shared
resources, but they set the Base
Priority (see below) to Low. For
most applications, users won'
t
even notice when their computer
is working on a Digipede job.
Agent Base Priority
Agent Base Priority specifies the priority at which the
Digipede Agent runs processes on the compute resource.
These priorities (which are defined by the operating system)
are: Low (sometimes called Idle), Below Normal, Normal,
High, and Real Time.
The Windows operating system is a multitasking operating
system, so it constantly switches between the currently
running processes. Setting the base priority determines how
often the operating system lets the processes started by the
Digipede Agent have access to the CPU. Digipede
recommends setting Agent Base Priority to Low on shared
resources and High on dedicated resources. While Real Time
is available, selecting Real Time could interfere with the
operating system and is not recommended.
Agent Administration
Check-in Frequency
The Digipede Agent periodically checks in with the Digipede
Server to see if there is any work available for it. A Digipede
Agent that has just completed a task immediately checks to
see if there is any more work available. However, if there is no
work available then the Digipede Agent will wait for the
specified period of time. After the check-in time has passed,
the Digipede Agent pings the Digipede Server to see if there is
any work.
The best measure for the proper
check-in frequency is number of
agents checking in per second.
Take the total number of agents
on your system, and divide it by
your check-in frequency. You
can use this number as a guide
for tuning your installation
properly.
In the default configuration, a Digipede Agent checks in every
5 seconds; the time is configurable using the Administration
Settings page in Digipede Control. Finding the appropriate
check-in frequency depends on several factors. The most
important factor is the number of agents on the system. A
High Throughput Computing Digipede Training.doc
November, 2007
21
Digipede Network
™
Session 1 Training Guide
system with 100 agents set to check in every 5 seconds
averages 20 agents checking in per second. A system with
1000 agents checking in at that frequency would have 200
agents hitting the server every second. This puts a load on the
server and on the network. The more powerful the machine
that the Digipede Server is installed on, the more agent checkins it will be able to process.
However, there are benefits to having a short check-in time.
Because the Digipede Network uses an agent-based pullsystem, an available agent will not begin working on a job until
it checks in; if you have a check-in frequency of 10 minutes, it
could be 10 minutes before all of your agents are working on
your job. If you have jobs that have a critical need for rapid
computation, having a short check-in frequency ensures that
your agents will be working very soon after a job is submitted.
Choosing the appropriate check-in frequency is one of the
most important decisions to make when configuring the
Digipede Network. Take into account the nature of the work,
the speed of the network, the scalability of the Digipede
Server, and the number of machines on the grid.
High Throughput Computing Digipede Training.doc
November, 2007
22
Digipede Network
™
Session 1 Training Guide
Users
Users are people or processes that have the right to access
the Digipede Network. The Digipede Network Team Edition
supports up to five users, while the Digipede Network
Professional Edition supports an unlimited number of users.
System Roles
Roles are used to grant rights to users. Identifying users and
defining roles is one layer of security available to the
administrator.
System Administrator
A System Administrator has full access rights and can:
•
access and use Administration pages;
•
install agents, delete job templates, enable and disable
agent licenses on compute resources, perform
database administration, register external resources,
administer pools, and administer users;
•
submit jobs on the system.
System Monitor
A System Monitor has limited access rights and changes are
limited directly to information that is tied to him. A System
Monitor can:
•
submit jobs on the system;
•
delete his own job templates;
•
change his own user profile.
Tip: Most users on your system
do not need to be Administrators.
Monitors can submit and control
their own jobs, and that'
s all most
users need to do!
Job Template Administration
A job template tells the Digipede Network what files need to be
on a compute resource to run a job, where to get those files,
how to install them, how to execute the job, and how to
communicate with the executable. Every job submitted to the
Digipede Network has an associated job template. The files
specified by a job template reside on a compute resource until
the job template is deleted from the system. An administrator
can use the Job Template Administration page to delete old
and unused job templates from the system.
Deleting a job template from Digipede Control instructs every
agent to delete all associated files from its cache.
High Throughput Computing Digipede Training.doc
November, 2007
Tip: For every Job Template in
the system, there may be many
files on each of the compute
resources. In addition to taking
up space on your hard disks,
having hundreds or thousands
of these can slow the
performance of the agents. If
your users intend to use the Job
Template again, they should
keep it in the system. But if they
are done with it, it should be
deleted after use.
23
Digipede Network
™
Session 1 Training Guide
Job Administration
Job Control
Users and administrators have the ability to monitor and
control jobs using the Jobs page in Digipede Control. By
navigating to the Jobs page, a user can view the progress of
jobs running on the system. When a running job is selected,
an administrator (or the user who submitted that job) can
pause the job by pressing the Pause button. No agents take
tasks from a paused job (although they continue working on
any tasks currently in progress). A paused job can be resumed
using the same button.
Similarly, a job can be aborted by clicking the Abort button.
No more tasks for that job will be assigned, and any agents
working on tasks for that job will stop working as soon as they
check in.
Troubleshooting
Error messages for a job (for example, if agents are unable to
download files) can be found on the Status tab. Error
messages for particular tasks (or task assignments) can be
found on the Tasks page; select the job that had an error, and
then select the Tasks link. Select the task that failed and click
the Task Assignment link to see errors, standard error,
standard output, and the command line (when appropriate). To
view a list of all Task Assignments for a Job, click the Job
Task Assignments link on the Tasks page.
Best Practices
Pool Configuration
To maintain fine control over which users have access to
hardware resources, we recommend always setting pools to
Enforce Pool Roles on every pool—including the Master Pool.
Never give submission rights to any user on the Master Pool.
Tip: Pools can also be used to
ensure that certain jobs run on
certain machines. For example, if
some of your distributed jobs
require Excel on the nodes, you
could create a pool that consists
only of machines that have Excel
on them.
Job Template Management
Files defined by a job template are moved to the compute
resource for a task to use. Every job is associated with a job
template and for commonly run jobs it is recommended that a
user reuse the job template. There are two reasons to do this:
The job template provides a common high-level definition for
the job that may include files and system requirements.
Files that are defined in the job template can be cached on the
compute resource for reuse.
There is a flag in each job template called DiscardAfterUse.
Generally a user sets the DiscardAfterUse flag to true if the
High Throughput Computing Digipede Training.doc
November, 2007
Tip: In the default configuration,
web service calls are limited to 4
MB. With the ability to stream
files and objects, it is easy for
Digipede job submissions to
become larger than this. Be
24
Digipede Network
™
Session 1 Training Guide
job is only going to be submitted once, resulting in a complete
cleanup of the files on the compute resources when the job
finishes. However, if the job is run often the user can set the
DiscardAfterUse flag to false and leave the files on the
compute resource for later use.
Summary
With the Digipede Control an administrator can configure the
Digipede Network for his specific business and technology
requirements.
High Throughput Computing Digipede Training.doc
November, 2007
25
Digipede Network
™
Session 1 Training Guide
Job Templates, Jobs, and Tasks
There are several tools for job submission provided by the
Digipede Network. While each tool is designed to address a
specific type of job submission, the objects required for a job
submission are standard.
There are three important concepts to understand regarding
work submitted to the Digipede Network: the job template, job,
and task. The relationship between these objects is important;
this relationship is shown in
. A task is an atomic unit of
work—work that gets executed on one machine. A job is a
collection of one or more similar tasks. A job template
describes the files necessary to work on a job, along with how
to execute those files. A job template is designed to be
reusable and can be associated with more than one job.
!
Job templates, jobs, and tasks are all configurable and have
associated properties that can be used to define a specific job
submission. In some respects, these objects are hierarchical:
4. A property set in the job template is inherited by any job
that uses the job template.
5. A property set in the job is inherited by all tasks defined for
that job.
6. Many properties can be overridden. When this is the case,
the property value in the more granular object is used. For
example, if a shared property is set in both the job
template and the job, the setting in the job is used.
The task contains the detailed specification of the work that
will occur on a particular job on a particular computer. The
tasks for a particular job differ from each other in three
respects: each can have unique files, parameters, and
serialized data.
High Throughput Computing Digipede Training.doc
November, 2007
26
Digipede Network
™
Session 1 Training Guide
File Definitions and Parameters
The job template defines all of the files and parameters that
will be used for jobs. Files and parameters may vary from task
to task, or may be the same for all tasks in the job. Job
templates, jobs, and tasks each have a collection of files and
parameters associated with them.
File Definition
Using the Digipede Network to move files means that the user
doesn’t have to pre-install software or data on the compute
resources, or figure out a how to get result files back to the
client machine. The user simply creates a File Definition for
each file that will be moved and the Digipede Network does
the rest.
Relevance
Files moved by the Digipede Network can apply either to the
job template, the job, or a task. Files apply to the:
1. job template when they are needed by every task and
job that uses that job template (for example, an
executable, DLL, or configuration file).
2. job when they are different for each job but are the
same for each task (for example, a document you are
searching).
3. task when they are different for each task (for example,
if you are searching for 1000 different strings in a
genome and each string is in its own file, those files
are task files).
Each file definition in a job template has a relevance; the
relevance indicates whether that file belongs to the job
template, the job, or the task. A relevance of JobTemplate
indicates that the file belongs to the job template. A relevance
of JobPlaceholder indicates that the file belongs to the job; in
this case, the file is not fully specified until job submission and
every job submitted must fully specify the file. A relevance of
InputPlaceholder indicates that the file belongs to the task; in
this case, the file is not fully specified until job submission, and
every task must fully specify the file.
File Transfer
When creating a file definition, the user must decide how the
file will be moved. The two transfer methods employed by the
Digipede Network are streamed and hosted. A streamed file
moves from the client machine through the Digipede Network
to the Digipede Server; it is then streamed to the compute
High Throughput Computing Digipede Training.doc
November, 2007
27
Digipede Network
™
Session 1 Training Guide
resource. The advantage of streaming a file is that it does not
have to be hosted on a machine that is reachable by the
agents; the Digipede Network will move it automatically. On
the other hand, a hosted file must be located on a machine
that is accessible to the agents. When the job is run, the
agents will copy the file directly from the host machine to the
compute resource using a specified protocol. The advantage
of using hosted files is that the file is only moved once (directly
from the hosting machine to the compute resource).
The hosted transfer type currently supports three protocols:
SMB, HTTP, and HTTPS. With the SMB protocol, files are
transferred to or from a Windows Share. With both HTTP and
HTTPS, files are transferred using the HTTP protocol (over
SSL in the case of HTTPS).
In addition to being able to download files via HTTP, the
Digipede Agent can upload result files via HTTP using the
Digipede Transfer website. The AcceptsFiles.aspx program
installed with the Digipede Transfer website allows agents to
"push" files to a file server. Digipede Transfer can be installed
on any machine with IIS, not just the Digipede Server.
Location
When creating a file definition for a hosted file, the user must
indicate where the file can be found on the network. Each job
template has remote locations associated with it; the remote
locations indicate network paths and transfer protocols for that
network location. Each file definition for hosted files must
specify which remote location it will be moved to or from.
Parameter
A parameter is a name-value pair used to define a commandline parameter or to define a variable for a job or task. If you
include the name of the parameter in the command line of a
distributed application (for example, blast.exe $(PARAM1), the
Digipede Agent will replace the Parameter with the proper
value when it calls the command line.
Parameters can be defined for a job template, job, or task. It is
also possible to create a placeholder Parameter for a job
template that requires a corresponding Parameter definition
for a job and task object. This ensures that any job using the
job template defines the required Parameter value.
High Throughput Computing Digipede Training.doc
November, 2007
28
Digipede Network
™
Session 1 Training Guide
Summary
The Digipede Network uses job template, job, and tasks as the
definition of the distributed work. The file definition, parameter,
and setting objects provide the details on how and where the
job and tasks are to be executed.
Job Template
A job template contains reusable information about a specific
type of job. It is, in essence, a template for a job. The job
template defines what common files are needed to execute a
job, how a job should be started, and if there are any compute
resource requirements. It also defines the parameters and file
definitions that must be completed in order to submit the job. A
job template is designed to be reusable—many jobs can be
submitted against the same job template.
Tip: Digipede recommends that if
a job template is going to be
reused, that the user give it an
easily identifiable name.
There are several advantages to reusing job templates. One is
that a job template can serve as a repository for default job
settings. This ensures that the basics for a specific type of job
are already set up and ready for the user.
Another advantage is that common files defined in a job
template can be cached on the compute resources. Caching
files reduces the amount of bandwidth required to execute a
job and gives the job a performance boost because those files
do not have to be copied in order to start working.
File Definition
Files in the job template can be cached on the compute
resource for future use and are then available to any job using
the job template. Caching files reduces network bandwidth
utilization as well as the time it takes to execute a job.
The user can set the Discard After Use flag to control file
caching. If this property is set to false, the Digipede Network
leaves the files on the compute resource. The cached files
reside on the compute resources until the job template is
deleted using Digipede Control.
Version
High Throughput Computing Digipede Training.doc
November, 2007
29
Digipede Network
™
Session 1 Training Guide
Once a job template has been submitted to the Digipede
Network, it cannot be changed. This functionality ensures that
a job template cannot be modified while a running job is using
it, and ensures that a user always knows exactly which files
his job is running against. When you need to change a job
template you must create a new version and make changes to
that, then submit future jobs using the new job template.
An advantage to using a new version for an existing job
template, instead of a modified copy, is that files cached by an
earlier version of the job template are available to the later
version. This eliminates the need to redistribute the already
cached files.
Application Control
The Application Control tells the Digipede Agent how to control
an executable. It includes the command line, the API, and
information on stopping or suspending the job.
The distributed application can be started in several ways:
1. Command line to start the executable or script directly;
2. .NET object (An object created by a grid-enabled .NET
application);
3. COM server (An object created by a grid-enabled COM
application).
Command line
The default Application Control start type is command line. A
Digipede Agent can start any command line application, batch
process, or script with both dynamic and static input
parameters. The user defines the command line and the
Digipede Network inserts any dynamic parameters before
starting the command line.
Standard Out/Standard Error
By default, command line processes write message text to
standard out (stdout) and error text to standard error (stderr).
These two text buffers may contain important information that
the user would like to retrieve. By default, the Application
Control returns standard error text and ignores standard
output text. If the user would like to see all the text produced
by the executed command line process, then standard output
should be set to true.
It is recommended that standard error be set to true so that
any errors occurring during command line execution be
returned to the user so that the user can determine what went
High Throughput Computing Digipede Training.doc
November, 2007
30
Digipede Network
™
Session 1 Training Guide
wrong.
.NET APIs
Digipede provides two different .NET APIs, the Executive and
the Worker. The can be used independently or in conjunction
with each other.
Executive
With the Executive design pattern the job template, job, and
tasks are created on the client machine, but the
associated .NET objects are created on the compute resource.
Because the .NET objects are created remotely, Executive
applications can be started from the Digipede Workbench.
Additionally, an Executive stays active until the job finishes.
This is differs from the command line application which is
associated with a task and closes once the task is completed.
Using an Executive allows the developer to share information
on the compute resource between tasks, such as database
connections, as well as eliminating the task-based application
start up time.
Worker
The Worker design pattern is the most common pattern used
for grid-enabling applications. With the Worker pattern the job
template, job, tasks, and all associated .NET objects are
created on the client machine.
The Worker pattern supports the distributed .NET object’s
class definition being defined in either the application itself or
in a dynamic linked library (DLL). Putting the distributed class
definitions into a DLL allows the developer to grid-enable
applications with graphical user interfaces (GUI) and can
significantly reduce the footprint on the compute resources.
COM
The Digipede Framework SDK supports the grid-enablement
of COM applications. Grid-enabled COM applications create
the job template, job, and tasks on the client machine with the
distributed COM objects being created on the compute
resource.
IComWorker
IComWorker pattern is used to grid-enable a COM application.
High Throughput Computing Digipede Training.doc
November, 2007
31
Digipede Network
™
Session 1 Training Guide
The IComWorker interface must be added to any COM Server
class you create for distribution on the Digipede Network. The
Digipede Agent then uses the IComWorker interface to start
the work on the compute resource.
Job Defaults
Settings define requirements and rules that the Digipede
Agent uses to determine whether it can run a task, and if so,
how to run it. Unless overridden by the associated job, these
job template settings are the default values used to define
basic job requirements and execution rules.
Summary
The job template is a reusable Digipede object that defines
common files, execution requirements, and execution rules for
jobs that use the job template.
Job
A job contains the details for a specific run of a job template,
and it contains one or more tasks. A job definition can also
define job specific hardware and software requirements, job
level files, and execution parameters.
File Definitions
The file definitions created for a job are specific to that job. If a
job template has any file definitions with a relevance of
JobPlaceholder, the jobs submitted against that job template
must have file definitions for those files. Because these files
are specific to an instance of a job, the files are not cached on
the compute resource but are deleted when the job completes.
A job file definition has the same file transfer locations as a job
template file definition.
Parameters
Similar to file definitions, if a job template has any parameters
with relevance of JobPlaceholder, the jobs submitted against
that job template must specify the values for those
parameters.
High Throughput Computing Digipede Training.doc
November, 2007
32
Digipede Network
™
Session 1 Training Guide
Settings
A job inherits the settings defined in the job defaults of the
associated job template. To override a job template setting,
simply change the setting in the job.
Summary
A job is a specific run of a job template and can use the
default job template settings or be uniquely configured. A job
also contains and defines the tasks that are executed on the
compute resources.
Task
A task is an atomic piece of work that will be executed on a
single compute resource. Each job has one or more tasks and
these tasks must be able to be executed in parallel. A task can
be a call to a command-line application or a script, a .NET
object, or a COM Server.
File Definitions
Some applications require different data files for each task.
For any job template that has file definitions with relevance of
InputPlaceholder, the tasks in jobs submitted against the
template must have file definitions to specify those files. When
a task completes, these files are deleted; these files are never
cached.
Tip: Digipede Workbench'
s job
wizard automatically groups files
by their filename (modulo
extension). For example, if you
specify task files input001.inf
and input001.dat, Workbench
would create one task with two
input files. If your input files are
not named in this convention,
you will have to create your
manually in the designer.
Parameters
If the job template specifies that tasks have unique
parameters, each task in that job must specify values for the
parameters.
Result Files
Tasks may specify result files; these are files that will be
moved from the compute resources to a specified location
after each task completes.
Summary
Tasks define the atomic units of work for a job and these units
of work need to be able to be executed in parallel.
Digipede Workbench
Traditional grid computing solutions require a user to create
High Throughput Computing Digipede Training.doc
November, 2007
For detailed information about
the Digipede Workbench, see the33
“Workbench User Guide” which
is installed with the Digipede
Workbench.
Digipede Network
™
Session 1 Training Guide
jobs using a scripting language—sometimes in proprietary
languages, sometimes using scripting languages such as Perl.
Digipede recognizes that requiring scripting is a major barrier
to grid adoption and created the Digipede Workbench to
simplify this arduous task. The Digipede Workbench is a
Windows application designed to make it easy for a user to
create, submit, and monitor jobs.
Wizards
With Digipede Workbench a user creates a job using Job
Wizard. Job Wizard is made up of pages that walk the user
through the job creation process. The user provides the file,
parameter, command, and setting information. Once the job
has been created, it can be automatically submitted when the
Job Wizard closes or later by loading the job into the Designer
and starting the job.
Job Template Wizard
After you select New Job (either from the File menu or by
clicking the New Job button), Workbench will ask if you would
like to use an existing job template. If you answer "No,"
Workbench opens the Job Template wizard.
The Job Template wizard walks you through the process of
creating a job template. It automatically creates a job template,
remote locations, file definitions, and parameters. Based on
the task files and parameters you specify, it also creates a job
to submit.
The wizard does not force you to specify the location and
relevance of each file or parameter manually. Rather, it uses
natural language questions (e.g., "Will the Digipede Agent
install common files for this job?") and interprets the results to
create file definitions with the appropriate relevance. It allows
the user to browse to locations on the network and
automatically creates the correct remote locations.
High Throughput Computing Digipede Training.doc
November, 2007
34
Digipede Network
™
Session 1 Training Guide
Tip: If you select the Yes,
cache the template and
common files button on the last
page of the wizard, the job
template will be stored in the
system and common files will be
cached on the compute
resources. And, it will be easier
to submit jobs against this job
template in subsequent
submissions, because you won'
t
need to define common files, file
definitions, or remote locations.
"
#
$
# %
Job Wizard
After you select New Job (either from the File menu or by
clicking the New Job button), Workbench asks if you would
like to use an existing job template. If you answer "Yes,"
Workbench opens the Job Template Wizard. The Job Wizard
is a shortened version of the Job Template Wizard. Rather
than forcing you to define all of the file definitions and remote
locations, it simply asks you to provide details for any file
definitions or parameters with InputPlaceholder relevance.
&
#
$
# %
Parameters in Workbench
Workbench can automatically pre-populate the values for
parameters. There are four different ways it can do this
population:
•
Literal: A constant. Literal parameters can be specified as
job-relevant (specified only once for the entire job). If a
High Throughput Computing Digipede Training.doc
November, 2007
35
Digipede Network
™
Session 1 Training Guide
Literal parameter is not job-relevant, you can change it for
each task.
•
Range: A range of numbers that varies for each task. If
you specify a Range parameter (along with input files and
parameters from files), you indirectly set the number of
tasks in a job. For example, a range from 10 to 10000
stepping by 10 creates 1000 tasks: the first would have
PARAM1 = 10, the second would have PARAM1 = 20,
etc., all the way up to PARAM1 = 10000. If you have more
than one Range parameter, the cross product of the sets
they generate determines the number of tasks.
•
Random: A randomized number from within a range that
you specify. Random parameters can be either real or
whole numbers.
•
Stored in a File: Each line in a particular file is a set of
parameters for your tasks. The Digipede Network can read
parameters from a file. When the job is submitted, the user
can specify a file in which the parameters are located.
Designers
Workbench'
s designer pages give the user full access to all of
the information contained in jobs and job templates. After job
templates and jobs have been created in the wizards, the user
can view and edit them using the designer pages.
If you prefer to work in the designers, you can create a blank
job template by using the File->New->Blank Job Template
option.
Job Template Designer
After a job template has been created, it can be opened in the
Job Template Designer, where the user can specify or change
aspects of the job template. All the specifications the user
made in the Job Template Wizard are displayed in the Job
Template Designer when the user chooses the Job Template
Definition view.
A job template definition can be changed until it has been
submitted. To makes changes to a submitted job template, a
user must either create a new version of the job template or
make a copy of the job template and change the copy.
High Throughput Computing Digipede Training.doc
November, 2007
36
Digipede Network
™
'
Session 1 Training Guide
#
$
Job Designer
Once a job has been created, it can be opened in the Job
Designer, where the user can view, specify, or change
properties of the job. All the specifications made in the Job
Wizard are displayed in the Job Designer when the user
chooses the Job Definition view. If the specifications are
changed here, the changes become the new definition for that
job. Unlike a job template, a job can be modified after it has
been submitted. The user can then resubmit the changed job.
High Throughput Computing Digipede Training.doc
November, 2007
37
Digipede Network
(
™
Session 1 Training Guide
#
$
Job Tracking
The Job Tracking Page is a Digipede Workbench tool that
allows a user to monitor and find jobs on the Digipede Server.
The user can search for jobs within a specified time frame,
and/or a having specific statuses. To view jobs, select the
appropriate time range and statuses (e.g., All running jobs
submitted today) and click the Find button. Workbench will
query the Digipede Server for appropriate jobs and list them.
Double-click a job (or select it and click Monitor) to get
detailed information about that job in a job window. If the job is
in an active state (anything except Aborted, Completed, or
Failed), the job window will actively monitor the progress of
the job. If you select the Get history when monitoring jobs
checkbox, Workbench will download the job history (all task
assignment information, including which tasks ran where,
standard error, etc).
Tip: Check the Only my Jobs
box to limit the jobs listed to
yours jobs. Also, if you check the
Autorefresh box, Workbench will
refresh this page every minute.
The Job Tracking Page provides much of the same
information as the Jobs page in Digipede Control but allows
the user to stay in the application where he is building his jobs.
High Throughput Computing Digipede Training.doc
November, 2007
38
Digipede Network
)
™
Session 1 Training Guide
#
$
$
*
Saving Job Templates and Jobs
Job templates and jobs can be saved to XML files. These XML
files can be submitted to the Digipede Server by Digipede
Control, can be opened in another Digipede Workbench, and
can even be hand-edited. To save a job template or job to
XML, simply make sure that its window is active, and then
select Save As from the File menu.
Although the files contain standard XML, by convention the
following extensions are used for the files. If the XML file
contains a job template, it receives the DNAX extension. If it
contains a job, it receives DNJX. If the file contains a full "job
submission," that is, a job template and a job, its extension is
DNSX.
Digipede Control
Digipede Control is the Digipede Network’s administration tool.
Using Digipede Control, a user can view the status and history
of submitted job templates, jobs, and tasks.
Each job template, job, and task is assigned an ID. This ID
can be used identify associated job template, job, and task
objects. Optionally, the user can assign names to job
templates and jobs to make association identification easier.
Job Template Page
As you can see in Figure 8, with Digipede Control a user can
view all the currently defined job templates. The Job
Template page (Administration->Administer Job Templates)
contains a list of available job templates. To see details about
a particular job template simply select the job template from
High Throughput Computing Digipede Training.doc
November, 2007
39
Digipede Network
™
Session 1 Training Guide
the list. The Information tab (in the top half of the page) then
displays detailed information of the selected job template.
+
,
*
To view the details in a particular job template, select that
template and click the View XML for this Template link in the
Information tab. Figure 9 shows the contents of the
MonteCarloPi job template. The MonteCarloPi job template is
a job template created from the WorkerLibraryForms sample
supplied with the Digipede Framework SDK. Binary
information (i.e., streamed files) is omitted from the XML file.
-.
Jobs Page
High Throughput Computing Digipede Training.doc
November, 2007
40
Digipede Network
™
Session 1 Training Guide
Like the Job Template page, Digipede Control can display a
list of jobs in the system. Click the Jobs link to view the Jobs
page. The lower half of the page displays a list of jobs, and the
tabs on the top half of the page show details about the
selected job. By default, jobs are listed most-recent-first;
however, clicking on any column (ID, Job Name, Priority, Time
Started, Progress, Status, and Last Result Time) will re-sort
the list by that column. If you would like to filter the job list or
find a particular job, use the Find tab.
To learn more about the individual tasks in a job, select the job
in the job list and then select the Task hyperlink above the job
list. This takes you to the Task page.
/
,
*
Task Page
The Task page shows the status of the tasks for one job. This
page is often used to check the status of the tasks in an
executing job, to see what machines were used for a job, or to
gather information on a failed job.
When a Digipede Agent claims a task the task is labeled
"Assigned". When an Agent executes a task, the task is
labeled "Running". When the Agent results are returned to the
Digipede Server, the task is labeled "Completed".
High Throughput Computing Digipede Training.doc
November, 2007
41
Digipede Network
™
Session 1 Training Guide
,
*
High Throughput Computing Digipede Training.doc
November, 2007
42
Digipede Network
™
Session 1 Training Guide
Hello World Walkthrough
This walkthrough will introduce you to the Digipede
Workbench by having you define and run a job. The
executable you will distribute is HelloWorld.exe, a simple
command-line program that writes to standard output. This
HelloWorld.exe can optionally take a command line
argument—if you give it an argument, it will echo that
argument in its standard output.
In this exercise, you will define and submit a job and job
template. Subsequently, you will submit another job against
that job template.
Defining the Job Template and Initial Job
1. If you haven’t installed Digipede Workbench, do so. To
install Workbench, log in to Digipede Control by
opening a browser and navigating to HTCServer/DigipedeControl and entering your username
and password. Your username is your machine name
(e.g., LABPC01) and your password is the same as
your username. Click on the Digipede Workbench link
and follow the installation instructions.
2. Start Digipede Workbench by selecting Start->All
Programs->Digipede->Workbench from your start
menu. Workbench will prompt you for your credentials.
Select Digipede Network Authentication and enter your
username and password again; use the same
username and password you used for Digipede
Control.
3. Start a new job by clicking the “New Job” link in the
Common Tasks pane of the Start Page.
High Throughput Computing Digipede Training.doc
November, 2007
43
Digipede Network
™
Session 1 Training Guide
4. When prompted with the “Would you like to use an
existing Job Template” dialog, answer “No.”
5. Enter a name and description for your job template and
click Next.
6. “Hello World” does require a file to be moved. Select
the appropriate protocol: if your files are accessible via
file share, choose “Share;” if your files are accessible
on a web server, choose “HTTP.”
High Throughput Computing Digipede Training.doc
November, 2007
44
Digipede Network
™
Session 1 Training Guide
7. The executable is a “Common” file (also known as a
Job Template file). Select “Yes” to common files and
click Next.
8. Click the add button, then browse to and select the
HelloWorld.exe file. The file is located in the \\HTCServer\SharedFiles folder.
9. This job does not require Job files or Task files. Select
“No” to those screens and click Next on each.
10. While this job does not require any command line
parameters, we’ll define one parameter for this job.
We’ll define a “Range” parameter in order to create
multiple tasks for this job. Select “Yes” and click Next.
High Throughput Computing Digipede Training.doc
November, 2007
45
Digipede Network
™
Session 1 Training Guide
11. Add one Range parameter. Enter a Name for your
parameter (e.g. “NumTasks”) and define the range to
go from 1 to 10 step 1. Check the “Can override at Job
submission” box. Then, click OK, then Next.
12. This application doesn’t return result files nor does it
use the Digipede API, so select No for the next two
screens.
13. This application needs one command line parameter: a
Task ID. Modify the command line by clicking “Edit.”
14. Add the Task ID to the command line by clicking where
you would like the parameter to go (right after
“HelloWorld.exe”), then double-clicking “Task ID” (or, if
you’d rather, any other variable). Your command line
should look like this: HelloWorld.exe $(TaskID)
High Throughput Computing Digipede Training.doc
November, 2007
46
Digipede Network
™
Session 1 Training Guide
15. Click “No” to notifications and click Next.
16. Because this application generates standard output,
click the Advanced Options button and ensure that
“Save Standard Output” is selected. Click OK.
17. Your job template and job are ready to submit. To
submit immediately, ensure that the “Run Job on
Finish” checkbox is selected and click Finish.
However, you may wish to familiarize yourself with the
details of the job and job template before submitting
them. If you want to see the details before submitting,
uncheck the “Run Job on Finish” checkbox and click
Finish.
Minen Walkthrough
This walkthrough demonstrates using the Digipede Network to
distribute the calculations of the Minimum Energy executable.
Minen contains one distributable step—for each input file, the
Update_File executable must be run in order to generate the
results for that set of inputs.
Digipede Workbench can be used to quickly and easily
distribute the work of many Update_File calls.
Before beginning this walkthrough, ensure that you have run
make_runs.exe in order to create input files. Also, you must
create a file share that is “world writable,” or you must utilize
Digipede’s HTTP file transfer.
1. Start Digipede Workbench by selecting Start->All
High Throughput Computing Digipede Training.doc
November, 2007
47
Digipede Network
™
Session 1 Training Guide
Programs->Digipede->Workbench from your start
menu.
2. Start a new job by clicking the “New Job” link in the
Common Tasks pane of the Start Page.
3. When prompted with the “Would you like to use an
existing Job Template” dialog, answer “No.”
4. Enter a name and description for your job template and
click Next.
5. “Update_File.exe” has two common files:
Update_file.exe and cygwin1.dll. Answer “Yes” to
common files, then browse the file share and select
these two files.
High Throughput Computing Digipede Training.doc
November, 2007
48
Digipede Network
™
Session 1 Training Guide
6. There are no job files for MinEn, so select “No” and
click “Next.”
7. Each task in MinEn requires an input file. Select “Yes”
and click “Next.”
8. Browse to a file share and select one or more input
files.
9. The tasks do not have parameters – click “No” and
click “Next.”
10. MinEn will produce one result file – select “Yes” and
browse to the file share where files should be returned.
High Throughput Computing Digipede Training.doc
November, 2007
49
Digipede Network
™
Session 1 Training Guide
For this exercise, browse to:
htc-server/digipedetransfer/AcceptsFiles.aspx
11. Next, give the output file a name (that can be used on
the command line) and give an expression that can be
used for a file name. In this case, use the expression
$N(INFile).out. The $(VariableName) syntax indicates
that the value of another variable (in this case, the
input file name) will be used. The N indicates that the
extension should be stripped off of the file, and the
“.out” adds the .out extension.
For a complete definition of the command line
expression syntax, see the Digipede Workbench
documentation.
12. This application does not use the Digipede API, so
select “My application does NOT use the Digipede API”
and click “Next.”
13. Update_File takes two arguments – the input file and
the output file. Because these will be different for each
task, you should use a variable for each of them.
The wizard will provide a list of the files for this job.
14. We do not have an SMTP server set up for this server,
so select “No” to notifications and click “Next.”
15. Click Finish.
High Throughput Computing Digipede Training.doc
November, 2007
50
Digipede Network
™
High Throughput Computing Digipede Training.doc
November, 2007
Session 1 Training Guide
51
Digipede Network
™
Session 1 Training Guide
Glossary
AGENTS
THE SOFTWARE THAT RUNS ON INDIVIDUAL COMPUTE RESOURCES.
AGENTS MANAGE THE EXECUTION OF THE DISTRIBUTED
APPLICATION.
APPLICATION/DATA
SERVER
A SERVER THAT PROVIDES DATA AND APPLICATIONS TO THE
AGENTS. IT CAN, BUT DOES NOT NEED TO BE, THE SAME SERVER
THAT HOLDS THE DIGIPEDE SERVER. NO DIGIPEDE SOFTWARE
NEEDS TO BE INSTALLED ON AN APPLICATION/DATA SERVER.
BATCHABLE
APPLICATION
A COMMAND-LINE APPLICATION THAT DOES NOT REQUIRE ANY
USER INTERACTION. THESE APPLICATIONS ARE CALLED
BATCHABLE BECAUSE THEY CAN BE RUN FROM A BATCH PROCESS.
COMPUTE RESOURCES
COMPUTERS THAT ARE MADE AVAILABLE ON THE DIGIPEDE
NETWORK. THESE COMPUTE RESOURCES MAY BE DEDICATED OR
SHARED. DEDICATED COMPUTE RESOURCES ARE USED
EXCLUSIVELY FOR JOBS RUN ON THE DIGIPEDE NETWORK. SHARED
COMPUTE RESOURCES MAY ALSO BE USED FOR OTHER PURPOSES.
DATA TRANSFER
THE PROCESS BY WHICH ALL DATA REQUIRED FOR A SPECIFIC
TASK ARE TRANSFERRED FROM A DATA RESOURCE TO A COMPUTE
RESOURCE.
DEDICATED COMPUTE
RESOURCES
COMPUTE RESOURCES THAT ARE USED EXCLUSIVELY FOR JOBS
RUN ON THE DIGIPEDE NETWORK (FOR EXAMPLE, CLUSTER NODES
IN A CLUSTER USED EXCLUSIVELY FOR SUCH APPLICATIONS).
DIGIPEDE AGENT
DIGIPEDE CONTROL
DIGIPEDE SERVER
DIGIPEDE TRANSFER
DIGIPEDE WORKBENCH
DISTRIBUTED
APPLICATION
THE SOFTWARE COMPONENT THAT MANAGES THE COMPUTE
RESOURCE FOR THE DIGIPEDE NETWORK. THIS IS A SMALL,
UNOBTRUSIVE PROGRAM THAT DOES NOT REQUIRE ANY
INTERACTION WITH ANY USER OF THE COMPUTE RESOURCE.
THE ADMINISTRATIVE COMPONENT OF THE DIGIPEDE NETWORK.
DIGIPEDE CONTROL IS A WEBSITE (USUALLY HOSTED ON THE SAME
COMPUTER AS THE DIGIPEDE SERVER) THROUGH WHICH AN
ADMINISTRATOR CAN MONITOR AND RUN THE DIGIPEDE NETWORK.
THE SERVER SOFTWARE THAT MANAGES JOBS AND ALL
COMMUNICATION WITH THE DIGIPEDE AGENT SOFTWARE.
A WEBSITE THAT TRANSPORTS FILES VIA THE HTTP (OR HTTPS)
PROTOCOL. IF YOUR NETWORK ARCHITECTURE DOES NOT PERMIT
THE USE OF SHARES FOR FILE COPYING, YOU CAN USE DIGIPEDE
TRANSFER TO SERVE FILES. A PROGRAM IN DIGIPEDE TRANSFER
CALLED ACCEPTSFILES.ASPX CAN RECEIVE FILES VIA HTTP. YOU
CAN USE THIS AS A DESTINATION FOR YOUR RESULTS FILES. YOU
CAN ALSO INSTALL DIGIPEDE TRANSFER ON ANY MACHINE IN YOUR
ORGANIZATION WHERE YOU WOULD LIKE TO HOST FILES.
THE SOFTWARE COMPONENT THAT DEFINES AND RUNS JOBS ON
THE DIGIPEDE NETWORK. THIS WINDOWS SMART CLIENT CAN
START AND MONITOR JOBS, AND CAN RUN ON ANY MACHINE IN AN
ORGANIZATION.
THE APPLICATION THAT THE DIGIPEDE AGENT MANAGES FOR
EXECUTION ON COMPUTE RESOURCES. THIS APPLICATION CAN BE
WRITTEN TO COMMUNICATE DIRECTLY WITH THE DIGIPEDE
High Throughput Computing Digipede Training.doc
November, 2007
52
Digipede Network
™
Session 1 Training Guide
NETWORK USING THE DIGIPEDE API, OR IT CAN BE A STAND-ALONE
COMMAND-LINE EXECUTABLE.
EXTERNAL RESOURCE
JOB
JOB TEMPLATE
ANY RESOURCE (E.G., A FILE SERVER, DATABASE, OR SOFTWARE
LICENSE) USED BY APPLICATIONS ON THE DIGIPEDE NETWORK.
THE DIGIPEDE NETWORK CAN APPLY LIMITS TO ENSURE THAT
EXTERNAL RESOURCES ARE NOT OVERUSED OR OVERTAXED.
A TASK TO RUN ON THE DIGIPEDE NETWORK; A SINGLE, SPECIFIC
SUBMISSION OF A JOB TEMPLATE. OFTEN A JOB IS COMPOSED OF
MULTIPLE TASKS.
THE INFORMATION NECESSARY TO COMPLETE A JOB. A JOB
TEMPLATE TELLS THE DIGIPEDE NETWORK WHAT FILES NEED TO BE
ON A COMPUTE RESOURCE TO RUN A JOB, WHERE TO GET THOSE
FILES, HOW TO INSTALL THEM, HOW TO EXECUTE THE JOB, AND
HOW TO COMMUNICATE WITH THE EXECUTABLE. JOB TEMPLATES
RESIDE ON A COMPUTE RESOURCE UNTIL THE JOB IS DELETED
FROM THE SYSTEM.
MASTER APPLICATION
AN APPLICATION THAT COMMUNICATES WITH THE DIGIPEDE
SERVER USING WEB SERVICES OR THE DIGIPEDE API IN ORDER TO
START, MONITOR, OR CONTROL JOBS.
MASTER POOL
THE MASTER POOLS IS THE COLLECTION OF ALL THE COMPUTE
RESOURCES
POOL
A COLLECTION OF COMPUTE RESOURCES ON WHICH JOBS ARE
RUN.
SHARED COMPUTE
RESOURCES
COMPUTE RESOURCES THAT ARE USED FOR OTHER PURPOSES, IN
ADDITION TO RUNNING JOBS ON THE DIGIPEDE NETWORK (FOR
EXAMPLE, DESKTOPS WITH ONE OR MORE INTERACTIVE USERS).
TASK
THE PART OF A JOB THAT IS RUN ON AN INDIVIDUAL COMPUTE
RESOURCE. MOST JOBS ARE COMPOSED OF MANY TASKS. OFTEN,
THE DIGIPEDE AGENT MUST COPY FILES TO A COMPUTE RESOURCE
IN ORDER TO RUN A PARTICULAR TASK. THE AGENT DELETES
THESE FILES AFTER THE COMPLETING THE TASK.
High Throughput Computing Digipede Training.doc
November, 2007
53
Download