4 Set-Extended ClassAd s and Set

advertisement
Set Matching:
A Discovery Mechanism for Resource Ensembles
University of Chicago, UW-Madison
July 31, 2002
1
2
Introduction ................................................................................................................. 1
ClassAds and Matchmaking ....................................................................................... 2
2.1
ClassAd Specification ......................................................................................... 2
2.2
Advertising Protocol ........................................................................................... 3
2.3
ClassAd Evaluation Algorithm ........................................................................... 3
2.4
Matchmaking Algorithm ..................................................................................... 4
3 Motivations for Set Matching ..................................................................................... 4
4 Set-Extended ClassAds and Set-Extended Matchmaking .......................................... 6
4.1
Set-extended ClassAds Syntax ........................................................................... 7
4.1.1
Attribute Accessor ...................................................................................... 8
4.1.2
Aggregation Functions ................................................................................ 8
4.1.3
SetSize Function ......................................................................................... 9
4.1.4
List Operations ............................................................................................ 9
4.2
Advertising Protocol ........................................................................................... 9
4.3
Evaluation Algorithm.......................................................................................... 9
4.4
Set-Extended Matchmaking Algorithm Framework ......................................... 10
4.4.1
The Basic Framework ............................................................................... 10
4.4.2
A Greedy Heuristic ................................................................................... 11
4.4.3
Controlling the Behavior of the Greedy Search ........................................ 12
5 A Example of Set Matching ...................................................................................... 14
6 Miscellaneous ........................................................................................................... 15
References ......................................................................................................................... 15
1 Introduction
Resource discovery is a fundamental and general operation in computing systems, and a wide
variety of approaches have been proposed for representing resource requirements and resources,
and for locating resources that meet requirements [1-5].
The approach that we consider (and extend) here, the ClassAds/Matchmaking formalism [6, 7], is
motivated by the observation that resource discovery can usefully be viewed as a symmetric
operation, in which a resource request and a resource description are evaluated with respect to
each other. Thus, a match succeeds only if both the resource meets requirements established in
the request, and vice versa. While pioneered in the Condor high throughput computing system,
the ClassAds/Matchmaking formalism has been applied in a wide variety of domains (list?).
We propose here extensions to the ClassAds/Matchmaking formalism that allow for the matching
of a request and a set of offers. These extensions are motivated, initially, by the demands of
distributed computing scenarios in which an application may have aggregate requirements (e.g.,
for memory, compute power, bandwidth) that may be satisfied by sets of resources. However, we
believe that they have general utility.
In defining these extensions, we must address a number of questions such as: What is a set and
how to organize resources into set? How to describe the characteristics of a set? How to define
the match of a set and a set request? How to match a request with sets efficiently when there are a
large number of candidate sets?
This document first presents the new challenges to the Condor resource selection mechanism
caused by applications that need multiple resources to run. In order to deal with these challenges,
we then introduce an extension of the ClassAd language and its matchmaking mechanism.
2 ClassAd and Matchmaking
The ClassAd/Matchmaking formalism comprises four principal components [6]:
1. The ClassAd specification, which defines a language for expressing attributes of an entity
and any constraints placed on a matching entity, and a semantics of evaluating these
constraints;
2. The advertising protocol, which defines basic conventions regarding what a matchmaker
expects to find in a ClassAd if this ClassAd is to be included in the matchmaking process,
how the matchmaker expects to receive the ClassAd from the advertiser and how it
returns the matchmaking result to advertiser.
3. The evaluation algorithm, which defines how the contents of two ClassAds determine the
outcome of the matchmaking process.
4. The matchmaking algorithm, which defines how the contents of one ClassAd are
matched with multiple other Classads.
2.1
ClassAd Specification
The ClassAd language[7] is a simple expression-based language. The central construct of the
language is the ClassAd (Classified Advertisement), which is a record-like structure composed of
a finite number of distinctly named expressions (Figure 1). We call name of expression attribute,
and the value of an attribute is the evaluation result of the expression.
[name0=expr0; name1=expr1; name2=expr2; …; namen=exprn]
Figure 1. ClassAd structure
In ClassAd language, attribute expressions can be simple constants, build-in functions, attribute
references or a logic/arithmetic operation of them. Attribute reference in an expression is like
“<classad>.<attr>” that refers to the value of attribute <attr> in ClassAd <classad>, or
“<attr>” that refers to the value of attribute <attr> in the same ClassAd. The operators are
essentially those of the C language, with certain operators excluded (e.g., pointer operators).
Thus, a rich set of arithmetic, logic, bit-wise and comparison operators are defined. The set of
supported operators and their relative precedence are summarized in [7].
The ClassAd language differentiates between expressions and values: Expressions are evaluable
language constructs obtained by parsing valid expression syntax, whereas values are the results of
evaluating expressions. This language has a rich set of types and values which includes many
traditional values (numeric, string, boolean), non-traditional values (list) and some esoteric
values, such as undefined and error. List is a finite sequence of expressions. Undefined is
generated when an attribute reference cannot be resolved, and error is generated when there are
type errors. The ClassAd language employs dynamic typing (or latent typing), so only values (and
not expressions) have types.
ClassAds are used as attribute lists by entities to describe their characteristics, constraints and
preferences. Figure 2 shows a ClassAd that describes a Resource Request and two ClassAds that
describe two computation resources.
Request=[
owner="chliu";
requirements = other.type=="machine"
&& other.cpuspeed > 500M
&& other.memorysize > 100M;
rank = other.memory + other.cpuspeed
]
ResourceA=[ name="foo"; type="machine";
cpuspeed=800M; memorysize=512M;
requirements=member(other.owner, {"chliu", "lyang"}) ]
ResourceB=[ name="bar"; type="machine"; cpuspeed=700M;
memorysize=256M; requirment=true]
Figure 2 Three ClassAds that describe a request and two resources, respectively
2.2
Advertising Protocol
Advertisers specify their properties of their offers or requests by attribute/value pairs encoded by
ClassAd specification. The advertising protocol describes the mapping of attribute names and
values of a ClassAd to the property of the advertised entity. For example, in Figure 2, the protocol
chooses attribute name “memorysize” to describe the memory size of computation resources,
and an expression “memorysize = 128M” means the memory size of this resource is 128M
bytes. The adverting protocol also requires advertisers to specify the constraints to the matched
entity in a boolean expression named “requirements” and their preference to the matched
entity in a scalar expression named “rank”.
An entity uses the advertising protocol to communicate a ClassAd to a matchmaker for
evaluation. The result of such a request is either another ClassAd for which matching has
succeeded, or failure.
2.3
ClassAd Evaluation Operation
The ClassAd evaluation operation is applied to a pair of ClassAds. The evaluation function
introduces a distinction between two kinds of attributes:
o
Property attributes such as “cpuspeed” and “memorysize” in ClassAds ResourceA
and ResourceB (Figure 2) have a semantic understood by the two entities that
participate in a match.
o
Two control attributes, requirements and rank, are used by the customer to control
the evaluation and matchmaking algorithms.
The two ClassAds match if expressions named requirements in both ClassAds evaluate to true
and an expression named rank is evaluated to a numerical value representing the quality of the
match. If requirements expression in one of two ClassAds returns to false, error or
undefined, these two ClassAds don’t match each other.
To perform the match, the matchmaker evaluates expressions in an environment that allows each
ClassAd to access attributes of the other. The matchmaker defines an expression “other” that is
evaluated to the other ClassAd in the context of the ClassAd that contain the reference, thus
“other.attribute-name” refers to value of an attribute in the other ClassAd [8]. For
example, in Figure 2, the sub-expression “other.memorysize” in ClassAd Request is equal
to 512M when ClassAd Request is evaluated with ClassAd ResourceA, and is equal to 256M
when ClassAd Request is evaluated with ClassAd ResourceB. Based on the definition of
match, both ResourceA and ResourceB match Request.
2.4
Matchmaking Algorithm
The ClassAd evaluation algorithm is typically used in the context of a matchmaking algorithm
that determines, within an environment containing many ClassAds, which pairs of ClassAds to
evaluate and in which order. Many different matchmaking algorithms can be defined; we describe
here the Condor matchmaking algorithm used within the Condor system [5]. As illustrated in
Figure 3, this works as follows. A distinction is made between ClassAds representing available
resources and ClassAds representing requests for resources. Both are sent to a Condor
matchmaking engine. This engine maintains a set of unmatched resource ClassAds and, for each
incoming request ClassAd, evaluates that request against every available resource ClassAd. If one
or more matches are found, a highest ranking match is selected, removed from the unmatched
resource ClassAd set, and returned to the requestor. Otherwise, the requestor is notified that the
request failed.
requests
The Condor
matchmaking engine
Resource ClassAd
resources
evaluate
Resource ClassAd
Request ClassAd
Resource ClassAd
Resource ClassAd
match, or fail
Figure 3: The Condor matchmaking algorithm
As an example of how the Condor matchmaking algorithm works, consider the ClassAds in
Figure 2. Here, Request represents a request while ResourceA and ResourceB represent
resources. Both ResourceA and ResourceB match Request, and ResourceB is better than
ResourceA because it has more memory and faster cpuspeed—and thus higher rank.
The simple description provided here suggests that the Condor matchmaking algorithm takes, for
each request, time proportional to the number of available resources. In practice, indexing
techniques can be used to reduce this cost.
3 Motivations for Set Matching
The power of the ClassAd/matchmaking formalism derives from its simple yet general notation
and mechanisms for describing entities and evaluating mutual constraints between a pair of
entities. However, in many practical situations, we find that we want to deal with constraints
involving multiple entities. For example:
1. An application requires both a 1 GHz computer and 10 GB of attached storage.
2. An application requires a 1 GHz computer connected by a network with at least 10 MB/s
capacity to a storage system that contains a file F.
3. An application consists of a number of jobs, each of which requires a machine and a
license to run the application. Licenses are limited in quantity, and each license is only
valid on some subset of machines. Thus the workstation and license resources required by
each job are inter-dependent.
4. An application may require a site that has available computing power and cached copies
of at least three of four specified datasets.
5. A parallel application may require a collection of computers with a specified aggregate
memory, aggregate computing power, and minimum interconnection bandwidth. (Or,
alternatively, that meet a minimum application performance level as specified by a
performance model.)
6. More set examples.
In previous work, Rajesh Raman extended the ClassAd/matchmaking formalism with gang match
technology to support multilateral matchmaking. Gang match uses a single ClassAd to represent
the requirements of multiple interdependent matches by defining an ordered list of labeled ports,
each of which is a nested ClassAd describing a request for a “sub-match.” Gang match associate
a label with every port, thus every port can access attribute of other ports. A multilateral match
occurs by docking the individual ports of distinct advertisements, thus forming tree-shaped
“gangs ”of linked ClassAds. Figure 4 shows an advertisement for a graphics rendering service.
The advertisement consists of three ports, of which the first behaves as the parent link. The next
two ports request a workstation and a license for the rendering application respectively.
[ Ports =
{[
label = request;
type = "render_server";
requirements = request.Type=="render_client"
&&request.Owner!="rival";
rank = 0],
[
label = cpu;
imageSize = 27.2M;
executable = "do_render";
stdIn = request.sceneFile;
stdOut = request.outputFile;
requirements = cpu.Arch=="INTEL"
&& cpu.OpSys=="LINUX"
&& cpu.VirtualMemory>ImageSize;
rank = cpu.Memory],
[
label = license;
requirements = license.App=="do_render"
rank = 0]
}]
Figure 4. Example of gang match
However, gang match still cannot express constraints that apply to sets of resources, such as
examples four and five above. We require more general mechanisms that allow us to deal with
sets. In order to deal with these challenges, we introduce Set Matching technology that includes a
Set-Extended ClassAd language for describing constraints on sets of entities, and a Set-Extended
Matchmaking mechanism for identifying entity sets that match a request.
4 Set-Extended ClassAd and Set-Extended Matchmaking
We now introduce the extensions that we define to the ClassAd/matchmaker framework to
support the use of sets. As indicated in Table 1 and illustrated in Figure 5, these extensions work
basically as follows. At the syntactic level, we extend the ClassAd language to include syntax,
operators, and functions for representing and manipulating sets. At the operational level, we
define a new set-extended variant of the Condor matchmaking algorithm. Like the Condor
matchmaking algorithm, this algorithm maintains a collection of resource ClassAds and evaluates
incoming request ClassAds one by one. However, rather than match each request with every
resource ClassAd, it invokes a set constructor to create a collection of set ClassAds representing
combinations of resource ClassAds. As we describe below, this set constructor does not generate
all possible combinations, as that would in general be computationally infeasible. Instead, it
explores the space of possible combinations by using a search strategy whose behavior can be
controlled via a set of matchmaking control attributes (Section 4.4.3) provided in the request
ClassAd.
Table 1: Comparisons of condor ClassAd with Set-Extended ClassAd.
Component
ClassAd
Set-Extended ClassAd
Syntax
ClassAd.
ClassAd plus additional set/listoriented operators and functions.
Matchmaking
Protocol
Send ClassAd; receive result.
Same.
Evaluation Algorithm
Evaluate a pair of ClassAds
subject to control attribute
requirements: return true if
requirements expressions in
both return true, false
otherwise.
Same.
Matchmaking
Algorithm (various)
E.g., Condor matchmaking
algorithm: evaluate each request
ClassAd with all current
resource ClassAds and return
ClassAd for which evaluation is
successful and that has highest
value computed for control
attribute rank, or fail.
E.g., set-extended Condor
matchmaking algorithm: operate a
set constructor (under control of
set-constructor-specific control
attributes) and then apply standard
Condor matchmaking to evaluate
request with respect to ClassAds
representing the sets created by the
set constructor.
In this discussion, we use the term element to refer to the atomic unit of a set; a set is thus a
collection of elements.
requests
Set-extended Condor matchmaking engine
Set
Constructor
Resource ClassAd 1
resources
Resource ClassAd 2
evaluate
Resource ClassAd 3
Request ClassAd
{Res2}
{Res1,Res2}
{Res1,Res3}
Resource ClassAd 4
match, or
fail
Figure 5: The set-extended Condor matchmaking engine
4.1
Set-extended ClassAd Syntax
The set-extended ClassAd language uses the ClassAd list syntax to represent a set. A ClassAd list
is constructed with the list construction operator as illustrated below.
{ expr0
expr1 , ... , exprn }
A list expression evaluates to a list value, which can late be used as an array in subscript
expressions. Thus, for example, the following syntax represents a set (list) of CPU names:
{“cirque.ucsd.edu”, “dralion.ucsd.edu”, “cmajor.cs.uiuc.edu”}
while the following syntax represents a set (list) of ClassAds, each of which (in this case)
describes a computational resource:
Set1= {
[ hostname="cirque.ucsd.edu"; cpuspeed=501M; memory=249M;
ID="cirque_ucsd_edu";
bandwidth=[ dralion_ucsd_edu= 56K; cmajor_cs_uiuc_edu= 6K];
requirements=RegExp(other.user, "grads*")
],
[ hostname="dralion.ucsd.edu"; cpuspeed=451M; memory=251M;
ID="dralion_ucsd_edu";
bandwidth=[cirque_ucsd_edu= 56K; cmajor_cs_uiuc_edu= 6K];
requirements=RegExp(other.user, "grads*")
],
[ hostname="cmajor.cs.uiuc.edu"; cpuspeed=256M; memory=128M;
ID="cmajor_cs_uiuc_edu";
bandwidth=[ dralion_ucsd_edu= 6K; cirque_ucsd_edu= 6K];
requirements=RegExp(other.user, "grads*")
]
}
Figure 6. A resource set described by a ClassAd list
Table 2: Set-extended ClassAd extensions to ClassAd syntax.
Operator(s)
Attribute reference
Syntax
“.”, “[]”
Example
{[val=1],[val=2]}.val => {1.2}
[a=1; b=3][“a”] => 1
Aggregation functions
Max, Min, Avg, Sum
Max({1,2,3}) => 3
List Functions
Allcompare,
Anycompare,
RegExp, Sublist, Inlist,
Size
Allcompare(“>”, {1,2,3},2) =>
false
List Operators
“+”, “-“, “*”, “/”
{1, 2, 3} + {4, 5, 6}={5, 7, 9}
{1, 2, 3} * 2 = { 2, 4, 6}
In order to describe the aggregation characteristics of a set, the set-extended ClassAd language
also extends ClassAd as noted in Table 2 and discussed in the following.
4.1.1 Attribute Reference
An operator “.” is defined for accessing attribute values of a ClassAd list, and an operator “[]” is
defined for accessing attribute value with attribute name as a variable.
expr.attr: This variant first valuates the expression expr, which must be evaluate to a
ClassAd or a ClassAd list. If this expression evaluates to undefined, the value of the entire
reference is undefined. Otherwise, if the value is not a ClassAd or a ClassAd list, the value of the
reference is error. If expr evaluates to a Classad, the definition of expr.attr is defined in [7];
if expr is evaluated a ClassAd list, assuming value of expr is a list lv, then expr.attr={
lv[i].attr | i =0, Size(list)-1}.
expr1[expr2]: This variant first valuates the expression expr1, which must be evaluate to a
ClassAd, and the expression expr2, which must be evaluate to string. If any of these two
expressions evaluate to undefined, the value of the entire reference is undefined. Otherwise, if
the value of expr1 is not a ClassAd or expr2 is not a string, the value of the reference is error.
Assume the value of expr2 is a string attr, then expr1[expr2]=expr1.attr.
In the preceding example (Figure 6):

Set1.hostname
returns
a
string
list
“dralion.ucsd.edu”, “cmajor.cs.uiuc.edu”}

Set1[0].bandwidth["dralion_ucsd_edu"] returns 56K.

Set1.hostname["dralion_ucsd_edu"]returns error, as value of Set1.hostname
{“cirque.ucsd.edu”,
is not a ClassAd.
4.1.2 Aggregation Functions
Four aggregation functions, Max, Min, Avg and Sum, are provided to specify aggregate
properties of a numeric value list:
o
Max(value-list) returns the maximum value in the value-list if the value-list is a list
of numeric values, or error otherwise.
o
Min(value-list) returns the minimum value in the value-list if the value-list is a list
of numeric values, or error otherwise.
o
Sum(value-list) returns the sum of the values in the value-list if the value-list is a list
of numeric values, or error otherwise.
o
Avg(value-list) returns the average of the values in the value-list if the value-list is a
list of numeric values, or error otherwise.
4.1.3 List Functions
List operation functions Size, Inlist, Sublist, Allcompare, Anycompare, and
RegExp are supplied to specify properties of every element in a list:
o
Size(V) returns the number of elements within list V, or error if applied to anything
other than a list.
o
Inlist(S, V) returns true if S is a element of list V, or false otherwise.
o
Sublist(V1, V2) returns true if for each element S of V1, Inlist(S, V2)
returns true , or false otherwise.
o
Allcompare(logicOp, value-list, V)returns true if and only if, for any value W
in value-list, W logicOp V returns true. Otherwise, it returns false.
o
Anycompare(logicOp, value-list, V) returns true if and only if, there is a value
W in value-list, W logicOp V returns true. Otherwise, it returns false.
o
RegExp(s, exp) returns true if the regular expression exp matches part or all of string
s. Otherwise, it returns false. We also allow s and exp to be a list. If s is a string list, the
function returns true if and only if for any string x in the list s, RegExp(x, exp)
return true. If exp is a regular expression list, the function returns true if and only if
there is a element y in list exp, RegExp(s, y) returns true. If both of them are list, the
function returns true if and only if for any scalar value x in s, there is a regular
expression y in exp, RegExp(x, y) returns true.
4.1.4 List Operators
List operators “+”, “-“, “*”, “/” are carried out element-by-element in a list. The definition of
arithmetic operators “+”, “-“, “*”, “/” is as follows:
List1 op List2 = {List1[i] op List2[i] | i = 0… size(List1) -1 }
Scalar op List2 = { Scalar op List2[i] | i = 0… size(List2) -1 }
List1 op Scalar = { List1[i] op Scalar | i = 0… size(List1) -1 }
In this definition, op is an arithmetic operator. List1 and List2 must have the same number of
elements, unless one is a scalar. Otherwise, the result is error. A scalar can operate with list of
any dimension.
4.2
Advertising Protocol
The advertising protocol used within the set-extended ClassAd system is identical to that used in
the standard ClassAd/matchmaker framework.
4.3
Evaluation Algorithm
The evaluation algorithm used within the set-extended ClassAd system is identical to that used in
the standard ClassAd/matchmaker framework. However, in general we will be evaluating a
request ClassAd containing set operations with respect to a set (list) of ClassAds. The request
ClassAd and a set of ClassAds match if expressions named requirements in all ClassAds
evaluate to true and an expression named rank is evaluated to a numerical value representing the
quality of the match. If requirements expression in one of those ClassAds returns to false,
error or undefined, they don’t match each other.
For example, Figure 7 shows a possible request ClassAd. Here other is an expression evaluated
to the matching ClassAd set described by a ClassAd list. Based on the definition of list operator
“.”, other.<attr> is evaluated to a list. Notice how the requirements attribute uses the RegExp
function to describe a request for a resource set that is located in domain “ucsd.edu” or
“cs.uiuc.edu”; the Sum function to specify that total memory must be more than 1 GB; and the
Allcompare function to specify that each element in the resource set is faster than 200MHz.
Furthermore, the rank attribute defines a (very simple) performance model, expressing rank in
terms of the product of the speed of the slowest processor and the number of processors, as might
be appropriate for an application with little communication and no ability to load balance. Based
on the definition of match, this Request match the resource set Set1 in Figure 6.
[user="grads27";
Domains={"*.ucsd.edu”, “*.cs.uiuc.edu”};
Requirements = Sum(other.memory) .> 1
&& Allcompare(">", other.cpuspeed, 200M)
&& RegExp(other.hostname, Domains);
effectivePower= Min(other.cpuspeed)*Size(other);
Rank = effectivePower ]
Figure 7. Request for resource set
4.4
Set-Extended Matchmaking Algorithm Framework
Recall that in the standard ClassAd/matchmaker framework, a variety of different matchmaking
algorithms can be defined that specify how the evaluation algorithm is invoked in a particular
setting. For example, the Condor matchmaking algorithm described in Section 2.4 evaluates each
request ClassAd with respect to every known resource ClassAd, subject to the control of two
control attributes, requirements and rank.
Our set-extended framework also admits to a variety of matchmaking algorithms. We describe
here a matchmaking algorithm framework intended for use in a similar context to the Condor
matchmaking algorithm, i.e., when incoming requests must be evaluated with respect to know
resources. The key difference is that, as illustrated in Figure 5, our framework introduces a set
constructor that creates a collection of candidate sets for a particular resource ClassAd. The
standard Condor matchmaking algorithm is then applied to evaluate the resource ClassAd with
respect to these sets. We refer to a matchmaking algorithm framework because, as we describe
below, the structure that we describe can be used to instantiate a variety of different set-extended
matchmaking algorithms. In the following, we first describe our basic framework, then describe
one particular instantiation of that framework, and finally discuss control attributes that can be
used to control the operation of that instantiation.
4.4.1 The Basic Framework
In order to avoid the generation of candidate sets that cannot possibly meet request requirements,
the set constructor proceeds in two phases. In a first filtering phase, it removes individual entities
from consideration based on the request ClassAd’s requirements expression. In the
requirements expression, there are two kinds of constraints on elements of a set: one constraint
is applied to every element in a set; the other is applied to the aggregation characteristics of a set.
For example, in , sub-expression ‘allcompare(“>”, other.cpuspeed, 200M)’ indicates that
every machine in the resource set should be faster than 200MHz; and sub-expression
“Sum(other.memory)>1G” indicates that the total memory size of Resource Set is bigger than
1G. Thus we use constraints to every element to filter the unmatched element from consideration.
A RegExp/Sublist expression can also be used in this phase, as discussed above.
Following the filtering phase, the set constructor proceeds to the set generation phase. Ideally, we
might like the set constructor to generate all possible combinations of (filtered) ClassAds, so as to
ensure that the matchmaking algorithm finds the optimal match. However, in many situations the
number of sets generated would be unfeasibly large, and so the set constructor instead uses a
search heuristic to generate a smaller set of candidate sets. It is the nature of this heuristic that
defines a specific instantiation of our set-extended matchmaking framework.
4.4.2 A Greedy Heuristic
We describe one possible heuristic here. (This happens to be the heuristic that we have
implemented in our set-extended ClassAd/matchmaker prototype.) In its basic form, this greedy
heuristic proceeds as described in Figure 8. In narrative form, the algorithm repeatedly removes
the “best” element remaining in the element pool (with “best” being determined by the rank of the
resulting set formed) and constructs a new “candidate set” by adding this element
(SetConstructor in Figure 8 is an algorithm to organize a list of elements into a set, see
Section 4.4.3 for detail). If this “candidate set” has higher rank than the “best set” so far, the
“candidate set” become the new “best set”. This process stops when element in the element pool
is exhausted or a matched set is found if ‘express’ search option (see Section 4.4.3) is specified
by request. The algorithm returns the “best set” that satisfies the user’s request, or failure if no
such set is found. As with other greedy algorithms, this algorithm is not guaranteed to find the
best solution if one exists. The set-matching problem can be modeled as an optimization problem
under some constraints. It is known that this problem is NP-complete under some situations.
Hence it is difficult to find a general algorithm to solve this problem efficiently, especially when
the number of elements is large. Our work provides an efficient algorithm with complexity O(N 2)
with rank computation as the basic operation.
CandidateSet = NULL;
BestSet=NULL;
LastRank = Negative Infinite; Rank = Negative Infinite;
while (elementPool > NULL)
{
Next = X : X in elementPool && for all Y in elementPool,
rank( SetConstructor(X+CandidateSet)) >
rank( SetConstructor(Y+CandidateSet));
elementPool = elementPool - Next;
CandidateSet = SetConstructor(CandidateSet + Next);
Rank = rank(CandidateSet);
If (requirements(CandidateSet)==true)
if(Rank > LastRank)
BestSet=CandidateSet;
LastRank=Rank;
if(IsExpressSearch)
return BestSet;
}
if BestSet ==NULL return failure
else return BestSet
Figure 8: A greedy heuristic for use in set-extended matchmaking
4.4.3 Controlling the Behavior of the Greedy Search
In many practical situations, the user may have insights into the nature of the environment
that can be used to guide the set generation process. In the ClassAd/matchmaker framework,
the natural method for enabling user guidance is via control attributes. In our work to date,
we have experimented with the use of four such control attributes: CandidateSet,
SetConstructor, GroupBy, and SearchOption. We describe each of these attributes in
turn.
The attribute CandidateSet, if present, is interpreted as specifying the candidate sets that the
matchmaker should consider, thus overriding the standard set generation algorithm altogether.
For example: if we specify CandidateSet={{r1, r2, r3}, {r2, r5, r6}}, the matchmaker will only
consider two resource set with one includes three resources r1, r2 and r3, and the other includes
three resources r2, r5, r6.
The attribute Groupby, if present, is interpreted as specifying the name of another attribute that
should be used to decide which elements belong to the same set. For example, if we set
“Groupby=domain” in the request, offers with same value for attribute domain belongs to a set.
In Figure 9, if the request ClassAd specify “Groupby=domain”, the matchmaker will consider
three candidate sets with each set includes the resources from the same domain.
Element Pool
[ hostname="torc1.cs.utk.edu";
domain="cs.utk.edu"
... ]
[ hostname="cirque.ucsd.edu";
domain="ucsd.edu";
... ]
[ hostname="amajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
[ hostname="torc2.cs.utk.edu";
domain="cs.utk.edu"
... ]
[ hostname="dralion.ucsd.edu";
domain="ucsd.edu"
... ]
[ hostname="bmajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
[ hostname="torc3.cs.utk.edu;
domain="cs.utk.edu"
... ]
[ hostname="cmajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
Groupby=domain
Matchmaker
elements for Set1
elements for Set2
[ hostname="torc1.cs.utk.edu";
domain="cs.utk.edu"
... ]
[ hostname="torc2.cs.utk.edu";
domain="cs.utk.edu"
... ]
[ hostname="cirque.ucsd.edu";
domain="ucsd.edu";
... ]
[ hostname="dralion.ucsd.edu";
domain="ucsd.edu"
... ]
elements for Set3
[ hostname="amajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
[ hostname="bmajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
[ hostname="cmajor.cs.uiuc.edu";
domain="uiuc.edu"
... ]
[ hostname="torc3.cs.utk.edu;
domain="cs.utk.edu"
... ]
Figure 9. Example of Groupby
Generally, a set is elements with particular relationship. For example, in the resource selection
scenario, a set is used to describe a candidate resource selection scheme that should include not
only the resources but also the topology of virtual machine consisted by these resources. Thus we
need to add attributes in every element to describe this kind of relationship rather than just putting
all elements in a list. The attribute SetConstructor, if present, is interpreted as specifying a
function that should be used to organize elements into a set. If this attribute is not specified in the
request ClassAd, the default action of the matchmaker is to construct a ClassAd list by just
including all element ClassAds. In previous resource selection example, the set construction
function is the mapper function that decides the topology of virtual machine.
Figure 10 shows how a mapper function organizes three resources into a resource set that describes a
virtual machine with one-dimensional topology. In this set, attribute “location”,
“LBandwidth”, “RBandwidth” are added by the mapper function that describes the topology of
the virtual machine.
[ hostname="dralion.ucsd.edu";
bandwidth = [cirque_ucsd_edu= 56K;
cmajor_cs_uiuc_edu = 6K ];
... ]
[ hostname="cirque.ucsd.edu";
bandwidth = [dralion_ucsd_edu = 56K;
cmajor_cs_uiuc_edu = 6K ];
... ]
[ hostname="cmajor.cs.uiuc.edu";
bandwidth = [dralion_ucsd_edu = 6K;
cirque_ucsd_edu = 6K ];
... ]
Mapper
[
Elements for set
SetConstructor
Set2 = {
[ hostname="cirque.ucsd.edu";
bandwidth = [ dralion_ucsd_edu = 56K; cmajor_cs_uiuc_edu = 6K ];
requirement= RegExp(other.user, "grads*"); ...;
location=1; LBandwidth=10M ; RBandwidth= 56K],
[ hostname="dralion.ucsd.edu";
bandwidth = [ cirque_ucsd_edu = 56K; cmajor_cs_uiuc_edu = 6K ];
requirement= RegExp(other.user, "grads*") ; ...;
location=2; LBandwidth=56K; RBandwidth= 6K],
[ hostname="cmajor.cs.uiuc.edu"; cpuspeed = 265M ; memory= 128M ;
bandwidth = [ dralion_ucsd_edu = 6K; cirque_ucsd_edu = 6K ];
requirement= RegExp(other.user, "grads*");...;
location=3; LBandwidth=6K; RBandwidth= 10M ] }
]
Figure 10. An example of set construction
Finally, the attribute SearchOption, if present, is interpreted as specifying the search method
used when looking for a good set. If its value is “express”, the matchmaker will try to find a set
that satisfies user’s request and then quit. If its value is “power”, the matchmaker will try its best
to find the best set from all possibilities. Obviously, “power” search gets better set and “express”
search run faster. By default, the matchmaker does the “power” search.
5 A Example of Set Matching
In [[9]], we build a Resource Selection Service based on the Set Match technology and validate
our design by a computational astrophysics application, Cactus.
The Cactus application is the simulation of the 3D scalar field produced by two orbiting sources.
When the scalar field is huge, it is impossible to run this application on a single site because of
the limited memory size and computation capability. In order to get higher performance, this
application can decompose the 3D scalar field into several smaller blocks and allocate these
blocks and the related computation to multiple sites. Thus the computation on different blocks
can run in parallel and the execution time of application is decided by the resource that finishes
the computation last. For this application, we need to describe the request like “Want a set of
resources with enough memory capability to keep the data in the memory”. If there are multiple
resource sets fulfilling the resource constraints, we pick the best one based on user’s criteria, for
example, the shortest execution time. So we also need to specify in the request the application
performance on resource set and user’s criteria to rank all candidate resources. This request is
described by extended ClassAd language as follows.
[
1. iter=100; x=100; y=100; z=100;
2. cactus=370; cactusC=254; startup=30;
3. computetime = x*y*other.alpha/other.cpuspeed*cactus;
4. comtime= ( other.RLatency+ y*x*cactusC/other.RBandwidth
+other.LLatency+y*x*cactusC/other.Lbandwidth);
5. exectime=(computetime+comtime)*iter+startup;
6. SetConstructor = [type ="dll"; libname="cactus"; func="mapper"];
7. requirements = Sum(other.memorysize) >= (1.757 + 0.0000138*z*x*y)
&& RegExp(other.hostname, domains);
8. domains={ "*.cs.utk.edu", "*.ucsd.edu"};
9. rank=Min(1/exectime)
]
Figure 11. Resource requests for Cactus application
Lines 1–5 are the job description including the problem size and the Cactus performance model
[10]. Line 5 models the execution time of every subtask on a machine. Line 6 gives the name and
location of the mapping algorithm used for the application. It is used by the matchmaker to
construct a set from several resources. The function of mapping algorithm includes choosing a
mapping scheme and adding this information into resource ClassAds. In this example,
LLatency, LBandwidth are network connection of a resource with its left neighbor in the
Virtual machine; RLatency and RBandwidth are network connection of a resource with its right
neighbor in the virtual machine; and alpha is the workload allocation to this resource. Line 7 is
the resource constraints that say the total memory capability of the resource set should be large
enough to keep the computation in memory that is described by a formula of the problem size,
and resources should be selected from machines in “cs.utk.edu” or “ucsd.edu” domain that
is described in Line 8. Line 9 denotes that the reciprocal of the execution time of the application
is used as the criterion to rank candidate resources. Because the execution time of the application
is decided by the subtask that finishes last, the rank of a resource set is equal to the minimum
value of the reciprocal of the execution time of subtasks as specified in Line 12. If multiple
resource sets fulfill the requirements, the resource set on which application gets smallest
execution time has the highest rank.
The offers we used are the resources in the GrADS test bed. The resource selector queries the GIS
[ref] to get resource information and create ads for every resource. Then it uses set match
technology to select the best resource set for this application.
References
1.
2.
3.
Berman, F. and R. Wolski. The AppLeS project: A Status Report. in Proceedings
of the 8th NEC Research Symposium. 1997. Berlin, Germany.
Petitet, A., S. Blackford, and J. Dongarra, Numerical Libraries And The Grid: The
GrADS Experiments With ScaLAPACK. 2001, University of Tennessee.
Dail, H., A Modular Framework for Adaptive Scheduling in Grid Application
Development Environments, in Computer Science. 2002, University of California:
San Diego.
4.
5.
6.
7.
8.
9.
10.
Dail, H., et al. Application-Aware Scheduling of a Magnetohydrodynamics
Applications in the Legion Metasystem. in Proceedings of the 9th Heterogeneous
Computing Workshop. 2000. Cancun, Mexico.
Litzkow, M., M. Livny, and M. Mutka. Condor - A Hunter of Idle Workstations.
in Proceedings of the 8th International Conference on Distributed Computing
Systems. 1988.
Raman, R., M. Livny, and M. Solomon. Matchmaking Distributed Resource
Management for High Throughput Computing. in Proceedings of The Seventh
IEEE International Symposium on High Performance Distributed Computing.
1998. Chicago, IL.
Raman, R., Matchmaking Frameworks for Distributed Resource Management, in
Computer Science. 2000, University of Wisconsin: Madison.
Raman, R., ClassAds Programming Tutorial (C++). 2000.
Chuang Liu, et al. Design and Evaluation of a Resource Selection Framework. in
HPDC-11. 2002. Edinburgh, Scotland.
Ripeanu, M., A. Iamnitchi, and I. Foster, Performance Predictions for a
Numerical Relativity Package in Grid Environments. International Journal of
High Performance Computing Applications, 2001. 15.
Download