Program Synthesis for Combinatorial Optimisation Problems Position Statement *

From: AAAI Technical Report SS-02-05. Compilation copyright © 2002, AAAI (www.aaai.org). All rights reserved.
ProgramSynthesis for CombinatorialOptimisation Problems
Position Statement*
Pierre Flener
Departmentof Information Technology
UppsalaUniversity, Box337, S - 751 05 Uppsala, Sweden
pierref@csd.uu.se ASTRA
group: http://www.csd.uu.se/,~pierref/astra/
Abstract
Ahigh-levelabstract-datatype-based
constraint modelling
languageopensthe doorto an automatable
empiricaldetermination-- by a synthesiser-- of howto ’best’ represent
the decisionvariablesof a combinatorial
optimisation
problem,basedon(real-life) traininginstancesof the problem.
the extremecasewhereno suchtraining instancesare provided,sucha synthesiserwouldsimplybe non-deterministic.
Afirst-orderrelationalcalculusis a goodcandidate
for sucha
language,
as it givesrise to verynaturalandeasy-to-maintain
modelsof combinatorial
optimisationproblems.
Introduction
Combinatorialoptimisation problemsare increasingly ubiquitous and crucial in industry. Indeed, staying competitive
in the global NewEconomy
requires the efficient modelling
and solving of such problems,whoseinstances are getting
larger and harder. Examplesare productionplanningsubject
to customerdemandand resource availability so that sales
are maximised,
and air traffic control subject to safety protocols so that flight times are minimised.Appropriatevalues
for the decision variables haveto be foundwithin their domains,subject to someconstraints, such that someoptional
objective function on these variables takes an optimalvalue.
In recent years, modellinglanguagesbasedon somelogic
with sets and relations have gained popularity in formal
methods,witness the B [1] and z [10] specification languages, the ALLOY
[6] object modellinglanguage, and the
Object Constraint Language(OCL)of UML.In database
modelling, this had been long advocated, most notably
via entity-relation-attribute (ERA)diagrams. Weexamine
whetherconstraint modellingcan benefit fromthese ideas.
Sets and set expressions recently started appearing as
modelling devices in someconstraint programminglanguages,with set variables often implemented
by the set intervai representation[5]. In the absenceof such an explicit
set concept, modellersusuallyrepresent a set variable as an
array of 0/1 integer variables, indexedby the domainof the
*Theseideas are the result of manyfruitful discussionswith
Brahim
Hnichand Zeynep
KJzlltan.Thisresearchis partly funded
undergrant 221-99-369
of VR,the Swedish
ResearchCouncil,and
undergrant IG2001-67
of STINT,
the SwedishFoundation
for InternationalCooperation
in ResearchandHigherEducation.
41
set. In termsof propagation,
the set interval representationis
equivalent to the 0/1 representation, whichconsumesmore
memory
but is able to supportmoreset expressionsand constraints. Bothrepresentations
are restrictedto finite sets.
Relations have not received muchattention yet in constraint progranuninglanguages,except the particular case
of a total function,via arrays. Indeed,a total functionf can
be representedas a 1-d array of variables over the range of
f, indexedby its domain,or as a 2-d array of 0/1 variables,
indexedby the domainand range of f, or evenwith someredundancy,
as long as channellingconstraints relate the parts
of the redundantrepresentation. Otherthan retrieving the
(unique)imageundera total function of a domainelement,
there has beenno supportfor relational expressions.
Weclaim that a high-level constraint modellinglanguage
withabstract datatypes(for sets, relations, and sequences)
opensthe door to an automatableempiricaldetermination
by a synthesiser -- of howto ’best’ represent the decision
variables of a combinatorialoptimisationproblem,basedon
(real-life) training instances thereof. In the extremecase
whereno such training instances are provided, such a synthesiser wouldsimplybe non-deterministic.Asuitable firstorder relational calculusis a goodcandidatefor sucha language,as it gives rise to verynatural and easy-to-maintain
modelsof combinatorialoptimisation problems.
Wehere ignore the issue of howto parameterisea solver,
say by providinga suitable labelling heuristic, towardsthe
solving of the modelledproblem. For non-expert or lazy
modellers,this task canalso be left to synthesisers[7,8]. We
thus here only aimat techniquesthat find the ’best’ model
for a givensolver, underits defaultsettings.
Relational Modelling with ESRA
DesignDecisions. In constraint satisfaction, muchmore
effort has beendirectedat efficiently solvingthe constraints
than at facilitating their modelling.Constraintprogramming
languagesreflect this, as their controlstructuresandvariable
representationoptions are usuallyquite low-level.
Thekeydesigndecisionsfor our constraint modellinglanguage-- called ESRA
-- are as follows. Wewant to capture common
modellingidioms in abstract datatypes, especially for relations, so as to designa truly high-levellanguage. Computational
completenessis not aimedat, as long
as the notationis useful for elegantly modellinga large hum-
ber of combinatorialoptimisationproblems.We(currently)
do not supportprocedures,and henceno procedurecalls and
no recursion. Similarly, we focus on finite domains, and
support only boundedquantification. In order to maximally
sugar the tint-order-logic nature of the language,weadopt a
’lower-128ASCII’syntax, unlike the I6TEX-requiring
syntax
of z, as well as a JAVA-style
declaration of the universally
quantified variables. For reasonsof space, wehere only introducethe conceptsof ESRA
that are actually illustrated in
this paper. Also, we can "only" give an informal semantics.
Thereader maymonitorwww.csd.uu.se/,,~pierref/astrafor a
completedescriptionof the full language.
Modellingthe Data. Aprimitivetype is either a finite enumerationof newconstantidentifiers, or a finite rangeof integers, indicated by its lowerand upperbounds.Theonlypredefined primitive types are the ranges nat: and int, which
are 0 :maxint and -maxint : nmxint:, respectively, with
maxint being the maximum
representable integer.
Relations are declared using the # relation typeconstructor. Considerthe relation type A m: n # p : q B.
ThenA and B mustbe primitive types, designating the two
participants of any relation of this type, with Abeingcalled
the domainand B the range of such a relation. The second and third argumentsof # are multiplicities, with the
followingsemantics: for every elementof A, there are betweenmand n elements of B, and for every element of B,
there are betweenp and q elementsof A in such a relation.
Wethus (currently)restrict the focus to binaryrelations, betweenprimitive types only. For partial and total functions,
m: n is 0 : 1 and 1 : 1, respectively.For injections, surjections, and bijections, p: q is 0 : 1, 1 :xp~txint, and1 : 1,
respectively. Ratherthan elevating functionsand their particular casesto first-class conceptswitha specific syntax,we
prefer keepingthe notation lean and leave their specialised
handlingto the synthesiser. This has the further advantage
that only the multiplicities needto be changedduringmodel
maintenance,say whena function becomesa relation.
(Arraysof) instance-datavariables are declaredin a JAVAstyle stronglytypedsyntax. All instance data are read in at
run-timefroma data file. Decisionvariable declarationsfollow the samesyntax, but are precededby the var keyword.
Theusageof arrays of decision variables, thoughpossible,
is sometimesdiscouraged, as they mayamountto a premature commitment
to a low-level representation of what essentiaily are relations. Dueto the (current) restrictions
relations, arrays are not a redundantfeature. All declarations
denote universally quantified variables, with the instancedata ones expectedto be groundat solving-timeand the decision onesexpectedto still be variablesthen.
Modelling the Cost Function and the Constraints. Expressions are constructedin the usual way.Theusual arithmeticoperatorsare available, suchas card for the cardinalit), of a set expression,ord for the positionof an identifier
in an enumeration,and sumfor the sumof a bounded(and
possibly filtered) numberof numericexpressions. Let R be
a relation of type Aat: n # 1~ : q B. For any element(or
42
subse0 a of A, the navigation expression a. R designates
the relational imageof a, that is the possibly emptyset of
all elementsin B that are related by R to (any elementin)
a. If m: n is 1 : 1, then a. R simplydesignatesthe (unique)
elementof B that is related to elementa of A. Therelation
expression,-,R designatesthe transposerelation of R, which
is thus of type B p: q # at: n A. The elementsof a rela- ¯
tion are representedas a#b pairs.
First-orderlogic formulasare also constructedin the usual
way.Atomsare built fromexpressionswith the usual predicates, suchas the infix in for set or relation membership
and
the infix ’<=’ for the ’<_’ inequality betweennumericexpressions. Formulasare built fromatomswiththe usual connectivesand quantifiers, such as not for negation,the infix
’&’ and ’=>’ for conjunctionand implication, and £ora3.1
and exists for bounded(and possibly filtered) universal
andexistential quantification.Theusualtyping, association,
and precedencerules apply.
Thecost function is a numericexpressionthat has to be
either minimisedor maximised.Theconstraints on the decision variablesare a conjunctionof formulas.
The Warehouse Location Problem
A companyconsiders opening warehouses on somecandidate locations to supply its existing stores. Eachcandidate
warehousehas the same maintenancecost, and the supply
cost to a store dependson the warehouse.Eachstore must
be supplied by exactly one warehouse(C1). Eachcandidate
warehousehas a capacity designating the maximum
number
of stores it can supply(Ca). Theobjective is to determine
which warehousesto open, and which of these warehouses
should supply the various stores, such that the sumof the
maintenanceand supply costs is minimised.In moremathematicalterms,the soughtsupplyrelationshipis a total function fromthe set of stores into the set of warehouses,
and the
set of warehouses
to be openedis the rangeof that function.
This problemwas first modelledas a constraint program
in the reference manualof 1LOG
SOLVER
4.0 (in 1997), and
then modelledin OPL[11]. There, the sought total function is modelledby a l-d array of variables representingthe
(unique)warehouse
that supplieseach store, therebycapturing the constraint Cx. The set of warehousesto be opened
is modelledin a redundantway(becauseit Wouldsuffice to
retrieve the range of that function), namelyas a 1-d array
owof 0/1 variables, such that OW[w] is 1 iff warehousew
is opened.A channelling constraint is then necessary, expressing that a warehousethat is actually supplying some
store mustbe opened.The cost function and constraint C2
can only be expressed in a low-level way, namelyby reinterpreting the Booleansof OW
andthe truth values of local
constraints as numericweights. Thingsbecomeeven more
awkwardif we non-redundantlymodelthe supply function,
namelyjust by the 1-d array of variables representing the
warehousethat supplies each store. Onthe instance data
we tried, this modelis actually an order of magnitudemore
efficient Coyall measures)than the publishedone, but it is
muchless readable. This showsthat redundancyelimination
maypay off in performance,but it mayjust as well be re-
enumWomen, Men
nat MaintCost
nat RankW[Women,Men],RankM[Men,Women]
enum Warehouses, Stores
var Women 1:1 # 1:1 Men Marriage // bij.
nat Capacity[Warehouses],
solve (
SupplyCost[Stores,Warehouses]
vat Stores 1:1 # nat Warehouses Supply // Cl forall(w#m, p#o in Marriage)
RankW[w,o] < RankW[w,m] // stability 1
minimise
=> RankM[o,p]< RankM[o,w]
sum(s#w in Supply) SupplyCost[s,w]
& RankM[m,p] < RankM[m,w] // stability 2
+ card(Stores.Supply)* MaintCost
=> RankW[p,o] < RankW[p,m] } }
subject to {
// C2
forall(w in Warehouses)
Rgure3:Thcoriginal Stable Marriageproblcm
card(w.’Supply)<= Capacity[w]
Pigurel:TheWarehouseLocationpmbiem
dundancyintroduction. But this is hard to guess, as human
intuition maybe weakhere.
Figure 1 shows an ESRAmodel of the problem. The
soughtsupplyrelationship is modelledas a relation and constrained to be a total functionfromthe stores into the warehouses, thereby capturing constraint G1. The elegance of
the cost function reflects the freedomfromrepresentation
choices, with the navigation expression Stores. Supply
retrieving the set of warehousesthat are to be opened.The
only constraint gracefully captures G2,using the navigation
expressionw. -,~supply to retrieve the set of stores that
warehousew supplies. Fromthis model, lower-level models can be syntbesised, includingthe ones discussedabove.
constraint is expressedfor both functions, requiring every
personto be identical to the spouseof their spouse.
A second modelwouldnon-redundantly modelthe marriages, namelyby a single total function, that is a l-d army
of variables representingthe wifeof each man,say. Toenforce the bijectivenessof this function,all variablesare constrainedto be different. Thismodelis probablyless efficient,
andthis has beenthe case withthe instancedata wetried.
Athird modelwouldmodelthe marriages in a 2-d array
Marriage of 0/I integer variables, indexed by the women
and men, so that Marriage [w, m] is 1 iff womanw is
marriedto manm. Twobijectiveness constraints are necessary to enforcethat every personhas exactly one spouse,so
that there is exactly one i in each row and in each column.
This modelis probablyless efficient than the secondone,
and this has beenthe case with the instancedata wetried.
Figure 3 showsan ESR^modelof the problem.The marriages are modelledas a relation over the sets of women
and
men,suchthat it is a bijection. Fromthis model,lower-level
modelscan be synthesised, including the ones above, using
the variouswaysof representingrelations, andexploitinginsights gainedfromthoroughstudies of bijections [9,12].
The Stable Marriage Problem
Original Version. Consider a dating agency where an
equal numbern of womenand menhave signed up and are
willing to marryany opposite-sexperson of the group.They
have rankedall possible spousesby decreasingpreference.
Figure 2 has sampleinstance data, wherea lowerrank means
a higher preference.For instance, Halis Nat’s first choice,
but it is Evewhois Hal’s first choice. Theobjective is to
match up the womenand mensuch that all marriages are
stable. A marriageis stable if, wheneverspouses prefers
someother partner, this partner prefers her/his spouseto s.
So s maybe unhappy,but s/he is boundto stay with her/his
spouse. In moremathematicalterms, the sought marriages
form a bijection betweenthe sets of women
and men.
This problemwasfirst modelledas a constraint program
in the reference manualof ILOGSOLVER
4.0 (in 1997), and
then modelledin OPLin a significantly simpler way[11].
Themarriagesare modelledin a redundantway, via two 1-d
arrays of variables representingthe (unique)husbandof each
woman
and the (unique) wife of each man,respectively.
channellingconstraint is necessaryto ensurethat bothtotal
functionsare the inverse of eachother, that is to achievea
bijectiun. To achieve better propagation, this channelling
ModelMaintenance.Relations and their particular cases
(partial functions,total functions,injections, surjections,bijections, and so on) are a single, powerfulconceptfor elegantly modellingmanyaspects of combinatorial optimisation problems.Also, there are not too manydifferent, and
evenstandard,waysof representingrelations and relational
expressions.Therefore,weadvocatethat the synthesiser can
actually makea (systematic)empiricalevaluation of candidate representations,using (real-life) training instances
the problem.In the absenceof suchtraining instances, such
a synthesiser wouldsimplybe non-deterministic.Also, theoretical studies such as [12] should be madefor particular
casesof relations in order to obtainrules stating whena representation is advisableand whennot, thereby reducingthe
volumeof such empiricalstudies by synthesisers.
Modelmaintenanceat the high ESR^level reduces to
adaptingto the newproblemand re-synthesising,as all representation(and thus solving)issues are left to the synthesiser. At lowerlevels, modelmaintenanceis quite tedious,
as the early if not uninformedrepresentation choices have
to be taken into accountand as the lower-level notation is
more awkward.Worse, a representation change, a redundancyelimination, or a redundancyintroduction (such as
Hal
Nat 1
Eve 2
Pat 3
Jim
2
3
2
Bob
3
1
1
Hal
Jim
Bob
Nat
3
3
3
Eve
1
1
2
Pat
2
2
1
Figure 2: Rankingsof the women
for the men(left),
rankings of the menfor the women
(fight)
and
43
enumWomen, Men
nat RankW[Women,Men],RankM[Men,Women]
var Women 1:3 # O:l Men Marriage
solve {
foral1(w#m in Marriage)
forall(oin Men)
// stability
RankW[w,o] < RankW[w,m] =>
exists(p in Women: p#o in Marriage)
RankM[o,p] < RankM[o,w]
& forall(pin Women)
// stability
RankM[m,p] < RankM[m,w] =>
forall(o in Men: p#o in Marriage)
RankW[p,o] < RankW[p,m] } }
Rgure4:TbePolyandricStable
Marriagepm~em
a modelintegration or the addition of impliedconstraints)
may"haveto" be operated,becauseit is unlikelythat, for the
consideredtraining instancesor in general,the ’best’ representationis the samefor bijectionsas for full relations, say.
Suchre-synthesis is also necessarywhenthe distribution
of instances on whichthe modelis deployedbecomesdifferent from the training distribution used whenthe modelwas
formulated.But the modellermaybe unwillingor unable to
do this experimentation
for finding the ’best’ model,or s/he
maybe unawareof insights gained froma general empirical
study, suchas on howto ’best’ modelbijections [9].
strongly-typed(full) first-order logic witharrays (andwith
the look of an imperativelanguage),while dispensingwith
such hard-to-properly-implement
and rarely-necessary (for
constraint modelling)’luxuries’ as recursion and unbounded
quantification. As shown,ESRAeven goes beyondthem, by
advocatingan abstract viewof relations.
Current and Future Work. Our ESRAlanguage is an extension of a streamlined(significan0 subset of OPL. A prototype ESRA-tO-OPL
synthesiser [4] has been implemented
by SimonWrang.The semantics of ESRAwill be given in
an implementation-independent
way, in twolayers. Indeed,
somefeatures of ESRA
ale just syntactic sugar for combinations of (a few)kernel features, hencewewill provide
operational semantics(by rewrite roles) for the non-kernel
features, and a set-oriented denotationalsemanticsfor the
kernel features. Wecan then tacHe the joint consideration of the modellingand the solver parameterisation. The
synthesiser will also benefit from our workon symmetryreducing/breakingconstraints [3]. A graphical languagecan
be developedfor the variable modelling,includingthe multiplicity constraintsonrelations, so that onlythe cost function
andthe other constraints needto be textually expressed.
References
1. J.-R. Abrial. The B-Book:Assigning Programsto Meanings. Cambridge
University Press, 1996.
2. K.R. Apt, J. Bmnekreef,V. Partington, and A. Schaerf.
Polyandry Version. Imagine a country where the law alAnimperative languagethat supports declarative prolows women
to marry up to 3 men,but menmaymarry only
gramming.ACMTOPLA$
20(5): 1014--1066, 1998.
1 woman.Alsoconsider that all women
whosigned up at the
3.
P.
Flener,
A.
Frisch,
B.
Hnich,Z. K~fltan, I. Mignel,J.
agencyneedto marry. The sought marriagesnowform a full
Pearson,
and
T.
Walsh.
Symmetryin matrix models. In
relation betweenthe sets of women
and men.A polyandric
CP’OIWorkshopon Symmetryin Constraints.
marriageis stable if, wheneverspouses prefers someother
partner, this partner/s married,and s/he prefers all her/his
high-level
4. R Flener, B. Hnich,and Z. l(azfltan. Compiling
spousesto s. If at least as manymenas women
havesigned
type constructors in constraint programming.
In Proc. of
up at the agency,the problemremainsa decision problem.
PADL’OI.LNCS
1990. Springer-Verlag, 2001.
Figure 4 shows an ~SRAmodel of this new problem.
5. C. Gervet.Interval propagationto reasonabout sets: DefThemultiplicities werechanged,andthe stability constraints
inition and implementation
of a practical language.Conwererephrasedto reflect the newdefinition. ~Thesamestastraints
1(3):191-244,
1997.
bility constraints could actually also havebeenused in the
A lightweight object modelling no6. D. Jackson. ALLOY:
modelof Figure 3, because the newdefmition of stability
tation. ACMTOPLAS,
forthcoming.
implies the original one in its context.) The modelmaintenancewas indeed unburdenedby representation issues.
7. Z. Klzlltan, P. Flener, and B. Hnich. Towardsinferring
labelling heuristics for CSPapplicationdomains.In Proc.
Conclusion
of Kl’O1. LNA12174.Springer-Verlag, 2001.
Related Work.This research owesa lot to previous work
8. S. Minton.Automatically
configuringconstraint satisfacon relational modellingin formal methodsand on ERA-style
tion programs.Constraints 1(1-2):7-43, 1996.
semantic data modelling, especially to the ALLOY
object
9. B.M. Smith. Modelling a permutation problem. RR18,
modellinglanguage[6], whichitself gainedmuchfromthe 7.
Univ. of Leeds(UK),Schoolof ComputerStudies, 2000.
specification notation [10] (and learned from UML/OCL
how
10. J.M. Spivey. The z Notation: A ReferenceManual(secnot to do it). Contraryto ERAmodelling,we do not distinondedition). Prentice-Hall,1992.
gnish betweenattributes and relations.
In constraint programming,OPL[I1] stands out as a
11. R Van Hentenryck. The OPLOptimi~tion Programming
medium-level constraint modelling language, and ALMA
Language.The MITPress, 1999.
[2] is alSObecominga very powerful notation, on top of
12. T. Walsh. Permutation problems and channelling conMODULA-2.Our ESRA
language shares with them the quest
straints. TR26 at www.dcs.st-and.ac.uld,-~apes,
2001.
for a practical declarative modellinglanguagebased on a
44