The Omega Test: programming a fast algorithm for William Dept. of Computer Science Univ. and integer dependence analysis * Pugh and Institute of Maryland, practical for Advanced College Park, Computer MD Studies 20742 pugh@cs .nrad. edu, (301)-405-2705 Abstract equalities The Omega test is an integer programming algorithm can determine whether a dependence exists between The that two gramming j array references, and if so, under what conditions. Conventional wisdom holds that integer programming techniques are far too expensive to be used for dependence analysis, except as a method of last resort for situations that cannot be decided by simpler methods. We present evidence that suggests this wisdom is wrong, and that the Omega test is refer to write 1 to = i A[i, 100 to j+l] be present evidence describe is an ties the array data dependence LYZ89, LY90, focused on approximate fast but only occurring) for the Omega Data whether *This cases. In may that results other they accurately report spurious are there an integer is supported also found quired to indicate scan grant to a set $01.50 the of the the that suggests this is wrong. which an wisdom determines arbitrary describe set of average the more equali- that suggest required vectors subscripts the test Omega twice and Omega by for an workstation. by the than We there linear time direction required We whether experiments the time At Omega the loop test to time re- bounds. is suitable a more a number ynique This for use in detailed keys, programming 7 that methods polynomial determining Omega of program- test such produce analysis as loop the is process (e.g., often incorpo- as comput- substantial speed for each (dependence validity must problem, time the in Omega structured see [BC86, vectors test as other ha. 10W- a decision this [WO182] of an exponenti.d In way, may re- number vectors complex to short-cut GKT91]). We which Treated are used transformations To be competitive, be able the complexity. directions of certain and complexity. situations yes or no. direction interchange). inethod time answer test vectors analysis many are accurate, simply a dependence of direction worst-case in dependence such the methods extension integer details, that worst-case tests of linear to an is a NP-Complete has exponential Section to determine level, new with in practice. test deciding combines elimination of implementation constraint in test constraints variable Dependence CCR-8908900. dependence for situations methods. array the quire resort compilers. problem: actuaI programming for special-case the equality order of last is rarely that (polynomial) has 4 @ 1991 ACM 0-89791-459-7/91/0004 that integer to be used 500 ,usecs on a 12 MIPS eliminating show dependence. to time values is performed. that determine Conceptually, ming. to be all to Fourier-Motzkin (commonly report the the simpler, to a problem would for W0189, work at to z and <loo holds We is less than production approximate equivalent solution by NSF this certain situations, problems exists Ban88, of read all programs, test pair analyze for deciding are guaranteed in dependency work but AK87, Much methods exact are conservative: dependence) BC86, refer pro- example, <loo test, solution almost Integer MHL91]. compute special methods [A1183, GKT91, that inequalities. that, We j] of methods the this variables j’ expensive by Omega integer and too decided improvements study time In loop and as a method cannot ing has been extensive the i’ wisdom except Omega There at the are far analysis, do = A[lOO, of and Conventional techniques do 100 values below. j+l=j’ rates j the variables array = shown programming. as an integer a = 1O(I analysis step in a parallelizing compiler (as well as many other software tools) is data dependence analysis for arrays: deciding if two references to an array can refer to the same element and if so, under what conditions. This information is used to determine allowable program transformations and optimizations. For example, we can decide that in the following program, no location of the array is both read and written. Therefore, the writes can be done in any order or in parallel. i problem of integer be formulated I<i<j A fundamental for would l<i’<j’ Introduction for a form problem is performed loop competitive with approximate algorithms used in practice and suitable for use in production compilers. The Omega test is based on an extension of FourierMotzkin variable elimination to integer programming, and has worst-case exponential time complexity. However, we show that for many situations in which other (polynomial) methods are accurate, the Omega test has low order polynomial time complexity. The Omega test can be used to simplify integer programming problems, rather than just deciding them. This has many applications, including accurately and efficiently computing dependence direction and distance vectors. 1 and inequalities, above a dependence this Section enumeration 4, we show how the Omega gramming test this in hand, that precisely and distance rectly in deciding 5.1. 2 The Omega Omega tion to an arbitrary test to a set use ~O<,<n variables 2.1 being this the ing di- the in Sec- Omega a,x, the the zo c) is = and of constraint tightening lutions, and (not includ- not integer integer do not we compute al, evenly the . . . . an. constant a. a., greatest the it a problem easier to Given a new solutions teger the The if and Of problem UO not the integer without solutions, rational so- P no that has and equality inequality con- constraints, pro- if the original problem in the process we might decide regardless of the no integer better simpler exists the constraint into all other ~ solve S@(fIk)(at result in all substitution this mod rn)zt constraints. In the original produces: la~ I = m – 1, this ~ Since all ((a, is equal to – (a, m~dm)) + m(a, m~dm))z, = O terms are now divisible by m, normalizing the produces: + ~ (([at/m+ $ + (a, m~dm))z, . ,Cv—{k} In the ficient original of a i; the plicative need same For of perform a unit the as the all absolute factor only before constraint. the of z k. changes inequality GCD additional if there Since coefficient inequali- of constraints solutions test [Banerjee88] constraints. approach eliminate We = O is only course, has integer somewhat which all the problem Generalized lowing + the this ‘lakl~ that at other most I/m step value value variables, of coefficient absolute absolute value this and suited solving constraint. that for our a,z, needs, for be added ~,=v be we found appropriate may equality a j # O such by toward more equalities the can However, in- used the to fol- since it is situations in later. = O, we first check IaJ I = 1. If so, we eliminate Xj and substitute the example, each of the of the this 1/2, Since O(log(max~et, and by a multi- m > _{O} we can coef- original substitution coefficient + appears consider result the 3, we la, I ) ) times eliminate the constraints: 7x + 12y + 312 3z+5rJ+14z has had constraints. eliminate substitute con- we replace determine equality eliminate solutions. inequality and constraint, For involving we first integer –sign(ak). teV—{k} constraints a problem ducing = com- constraint tightens but m constraint. Equality straints, ma — divide solutions. 2.2 then m)z, ~k xk = ‘Si@I(ak)??ZCI constraint, by g (i.e, term rational produce making mod constraint: tcV—{k} any We then is an inequality in the P has – a, the coef- produce is an equality divide constraint dividing P may thus integer but to produce here coefficients when If a problem produce we ). floors a and b b) – b fi con- are integers rational constraint not variable a mod (a mod standard A normalized by g. If the constraint floor lao/g] ak for constraint If the the Taking a new then else constraints coefficients with a constraint, g does <b/2 and 1 I O < i s n}). any coefficients described g of the unsatisfiable. with that constraint our coefficients). coefficients we take the algorithms the as follows: and set of indices V = {i that of the = > 0 as our tightening) divisor we scale divisor straint We create Note test simplify we define normalized. all with (k # O) and solu- ineqnalities, To c). – (i.e., a constraint To normalize all the am~db=ifamodb ‘Iaklma+ (the non-integer To and <Z<n a, x, z we assume in which m~d be quickly is an integer to as ~1 ~o<,<n been common coefficients, ficients that used is 1. ao) ties. input ai~t (and If we are given mon there equalities (such paper, variable value :Gv we use V–to-denote has in one greatest m = Iak I + 1. We define -=z( manipulated manipulating straint described al~or;thms), Normalizing Throughout are can are whether = O and and of the absolute depen- be vectors The our a,z, represen~a~ions, k be the index has the smallest transformations, distance as ~l<t<n (and can let let that test equalities presentation information set of linear (such possible of program determines of linear coefficient With a set of constraints all Otherwise, pro- them. techniques as a problem. inequalities the These integer deciding produce This and it. to simplify just describes validity direction The referred efficiently the from than concisely vectors. or standard computed tion rather we can dency be modified can problems, The methods Figure of example, these straints, term not these constraint no contradictions constraints so we constraints. ity resolve 17 = 7 constraints as shown in 1. In this ing above = know there A contradiction with or an equality divisible and by the zero arose are during no integer arises if we encounter with and the solutions conto the an equal- a non-zero a constant of the coefficients solv- inequality exist coefficients constraint GCD there constant term of the that variables. is This substitution resulting x=–8cr-4y -z-l constraints –7a–2y+3z=3 –24cr-7y+ y=cr+3p 18=–20–1 Figure llz= 10 –3a–2/3+z=l z=3cr+26+l – 21p 2ff+b =-1 + 112 = 10 first to in key observation P attempt of equatity that We do not and lower lower constraints 2.3 Inequality been equality the process can we report (such constraint the than one has solutions. elimination ply an a pair bound equality of made 6), on xk cases If neither and makes have very 2.3.1 use slip special two O, where a~ > x~ and at the all inequalities each psi; example, ~z=v so- moduced of upp;r if ~,cv a~x, P’ ~ O is a; bound a,z, ~ O is a upper bound contains: have There to is staudard the This can if and only there to P. important is Motzkin related solutions solutions if Fourier- directly of integer be very integer P’ More lack of integer of P’ not solution P. [DE73]. hand, existence to to P’ the rules Tightening the in verifying that P’ solutions. is an integer solution for Xk to more if and variable elim- and that arising true, can there in a new we report involves practice, problem problem disprove the is not confirm integer of solutions solutions but not tests of the are decisive, fact the that crack the we then only between integer the two upper bound jla~ bound between Our is to must ~t=v second The a,x, first is an upper ~ O and ~,cv a~x, of these is a lower bound. Resolving a multiple (j+ the is greater exist l)la~a~ a j I and yet and there — la~ I and since jla~a~ the is the the bound is lower of bound Therefore, and lower between this, the that is a multiple I + “k. upper distance such the lower bound than upper upper of Iak a~ I between must between the the and the upper (~ +l)laka~l If where them. there a~ 1. Since distance exist case as possible, of a~ it is at least the a multiple the bound is upper and of “k a~ must them. adaptation produce of Fourier-Motzkln a second inequality in combining each to produce 0 > a~. exist is less than than lower solutions the apart bound, laka~ I – la~ I – “k. a test tests as far doesn’t lower maximum in forms. inequalities the and is a multiple exact apply there upper of [a~a~ I between Consider of I“k a~ I between ak it is at most we have an integer is a multiple are a multiple greater we ap- if there bounds. bounds Since has existence exact, that not has in- method guarantees lower no this that original integer are only lower P that does pair a new variable problem P“, not which involve of upper and x~ lower elimination contains and the bounds every result of on Xk in P inequality ~ bound these of x~ produces: If there integer is an integer value an integer When are the only through by –aj and ak respectively gives: of z~ solution the same, if there as an ezact Multiplying of integer an- details Consider on existence x + 2y ~ 5, the variable, Fourier-Motzkln have through The in terms elimination we by to variable does constraints. rednndant prove if the test that of these can and solution constraints eliminations. that For problem problem out we constraints, moblem elimination reports This to cannot the of the if it one produces test and exists. most cases only If this that, of the it if and an adaptation that z L. and above) on z~. is a rational a rational of tight constraint Fourier-Motzkln most emxcti solutions most with attempt In solutions. solution involve bound There a than 3X + 2zJ ~ equality If the we apply in is at solutions. solutions devised and are If we find efficiently z + 2y ~ O and involves variable, integer teger the in- (e. g., has no solutions. pairs that 0). if we find dealing two another more contradictory integer [DE73] integer problem appropriate (e. g., given ination integer disprove is redundant). problem it the 3X + 2y constraints constraint that elim- constraints one constraints for for eliminate If the equality to see if anv 3X + 5y s Therefore, methods checking other that as 6 ~ with to our W bile all check contradict equality constraints. them revert directly with inequalities replace once first 3X + 5y ~ 2 and deal inequality first We constraints contradiction, also is used eliminated. constraints We variable constraints following have to (as shown on Zk in P, the The of Fourier-Motzkln do this by checking if there is an integer P’ that contains all inequalities to a new problem P. by combining I of elimination We lutions solution –31cl I 1: Example is the ination. set of integer there aO ~ O (where 6 points we know solution there for the lower ao is greater P. upper is an l’” if the to be than zero and P’ 1, elimination constraints in P“). to coefficients of z~ are The inequalities P“ if and is referred bound coefficients only P’ to This to be exact. Xk produce by solution to the bound to be exact for described solution If all is maranteed coefficients P“, that be an integer is an integer is also guaranteed non-unit will eiirninatzors. elimination to extends to P. of z~ are —1. or all the solution that having of the In these form cases, test takes advantage have to consider both P’ the Omega What if integer ger P’ integer has solutions? solution solutions Then to of the fact we know it is tightly P, that P’ we don’t and P“. but P“ that if nestled of upper and lower bounds. Let coefficient of x k in any inequality. teger solution exists, there must ~,ev a,z, >0 on X, such that does there not have is an inte- between some pair m be the most negative We know that if an inexist some lower bound p,! 2: Eliminations P’ and P“ We produce integer each lower For ~ in the a new range problem equality bound O to Pj ~, ~v a, z, ~ O on z~ in turn. [(lmakl–lml containing –ak)/lmlJ, the of P and the constraint: akzk = E j+ negative coefficient consider P Since ,cV–{k} teger If we find an integer solution to some lower bound just considered = Pj, report we the we change in P the to ~ to We range not find the lower 7x ~ 9y – 8. The This 9y – 8 + j the integer means for all disprove O < to range so we of in- lower bound, 11x > 3 – 13y +j = ~ [“’-~~-”] = any PJ and we have on z, so we report the existence next PA most have solutions, Pj j for we J in the on to the consider solutions bounds to we move j in = —11. unable P, 3 – 13y. all in Figure 2. Since P’ has we consider each lower does not, z is in example = 5. None of these have 7X ~ 9y – 8 + 6 to P. we are still for as shown the lower bound P A 7x solutions 11x problem. performed P“ of O S j < [*J add a restriction —a, x, exist ence of integer solutions to original If we don’t find solutions for any Pj, but bound in turn. We first consider we produce constraints solutions unnormalized comb mat mn 98>0 190y> 15 62~0 175 ~ 190y upper bound 121x <231 – 143y 77X ~ 99y + 66 49c < 63y + 42 77x < 147– 91y lower bound (33143y)+ 100< 121z (21 – 91y)+ 60< 77x (63y - 56)+36 <490 (99y - 8S) + 60< 77x Figure We consider unnormalized combination 1982 (1 190y + 45 ~ o 98~0 235 ~ 190y upper bound 121z <231 – 143y 77X < 99y + 66 49z < 63y +42 77X < 147– 91y lower bound 33– 143y < 121C 21 – 91y < 77X 63y – 56< 49c 99y – 88< 77X that 9. We do exhausted P has no integer solutions. The and steps expensive and check if we can now solutions to P. If so, solutions to P. Otherwise, bound on Z,$. If we have we report that 2.3.2 disprove we report existence there we move exhausted no solution Choosing the that on the of integer are to the lower no integer next lower bounds ficult to are performed arise variable to eliminate to choose straints bounds. If we are forced we choose a variable sible. We expect arise in the used, we illustrate very hard for if any, of typical (and an example perform test show the and as close inexact to zero as poswill programs. to problem. of) performed force by the the techniques the Omega Consider the Omega test to to arise the type 2.4 Implement and time. We report Equalities rithms only represent Once decide are no exact to eliminate smaller. Rearranging bounds on T give.: eliminations z since we the coefficients the inequalities can perform. bounds We ables of x are (slightly) to be upper and lower or no occur Omega test is gained several ideas on an unbounded involving variables algorithmic our running here. represented as vectors so that the no rational all the equality upper variables of algo- number bounds. constraints that We refer Performing variable have simply to variables. remain. We repeat lower such vari- Motzkin deletes this from no Fourier- it. We delete all constraints and then check if that has variables sit- in practice. improved integers; variables. unbounded these to be used. any constraints unbounded no unbounded 2 time While expect with is crafted with needs for only take experience we used are eliminated we check have test test elimination til 7 to deal scheme with can Details inequalities as unbounded additional that of those need of a 12 MIPS coefficients. not more substantially Omega we have of the for dealing until Omega The at ion a problem, There and many in practice. at ion some test we do to wait that on on problems value of nightmares tricks although be dif- implementation Omega methods have the ideas coefficients. inequalities absolute on better our to this all. the frequently will about test work them situations milliseconds possibility, In implementing the limitations 4.5 are possible: to the nightmares eliminations only complicated designed Also, process, 3 constraints, A decision reductions, takes expect in practice. in this is a frightening uations nightmare steps designed a small non-exact coefficients few, Omega To demonstrate on with that analysis An 2.3.3 to this was We do not nightmares variables perform an exact elimination if possible, and the variable that minimizes the number of conof upper and lower resulting from the combination try appear example to perform Worse proportional We example this frequently test workstation exists. which to resolve. Omega in this However, steps the on Xk, performed expensive. all the involving produced process nn- Next, keys we and normalize straints that were have tag two only if both negative is the the they ing of the of the keys and in a hash compute straint keys hash them into any a table for must keys of where new keys them we examine eliminate. If perform the straints from the data constraints and much key. are and use a as the This we would produced P“ differ Integer the P’ in their in creating programming types (adding and For conditional and same inequality as the value example, 4 imme- Simplification we deleting two then this the if we like dependence we can section, the the it constants in methods [LT88, loop program share protected HP90] bounds 5; and previous invariant variablen information to decide following about that the there may are we can handle the above problem would program (involving generate the the following variables be of the remainder accommodate operations, something variables in terms of in (possibly the that does not simplified n): integer appear to way, as being P into V such values. one describe that For problem there example, {O the values process ~ a ~ problem variables a set including new in ~’ values a, the Omega test more will along for the if asked {a = the simplification problems. ~ a ~ 6b} while process For example, protecting to 8 may {a a produces: ~ a} a=6rv} {l<a; a=6a-1} {2~a; a=6a-2} {3~a; a=6rr-3} produce reducing a the than with values to simplify 10tJ + 25c; produce be results Instead, not variables), For example, problem The V “. of the appropriate of ~’. can described. of programming protecting Also, integer and V a produces just the terms {20 division c in V that simplification {O~a; integer this programming V in those test used simpli@ variables if In a ~ 13} z 3; a = 5ff}. 2+n=i’ can only than 1<2,2’<n also test protecting for calculating integer while {5b We Omega integer a set programming while given variables the loop- z, t’ and the Omega of the integer 5b} an deczdes problem. When of to P with complicated are of V from symbolic constants by adding them as additional to the integer programming problem. For exam- programming not results program: suggested, values results more methods have involving the Actually, little = a[i] authors The simply {2~a~5}. [WT90] tondo a[i+nl variables. a < adapt as input a designation solution to sirnplzjication. is given possible test programming how for and problems 6 < Ifil fori=l test P simplifying allows to an integer we describe Omega Since [LC90]. no Problems to be used or more analysis to be able in the use Integer 2, the Omega in Section is an integer had but x, of is a solution allow new add elimination. term, there con- we copy and As described problems. max functions even constraint of e. variable into P“) these assignments of n, we would no flow ple, the we elimination, elimination and constant symbolic and ‘odm to see After which Otherwise, dependence handle min a if —ma+c+~a, subscripts us to properly As add allows we return exact by Fourier-Motzkin only work in we define n all the value of e. Similarly, *=1 constraint constraints, to decide an (for this, constraints C+b’ () ‘= oppos- variable. and problem). involved Nonlinear and value not hash we check one solutions, in place current To handle inequality we enter or tight than perform structures of the some the ,=1 variables we can not problem the elimination constraints 3 integer. add Programming Next, P’ a and con- diately. to m is a positive variable ,=1 opposing normalized, no multi-variable have e ‘ivm ,=1 are We the have constraints, more if we found system ‘+5”’ () constraint. involve expression ss by record- computing constraint an as op- assigned. (hash Assume can be expressed of the this time for contradictory per to to assign are of normalizing normalization, the it easy on their only negation constraints constraints time process table opposing based if coefficients method that ‘= are el previously hash recognized. is a Keys Constraint the Our redundant, constraints know the makes As in constant In the if which keys. us to check pairs into on previously in a program in the ez if and refer been appears they of a constraint expected keys based so that keys, constraint We constraints. to be unique). is designed ing e2. have keys if and term. key in e] are the in constraint key as an index guaranteed the in constant table constraint constant variables time variables of a constraint opposing a hash equal and variables to constraints last hash to con- a constraint of the their key of assign do this the key have in only since coefficients positive, of the coefficients assigned modified the and We constraint only and constraints them. constraints negation posing on differ coefficients to been The based constraint; the keys normalized. unique all constraint the multiple problem 4.1 All Complicated the Omega tem of constraints than one, the constraints, ones (nor Three of the as simple. problem it does redundant Changes required quick If the current ables, check to a sys- any Using This simplification poses. If I VI is greater involve contain 5 redundant 5.1 contradictory simple, other is not are: problem P involves if there the only are integer protected simplification. P as one One problem that they When performing an inexact Fourier-Motzkin elimina- pilers and know the simplify P“ and all the Pj for all lower bounds (not stopping when an integer solution is first veriif simplifying a system fied). This could be expensive many inexact this occur in practice will dependency ● never tion on a protected The always) that the ables. We tected One way make 3L ables All If g >1, constraint variables In order must our for of the protected to eliminate a new only is 1 (after this sub- (Section 2.2) variable sible constraint. variable a, add were recursively the For we report substitutions explore those. ing the current the gcd of the i for non-protected problem protected = max(-j k we also variables report made while all be treated In the as our applications found simplification than producing multiple and and as much infer a sindistance for zero deter- If coupled the choose dis- problem one two or three or positive), for the and protected posand following result. The l< disadvantage variables (that with found for wildcards simplification, to be more i,j) = O do to -1 do . . . k,i+j) to: Ai+AI <10 o< Aj <10 l< Aj+Az+Ak Ai + 2A] <20 (Redundant) Al=o is that We should first sign(Aj”) we have to solv- as wildcards) have direction is completely dist antes -10)-i . . . = a(l, wildcards has additional ,-10) = max(-j, the simplify with problem dependence 5). forl=Oto5do normalization). of the approach described above, we could refuse to perform inexact reductions while performing simone plification. The advantage of this is that we only report the simplified than involving simplify (negative, (such pair for the orig- As a modification simplified greater dependence subproblems than vectors forj=Oto20do transform problem. Simplification 4.3 the information set of constraints sign we to m)z, will a simplification, involving the variable a(l, When we way directions is unprotected. Otherwise, the example array or who’s unprotected, generate for Any We distance a better constraints constraints process. and signs more simplified variable. ap&opriate variable). direction dependence from distance the be two der)endence dependence dependence, is uncoupled this variable vari- the the prob- dependence is always use the We scan by uncoupled repeat (for between of the may than dist ante efficiently that mined with describes as possible variables unprotected the . a, mod new constraint so that in These anal- enumeration for the to the system we can gle dependence a dependence exists value in dependence a dependence information variables. protected x( te v the pro- programming (along the conditions it accurately vectors. dependence decision both variable down simplified contained determine t ante integer loop to define problem Alternatively, of a pro- log. techniques or eliminates The describ- surrounding this dependence a new shared dependence as when (as introduce the to and the conflict- the of loops the if any constraints simplify to to short-cut we take each Mur71] to determine calls need GKT91]). method, in we is com- [W0182] in which to be competitive, determining and tools, iterations number be able see [BC86, is typically vari- performed involving ga = this references). ysis method to elimination L is the and dist antes; an assume protected methods standard simplifies Eliminating gcd (we only substitutions only we create Given g be the in the constraint: inal O, let in a substitution a substitution that equalities. involves result If g = 1, we use our to find (where the In vector [K MC72, is to variables. we vector occur. vectors describe variables. standard are recorded involve if dist ante reads/writes then where elimination pur- methods methods. direction ing distance is normalized). will variable. stitutions = constraint use our exact involves a,z, constraint This process ● change a situation several to us. analysis structuring between equality us to program relation lem elimina- require for distance decision dependence dependence In from variables. non-protected O, the If g = could in an certain ~,c” of the constraint. ● performed constraint variable This elimination protecting so simple coefficients ● have not not equality variable. a non-exact could were Fourier-Motzkin used occurred dependence the believe arising be and ‘yes/no” other references. perform perform We do not for the problems some only data example, analysis. We we eliminations. have that ing cedure tion, involves can some direction with are direction ● technique Dependence data vari- of P and if solutions describe vectors test are changes simplification We pairs). changes so, report may not to the Omega The simplification a proble~ IV I variables. involving simplified any from does is s~mplify although 4.2 ● results test unprotect Al, and = 1. Considering the consider sign(Aj) we l< useful results. 9 Az<lO; l< sign(Aj) = O gives: AZ+Ak = O and We would then and simplify vector unprotect Ai the problem, of (=, (since we know obtaining —9 s sign(Ai) Ak, 2000 = 1) I I I t o t% or a direction A <, *, =). Returning to consideration of sign(Aj) = 1 produces: 1000 –95Ais9 –9 s Ak –9<A2+Ak –18 ~ Ai+2Ak Recursively analyzing produces (<, direction <,*, =). testing, 5.2 Summarizing requiring BK89, to obtain 1890 that HK90, an accurate that might be We do this the array summary the assignment convex tions only Details The two It for and those Omega be used regions; exact test to easily if one the into by [BK89, to determine able test test to could and affect then not did The Omega bounds of test when integer can are our interchanging programming appropriate non-rectangular loops. to perform this is The use described 6 [WO191]. We handle and symbolic tion vectors normally 1, 3, 4, and to all include several versions contrived grams. different source constant the LU The subscripts tool to several of our own the pro- psecs 209 psecs 485 psecs 143 ~secs 220 provide we decided that it would analysis of array pairs a (4) and a(5)) array be that are 10 groups it represents dusty individual in Figure pair. to copy by diagonal 4 x copying pair, the required build loop triplely indices. “worst-case code We exam- (excluding array pairs from all point is the timing to time time the and to scan constraints to the All spread 2 x are plotted time (the were time points. to analyze dependence graph, subscripts describe randomly time. direction that on a time = 8 x copying required appropriate array result overlapping copying time programs in a somewhat times out at analysis is the total the results the resnlts vs. copying the problem). dependence the deep, 3. Each are drawn calculate time all real-world four a good To present +1/2psec lines analysis almost FORTRAN independent fashion, graph of analysis time just (CHOLSKY) arrays). on array that nested of 3 conpled deck ~secs benchmark loops and of index add and NAS complicated involving that results The spe- of the more are shown time, make psecs GMTRY program to time 329 ~secs a single The 95%tile 946 perturbed more required 420 analyzing required pair Shown psecs confident machine log/log tiny, test benchmark we do not (e.g., for array a 12 CPU. 269 psecs arrays Our pairs benchmark: time 469 same array 3100, R2000 per input of the loop. 295 psecs subscripted tested benchmark with average (an reads between NAS loop vari- performed a Decstation time NASA the BTRIx third for two a MIPS on the MXM treatment decomposition, and NAS Since arrays, them. vectors distributed as several arrays. for index to include files 6 of direction NAS algorithms, as well 2 and bounds of direc- this NASA tool sets applied the decomposition, of wavefront use of index treatment We 7 of tiny in loop exact compressed tiny). tiny examples), misleading have the by in Wolfe’s compute to the 5 and test max expressions and Cholesky Programs extensive and (aa opposed (which cial min Omega constants, generated programs suite the on on scan as induction one common test in the code, ple” implemented results FORTRAN feel Performance have Omega are avoiding were dependence at least cases both dependence between based is substantially by [A191]. We input #7:vPENTA loop Ths substitution CHOLSKY #5: to determine forward or to such compute Program #1: required not programs could time those (thus and the below (such regions the array) workstation Bounds be used here; subscripts optimizations share timed analyze to convert an MIPS #4: Loop reported the is a dependence that #3: Determining figures and We did of be merged. 5.3 Performance the recognition We affected regions those 150 Test scanning time location test of another. merge be used approximate 100 (~sec) 3: Omega Standard dependence when Omega analysis by hand. solu- be changed. the in while bounds). to have is a subset Omega HK90]) the by limited will work, region Omega regions used more use the is not actually detected an affected included not so variables gives array Figure represented. be With to can are accurately check however, affected that and the problem time W_ I con- constants, summary 75 copy vari- appropriate of the The loop expressions, symbolic 50 complex I problem all protecting simplified can how as described subscript “-’‘1 ~~, t statement. and adding while locations’ intersect. is unclear and /’ 100 of an array programming index locations The locations I por- Omega assignment array the of the as strides regions can bounds, statement. such of the an integer each problem, polyhedron. can a single this indices the call loop summary to characterize test constants, accurate References the symbolic for to analyze. a procedure up 250 seen by by time (psecs) and example use affected the (<,=,*,=) difficult We ables for Ai be affected by setting Simplifying of may for straints sigu IJT91]. variables on. psecs we need involving and the =), most Array analysis, of an array [Tri85, for of (<,>,*, is the “A.”-” - anrdysis possibilities example In interprocedural tions the vectors This in our 500 (Redundant) vectors excluding and the loop the and the bounds dependence be- the array tween Across ing break-down about 1/2 7 pairs a range of for the test how time programs, time was was spent we found spent the by the dealing follow- Omega test: wit h inequality con- lyzed to remove effective each and any redundant normally). array pair constraints Based on was classified simple Any case that dist antes. the (this is not simplified section, The time constraints, as follows: apply the takes time. coupled dependence time, A case coefficient 10; Ai the but (for inequality at least < 10} constraints one constraint example, + 2Aj is the {O ~ AJ – the define Normalizing last a constraint makes (the because < Motzkin th~ pass time the we might we a substitution O(rnrLp) At the in the before worst-case required unbound. to eliminate least one variable last. and checking constraints for requires is only directly O(mn) expected, not exworst- is used). subproblems variable size of the takes except bound that performing constraints hashing Producing + Aj become the fact time, value substitutions of passes or redundant time case, and worst-case absolute the variables in each contradictory has a non-unit ~ 10; O ~ Ai that is eliminated largest from number of constraints 2.2 to eliminate log ICI) log [Cl constraint, unbound p where all the variables where region arises perform for of variables. in Section the of the bounds are accurate. number number O(mn with cost can eliminate O(nm) the the is on parts time algorithms by the methods the Eliminating does not involve time denote constraint This to pected convex taken constraint. cost- the bounds polynomial polynomial C is coefficient have time describe we use m to denote equality where regular A case where dependence dist antes are coupled, but afl inequality constraints have unit coefficients (for example, {Ai > O;Ai + Aj > O}). convex other bounds generaf then n to denote one results, the set of constraints describdistances for each array pair were ana- some and where In this elimination. To analyze our ing the dependence test, cases time describe Omega straints, about 1/4 of the time was spent on dealing with equality constraints, and I/4 of the time was spent examining simplified constraints to construct direction vectors. None of our test cases required inexact Fourier-Motzkin variable Polynomial We first resulting elimination subproblems takes from time Fourier- proportional to the produced. non-regular). complex A csse non-convex the region. one shown the where We only below lower inequality and bound encounted another of the constraints define two such one identical i loop 7.1 a During cases, except Special lOdo = a(j) end for flow/anti above are 4; –7 {Ai= test speed improvement. loop array dependence building < Aj [MHL91] found in the The < except the produce produce a hash 2-4 times adding or that this cases a dependence for the the larger problem dependence the have performance the pairs, faster overall time not for this it at be was cases percentage applied resolve in just exact They in Residue” over exactly by elimi- eliminations a process found 1/4 algorithm [Sho81] each constraint In a set x,. this variable property. On n2 + n constraints of [M HL91] of the unique Residue algorithm. found that in of the 1/4 Thus, Perfect the Club Banerjee’s using a dependence analysis than the Ome~a test would alnot Loop Hennessy encountered and could Generalized cases, Residue cases of with form test can constraints will take and be O(n3 ) time by the Loop Lam could their can eliminating be solved in is exact there after algorithm encounted in x, > elimination Hennessy Residue unique be applied [MHL91] be applied study of the benchmark. Maydan, for Maydan, can is of the form n variables, this that the Con- unique of constraints, Omega imagine be time. applied preserves most that speed could Per which (a higher can number Fourier-Motzkin they suggests the the builds of the performing ) worst-case property, the This test “Loop twice problem. increase and a set of constraints analyz- 1/3 can pairs. was in test to resolve that to variables O(mn2 Variable com- separately). redudant effort less than Omega has additional applied, Benchmark typically much code not any has problem be [MHL91] the pairs the ‘Single for many spent is difficult in much and problem spent of the However, problem was Thus, Test” that know can considered variables. constraint perform the Club iust those cases where zj+c, x,>c, ore> no overall subscripts problem. than We array iff [MHL91] to see if any other encountered. The to savings little do to test were checks with we to be applicable unbound takes the caching significant of scanning be added of computing be about cost problems. to copy cases that need Perfect test contradictory applies cases that Omega constraints for not “Acyclic those for memorization could therefore may problem. to significant cost would and of building the majority of array gorithm significantly lead the hit cost a dependence required use Memorization not or even to improve 9} do This if duplicate example { –4 < [MHL91] and build the cost resulting and found and the copying as large the trying time to the pairs nearly ing cases that bounds 2-4 times Lam However, would simple found satisfy the the in a contradiction), (SVPC) nating a problem, typical, We < 10; Ai a cache for Omega that Az + Aj and test. verifying cost for distances performance. Omega and distances =0}. better copying the the Hennessy to obtain key all 9;Aj Maydan, to the dependence AZ – Aj, ~ moduced straint” or involved checking putation. a(i-j) The (and solu~ions forj=Oto4do endf are If not is 2. not fori=ito normalization, variables that cases they tests GCD found could Lam found that be determined that be appIied tests. their 91% of the by constant Of the SVPC, in 86% remaining Acyclic of the cases tests and 9?Z0 of or Loop unique cases. for depen- improvements. The 11 Delta test [GKT91] works by searching dence distances propagating that that possible to easily cisely. In the mial test also into treats the pendence analysis without problem methods tests) can be solved In their study the of the can could RiCEPS, be tight any Since the Omega combination Acyclic and test time of the test, the that Perfect, SPEC Loop solve Delta test. benchmarks 97% requiring do equality Variable Residue be able than any effectively use one Per test and in be solved Constraint and the to solve more of them alone. Related work on Exact algorithm test modified the test, we Delta Constraint-Matrix how test efficiently Lu and algorithm it works Chen for appears can that to use of the terminate la~], sim- not dependence an analysis. expensive integer their [Tri85] resenting affected Triolet (22 used found array than array have The been Power combines the tightening, in interprocedural using test Banerjee’s and take no special tion except action Fourier-Motzkln for Motzkin when the is If the sample use of branch the existence need to [WT90] that due ination, where to Maydan, the simpler they Ancourt tests and Irigoin of have Both and should be used describe If so, then or P’ = ~ and to P’ of P’, so that solutions. and they is These has appropriate for to use for In we is exact,. with respect p.sezdo-linear solutions con- iff P constraints determining the P ~ P’, redundant pseudo-linear are difficult from reduction. elimination integer problem. are redundant If so, since add ~ is results. differs is an exact they Since a separate constraints not la~a~ I – better in P that then P’ a: ~ P difference P“. gives ~ the P the to be at least ~ P. t~at in produce a problem use force P“ P’ if P’ they – ak + 1 for using generate constraint loop bounds. determining has appear the How- existence solutions. they recent report dependence The methods is not (based used clear that the whole the work fully described, described out integer in is not but many programming high ana- [A191]). the basic in Section of simply is fast a very to in constraints that and 5.1, [A191] cases, efficient. techniques cost a production in their implementation a noticeable amount mentions is used described elimination parallelization project to that inccurs PIPS PIPS pseudo-linear point using analysis sinces the elimination the not variable also state that does not take suggest are They state on similar how handled. on variable appears ceptable Hen- [IJT91] Fourier-Motzkin dependence (that for is system). ac- They dependence testing of time compared with process. or disprove not found and the Tseng suggest variable instead in the use of 9 Source A C language available elim- rectory situations to be accurate. [A191] to verify consider they – Iajl check constraints They elimina- [MHL91] Fourier-Motzkin a it is solutions. also see if those P. Fourier-Motzkin They methods they a sample so- Wolfe Lam that the are constraint Maydan, to verify (they far). Hennessy expense are known thus by integral, methods solutions this by are over but attempting the constraints they lyze It [WT90] conservative. of the other to determine is not bound of integer implement and solution and test, being used There iterate inequalities, bounds P“, actually to framework and- Tseng an inexact as possibly that elimination, except laka~l of P’sand some that rep- variable elimination. performing result elimination the GCD variable nessy and Lam [M HL91] if none use They use back substitution lution. for use in dependence and when lower than check words, A expensive methods by Wolfe Generalized Fourier- to flag be Fourier-Motzkin described described their introducing problem. loops They P“ and to do not of integer regions). of on analysis. to simpler data bv the of integer P’. to our respect ever, rep- our upper they useful comfor producing an inexact to conservative integer method techniques techniques implementations elimination ysis. longer affected SeveraJ regions Fourier-Motzkln to 28 times resenting Fourier-Motzkin haneasily test. by a set of linear performing straints piler. Triolet for could performance into existence as opposed If programming However, they elimination to use them the the known clear for use in a production any inexact how equaJ other The and describe described is similar to [LC90] Their key differences techniques GCD space constraints programming. special Generalized for more exactly in practice. describe prohibitively fail an iter- is as follows: although useful They makes integer not any useful between Dependence [Wa188] for The constraints, handle problem any test, problems describe constraints with Constraint-Matrix plex ours. work over inequalities. of Analysis The their in Section for iterating of linear with pseudo-linear Rather, 8 do disprove (effecby not unclear percent the They Thev to as MIV Omega that that a set an inte- described algorithm. or approx- by the problem Single it should efficiently can any and or project, concept bounds by overlaps work dling When polynomial expect our They de- tests. tive) loop described use Banerjee’s achieves by the time without the as a sin- refer found in- Since Therefore, they between in polyno- algorithms they solved has significant the to simplify, (the constraints, be solved to what EISPACK, work problem test. in polynomial and cases MIV that resorting tests, it automatically Delta to exponential (i.e., LINPACK of of the resorting imate and problem, effects space converting analysis 4) so as to determine pre- ation constraints. dependence programming propagation and equality (and equality eliminate problem determine MIV it efficiently variable ger programming it distances of Motzkin then of making can use of solving constraints constraints test gle integer test algorithm determine inequality equality the dependence the and intent their without will determined, the other by a combination tightening Omega easily with where distance time) be determine cases a dependence Omega can information Fourier- 12 for code availability implementation anonymous pub/omega. of f tp from the Omega f tp. test is freely cs. umd. edu in di- 10 Conclusions Conservative cious dependence for forming the SIMD Also, lems that analysis loop have that only adequate FORTRAN program used to determine efficiently direction and distance easier transformation made bounds Omega prob- problems and build For example, loop program it can in work manner other uses. and that for deter- [A191], and considers Thanks to everyone who Z. Li and P. Yew. Some results on exact data deIn D. Gelernter, A. Nicolau, and pendence anrdysis. D. Padua, editors, Languages and Compders for Parallel Computing. The MIT Press, 1990. [LYZ89] Z. Li, P. Yew, and C. Zhu. Data dependence analysis on multi-dimensional array references. In Proceedings International Conference on Superoj the 1989 ACM computing, June 1989. [MHL91] D. E. Maydan, J. L. Hennessy, and M. S. Lam. Efficient and exact data dependence analysis. In A Ckf Conference on PTogTamming Language SIGPLAN’91 Design and Implementation, 1991. [Mur71] Y. Muraoka. Parallelism EzposuTe and Ezplottatton in Programs. PhD thesis, Dept. of Computer Science, University of Illinois at Urbana-Champaign, February 1971. [Pug91] William Pugh. Uniform methods for loop optimization. In 1991 International Conjer-ence on Supercomputing, Cologne, Germany, June 1991. [Sh081] R. Shostak. Deciding linear inequalities loop residues. Journal oj the A CM, October 1981. [Tri85] R. Triolet. structuring of Computer Champaign, [wa188] D. Wallace. Dependence of multi-dimensional array references. In Proceedings oj the Second Intemtational Conference on Supercomputing, St. Male, France, July 1988. [WC1182] M. loop [Pug91]. gave and me the feedback anonymous on this work, who [A191] Corirme Ancourt and Pranqois Irigoin. Scanning hedra with do loops. In PPOPP ’91, 1991. poly- [AK87] J. R. Allen and K. Kennedy. Automatic translation of Fortran programs to vector form. ACM Transactions on Programming Languages and Systems, 9(4):491– 542, October 1987. detailed Wolfe comments. References [AH83] Dependence Analysis jor Subscripted J. R. Allen. Variables and Its Application to Program Transjormatio ns. PhD thesis, Rice University, April 1983. [Ban88] U. Banerjee. ing. Kluwer [BC86] M. Burke and R. Cytron. InterProcedural dependence analysis and parallelization. In em Proceedings of the SIGPLAN ’86 Symposium on Compiler Construction, Palo Alto, CA, Jtiy 1986. [BK89] V. Balasundaram and K. Kennedy. A technique for summarizing data access and its use in parallelism enhancing transformations. In SIGPLA N Conference on Pvogmmming Language Design and Implementation, ’89, June 1989. [DE73] [GKT91] G .B. Dantzig rmd B.C. Eaves. Fourier- Motzkin nation and its dual. Journal of Combinator-ial 14:288-297,1973. (A), G. Goff, Kenn Kennedy, tical dependence testing. je,ence on Programming mentation, [HK90] Dependence Analysts jor SupercomputAcademic Publishers, Boston, MA, 1988. University 1982. elirnTheory Kennedy. Experience of array side effects. with inIn Super- 13 Optimizing Supercompilers jor PhD thesis, Dept. of Computer of Illinois at Urbana-Champaign, Super- Science, October [W0189] Michael Wolfe. Optimtmng Supercompilers jor computers. Pitman Publishing, London, 1989. [wcd91] Michael Wolfe. The tiny loop restructuring research tool. In Proc of 1991 International Conference on Parallel P~ocess8ng, 1991. [WT90] Michael Wolfe and Chau-Wen Tseng. The power test for data dependence. Te&lcal Report CS/E 90-015, Oregon Graduate Institute, August 1990. 1991. Paul Havlak and Ken terprocedural analysis computing ’90, 1990. em. by computing 28(4):769--779, InterProcedural analysis for program rewith Parafrase. CSRD Rpt. 538, Dept. Science, University of Illinois at UrbanaDecember 1985. J. Wolfe. comput and Chau- Wen Tseng. PracIn ACM SIGPLAN’91 ConLanguage Design and Imple- An [LY90] we referee provided Michael Trio- A. Lichnewsky and F. Thomasset. Introducing symbolic problem solving techniques in the dependence testing phases of a vectorizer. In Proceedings o-f the Second International Conference on Supercomputing, St. Male, France, July 1988. Acknowledgements especially R6mi [LT88] it can be analysis and L. Lu and M. Chen. Subdomain dependence test for massive parallelism. In Proceedings o.f Supercomputing ‘9o, New York, NY, November 1990. dependence be used interchange use of it in a uniform about for several Jouvelot, Au- [LC90] of prob- how Pierre Computing, D. Ku&, Y. Muraoka, and S. Chen. On the number of operations simultaneously executable in FORTRANIEEE like programs and their resulting speedup. Transactions on Computers, 1972. tools. discussed as well demands I.tigoin, for Parallel [KMC72] is a encountered programming information vectors, test PranSois and Compilers Symbolic deparallelizing Workshop on Iet. Semanticaf interprocedural parallelizatiorc overview of the pips project. In ICS ’91, 1991. dependence also for the We have after extensive transformations for but of integer to describe tools. loop 11 [IJT91] in typical data transformation simplification concept. the performing code, is an exciting have task. analysis M. Haghighat and C. Polychronopoulos. pendence analysis for high performance compilers. In Proceedings oj the Third Languages gust 1990. such encountered us that for lems mining present than convinced sophisticated much demanding transformations interchange method is not Performing It Trans- use of massively more undergone difficult be effica- efficient is a much more practical in vectorizing more make may compilers. FORTRAN. studies and methods vectorizing have and substantially Our fast computers skewing dusty-deck of so as to programs as look analysis demands programs parallel [HP90] Super-