Solving Large-Scale Fuzzy and Possibilistic Optimization Problems Weldon A. Lodwick, K. Dave Jamison, and Katherine A. Bachman University of Colorado at Denver Department of Mathematics - Campus Box 170 P.O. Box 173364 Denver, Colorado 80217-3364 USA weldon.lodwick@cudenver.edu Telephone: 303-556-8462; Fax: 303-556-8550 Abstract The semantic and algorithmic differences between fuzzy and possibilistic optimization methods are presented in the context of three methods for solving large fuzzy and possibilistic optimization problems. In particular, an optimization problem in radiation therapy with various orders of complexity,1,000-55,000 constraints, possessing (i) soft constraints, (ii) fuzzy right-hand side values and (iii) possibilistic right-hand side values, are used to illustrate the semantics and to test the performance of the three fuzzy and possibilistic optimization methods. We focus on the uncertainty in the right side which arises, in the context of the radiation therapy problem, from the fact that minimal/maximal radiation tolerances are target values rather than fixed real numbers. The results indicate that fuzzy/possibilistic optimization is a natural way to model various types of optimization under uncertainty problems and large optimization problems can be solved efficiently. Keywords: Fuzzy optimization, possibilistic optimization, surprise functions. 1. Introduction Many of the hardest optimization problems are those that contain uncertainty because the meanings of inequalities and optima must be defined in the context of the problem in question. Moreover, the complexity of uncertain optimization is formidable. Our research focuses on three approaches to fuzzy and possibilistic uncertainty optimization - (1) fuzzy optimization of Tanaka, Okuda, and Asai [13], and Zimmermann, [15], (2) the uncertainty/fuzzy optimization based on surprise functions of Neumaier, [12] and [11], and (3) the possibilistic optimization of Jamison and Lodwick, [6]. The purpose of this research is to demonstrate that the use of fuzzy/possibilistic optimization to solve large-scale optimization is not only tractable but the most direct way to model problems with embedded fuzzy and possibilistic uncertainty. To this end we present the three approaches, mentioned above, to solve a large-scale optimization problem where the uncertainty lies in right-hand side values. We assume that the reader is familiar with fuzzy set theory, possibilistic theory and linear and nonlinear programming. What is novel is that we solve very large fuzzy/possibilistic optimization problems, perhaps the largest reported application to date. Secondly, we test a novel way to solve optimization under uncertainty (see [12]) and extend what has been done in [6] and [11]. Thirdly, we describe the differences between fuzzy and possibilistic optimization and illustrate these in examples and applications. This paper is organized as follows. This first introductory section contains the discussion of the general problem of optimization under uncertainty and the application that we consider. The second section deals specifically with fuzzy and possibilistic optimization and algorithms that will be used to solve the radiation therapy of tumor problem. The third section contains the exposition of the numerical experiments and their results. Conclusions are found in section four. There is often confusion about fuzzy and possibilistic optimization. Fuzzy and possibilistic entities have different meanings/semantics. Fuzzy and possibility model different entities and the associated solution methods are different as we shall see below. Fuzzy entities, as is well known, are sets with non-sharp boundaries in which there is a transition between elements that belong and elements that don't belong to the set. Possibilistic entities are obtained from sets that are classical sets (crisp) but the evidence associated with whether a particular element belongs to the (crisp) set or not is incomplete or hard to obtain. Decisionmaking in the presence of fuzzy/possibilistic entities takes the following generic form (we use a tilde to denote a fuzzy set and a hat for a possibility distribution). 1. Fuzzy Decision Making: If the set of decisions are ~ ~ of the form x is F and G , that is, ~ ~ X = {x ∈ F and G} , find the optimal decision ~ ~ in X, that is, sup{F ( x ) ∩ G ( x)} . x 2. Note that the decision space X is a crisp set. Possibilistic Decision Making: If the set of decisions are of the form yˆ is F and G , that is, Yˆ = { yˆ ∈ F and G} , if the optimal decision in sup EA {U [ F ( x ), G ( x )]} = x 1 ∫ U { F ( x (α )), G ( x (α )) }α d α 0 1 where U{F(x)G(x)} represents the utility of the outcome F(x) and G(x). For our method, we use the “expected average” (EA) where the definition and properties are found in [5]. Very simply, fuzzy decision making selects from a set of crisp elements while possibility selects from a set of distributions. The underlying sets associated with fuzzy decision making are fuzzy where one forms the decision space of crisp elements from operations (''and'' in the case of optimization, that is, constraints) on these fuzzy sets. The underlying sets associated with possibilistic decision making are crisp where one forms the decision space of distributions from operations on crisp sets. Possibilistic distributions encapsulate the best estimate about the value of an entity given the available information. Fuzzy membership function values describe the degree to which an entity is that value. A possibility of one means that the value of the entity has the highest possibility of being what distribution defines. If the fuzzy membership value is one, then it is definitely the value. Thus the nature of decision making in the presence of fuzzy/possibilistic uncertainties are quite different in semantics and optimization procedures since fuzzy optimization optimizes over sets of numbers and possibility optimizes over sets of distributions. This study considers three cases: (1) Soft (fuzzy) inequalities where the crisp right-hand side value is an aspiration with associated fuzzy optimization (flexible programming) methods of Tanaka and Zimmermann used to optimize, (2) Fuzzy right-hand side values where the inequality is crisp but the right-hand side value is fuzzy with associated fuzzy optimization using surprise functions, and (3) Possibilistic right-hand side values where the inequality is crisp but the right-hand side value is a possibilistic real number with associated possibilistic optimization methods that use penalties on the weighted expected average of the constraint violations to optimize. Soft constraints mean that the crisp right-hand side value is a target. Fuzzy right-hand side values mean that various values of the right-hand side as defined by the fuzzy set have different preferences as measured by the α -level. Possibilistic right-hand side values mean that the entity described by the right-hand side exists but the research and empirical evidence does not support a single real-valued number but a distribution of possible values with varying degrees of belief. When inequalities mean, ''come as close as possible'' to the crisp right-hand side value, the inequality is called soft and it has non-sharp boundaries. Typically, the highest degree of attaining the inequality is to maximize the α -level of the soft inequalities altogether to the same minimum degree. While the original method was to obtain simultaneously the highest for all constraints, there is no reason why one could not maximize the weighted sum of α -levels, one for each soft constraint where different weights mean different levels of importance of attaining the target. In addition, if certain constraints were required to attain at least a minimum level, these could be obtained by adjusting the target or the α -levels associated with the particular constraint(s). This generalization is the essence of the surprise approach. Regardless, soft constraints are handled by flexible programming methods. When the right-hand side values are fuzzy numbers, that is, the values described on the right-hand side have non-sharp boundaries, one tries to attain the highest level of feasibility in aggregate as sum of α -levels of each constraint. To do this, surprise functions are used that penalize constraint violations dynamically within the range of tolerances specified by the membership function, where the preferred values closest to one are not penalized and the least preferred (nearest the outside of the support) are infinitely penalized. In between, of course, the penalties are between infinity and zero. When the right-hand side values are represented by a possibility distribution, it means that evidence at hand supports the value to that degree. The entity described by the right-hand side exists. However, for whatever reason, the evidence as to what the specific crisp value the entity attains is incomplete and the distribution describes the best information available as to its value measured by the level of confidence. In this case, something akin to recourse models in stochastic optimization, robust optimization or mean/variance optimization is used where constraint violations are allowed at a penalty and the objective cost is the average expected value of the penalty. The Application Used in the Numerical Experiments The use of particle beams to treat tumors is called the radiation therapy problem (RTP). Beams of particles, usually photons or electrons, are oriented at various angles and with varying intensities to deposit dose (energy/unit mass) to the tumor. The idea is to deposit as much dose as is possible to the tumor while sparing normal tissue. The process begins with the patient's computed tomography (CT) scan. Each image in the CT is examined to identify and contour the tumor and normal structures. The image is subsequently vectorized. Likewise, candidate beams are discretized into beamlets where each beamlet is the width of a CT pixel. A pixel is the mathematical entity or structure (a square in the two-dimensional case and a cube in three dimensions) that is used to represent a unit area or volume of the body at a particular location. For this study, we restrict ourselves to two-dimensional problems so that the analysis is done on a series of images that cover the tumor, each two-dimensional. In our experiments, we used 2 only one image per tumor. There were two tumors considered: head and prostate. For each, head and prostate, the experiment consists of four pixel resolutions (64x64, 128x128, 256x256, and 512x512) and one set of 10 equally spaced angles. Since we constrain the dosage at each pixel, the complexity of the problem goes from a maximum of 642 to 5122 potential constraints. However, since all pixels are not in the path of the radiation beams that hit the tumor, we a-priori set the delivered dosages at these pixels to zero and remove them from our analysis. This corresponds to blocking the beam which is always done in practice. The identification of a set of beam angles and weights that provide a lethal dose to the tumor cells while sparing healthy tissue with a resulting dose distribution acceptable and approved by the attendant oncologist is called a treatment plan. A discretized dose transfer matrix AT (representing how one unit of radiation intensity in each beamlet is deposited in pixels - for historical reasons we use a transpose to emphasize its origin as the discrete version of the inverse Radon transform), called here the attenuation matrix, specific to the patient's geometry, is formed where columns of AT correspond to the beamlets and rows represent pixels. A component of a column of the matrix AT is non-zero if the corresponding beamlet intersects a pixel in which case it is the positive fraction of the area of the intersection of the pixels with the beamlet (otherwise it is zero). The variables are vectors x that represent the beamlet intensities. There are a variety of ways of treating this problem without uncertainty. Pixels may be constrained individually or grouped into one constraint. Under idealized assumptions (see [2] or [4]), the problem without uncertainty has the following form: The problem we consider is derived from the deterministic linear program (LP) in standard form: min z = c T x subject to : A T x ≤ b 0≤ x≤U In the RTP literature there is no agreement on what the objective function should be. For example, one finds the following objective functions: minimize total radiation, maximize minimum tumor dosage, minimize radiation to critical structure(s), minimize the probability of healthy tissue complication, maximize the probability of delivering a tumorcidal dose, or minimize maximum critical structure dose. We minimize total radiation dosage as our objective function in the applications. Typically, oncologists consider the RTP as one of coming as close as possible to values specified by a radiation oncologist and this is the approach we use. The basic RTP translated into a mathematical programming problem is: min z = c T x subject to : body dosage Bx ≤ bbody critical tissue dosage C i x ≤ ci i = 1,..., N min tumor dosage Tx ≥ t min max tumor dosage Tx ≤ t max 0 ≤ x ≤U where the rows of B are body pixels, Ci are critical tissue pixels and T are tumor pixels obtained from re-ordering the rows of the attenuation matrix associated with the patient’s CT. Let bbody c 1 M b= and A = c N − t min t max B C 1 M , then the RTP is the LP C N − T T min z = c T x subject to : Ax ≤ b 0 ≤ x ≤U where we have the following three optimization problems corresponding to the flexible LP, surprise, and possibilistic approaches: min z = c T x ~ 1. subject to : Ax ≤ b (soft constraint ) 0 ≤ x ≤ U (hard constraint ) min z = c T x 2. subject to ~ : A T x ≤ b (fuzzy number) 0 ≤ x ≤ U (hard constraint ) min z = c x 3. subject to : A T x − bˆ ≤ 0 (possibilit y distributi on) 0 ≤ x ≤ U (hard constraint ) T 2. Fuzzy Optimization The two types of fuzzy optimization (1) and (2) above are distinct. 1. Fuzzy Optimization with Soft Constraints The first (Tanaka and Zimmermann) is the oldest of the approaches and its semantics are soft constraints, relaxation of the meaning of less than or equal to. These two researchers independently implemented the ideas of [l] in 3 α the context of mathematical programming problems. The idea of [l] was that constraint inequalities (and the objective function) were targets or goals. Thus, in the context of fuzzy uncertainty, membership functions (resulting fuzzy sets) of all constraints (and objective) were intersected. The fuzzy intersections are the ''and'' operation which, for membership functions, are minima. The optimization problem is thus to maximize the resulting function that is obtained by intersecting all constraint functions (and objective function). min z = c T x − α subject to :α d i + Ai x ≤ bi + d i 0 ≤ x ≤ U , 0 ≤ α ≤ 1 where di > 0 is the relaxation of the right-hand side constraint. Here we have modified the Tanaka [13] and Zimmermann [15] approach and keep the original objective (minimization of total radiation in the context of the RTP). 2. Fuzzy Optimization – Fuzzy Right-Hand Side The second (see [11] and [12]) approach is one in which the right-hand values are fuzzy sets. The translation of it to a mathematical programming problem is: Each fuzzy constraint ( Ax )i ≤ b% i is translated into a fuzzy equality constraint ( Ax)i = ξ% i , where the membership function µ (ξ ) of ξ% is the i i possibility Posi (b% i ≥ ξ ). Each membership function is translated into a surprise function by si (ξ ) = ( µi (ξ )−1 − 1) 2 ; and the contribution of all constraints are added to give the total surprise, ∑s i i (ξ ) = ∑s i (( Ax ) i so that we have, min ∑ si (( Ax)i ) i subject to : 0 ≤ x ≤ x. For triangular and trapezoidal fuzzy numbers, the surprise function is quadratic, smooth and convex. Hence, the optimization problem is tractable with standard optimization software, even for very large problems. 3. Possibilistic Optimization From the point of view of [6], the RTP that incorporates the possibilistic right-hand sides is in the form: A T x − bˆ ≤ 0 , where the left side is a possibilistic outcome and thus a possibilistic distribution. This means that the constraint set is a set of distribution so that one must take into account all the possible distribution. To do this, the RTP is translated into the following mathematical programming problem. min c T x + p B EA {max( 0 , Bx − bˆ body )} + N ∑ i =1 p C i EA {max( 0 , C i x − cˆ i )} + p T EA | Tx − tˆ | subject _ to : 0 ≤ x ≤ U where p is the penalty term. Here the maximum function is changed to: 1 2 max{0, x} → ( x + ∈ + x ), ∈> 0 small , and “EA” is 2 the expected average given by: 1 1 EA ( fˆ ) = ∫ { fˆ + (α ) + fˆ − (α )}d α . As it turns out (we 20 prove this elsewhere), the integrals associated with the penalized method above have a closed form functional expression for all trapezoidal and triangular possibilistic real numbers (they do not have to be numerically integrated) and this is what is used in the experiments. 3. Numerical Experiments Two sets of experiments were performed – a head tumor and a prostate tumor each with a set of 10 angles. The complexity for the head, measured in number of constraints, for the head was: (a) 64x64 - 977 constraints, (b) 128x128 – 3,669 constraints, (c) 256x256 – 14,159 constraints and (d) 512x512 – 55,720 constraints. The complexity for the prostate, measured in number of constraints, was: (a) 64x64 - 788 constraints, (b) 128x128 – 3,061 constraints, (c) 256x256 – 11,598 constraints and (d) 512x512 – 45,804 constraints. The execution times, measured in seconds of run-time, for the prostate example are listed below. The head example is similar. It is noted that we used un-optimized code in MATLAB and the MATLAB optimization toolbox. TABLE 1: Prostate Tumor Jamison&Lodwick Zimmermann Surprise Resolution 64x64 113 34 1 128x128 2,915 734 4 256x256 46,386 3,919 48 512x512 not run no memory 4,144 4. Conclusions It is clear that large fuzzy optimization problems can be solved efficiently. One would expect a quadrupling of the time, at least for the linear programming approach of Zimmermann. There is no such consistent pattern of increase in times. In fact, there is a jump in the time for surprise algorithm of roughly ninety times longer from the 256x256 to 512x512 problem. This is most likely due to memory allocation. We emphasize that we are running 4 nonlinear programs for Jamison and Lodwick and surprise while the Zimmermann approach is a linear programming problem. Moreover, run-times depend on how the resources of the computer at the time of execution are being used. Thus, the times are significant only in relative terms in measuring orders of magnitudes. Regardless, it is clear that the surprise approach is orders of magnitude faster and it the quality of the solutions obtained superior for the radiation therapy problem. In fact, the quality of the solutions for all methods was very good especially for the surprise method. It is, of course, clear that one would use a mathematical programming system like GAMS or TOMS for actual production code. 5. References [1] R.E. Bellman and L.A. Zadeh, ''Decision-making in a fuzzy environment,'' Management Science, Serial B, 17:141-164, 1970. [2] Y. Censor, M. Altschuler, and W. Powlis, ''A computational solution of the inverse problem in radiationtherapy treatment planning,'' Applied Mathematics and Computation, 25:57-87, 1988. [3] A.M. Cormack, ''Some early radiotherapy optimization work,'' International Journal of Imaging Systems and Technology, 6: 2-5, 1995. [4] A.M. Cormack and E. T. Quinto, ''The mathematics and physics of radiation dose planning using x-rays,'' Contemporary Mathematics, 113:41-55, 1990. [5] K.D. Jamison and W.A. Lodwick, ''Minimizing unconstrained fuzzy functions,'' Fuzzy Sets and Systems, 103:457-464, 1999. [6] K.D. Jamison and W.A. Lodwick, ''Fuzzy linear programming using penalty method,'' Fuzzy Sets and Systems, 119:97-110, 2001. [7] K.D. Jamison and W.A.Lodwick, ''The construction of consistent possibility and necessity measures,'' Fuzzy Sets and Systems, 132(1):1-10, November 2002. [8] W.A. Lodwick and K.D. Jamison, Chapter 19, ''A computational method for fuzzy optimization,'' in Ayyub, Bilal and Gupta, Madan (editors), Uncertainty Analysis in Engineering and Sciences: Fuzzy Logic, Statistics, and Neural Network Approach, Kluwer Academic Publishers, 1997. [9]W.A. Lodwick and K.D. Jamison, ''Interval methods and fuzzy optimization,'' International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 5(3):239-249, 1997. [10] W.A. Lodwick, S. McCourt, F. Newman and S. Humphries, ''Optimization methods for radiation therapy plans,'' in C. Borgers and F. Natterer (editors), IMA Volumes in Mathematics and its Applications, Computational Radiology and Imaging: Therapy and Diagnosis, Springer-Verlag, 1998. [11] W.A. Lodwick, A. Neumaier, and F. Newman, ''Optimization under uncertainty: methods and applications in radiation therapy,'' Proceedings 10th IEEE International Conference on Fuzzy Systems 2001, 3:1219-1222. [12] A. Neumaier, ''Fuzzy modeling in terms of surprise,'' Fuzzy Sets and Systems,5(1):21-38, April 1, 2003. [13] H. Tanaka, T. Okuda, and K. Asai, On fuzzy mathematical programming, J. of Cybernetics, 3:37-46, 1974. [14] L.A. Zadeh, ''Fuzzy sets as a basis for a theory of possibility,'' Fuzzy Sets and Systems, 1:3-28, 1978. [15] H. Zimmermann, ''Description and optimization of fuzzy systems,'' International J. of General Systems, 2:209215, 1976. 5