Format checking: Mar 7, 2013 *** please edit in this file *** 1 Fitting Piecewise Linear Functions Using Particle Swarm 2 Optimization 3 4 Pavee Siriruk1* 5 6 1 7 Thailand. E-mail: pavee@g.sut.ac.th 8 * Department of Industrial Engineering, Suranaree University of Technology, Nakhon Ratchasima, Corresponding author 9 10 Abstract 11 The problem of determining a piecewise linear model for two-dimensional data 12 is commonly encountered by researchers of countless fields of scientific study. 13 Examples of the problem or challenge are that of two-dimensional digital curves, 14 reliability, and applied mathematics. Nevertheless, in solving the problems, we 15 are typically constrained by the lack of prior knowledge of the shape of the 16 curve. Therefore, fitting a piecewise linear curve into a given set of data points is 17 a useful technique. Moreover, any two-dimensional continuous curve can be 18 approximated arbitrarily by a piecewise linear function. In fitting a piecewise 19 linear model, the number of segments and knot locations may be unknown. 20 Several techniques can be employed, such as genetic algorithms as well as least 1 Format checking: Mar 7, 2013 *** please edit in this file *** 21 square error function, but not all techniques can guarantee convergence to a 22 near optimal. In this paper, we introduce a method which employs the particle 23 swarm optimization as our primary model fitting tool. We aim to approximate a 24 curve by optimizing the number of segments as well as their knot locations with 25 fixed initial and final points. The experimental results which reveal the 26 performance of the proposed algorithm are also presented. 27 28 Keyword: Optimization, Particle Swarm Optimization, Piecewise Linear 29 Approximation 30 31 Introduction 32 One problem that invariably arises in determining the functional relationship between 33 certain input and output variables is fitting a piecewise linear curve into a given set of 34 data points. Reportedly, fitting a piecewise linear curve into a given set of data points 35 is of great use when prior knowledge of the shape of the curve to be fitted does not 36 exist (Kundu and Ubahya, 2001). As a result, the problem has received considerable 37 attention from and been studied by researchers of various application fields, including 38 engineering, chemometrics, and material science (Dunham, 1986; Pittman and 39 Murthy, 2000). Alternatively, the same problem can be viewed as approximating any 40 two-dimensional continuous curve with a piecewise linear curve. Examples of study 41 areas in which the aforementioned problem is encountered are that of shape analysis 2 Format checking: Mar 7, 2013 *** please edit in this file *** 42 (Dunham, 1986) and an approximation of the expected cost function during generator 43 outages (Siriruk and Valenzuela, 2011). 44 A large number of techniques have been proposed to address the problem 45 involving the creation of the piecewise linear curve. However, not all the techniques 46 can guarantee convergence to a near optimal. Cantoni (1971) proposed an algorithm 47 which determines the slopes and breakpoints that minimize the integral of the 48 weighted squared error over the approximation interval. Later, Pittman and Murthy 49 (2000) introduced a method by which genetic algorithms are employed to optimize 50 the number and location of the pieces. Kundu and Ubhaya (2001) proposed an 51 algorithm to optimize piecewise linear continuous fit to a given set of data points by 52 minimizing a weighted least square error. 53 Particle swarm optimization (PSO) was first introduced by James Kennedy 54 and Russ Eberhart in 1995. PSO is a simple algorithm that seems to be effective for 55 optimizing a wide range of functions (Kenedy and Eberhart, 1995). In this work, we 56 introduce the new method which employs the particle swarm optimization as the 57 primary model fitting tool. In the case of piecewise linear functions, we intend to 58 optimize the number of segments as well as their knot locations in the model. 59 60 The Problem Statement 61 We assume a set of data points 62 π· = {(π₯π , π¦π ): 1 ≤ π ≤ π}, (π₯π , π¦π ) ∈ ℜ2 , 1 ≤ π ≤ ∞ 3 Format checking: Mar 7, 2013 *** please edit in this file *** 63 where the values (π₯π , π¦π ) have a relationship with an unknown function π such that 64 π¦π = π(π₯π ). It is also assumed that π₯1 < π₯2 <. . . < π₯π . Our objective is to fit a model 65 πΜ(π₯) into a given set of data points. The following additional assumptions apply to 66 the model πΜ(π₯). 67 ο· 68 69 πΜ(π₯) is an β∗ -piecewise linear function, where β∗ is a positive integer representing the number of segments. ο· Let {πΏ1 , πΏ2 , … , πΏβ∗ } denote approximated slopes of the curve πΜ(π₯), where the 70 successive line segments πΏ1 , πΏ2 , … , πΏβ∗ of the curve are listed from left to 71 right. 72 73 ο· Let ππ and ππ+1 be the left and right end points of πΏπ (knot locations), where ππ = (π’π , π£π ), 0 ≤ π ≤ β∗ + 1, with π’0 = π₯1 and π’β∗+1 = π₯π . 74 75 The objective of this paper is to employ PSO to fit the function with the 76 unknown shape of the curve by minimizing the sum of squared errors (SSE) between 77 the original data points and the data points generated by πΜ(π₯). 78 79 Algorithm Description 80 This section describes how PSO can be used to fit optimal piecewise linear function. 81 One iteration of an algorithm consists of choosing knot locations using PSO, 82 constructing model πΜ(π₯), and calculating evaluation function. The process continues 83 until the stopping conditions are met. 4 Format checking: Mar 7, 2013 *** please edit in this file *** 84 If a starting point (π₯1 , π¦1 ) is not the origin (0, 0), we have to shift the starting 85 point to the origin and thereby shift other data points in accordance with the shifted 86 starting point. The following pseudo-code describes an algorithm to move the starting 87 point. 88 FOR π = 1 π‘π π 89 π₯π = π₯π − π₯1 90 π¦π = π¦π − π(π₯1 ) 91 END FOR 92 93 Notice that our new data range for π₯ and π¦ is [0, π₯π − π₯1 ] and [0, π(π₯π ) − 94 π(π₯1 )], respectively. Once the starting point is at the origin, we then select a set of 95 knot locations (π1 , π2 , … , πβ∗+1 ) in order to calculate the approximated slope for each 96 segment. However, it is assumed that π1 = π₯1 and πβ∗+1 = π₯π . Thus, using PSO 97 described below, we simply choose knot locations from π2 to πβ∗ , where π1 < π2 < 98 . . . < πβ∗+1 . 99 100 Particle Swarm Optimization 101 PSO is an iterative process which evaluates the solutions represented by the 102 particle locations and adjusts the particle velocities based on prior knowledge. Each 103 particle π maintains its current position, π΄π = (ππ,1 , ππ,2 , … , ππ,β∗+1 ), which is a set of 104 knot locations. Each particle also maintains the velocity, ππ = (π£π,1 , π£π,2 , … , π£π,β∗+1 ), 5 Format checking: Mar 7, 2013 *** please edit in this file *** 105 and the best position that each particle has found so far, ππ = (ππ,1 , ππ,2 , … , ππ,β∗+1 ). A 106 population size of π particles is initialized by randomly generating the vectors π¨ and 107 π½. The evaluation function is then used to evaluate the fitness of each particle. When 108 the particle is moved, the new location and velocity are calculated based on π΄π , ππ , ππ , 109 and the best π· vector from the neighborhood which is represented by ππ = 110 (ππ,1 , ππ,2 , … , ππ,β∗+1 ). The new velocity vectors are calculated using (1). 111 112 π£π,π = πΎ[π£π,π + π1 β πππ() β (ππ,π − ππ,π ) + π2 β πππ() β (ππ,π − ππ,π )] 113 for π = 1,2, … , β∗ + 1 (1) 114 115 where πΎ = 2 , π = π1 + π2 , and π > 4. |2−π−√π2 −4π| 116 117 Note that the values of π1 and π2 are set equally to 2.05, which are generally 118 accepted values (Carlisle and Dozier, 2001). The new π΄π vector is then calculated as 119 below: 120 ππ,π = ππ,π + π£π,π 121 After computation, the new π΄π vector must be sorted in the ascending order (ππ1 < 122 ππ2 <. . . < ππβ∗+1). (2) 123 124 6 Format checking: Mar 7, 2013 *** please edit in this file *** 125 Constructing πΜ(π) 126 Once the knot locations are selected, the process of fitting piecewise linear 127 curve into a set of data points can begin. The main steps of constructing πΜ(π₯) are 128 described below. 129 1) The first breakpoint is set at the first knot location (π1 ). 130 2) A new breakpoint is obtained by moving to the next knot location. 131 3) All data points between the current breakpoint and the previous breakpoint 132 are used to calculate the approximated slope using linear regression. 133 4) Steps 2) and 3) are repeated starting from the last breakpoint found so far. 134 5) The new model πΜ(π₯) is obtained once the last knot location has been 135 reached. 136 All slopes in the new model connect to each other as depicted in Figure 1. A set 137 of approximated slopes, {πΏ1 , πΏ2 , … , πΏβ∗ }, is obtained after applying the above 138 procedure. The new model πΜ(π₯) can be written as 139 140 πΜ(π₯π ) = πΜ(π₯π−1 ) + πΏπ β (π₯π − π₯π−1 ) 141 for ππ ≤ π₯π ≤ ππ+1 and π ≠ 1 (3) 142 143 where π = 1, … , β∗ − 1 and πΜ(π₯1 ) = π(π₯1 ). Note that the value of π₯π in (3) is not the 144 value that is shifted to the origin but the initial π₯π value (i.e., prior to shifting). 145 7 Format checking: Mar 7, 2013 *** please edit in this file *** 146 147 148 The Evaluation Function As mentioned earlier, our objective is to find the new model πΜ(π₯) that can minimize SSE and can be computed as below: 149 150 2 Μ πππΈ = ∑π π=1[π(π₯π ) − π (π₯π )] (4) 151 152 Therefore, the algorithm input is a set of knot locations (π1 , π2 , … , πβ∗+1 ); and 153 after computing for slope and implementing πΜ(π₯), the algorithm output is the sum of 154 squared errors computed by (4). 155 The process of choosing knot locations, constructing model πΜ(π₯), and 156 calculating evaluation function continues until the stopping conditions are met. We 157 then choose the π· vector with the lowest fitness value. 158 159 Numerical Examples 160 Methodology 161 162 In an effort to standardize the results of each dataset, a standard set of parameters is used. The population size can be described as: 163 164 π = {2,4,8,16,32,64,128} 8 Format checking: Mar 7, 2013 *** please edit in this file *** 165 All of the datasets use this parameter, but there is variation for each dataset. 166 The algorithm continues until it exceeds the number of function evaluations as shown 167 in Tables 1 and 2. 168 169 170 171 Data The proposed algorithm is applied to a couple of datasets in order to test its efficiency. 172 173 Dataset 1 174 Dataset 1 is borrowed from (Pittman and Murthy, 2000) which is generated 175 from piecewise linear functions with unequally spaced knots of which the middle 176 piece is considerably longer than either of the end pieces. This dataset uses normally 177 distributed noise (π). In other words, the function of dataset 1 can be expressed as 178 π¦ = π(π₯) + π. Thus, the shape of this curve cannot be predicted. The parameters and 179 generating function of dataset 1 are described in Table 1. 180 181 Dataset 2 182 Dataset 2 is borrowed from Siriruk and Valenzuela (2011). It is a recursive 183 function which acts as a piecewise linear function with a large number of slopes. 184 However, dataset 2 does not consider random noise. Unlike dataset 1, dataset 2 yields 185 an exact set of data points. Therefore, we can test PSO algorithm to approximate the 9 Format checking: Mar 7, 2013 *** please edit in this file *** 186 curve that does not include variation but does have a large number of slopes. The 187 parameters and the generating function of dataset 2 are described in Table 2, the latter 188 of which uses the data in Table 3. 189 190 Results 191 For Dataset 1, our PSO algorithm found the best solution when π = 16. This 192 population size returned 445.6074 SSE with 20,000 function evaluations. The picture 193 of the fitting curve is depicted in Figure 2. 194 For Dataset 2, the best solution was found when the population size π = 4 and 195 the number of segments β∗ = 6 (Table 5). With 4000 function evaluations, SSE was 196 found to be 78.33. From Figure 3, it is clear that PSO provided a reasonable curve 197 when the number of slopes (β∗ ) was 6. We can efficiently reduce the number of 198 slopes to 6 with a small SSE remaining. In addition, the curve given by PSO is not 199 much different from the original curve. 200 201 Conclusions 202 In this paper, we have employed particle swarm optimization to fit a function with the 203 unknown shape of the curve by minimizing the sum of squared errors. Our proposed 204 method has shown a good fit on a function that we have no prior knowledge of the 205 shape of the curve. Moreover, this algorithm performs well on approximating any 10 Format checking: Mar 7, 2013 *** please edit in this file *** 206 two-dimensional continuous curve with a piecewise linear curve with accuracy 207 mostly intact. 208 209 References 210 Cantoni, A. (1971). Optimal curve fitting with piecewise linear functions. IEEE 211 Trans. Comput., C-20(1):9-59. 212 Carlisle, A. and Dozier, G. (2001). An off-the-shelf PSO. Proceedings of the 2001 213 Workshop on Particle Swarm Optimization; April 2001; Indianapolis, Indiana, 214 USA, p. 1–6. 215 216 Dunham, J.G. (1986). Optimum uniform piecewise linear approximation of planar curves. IEEE Trans. Pattern Anal. Mach. Intell., PAMI-8(1):9-67. 217 Kennedy R. and Eberhart, J. (1995). Particle swarm optimization. Proceedings of the 218 IEEE International Conference on Neural Networks; November 27–December 219 1, 1995; Wasington, DC, USA, p. 1942-1948. 220 221 222 223 224 225 Kundu, S. and Ubhaya, V.A. (2001). Fitting a Least Squares Piecewise Linear Continuous Curve in Two Dimensions. Comput. Matha. Appl., 41:9-1033. Pittman, J. and Murthy, C.A. (2000). Fitting optimal piecewise linear functions using genetic algorithms. IEEE Trans. Pattern Anal. Mach. Intell., 22(7):17-701. Siriruk, P. and Valenzuela, J. (2011). Cournot equilibrium considering unit outages and fuel cost uncertainty. IEEE Trans. Power Syst., 26(2):8-747. 11 Format checking: Mar 7, 2013 *** please edit in this file *** 226 227 228 229 Figure 1. An outline of the evaluation function 230 231 12 Format checking: Mar 7, 2013 *** please edit in this file *** fˆ ο¨ x ο© 232 233 234 Figure 2. Results of dataset 1 235 236 13 Format checking: Mar 7, 2013 *** please edit in this file *** 237 f ο¨ xο© fˆ ο¨ x ο© 238 239 240 Figure 3. Results of dataset 2 241 14 Format checking: Mar 7, 2013 *** please edit in this file *** 242 243 Table 1. Experimental Dataset 1 Dataset 1 Generating function Parameters π: 791 h*: 3 # Function 3.88π₯ + 10.44 π(π₯) { −1.74π₯ + 3.14 3.77π₯ − 17.25 − 3.0 ≤ π₯ ≤ −1.29 − 1.29 ≤ π₯ ≤ 3.70 3.70 ≤ π₯ ≤ 4.90 Evaluations: 20,000 Noise: πππ(0,1) 244 245 15 Format checking: Mar 7, 2013 *** please edit in this file *** 246 Table 2. Experimental dataset 2 Dataset 2 Generating function Parameters N : 321 For π = 1,2, … ,7, h* : {4,5,6} ππ ππ π₯ + ππ ππ+1 (π₯) ππ (π₯) = {ππ {ππ πππππ₯ + ππ+1 (π₯ − for 0 ≤ π₯ ≤ πππππ₯ πππππ₯ )} + ππ ππ+1 (π₯) for πππππ₯ π₯> #Function Evaluations: For π = 8, 4,000 π8 (π₯) = π₯π8 Noise: - 247 248 16 Format checking: Mar 7, 2013 *** please edit in this file *** 249 Table 3. Data for generating function in dataset 2 π 1 2 3 4 5 6 7 8 ππ 0.99 0.99 0.9 0.98 0.96 0.98 0.98 - ππ 0.01 0.01 0.1 0.02 0.04 0.02 0.02 - ππ 1 1 8 14 14.8 14.8 14.8 100 πππππ₯ 50 50 20 76 100 250 251 252 17 12 12 ∞ Format checking: Mar 7, 2013 *** please edit in this file *** 253 Table 5. Results of dataset 2 h* ο SSE 4 128 479 5 16 178.35 6 4 78.33 254 255 256 257 18