Appendix S3. Topology Problem: GA model The concept of GA was first established by Holland (1975) inspiring from the aspects of evolution theory. GAs function based on the “Survival of the fittest” and are robust search methods trying to reproduce the mechanisms of natural selection and population genetics according to the biological processes of survival and adaptation (Goldberg, 1989). The process of GA starts with a randomly generated population coded in either binary, integer or real. Each population is composed of a set of chromosomes (or string), representing a solution to the problem, having pre-specified number of genes equal to the number of decision variables of the problem being solved. Each gene represents a decision variable of the problem composed of pre-specified number of bits depending on the coding used. The next step is to calculate the fitness function value from the objective function of the problem for each chromosome. A new population is then reproduced using the individuals of the old population in such a way that the fitter chromosomes have greater chance of surviving in the new population. The process of updating the populations are continued until some convergence criteria is reached. A standard GA requires that chromosomes, coding, fitness function, and reproduction represented by three selection, crossover, and mutation operators are defined (Goldberg, 2000). Coding and Chromosomes The topology problem defined by Equations. 9 and 10 is solved here using two GA models with different codings and, therefore, chromosome structures. In the first model, referred to as bcGA, a chromosome with the number of genes equal to the number of all potential wells locations (n) with each gene represented by a binary bit leading to a chromosome length of n is used to represent a trial solution. Each trial solution represents a possible operational well configuration defined by the well locations represented by a one in the corresponding bit. This representation, however, would not automatically satisfy the topological constraint defined in Equation 10 requiring that the number of operating wells, the number of bit containing 1, be equal to m. This point will be addressed later when discussing the fitness function evaluation. In the second model referred to as icGA, a chromosome with a length equal to the number of operating wells (m) is used with each bit containing an integer number in the range [1,n] representing the corresponding operational well location. This representation has the advantage of explicitly satisfying the constraints of Equation 10. Two typical chromosomes of binary and integer type are shown in Figure S1. representing the same solution of three operating wells out of ten potential wells. 0 1 0 0 1 0 1 0 0 0 (a) 2 5 7 (b) Figure S1. Representation of (a) string of bcGA-LP-LP model and (b) string of the icGA-LP-LP model used for the solution of an arbitrary problem with n=10, and m=3. Fitness Function The fitness function defines how good a trial solution with respect to the problem objective is. For the problem under consideration, the less the operational cost of a trial solution, the better and, therefore, the fitter the solution is. The operational cost of a trial solution is defined by the objective function defined in Eq. (9) in which the pumping cost and corresponding water level drawdown are obtained by the solution of operational problem using the two stage LP-LP approach. It should be noted that the well drilling and pump installation costs are neglected in comparison to the pumping cost. The fitness function of the proposed GA, therefore, can simply be defined as: fitness 1 f (18) Where f is the optimal objective function value of the operational problem provided that the trial GA solution conforms to the problem constraint as defined by Equation 10. When an integer coding with the corresponding chromosome structure is used, the constraint of Equation 10 is identically satisfied and Equation 18 can be used to calculate the fitness of each trial solution. With the binary coding, however, there is no guaranty that a trial solution, which is either generated randomly at the start of the computation or reproduced by the GA operators during the evolution process, will satisfy the problem constraint. To force the search towards the feasible region of the search space, a penalty method is used in which the infeasible solution represented by chromosomes with the number of 1s not equal to m is penalized as follows: fitness 1 penal abs( m sum ) f (19) Where sum represents the number of 1s in the chromosome, and penal is large enough positive number so that any infeasible solution has a greater value than any feasible solution. The proper value of the penalty factor is often found via trial and error. Selection Of the GA operations, the selection is the most responsible to fulfill the concept of “Survival of the fittest”. In the selection process, some individuals of the current population are selected as parents of the next generation. The selection is carried out in a way that fitter individuals have more chance of progressing to the next generation as parents. Different selection methods have been proposed and used in GA. The “roulette wheel” is the first selection method proposed which is later found to be much prone to immature convergence. Here the binary “Tournament” is used which is reported to be the most efficient one. In this method, two chromosomes are randomly selected and the fitter chromosomes will be selected as one of the parents. The process is repeated until the required number of parents equal to the number of population is generated. Cross over Crossover is used to reproduce children from the selected parents and to form the next generation of individuals. Different type of crossover have been proposed and used such as one-point cross over, two-points cross over, scatter cross over, etc. Crossover helps GA to find good genes and propagate them into the next generation. Here, a one-point crossover is used with both coding. With binary or integer coding, one-point cross over selects a random bit position and exchanges the bits afterwards to form the parents. Mutation Mutation is used to avoid rapid convergence of GA and escape the local optimums by randomly exchanging the bit value. Mutation helps GA to search areas which is not represented by the initial solutions or those created by the selection and crossover operators. Here a uniform bitwise mutation is used in which each bit is mutated with a pre-defined probability. In bcGA, with binary coding, the mutation is simply carried out by exchanging the bit value while for the icGA, with integer coding, the bit value is replaced by a random number selected randomly from the potential bit values. Elitism To assure that the information of the best chromosome is not lost during the evolution process, elitism is often used in which the best chromosome of the population is directly copied into the next generation after all the above mentioned operators are conducted. The flowchart of the proposed GA-LP-LP model is depicted in Figure S2. Start Create initial population Assume initial random pumping rates for the given well configuration Use the previous Q values and solve the H-LP stage Take the resulting H values and solve the Q-LP stage Iter = Iter + 1 Compute fitness of No Termination the population criterion met? Yes No Violation over the constraint of well numbers observed? Yes Add penalty to cost values Select fittest chromosome Gen = Gen + 1 Reproduction Mutation Cross over Elitism Select an individual Select random two parents at random Select the best chromosome Mutate the selected gene Keep it intact Breed them to create new individual Compute fitness of the new population Termination criterion met? Yes No The most fit chromosome(s) is/are the optimal solution End Figure S2. Flowchart of the proposed GA-LP-LP model.