CMS-UTEP-May30Report-KY06-003

advertisement
Department of Defense
Hybrid Optimization Approach for
Solving Parameter Estimation Problems
Demonstration of High-Performance Computing
Reservoir Simulation Problems
Project EQM-KY6-003
Automated Parameter Estimation and
Sensitivity Analysis
Deliverable 5
PI: Hector Klie (UT-Austin)
E-mail: klie@ices.utexas.edu
Voice: (512) 475-8634
Fax: (512) 232-2445
Team Members:
Leticia Velázquez (UTEP)
Miguel Argáez (UTEP)
Carlos Quintero (UTEP)
Mary F. Wheeler (UT-Austin)
Abstract
We present the numerical performance of a hybrid optimization approach for solving automated
parameter estimation problems based on the coupling of the Simultaneous Perturbation
Stochastic Approximation (SPSA) and the Newton-Krylov Interior Point (NKIP) methods via
the generation of metamodels. The approach is tested on a suit of different problems that
illustrates potential for addressing large-scale EQM inverse problems.
Keywords. SPSA, Newton-Krylov, interior point methods, global and local optimization,
metamodels, parameter estimation, inverse problems, groundwater flow.
1 Introduction
In this report we present the numerical performance of a hybrid optimization approach for
solving automated parameter estimation problems based on the coupling of the Simultaneous
Perturbation Stochastic Approximation (SPSA) and the globalized Newton-Krylov InteriorPoint algorithm (NKIP) [1] described in previous reports of this project ([7, 8]). In brief terms,
the approach generates a surrogate model (also known as metamodel or proxy) based on the
best approximate solution yield by SPSA. We filter its corresponding sequence of points with a
user-specified radius. The set of points is enriched with additional sampling of points that
allows for creating an interpolated response surface in a neighborhood of the optimal solution.
Furthermore, this procedure allows for creating first and second order derivative information.
At this stage, the NKIP algorithm can be applied to find a refined optimal solution. We
implement the hybrid optimization algorithm on four relatively small test cases, and
demonstrate its capabilities for performing parameter estimation based on a two-phase flow
problem of interest to the PET-DoD using a HPC platform.
2 Hybrid Optimization
Finding a global optimal solution is a challenging task in many DoD environmental
applications that either estimate subsurface parameters or find an optimal management of
resources/operations on contaminated sites as stated in the first two deliverables of this EQM
Project [7, 8]. Many of these problems are computationally demanding and, therefore, have
motivated the need to develop novel optimization approaches. Hence, we have proposed a suite
of different optimization problems to evaluate the capabilities of combining both SPSA and
NKIP in searching for an optimal solution. The list of problems ranges from analytical
expressions to PDE-based problems. The Hybrid Optimization Scheme uses SPSA as a global
strategy and NKIP as a local strategy.
SPSA is an algorithm that does not depend on derivative information, and it is able to
find a good approximation to the solution using few function values. Its disadvantage is that
once we have a good approximation, it may not satisfy some conditions and constraints
associated with the problem. NKIP principal advantages are its fast rate of convergence to a
local optimum, and it can be implemented for solving large-scale problems. This approach
finds local solutions that meet certain conditions and constraints that the SPSA's solutions may
not necessarily satisfied. For our purpose, NKIP has the disadvantage that is a local strategy,
then this solution may not necessarily be the global solution. This strategy requires first and
second order derivative information which could be expensive or even impossible to evaluate
(as it occurs in most EQM applications that mostly rely on legacy code). Our objective is to
combine these two strategies taking the advantages of each one. Figure 1 shows our hybrid
optimization scheme.
Figure 1: Hybrid Optimization Scheme.
3 Metamodeling Strategy
Most problems require experiments and/or simulations to evaluate objective and constraint
functions as in terms of design variables. Frequently, optimization strategies requires thousands
or even millions of evaluations, and often the associated high cost and time requirements render
this infeasible.
We can overcome the computational cost of the optimization by constructing analytical
expressions, known as surrogate models, that locally approximate the original simulation or
forward model with a limited amount of data. This approach is very appealing when the
simulation or forward model is based on the solution of a set of differential equations consisting
of hundreds of thousand to millions of elements and with a significant number of time steps.
Obviously, the accuracy of the surrogate model depends strongly on the number and location
of samples (as it generally occurs with any approximating function).
The most popular surrogate models are provided by polynomial response surfaces,
kriging, artificial neural networks and radial basis function (see e.g., [5]). In particular, the
hybrid optimization approach seems to provide more reliable results when using radial basis
functions to create the surrogate model.
We find the surrogate model fs(x) using an interpolation method with the data, (xk,f(xk)),
k=1,…,p, provided by SPSA. In our test cases, we optimize the surrogate function
m
f x (x)   w j h j ( x)
j 1
where the multiquadric basis functions are given by
n
h j ( x)  1  
i
x  c 
2
i
ij
rij2
with cij,rij,,wj, ,i=1,…,n and, j=1,…m. We also consider the Gaussian basis functions defined by
 n x c 2 
   i ij  


h j ( x)  e
i
rij2



.
A radial basis function (RBF) is typically parameterized by two sets of parameters: the
center c, which defines its position, and a second set of parameters that determines the shape
(width or form) of an RBF. In the case of a one-dimensional Gaussian function, this second set
of parameters is given by the standard deviation 1/2 rij2 .We use the Gaussian RBF to create
the surrogate model for Pinter, Rastringin and the small reservoir problem. The multiquadric
RBF is use for Chandrasekhar and Bratu problems. Figure 2 shows the metamodel obtained for
Pinter's
Figure 2. Metamodel representation for the Pinter's function (d=2).
4 Numerical Experiments
We consider 4 cases: (1) Pinter's problem, (2) Rastringin's problem, (3) Chandrasekhar integral
equation (see [3, 6]) and, (4) the nonlinear elliptic equation Bratu [4, 9, 8]. All of these cases
were presented in detailed on the previous report [8]. For each case, we use five different
random initial points to start SPSA. We show an example for the Pinter's problem in Figure 3.
Figure 3. Solution to Pinter's problem using SPSA with 4 different initial points
(indicated with a white circle).
Next, we filter out the data resulting from searching for the lowest function value and
construct a surrogate model using additional sampling. In that way, derivative information can
be associated with the interpolating function that represents the surrogate model. This is used
by the NKIP algorithm to find an optimal solution. This solution is evaluated and compared
with the original model (function). If the solution satisfies certain tolerance, then we claim it is
the global solution. If this is not the case and this solution is better than SPSA's solution, then
this point is added to reconstruct a new surrogate model. Otherwise, we restart SPSA using this
solution as a new initial point. The process is repeated again until there is sufficient agreement
between the model and metamodel value at the optimal solution.
In Figures 4-9, we plot the results obtained for the four test cases in terms of residual
reduction vs number of iterations. The blue and red lines represent the iterations performed by
SPSA and NKIP, respectively. Here the blue line represents the best search space found by
SPSA from one random point. Each iteration of SPSA requires 2 function evaluations of the
original function (simulator), while NKIP requires one function evaluation and corresponding
1st derivative of the surrogate model per iteration.
Each plot shows that the hybrid optimization approach worked efficiently, i.e. the
connection was able to refine the solution. As an example, the function value obtained for the
Pinter's problem using the hybrid approach is -1.926142 which is lower than the one reported
by running SPSA or NKIP alone (see [8]).
Figure 4. Convergence history for minimizing Pinter's function (n=2).
Figure 5. Convergence history for minimizing Rastringin's function (n=50).
Figure 6. Convergence history of the hybrid optimization scheme for the Chandrasekhar
function for c=.9.
Figure 7. Iteration report of the hybrid optimization scheme for the Chandrasekhar function for
c = .9.
Figure 8 Convergence history of the hybrid optimization scheme for theChandrasekhar function for
c = .999999
Figure 9 Convergence history for Bratu problem.
5 Two-Phase Flow Parameter Estimation
We experimented with a small and large scale parameter estimation problem using sensor
pressure data. This analysis was included to show that despite having full information of the fluid
flow pressure field in this small case, the inverse problem is still highly ill-posed and has multiple
local minima. This also illustrates how challenging parameter estimation can be on realistic on
groundwater flow scenarios. Therefore, strategies to regularize and re-parameterized the inverse
problem need to be employed.
Our goal is to estimate the permeability based on pressure measurements. We formulate
the parameter estimation as a nonlinear least squares problem:
1
( x  x* )T W ( x  x* ),
2
n
where x  , x* is the true values of the pressure, and W is a positive diagonal matrix.
min f ( x) 
5.1 Small and Sequential Case
We run a single-phase simulation for a given permeability field K of size 10x10 and some given
default parameters. The simulation returns the pressure head field at 100 different pressure
observations points. We run the test case in a laptop Windows-based system using Matlab 7.1.
We use 5 starting random points and allow 2000 iterations for SPSA. We obtain 10167 search
points, then we filter some of these points to obtain 105 points for creating a surrogate model.
Next we use NKIP to find a local solution in 103 iterations and the optimal function value
obtained is 0.00353. Figures 10 and 11 show the permeability and pressures calculated from the
hybrid approach versus true data for the small test case.
Figure 10. Permeability: true field (top-left),Initial input for SPSA (top-right), estimated output
with only SPSA (bottom-left) and estimated output with the hybrid optimization approach
(bottom-right).
Figure 11. Pressure field : true field (top-left), Initial input to SPSA(top-right), estimated output
with the SPSA (bottom-left) and estimated output with the hybrid optimization scheme (bottomright).
5.2 Using HPC
We run the hybrid optimization algorithm on conjunction with the simulator framework IPARS
(Integrated Parallel Accurate Reservoir Simulator) [10, 12] on a Linux-based multicore network
of workstations. The problem consists of finding a permeability field that involves 2000
prameters. The field is parameterized by the singular value decomposition (SVD) [11] reducing
the original parameter space to only 20 (each parameter represents a scale resolution level). We
choose several initials points adding a percentage of noise to the true singular values. We choose
5%, 10%, and 15% of noise.
Tables 1-3 summarize the results obtained for the best 3 test cases, and Figures 13-26
show the corresponding permeability fields obtained.
As we can observe, the SPSA algorithm does a good job in obtaining an estimation that is
further improved by the hybrid approach. As the noise level was increased, both the SPSA and
NKIP presented more difficulties in reproducing the true permeability field. However, the
estimation is very accurate in all cases. The metamodels was constructed on the SVD
parameterization. Note that the parameterization criterion was also effective in this case.
Table 1. Hybrid Optimization Parameters.
Noise
5%
Number of runs by SPSA
30
Total Iterations SPSA
3645
Function Evaluation SPSA(2xTotal Iterations SPSA)
Initial Objective Function Value
7290
8 to 10
Final Objective Function Value given by SPSA
.568 to 2.5
Best SPSA Iteration- Initial Objective Function Value
9.03
Best SPSA Iteration - Final Objective Function Value
0.568
Number of Function Values Used to create Surrogate Model
Iterations NKIP
156
94
Objective Function Hybrid
0.478
% Gain by Hybrid Approach
16
Figure 12. True permeability field.
Figure 13. Initial permeability field with 5% noise.
Figure 14. Solution by SPSA.
Figure 15. Hybrid scheme final solution (improved).
Figure 16. Convergence history of HPC problem with 5% noise.
Table 2. Hybrid Optimization Scheme Parameters.
Noise
10%
Number of runs by SPSA
20
Total Iterations SPSA
2248
Function Evaluation SPSA(2xTotal Iterations SPSA)
4496
Initial Objective Function Value
18.72 to 53.9
Final Objective Function Value given by SPSA
.858 to 6.785
Best SPSA Iteration- Initial Objective Function Value
19.67
Best SPSA Iteration - Final Objective Function Value
0.858
Number of Function Values Used to create Surrogate Model
81
Iterations NKIP
67
Objective Function Hybrid
0.747
% Gain by Hybrid Approach
13
Figure 17. True Permeability Field.
Figure 18. Initial Permeability Field with 10% Noise.
Figure 19. Solution by SPSA.
Figure 20. Hybrid optimization scheme final solution (Improved).
Figure 21. Convergence history of HPC problem with 10% noise.
Table 3. Hybrid Optimization Scheme Parameters.
Noise
15%
Number of runs by SPSA
20
Total Iterations SPSA
1647
Function Evaluation SPSA(2xTotal Iterations SPSA)
3294
Initial Objective Function Value
29.46 to 120.56
Final Objective Function Value given by SPSA
1.8 to 15.89
Best SPSA Iteration- Initial Objective Function Value
31.35
Best SPSA Iteration - Final Objective Function Value
1.541
Number of Function Values Used to create Surrogate Model
Iterations NKIP
57
110
Objective Function Hybrid
1.465
% Gain by Hybrid Approach
5
Figure 22. True permeability field.
Figure 23. Initial permeability field with 15 % noise.
Figure 24. Solution given by SPSA.
Figure 25. Hybrid optimization scheme final solution (improved).
Figure 26. Convergence history of HPC problem with 15 % noise.
7. Conclusions
We combine SPSA and NKIP strategies into a Hybrid Scheme that exploit the best of these two
approaches for a given problem in order to achieve maximum efficiency and robustness. It is
clear that certain broad exploration of the parameter space should be done to increase the chance
of finding a global optimum when using SPSA. On the other hand, this search has to be
performed in a controlled way to keep the number of function evaluations within reasonable
computational bounds. In this report, we show how this could be achieved by generating
metamodels (or proxies) and adjusting the degree of exploration as the iterations proceed. We
present promising numerical results that show the efficiency of the hybrid scheme. Further
experiments should be contacted using high-performance computing to exploit the hybrid scheme
for solving very large-scale problems. As we showed, the hybrid scheme is trivially parallel since
different initial guesses for the optimization can be independently deployed. Nevertheless, further
levels of parallelism could be exploited when computing the stochastic gradient in SPSA and
evaluating the function itself (e.g. using a parallel groundwater simulator).
This project results open new development avenues to be further pursued. These avenues
are mainly related to the ill-possedness associated with the parameter estimation process in EQM
problems. We mention some of them: (1) Regularization; (2) parameterization; (3)
preconditioning; and, (4) uncertainty and sensitivity. The first two seek to reduce the parameter
space and yet, be able to come up with better estimations. Preconditioning aims at improving
efficiency and has a regularization effect that is traditionally hard to measure or control.
Uncertainty and sensitivity assessment provide the means to determine reliability bounds to the
estimation.
Fortunately, the metamodel framework offers interesting possibilities for developing the
above ideas. We consider the parameterization (or reparameterization) a natural issue to improve
both accuracy and performance in large-scale estimation. Some of the authors has initiated part of
that effort [2, 11] and has been already planned for future DoD PET developments.
Acknowledgements
The authors thank DoD for the support given with the grant DoD-PET Project EQM-KY6-003.
References
[1] Miguel Argáez and R.A. Tapia. On the global convergence of a modified augmented
Lagrangian linesearch interior-point Newton method for nonlinear programming. J. Optim.
Theory Appl., 114:1-25, 2002.
[2] R. Banchs, H. Klie, A. Rodriguez, and M.F. Wheeler. A neural stochastic optimization
framework for oil parameter estimation. In Intelligent Data Engineering and Automated Learning
(IDEAL), Lecture Notes in Computer Science, pages 147-154, Burgos, Spain, Sept. 20-23 2006.
[3] S. Chandrasekhar. Radiative Transfer. Dover, New York, 1960.
[4] R. Glowinski, H.B. Keller, and L. Reinhart. Continuation-conjugate gradient methods for the
least squares solution of nonlinear boundary value problems. SIAM J. Sci. Stat. Comput., 4:793833, 1985.
[5] V. Keeman. Learning and Soft Computing: Support Vector Machines, Neural Networks and
Fuzzy Logic Machines. The MIT Press, 2001.
[6] C.T. Kelley. Iterative methods for linear and nonlinear equations. In Frontiers in Applied
Mathematics. SIAM, Philadelphia, 1995.
[7] H. Klie, L. Velázquez, M. Argáez, C. Quintero, and M.F. Wheeler. Project EQM-KY6-003:
Automated Parameter Estimation and Sensitivity Analysis. Deliverable 1. Technical report,
Department of Defense, Environmental Quality Modeling and Simulation, Aug. 31, 2006.
[8] H. Klie, L. Velázquez, M. Argáez, C. Quintero, and M.F. Wheeler. Project EQM-KY6-003:
Automated Parameter Estimation and Sensitivity Analysis. Deliverable 2. Technical report,
Department of Defense, Environmental Quality Modeling and Simulation, Dec. 20, 2006.
[9] J.J. Moré. A collection of nonlinear problems. In E.L. Allgower and K. Georg, editors,
Lectures in Applied Mathematics, Vol. 26, pages 723-762. American Mathematical Society, 1990.
[10] M. Parashar, J. A. Wheeler, G. Pope, K. Wang, and P. Wang. A new generation EOS
compositional reservoir simulator. Part II: Framework and multiprocessing. In Fourteenth SPE
Symposium on Reservoir Simulation, Dalas, Texas, pages 31-38, June 1997.
[11] A. Rodriguez, H. Klie, S.G. Thomas, and M.F. Wheeler. A multiscale and metamodel
simulation-based method for history matching. In 10th European Conference on the Mathematics
of Oil Recovery (ECMOR). EAGE, Sept. 4-7, 2006.
[12] P. Wang, I. Yotov, M. F. Wheeler, T. Arbogast, C. N. Dawson, M. Parashar, and K.
Sepehrnoori. A new generation EOS compositional reservoir simulator. Part I: Formulation and
Discretization. In Fourteenth SPE Symposium on Reservoir Simulation, Dalas, Texas, pages 5564. Society of Petroleum Engineers, June 1997.
Download