Tutorial: Optimal Learning in the Laboratory Sciences Working with nonlinear belief models December 10, 2014 Warren B. Powell Kris Reyes Si Chen Princeton University http://www.castlelab.princeton.edu 1 Slide Slide 1 Lecture outline Nonlinear belief models 2 Knowledge Gradient with Discrete Priors The knowledge gradient can be hard to compute: x KG ,n E m ax y F ( y , K The expectation can be hard to compute when the belief model is nonlinear. n 1 ( x )) m ax y F ( y , K ) n The belief model is often nonlinear, such as the kinetic model for fluid dynamics. This has motivated research into how to handle these problems. 3 Knowledge Gradient with Discrete Priors Proposal: Assume a finite number of truths (discrete priors), e.g. L=3 possible candidate truths Utility curve depends on kinetic parameters, e.g 1, 2 , 3 We maintain the weights of each of the possible candidates to represent how likely it is the truth, e.g. p1=p2=p3=1/3 means equally likely 4 Knowledge Gradient with Discrete Priors Utility curve depends on kinetic parameters. The weights on the candidate truths are also on the choice of kinetic parameters: Knowledge Gradient with Discrete Priors Estimation: a weighted sum of all candidate truths Knowledge Gradient with Discrete Priors There are many possible candidate truths For each candidate truths, the measurements are noisy Utility curve depends on kinetic parameters. Knowledge Gradient with Discrete Priors Suppose we make a measurement Knowledge Gradient with Discrete Priors Weights are updated upon observation Observation More likely based on observation. Less likely based on observation Knowledge Gradient with Discrete Priors Estimate is then updated using our observation Average Marginal of Information Best estimate: maximum utility value Marginal value of information Average marginal value of information: average across all candidate truths and noise Best estimate before the experiment Best estimate after the experiment Knowledge Gradient with Discrete Priors KGDP makes decisions by maximizing the average marginal of information After several observations, the weights can tell us about the truth 12 Candidate Truths (2D) Beliefs on parameters produces family of surfaces 13 Before any measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) Do we explore? The KG map shows us where we learn the most. 10 Prior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) … or do we exploit? This is the region where we think we will get the best results (but we might be wrong). This is the classic exploration vs. exploitation problem 10 Before any measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) Do we explore? The KG map shows us where we learn the most. 10 Prior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) … or do we exploit? This is the region where we think we will get the best results (but we might be wrong). This is the classic exploration vs. exploitation problem 10 Before any measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Prior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 1 measurement 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Posterior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 2 measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Posterior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 5 measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Posterior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 10 measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Posterior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 20 measurements 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Posterior Estimate Inner water droplet diameter (nm) Inner water droplet diameter (nm) KG “Road Map” 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 After 20 measurements Posterior Estimate 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 Oil droplet diameter (nm) 9 10 Inner water droplet diameter (nm) Inner water droplet diameter (nm) Truth 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 5 6 7 8 9 Oil droplet diameter (nm) 10 Kinetic parameter estimation Besides learning where optimal utility is, the KG policy can help learn kinetic parameters. Distribution on candidate truths induces a distribution on their respective parameters. Candidate Probability Probability 0.7 0.6 Uniform prior distribution 0.5 0.4distribution of possible parameter vectors… Uniform 0.3 0.2 0.1 0 0 Parameter Probability 10 20 30 Candidate Truth 40 1 1 … translates to random sample of a 0.5 uniform distribution for an individual parameter. 0.5 0 50 7 8 9 10 11 0 0.52 0.54 0.56 0. Kinetic parameter estimation 0.5 Probability 0 7 8 9 10 11 0 1 1 0.5 0.5 0 26 28 30 0 1 0.5 0.5 Probability 0.5 1 0 0.52 0.54 0.56 0.58 0.82 0.84 0.86 0.88 Prior distribution 7 8 9 10 11 0 1 1 0.5 0.5 Probability 1 Probability 1 0 26 28 30 0 0.52 0.54 0.56 0.58 0.82 0.84 0.86 0.88 After 20 measurements Kinetic parameter estimation Probability 1 1 Low prefactor/low barrier 0.5 0.5 0 7 8 9 10 11 0 0.52 0.54 0.56 0.58 High prefactor/high barrier 1 1 0.5 0.5 Probability • Most probable prefactor/ energy barriers come in pairs. • Yield similar rates at room temperature. • KG is learning these rates. 0 26 28 30 0 0.82 0.84 0.86 0.88 After 20 measurements Kinetic parameter estimation After 50 measurements, distribution of belief about vectors… … distribution of belief about k ripe : k ripe After 50 measurements, distribution of belief about vectors… … distribution of belief about one parameter: k coalesce 28 Collaboration with McAlpine Group Opportunity Cost Percentage opportunity cost: difference between estimated and true optimum value w.r.t the true optimum value 29 Rate Error Rate error (log-scale): difference between the estimated rate and the true optimal rate 30