Bioassay 4PL Simulator A learning tool USP Statistics Expert Committee 22Aug2011 1 Outline • Objectives • <1034> Principles illustrated • Using the simulator • Some technical details 2 Objectives • Illustrate Principles in <1034> Analysis of Biological Assays, section 3.4. Nonlinear Models for Quantitative Responses • Readily available software (MS Excel) • Tool for exploring and visualizing bioassay data • Simulate the process for relative potency estimation 3 Not the Objective • Analysis of real bioassay data • Illustrate experimental design principles (blocking, randomization, balance, etc.) • Recommend specific statistical or numerical procedures • Recommend a dilution protocol • Recommend a specific sampling plan. The following is chosen purely for illustration: • 4 96 well plates, • 3 unknown samples + 1 reference • 12 two-fold dilutions/sample, • 2 replicates/sample/dilution/plate 4 <1034> Principles Illustrated • Determination and application of appropriate transformation (but not weighting) • Outlier detection (but not resolution) • System suitability assessment • Sample suitability assessment • Relative potency and confidence interval for 4PL nonlinear bioassays • Combination of estimates 5 Determination and application of appropriate transformation • The simulated data assume a log transformation (no weighting) so LN(readout) is the response. • Informal scatter plots to illustrate effect of transformation on variance homogeneity • Measures of spread that take the log transformation into account (StdDev, GSD, %GCV, %CV) • Log transformation of concentration scale simplifies relative potency estimation (no need for Feiller’s theorem) 6 Outlier detection • Random outliers present with probability and magnitude controlled by user. • Potential outliers identified by scatter plots of Studentized residuals obtained from a model that does not assume sample similarity but takes into account variance heterogeneity. • The effect of outliers on the analysis can be simulated. • Outlier removal/accommodation is not illustrated - all data are used in the analysis. • In real practice, should exclude only data known to result from technical problems (e.g., contaminated wells) 7 System suitability assessment • Both within- and between-plate measures of spread are obtained. • Full model does not assume sample similarity. • System suitability metrics illustrated • 95% Confidence upper bound on measures of spread • 90% Confidence Intervals on reference (and test sample) parameter estimates. • Correlation of parameter estimates to detect colinearity and adequate dilution range. • Scatter plots of Studentized residuals to recognize outlier patterns • Scatter plots of raw and transformed readout to look for hook effect (hook effect not simulated). • Setting of system suitability limits not illustrated. Can simulate impact of limit choice on suitability failure rate. • Lack of fit test is not illustrated. 8 Sample suitability assessment • Sample suitability metrics illustrated • Scatter plots to examine sample variance and hook effect. • 90% Confidence interval estimates for all sample parameters • 90% confidence interval estimates for parameter differences from reference • Sample suitability metrics not illustrated • Parallelism sum of squares (discussed in <1032>) • Functional transformation of paramters (e.g., asymptote divided by the difference between asymptotes). • Setting of sample suitability limits not illustrated. Can simulate the impact of limit choice on suitability failure rate. 9 Relative Potency Estimation • Two sample (test and reference) estimation illustrated. • Reduced model (assuming similarity) used. • Log transformation of concentration scale simplifies relative potency estimation (no need for Feiller’s theorem) • 95% Confidence interval for relative potency provided. • Based on total variance (sum of within and between plate variance components) • Accounts for covariance between sample and reference estimates of EC50 • Satterthwaite approximation used for degrees of 10 freedom Combination of estimates • If each plate is regarded as a separate assay then the simulator illustrates a combination by analyzing all plates together using a nested model to accommodate between plate variance. • The proper variance for the estimate of relative potency is managed using an approximation (linear Taylor series) to the variance covariance matrix of parameters. • Multiple relative potency estimates (i.e., repeat runs of 4 plates) can be simulated and the methods in <1034>, section 4.1 applied. 11 Using the Simulator • Download the version that matches your version of MS Excel: • • • • MS Excel 2000 & 2003 MS Excel 2007 MS Excel 2010 Has not been tested in other environments • Requires manual installation of Solver Add-In for least squares minimization • Normally installed with Excel, but must be loaded on first use • More information about Solver is available in Excel Help (search “Solver”), or from Solver Help at www.solver.com • User must enable macros on opening the simulator • Problems or suggestions: David.LeBlond@abbott.com 12 Loading Solver: MS Excel 2000 & 2003 • Tools/ Add-Ins/ • Check “Solver Add-in”/ OK – If you get a message, click “Yes”. – If “Solver Add-In” is not listed run setup again and make sure to select the option to install solver. • When opening the simulator a warning message will appear. – Click “Yes” (2000) or “Enable Macros” (2003) 13 Loading Solver: MS Excel 2007 • Click the Microsoft Office Button , top left, and then click Excel Options (bottom of window). • Click Add-Ins, and then in the Manage box, select Excel Add-ins. • Click Go. • In the Add-Ins available box, select the Solver Add-in check box, and then click OK. • Tip- If Solver Add-in is not listed in the Add-Ins available box, click Browse to locate the add-in. • If you get prompted that the Solver Add-in is not currently installed on your computer, click Yes to install it. • After you load the Solver Add-in, the Solver icon should appear in the Analysis group on the Data tab. • When opening the simulator in Excel a security warning “Macros have been disabled” may appear under the ribbon menu. Click “Options…”, then select “Enable this Content” / OK. 14 Loading Solver: MS Excel 2010 • Click the File tab. • Click Options, and then click the Add-Ins category. • Near the bottom of the Excel Options dialog box, make sure that Excel Add-ins is selected in the Manage box, and then click Go. • In the Add-Ins dialog box, select the check box for Solver Add-in, and then click OK. • If Excel displays a message that states it can't run this add-in and prompts you to install it, click Yes to install the add-in. • On the Data tab, note that an Analysis group has been added. This group contains command buttons for Data Analysis and for Solver. • When opening the simulator in Excel a security warning “Macros have been disabled” may appear under the ribbon menu. Click “Enable this Content”. 15 Using the Simulator (cont.) • A 21 step process… just follow instructions in first sheet • Simulate new bioassay results (steps 1-4) • Examine effect of log transformation (steps 5-6) • Obtain starting estimates for model parameters (step 7) • Fit Full 4PL Model (step 8) • Perform System and Sample Suitability Checks (steps 9 –16) • Fit Reduced 4PL Model, assuming similarity (step 17) • Examine the final relative potency estimate (steps 18-21) • May examine cell formulae and macro code, if desired. • User must not make any changes to cells unless specifically instructed. 16 Technical Details 17 Data Simulation Model • Readout for each sample are generated from the following model lnReadout plate,well D A D Rel. Conc. 1 EC50 B plate well( plate) outlier 2 2 2 2 , well( plate) ~ N 0, wthn , outlier I (outlier) F btwn plate ~ N 0, btwn wthn • Data for four 96-well plates are simulated. • Twelve 2-fold serial dilutions of 4 samples (A, B, C, R) in duplicate for each plate. • Fixed effect parameters are: exp(A), B, log2(EC50), and exp(D) • Random normal plate deviations added. • Random normal well (within plate) deviations added. • Random occurrence of outliers, with specified probability, and magnitude specified by F. 18 Overall Data Analysis Approach The MS Excel Add-In Solver is used to find a point estimates of the the fixed nonlinear model parameters exp(A), B, log2(EC50), and exp(D). The solution is obtained using Solver’s search algorithm based on minimizing the sum of squared errors, or deviations of the fitted model from the observed individual LN(readout) values. Once the Solver search terminates, these deviations are analyzed subsequently outside of Solver to obtain estimates of the 2 random parameters: the within and between plate variances. This latter analysis is based on a simple one way random effects ANOVA. The Jacobian (a matrix of partial derivatives of the model with respect to all parameters at the optimal solution, J) is generated in Excel. Excel matrix functions are then used to obtain the unscaled variance covariance (VCV) matrix of model parameters as (D’D)-1. This provides the parameter correlations. A final variance covariance matrix, is obtained by multiplying the unscaled VCV matrix times the total variance. This (and Satterthwaite approximation) provide confidence intervals for variances, fixed parameters and parameter differences (i.e., from reference sample). The Graybill and Wang modified large sample confidence interval approach is used to obtain the confidence interval for the total variance. Good references to these methods are: 1. Applied Linear Statistical Models Neter et al. 4th ed., Irwin, pp545 & 972 2. Confidence Intervals on Variance Components, Burdick & Graybill, Dekker, p. 63 3. Links to additional information about Solver: http://peltiertech.com/Excel/SolverVBA.html http://support.microsoft.com/kb/82890 19 Solver Minimization Algorithm • The standard version of Solver bundled with Excel employs an optimization algorithm (GRG2) for smooth nonlinear problems based on one developed by L. Lasdon and A Waren. • The simulator uses this algorithm as a tool to find the fixed nonlinear model parameters that minimize the sum of squared deviations of the model from the simulated ln(Readout) data. • Lasdon, L.S.; Waren. A. D.; Jain, A.; and Ratner, M. 1978, "Design and testing of a generalized reduced gradient code for nonlinear programming," ACM Transactions on Mathematical Software, Vol. 4, No. 1, pp. 34-49. • Lasdon, L.S. and Smith, S. 1992, "Solving large sparse nonlinear programs using GRG", ORSA Journal on Computing, Vol. 4, No. 1, pp. 2-15. • See also: http://www.utexas.edu/courses/lasdon/design3.htm 20 Initial Parameter Guesses • • • • Required by Solver to start least squares search exp(A0) = average readout at highest dilution exp(D0) = average readout at lowest dilution log2(EC50)0 = -6, log2(rel conc) near middle of dilution range • B0 A D slope ln 2 A D slope , 2 ln2 ave readout at log2 relconc 5 where slope ln ave readout at log2 relconc 6 21 Measures of Spread The following measures of spread are used 2 2 Total within plate between plate GSD exp %GCV 100 GSD 1 %CVreadout 100 exp 2 1 These are described more fully in the appendix to USP <1033>, Biological Assay Validation 22 Matrix of partial derivatives • = J, “Jacobian”, needed for approximate confidence interval and parameter correlation estimates • Evaluated for each parameter/sample/dilution/replicate at the parameter estimates found by Solver y 1 1 exp A exp(A) 1 2 B x log2 EC 50 y 1 1 1 expD exp(D) 1 2 B x log2 EC 50 y B A D ln2 2 B x log2 EC 50 2 B 1 2 B x log2 EC 50 y x log2 EC50 A D ln2 2 B x log2 EC 50 2 log2 EC50 1 2 B x log2 EC 50 • Where y = ln(Readout) and x = log2(Relative Concentration) • Approximate parameter variance covariance matrix = 2 between plate 2 within plate J ' J 1 23