Combined use of Design of Experiment (DoE) and Process Automation for the Efficient Optimization of New Synthetic Transformations Federica Stazi Ph.D Thesis Universita’ dell’Insubria-Dipartimento di Chimica Via Valleggio no 11 22100 Como (Italy) www.uninsubria.it R&D Chemistry Research Centre Via Lorenzini no 8 20139 Milano (Italy) www.boehringer-ingelheim.it Literature meeting May 2nd 2005 Reasons for DoE at the Chemistry Research Centre Boehringer Ingelheim Pharma KG Biberach, Germany Drug Development Drug Discovery • Pre-dev. Candidates • Intermediates • Metabolites • Process impurities… • Building blocks • Test compounds… CRC Boehringer Ingelheim Pharma KG CRC Milan, Italy Target Oriented Synthesis Diversity Oriented Synthesis Target Oriented Synthesis (TOS) and DoE NO2 NO2 H2N F F F N + O H N CF3 N S O HO N NO2 HO CF3 N H N N N CF3 NaH, DMF 1 NO2 HO H N MeO2C N CF3 N O Br MeO2C PivO O NO2 H N O N OPiv OPiv PivO DoE-driven search for optimal conditions CF3 N OPiv OPiv 1 Ag2O, CH3CN NO2 MeO2C O O O H N N N N PivO OPiv 2 CF3 1. H2, Pd/C, THF 2. 1,1'Carbonyldi(1,2,4-triazole) 3. LiOH HO2C O N N HO OPiv N O OH OH 2 3 CF3 Target Diversity-Oriented Synthesis (DOS) and DoE CHO CHO DoE-driven search for optimal conditions Same starting material and rxn conditions CHO OH OH different RX F F O O F OH O CHO CHO Same R’X and rxn conditions O OH different starting material CHO O O O CHO O OMe O OH … F … OMe The DoE Concept: Basic Principles controllable factors x1 x2 xp … Inputs Outputs System y (products) (starting materials) … z1 z2 zq uncontrollable factors OFAT (One Factor at A Time) Approach OFAT results in a set of experiment in which only one factors is varied P SM A B C • incomplete picture of the overall process • factor interactions are not revealed • number of experiments not fixed • not possible to perform experiments in parallel DoE (Design of Experiment) Approach DOE results in a set of pre-planned experiments in which factors are varied at the same time 2-level Factorial Design SM P 7 experimental matrix 8 exp 3 4 A B C 5 1 6 2 1 2 3 4 5 6 7 8 Factor setting A B C + + + + + + + + • precise estimation of factors effect • factor interactions are revealed • mathematical model of the chemical process based on statistical analysis • possibility to perform experiment in parallel + + + + Statistical Background and DoE Tools Doe Simplified: Practical Tools for Effective Experimentation Mark J. Anderson, Patrick J. Whitcomb Productivity Press, 2000 Design and Optimization in Organic Synthesis R. Carlson Elsevier Science, 1997 Design and Analysis of Experiments, 5th Edition D.C. Montgomery Wiley, 2000 + Chemical Journals Statistical Background and DOE Tools: Examples S.V. Ley et al. Organic Process Research Development , 2002, 6, 823 R F F HCl H2N COOtBu H N F R COOtBu NO2 NO2 DIEA EtOH, reflux Pre DoE: 40% Post DoE: 91% 5 different R groups Yields: 81-96% S.V. Ley et al. Synlett , 2000, 11, 1603 COOH NH2 O MeO R DCC N H R' solvent Post DoE: 97% R: Et 4 F Res IV, 8 exps + 2 centres A.A. (equiv) <1;2> PS-DIEA (equiv) <2;4> Rnx time (hours) <14;24> Conc. (volumes) <25;50> 8 different R groups 10 different R’ groups 80 cpds. Hit rate 95% 5 F ResIV, 16 exps + 4 centres PS-DCC (equiv) <1;3> Conc. (volumes) <40;160> Rnx time (hours) <0.5;4> Solvent T1 <-1;+1> Solvent T2 <-1;+1> 4 F ResIV, 8 exps + 1 centre PS-DCC (equiv) <1;3> Conc. (volumes) <40;160> Amine (equiv) <1;2> Statistical background and DoE Tools ? Advantage Series 2050 (Argonaut) SK233 React Array Workstation (Anachem) Design Expert 6.0.4 by Stat-Ease Carousel (Radley) MODDE 7.0.0 by Umetrics ?? ? Statistical Background and DOE Tools Syringes Needle HPLC React. rack Reagent Solvent racks UV/Vis Detector PC Reaction Control HPLC control The Sequential Workflow of DoE 1. Synthetic Problem 5. Interpretation and confirmations ? 2. Planning the experiment: • State experimental objectives 6. Reiteration 4. Data analysis and modeling • Choice of factors, levels and response variable • choice of experimental design 3. Performing the experiment Putting the Theory into Practice Step 1. Defining the Synthetic Problem: a Problematic Glucuronidation H H N HO2C O N O H N O N HO N N HO CF3 O N OH CF3 N Flibanserin 1. cytochrome P450 2. UDPG transferases MeOOC O RO X OR HO + H N O N N N OR BIMC-0576 CF3 Putting the Theory into Practice Step 1. Defining the Synthetic Problem: O-Glucuronidation Background O O HO O O HO UDPG transferases O O O UDP= HO OH OUDP N P P O O HO N OH acido UDP-glucuronic CO2Me O RCO2 RCO2 + RCO2 X additive R'OH R'= alkyl, phenyl R= Me, i-Pr, t-Bu CO2Me O RCO2 RCO2 OR' RCO2 X= leaving group (Br, -tricloroacetammidate) For a review, see: Stachulski, A. V.; Jenkins, N. J. Nat. Prod. Rep. 1998, 173. O Putting the Theory into Practice Step 1. Defining the Synthetic Problem: O-Glucuronidation Background transacylation R'OCOR CO2Me O RCO2 RCO2 + RCO2 X CO2Me O RCO2 RCO2 R'OH O R'= alkyl, phenyl O R= Me, i-Pr, t-Bu R X= leaving group (Br,-tricloroacetammidate) OR' orthoester MeO2C O RCO2 MeO2C O O2CR O2CR C1-elimination X O2CR O2CR C4-elimination For a review, see: Stachulski, A. V.; Jenkins, N. J. Nat. Prod. Rep. 1998, 173. Putting the Theory into Practice Step 1. Defining the Synthetic Problem: A New Strategy Typical Koenigs-Knorr cond.: 3% yield (Ag2O, mol sieves, 18 h CH3CN, R=Ac or R=Piv) NO2 HO H N N CF3 N MeO2C O Br Koenigs-Knorr conditions NO2 MeOOC O O H N N + CF3 N RO OR RO OR OR OR 1 h 45 °C, then 1. H2, Pd/C 2. 1,1'Carbonyldi(1,2,4-triazole) 3. LiOH MeSO2(CH2)2OH + NaH 1 h 0 °C; DMSO O NO2 F N F + CF3 N NH2 HN HOOC O O N N HO OH N OH Modified Koenigs-Knorr cond.: 25% yield (Ag2O, mol sieves, 18 h CH3CN + TMEDA 10 eq , R=Piv) CF3 Step 2. Planning the Experiment Find the best starting point: small-scale parallel reagent screening (10 mg scale). Amine vs. “Ag” 45 40 Ag2O 40.6 38.6 Ag2CO3 influence of 35 29.3 30 26.1 24.8 • amine complexing abilitya • amine basicity • silver source 25 20 15.2 15 10 5 0 0 0 0 0 DIPEA TMEDA DMEDA DIPEDA HMTTA • HMTTA works best. N N < N < N HN NH HN N < NH N N pKa : 11.0 9.1 10.3 10.4 9.2 a. Meyerstein and al. J. Am. Chem.Soc. 1995, 117, 8353-8361 • The silver source does not significantly influence yields. Step 2. Planning the Experiment: Statement of the Problem State experimental objectives: which type of design? • Process screening which variables are most influential? • Process optimization how variables are relevant? • Process robustness testing Do small changes in uncontrolled variables influence the response? Step 2. Planning the Experiment: Selection of Factors Choice of factors and factor levels: use of process knowledge + team work Ag2CO3 Ag2O “Ag” Br-sugar ... NO2 MeOOC O PivO O H N OPiv N N OPiv HMTTA • Define design factors, held constant factors, allowed-to-vary factors • Factors can be either quantitative (time, stoichiometry) or qualitative (“Ag” type) CF3 Step 2. Planning the Experiment 7 factors to be investigated in a screening factorial design Variables Considered and Levels Used in the Factorial Design name factor units (-) 0 (+) A B C D E F G pre-complexation time reaction time amount of Ag2CO3 amount of HMTTA amount of Br-sugar 4Å molecular sieves amount of solvent min h equiv equiv equiv mg mL 0 2 1.5 1.5 1.5 0 0.5 30 4 2.6 7.1 2.2 50 1 60 6 3.8 12.6 3 100 1.5 FI relative importance: 2-FI > 3-FI >> 4/7-FI • A complete investigation of 7 factors over 2 levels requires: 27 = 128 exps • 128 parameters are estimable: 1 constant term, 7 linear terms, 21 2-FI, 35 3-FI, 64 4/7-FI Step 2. Planning the Experiment: Full vs. Fractional Factorial Designs Fractional Factorials exploit the redundancy of Full Factorials to reduce the no of exps no of factors 2 3 4 5 6 7 8 9 Full 4 no of experiments 8 16 32 64 Fractional 128 256 7 factors can also be studied in only a fraction of the original full factorial design. Step 2. Planning the Experiment: Final Output of Pre-Experimental Plan 7 factors to be investigated in a 27-4 Resolution III design: 8 exps + 3 center points (50mg scale) Experimental matrix: exp A factor settings B C D E F 1 2 3 4 5 6 7 8 9 10 11 + + + + 0 0 0 + + + + 0 0 0 + + + + 0 0 0 + + + + 0 0 0 + + + + 0 0 0 + + + + 0 0 0 G + + + + 0 0 0 name factor (-) A B C D E F G pre-complexation time reaction time amount of Ag2CO3 amount of HMTTA amount of Br-sugar 4Å molecular sieves amount of solvent 0 2 1.5 1.5 1.5 0 0.5 0 (+) 30 60 4 6 2.6 3.8 7.1 12.6 2.2 3 50 100 1 1.5 center points for curvature detection for calculation of pure error Step 3. Performing the Experiment run 1 2 3 4 5 6 7 8 9 10 11 A 0 60 0 60 0 60 0 60 30 30 30 factor settings B C D E F 2 2 6 6 2 2 6 6 4 4 4 1.5 1.5 1.5 1.5 3.8 3.8 3.8 3.8 2.6 2.6 2.6 12.6 1.5 1.5 12.6 12.6 1.5 1.5 12.6 7.1 7.1 7.1 3 1.5 3 1.5 1.5 3 1.5 3 2.2 2.2 2.2 100 100 0 0 0 0 100 100 50 50 50 G 0.5 1.5 1.5 0.5 1.5 0.5 0.5 1.5 1 1 1 Prod. yield (%) 14.7 19.5 24.4 11.2 34.2 83.2 56.5 55.4 50.2 43.2 50.5 • Monitor and record values of uncontrolled factors • Use randomization to reduce the influence of nuisance factors • If possible, operate in parallel since we rely on a previous experimental plan • Perform a scoping study: check -- - vs. +++ and reproducibility. Step 4. Data Analysis and Modeling: ANOVA Testing (Analysis of Variance) Ag2CO3 Br-sugar HMTTA Source of changing variable Sum of Squares Df Model A B C D E F G 3,32 0,051 0,005369 2,54 0,38 0,33 0,001572 0,020 Curvature Pure Error Cor Total 0,53 0,016 3,87 Mean Square F Value P Value 0,47 0,051 0,005369 2,54 0,38 0,33 0,001572 0,020 60,30 6,46 0,68 322,50 47,91 41,76 0,20 2,56 0.0164 0.062 0.4957 0.0031 0.0202 0.0231 0.6986 0.2505 1 0,53 2 0,007869 10 67,21 0.0146 7 1 1 1 1 1 1 1 Step 5. Interpretation and Confirmation After stepwise modifying the insignificant terms we obtain the definitive linear model Source Sum of Squares Df Mean Square F Value P Value 96.10 5.93 296.13 43.99 38.35 < 0.0001 0.0059 <0.0001 0.0012 0.0016 1.15 0.4967 61.71 0.0005 Model A C D E 3.29 0.051 2.54 0.38 0.33 4 1 1 1 1 0.82 0.051 2.54 0.38 0.33 Residual Lack of fit Pure Error Curvature Cor Total 0.0043 0.0027 0.016 0,53 3,87 5 3 2 1 10 0.0085 0.0090 0.0078 0,53 Is this linear model adequately modeling the response? y = 0 + 1* A+ 2* C - 3*D + 4* E + e Step 6. Reiteration: Altering Factors Ranges The contour plot directs us outside the investigated region modify factors ranges to explore a better experimental region name factor (-) (+) A B C D E F G pre-complexation time reaction time amount of Ag2CO3 amount of HMTTA amount of Br-sugar 4Å molecular sieves amount of solvent 0 2 1.5 1.5 1.5 0 0.5 60 6 3.8 12.6 3 100 1.5 (-) (+) 60 2 3.3 0.7 2.0 5.5 2.5 2.5 0 0.5 Response Surface Modelling (RSM): an Overview Different options when the linear model is not adequate. Many are extensions of the 2-level factorial design 2-level FD Box-Behnken CCD CCF 3-level FD 3 5 3 3 12+3 14+3 14+3 27+3 cubic cubic Characteristics: • • • Factor levels Number of Experiments Geometries of the Explored Space spherical spherical Optimizing Glucuronidation Yield using CCD: Performing the Experiment Point type exp 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 fact fact fact fact fact fact fact fact axial axial axial axial axial axial center center center center center center A B C equiv of equiv of equiv of Product Residual HMTTA Ag2CO3 Bryield SM sugar 0.7 2.1 0.7 2.1 0.7 2.1 0.7 2.1 0.2 2.5 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 1.4 3.7 3.7 5.1 5.1 3.7 3.7 5.1 5.1 4.4 4.4 3.3 5.5 4.4 4.4 4.4 4.4 4.4 4.4 4.4 4.4 2.1 2.1 2.1 2.1 2.4 2.4 2.4 2.4 2.25 2.25 2.25 2.25 2.0 2.5 2.25 2.25 2.25 2.25 2.25 2.25 82.8 63.5 81.6 69.9 87.5 72.8 85.1 74.3 62.1 70.2 73.9 77.0 71.9 79.6 71.8 79.8 77.6 77.2 76.6 78.4 11.4 11.2 5 15.9 12.4 20.8 4.8 15.4 3.6 19.0 21.6 9.1 14.4 12.1 12.0 14.0 13.9 14.2 12.8 13.3 axial center factorial 20 exps on (100mg scale) Optimizing Glucuronidation Yield using CCD: Data Analysis and Model Building source SS Df MS F P model A B C A2 AB A3 residual lack-of- fit pure error cor total 686,640 293,242 6,64122 117,042 18,6096 16,5025 121,418 78,559 41,219 37,340 765,202 6 1 1 1 1 1 1 18 13 5 24 114,441 293,242 6,641 117,042 18,610 16,502 121,418 4,364 3,171 7,468 26,222 67,190 1,522 26,818 4,264 3,782 27,820 0.0001 0.0001 0.2330 0.0001 0.0540 0.0680 0.0001 0,424 0.901 S= 2.0891 R2=0.897 R2adj=0.863 PRESS=146,590 Maximum Definitive coded model yield = 76.91 - 9.58 A + 0.70 B + 2.57 C- 0.75 A2 + 1.44 A B + 2.51 A3 Optimizing Glucuronidation Yield using CCD: Empirical Model Interrogation Program optimization tools indicate the best conditions found and the confidence intervals Factor Name Level Low Level High Level A HMTTA 0.70 0.2 2.5 B Ag2CO3 3.76 3.3 5.5 Qty phenol C Br-sugar 2.42 2.0 2.5 1 gr 86.0 80.6 Prediction SE Mean 95% CI low 95% CI high 1 gr 87.2 81.0 86.5 1.34 83.71 3.5 gr 85.7 80.0 Model validation P yield 89.33 in situ yield isolated yield Optimizing Glucuronidation Yield Using CCD: Conclusion NO2 HO H N NO2 N N CF3 MeO2C O Br MeOOC O O H N N + N PivO OPiv PivO OPiv OPiv OPiv Initial conditions: Optimized conditions: Ag2O 2.7 eq Reagents Screening 10 exp Ag2CO3 3.76 eq Br-sugar 1 eq DoE Factorial Screening 11 exp Br-sugar 2.4 eq mol sieves DoE CCD Optimization HMTTA 0.7 eq 20 exp 18 h CH3CN 1h CH3CN isolated yield 3% in situ yield 86.0% isolated yield 80.5% CF3 Mechanistic Modelling: the Manifold Actions of HMTTA Ag2O >> Ag2CO3 Ag+ NO2 HO N N Ag+ H N N N N CF3 N active ! 120,00 % complex (SM-Ag+) at 2h (%) complexation 100 90 80 70 3 + Ag2O + HMTTA 60 3 + Ag2CO3 + HMTTA 50 3 + Ag2O 40 3 + Ag2CO3 30 20 100,00 Ag+ dissolution / activation •Ag2CO3 •no Br-Sugar 80,00 60,00 40,00 Ag+ competitive complexation 20,00 10 0,00 0 0 5 10 15 time (h) 20 25 0,00 0,20 0,40 0,60 0,80 1,00 equiv of HMTTA 1,20 1,40 1,60 Mechanistic Modelling: the Manifold Actions of HMTTA Negative effect of HMTTA : Positive effects of HMTTA : Excess favours the formation of unwanted side product • competitive ligand for SM complexation Base (pKa=9.23, 8.47, 5.36, 1.68) on the Br-sugar (-HBr) • activator of Ag+ MeO2C O PivO Br OPiv OPiv MeO2C O PivO Consistent depletion of Brsugar OPiv OPiv The postulated irreversible binding of starting material (SM) to Ag+ ions is really operative. The presence of the tetramine additive (HMTTA) influences the complexation equilibria. SM Complexation > Ag+ activation The relationship between complexation of SM and concentration of HMTTA is non-linear. F.Stazi, G. Palmisano, M. Turconi, S. Clini, and M. Santagostino, J. Org. Chem, 2004, 69, 1097-1103. Max Ag+ activation Max competitive binding to Ag+ Scope and Limitation of the Methodology % isolated yield : optimized conditions % isolated yield: classical Koenigs-Knorr conditions NO2 GlucO NO2 H N H N GlucO N 71% CHO O OGluc R' OMe 20% + HMTTA 0.2-0.7 eq HO 0% 80% 3% MeO2C 79% CHO OMe OGluc OMe CF3 N 0% CHO O Br MeO2C 88% O RO OR 54% 0% CHO O OR RO OGluc 0% R' 86% MeO OR OMe OGluc OR 0% CHO 85% 74% mix 15% 80% CHO OGluc GlucO O OGluc 65% 30% O O O O Br Br OGluc Other Applications Pd-Catalysed Cyanation of aryl bromide at room temperature F.Stazi, G.Palmisano, M.Turconi, M.Santagostino Tetrahedron Letters, 46 (2005) 1815-1818. Br CN R R Initial conditions: 1 % Pd2(dba)3.CHCl3 2 % [(tBu)3PH]BF4 Zn(CN)2 1.2 eq,wet DMF, 50 oC 30 % yield Modified conditions: 0.5 % Pd2(dba)3.CHCl3 1.4 % [(tBu)3PH]BF4 Zn(CN)2 1.1 eq, NMP (0.1 % water content), 5% Zn poweder, RT 75-98 % yield Regioseletive Alkylation of 3,4-dihydroxybenzaldehyde Unpublished Results Br CHO CHO CHO CHO R= + + O OH OH OH Initial conditions: RX 1 eq, NaH 2.7 eq, DMF, 0-5 oC Modified conditions: RX 1.5 eq, NaH 2.5 eq, KI 0.05 eq, OH R O R O R O 30 % 15 % 5% 65% 15 % 0% optimized work-up TBAI 0.05 eq, DMF, 25 oC Modified conditions: different RX R 40-80 % Summary and Conclusions DOE results in a set of experiments in which factors are varied at the same time in an organized and systematic approach A mathematical regression model is generated. This model is empirical and valid only within the studied factor range. A better understanding and control of the process are gained by interacting with the model. Use of non-statistical knowledge of the problem for choosing factors and their levels, interpreting the results ... “Using statistics is no substitute for thinking about the problem.” Design and analysis of Experiments D.C. Montgomery Suggestion If you find DoE applied to boring chemistry problem ….. • Using DoE to Spend Less Time in The Traffic • Screening Ingredients (for Homemade Bread) Most Efficiently with Two- Level Design of Experiment • Applied DoE to Microwave Popcorn and more and more…. By Mark J. Anderson, consultant, Stat-Ease, Inc., Minneapolis, MN Acknowledgment Prof. Giovanni Palmisano Universita’ dell’Insubria-Dipartimento di Chimica Dr. Marco Santagostino Boehringer-Ingelheim R&D Chemistry Research Centre