In memory of Rich Green 1947 - 2001 An Outstanding Medicinal Chemist and Colleague “Making Lead Discovey less complex?” Mike Hann, Andrew Leach & Gavin Harper. Computational Chemistry and Informatics Unit GlaxoSmithKline Medicines Research Centre Gunnels Wood Rd Stevenage SG1 2NY email MMH1203@gsk.com Subtitle: Molecular Recognition versus the gambling game that we play in using HTS and libraries to discover new leads Libraries - have they been successful at revolutionising the drug discovery business? Despite some successes, it is clear that the high throughput synthesis of libraries and the HTS screening paradigms have not delivered the results that were initially anticipated. Why – immaturity of the technology, – the inability to make the right types of molecules with the technology – lack of understanding of what the right types of molecule to make actually are drug likeness, Lipinski,etc An additional reason exemplified by a very simple model of Molecular Recognition Define a linear pattern of +’s and -’s to represent the recognition features of a binding site Vary the Length/Complexity of a linear Binding site as +’s and -’s Vary the Length/Complexity of a linear Ligand up to that of the Binding site Calculate probabilities of number of matches as ligand complexity varies. Example for binding site of 9 features: Feature Position Binding site features Ligand mode 1 Ligand mode 2 1 2 3 4 5 6 7 8 9 - - + - + - - + + + + + - Probabilities of ligands of varying complexity (i.e. number of features) matching a binding site of complexity 12 As the ligand/receptor match becomes more complex the probability of any given molecule matching falls to zero. i.e. there are many more ways of getting it wrong than right! 1 0.9 Probability 0.8 0.7 Match any 1 matches 2 matches 3 matches 4 matches 5 matches 6 matches 7 matches 8 matches 9 matches 10 matches 11 matches 0.6 0.5 0.4 0.3 0.2 0.1 0 2 3 4 5 6 7 8 9 Complexity of Ligand 10 11 12 Probaility Probaility The effect of potency 1 0.9 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0 0 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 Ligand Complexity Ligand Complexity P (useful event) = P(measure binding) x P(ligand matches) Probability of of measuring binding Probability measuring binding Probability Probability of of matching matching just just one one way way Probability matching one way Probability of of useful event just (unique mode) Probaility Too simple. Low probability of 1 measuring affinity 0.9 even if there is a 0.8 unique mode 0.7 Optimal. But where is it for any given system? Too complex. Low probability of finding lead even if it has high affinity 0.6 0.5 0.4 0.3 0.2 0.1 0 2 3 4 5 6 7 8 9 10 Ligand Complexity Probability of useful event (unique mode) 11 12 Limitations of the model Linear representation of complex events No chance for mismatches - ie harsh model No flexibility only + and - considered But the characteristics of any model will be the same 1 0.9 Probaility 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2 3 4 5 6 7 8 Ligand Complexity 9 10 11 12 P (useful event) = P(measure binding) x P(ligand matches) Real data to support this hypothesis!! Leads vs Drugs Data taken from W. Sneader’s book “Drug Prototypes and their exploitation” Converted to Daylight Database and then profiled with ADEPT 480 drug case histories in the following plots Sneader Lead Sneader Drug WDI Change in MW on going from Lead to Drug for 470 drugs 400 Change in MW in going from Sneader Leader to Drugs 300 200 100 Average MW increase = 42 0 0 100 200 300 400 500 -100 -200 -300 MW of Sneader Drugs 600 700 800 ADEPT plots for WDI & a variety of GW libraries WDI WDI WDI WDI WDI WDI Molecules in libraries are still even more complex than WDI drugs, let alone Sneader Leads In terms of numbers Average property values for the Sneader lead set, average change on going to Sneader drug set and percentage change. Av # arom arom % Av ClogP ClogP % Av CMR CMR % 1.3 0.2** 15 1.9 0.5** 26 7.6 1.0** 14.5 Av # HBA HBA % Av # HBD HBD % 2.2 .3** 14 .85 -.05+ (4) 19. 3.0** 16 Av MW MW % Av MV MV % Av # Rot B Rot B % 272 42.0** 15 289 38.0** 13 3.5 .9** 23 Av # heavy heavy % Astra Zeneca data similar using hand picked data from literature AZ increases typically even larger (because of data picking?) 1 0.8 0.7 Probaility Catch 22 problem 0.9 0.6 0.5 0.4 0.3 0.2 0.1 0 2 3 4 5 6 7 8 Ligand Complexity 9 10 11 12 We are dealing with probabilities so increasing the number of samples assayed will increase the number of hits (=HTS). We have been increasing the number of samples by making big libraries (=combichem) And to make big libraries you have to have many points of diversity Which leads to greater complexity Which decreases the probability of a given molecule being a hit Catch 21 Concentration as the escape route Screen less complex molecules to find more hits – Less potent but higher chance of getting on to the success landscape – Opportunity for medicinal chemists to then optimise by adding back complexity and properties Need for it to be the right sort of molecules – the Mulbits (Multiple Bits) approach – Mulbits are molecules of MW < 150 and highly soluble. – Screen at up to 1mM Extreme example from 5 years ago - Thrombin: – Screen preselected (in silico) basic mulbits in a Proflavin displacement assay specific – known to be be specific for P1 pocket. Thrombin Mulbit to “drug” N N N NH NH2 NH2 2-Amino Imidazole (5mM), as the sulphate, showed 30% displacement of Proflavin (18µM) from Thrombin (10µM) (cf Benzamidine (at 5mM) shows 70% displacement) under similar conditions Absorbance at 466nM relative to that at 444nM was used as the measure of amount of proflavin displaced O O O H S N N H O Thrombin IC50 = 4µM (15 min pre-incubation; for assay conditions see reference 23) N Related Literature examples of Mulbits type methods Needles method in use at Roche .Boehm, H-J.; et al Novel Inhibitors of DNA Gyrase: 3D Structure Based Biased Needle Screening, Hit Validation by Biophysical Methods, and 3D Guided Optimization. A Promising Alternative to Random Screening. J. Med. Chem., 2000, 43 (14), 2664 -2674. NMR by SAR method in use at Abbott Hajduk, P. J.; Meadows, R. P.; Fesik, S. W.. Discovering high-affinity ligands for proteins. Science, 1997, 278(5337), 497-499. Ellman method at Sunesis Maly, D. J.; Choong, I. C.; Ellman, J. A.. Combinatorial target-guided ligand assembly: identification of potent subtype-selective c-Src inhibitors. Proc. Natl. Acad. Sci. U. S. A., 2000, 97(6), 2419-2424. In conclusion Lipinski etc does not go far enough in directing us to leads. We have provided a model which explains why. “Everything should be made as simple as possible but no simpler.” Einstein – Simple is a relative not absolute term where is that optimal peak in the plot for each target? – Simple does not mean easy!! Thanks: Rich Green, Giampa Bravi, Andy Brewster, Robin Carr, Miles Congreve, Darren Green, Brian Evans, Albert Jaxa-Chamiec, Duncan Judd, Xiao Lewell, Mika Lindvall, Steve McKeown, Adrian Pipe, Nigel Ramsden, Derek Reynolds, Barry Ross, Nigel Watson, Steve Watson, Malcolm Weir, John Bradshaw, Colin Grey, Vipal Patel, Sue Bethell, Charlie Nichols, Chun-wa Chun and Terry Haley