Table S1: Physicochemical descriptors used Descriptor Num_Rings Num_AromaticRings Num_AliphaticRings Num_RingAssemblies Num_Chains Num_ChainAssemblies Num_BridgeBonds Index 1 2 3 4 5 6 7 Num_TerminalRotomers 8 Heavy_Atom_Bonds Hydrogen_Atom_Bonds SingleBonds_Frac DoubleBonds_Frac TripleBonds_Frac BridgeBonds_Frac RingBonds_Frac Aliphatic_Ringbonds_Frac AromaticBonds_Frac StereoBonds_Frac 9 10 11 12 13 14 15 16 17 18 RotatableBonds_Frac 19 Num_PositiveAtoms Num_NegativeAtoms 20 21 Num_H_Acceptors 22 Num_H_Donors Num_Halogens Num_SP3_Carbons Num_SP2_Carbons Num_SP_Carbons SP3_Carbon_Fraction SP2_Carbons_Fraction SP_Carbon_Fraction Carbon_fraction Hydrogen_fraction Nitrogen_fraction Oxygen_fraction Heteroatom_fraction Sulphur_fraction Phosphorus_fraction Halogen_fraction 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Description Number of rings in the molecule Number of aromatic rings Number of non-aromatic rings (2 - 1) Number of fragments after removing non-ring bonds Unbranched chains needed to cover all the non-ring bonds in the molecule Number of fragments after removing Rings Bonds in bridgehead systems (any rings that share more than one bond in common) A non-terminal sp3 atom connected to three terminal atoms of the same type, or a non-terminal sp2 atom connected to two terminal atoms of the same type Bonds between heavy atoms Bond to hydrogen atoms Fraction of total bonds that are single Fraction of total bonds that are double Fraction of total bonds that are triple Fraction of total bonds that is a bridgebond Fraction of total bonds in ring systems Fraction of total bonds in a non aromatic ring Fraction of total bonds in an aromatic ring Fraction of total bonds that are marked CisBondStereo, TransBondStereo, or UnknownBondStereo Fraction of total bonds that are single bonds between heavy atoms that are both not in a ring and not terminal (special case amide C-N bonds are not rotatable) Number of Atoms with a positive charge. Number of Atoms with a negative charge. Heteroatoms (O, N, S, or P) with one or more lone pairs, excluding atoms with positive formal charges, amide and pyrrole-type N, and aromatic O and S atoms in heterocyclic rings Heteroatoms (O, N, S, or P) with one or more attached H atoms Sum of F, CL, I and Br Number of SP3 hybridized carbon atoms Number of SP2 hybridized carbon atoms Number of SP hybridized carbon atoms Fraction of total carbons that is SP3 hybridized Fraction of total carbons that is SP2 hybridized Fraction of total carbons that is SP hybridized Fraction of atoms that is C Fraction of atoms that is H Fraction of atoms that is N Fraction of atoms that is O Fraction of atoms that is O,N,S, or P Fraction of atoms that is S Fraction of atoms that is P Fraction of atoms that is F, Cl, I, or Br 1 OtherAtom_fraction StereoAtom_Fraction PositiveAtom_Fraction NegativeAtom_Fraction H_Acceptors_Fraction H_Donors_fraction Molecular_Weight LogP LogD Solubility FormalCharge Molecular_Volume Molecular_SurfaceArea Molecular_PolarSurfaceArea Molecular_PolarSurfaceArea_Fraction Molecular_SASA Molecular_PolarSASA_Fraction Lipinski_Pass 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 Rigidity_Index 57 drugLikeness Small_Molecule Biological Peptide Organic_Non_Peptidic Inorganic Cmp_Acid Cmp_Base Cmp_Neutral Cmp_Zwitterion Cmp_Species_Undefined 58 59 60 61 62 63 64 65 66 67 68 Fraction of atoms that is not one of the above Fraction of atoms marked EvenAtomStereo, OddAtomStereo, or UnknownAtomStereo Fraction of atoms with a positive charge Fraction of atoms with a negative charge Fraction of atoms that is hydrogen bond acceptor Fraction of atoms that is hdyrogen bond donor Molecular weight (calc Molsoft ICM [76]) Average value of ACD LogP, Accelrys AlogP and Molsoft ICM logP [76,77,79] Average value of ACD LogD, Accelrys LogD at pH 7.4 [77,79] Average value of Accelrys solubility and molsoft solubility [76,77] Formal charge of the molecule at pH 7.4 (ionized in Molsoft ICM [76]) Molecular volume (calculated in Molsoft ICM [76]) Molecular Surface Area (2D Method) Polar surface area (Molsoft ICM [76]) Fraction of total surface area that is polar Solvent Accessible Surface Area Fraction of SASA that is polar N_Count + o_Count <= 10, MW <= 500, H_donors <= 5, AlogP <=5 (AromaticBonds_Frac + (1-RotatableBonds_Frac) + Aliphatic_Ringbonds_Frac + (1SingleBonds_Frac) + DoubleBonds_Frac + TripleBonds_Frac + BridgeBonds_Frac) / 7 Druglikenes (calculated in Molsoft ICM [76]) Binary flag for molecule class as calculated in ChEMBL Binary flag for molecule class as calculated in ChEMBL Binary flag for molecule class as calculated in ChEMBL Binary flag for molecule class as calculated in ChEMBL Binary flag for molecule class as calculated in ChEMBL Binary flag for ACD Molecular species Binary flag for ACD Molecular species Binary flag for ACD Molecular species Binary flag for ACD Molecular species ACD Molecular species unknown / cannot be calculated 2