PPT

advertisement
EMBL-EBI
Chemistry & the PDB
MSDchem
Primary Developer: Dimitris Dimitropoulos
EMBL-EBI
The chemical database
EMBL-EBI
MSDchem ligand dictionary
 Complete, clean, up to date collection of all the
chemical species and small molecules in the PDB
 A ligand in MSDchem is a complete, distinct stereo
isomer of a chemical compound
Atoms and element types
Bonds and bond orders
Stereo configuration of atoms and bonds in cases
of stereo-isomers (R/S – E/Z)
 Atom names and coordinates are not fundamental
properties
EMBL-EBI
Role in the MSD database
 An integral component in the core of MSD database
 Relational reference from entities where a molecule or
atom name is used in the PDB (protein residues and atoms)
It is not possible for an ATOM line:
HETATM 4342 C2 PLA 86 14.227 11.195 -8.256 1.00 67.95 C
to be loaded if the “PLA” ligand is not defined or it does not
include a “C2” atom.
EMBL-EBI
Chemistry and PDB
Eliminate chemical inconsistencies from new PDB
entries
 Structure and derived properties of a ligand apply automatically
to residues and bound molecules that reference it
 The basic structure is carefully determined during curation, and
a rich set of derived attributes is calculated for each ligand
 Graph isomorphism is being applied to check the consistency of
the PDB, taking stereo-configuration into account
 Old legacy PDB entries are chemically “corrected”
when loaded in the MSD database
 In thousands of cases errors are identified and corrected,
involving most of them times inconsistent naming or different
stereo-configuration
Exchanged in cooperation with RCSB
and the wwPDB
EMBL-EBI
More than just the PDB codes
All ligands are modelled as separate inter-related ligands
and the appropriate one is referenced
 No distinction is made in the PDB between ribo- and
deoxyribonucleotides (all are identified with the same
residue name i.e., A, C, G, T, U, I)
 Modified nucleic acids are given as +A etc regardless
of modification
 No distinction between different topological variants
(12 different variants can be found for HIS in PDB)
EMBL-EBI
Derived information


External scientific software (CACTVS, VEGA, CORINA,
ACD-labs, CCP4, OELIB) together with in house
development has been used to derive:
Stereochemistry (R/S – E/Z)
DCF
C4' R
C3' S
C1' R
Smiles and detailed gifs
 Systematic IUPAC names

THIOALANINE (ALT)
CC(N)C(O)=S - C[C@H](N)C(O)=S
(2S)-2-aminopropanethioic O-acid
DCM
C4' S
C3' R
C1' S
Click to see
attributes
EMBL-EBI
Search options
 By ligand code
 By ligand name or synonym
 By formula or formula range
 By non stereo substructure
 By non stereo superstructure
 By exact stereo or non stereo structure
 By fingerprint similarity
 By fragment expression
Activate JME
molecule editor
Clear structure
Delete atom
Change atom type
after drawing bonds
JME editor allows
generation of SMILE
string to enter search
mode
Click when
complete
EMBL-EBI
Search for ligand structures
containing 3-chloro-phenol
Results
Click to get Details for EAA
Get PDB entries and bound
molecule instances
containing 3-chloro-phenol
EAA details
substructure of
3-chloro-phenol
Viewing &
saving options
Get the PDB
entries that
include EAA
Get the bound
molecule
instances and
site interaction
details
EMBL-EBI
PDB residue KWT
<chemComp>
<code>KWT</code>
<name>(1S,6BR,9AS,11R,11BR)-9A,11B-DIMETHYL-1-[(METHYLOXY)METHYL]3,6,9-TRIOXO-1,6,6B,7,8,9,9A,10,11,11B-DECAHYDRO-3H-FURO[4,3,2DE]INDENO[4,5-H][2]BENZOPYRAN-11-YL ACETATE</name>
<nAtomsAll>55</nAtomsAll>
<nAtomsNh>31</nAtomsNh>
<overallCharge>0</overallCharge>
<stereoSmiles>COC[C@H]1OC(=O)c2coc3C(=O)C4=C([C@@H](C[C@@]5(C)[C@H]4
CCC5=O)OC(C)=O)[C@]1(C)c23</stereoSmiles>
<systematicName>(1S,6bR,9aS,11R,11bR)-1-(methoxymethyl)-9a,11b-dimethyl-3,6,9trioxo-1,6,6b,7,8,9,9a,10,11,11b-decahydro-3H-furo[4,3,2-de]indeno[4,5h]isochromen-11-yl acetate</systematicName>
EMBL-EBI
Formula-fragment expression search
Fragment expression
Example: Search for ligands with furan rings
but not any saturated carbon rings
(cyclobutane,cyclopropane,cyclohexane)
Formula expression
Example: Search for ligands with more than
10 oxygens no nitrogens and sulphurs
Download