prot24327-sup-0001-suppinfo

advertisement
Supplementary Methods
Adaptive Smith-Waterman Residue Match Seeding for
Protein Structural Alignment
Christopher M. Topham1-3,*, Mickaël Rouquier1-3, Nathalie Tarat1-3
and Isabelle André1-3,*
1
Université de Toulouse; INSA, UPS, INP; LISBP, 135 Avenue de Rangueil, F-31077
Toulouse, France
2
CNRS, UMR5504, F-31400 Toulouse, France
3
INRA, UMR792 Ingénierie des Systèmes Biologiques et des Procédés, F-31400 Toulouse,
France
*Authors
for
correspondence:
e-mail
christopher.topham@insa-toulouse.fr;
isabelle.andre@insa-toulouse.fr
Running title: Protein structural alignment
1
Supplementary method S1:
The PANORAMA protein structure analysis program is integrated with a database of amino
acid, nucleotide, sugar, lipid and small molecule residue library files, derived from atom
connectivity data in the re-mediated version 3 RCSB PDB (Henrick et al. 2008) chemical
component dictionary (http://www.wwpdb.org/ccd.html). The library files contain additional
data defining molecular geometry at central heavy atom positions and chemical property
annotations, including hydrogen bond donor/acceptor atom status, and aromatic ring atom
assignments, obtained by the application of Hückel’s 4n +2 -electron rule.
PANARAMA first searches for inter-residue covalent bonds in macromolecular co-ordinate
data sets using generous distance and valence bond angle range cut-offs. Individual molecules
are then identified as assemblies of covalently-bonded atoms, and classed according to type
(protein, nucleic acid, polysaccharide etc.). Small molecule ligands are operationally defined
as molecules with ≥6 and ≤ 100 atoms. Once the complete bond connectivity has been
determined, the molecular geometry of atoms at residue-junctions is reset, and the number of
attached hydrogen atoms updated accordingly.
Secondary structural analysis is performed using an implementation of the Kabsch and Sander
(1983) algorithm. Main-chain residue conformation is described by four classes: (, 310, or )
helix, -strand, +ve dihedral angle, or random coil. Di-sulphide bonded cysteine residues
are identified as having -S – -S inter-atomic separation distances of < 3.0 Å (Kabsch and
Sander, 1983), and stereochemical geometries compatible with the Sowdhamini et al. (1989)
A, B or C quality grades. Control checking for the proximity of co-ordinating metals,
incompatible with disulphide bonding, is also carried out. Di-sulphide bridge partner
2
ambiguities in poorer quality structures are resolved by systematic searching of all feasible
arrangements within a given network using a summed empirical disulphide geometry quality
score.
Atomic solvent-accessible surface areas are calculated using the Shrake and Rupley (1973)
method with a probe size of 1.4 Å, and a test-point density of approximately 2000 points per
atom. The van der Waals radii of Chothia (1976) are used for atoms in the 20 standard amino
acid residues and for chemically equivalent atoms in other residue types. The van der Waals
radius for phosphorus is taken from Lesk (1991), and radii for all other elements from Flower
(1997). Hydrogen atoms are not included in the accessibility calculations. Relative side-chain
solvent accessibilities are computed with respect to summed side-chain atom accessibilities in
extended conformation Ala-X-Ala tri-peptides, geometry-optimised using GAMESS (Schmidt
et al., 1993) at a high level of quantum chemical theory (B3LYP/6-311++G(d,p)) in a (CPCM) conductor-like polarized continuum model (Miertuš et al., 1981; Barone and Cossi,
1998; Cossi et al., 2003) to account for solvent electrostatics (C.M. Topham and J.C. Smith,
unpublished results). Side-chains with relative solvent accessibilities of ≤ 7% are classed as
inaccessible.
Hydrogen bonds are assigned to donor-acceptor atom pairs separated by ≤ 3.9 Å (or 4.0Å for
hydrogen bonds involving sulphur atoms), subject to a hydrogen-acceptor separation distance
of ≤ 2.5 Å. Hydrogen atoms are positioned at donor atom centres using idealised internal coordinate parameter values (McDonald and Thornton, 1994). The placement of single
hydrogen atoms at tetrahedral centres, histidine protonation state assignment, and the
resolution of ambiguous asparagine and glutamine carbamoyl group and histidine imidazole
ring flip-states is achieved through optimisation of summed interaction energies in local
3
hydrogen bond networks using a genetic algorithm. The additive energy function is composed
of three terms: (i) a 6-4 direction-dependent hydrogen bond function (Goodford, 1985),
employing Autodock 3 minimum-energy separation and well-depth energy parameters
(Morris et al., 1998), (ii) a shallow energy well Lennard-Jones 12-6 function, with collision
diameter parameters of 2.47 Å or 3.60 Å to prevent the creation of unfavourable hydrogenhydrogen or hydrogen-metal close contacts, respectively, and (iii) the protonation-state and
flip-state energy penalties of Hooft et al. (1996).
Ligand-contacting protein residues are identified using a variant of the occluded molecular
surface analysis method (Pattabiraman et al., 1995), based on vector projection normal to the
contact surface component of the molecular surface (Richards, 1977) of the residue in the
absence of the surrounding protein environment. Contact surface unit normal vectors are
determined for residue atoms in isolated tri-peptides centred on the residue of interest (or for
half-cystine, in the isolated branched-chain tetra-peptide containing the di-sulphide bonded
partner residue) using the Shrake and Rupley (1973) algorithm and the same parameter
variables as for the solvent accessibility calculations described above. Atoms of the occluded
surface, to which neighbouring atoms in the tri-peptide do not contribute, are represented as
their van der Waals spheres. The maximum vector outward projection distance from the
contact surface is set to 2.8 Å, approximating the diameter of a water molecule. Protein
residues with ligand atom contributions to the occluded contact surface of at least one mainchain or side-chain atom are considered to be in contact with the ligand. Metal ion coordinating protein residues are identified independently as residues possessing one (or more)
unprotonated oxygen, nitrogen or sulphur atom(s) within a separation distance of ≤ 2.5 Å of
the metal ion.
4
Protein residue aromatic environments are quantified using the occluded contact surface
analysis method described above. Residues with >2% of summed side-chain, main-chain
carbonyl or –NH group atom occluded contact surface areas composed of aromatic atoms are
considered to be in an aromatic environment.
REFERENCES:
Barone, V. and Cossi, M. (1998) Quantum calculation of molecular energies and energy
gradients in solution by a conductor solvent model. J. Phys. Chem. A, 102, 1995-2001.
Chothia, C. (1976) The nature of the accessible and buried surfaces in proteins. J. Mol. Biol.,
105, 1-14.
Cossi, M., Rega, N., Scalmani, G. and Barone, V. (2003) Energies, structures, and electronic
properties of molecules in solution with the C-PCM solvation model. J. Comp. Chem., 24,
669-681.
Flower, D.R. (1997) SERF: a program for accessible surface area calculations. J. Mol.
Graphics Mod., 15, 238-244.
Goodford, P.J. (1985) A computational procedure for determining energetically favourable
sites on biologically important molecules. J .Med. Chem., 28, 849-857.
Henrick, K., Feng, Z., Bluhm, W.F., Dimitropoulos, D., Doreleijers, J.F., Dutta, S., FlippenAnderson, J.L., Ionides, J., Kamada, C., Krissinel, E., Lawson, C.L., Markley, J.L.,
Nakamura, H., Newman, R., Shimuzu, Y., Swaminathan, J., Velankar, S., Ory, J., Ulrich,
E.L., Vranken, W., Westbrook, J., Yamashita, R., Yang, H., Young, J., Yousufuddin, M. and
Berman, H.M. (2008) Remediation of the protein data bank archive. Nucleic Acids Res., 36,
D426-D433.
5
Hooft, R.W.W., Sander, C. and Vriend, G. (1996) Positioning hydrogen atoms by optimising
hydrogen-bond networks in protein structures. Proteins: Struct. Funct. Genet., 26, 363-376.
Kabsch, W. and Sander, C. (1983) Dictionary of protein secondary structure: pattern
recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577-2637.
Lesk AM (1991) Protein architecture: a practical approach. Oxford University Press, Oxford,
New York, Tokyo, p49.
McDonald, I.K. and Thornton, J.M. (1994) Satisfying hydrogen bonding potential in proteins.
J. Mol. Biol., 238, 777-793.
Miertuš, S., Scrocco, E. and Tomasi, J. (1981) Electrostatic interaction of a solute with a
continuum. A direct utilization of ab initio molecular potentials for the provision of solvent
effects. Chem. Phys., 55, 117-129.
Morris, G.M., Goodsell, D.S., Halliday, R.S., Huey, R., Hart, W.E., Belew, R.K. & Olson,
A.J. (1998) Automated docking using a Lamarckian genetic algorithm and an empirical
binding free energy function. J. Comp. Chem., 19, 1639-1662.
Pattabiraman, N., Ward, K.B. and Fleming, P.J. (1995) Occluded molecular surface: analysis
of protein packing. J. Mol. Recognition, 8, 334-344.
Richards, F.M. (1977) Areas, volumes, packing and protein structure. Annu. Rev. Biophys.
Bioeng., 6, 151-176.
Shrake, A. and Rupley, J.A. (1973) Environment and exposure to solvent of protein atoms.
Lysozyme and insulin. J. Mol. Biol., 79, 351-371.
Schmidt, M.W., Baldridge, K.K., Boatz, J.A., Elbert, S.T., Gordon, M.S., Jensen, J.H.,
Koseki, S., Matsunaga, N., Nguyen, K.A., Su, S., Windus, T.L., Dupuis, M. and Montgomery,
6
J.A. (1993) General Atomic and Molecular Electronic Structure System. J. Comput. Chem.,
14, 1347-1363.
Sowdhamini, R., Srinivasan, N., Shoichet, B., Santi, D.V., Ramakrishnan, C. and Balaram, P.
(1989) Stereochemical modeling of disulfide bridges. Criteria for introduction into proteins by
site-directed mutagenesis. Prot. Eng., 3, 95-103.
7
Download