Diapositiva 1

advertisement
THE AXIOM
Databanks
+
New tools S A D I C
=
New insights
imple
tom
epth
ndex
protein fold barcoding
CATH – ADAPT…
alculator
-1
SADIC: a new tool to analyze atom depth
Digging inside objects to discover their origins
Birth of the Earth
protein folding
atom depth 2D
atom depth calculated as the
distance with:
the closest external water*
the closest dot of the water
accessible surface*
the closest surface exposed atom*
HEWL 4lzt
*
*
Chakravarty S, Varadarajan R. Residue depth: a novel parameter for the analysis of protein structure and stability. Structure Fold Des. 1999 7:723-732
Pintar A, Carugo O, Pongor S. Atom depth as a descriptor of the protein interior. Biophys J. 2003 84:2553-2561.
atom depth 2D
3D
Calculation of exposed volumes
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti,
Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai.
Three-dimensional Computation of Atomic Depth in Complex
Molecular Structures Bioinformatics 2005 21:2856-2860
HEWL 4lzt
atom depth 3D
Calculation of exposed volumes
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti,
Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai.
Three-dimensional Computation of Atomic Depth in Complex
Molecular Structures Bioinformatics 2005 21:2856-2860
HEWL 4lzt
atom depth 3D
Calculation of exposed volumes
Depth index:
Di,r = 2Vi,r / V 0,r
where Vi,r is the exposed volume of a
sphere of radius r centered on atom i of
the molecule and V0,r is the exposed
volume of the same sphere when
centered on an isolated atom
the sphere radius r should have the
biggest value which makes Vi = 0 for the
most buried atom
Daniele Varrazzo, Andrea Bernini1, Ottavia Spiga, Arianna Ciutti,
Stefano Chiellini,Vincenzo Venditti, Luisa Bracci and Neri Niccolai.
Three-dimensional Computation of Atomic Depth in Complex
Molecular Structures Bioinformatics 2005 21:2856-2860
HEWL 4lzt
24,0
20,0
16,0
12,0
8,0
4,0
Di,r
2,0
1,5
1,0
0,5
0,0
r [Å]
atom depth 3D vs 2D
Thr 47 α carbon Di,9 = 1.59
Ile 58 α carbon Di,9 = 0.13
Trp 28 α carbon Di.9 = 0.03
28
58
47
HEWL 4lzt
3D atom depth analysis
from PDB ID
1UBQ
Di
http://www.sbl.unisi.it/prococoa/
SBL Bioinformatics Projects
Projects SADIC correlated:
1. fold dependent aa compositions
of protein cores;
2. towards i-SADIC.
----------------------------------------------------
Projects SADIC uncorrelated:
1. systematic analysis of PPI
Di analysis of protein atoms
defining strutural layers
in protein 3D structures
each strutural layer includes
atoms with similar Di’s
fast and accurate analysis of
aa content of structural layers
Di analysis of protein atoms
3 VTR (chitinolytic enzyme 572 aa)
color
Ln
Di
L6
> 1.2
red
L5
1.0 – 1.2
orange
L4
0.8 – 1.0
yellow
L3
0.6 – 0.8
green
L2
0.4 -0.6
blue
L1
0.2 - 0.4
indigo
L0
< 0.2
violet
3D atom depth analysis
K63
from PDB ID
1UBQ
0.19
0.30
0.25
0.23
0.50
0.68
0.91
1.11
1.29
N
CA
C
O
CB
CG
CD1
CD2
N
CA
C
O
CB
CG
CD
OE1
OE2
0.10
0.05
0.11
0.18
0.02
0.02
0.02
0.00
Dimax
0.38
E24
0.52
0.50
0.52
0.76
0.95 Dimax
1.17
1.24
1.24
L43
http://www.sbl.unisi.it/prococoa/
Dimax
N
CA
C
O
CB
CG
CD
CE
NZ
Dimax analysis of protein residues
defining aa occupancy in
protein strutural layers
each strutural layer includes
residues with similar Dimax’s
fast and accurate analysis of aa
distribution in protein structures
Dimax analysis of protein singles
quite a few proteins like to stay single
(at least in the crystalline state)
Bioinformatiha 2, Firenze 18 ottobre
-9
a database of protein singles
Experimental Method: X-RAY (79,770)
Chain Type: Protein (74,456)
Only 1 chain in asym. unit: (28,803)
Oligomeric state: 1 (21,193)
Number of Entities: 1 (3,517)
Homologue Removal @ 95% identity (2,410)
DOOPS: 2,410 proteins in the dataset
4,657,574 atoms
589,383 residues
18
16
14
12
10
8
6
4
2
0
1
1001
2001
a database of protein singles
Swiss-Prot: 540,958 proteins in the dataset (192 Maa)
DOOPS: 2,410 proteins in the dataset
4,657,574 atoms
589,383 residues
18
16
14
12
10
8
6
4
2
0
01
1001
1000
2000
2001
Dimax analysis of protein cores
DOOPS: 2,410 proteins; 4,657,574 atoms; 589,383 residues
calculation of % amino acid content in L0
the first quantitative analysis of a large array of protein cores!
core aa if Dimax < 0.2
~20 % of total
molecular volume
ΣDOOPS aa(L0) = 106,088
(from 2410 proteins)
aa
% in L0
Alanine
Cysteine
Aspartate
Glutamate
Phenylalanine*
Phenylalanine
Glycine
Histidine
Isoleucine
Lysine
Leucina
Methionine
Asparagine
Proline
Glutamine
Arginine
Serine
Threonine
Valine
Tryptophan
Tyrosine
11.51
2.63
1.77
1.2
6.36
10.81
1.32
11.74
0.58
16.27
2.49
1.7
2.45
1.21
0.83
4.85
4.65
13.7
1.43
2.5
Di analysis of protein cores :
folding clues from aa core composition?
Class
Architectures Topology
Homologous
superfamily
Domains
1
(mainly α)
5
386
875
37,038
2
(mainly β)
20
229
520
43,881
3
(α & β)
14
594
1113
90,029
4
(few sec. str.)
1
104
118
2,588
40
1313
2626
173,536
Total
Di analysis of protein cores :
folding clues from aa core composition?
DOOPS + CATH
selected Architectures
with ≥ 10 PDB files
1.10 1.20 1.25 1.50 2.10 2.30 2.40 2.60 2.80 3.10 3.20 3.30 3.40 3.60 3.90 total
#
Proteins
mono
( domain
)
213 84 19 10 17 57 94 134 12 84 52 139 218
(84) (40) (17) (3) (13) (37) (73) (110) (12) (73) (44) (106) 203
10 49 1,190
(8) (49) (872)
Towards protein folding barcodes
% L0 1.10
ALA
ARG
ASN
ASP
CYS
GLN
GLU
GLY
HIS
ILE
LEU
LYS
MET
PHE
PRO
SER
THR
TRP
TYR
VAL
1.50
2.10
2.30
2.40
2.60
2.80
13,28 10,32 21,46 12,74
1.20
1.25
9,26
10,05
8,43
9,32
5,5
3.10
3.20
3.30
3.40
3.60
3.90
overall
10,69 10,08 12,58 11,88 14,95 12,01 11.51
0,6
1,28
0,24
1,39
0
0,64
1,72
0,75
0
0,55
1,11
1,75
0,3
0,47
0,95
0.83
0,67
2,62
0,73
2,77
1,85
2,04
1,77
1,36
0
2,1
2,9
0,96
1,52
2,8
2,1
1.70
1,61
2,62
0,24
2,91
1,23
1,27
2,03
1,79
0
2,1
2,9
3,02
1,77
2,34
0,95
1.77
av + 2σ
av - σ
3,35
2,99
5,37
0,83
22,84
2,04
1,46
4,42
0,92
2,83
2,1
1,49
1,86
1,4
3,05
2.63
0,6
1,5
0,24
1,11
1,23
1,15
1,81
1,69
0
0,46
1,56
2,15
0,99
1,4
1,33
1.21
1,48
1,44
0,73
1,52
0
1,15
1,19
1,04
0
0,91
2,59
2,41
1,08
0,93
0,67
1.20
8,05
8,72
9,76
13,85 16,05
9,92
16,2
10,82
9,17
8,78
11,81 11,35 12,64 13,08
9,91
10.81
0,79
0,56
0
2,65
1,96
0,47
2,48
1.32
12,8
11,77 12,53 11,53
7,01
11,34 11.74
1,01
1,6
2,44
1,11
0,62
0,76
12,68
9,95
10,73
8,59
6,79
13,61 10,68 10,78 13,76
8,02
17,18 12,97 13,98 33,94 16,54
11,9
14,33 14,22 15,42 13,63 16.27
0,38
0,49
0,56
0
0,09
0,62
1,36
0,55
0
0,67
0.58
23,88 18,34 22,44 11,77
1,91
0,67
0,91
0
1,11
2,62
4,17
1,71
4,99
0
2,8
2,65
3,15
1,83
2,93
2,76
2,41
2,39
3,27
1,91
2.49
6,44
6,79
2,93
4,57
4,32
7,12
7,06
6,73
15,6
7,22
4,95
6,18
6,07
4,21
6,01
6.36
1,34
2,46
3,41
2,63
3,09
3,31
3
2,78
0
3,29
2,9
1,84
2,25
1,4
1,81
2.45
3,49
4,55
3,66
5,96
3,09
5,34
5,56
5,13
2,75
2,83
5,35
4,43
4,23
6,07
5,34
4.85
2,28
4,81
4,15
7,2
5,56
3,31
5,12
4,47
0,92
3,2
5,22
4,25
4,94
5,14
5,91
4.65
1,01
1,55
0
2,77
3,7
0,38
1,63
2,78
2,75
2,19
1,52
0,66
1,26
0,47
2,1
1.43
2,62
3,69
0,24
4,57
2,47
1,27
2,69
4,38
0,92
3,29
3,12
1,58
2,32
0
2,29
2.50
12,34
9,68
9,51
7,62
9,88
16,28 12,75 13,51 11,93 14,53 12,88
11,7
16,29 19,16 15,54
84
(40)
19
(17)
10
(3)
17
(13)
213
#PDB
(84)
Ala
0
3,02
57
(37)
Cys
94
(73)
134
(110)
12
(12)
84
(73)
Leu
Phe
52
(44)
CATH-ADAPT
alpha ribbon
horseshoe
trefoil
139
(106)
218
203
10
(8)
3CKC(A02)
PDB ID
1RG8(A00)
av + σ
av - 2σ
Di of 173,536 CATH domains
28 h, 5’ (average comp. time 1.72 s/domain)
Calculations performed on
6 cores 990X CPU based computer
13.7
49
2,410
(49)
Val
four layer
sandwich
PDB ID
CATH - atom
depth assisted protein tomography
2IMH(A01)
PDB ID
PDB ID
1UZK(A01)
aa % average value (av)
Class Architectures Topology
Homologous
superfamily
1
5
386
875
2
20
229
520
3
14
594
1113
4
1
104
118
Total
40
1313
2626
Towards protein folding barcodes
Putting the protein universe in order
Towards protein folding barcodes
Putting the protein universe in order
towards i-SADIC
(implemented SADIC)
towards i-SADIC
(implemented SADIC)
H/D exchange rate profiles
towards i-SADIC
(implemented SADIC)
H/D exchange rate profiles
towards i-SADIC
(implemented SADIC)
H/D exchange rate profiles
towards i-SADIC
(implemented SADIC)
H/D exchange rate profiles
towards i-SADIC
(implemented SADIC)
H/D exchange rate profiles
H/D exchange rate profiles
2D atom depth
dnwi = or atom distance with
the nearest water molecule
or
3D atom depth
Di,9 = or atom depth index
with a probe od radius 9 Å
data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang
model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.
H/D exchange rate profiles
iSADIC atom depth
iDi,9 = aDi,9 + bASAi
cDi,9 + dDnwi
3D atom depth
Di,9 = or atom depth index
with a probe od radius 9 Å
data from Pedersen TG, Thomsen NK, Andersen KV, Madsen JC, Poulsen FM. Determination of the rate constants k1 and k2 of the Linderstrom-Lang
model for protein amide hydrogen exchange. A study of the individual amides in hen egg-white lysozyme. J Mol Biol. 1993 230(2):651-660.
H/D exchange rate profiles
iSADIC atom depth
iDi,9 = aDi,9 + bASAi
cDi,9 + dDnwi
3D atom depth
protein-protein interface analysis
biological vs crystallographic interfaces
N
CA
C
O
CB
CG
CD
NE
CZ
NH1
NH2
H
HA
HB2
HB3
HG2
HG3
HD2
HD3
HE
HH11
HH12
HH21
HH22
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
ARG
vs
N
CA
C
O
CB
CG
CD
CE
NZ
H
HA
HB2
HB3
HG2
HG3
HD2
HD3
HE2
HE3
HZ1
HZ2
HZ3
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
LYS
Download