Ig and fnIII domains in the human genome (ENSEMBL v

advertisement
1
Supplementary data for
Aggregation and evolution of multi-domain proteins: the importance of
sequence identity
Caroline F. Wright, Sarah A. Teichmann, Jane Clarke and Christopher M.
Dobson
The supplementary data contains the following supplementary information:
Figure S1: Structural features of titin.
Figure S2: Aggregation kinetics for monomeric TI I27 at various concentrations
Supplementary to Figure 1 in the manuscript.
Figure S3: Co-aggregation of different domains with TI I27. Supplementary to
Figure 2 in the manuscript.
Figure S4: Details of the analysis of sequence identity of Ig and fnIII domains in the
human genome. Supplementary to Figure 3 in the manuscript. The figure legend
gives full details of the analysis of homologous Ig and fnIII domains in the human
genome.
Table S1: Sequence identity and aggregation kinetics of monomeric proteins.
Supplementary to Figure 2 in the manuscript. The footnotes include a detailed
description of the kinetic analysis.
Table S2: Domains in proteins of genomes from the three kingdoms of life. The
footnotes give details of the analysis of whole genomes.
Table S3: Length distribution of repeats of tandem homologous domains in human.
Table S4: The largest repeat superfamilies in the human genome
Table S5: Sequence identities in the I-band of titin
References: To support the supplementary material
2
Supplementary Figure S1.
Figure S1. Structural features of titin. (a) The 27th Ig domain from titin1. The strands are labelled A, A’, B, C, D, E, F and G, from N to C termini. The two sheets pack on to each other to form a -sandwich structure with Greek Key
topology. (b) Schematic representation of part of a multi-modular protein, such
as titin; the domains are labelled from the N-terminus.
3
Supplementary Figure S2.
Figure S2. Aggregation kinetics for monomeric TI I27 at various
concentrations. Supplementary to Figure 1 in the manuscript. As the
concentration of monomer increases the lag decreases and the elongation
rate increases.
4
Supplementary Figure S3.
Figure S3. Co-aggregation of different domains with TI I27. Each monomeric
protein was added to a solution of TI I27, to reach a final concentration of
each domain of 1 mg/ml (i.e. a total protein concentration of 2 mg/ml). The
extent of light scattering was then monitored, using a Synergy Bio-Tek plate
reader, as a function of the time of incubation in a buffered 28 % TFE solution
at 25 ˚C. The aggregation kinetics (black circles) were found to fall between
5
two limiting rates, I27 aggregating alone at either 1 mg/ml (lower dashed red
line) or 2 mg/ml (upper dashed red line). The aggregation time course of each
protein alone at 2 mg/ml are also shown (solid black line). The protein
domains used and sequences identities (ID) to wild-type TI I27, are (a) PI3SH3, 0 % ID, (b) TNfn3 , 8 % ID, (c) TI I28, 28 % ID, (d) TI I32, 42 % ID, (e)
TI I31, 56 % ID, (f) TI I27 (F14L), 99 % ID. (The data shown are averages of
at least 5 measurements, except for TI I31 where only 2 measurements were
possible due to very low yields of protein in the expression system.) (Panels
(a) (c) and (d) are reproduced in the manuscript).
6
Supplementary Figure S4.
Figure S4 Details of the analysis of sequence identity of Ig and fnIII domains
in the human genome (See associated Figure 3 in manuscript).
Ig and fnIII domains in the human genome (ENSEMBL v. 15.33) were
assigned by hidden Markov models in the SUPERFAMILY database4 version
1.61. These two superfamilies are amongst the largest in the human genome,
with 2545 Ig domains and 1098 fnIII domains. The fnIII superfamily is
comparable to the Ig family in both size and function (extracellular domains
predominantly involved in cell adhesion). However, fnIII domains do not
contain disulphide bridges. The domains in SUPERFAMILY database are as
defined in the SCOP database5.
7
All the sequence regions assigned as Ig or fnIII domains in the human
genome were compared to each other by pairwise sequence comparisons
using FASTA6 . Three types of homologous domain pairs amongst the
proteins in an organism were identified (as illustrated above). 1) Paralogous
domains: domains in different proteins within the same organism here. 2)
Adjacent domains: domains within 30 residues of each other within the same
protein. (The 30 residue cut-off is used since very few independent domains
are less than 30 residues long). 3) Non-adjacent domains: domains within the
same protein that are >30 residues apart.
Since the Ig superfamily encompasses very distantly related proteins, many
domain pairs do not produce significant alignments based on pairwise amino
acid sequence comparisons. Statistical scores were not used since the data
sets of sequences compared were small, so in order to ensure that only
meaningful sequence identities were obtained, any alignments shorter than
thirty residues and/or with less than thirty percent sequence identity were
considered ‘unmatched’7. In Figure 3a in the manuscript the identities are
binned in a category of more than thirty percent identity.
Of all the paralogous domain pairs in the human genome, only 2% of the ~
3x106 paralogous Ig pairs and 3% of the ~ 6x105 fnIII domain pairs have
≥30% sequence identity. Adjacent domains are on average more similar than
the domains on different proteins, which is probably because they are likely to
be exposed to the same sorts of functional constraints on their sequences and
to have evolved by internal gene duplication of each other. Despite these
similar evolutionary constraints, only 27% of the 1165 Ig and 29% of the 673
fnIII adjacent pairs have >30% sequence identity.
Domains within the same polypeptide chain that are not adjacent, are plotted
in the third section of Figure 3a. The vast majority of these domains are
8
themselves adjacent to other Ig or fnIII domains. Thus any selective pressure
for sequence divergence from their neighbours will be reflected in their
sequence similarity to domains further away in the proteins. Nevertheless, the
percentages of pairs with greater than 30% identity are actually somewhat
greater amongst these domain pairs than amongst the adjacent pairs: 35% of
the 4712 Ig and 53% of the 4047 fnIII pairs. The difference is greater for the
fnIII family than for the Ig superfamily and could be due a greater pressure for
sequence divergence amongst adjacent fnIII domains in the absence of
disulphide bonding; the presence of such bonds and the restrictions they
place on conformational transitions are likely to inhibit aggregation in the Ig
superfamily.
Although the percentage of adjacent domain pairs with 30% or greater
sequence identities is 27% and 29% of the Ig and fnIII adjacent domains
respectively, it is important to appreciate that the sequence identity
distributions of the adjacent domain pairs (and indeed all groups of domain
pairs) falls off roughly as a power law (Figure 3b). This means that, for
instance, only about 8% of the adjacent Ig domains and 12% of the fnIII
domains have more than 40% sequence identity.
9
Table S1 Sequence identity and aggregation kinetics of monomeric
proteins
% ID
k (s-1)
Lag (s)
k+I27 (s-1)
Norm. k+I27
Lag+I27 (s)
PI3-SH3
0
< 0.0001
> 10 000
0.0004
-0.1 (± 0.2)
498
TNfn3
8
< 0.0001
> 10 000
0.0002
-0.2 (± 0.2)
588
TI I28
28
0.0001
3000
0.0005
0.0 (± 0.2)
510
TI I32
42
0.0005
1500
0.0011
0.5 (± 0.2)
527
TI I31
56
0.0006
300
0.0014
0.7 (± 0.2)
243
TI I27 (F14L)
99
0.0004
960
0.0017
1.0 (± 0.2)
342
TI I27 (wt)
100
0.0018
334
0.0018
1.0
334
Protein
Co-aggregation was assessed by measuring the aggregation rate of a 1 mg/ml solution of TI
I27 in the presence of 1 mg/ml of a second protein whose aggregation rate is otherwise slow
under the conditions used here. If the aggregation rate is increased in the mixed protein
solution, it indicates that the effective concentration of the TI I27 is increased as a result of
the ability of the second protein to co-aggregate with TI I27.
The kinetics were analysed with Prism (Graphpad) using a simple single exponential function,
with terms to account for the lag phase as well as linear and quadratic drift over long
timescales (see below). k, single exponential aggregation rate constant; k +I27, single
exponential aggregation rate constant when co-incubated with TI I27; Lag, length of the lag
phase; Lag+I27, mean length of the lag phase when co-incubated with TI I27; Norm. k+I27,
normalised co-aggregation rate constants, from zero to one (aggregation at the same rate as
TI I27 alone at 1 mg/ml or 2 mg/ml respectively). Errors are calculated from the standard
deviation of the aggregation kinetics fitted to repeated measurements. The error on all
aggregation rates is approximately 15 %; the error on all lag times is approximately 20 %. All
data are given for a total protein concentration of 2 mg/ml, which is either all one protein or 1
mg/ml of each protein with 1 mg/ml of TI I27 for the co-aggregation experiments. Aggregation
of wild-type TI I27 at 1 mg/ml has an elongation rate of 0.0004 s-1 and a lag phase of 540 s.
10
Details of the fitting procedure: Because of the short lag time for the aggregation of TI I27,
it is not possible to fit the kinetics with a sigmoidal curve of the type used, for example, in
fitting IAPP aggregation kinetics8. Instead, data were analysed using a phenomenological
equation:
OD400  IF(t  t0,P,P  (T  P)1 exp[kA (t  t0 )]  at  bt 2
where t is the time (s), t0 is the lag time, P is the initial light scattering intensity plateau during

the lag time, T is the final (top) light scattering plateau, kA is the elongation rate and a and b
are terms to account for non-linearity in the data after the initial plateau due to fragmentation
and association. The IF function forces the data to be fitted to a flat, straight line until the end
of the lag phase, at which point the data are fitted to a single exponential function. The
program Graphpad (Prism Software) was used to analyse all aggregation data. Since the
kinetic data for the aggregation of AcP with 25 % TFE fit well to an equation containing just a
single exponential function9, the fit was validated by manually fitting a smaller portion of the
data (after the visible lag time) to a simple single exponential function. The data fit was not
improved by using double, triple or quadruple exponential functions. (Note that where the lag
time < 90 s it was not possible to assign a value to the lag time because of the 60 s
experimental dead-time.)
In order to determine the value of the final aggregation plateau (i.e. the light scattering at t =
∞), the OD400 was plotted versus 1/t for a series of times after the end of the initial exponential
phase; linear extrapolation to the y-intercept then gives the final plateau point OD  .
Analysis of the data using the simpler procedure of comparing the half-times for the
aggregation reactions, rather than by explicit analysis of the lag phase
and kinetic phases as
described here, gives very similar results and the conculsions are unchanged.
11
Table S2. Domains in proteins of genomes from the three kingdoms of
life.
% multi-domain
Genome
% proteins
% proteins with
proteins with
Longest array
with structural
assignments
tandem
of tandem
domain
that are multi-
homologous
homologous
assignments
domain
domains
domains
Vertebrates
Human
59
77
27
45
Mouse
63
72
26
32
Fugu (fish)
44
56
24
45
51
73
16
44
54
80
19
32
47
77
13
10
56
74
14
8
50
65
15
6
Invertebrates
Worm
(C. elegans)
Fly
(D. melanogaster)
Unicellular eukaryotes
Budding yeast
(S. cerevisiae)
Fission yeast
(S. pombe)
Encephalitozoon
cuniculi
Plant
12
A. thaliana
55
76
13
8
E. coli
57
55
13
16
Y. pestis
54
54
14
25
B. subtilis
54
52
14
6
M. pneumoniae
52
59
12
3
A. fulgidus
57
50
16
5
M. jannaschii
57
48
15
5
T. acidophilum
60
51
14
5
Bacteria
Archaea
The genomes and structural domain assignments are taken from the SUPERFAMILY
database version 1.614. The parameters for adjacency between domains, and for unassigned
regions equivalent to domain are also taken directly from this database. The fraction of
assigned proteins that have two or more domains is between two thirds and three quarters for
most eukaryotes, and closer to one half for prokaryotes. Multi-cellular animals are enriched in
proteins with repeats of two or more tandem homologous domains. (Homology is defined as
domains that belong to the same superfamily, as described in the SCOP database 5.) They
include muscle proteins such as titin, as well as proteins involved in multi-cellularity such as
extracellular matrix proteins, cell adhesion proteins and cell signalling molecules. Thus the
fraction of multi-domain proteins with tandem homologous domains is approximately one
quarter in vertebrates, and slightly lower in invertebrates. The proteins with the longest
number of repeats of tandem homologous are also found in vertebrates and invertebrates.
Proteins with >30 tandem repeats include muscle proteins such as titin and extracellular
matrix proteins, which all contain long arrays of Ig and fnIII domains. Interestingly, the
proteins in bacteria, such as E. coli and Y. pestis, with tandem repeats of >20 domains are
cell adhesion proteins used for host invasion in pathogenic bacteria.
13
Table S3. Length distribution of repeats of tandem homologous domains in
human.
% of proteins with
Number of tandem
% of proteins with tandem
Number of tandem
tandem homologous
homologous domains
homologous domains
homologous domains
domains
2
41
14
0.12
3
32
15
0.12
4
10
16
0.12
5
5
18
0.07
6
3
19
0.05
7
4
20
0.02
8
1
22
0.02
9
1
24
0.02
10
1
25
0.02
11
0.5
28
0.02
12
0.5
29
0.07
13
0.32
31
0.02
45
0.02
About one quarter of the multi-domain proteins in the human genome contain tandem repeats
of two or more homologous domains. About three quarters of these proteins have either two
or three homologous domains in a consecutive array, as shown in this table. Only about three
percent contain ten or more domains, with up to forty-five homologous domains as a
maximum in this data set of human proteins.
14
Table S4. The largest repeat superfamilies in the human genome
Number of
Number of
non-
adjacent
adjacent
Superfamily name in SCOP
pairs
pairs
TPR-like
40
40
WW domain
42
37
Kringle-like
45
35
Concanavalin A-like lectins/glucanases
49
23
Spermadhesin, CUB domain
50
377
Homeodomain-like
43
46
Eukaryotic type KH-domain (eKH-domain)
59
141
Ankyrin repeat
61
59
Integrin A (or I) domain
62
234
SH3-domain
65
80
C2 domain (Calcium/lipid-binding domain, CaLB)
68
95
C-type lectin-like
73
136
PDZ domain-like
98
370
Actin-like ATPase domain
120
23
Scavenger receptor cysteine-rich (SRCR) domain
153
525
Glucocorticoid receptor-like (DNA-binding domain)
154
125
RNA-binding domain, RBD
218
126
P-loop containing nucleotide triphosphate hydrolases
239
180
15
Complement control module/SCR domain
267
79
LDL receptor-like module
276
940
Spectrin repeat
363
3677
Cadherin
671
2901
Fibronectin type III
573
4047
EGF/Laminin
1082
8374
Immunoglobulin
1165
4712
C2H2 and C2HC zinc fingers
5934
23296
There are 25 superfamilies in the human genome that have forty or more adjacent pairs of
homologous domains and thirty or more non-adjacent domains within the same protein
sequence using the domain assignments in SUPERFAMILY v. 1.614.
If a superfamily has small numbers of non-adjacent pairs, it means that most of the
homologous domains in the same sequence consist of a pair of adjacent domains. Larger
numbers of non-adjacent domain pairs are associated with long arrays of tandem
homologous domains.
16
Table S5: Sequence identities in the I-band of human cardiac titin
domain I- 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
1
100 17 17 23 19 16 22 23 19 24 25 24 20 24 29 25 27 22 27 23 14 23 15 19 24 18 17 18 19 15 16 27 24 16 17 21 22 21 21 18
2
17 100 29 27 26 24 29 22 34 28 32 37 25 29 27 22 19 24 27 31 22 20 20 29 27 34 26 29 26 30 23 30 32 35 27 24 29 25 33 26
3
18 30 100 30 29 28 26 25 28 30 29 30 19 18 25 21 24 25 19 25 17 26 25 26 25 28 22 24 21 28 21 28 37 30 30 28 24 28 29 22
4
24 27 30 100 26 30 25 22 26 26 29 32 26 25 24 24 20 25 22 22 25 19 16 25 23 29 22 31 16 29 16 26 33 25 27 19 20 30 23 20
5
20 27 29 27 100 26 28 29 31 26 28 29 30 21 33 20 19 24 20 22 21 20 21 37 23 27 21 34 23 26 21 29 33 32 28 26 28 28 30 23
6
17 25 28 30 26 100 28 36 37 31 28 29 26 21 31 22 22 27 26 25 20 21 26 33 28 28 22 26 30 34 29 31 31 35 26 26 27 19 30 26
7
24 30 26 26 28 28 100 28 26 29 24 34 28 24 30 29 22 25 26 29 26 24 30 31 28 33 31 28 21 35 27 31 35 35 27 25 30 26 35 26
8
24 22 24 22 29 36 28 100 27 37 29 24 27 23 21 24 19 24 29 21 26 22 19 31 28 23 22 24 29 20 26 23 29 30 23 27 29 22 33 26
9
20 36 28 27 31 37 26 27 100 39 35 35 31 20 30 24 19 20 26 33 18 27 25 31 28 31 24 21 28 31 22 34 31 26 22 26 24 24 26 34
10
26 29 30 27 26 31 29 37 39 100 37 34 37 24 26 29 24 22 35 28 21 30 24 35 26 31 29 30 25 28 29 27 37 30 29 24 27 31 28 24
11
27 34 29 29 28 28 24 29 35 37 100 37 33 24 30 24 20 26 28 27 17 24 19 27 24 31 21 30 27 25 25 28 29 27 26 18 33 34 28 26
12
26 38 30 33 29 29 34 25 35 34 37 100 43 26 28 24 27 19 31 31 21 26 29 30 27 28 28 36 33 28 29 30 29 29 30 27 24 31 31 24
13
21 25 19 26 30 25 27 26 31 36 32 42 100 20 25 21 15 18 26 25 23 16 19 21 15 23 20 24 19 21 25 21 26 24 21 13 21 29 23 24
14
24 29 17 24 20 20 22 22 19 22 22 24 19 100 34 32 33 31 28 23 17 24 23 26 19 20 23 26 20 23 20 28 29 27 19 22 21 15 26 16
15
30 27 24 24 32 30 29 20 29 25 29 27 25 34 100 27 24 31 27 25 23 20 18 26 27 26 19 27 22 27 19 28 29 22 19 23 30 24 24 23
16
26 21 20 23 19 21 28 23 22 28 22 22 20 32 27 100 30 29 30 22 19 19 21 19 19 26 24 23 15 23 16 21 28 19 21 23 16 20 27 19
17
27 19 22 19 18 21 21 18 18 22 19 25 15 33 23 29 100 22 26 23 19 17 16 24 20 19 22 17 20 19 14 21 24 22 19 21 21 20 17 16
18
23 24 24 25 24 26 24 24 19 22 25 18 17 31 31 29 23 100 23 18 18 23 18 26 15 27 14 24 18 26 17 23 26 23 22 23 24 23 20 23
19
28 27 18 21 19 24 24 28 24 33 27 30 26 28 27 30 27 22 100 29 17 24 22 24 22 28 23 19 19 22 21 24 28 28 22 23 21 26 23 23
20
24 31 24 22 22 24 28 20 31 27 26 30 25 24 25 23 24 18 29 100 20 24 27 29 25 28 29 22 17 26 17 27 31 28 25 27 24 22 30 24
21
14 22 16 25 20 19 25 25 17 20 16 20 23 17 23 19 19 18 17 20 100 17 16 23 20 25 17 15 14 24 13 23 28 25 27 22 20 23 20 17
22
24 21 25 19 20 21 23 22 26 30 23 25 16 25 21 20 18 23 25 24 18 100 22 30 24 30 19 30 24 29 20 23 30 30 26 25 26 24 27 22
23
16 21 25 17 21 26 30 19 25 24 19 29 19 25 19 22 17 19 24 28 17 22 100 25 30 31 47 19 29 28 48 46 28 31 22 29 19 20 22 19
24
20 31 26 26 38 33 32 32 32 35 27 31 22 27 27 20 26 27 26 31 24 31 25 100 32 32 28 41 28 32 27 33 38 43 28 32 30 28 40 22
25
26 28 25 24 24 28 28 28 28 26 24 27 16 20 28 20 21 16 24 26 21 25 30 31 100 25 31 22 52 25 30 40 36 35 25 30 28 24 35 22
26
19 36 28 30 27 28 33 24 32 32 32 28 24 22 27 27 20 28 30 30 26 31 32 32 25 100 24 27 24 61 23 30 42 40 39 32 33 32 34 19
17
27
18 27 22 22 21 22 31 22 24 29 21 28 20 25 20 26 24 15 25 30 18 19 47 28 31 24 100 25 25 20 57 42 29 29 25 28 21 22 26 20
28
19 30 24 31 35 26 28 25 21 30 30 36 25 27 28 25 18 25 20 22 16 30 19 40 22 27 25 100 28 31 22 24 36 36 31 22 26 31 34 21
29
20 27 21 17 24 30 21 29 28 25 27 33 19 21 22 16 21 19 20 18 15 25 29 28 52 24 25 28 100 22 30 34 26 31 18 24 26 30 28 24
30
16 31 28 29 26 34 35 20 31 28 25 28 21 25 28 25 20 27 24 27 25 29 28 31 25 61 20 31 22 100 21 27 40 38 34 29 24 28 31 16
31
17 24 21 17 21 29 27 26 22 29 25 29 26 21 20 17 15 18 22 18 13 20 48 27 30 22 57 22 30 21 100 39 28 27 25 22 25 21 25 19
32
29 31 28 27 29 31 31 24 34 27 28 30 21 29 29 22 22 24 26 28 24 24 46 33 40 29 42 24 34 27 39 100 37 33 22 29 28 27 34 29
33
26 34 37 34 34 31 35 29 31 37 29 29 27 30 30 29 26 27 29 33 29 30 28 37 36 42 29 36 26 40 28 37 100 51 37 28 35 27 45 26
34
17 37 30 26 33 35 35 30 26 30 27 29 25 28 22 20 24 24 29 29 26 30 31 43 35 39 29 36 31 38 27 33 51 100 34 36 35 26 45 21
35
18 28 30 28 28 26 27 24 22 29 26 30 21 20 20 22 20 22 24 26 28 27 22 28 25 38 25 31 18 34 25 22 37 34 100 29 29 20 27 16
36
22 25 28 19 26 26 25 27 26 24 18 27 13 24 24 25 22 24 25 28 22 26 29 31 30 31 28 22 24 29 22 29 28 36 29 100 22 21 30 26
37
23 29 23 19 27 26 29 28 23 26 31 23 20 22 30 16 22 24 22 24 20 26 18 28 27 31 20 25 25 23 24 27 33 33 28 22 100 31 34 20
38
22 26 28 30 28 19 26 22 24 31 34 31 29 16 25 21 21 24 27 22 24 25 20 28 24 31 22 31 30 28 21 27 27 26 20 21 33 100 33 25
39
22 35 29 24 30 30 35 34 26 28 28 31 24 27 25 28 18 21 25 31 21 28 22 39 35 34 26 34 28 31 25 34 45 45 27 30 36 33 100 31
40
19 27 22 20 24 26 26 26 34 24 26 24 25 17 24 20 17 24 25 25 18 22 19 21 22 19 20 21 24 16 19 29 26 21 16 26 21 25 31 100
domain I- 1
2
3
4
5
6
7
8
9
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
Pairwise sequence identities of Ig domains in the I-band of human cardiac titin. The alignment is taken from reference1.
Only two adjacent domain pairs have > 40% sequence identity (coloured in yellow). Non- adjacent domain pairs with > 40% sequence identity are shaded in
grey.
18
References.
1.
2.
3.
4.
5.
6.
7.
8.
9.
Improta, S., Politou, A. S. & Pastore, A. Immunoglobulin-like modules
from titin I-band: extensible components of muscle elasticity. Structure
4, 323-337 (1996).
Oosawa, F. & Asakura, S. Thermodynamics of the polymerization of
protein (Acadmic Press, London, 1975).
Fernandez, C. O. et al. NMR of alpha-synuclein-polyamine complexes
elucidates the mechanism and kinetics of induced aggregation. Embo
J. 23, 2039-2046 (2004).
Gough, J. & Chothia, C. SUPERFAMILY: HMMs representing all
proteins of known structure. SCOP sequence searches, alignments
and genome assignments. Nucleic Acids Res. 30, 268-272 (2002).
Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia, C. SCOP - a
Structural Classification of Proteins database for the investigation of
sequences and structures. J. Mol. Biol. 247, 536-540 (1995).
Pearson, W. R. & Lipman, D. J. Improved tools for biological sequence
comparison. Proc. Natl. Acad. Sci. USA 85, 2444-2448 (1988).
Sander, C. & Schneider, R. Database of homology-derived protein
structures and the structural meaning of sequence alignment. Proteins
Struct. Funct. Genet. 9, 56-68 (1991).
Padrick, S. B. & Miranker, A. D. Islet amyloid: phase partitioning and
secondary nucleation are central to the mechanism of fibrillogenesis.
Biochemistry 41, 4694-4703 (2002).
Chiti, F. et al. Mutational analysis of the propensity for amyloid
formation by a globular protein. EMBO J. 19, 1441-1449 (2000).
Download