Supplementary Material 1

advertisement
Supplementary Material 1:
Uniprot: Uniprot (Universal Protein resource) database (http://www.uniprot.org/) provides a free
online comprehensive resource for protein sequence which is fully classified and accurately
annotated [1]. It is developed by UniProt Consortium which comprises of groups from European
Bioinformatics Institute (EBI), the SIB Swiss Institute of Bioinformatics (SIB) and the Protein
Information Resource (PIR). The database is updated every month and the sequence is submitted
in FASTA format.
PDBsum: PDBsum is a graphical database (http://www.ebi.ac.uk/pdbsum/) that provides pictorial
information in both 2D and 3D format [2]. It also provides other information such as protein
chains, ligands, protein-protein interaction diagrams, Number of helices, number of beta, gamma
turns, etc. Moreover, it provides wiring diagrams and topology diagrams of the query protein. It
also provides information about protein-protein interfaces and residue-residue interactions. It is
developed by European Bioinformatics Institute (EBI). More information about the server could
be accessed from http://www.ebi.ac.uk/thornton-srv/databases/cgibin/pdbsum/GetPage.pl?pdbcode=n/a&template=doc_about.html.
Protparam: Protparam is a web application that calculates physico-chemical properties from
amino-acid sequence [3]. The website can be accessed from http://web.expasy.org/protparam/. The
various properties are molecular weight, theoretical pI, amino acid composition, atomic
composition, extinction coefficient, estimated half-life, instability index, aliphatic index and
grand average of hydropathicity (GRAVY). The protein under investigation can be specified as a
accession number or as raw sequence. The documentation of various parameters can be accessed
through http://web.expasy.org/protparam/protparam-doc.html.
Pfam: Pfam is a comprehensive database (http://pfam.xfam.org/) of proteins domains and families
and is developed by The Wellcome Trust Sanger Institute, UK; University of Helsinki, Finland;
University of Oxford, UK; Stockholm Bioinformatics Centre, Sweden and Janelia Farm
Research Campus, USA [4]. The current information of release notes can be accessed from
(ftp://ftp.sanger.ac.uk/pub/databases/Pfam/current_release/relnotes.txt). It uses Uniprot as its reference
sequence database. Pfam uses various algorithms to provide results such as jackhammer [5] and
Hidden Markov Models [6]. Domain and family identification is done semi-automatically based
on expert knowledge, sequence similarity, other protein family databases and the ability of
HMM-profiles to correctly identify and align the sequences. The data is constantly shared with
other databases such as Structural Classification of Proteins (SCOP) [7] and CATH protein
structure classification [8].
InterProScan: InterProScan (http://www.ebi.ac.uk/interpro/) is another database which contains
broad information about protein domain and families [9]. The results obtained from Pfam were
crosschecked and compared to the results of InterProScan. Sequence (amino acid or nucleic acid)
submitted to InterProScan are matched against the signatures from several different databases.
Sequences are submitted in FASTA format. More information can be accessed from
http://www.ebi.ac.uk/interpro/about.html.
Supplementary Material 2:
I-TASSER:
Iterative
Threading
ASSEmbly
Refinement
(I-TASSER)
(http://zhanglab.ccmb.med.umich.edu/I-TASSER/) is an algorithm for predicting three-dimensional
protein structure from amino acid sequences [10]. It identifies structure templates from the
Protein Data Bank by fold recognition. The full-length structure models are created by
reassembling structural fragments from threading templates using replica exchange Monte Carlo
simulations [11]. Amino acid sequences are submitted as input by the users. More information
about
the
I-TASSER
can
be
accessed
from
http://zhanglab.ccmb.med.umich.edu/I-
TASSER/about.html. Moreover, I-TASSER server was ranked No. 1 server in Critical Assessment
of Techniques for Protein Structure Prediction (CASP) 7, 8, 9 and 10 respectively. I-TASSER
also uses LOMETS V3.0 for protein structure prediction and the documentation can be accessed
from http://zhanglab.ccmb.med.umich.edu/LOMETS/readme.txt. The best model for energy
minimization was chosen based on the maximum C-value score and maximum number of
decoys. C-score is a confidence score for estimating the quality of predicted models by I-
TASSER. It is calculated based on the significance of threading template alignments and the
convergence parameters of the structure assembly simulations. C-score is typically in the range
of [-5,2], where a C-score of higher value signifies a model with a high confidence and viceversa. I-TASSER generates full length model of proteins by excising continuous fragments from
threading alignments and then reassembling them using replica-exchanged Monte Carlo
simulations. A higher cluster density means the structure occurs more often in the simulation
trajectory and therefore signifies a better quality model.
Discovery Studio: Discovery Studio is client-based-server suite and is developed and distributed
by Accelry’s (http://accelrys.com/products/discovery-studio/). It is well known collection of various
algorithms used for computational chemistry, computational biology, cheminformatics,
molecular simulations and quantum mechanics. It uses many software algorithms such as
CHARMM [12], MODELLER [13], DELPHI [14], ZDOCK [15], etc. All the thirty Peptides
were prepared using Discovery Studio 3.1 module build and edit protein, in which build action
was used to create and grow chains of amino acids as desired. The generated peptides were
minimized using CHARMM force field using electrostatics spherical cutoff and the smart
minimizer algorithms with maximum steps of 200.
Swiss-PDB Viewer: It is a wonderful application that helps to analyze and minimize H-bonds,
angles, distances between atoms, etc in proteins and it can be accessed from http://spdbv.vitalit.ch/. The generated three dimensional models from I-TASSER web application were further
subjected to energy minimization using the steepest descent technique to eliminate bad contacts
between protein atoms. The Swiss-PDB viewer uses GROMOS 43B1 force field [16] which is
mainly used to repair distorted geometries by removing internal constrains. Energy minimization
preferences were set to 1000 steps of steepest descent technique while the cutoff value was set to
0.500 Å. The delta E cutoff value was maintained at 0.030 kJ/mol and the force acting on any
atom was set to a default value of 10.000. Energy minimization module in tools tab was used to
start the process. The minimized model was selected for molecular dynamics simulation studies.
GROMACS: GROMACS 4.5.4 package [17] and Amber99sb-ILDN force field [18] was
implemented to examine the modeled proteins stability. The protein models were solvated with
SPC-E water model that extend to 0.9 nm triclinic box from the molecule to the edge of the box.
Periodic boundary conditions were applied in all directions and the total charge was adjusted to
zero. Maximum of 50,000 energy minimization steps was carried out for the protein models
using a steepest descent algorithm with a tolerance of 1000 kJ mol-1 nm-1. Consequently, 50,000
steps of a conjugate gradient algorithm are also used to minimize the protein models with a
tolerance of 1000 kJ mol-1 nm-1. The solvated and minimized system were considered a
reasonable one in terms of geometry and solvent orientation and used for further simulation
steps. All bond angles were controlled with LINCS algorithm [19], while SETTLE algorithm
[20] was used to constrain the geometry of the water molecules. Temperature was maintained
(300 K) by V-rescale weak coupling method, while the Parrinello-Rahman method [21] was used
to preserve the pressure (1 atm) of the system. The position restrains (PR) MD for both NVT
(constant number of particles, volume and temperature) and NPT (constant number of particles,
pressure and temperature) were carried out for 100 ps. This pre-equilibrated system was later
used in the 3000 ps (3 ns) production MDS with a time-step of 2 fs. Structural coordinates were
saved every 2 ps and analyzed using the analytical tool in the GROMACS package. The lowest
potential energy conformations were selected from 3 ns MDS trajectory for further ProteinProtein Interaction as well as Protein-Peptide Interaction Studies. The refined models were
validated using the structural analysis and verification server (SAVES). The above mentioned
protocol was also used for molecular dynamics simulation studies of Protein-Peptide-Protein
complexes which also proved the stability of the designed peptides. More information can be
accessed from http://www.gromacs.org/.
SAVES: Structural Analysis and Verification Server (SAVES) is a protein structure validation
server (http://nihserver.mbi.ucla.edu/SAVES/). It used many web applications to come to a
conclusion such as PROCHECK [22], ERRAT [23] and VERIFY_3D [24]. PROCHECK checks
the stereochemical quality of a protein structure and overall structure geometry. ERRAT
analyzes the non-bonded interactions between different atom types while VERIFY _3D
examines 3D models with its amino acid sequence. The proteins under investigation were
submitted with their pdb files for structure validation and verification. The parameters and
working of SAVES server can be accessed from http://nihserver.mbi.ucla.edu/SAVES/Info.php.
Supplementary Material 3:
HADDOCK: HADDOCK is a user friendly and popular web server which has been extensively
used for biomolecular docking [25]. It is one of the few docking software platforms that
explicitly takes flexibility into account both in the side-chains and backbone of the proteins. It
uses both NMR and non-NMR experimental information to guide the docking process. Haddock
web server (http://haddock.science.uu.nl/services/HADDOCK/haddock.php) offers different levels of
services to users and these could be accessed by a simple registration process. The pdb file of the
proteins under consideration was submitted as an input file and the domain region was provided
as active site residues. The default parameters by the webserver were used both for proteinprotein
and
protein-peptide
docking
and
it
can
be
accessed
from
http://haddock.science.uu.nl/services/HADDOCK/settings.html.
LIGPLOT: LIGPLOT is a program which generates schematic diagrams of protein-ligand and
protein-protein interactions [26] and it can be accessed from https://www.ebi.ac.uk/thorntonsrv/software/LIGPLOT/. Hydrogen bonds formation between proteins under investigation were
analysed using the DIMPLOT module of LIGPLOT. The maximum H-A distance for hydrogen
bond formation was kept at 2.70 Å while D-A distance was kept at 3.35 Å, where H=hydrogen,
A=acceptor and D=donor respectively.
PROPKA: PROPKA webserver (http://propka.ki.ku.dk/) helps to estimate pKa values of amino
acids as they exist within proteins. PROPKA 3.1 was used to calculate the was also used to
check the stability of the protein-peptide complex [27]. The pdb structure file was provided as
input and the results were calculated based on the default values of the server.
PISA:
Protein
Interfaces,
Surfaces
and
Assemblies
(PISA)
webserver
(http://www.ebi.ac.uk/pdbe/pisa/pistart.html) was deployed for salt bridge analysis [28]. The
number of salt bridges is then used to assess the likely stability of the interface. PISA considers a
distance of 4 Å for a salt bridge to form. The pdb structure file of proteins under consideration
was submitted as input. Further details can be found from http://www.ebi.ac.uk/msd-
srv/prot_int/pistart.html. European Bioinformatics Institute (EBI) is responsible for maintaining the
webserver.
DrugScorePPI: DrugscorePPI is a knowledge-based webserver for computational alaninescanning in protein-protein interfaces [29]. It uses QSAR approach with respect to experimental
binding free energy differences between wildtype proteins and ALA mutants for protein-protein
complex formation. This server automatically scans for the interface residues of given biomolecular complexes and it can be accessed from http://cpclab.uni-duesseldorf.de/dsppi/.
Initially it calculates ΔGWT (wild type) and mutates one of the interface residues to alanine then
calculates the ΔGMUT (mutant type) which allows succeeding calculation of ΔΔG (change in
binding free energy) by subtracting the ΔGWT from the ΔGMUT. This procedure will be
continued until ΔΔG of all the interface residues are calculated. The input was provided in the
form of pdb file.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
Bairoch, A., et al., The Universal Protein Resource (UniProt). Nucleic Acids Res, 2005.
33(Database issue): p. D154-9.
Laskowski, R.A., V.V. Chistyakov, and J.M. Thornton, PDBsum more: new summaries and
analyses of the known 3D structures of proteins and nucleic acids. Nucleic Acids Research, 2005.
33(suppl 1): p. D266-D268.
Gasteiger, E., et al., Protein identification and analysis tools on the ExPASy server, in The
proteomics protocols handbook. 2005, Springer. p. 571-607.
Finn, R.D., et al., Pfam: clans, web tools and services. Nucleic Acids Research, 2006. 34(Database
issue): p. D247-51.
Johnson, L.S., S.R. Eddy, and E. Portugaly, Hidden Markov model speed heuristic and iterative
HMM search procedure. BMC bioinformatics, 2010. 11(1): p. 431.
Baum, L.E., et al., A maximization technique occurring in the statistical analysis of probabilistic
functions of Markov chains. The annals of mathematical statistics, 1970: p. 164-171.
Murzin, A.G., et al., SCOP: a structural classification of proteins database for the investigation of
sequences and structures. Journal of molecular biology, 1995. 247(4): p. 536-540.
Pearl, F.M.G., et al., The CATH database: an extended protein family resource for structural and
functional genomics. Nucleic acids research, 2003. 31(1): p. 452-455.
Quevillon, E., et al., InterProScan: protein domains identifier. Nucleic Acids Res, 2005. 33(Web
Server issue): p. W116-20.
Roy, A., A. Kucukural, and Y. Zhang, I-TASSER: a unified platform for automated protein structure
and function prediction. Nature protocols, 2010. 5(4): p. 725-738.
Andrieu, C., et al., An introduction to MCMC for machine learning. Machine learning, 2003. 50(12): p. 5-43.
Brooks, B.R., et al., CHARMM: the biomolecular simulation program. Journal of computational
chemistry, 2009. 30(10): p. 1545-1614.
Eswar, N., et al., Comparative protein structure modeling using Modeller. Current protocols in
bioinformatics, 2006: p. 5.6. 1-5.6. 30.
Rocchia, W., E. Alexov, and B. Honig, Extending the applicability of the nonlinear PoissonBoltzmann equation: Multiple dielectric constants and multivalent ions. The Journal of Physical
Chemistry B, 2001. 105(28): p. 6507-6514.
Chen, R., L. Li, and Z. Weng, ZDOCK: An initial‐stage protein‐docking algorithm. Proteins:
Structure, Function, and Bioinformatics, 2003. 52(1): p. 80-87.
Scott, W.R., et al., The GROMOS biomolecular simulation program package. The Journal of
Physical Chemistry A, 1999. 103(19): p. 3596-3607.
Pronk, S., et al., GROMACS 4.5: a high-throughput and highly parallel open source molecular
simulation toolkit. Bioinformatics, 2013. 29(7): p. 845-854.
Lindorff-Larsen, K., et al., Improved side-chain torsion potentials for the Amber ff99SB protein
force field. Proteins-Structure Function and Bioinformatics, 2010. 78(8): p. 1950-1958.
Hess, B., et al., LINCS: a linear constraint solver for molecular simulations. Journal of
computational chemistry, 1997. 18(12): p. 1463-1472.
Miyamoto, S. and P.A. Kollman, SETTLE: an analytical version of the SHAKE and RATTLE
algorithm for rigid water models. Journal of computational chemistry, 1992. 13(8): p. 952-962.
Martoňák, R., A. Laio, and M. Parrinello, Predicting crystal structures: the Parrinello-Rahman
method revisited. Physical review letters, 2003. 90(7): p. 075503.
22.
23.
24.
25.
26.
27.
28.
29.
Laskowski, R.A., et al., PROCHECK: a program to check the stereochemical quality of protein
structures. Journal of applied crystallography, 1993. 26(2): p. 283-291.
Colovos, C. and T.O. Yeates, Verification of protein structures: patterns of nonbonded atomic
interactions. Protein Science, 1993. 2(9): p. 1511-1519.
Bowie, J.U., R. Luthy, and D. Eisenberg, A method to identify protein sequences that fold into a
known three-dimensional structure. Science, 1991. 253(5016): p. 164-170.
de Vries, S.J., M. van Dijk, and A.M. Bonvin, The HADDOCK web server for data-driven
biomolecular docking. Nature Protocols, 2010. 5(5): p. 883-97.
Laskowski, R.A. and M.B. Swindells, LigPlot+: Multiple Ligand-Protein Interaction Diagrams for
Drug Discovery. Journal of Chemical Information and Modeling, 2011. 51(10): p. 2778-2786.
Li, H., A.D. Robertson, and J.H. Jensen, Very fast empirical prediction and rationalization of
protein pKa values. Proteins: Structure, Function, and Bioinformatics, 2005. 61(4): p. 704-721.
Krissinel, E. and K. Henrick, Inference of macromolecular assemblies from crystalline state.
Journal of molecular biology, 2007. 372(3): p. 774-797.
Kruger, D.M. and H. Gohlke, DrugScorePPI webserver: fast and accurate in silico alanine
scanning for scoring protein-protein interactions. Nucleic Acids Research, 2010. 38(Web Server
issue): p. W480-6.
Supplementary Material 4:
TABLES:
Table 6: Domain function screening for the proteins under consideration
Tools in Use
Pfam
InterProScan
Proteins
AtPOT1b
POT1
domain
(13-143)
Telo_bind
domain
(13-143)
AtTRB1
Myb
Linker
DNA
Histone
binding family (123domain 178)
(5-55)
SANT/
Histone
Myb
H1/H5
domain domain
(5-55)
(123-182)
AtTRB2
Myb DNA Histone
binding
H1/H5
domain
domain
(5-55)
(125-182)
AtTRB3
Myb DNA
binding domain
(5-55)
SANT/My
b domain
(5-55)
SANT/
Myb
domain
(5-55)
Histone
H1/H5
domain
(125-182)
Histone
H1/H5
domain
(122180)
Table 7: List of top ten templates used by I TASSER for three dimensional (3D) structure
prediction
Protein Name
Templates
POT1b
2i0qA, 1jb7A , 1xjvA, 3kjpA
TRB1
2osxA, 2lsoA, 1hstA, 4fsxA, 2juhA, 1h89A, 1hstA, 1h88C, 3hfwA
TRB2
4fxgB, 2lsoA, 1hstA, 4fsxA, 2juhA, 1x58A, 1h88C, 4fxgB
TRB3
4fxgB, 1hstA, 1zrtD, 2juhA, 1x58A, 1h88C, 4fxgB, 2lsoA
Table 8: I-TASSER scores to identify the best model generated
Protein Name
POT1b
TRB1
TRB2
TRB3
I-TASSER
C-score
No. of decoys
Cluster density
Models
Model 1*
-0.31
2089
0.1410
Model 2
-2.20
314
0.0212
Model 3
-1.00
1049
0.0708
Model 4
-3.27
108
0.0073
Model 5
-3.69
71
0.0048
Model 1*
-3.24
703
0.0242
Model 2
-3.37
621
0.0214
Model 3
-3.66
461
0.0159
Model 4
-3.98
335
0.0115
Model 5
-4.19
272
0.0094
Model 1*
-2.44
2315
0.0533
Model 2
-4.67
251
0.0058
Model 3
-4.73
235
0.0054
Model 4
-4.90
199
0.0046
Model 5
-5.00
175
0.0040
Model 1 *
-2.20
2336
0.0696
Model 2
-3.54
611
0.0182
Model 3
-4.69
193
0.0058
Model 4
-4.82
169
0.0050
Model 5
-4.86
162
0.0048
* represents the best models generated by I-TASSER server.
The number of decoys ranged from 703 to 2336 as shown in Table 2. Template modelling score
(TM-score) was used to find the structural similarity between the models and templates. The
TM-score for best model were revealed to be was 0.67±0.13, 0.35±0.12, 0.43±0.14 and
0.45±0.15 for the proteins AtPOT1b, AtTRB1, AtTRB2 and AtTRB3 respectively. The values of
decoys are directly proportional to the value of clusters. More the number of decoys, more the
density value, which indirectly influenced the stability of structures and less the c-score values,
more the decoy values which also supports for choosing the best structure. Based on this logic
and principle, the best models were identified and were further taken up for energy minimization.
Table 9: ProMotif results for all proteins from PDBsum server
Protein Name No.of No. of
Sheet Beta
Hairpins
AtPOT1B
7
7
No. of
Psi
loop
1
No. of No. of No. of
Beta
strand helices
bulges
5
19
7
No. of
No. of
Helix-Helix Beta
Interaction turns
1
60
No. of
Gamma
turns
10
AtTRB1
None None
None None
None
15
27
33
11
AtTRB2
1
1
None None
2
13
18
28
8
AtTRB3
1
1
None 1
2
13
12
31
7
Table 10: Salt Bridge Interactions of the three complexes as detected through PISA
AtPOT1b
A:ARG 13
A:ARG 282
A:ARG 282
A:GLU 149
A:GLU 149
A:ASP 116
A:ASP 116
A:ASP 116
A:ASP 116
A:GLU 141
A:GLU 141
A:GLU 130
A:ASP 50
A:ASP 16
A:ASP 16
A
Dist.
(Å)
2.77
3.72
3.03
3.9
2.99
2.71
3.26
3.67
2.62
2.78
2.71
2.59
2.6
2.67
2.76
AtTRB1
AtPOT1b
B:ASP 122
B:ASP 149
B:ASP 149
B:ARG 120
B:ARG 120
B:ARG 159
B:ARG 159
B:ARG 159
B:ARG 159
B:LYS 164
A:LYS 10
A:LYS 10
A:ASP 8
A:ASP 8
A:ASP 8
A:ASP 16
A:ASP 16
A:ASP 16
A:ASP 16
A:GLU
141
B:LYS 164
B:LYS 166
B:LYS 173
B:LYS 176
B:LYS 176
B
Dist.
(Å)
2.88
2.79
3.81
3.77
2.67
2.67
3.56
3.34
2.67
2.6
AtTRB2
AtPOT1b
B:GLU 153
B:GLU 254
B:ARG 293
B:ARG 293
B:ARG 293
B:ARG 163
B:ARG 163
B:ARG 163
B:ARG 163
A:ASP 16
A:ASP 16
A:ASP 16
A:ASP 50
A:ASP 50
A:GLU 149
A:GLU 149
A:GLU 149
A:GLU 149
B:LYS 127
C
Dist.
(Å)
3.64
2.65
3.85
2.65
2.82
2.73
3.55
3.5
2.7
AtTRB3
B:LYS 135
B:ARG 136
B:ARG 136
B:LYS 135
B:LYS 135
B:ARG 161
B:ARG 161
B:ARG 161
B:ARG 161
Table 11: Propensity of important interacting residues between AtPOT1b and AtTRB1-3
AtPOT1b (Total no. of AtTRB1-3 (Total No. of amino acids)
amino acids)(Chain A) (Chain B)
6 Arginine
3 Aspartic acid / 1 Glycine/ 1 Threonine/ 1 Lysine
6 Asparagine
2 Arginine/ 1 Asparagine/ 1 Leucine/ 1 Aspartic acid/ 1 Serine
10 Glutamic acid
4 Arginine/ 2 Lysine/ 1 Serine/ 1 Tryptophan/ 1 Tyrosine/ 1 Glutamine
4 Serine
1 Serine/ 1 Aspartic Acid/ 1 Arginine/ 1 Asparagine
2 Tryptophan
1 Threonine/ 1 Aspartic acid
14 Aspartic acid
5 Lysine/ 7 Arginine/ 2 Asparagine
Number denotes the number of times the amino acids are involved in making hydrogen
bond formation.
FIGURE LEGENDS:
Figure 8: Two and three dimensional structures of best I-TASSER models after 3ns simulation.
2D figures were generated in PDBsum server while 3D was prepared using Chimera software. A, E
represents AtPOT1b; B, F represents AtTRB1; C, G represents AtTRB2; D, H represents AtTRB3. Oval
dashed lines represent the specific regions of protein which interacts with each other for all the three
proteins while the positions of interacting residues can be inferred from 2D figures.
Figure 9 A-D: Ramachandran Plot of the four proteins as depicted by PROCHECK server.
AtPOT1b, AtTRB1, AtTRB2 and AtTRB3 are represented by Figure A, B, C and D respectively. Most
favored regions are colored red, additional allowed as yellow, generously allowed as light yellow and
disallowed regions as white fields respectively.
Figure 10 A: DIMPLOT result of AtPOT1b-AtTRB1 interaction.
Blue labels represent interacting residues of AtPOT1b while red represents AtTRB1.
Figure 10 B: DIMPLOT result of AtPOT1b-AtTRB2 interaction.
Blue labels represent interacting residues of AtPOT1b while red represents AtTRB2.
Figure 10 C: DIMPLOT result of AtPOT1b-AtTRB3 interaction.
Blue labels represent interacting residues of AtPOT1b while red represents AtTRB3.
Figure 11 A-B: PROPKA results of AtPOT1b with peptide and without peptide.
‘A’ represents AtPOT1b without peptide while ‘B’ represents AtPOT1b with peptide. Unbound AtPOT1b
requires very high energy for stability with 23.5 kcal mol-1 while Peptide bound AtPOT1b requires very
low energy for stability with -44.4 kcal mol-1 suggesting the binding of the peptide to AtPOT1b is highly
stable.
Figure 12: Distribution of amino acids of Protection of Telomeres 1 (POT1) protein in different
organisms.
Amino acid frequency of Protection of Telomeres 1 (POT1) protein in different organisms suggests
abundant presence of Leucine in all organisms.
Figure 13: Distribution of amino acids of Telomerase Reverse Transcriptase (TERT) protein in
different organisms.
Amino acid frequency of Telomerase Reverse Transcriptase (TERT) protein in different organisms shows
abundant presence of Leucine in all organisms.
FIGURES:
Figure 8:
Figure 9:
Figure 10 A:
Figure 10 B:
Figure 10 C:
Figure 11:
Figure 12:
Arabidopsis_thaliana
Arabidopsis_lyrata
Olimarabidopsis_pumila
Lepidium_alyssoides
Brassica_oleracea
Neslia_paniculata
Boechera_platysperma
Cardaminopsis_arenosa
Arabidopsis_neglecta
Turritis_glabra
Cardamine_pulchella
Pachycladon_stellatum
Lepidium_draba
Matthiola_integrifolia
Euclidium_syriacum
Avg.
Ala Cys Asp Glu Phe Gly His Ile
Lys Leu Met Asn Pro Gln Arg Ser Thr Val Trp Tyr Total
4.185 3.5242 5.5066 6.6079 5.9471 4.4053 2.4229 6.8282 6.3877 9.0308 3.0837 4.185 4.6256 2.8634 6.8282 7.9295 4.185 6.8282 1.9824 2.6432 454
4.4444 3.5556 4.8889 6.2222 5.5556 4.4444 2.8889 6.8889 6.2222 9.3333 2.8889 4.4444 4.4444 3.3333
6 7.7778 4.6667 7.3333
2 2.6667 450
4.3796 3.4063 5.1095 5.3528 5.8394 5.1095 2.6764 6.8127 6.326 9.7324 2.4331 3.6496 4.8662 3.4063 6.5693 8.2725 3.8929 7.5426 1.7032 2.9197 411
4.3796 3.4063 5.3528 5.5961 5.8394 4.8662 2.4331 6.8127 6.0827 9.7324 2.4331 4.1363 4.8662 3.4063 6.8127 7.7859 3.8929 7.5426 1.7032 2.9197 411
4.8889 3.1111 6.2222 5.3333 7.1111 4.6667 2.2222 6.6667
6 8.8889 2.2222 4.6667 5.3333 2.8889 6.2222 8.4444 4.6667 6.4444 1.7778 2.2222 450
4.6569 3.6765 4.1667 6.1275 5.1471 4.4118 2.9412 6.6176 6.1275 10.539 2.6961 4.4118 4.902 3.4314 6.1275 7.598 4.4118 7.8431 1.4706 2.6961 408
4.3902 3.1707 4.3902 6.3415 6.0976 5.3659 2.9268 7.0732 6.8293 9.7561 2.1951 3.9024 4.878 3.4146 6.3415 7.561 4.3902 6.8293 1.4634 2.6829 410
4.8469 3.3163 4.0816 5.8673 5.8673 5.3571 2.8061 6.3776 6.8878 11.48 2.2959 3.8265 4.5918 2.8061 5.8673 8.9286 3.8265 7.1429 1.2755 2.551 392
5.102 2.8061 4.3367 5.6122 6.1224 5.3571 3.0612 6.3776 6.8878 11.48 2.2959 3.8265 4.5918 2.8061 6.1224 8.4184 3.5714 7.398 1.2755 2.551 392
4.6036 3.5806 4.6036 5.6266 5.8824 5.6266 3.0691 6.6496 6.9054 10.742 2.046 3.3248 4.8593 2.3018 6.3939 7.1611 4.3478 8.1841 1.2788 2.8133 391
3.8929 3.4063 4.6229 7.2993 5.8394 5.8394 2.4331 5.8394 6.0827 10.706 1.9465 4.1363 5.1095 3.4063 6.326 7.2993 5.1095 7.056 1.2165 2.4331 411
4.6341 3.4146 5.122 6.0976 6.8293 5.6098 2.9268 7.3171 7.0732
10 1.4634 2.9268 5.3659 3.1707 5.3659 6.5854 4.6341 6.5854 1.9512 2.9268 410
4.6154 3.3333 4.8718 5.641 6.9231 5.641 3.3333 6.6667 6.6667 9.7436 2.0513 2.5641 4.6154 2.8205 6.1538 8.7179 4.359 7.1795 1.5385 2.5641 390
5.102 3.5714 5.102 5.6122 7.1429 4.8469 3.3163 6.8878 7.1429 9.949 1.5306 3.3163 4.5918 3.3163 6.3776 7.6531 4.5918 6.3776 1.5306 2.0408 392
4.6036 3.3248 5.8824 4.8593 7.4169 5.1151 3.3248 6.3939 7.6726 8.6957 2.046 4.6036 4.6036 2.8133 5.3708 7.1611 5.3708 7.6726 1.5345 1.5345 391
4.5757 3.375 4.9651 5.89 6.2307 5.0949 2.8395 6.6851 6.6039 9.9627 2.2554 3.878 4.8191 3.0829 6.1983 7.8209 4.3972 7.1881 1.5901 2.5475 410.87
Figure 13:
Homo_sapiens
Mus_musculus
Rattus_norvegicus
Arabidopsis_thaliana
Oryza_sativa
Tetrahymena_thermophila
Bos_taurus
Canis_familiaris
Oxytricha_trifallax
Avg.
Ala Cys Asp Glu Phe Gly His Ile Lys Leu Met Asn Pro Gln Arg Ser Thr Val Trp Tyr Total
8.7456 2.5618 3.0035 3.9753 4.1519 6.6254 3.0035 2.0318 3.5336 12.986 1.0601 1.8551 7.6855 4.1519 11.042 6.6254 5.1237 7.7739 1.5901 2.4735 1132
5.8824 3.1194 3.2086 3.4759 4.902 4.8128 2.9412 3.1194 4.6346 13.369 2.139 2.7629 6.0606 5.4367 8.6453 8.7344 5.3476 6.8627 1.426 3.1194 1122
6.4889 3.0222 3.1111 3.2889 4.9778 5.6889 2.8444 2.8444 5.1556 13.067 1.9556 2.6667 6.4 5.1556 7.9111 8.6222 5.4222 7.0222 1.4222 2.9333 1125
2.7605 3.2057 4.9866 4.3633 4.8085 4.0071 3.2057 5.4319 8.1033 10.864 1.6919 5.0757 4.1852 4.1852 7.3909 10.062 3.9181 6.5895 1.6028 3.5619 1123
4.9245 4.7657 4.448 3.4948 4.2097 4.6863 3.2566 5.8777 7.3074 9.2931 1.9063 5.4011 3.6537 3.4154 7.1485 11.676 3.5743 5.56 1.1914 4.2097 1259
1.5219 1.4324 4.3868 6.0877 7.0725 2.7753 0.8953 9.3107 11.638 10.027 1.6115 9.8478 2.1486 9.3107 2.6858 5.6401 3.6705 4.2077 0.6267 5.103 1117
10.756 2.7556 2.9333 3.5556 4.1778 8.2667 2.6667 1.3333 2.8444 13.6 0.8889 1.8667 7.7333 4.5333 12.089 5.6889 3.7333 7.2889 1.1556 2.1333 1125
10.508 3.0276 2.7605 3.3838 4.0962 6.9457 3.0276 2.0481 3.1167 13.802 1.1576 2.1371 7.6581 4.3633 10.686 6.0552 4.3633 6.9457 1.2467 2.6714 1123
3.6219 1.8551 3.9753 6.0954 7.9505 3.1802 1.4134 7.6855 12.014 8.8339 3.0035 9.4523 2.3852 6.3604 3.4452 5.3004 4.1519 4.2403 0.7951 4.2403 1132
6.1221 2.8856 3.6557 4.1821 5.1375 5.2154 2.5931 4.4258 6.4925 11.727 1.7157 4.572 5.3032 5.1862 7.8865 7.6526 4.3576 6.2683 1.2283 3.3925 1139.8
Download