Try these prediction tools with the following sequence (Sequence 2

advertisement
Try these prediction tools with the following sequence (Sequence 2, Lecture 4 from the
website):
MLDQQTINIIKATVPVLKEHGVTITTTFYKNLFAKHPEVRPLFDMGRQESLEQPKALAMT
VLAAAQNIENLPAILPAVKKIAVKHCQAGVAAAHYPIVGQELLGAIKEVLGDAATDDILD
AWGKAYGVIADVFIQVEADLYAQAVE
Try to think about the following questions:
1.
2.
3.
4.
5.
What sort of secondary structure does this protein have?
Is it likely to cross the cell membrane?
What sort of tertiary structure is predicted?
Are there any proteins of known structure with similar sequence?
If so, what functions do they have?
1. Predict secondary structure
a. PSIPred results (version 2.3):
Conf: Confidence (0=low, 9=high)
Pred: Predicted secondary structure (H=helix, E=strand, C=coil)
AA: Target sequence
Conf: 998889999999999997348899999999999858357612352353446789999999
Pred: CCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCHHHHHHCCCCCHHHHHHHHHHHHH
AA: MLDQQTINIIKATVPVLKEHGVTITTTFYKNLFAKHPEVRPLFDMGRQESLEQPKALAMT
10
20
30
40
50
60
Conf: 999999741478899999999999988089766789999999999998626136989999
Pred: HHHHHHHHHCHHHHHHHHHHHHHHHHHCCCCHHHHHHHHHHHHHHHHHHCCCCCCHHHHH
AA: VLAAAQNIENLPAILPAVKKIAVKHCQAGVAAAHYPIVGQELLGAIKEVLGDAATDDILD
70
80
90
100
110
120
Conf: 99999999999999999999976329
Pred: HHHHHHHHHHHHHHHHHHHHHHHHCC
AA: AWGKAYGVIADVFIQVEADLYAQAVE
130
140
Helical regions: 4-35
37-42
48-69
71-87
92-109
116-144
The graphical version can be seen below:
b. NNPredict results:
Secondary structure prediction (H = helix, E = strand, - = no prediction):
------EEEE-----HH-----EEEEHHHHH----------------------HHHHHHH
HHHHHHHHH----HHHHHHHHHHHHHHHHHH--------HHHHHHHHHHH-----HHHHH
HHH-HHEEEHEHEHHHHHHHHHHH--
Helical regions:
54-69
74-91
100-110
116-123
134-144
Conclusions: This protein appears to be mainly made up from alpha helices. The indication
of confidence is high in the results from PSIPred. The results from PSIPred and NNPredict
broadly agree, although the results from PSIPred are probably slightly more reliable.
2. Predict transmembrane regions
a. TMPred results:
The sequence positions in brackets denominate the core region.
Only scores above 500 are considered significant.
Inside to outside helices :
1 found
from
to
score center
56 ( 58) 74 ( 74)
175
66
Outside to inside helices :
0 found
b. TMHMM results:
# Sequence Length: 146
# Sequence Number of predicted TMHs:
0
If the whole sequence is labelled as inside or outside, the prediction is that it contains no
membrane helices. It is probably not wise to interpret it as a prediction of location. The
prediction gives the most probable location and orientation of transmembrane helices in the
sequence.
c. HMMTOP results:
Length: 146
N-terminus: OUT
Number of transmembrane helices: 0
Conclusions: Since all three programs agree, and fail to predict any transmembrane regions,
there is no evidence to suggest that this protein is located within the cell membrane.
3. Predict tertiary structure
a. SwissModel
SwissModel has predicted the following structure for the protein:
This model is based on the similarity of the query sequence to the following sequences that
are in the PDB database:
Sequence identity of templates with target:
4vhbA.pdb: 98.85 % identity
2vhbB.pdb: 100 % identity
2vhbA.pdb: 98.85 % identity
4vhbB.pdb: 100 % identity
3vhbA.pdb: 100 % identity
1vhbB.pdb: 100 % identity
3vhbB.pdb: 100 % identity
1vhbA.pdb: 100 % identity
1cqxA.pdb: 50.7 % identity
1cqxB.pdb: 50.7 % identity
1gvhA.pdb: 50.85 % identity
1oj6B.pdb: 26.4 % identity
1oj6C.pdb: 26.4 % identity
1oj6A.pdb: 26.4 % identity
1oj6D.pdb: 26.4 % identity
1q1fA.pdb: 25.3 % identity
1w92A.pdb: 25.3 % identity
Searching PDB with the ID numbers of the templates used in the modelling reveals the
following results:
1vhb: Bacterial Dimeric Hemoglobin From Vitreoscilla Stercoraria
2vhb: Azide Adduct Of The Bacterial Hemoglobin From Vitreoscilla Stercoraria
3vhb: Imidazole Adduct Of The Bacterial Hemoglobin From Vitreoscilla Sp.
4vhb: Thiocyanate Adduct Of The Bacterial Hemoglobin From Vitreoscilla Sp.
This shows that the sequences with the highest similarity are bacterial haemoglobin. Our
query sequence may therefore have a function similar to that of haemoglobin, i.e. that of
oxygen transport.
b. EsyPred Results
“A 3D model of your protein has been built using the 3D structure 2GDM chain ' ' as template.
This template shares 24.7% identities with your query sequence (using the ALIGN program)
The target-template alignment is provided in attachment in the prot_29715354025928.ali file.
The 3D model of your protein is provided in attachment in the prot_29715354025928.pdb file.”
Searching PDB for 2GDM reveals that the model has been based on the structure of
Leghemoglobin. This is also involved with oxygen transport. Despite the much lower level of
identity, the model still appears to be similar to that generated by SwissModel.
Download