Protein Structure Prediction (10 points total)

advertisement
Problem I
Definitions
Provide a BRIEF description for the terms listed below:
Smith-Waterman
BLOSUM62
UPGMA
Parsimony
ddA (di-deoxyA)
TBLASTX
Shannon entropy
Pseudoknot
Pseudocount
rotamer library
Problem II
Secondary sequence analysis
In secondary structure analysis, both the Chou-Fasman
algorithm and the Garnier-Osguthorpe-Robson (GOR) methods
are inherently statistical in nature. However, the ChouFasman method is sometimes described as having some
“physical principles” contained within it, while the GOR is
sometimes described as an “information theory”-style
approach.
a.
Describe the basis for the Chou-Fasman method and
explain why some describe it as having some “physical
principles” within it.
b. Describe the basis for the GOR secondary structure
predictions and explain why some refer to it as an
information-theory style approach.
c. JPRED is a consensus-based approach to secondary
structure predictions. Explain what this “consensus-based”
term means and explain why this approach gives the highest
overall accuracy in predicting secondary structure of
proteins.
Problem III
Protein Structure Prediction
Here is information about target T0140 from CASP5 (same as
that that given to assessment participants).
CASP5 Target T0140
1. Protein Name: 1b11
2. Organism Name: Synthetic protein
3. Number of amino acids (approx): 103
4. Accession number:
5. Sequence Database:
6. Amino acid sequence:
MRGSHHHHHHGSRLQSGKMTGIVKWFNADKGFGFITPDDGSKDVFVHFSAGSSGAAVRG
NPQQGDRVEGKIKSITDFGIFIGLDGGIDGLVHLSDISWAQAEA
7. Additional Information
1b11 is a synthetic protein constructed by non-homologous
recombination. The N-terminal part derives from cold
shock protein A (CspA), while the C-terminal segment
comes from the E.coli 30S ribosomal subunit protein
S1. (Riechmann L, Winter G. Novel folded protein
domains generated by combinatorial shuffling of
polypeptide segments. Proc Natl Acad Sci U S A. 2000
Aug 29;97(18):10068-73.)
8. Crystallization conditions: include MES pH5.6
The protein is a tetramer under native conditions, but
after denaturation, elutes at approximately the
molecular weight of dimer on gel filtration.
9. X-ray structure
yes
10. Current state of the experimental work: Completed
11. Interpretable map?: yes
12. Estimated date of chain tracing completion: June
13. Estimated date of public release of structure:
September
14. Name: unavailable until after public release of
structure
Here is the abstract from the Riechmann & Winter article
describing how target T0140 was constructed:
It has been proposed that the architecture of protein
domains has evolved by the combinatorial assembly and/or
exchange of smaller polypeptide segments. To investigate
this proposal, we fused DNA encoding the N-terminal half
of a beta-barrel domain (from cold shock protein CspA)
with fragmented genomic Escherichia coli DNA and cloned
the repertoire of chimeric polypeptides for display on
filamentous bacteriophage. Phage displaying folded
polypeptides were selected by proteolysis; in most cases
the protease-resistant chimeric polypeptides comprised
genomic segments in their natural reading frames.
Although the genomic segments appeared to have no
sequence homologies with CspA, one of the originating
proteins had the same fold as CspA, but another had a
different fold. Four of the chimeric proteins were
expressed as soluble polypeptides; they formed monomers
and exhibited cooperative unfolding. Indeed, one of the
chimeric proteins contained a set of very slowly
exchanging amides and proved more stable than CspA
itself. These results indicate that native-like proteins
can be generated directly by combinatorial segment
assembly from nonhomologous proteins, with implications
for theories of the evolution of new protein folds, as
well as providing a means of creating novel domains and
architectures in vitro.
a. Describe the three general strategies used for
structure prediction in CASP and when each is appropriate.
b. Outline how you would go about predicting the structure
of target T0140, justifying your choice of methods.
Describe at least 5 sequential steps you would take.
c. Describe two specific challenges where improvements are
needed in protein structure prediction today.Problem IV
Protein Structure Modeling Approaches
For many approaches to protein structure analysis there are
two key parts to the problem: (1) how to search the
relevant sequence or structure space, and (2) how to
evaluate, or score, which sequence or structure is best.
a. Give examples of three distinct problems that we
studied in the protein structure part of the class. For
each, describe a search algorithm and a scoring function
that can be used in combination to address it.
b. Search methods sometimes constrain the energy functions
that can be used, and vice versa. Give an example of a
scoring function and a search method that can NOT be used
together, and describe why they are incompatible.Problem V
Modeling of simple reactions
Consider the following reaction
in which the forward rate constant is k1 and the reverse
rate constant is k-1
a. On the rate balance plot below, draw lines indicating
the forward reaction and the reverse reaction (and label
them accordingly). Indicate the steady state points on the
graph. Is this system bistable or monostable?
Rate
[A*]/([A]+[A*])
b. Consider simple linear feedback – that is, A* feeds
back and catalyzes the conversion of A to A*. On the ratebalance plot below, draw a curve representing this
feedback.
Rate
c. Now draw a rate balance plot that includes results from
both simple linear feedback and the reverse reaction. For
simplicity, assume the forward reaction without feedback is
negligible. Indicate the two equilibrium states. Are they
both stable – that is, can simple linear positive feedback
as illustrated here generate a bistable system? Explain
why in a sentence or two.
Rate
[A*]/([A]+[A*])
Download