Exploring-FPs

advertisement
Boot Camp
January 2014
Exploring the Structure of Fluorescent Proteins
Based on queries performed at www.rcsb.org in Dec. 2013, PDB entries that closely match
each of the Fluorescent Proteins (FPs) are:
1. mCherry – PDB ID 2h5q
2. mOrange – PDB ID 2h5o
3. mCitrine – PDB ID 3dq7
4. mCerulean – PDB ID 2wso
5. msfGFP – PDB ID 2b3p
In the following section all instructions are provided using the PDB entry 1gfl. You can
substitute this with the PDB entry that is the closest match to the FP assigned to your
group.
1. Download the coordinate and structure factor files from the PDB.
Open the web page www.rcsb.org, type in the PDB ID of interest (e.g. 1gfl) in the top search
box and click on search. This should open the structure summary page for the PDB entry
1gfl.
From the top right corner of the page download the coordinates and structure factor data
(Text files or compressed files followed by uncompression) to your local computer.
2. Compute the electron density map
For this we will use the sf-tool server at RCSB PDB (http://sf-tool.rcsb.org/). Load the
coordinate and structure files as follows:
Boot Camp
January 2014
Now select the option for checking model (coordinates) against the structure factor file as
shown below and click on the Run button.
Once the calculation is complete the report page will show statistics calculated by the tool
and the electron density map.
From the mmCIF file link you can download a residue-by-residue report on how well the
coordinates of the residue matches the map and the map itself (using a parameter called
the Real space R factor). Residues/ligands with poor matches are summarized in the
TABLE.
Download the 2Fo-Fc map from above and save on your computer for visualization using
Chimera.
3. Visualize the coordinates and corresponding electron density maps
Upload the coordinates of the PDB entry (1gfl) in Chimera using Menu File… Fetch by ID…
then type the PDB ID in the box and click on Fetch. Make sure that the radio button next to
the database PDB is checked on.
Boot Camp
January 2014
Once the file is uploaded hide the ribbons and show the coordinates in the all atom view by
clicking on Menu Action… Ribbons… Hide and Action… Atoms/Bonds… Show.
Upload the electron density map by clicking on Menu File… Open… map file name. When
the map is loaded to the structure display window a new Volume Viewer window opens.
In the Volume Viewer window move the vertical marker in the histogram of data values to
select a suitable contour map. The ideal contour should be between 1 and 1.5 sigma. Since
Chimera does not normalize the maps so you have to determine the contour for each map
visually or by moving the slider to approximate the Level value listed here:
PDB
1.0 sigma 1.5 sigma Chromophore
entry
level
level
residue #
1gfl
0.28893
0.43339
S65, Y66, G67
2b3p
0.43079
0.64618
(CRO)66
2h5o
0.44597
0.66895
(CRO)66
2h5q
0.37432
0.56147
(CH6)66
2wso
0.46722
0.70083
(CRF)66
3dq7
0.39649
0.59473
(CR2)66
Change the Contour step to 1 and style to mesh. You may choose to change the color of
the map by clicking on the Color box and selecting a color of your choice.
Focus on the chromophore in chain A of 1gfl to see how well the coordinate model fits the
electron density map. Click on Menu Favorites… Sequence… Chain A… Show to launch a
new window with the sequence of residues in chain A. Use Shift-drag to select the residue
65-67 (representing the chromophore)
Boot Camp
January 2014
Q: Do you think that the chromophore in chain A of this structure agrees well with the
electron density map? What do you think about the agreement of map and model of
residues neighboring the chromophore?
Focus on another region closer to the surface of the protein structure you are exploring.
Q: What do you think about the coordinate model – electron density map fit for the
residues you explore?
Boot Camp
January 2014
4. Visualize and compare the structures of all of the above FPs
Load the structure of 1gfl (using the fetch option as described above).
Now one by one, load the PDB entries that closely match the Fluorescent protein assigned
to the various teams as follows: Menu File… Fetch by ID… and type the PDB ID (e.g. 2h5o)
in the box.
Superpose the 2 structures by clicking on Menu Tools… Structure Comparison…
Matchmaker. This brings up the structure alignment window:
On the left side of the new window, under Reference Structure, highlight 1gfl by clicking it
once, then select the other PDB ID (e.g. 2h5o) in the right hand section (structure to match).
Now press Apply or OK.
View the 6 superposed structures.
Q: How well does the overall beta-barrel in these structures overlap?
Q: How do the chromophores in these structures overlap in this superposition?
View the structure based sequence alignment of these structures by clicking on Menu
Tools… Structure Comparison… Match -> Align. A new window will open called Create
Alignment From Superposition. The superposed chains should already be selected, if not,
click on them to select and say OK to compute the sequence alignment.
Q: Based on the sequence alignment can you identify an absolutely conserved Arg and
an absolutely conserved Glu, both located in the core of the beta barrel? (Hint: Select it
Boot Camp
January 2014
from the sequence alignment by click-shift-drag and display the side chains as Menu
Action… Atoms/Bonds… Show.)
Q: Comment on the relationship of these conserved residues and the chromophores in
these proteins.
5. Explore the sequence and structural neighbors of the PDB entry (that you are
exploring) and find the most distant neighbor.
Open the structure summary page for the PDB entry that you are exploring by typing the
PDB ID in the top search box on the RCSB PDB website (www.rcsb.org).
Click on the Sequence Similarity tab.
Click open any of the clusters to see what types of proteins are closely related to the GFP
protein by sequence.
Q: What type of proteins do you see here? Name any three from the 30% cluster.
Note that there is a dramatic difference in the number of chains in the 30% cluster
compared to the 40% cluster. All proteins in the 40% cluster are clearly related in
sequence and structure. Those in the 30% cluster may or may not have the same structure
and function. (To learn more about this read the article by Sander and Schneider, 1991,
Proteins: Structure, Function and Genetics 9:56-68).
For structure-based comparisons in the PDB, the top ranking protein in the 40% sequence
cluster is considered to be a representative of the PDB entry that you are working with (e.g.
1gfl). Click on the 3D Similarity tab to see the structures in the PDB that are structurally
similar to the PDB entry that you are exploring.
Boot Camp
January 2014
Click on the structure comparison results to see the PDB entries that are structurally
related to the PDB entry you are exploring (e.g. 1gfl). Click on the title of the rmsd column
to sort by the rmsd values of matched structures. The structure with the highest rmsd value
is probably the most distant relative.
Q: What is the most distant relative of the PDB entry you are exploring? List the PDB
ID/Chain ID and name of the protein.
6. What other protein(s) may be related to your protein of interest. Draw a
phylogenetic tree to describe its/their relationship with the Fluorescent
proteins being studied here.
Go to the PDB entry of any one protein that you identified as a distant structural relative of
the protein of your interest (above) and get the FASTA sequence of the relevant chain ID.
Run a PSI-Blast on this sequence (at
http://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp&PAGE_TYPE=BlastSearch&LINK
_LOC=blasthome) to identify homologous proteins/domains. Paste the FASTA sequence
here and select the options show below:
Boot Camp
January 2014
Remember to change the Database to UniProtKB/SwissProt and select the algorithm PSIBLAST. In the algorithm parameters get as many sequences as possible by selecting the
Max target sequences to be 20000.
Click on the Blast button to start the search.
In the Results page, look for proteins that are not orthologs (same protein as your query
but from different organisms) but are related to the query sequence. Repeat the PSI-Blast
(upto 3 times) to see if other proteins show up in the results.
Boot Camp
January 2014
After the 3rd iteration look for sequences that have a sufficiently low E value but low
sequence identity.
Q: What protein(s)/domains did you find using this search?
Save the FAST format sequence of this protein/domain.
To make a phylogenetic tree go to the website:
http://www.cbrg.ethz.ch/services/PhylogeneticTree and paste the sequences of all the
fluorescent proteins that you and the other groups have been working with, the sequence
of the farthest structural/sequence relatives (identified in 5 and 6 above).
For each sequence include the tags <E><SEQ> and </SEQ></E> at the beginning and end of
the sequences respectively. Remember to list tags for all the sequences (in the order that
you upload the sequences) and select both Distance and Parsimony modes of calculation.
All other options on the page can be retained at the default value. Click on the submit
button to generate the phylogeny tree as follows:
Q: Comment on the evolutionary relationship of these proteins.
Download