Module 2: T-COFFEE & Module 8: Horizontal Gene Transfer

advertisement
T-COFFEE Multiple Alignments
of Orthologous Sequences
WebLogo
Horizontal Gene Transfer
(Phylogenetic Trees)
Overview
• T-COFFEE
– Tree-based Consistency Objective Function
for alignment Evaluation
• Focuses on orthologous gene sequences
• Used to generate multiple sequence alignments
• WebLogo
• Constructed from multiple sequence alignment
• Phylogenetic Trees
• Used to determine if your gene is derived from
horizontal gene transfer
“Click”
Enter ortholog sequences
into query box –
Where do I get these?
RECALL: What are orthologs?
• Homologs
– Orthologs
Insert Figure 8-41 from
Microbiology – An Evolving Science
© 2009 W.W. Norton & Company, Inc.
• Genes duplicated via appearance of
new species
– Identical function in different organisms
– Paralogs
• Genes duplicated within a species
– Perform slightly different tasks in cell
» Can develop new capabilities
» Can become pseudogene if
functionality lost but sequence
similarity retained
Where do I find orthologs?
Scroll
down
Under Homolog Selection,
choose “Paralogs/Orthologs”
from drop-down menu
Scroll down to table containing list of orthologs
Add the top 5 orthologs to Gene Cart
Notice orthologous
genes are from
different organisms
The genes are ranked
by ascending E-values
Select the genes by clicking these boxes
Scroll down to bottom of page
“Click”
Only 5 genes were selected, why are
6 genes shown in the Gene Cart?
One of the genes shown is your
ASSIGNED gene
(the one you are annotating)
Generate amino acid sequences for
orthologs in FASTA format
Scroll down to “Export Genes”
Select “FASTA
Amino Acid
format”
“Click”
Amino Acid sequences
in FASTA format for all
6 genes will appear
Scroll down
Scroll down
Your assigned gene is
located at the bottom
of this list
(inspect Gene OID number)
Copy / paste all 5 ortholog sequences
into your notebook for this module
EXCLUDE your gene, which
should already be in your notebook
Recording results in your notebook
Add heading and box
The amino acid sequences in FASTA
format for the top 5 orthologs
Return to T-COFFEE database
STEP 1: Copy / paste the
amino acid sequence in
FASTA format for your
assigned gene into the
query box for T-COFFEE
T-COFFEE database entries
STEP 2: Copy / paste the
amino acid sequences in
FASTA format for the top 5
orthologs into the same
query box as your gene
Separate individual
sequences by a
hard return
“Click”
Wait a few moments . . .
T-COFFEE Results
Select “Start JalView” to
examine the multiple
sequence alignment of
the ortholog sequences
Alignment inspection using JalView
Select “Percentage Identity”
under “Colour” menu
Reminder:
Light Blue = Low Frequency
Dark Blue = High Frequency
Compare to consensus sequence
Return to T-COFFEE Results
Copy / paste
this alignment
into your lab
notebook
Recording results in your notebook
Identify
organism in
alignment by
Gene OID
T-COFFEE complete
On to WebLogo
“Click”
“Right Click” and open
in IE tab (not Firefox)
Copy/paste multiple
sequence alignment
Scroll down
1- Select “amino acid”
as sequence type
2- Select box for
multiline logo
“Click”
WebLogo Results
Zoom in
In IE, save
picture as .png
file for upload
to notebook
Recording results in your notebook
WHAT ARE OUR GOALS?
1. Build a phylogenetic tree
2. Determine if assigned genes are
derived from horizontal gene transfer
Phylogenetic
tree of Bacteria
showing
established &
candidate phyla
(organismal
phylogeny)
Domain
Phylum
Class
Order
Family
Genus
Species
Insert Figure 1 from Handelsman (2004)
Microbiol. Mol. Biol. Rev. 68: 669-685.
Three bacterial phyla closely related to
Planctomycetes by 23S rRNA analysis
(organismal phylogeny)
Insert Figure 4A from Pilhofer et al. (2008)
Characterization and Evolution of Cell Division and Cell Wall Synthesis
Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae,
and Planctomycetes and Phylogenetic Comparison with rRNA Genes.
J Bacteriology 190: 3192-3202.
16S rRNA gene supports the
monophyletic grouping Planctomycetales
(organismal phylogeny by rDNA analysis)
Closest phylogenetic relatives
of P. limnophilus (same family)
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=1
How do we build a phylogenetic tree?
 include P. limnophilus gene (first module)
 include the top 5 orthologs (second module)
 include genes from organisms closely related
to P. limnophilus (i.e., same family)
 include genes from organisms less closely related
to P. limnophilus (i.e., from phyla Verrucomicrobia,
Chlamydiae, and Lentisphaerae)
 include genes from organisms that are
distantly related to P. limnophilus
Recall: We want to include genes from organisms more closely
related to P. limnophilus AND genes from organisms that are
less closely related to P. limnophilus.
So…depending on the organisms in your top 5 orthologs,
there are 2 paths you can take:
Select top 5 orthologs
PATH 1
PATH 2
If top 5 are closely related
to P. limnophilus. . .
If top 5 are less closely related
to P. limnophilus. . .
Choose 5-10
less closely related
organisms
Choose 5-10
more closely related
organisms
Building a phylogenetic tree
 EXAMPLE: Organisms closely related
to P. limnophilus (i.e., same family)
http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=1
Building a phylogenetic tree
 EXAMPLE: Organisms less closely related to P. limnophilus
(i.e., from phyla Verrucomicrobia, Chlamydiae, and Lentisphaerae)
Insert Figure 4A from Pilhofer et al. (2008)
Characterization and Evolution of Cell Division and Cell Wall Synthesis
Genes in the Bacterial Phyla Verrucomicrobia, Lentisphaerae, Chlamydiae,
and Planctomycetes and Phylogenetic Comparison with rRNA Genes.
J Bacteriology 190: 3192-3202.
Inspect your top 5 orthologs:
Which path?
Example: PATH #1 – Most are in same family as P. limnophilus,
so choose 5-10 sequences from less closely related organisms
Where do I find sequences?
Scroll
down
Under Homolog Selection,
choose “Paralogs/Orthologs”
from drop-down menu
Scroll through the ortholog list and select some genes from
less closely related as well as some distantly related organisms
Once 5-10 orthologs
are selected, add them
to your gene cart
Generate amino acid sequences for
orthologs in FASTA format
Scroll down to “Export Genes”
Select “FASTA
Amino Acid
format”
“Click”
Amino Acid sequences in FASTA format
Remember: Your
assigned gene is at the
bottom of the list
Scroll down
Recording results in your notebook
Create another box in your lab
notebook, and copy/paste
ONLY the 5-10 NEW
ortholog FASTA sequences
(i.e., exclude those already
in first & second module)
Recording results in your notebook
What if your top 5 orthologs are
distantly related to P. limnophilus?
Example: PATH #2 – Most are
not in the same phylum as
P. limnophilus, so choose 5-10
sequences from more closely
related organisms
Scroll through the ortholog list and select some genes from
closely related as well as other distantly related organisms
Once 5-10 orthologs
are selected, add them
to your gene cart
Copy / paste
FASTA format
protein sequences
into notebook
Use Phylogeny.fr site to create
a phylogenetic tree
“Click”
Creating a phylogenetic tree
Select
“A la Carte”
from menu
1- Select “T-Coffee” for
multiple alignment
2- Leave other settings
as default
Scroll down
Scroll down
“Click”
Your P. limnophilus gene
Your top 5 orthologs
5-10 new orthologs
Copy/paste sequences
in query box.
Scroll down & select
“submit”
Results of phylogenetic analysis
Download and save as .png for
upload to notebook
Recording results in your notebook
How do I interpret the tree results?
Possible scenarios resulting from
construction of a phylogenetic tree
P. limnophilus
Blastopirellula
Carboxydothermus
Bacillus
P. limnophilus
Carboxydothermus
P. maris
Pirellula
No HGT since P. limnophilus and
Blastopirellula are in the same family and are
clustered together (i.e., gene phylogeny
matches organismal phylogeny).
Possible HGT since P. limnophilus and
Carboxydothermus are very distantly related yet
clustered together (i.e., gene phylogeny does
NOT match organismal phylogeny).
Bacillus
Clostridium
Clustered
Carboxydothermus
P. limnophilus
P. maris
Blastopirellula
Not
clustered
Maybe HGT, but unsure
because there is also an
unresolved or
multifurcating branch
Interpreting your phylogenetic tree
 If your Planctomyces limnophilus gene is clustered with that from an organism
in the P. limnophilus family  probably not horizontal gene transfer
 If your Planctomyces limnophilus gene is clustered with that from an organism
that is NOT in the P. limnophilus family  may be horizontal gene transfer
 If your Planctomyces limnophilus gene is clustered with more than one organism
in the tree (multifurcating branch)  unresolved phylogeny
In the example below, is the gene derived by HGT? Why or why not?
Planctomyces limnophilus
Blastopirellula marina
Planctomyces maris
Pedospheara parvula (Ellin 514)
Lentisphaera araneosa
Verrucomicrobium spinosum
Sorangium cellulosum
Escherichia coli K12
Rhodopirellula baltica
Thiobacillus denitrificans
Moorella thermoacetica
Brucella canis
Hydrogenivirga sp.
Clostridium perfringens
Gemmata obscuriglobus
Recording results in your notebook
In the “Interpretation” box to your lab notebook
 Is there evidence of horizontal gene transfer?
 What organisms does your gene cluster with? The same family? Or the three
more closely related phyla (Verrucomicrobia, Chlamydiae, Lentisphaerae)?
 Do the gene and genome GC content differ by more than 5%?
 Do the neighborhoods for the top 5 orthologs look similar or different to that of
your gene in P. limnophilus?
Download