IMG User Scenario October 21, 2010 IMG/M User Tasks List with Answers A. Using Genome Browser and Metagenome Details 1. How many metagenomes are there in IMG/M? Answer: currently 151, but changes with every update. Explanation: start with the link http://img.jgi.doe.gov/m Look at the “IMG/M Genomes” table of general statistics. The number of metagenomes is in the category “Microbiomes”. 2. How many bins are there in Acid Mine Drainage metagenome? Answer: 5 (according to the method “tetra”). Explanation: on the above page “IMG top page” click on the number “22” corresponding to the count of metagenomes sequenced by the JGI. It will bring you the list of metagenomes sequenced by the JGI, which includes “Acid Mine Drainage”. Clicking on its name will bring you to the “Microbiome Details” page for Acid Mine Drainage metagenome. IMG User Scenario October 21, 2010 At the bottom of “Microbiome Information” panel there is a category “Bins (of scaffolds)”, which displays the binning method (“tetra”) and a list of 5 bins found by Tetra in this metagenome. Note that to some metagenome more than one binning method IMG User Scenario October 21, 2010 has been applied – in this case you would see several “Method” headings and a list of bins found by each method. 3. To which phyla the proteins in Acid Mine Drainage metagenome have the highest number of hits? Answer: Euryarchaeota (but the majority belongs to the category “Unassigned”). Explanation: on the above page “Microbiome Details” follow the link “Distribution of genes binned by BLAST percent identities”. It will bring you to a histogram of the distribution of best BLAST hits to the genomes from different phyla. The highest number of hits is to the representatives of Euryarchaeota, Acid Mine Drainage community includes 2 Ferroplasmas and a Thermoplasmatales archaeon (see bins on the previous page). On the other hand, the majority of hits from bacterial constituents of this community end up in the category “Unassigned”, since IMG User Scenario October 21, 2010 Leptospirillum spp. belong to a phylum Nitrospirae, the only sequenced representative of which (Thermodesulfovibrio yellowstonii) is too distant to get good hits. 4. How many scaffolds in Acid Mine Drainage metagenome have GC content below 30%? Answer: 2. Explanation: go to the previous page (“Microbiome Details”). On the bottom of the page there is “Scaffold Search” tool, which allows searching for scaffolds with certain name, length, GC content and read depth. Select “GC (0.00-1.00)” filter, enter the values of 0 and 0.3 and click “Go”. 5. Using Phylogenetic Profiler on Microbiome Details page, find how many genes with COG hits are present in Methylococcaceae bin of Lake Washington (combined v2) metagenome, but not in the genome of Methylococcus capsulatus. Use default settings of Phylogenetic Profiler for similarity cutoffs. Answer: 75. Explanation: go to Phylogenetic Profiler section on Microbiome Details page of Lake Washington (combined v2) metagenome, click on “Phylogenetic Profiler”. Select Methylococcaceae bin as your query, select Methylococcus capsulatus as your reference organism (“Without Homologs In”). You could use browser “Find” function to get to them faster. Keep default settings for similarity cutoffs, press “Go”. On the “Phylogenetic Profiler Results” go to “Summary Statistics”. IMG User Scenario October 21, 2010 B. Using Phylogenetic Distribution of Genes and Scaffold Cart. 1. How many genes in Sludge/US, Phrap Assembly metagenome have the best hit to Alphaproteobacteria with percent identity between 60 and 90%? Answer: 1330. Explanation: type “Sludge/US, Phrap” in the “Quick Genome Search” and click “Go”. Note that this is not a keyword search, so “Sludge US Phrap” won’t retrieve anything. Click on the name of the metagenome on “Genome Search Results” page, which will bring you to “Microbiome Details” page. Follow the link “Distribution of genes binned by BLAST percent identities” below “Microbiome Information” panel. On the “Distribution of Best Blast Hits” histogram find Alphaproteobacteria and the column corresponding to the hits with percent identity between 60 and 90%. If you want to compare cumulative counts (e. g., hits below 60% identity or hits above 30% identity), add the counts in the corresponding columns. Since the counts are linked to the corresponding lists of genes with hits in a certain phylum or class, the hits are separated according to percent identity interval rather than into cumulative list below or above certain percent identity to make the lists non-redundant. IMG User Scenario October 21, 2010 2. How many genes in Sludge US Phrap metagenome with the best hit to Alphaproteobacteria at 60-90% identity belong to the COG Functional Category of “Amino acid transport and metabolism”? Answer: 170. Explanation: Click on the count of genes with best hits to Alphaproteobacteria with 6090% identity (1330). The list of results is sorted by gene_oid. Re-sort the resulting list of genes by clicking on “COG Functional Cat.” above the table. The count of genes belonging to a certain COG Functional Category is shown next to its name. Note that the count of genes in the top right corner (1437) is different from the count in the previous table (1330), because proteins identified as fusions can belong to more than one COG and some COGs belong to multiple COG Functional Categories. IMG User Scenario October 21, 2010 3. Which family of Betaproteobacteria has the highest number of best hits from the genes in Sludge US Phrap metagenome (cumulative above 30% identity)? Answer: Rhodocyclaceae. Explanation: go back to “Distribution of Best Blast Hits” histogram and click on “Betaproteobacteria”. It will bring you the same histogram of the distribution of best BLAST hits, only at the family level. Rhodocyclaceae have more best hits than the second-highest hit family, Comamonadaceae in the categories 30-60% and 60-90% identity and slightly fewer hits in >90% category. The sum of 3 categories for Rhodocyclaceae is the highest. IMG User Scenario October 21, 2010 4. Which archaeal genome has most hits with >90% identity from the metagenomes of human gut subject 7 and human gut subject 8? Answer: Methanobrevibacter smithii. Explanation: from “Microbiome Details” page for each metagenome (human gut subject 7 and human gut subject 8) go to “Phylogenetic Distribution of Genes” and “Distribution by BLAST percent identities”. The second column in the histogram table displays the domain (A for Archaea, B for Bacteria); out of 2 archaeal phyla (Crenarchaeota and Euryarchaeota) Euryarchaeota have most hits with >90% identity (1539 for human subject 7 and 1508 for human subject 8). If you click on the name of the phylum “Euryarchaeota”, it will display the distribution of hits to the families from this phylum; the family Methanobacteriaceae has the most hits with >90% identity. Clicking on the name “Methanobacteriaceae” brings the table with the species from this family. Methanobrevibacter smithii has the most hits with >90% identity from the metagenomes of both human gut subject 7 (1533 hits) and human gut subject 8 (1504). IMG User Scenario October 21, 2010 5. What are the functions of genes in the region between 270 kb and 330 kb of isolate Methanobrevibacter smithii ATCC 35061 that are missing from Methanobrevibacter smithii from human gut subject 7 metagenome? Are these genes also absent from Methanobrevibacter smithii from human gut subject 8 metagenome? Answer: ribosomal protein L15, an peptide/nickel ABC transporter, Ni-Fe hydrogenase, UDP-glucose dehydrogenase, sugar kinase, 2-oxoisovalerate:ferredoxin oxidoreductase, glutamyl-tRNA(Gln) amidotransferase and several hypothetical proteins. These genes are also absent from human gut subject 8 metagenome. Explanation: for the metagenome of human gut subject 7 go down the levels in the table displaying the distribution of BLAST hits to the family level (“Family Methanobacteriaceae”) and click on “Methanobrevibacter smithii”. In the table “Protein Recruitment Plot” click on the “Normal” or “Larger” for “All Scaffolds”. This plot displays proteins from the metagenome with BLASTp hit to the proteins in the genome selected and the gaps on the genome that have no hits from the metagenome can be reviewed. The region between 280 and 330 kb is one of the largest gaps. Go back to the “Methanobrevibacter smithii” recruitment plot page. Go to “Reference Genome Context View”, to “Methanobrevibacter smithii ATCC 35061 (bottom of the page), and select the range “240001-320000”on the scaffold “Methanobrevibacter smithii ATCC 35061:NC_009515” (which you could have identified from protein recruitment plot). Mouse over genes to see their functions. IMG User Scenario October 21, 2010 They include ribosomal protein L15, an peptide/nickel ABC transporter, Ni-Fe hydrogenase, UDP-glucose dehydrogenase, sugar kinase, 2-oxoisovalerate:ferredoxin oxidoreductase, glutamyl-tRNA(Gln) amidotransferase and several hypothetical proteins. Go to the metagenome of human gut subject 8, and down the levels of “Distribution of BLAST hits by percent identity” table to Methanobrevibacter smithii. Go to “Protein Recruitment Plot” (“Normal” or “Larger” for “All Scaffolds”. You can change the resolution by selecting “View Range” 184376..366679. The region between 280 and 330 kb also has a gap in BLAST hits of metagenome to Methanobrevibacter smithii genome. You can verify the absence of the proteins by going gene by gene in this fragment of Methanobrevibacter smithii chromosome and running “IMG Genome BLAST” from the corresponding gene pages. Select “Human Gut Community Subject 7” and “Human Gut Community Subject 8” in the list of genomes. Select “Min. percent identity” at 90% to avoid seeing the hits from organisms other than Methanobrevibacter smithii in metagenomic datasets. 6. What is the range of GC content of contigs assigned to the bin “Accumulibacter” (binning method PhyloPythia) in Sludge US Phrap metagenome? What is the range of read depths for this bin? Answer: 58-69% GC, read depth 1.27-17.52. Explanation: go to “Microbiome Details” page of Sludge US Phrap metagenome and find the list of bins in “Microbiome Information” section. “Accumulibacter” bin (binning method PhyloPythia) has 180 contigs, click on this count. Retrieve all the CDSs in this bin (4301, don’t forget to change your preferences to increase the size of the gene list to at least 5000!). Add all these genes to Gene Cart. Select all the genes and add the corresponding scaffolds to Scaffold Cart. Now your Scaffold Cart contains 180 contigs. Select all contigs (the button is below the table), go to Histogram section (even lower). Select GC content as an option and click on “Show Histogram” button. The histogram will display the range of GC content of the contigs in “Accumulibacter” bin. Go back to the previous page and select “Read Depth” as an option, click “Show Histogram”. The histogram of read depths will be retrieved. Both parameters can be used to estimate the quality of binning, since both the coverage and GC content for the same microbial population are expected to be in relatively narrow range. The range of GC content for “Accumulibacter” bin is similar to what we see in most bacterial genomes, but read depth is quite variable. Outliers can be checked to eliminate binning and assembly errors. IMG User Scenario October 21, 2010 C. Using Find Functions 1. Which Pfams describe carbohydrate-binding modules (CBM)? Answer: pfam00553, pfam00686, pfam00734, pfam00942, pfam01607, pfam02013, pfam02018, pfam02839, pfam02922, pfam03370, pfam03422, pfam03423, pfam03424, pfam03425, pfam03426, pfam03427, pfam06204, pfam08305, pfam09212, pfam09478, pfam10633. Explanation: go to “Find Functions” tab. In “Search Terms and Pathways” page select “Pfam” as a filter and search with the keyword “CBM”. 21 Pfams are retrieved. Note that if you have selected a subset of genomes, the search will be performed on these selected genomes only. IMG User Scenario October 21, 2010 2. Are there any CBM-containing genes in human gut subject 7 and subject 8? IMG User Scenario October 21, 2010 Answer: yes - 8 pfams (pfam00553, pfam02018, pfam02839, pfam02922, pfam03422, pfam06204, pfam08305, pfam10633), 46 genes. Explanation: go back to “Search Terms and Pathways” page. Use the same keyword and filter (“CBM”, “Pfam”), but select 2 metagenomes, “Human Gut Community Subject 7” and “Human Gut Community Subject 8” in the scroll-down menu below (press Ctrl button to select multiple genomes or metagenomes). Only 8 Pfams are retrieved now. Add gene counts for total. You can find out which families are present in which metagenome by adding these Pfams to the function cart and comparing phylogenetic profiles of these two metagenomes. 3. Do A. phosphatis bins in Sludge/Australian, Phrap Assembly and Sludge/US, Jazz Assembly metagenomes have all COGs assigned to “Histidine biosynthesis” pathway? Do they have a complete pathway? Answer: no, but it the only missing COG is an alternative implementation of histidinol phosphatase, so the pathway is likely to be present. Explanation: go to “Find Genomes” tab, in “Genome Browser” page press “Clear All” button. Find “Sludge/Australian, Phrap Assembly” and “Sludge/US, Jazz Assembly” metagenomes, select them, save selections. Go to “Find Functions” tab, click on “COG” tab in this panel – this would bring a list of all COG Functional Categories and COG Pathways. Find “Histidine biosynthesis” under “Amino acid transport and metabolism” Category ([E]), click on it. This will bring the “COG Pathway Details” page listing all COGs classified to this pathway. Select all of them and click “Add Selected to Function Cart” button. Go to “Function Profile” option in “Function Cart” page, select “A. phosphatis” bins in “Sludge/Australian, Phrap IMG User Scenario October 21, 2010 Assembly” and “Sludge/US, Jazz Assembly” metagenomes in the scroll-down menu (press Ctrl to select multiple bins/genomes/metagenomes). Click on “View Functions vs Genomes” button. The table displays counts of genes assigned to each COG, the only COGs with no genes is an alternative histidinol phosphatase (COG1387). 4. Representatives of which phyla are likely to be present in the metagenome of human gut community subject 7 when COG0200 (Ribosomal protein L15) is used? IMG User Scenario October 21, 2010 Answer: Firmicutes (Lactobacillus), Proteobacteria (Desulfovibrio), Actinobacteria (Bifidobacteria) Detailed answer: go to “Find Functions” tab and to “Phylogenetic Marker COGs”. Select “Human Gut Community Subject 7” from the list of metagenomes, click “Go”. Select “COG0200 Ribosomal protein L15” from the list of COGs, click “Go”. Change gene selections if necessary (e. g. unselect the genes from multiple strains of the same species), click “Run Multalin” button at the bottom of the page. On the Multalin tree find red entries corresponding to the genes from the metagenome; check the names of the organisms of their closest neighbors. You can find the names of the phyla to which they belong by clicking on the corresponding genes, which will take you to the “Gene Detail” page. On this page click on the genome name in the “Genome” field of “Gene Information” panel; this link will bring the corresponding “Organism Details” page, which displays full lineage of the organism (“Lineage”). Multalin tree is a hierarchical clustering tree rather than phylogenetic tree. In order to calculate a phylogenetic tree using your method of choice (neighbor-joining, UPGMA, maximum likelihood, etc.), go to the previous page with the list of genes, click “Add Selected to Gene Cart”, from which you can export their amino acid sequences and use any of the available alignment and tree tools. Alternatively you can use an alignment generated by Multalin, which is provided on the bottom of the page with the tree. D. Using Find Genes IMG User Scenario October 21, 2010 1. Which domains are associated with carbohydrate binding module family 6 (CBM_6, pfam03422) in Soil microbial communities from Minnesota farm metagenome? Answer: pfam00942 (CBM_3), pfam00041 (fn3), pfam08305 (NPCBM), pfam00801 (PKD). Explanation: go to “Gene Search” tab in “Find Genes”. In “Gene Search” page use keyword “pfam03422”, set filter to “Pfam Domain Search (list)” and select “Soil: Diversa Silage” in the scroll-down menu below. This search retrieves a list of genes with this particular pfam (or a combination of pfams if several comma-separated pfams are listed) and displays all other pfams found in the same genes. 3 genes in soil metagenome have CBM_6 in combination with other domains; two of the pfams associated with CBM_6 are other CBMs. IMG User Scenario October 21, 2010 E. Using Compare Genomes 1. Using Abundance Profile Search, find, how many Pfams are at least twice as abundant in the metagenome of human gut community subject 7 as compared to human gut community subject 8 using frequency normalization. Which Pfam has the highest frequency in human gut community subject 7 metagenome? Is it more abundant than in human gut community subject 8 metagenome by frequency? By raw counts? Answer: 557; pfam00005 (ABC_tran); yes, although not by much; no. Explanation: first, you have to make sure that you have these metagenomes selected, so go to “Find Genomes” tab and on the “Genome Browser” page click “Select All” and save selections. Then go to “Compare Genomes” tab and to “Abundance Profiles”. Follow the link “Abundance Profile Search”; on this page select “Pfam” as functional classification, “frequency” as normalization method and set “More Abundant Cutoff” to 2. Select IMG User Scenario October 21, 2010 “Human Gut Community Subject 7” as your query genome (“Find Functions In”) and “Human Gut Community Subject 8” as your reference genome (“More Abundant Than In”), click “Go” at the bottom of the page. To find which Pfam has the highest frequency in Human Gut Community Subject 7, change “More Abundant Cutoff” on the previous page to 1. This would bring all 1457 pfams pfams found in Human Gut Community Subject 7 and 8 metagenomes. Sort the table in the order of decreasing counts in human gut community subject 7 by clicking on the header of the column “Human Gut Community Subject 7”. IMG User Scenario October 21, 2010 Pfam00005 (ABC_tran) has the highest frequency in human gut community subject 7 metagenome; it has slightly higher frequency in subject 7 than in subject 8 (21263 vs 20302), but less genes were assigned to this pfam in subject 7 than in subject 8 (451 vs 542). The frequency takes into account not only the number of genes, but also the size of the metagenome. 2. Using Function Comparisons, find which COG is most overrepresented in the metagenome of human gut community subject 7 as compared to human gut community subject 8. Is this result statistically significant? Which COGs are significantly overrepresented in human gut community subject 7 as compared to human gut community subject 8? Use D-score as statistic and gene count. Answer: COG0629 (ssDNA-binding protein); not statistically significant. There are no COGs in human gut subject 7 metagenome that are significantly overrepresented as compared to human gut subject 8 metagenome. Explanation: go to “Compare Genomes” tab, then to “Abundance Profiles” and follow “Function Comparisons” link. Select “D-score” as “Output”, “Human Gut Community Subject 7” metagenome as a query and “Human Gut Community Subject 8” as reference genome. Select “COG” as for functional comparison, “Gene count” and “Show only rows with at least one non-zero gene count” as your output options. IMG doesn’t have read depth data for these metagenomes, so “Estimated gene copies” option is irrelevant. Sort the resulting table in the order of decreasing D-score by clicking on the header of “Human Gut 8 (R)” column. The higher is the D-score, the more overrepresented is the corresponding protein family in the query metagenome; the lower is the D-score, the more overrepresented is the corresponding protein family in the reference metagenome. IMG User Scenario October 21, 2010 The COG with the highest D-score is COG0629 (single-stranded DNA-binding protein), however, this result is not statistically significant, since the corresponding cells are not colored (cells corresponding to protein families with statistically significant overrepresentation are colored yellow or pink). Two criteria are used to decide whether the test is valid: 1) since an approximate statistical test is used, d-scores of protein families with less than 5 members in each metagenome are unreliable and considered statistically insignificant, no matter how high or low the d-score is and 2) if p-value of the protein family does not satisfy the p-value cutoff, which is based on the false discovery rate of 0.05 and which also depends on the number of hypotheses tested (number of protein families being analyzed, such as COG, Pfam, etc., see the link to an explanation about D-score from the query page). In the case of COG0629 the first criterion is satisfied, but the p-value is too high to be significant after false discovery rate correction. In order to find the statistically significant results, go back to query page and select “Show only rows with significant hits” as your output option. The only COGs that are significantly different between the two metagenomes are COG0642, COG2972 and COG4753, but all of them are more abundant in human gut subject 8 than in human gut subject 7. Therefore no COGs are significantly overrepresented in human gut subject 7 than in human gut subject 8. IMG User Scenario October 21, 2010 3. Using Function Category Comparisons, find which COG Pathways are overrepresented with p-value less than 1.0e-01 in human gut subject 7 metagenome as compared to human gut subject 8 metagenome. How these results could be interpreted with respect to the overrepresentation of individual COGs in the same metagenome? Answer: “Aminoacyl-tRNA synthetases and alternate systems for amino acid activation” and “Basal replication machinery”. COG Pathway “Basal replication machinery” includes COG0629, Single-stranded DNA-binding protein, which was found as overrepresented (although not statistically significant) in the previous test (see the answer to the question D2). Overrepresentation of this single family could skew the results of Function Category Comparison. Explanation: go to “Compare Genomes”, “Abundance Profiles”, “Function Category Comparisons”. Select “Human gut community subject 7” as your query metagenome and “Human gut community subject 8” as your reference metagenome, select COG Pathways for your “Function Category” and D-rank as an output option. In the results table click on the header of the column “Human Gut 8 (R)” to sort the table by this column in descending order. Two COG Pathways have positive D-rank (i. e. they have higher abundance in query genome) with p-value less than 1.0e-01, ““Aminoacyl-tRNA synthetases and alternate systems for amino acid activation” and “Basal replication machinery”. If you click on the gene count corresponding to “Basal replication machinery” in the column “Human Gut 7 Gene Count (Q)”, and scroll through the list, you will see multiple occurrences of “Single-stranded DNA-binding protein” corresponding to COG0629 (annotations of human gut subject 7 and human gut subject 8 metagenomes are based on COGs). This COG has been identified as the one with the highest D-score in the previous test of COGs most overrepresented in human gut subject 7 metagenome as compared to human gut subject 8 (see the answer to the previous question). Although D-rank test has been designed to avoid such situation, it is still IMG User Scenario October 21, 2010 possible that one highly overrepresented protein family may skew the results of “Function Category Comparisons” and this possibility should be taken into account. 4. Using Genome Clustering, find, which of 5 mouse gut community metagenomes are the closest according to their COG frequency distributions? According to their Pfam frequency distribution? Answer: Mouse Gut Community ob1 and Mouse Gut Community lean3 by COG, and ob1 and ob2 by Pfam profiles. Explanation: in “Compare Genomes” tab go to “Genome Clustering”. Select 5 mouse gut community metagenomes, “Mouse Gut Community lean1”, “Mouse Gut Community lean2”, “Mouse Gut Community lean3”, “Mouse Gut Community ob1”, “Mouse Gut Community ob2”. Select “COG” for functional profile and “Hierarchical Clustering” for clustering method. Repeat the same using “Pfam” for functional profile. IMG User Scenario October 21, 2010 5. Using Phylogenetic Distribution, find, which phyla are underrepresented in metagenomes of 2 obese mice as compared to 3 lean mice using “Gene count” and BLAST hits with identity above 60%. Which phyla are overrepresented? Answer: Mouse Gut Community ob1 and Mouse Gut Community ob2 have fewer genes with best hits to Bacteroidetes and more hits to Firmicutes. Explanation: in “Compare Genomes” tab go to “Phylogenetic Distribution”, and then to “Metagenomes Phylogenetic Distribution”. Select “Percent Identity” 60+, “Gene count” and “Show percentages” as Display option. This metagenome is completely unassembled, so the option of “Estimated gene copies” is irrelevant. Select 5 mouse gut community metagenomes, “Mouse Gut Community lean1”, “Mouse Gut Community lean2”, “Mouse Gut Community lean3”, “Mouse Gut Community ob1”, “Mouse Gut Community ob2”. You will get the table with counts of genes in all 5 metagenomes with best hits to isolate genomes from various phyla/classes (similar to Phylogenetic Distribution tool). You can filter the table by deselecting the phyla/classes with very low counts in all metagenomes and clicking “Filter” button. 2 obese mice have lower % of genes with hits to Bacteroidetes than any lean mouse. IMG User Scenario October 21, 2010 F. Using SNP BLAST and SNP VISTA. 1. Using SNP BLAST and SNP VISTA find whether there are any populations within Leptospirillum sp. group II bin of Acid Mine Drainage metagenome. Answer: no, there are no populations of Leptospirillum sp. Group II that could be distinguished through single nucleotide polymorphisms (however, there could be populations with large-scale genome rearrangements). Explanation: go to Microbiome Details page of “Acid Mine Drainage” metagenome (using “Find Genomes”, “Quick Genome Search” or through the list of metagenomes from IMG/M Home page), scroll down to the list of bins. Click on the scaffold count associated with Leptospirillum sp. Group II bin. Click on nucleotide range of any scaffold (preferably larger one, since shorter scaffolds are more likely to be misclassified by the binning tools). Click on any gene on the graphical representation of this scaffold. On the gene page in the section “IMG Sequence Search” click on “SNP BLAST”, which performs BLASTn of assembled sequence of the scaffold against the database of sequence reads. You may use the sequence of one gene only or extend the range by adding the sequence upstream and/or downstream, click on “Run BLAST”. IMG User Scenario October 21, 2010 SNP BLAST produces simple text output, which can be analyzed to see if there are any groups of reads differing from the consensus sequence of the scaffold. In the case of Leptospirillum sp. Group II bin no such variation can be found (only single reads deviating from consensus, likely due to sequencing artifacts), so there are no populations of this organism detectable by single nucleotide polymorphisms. You can view graphical presentation of SNP BLAST results by clicking on “SNP VISTA” button (may not work in some browsers). 2. Is there any evidence of recombination between populations within Ferroplasma acidarmanus type I bin of Acid Mine Drainage metagenome? Answer: yes, there is evidence of recombination between populations of Ferroplasma acidarmanus. Explanation: go to Microbiome Details Page of Acid Mine Drainage metagenomes and click on the count of scaffolds associated with “Ferroplasma acidarmanus type I” bin. IMG User Scenario October 21, 2010 Click on the coordinate range for any scaffold and then on any gene on the graphical view of this scaffold. Go to “IMG Sequence Search” section and click on “SNP BLAST” link, then on “Run BLAST” link. As shown in the screenshot above, there are two populations of Ferroplasma that can be distinguished by single nucleotide polymorphisms; one of them is represented by consensus scaffold amd_scaffold_35, another is represented by amd_scaffold_1. However there are some reads that are intermediate between two consensus sequences (e.g., XYG47968.g1, XYG53549.b1 and XYG68254.b1) indicating a possibility of recombination between the two populations.