tpj12173-sup-0001-FigS1-S16

advertisement
A draft genome sequence of the pulse crop chickpea (Cicer arietinum L.)
SUPPORTING FIGURES
1
Figure S1: Fragment distribution of the de novo assembly of ICC4958. Number of fragments
covering different percentile of the de novo assembly plotted against different length percentile.
2
Figure S2: Read depth at assembled bases of chickpea ICC4958 (Based on 454/Roche read
alignment). Frequency of 454/Roche reads at the assembled bases (x-axis) plotted against the
number of bases (y-axis). The poison-shaped distribution showing a peak at 15 denotes the
average 15X throughput of the assembled reads. The x-axis and y-axis in the figure have been
limited to 1001 and 1.0x108, respectively.
3
Figure S3. GC content distribution in the genome sequence of chickpea as
compared to other plant species. The x-axis represents GC content percentage and
y-axis represents fraction of bins (bin size of 500 bp in sliding non-overlapping
window).
4
Figure S4. Top 20 GO terms represented in chickpea geneset. GO terms
were assigned using Blast2Go pipeline.
5
Figure S5. Top 20 PFAM domains represented in the chickpea geneset.
PF00069 Protein kinase domain
PF07714 Protein tyrosine kinase
PF00067 Cytochrome P450
PF00249 Myb-like DNA-binding domain
PF00076 RNA recognition motif
PF12854 PPR repeat
PF12678 RING-H2 zinc finger
PF00847 AP2 domain
PF00501 AMP-binding enzyme
PF00010 Helix-loop-helix DNA-binding domain
PF03171 2OG-Fe(II) oxygenase superfamily
PF00106 short chain dehydrogenase
PF12697 Alpha/beta hydrolase family
PF00083 Sugar (and other) transporter
PF00072 Response regulator receiver domain
PF00201 UDP-glucoronosyl and UDP-glucosyl transferase
PF00400 WD domain, G-beta repeat
PF03401 Tripartite tricarboxylate transporter family receptor
PF00005 ABC transporter
PF00270 DEAD/DEAH box helicase
6
Figure S6. Strategy for the identification of lineage-specific genes in chickpea
genome. The genes that showed significant hits with non-Fabaceae plant species are
in dotted boxes. “Yes” represents a significant hit, and “No” represents no significant
hit in BLAST searches as per the given criteria (E ≤1e-5). The genes identified as
candidate chickpea-specific (CS) and legume-specific (LS) are highlighted in gray
boxes.
7
Figure S7. Top 10 GO terms represented in the genes included in chickpeaspecific gene families. The distribution of the top ten GO terms in the genes included
in gene families unique to chickpea (CS), conserved in legumes (chickpea, soybean,
M. truncatula and pigeonpea; LS) and conserved in the five plants species (chickpea,
soybean, M. truncatula, pigeonpea and grapevine; all) has been shown. The p-value
for the enrichment of these GO terms in gene families unique to chickpea was at least
0.001. Asterisks indicate the GO terms were enriched with p-value of at least 1E-10.
8
Figure S8. Gene distribution in different transcription factor families in
chickpea, other sequenced legumes and Arabidopsis.
9
(a)
(b)
Figure S9. Phylogenetic analysis of chickpea and Medicago genes belonging to
CC-NBS-LRR (a) and Leghaemoglobin (b) families. Medicago, chickpea and
soybean genes are shown in red, green and blue respectively. Bootstrap values are
mentioned next to the branches. Medicago and soybean show a clear expansion in
these families. Chickpea genes form distinct clusters suggesting diversification.
10
Figure S10. Ks distribution analysis of paralogous chickpea gene pairs to
determine the genome duplication event. The number of paralog pairs within a Ks
range (bin size of 0.05) are shown. The peak observed at 0.7 corresponds to the
duplication event in legume genomes.
11
Figure S11. The whole genome dot-plot was generated between chickpea linkage
groups (x-axis) and Medicago truncatula chromosome arms (y-axis). An asterisk
before a chromosome number indicates reverse complement. Order and orientation of
chromosomes are rearranged so that the synteny observed is easier to visualize.
Syntenic blocks are formed by red or blue dots representing best hits across any two
chromosomes in the same or opposite direction, respectively. A total of 12406 hits
were observed, out of which 9673 hits were in syntenic blocks. The syntenic blocks
are shown in green circles.
12
Figure S12. Microsynteny of chickpea (Ca) LG 5 with M. truncatula (Mt)
chromosome 3. Chickpea gene models are mapped on both the pseudomolecules to
show gene order. The upper panel shows overall synteny with local rearrangements.
The microsynteny presented in the lower panel shows conserved gene order between
two genomes.
13
Figure S13. The whole genome dot-plot was generated between chickpea linkage
groups (x-axis) and Glycine max chromosome arms (y-axis). An asterisk before a
chromosome number indicates reverse complement. Order and orientation of
chromosomes are rearranged so that the synteny observed is easier to visualize.
Syntenic blocks are formed by red or blue dots representing best hits across any two
chromosomes in the same or opposite direction, respectively. A total of 10387 hits
were observed, out of which 4842 hits were in syntenic blocks. Duplicated syntenic
blocks within green circles refer to recent whole genome duplication in the Glycine
max genome.
14
Figure S14. Scatter plot showing distribution of Ka/Ks (ω) with respect to Ks
between gene pairs present in the collinear blocks of chickpea and Medicago. The
gene pairs are distributed in four clusters according their Ks values. Average Ka/Ks
values of the clusters are decreasing with Ks. Clusters with average Ks≥1.5 attribute
to pan-eudicot palaeopolyploidization indicating genes in the other cluster with higher
ω are under purifying selection.
15
Figure S15. Ka/Ks distribution analysis of chickpea gene pairs. Distribution of
ratio of non-synonymous vs. synonymous substitution rates within the chickpea gene
families of size 2-6. The number of gene pairs within a Ka/Ks range 0.2 to 2.0 (bin
size of 0.1) are shown.
16
Figure S16. Distribution of various GOSlim categories (level 2) in chickpea gene
pairs with Ka/Ks >1.
17
Download