SUPPORTING INFORMATION Supporting Text. Experimental Methods and URL used Supporting Tables. Table S1-S22 Table S1. Sequencing data generated for chickpea genotypes. a) Sequence data generated for C. arietinum ICC4958. i) Sequence generated using 454/Roche pyrosequencing platform (WGS, Whole genome shotgun. MP, Mate-pair). ii) Pairedend sequence data generated using Illumina/Solexa platform. b) Sequence data generated for other chickpea genotypes (454/Roche pyrosequencing platform) Table S2. Statistics of draft assembly Table S3. Anchoring of scaffolds to linkage groups Table S4. Estimation of chickpea genome length based on read alignment Table S5. Estimated heterozygosity in ICC4958 draft genome Table S6. Transcriptome coverage in the assembled chickpea genome Table S7. Repeat content in the assembled chickpea draft genome Table S8. Comparative analysis of microsatellite sequences in chickpea draft genome with those in other legumes Table S9. Statistics of protein coding gene prediction Table S10. Assessment of gene prediction using CEGMA pipeline Table S11. Experimental evidence for the predicted protein-coding genes Table S12. Statistics of protein coding genes from different plant species Table S13. Functional annotation of the predicted protein-coding genes Table S14. Features of lineage-specific genes in chickpea Table S15. Transcription factor/regulator families in the chickpea draft genome Table S16. Comparison of R-gene family in chickpea draft genome with other sequenced plant genomes Table S17. Comparison of nodulation-associated gene families in chickpea draft genome with other sequenced plant genomes Table S18. Comparison of number of genes associated with carotenoid and flavonoid metabolism in chickpea draft genome with other sequenced plant genomes Table S19. Non-coding RNA genes in the chickpea draft genome Table S20. Summary of RNA-seq data generated from different tissues/treatments to study gene expression Table S21. Summary of tissue-preferential and stress-responsive gene expression results based on RNA-seq data Table S22. GO terms enriched in the chickpea genes expressed in tissue-specific manner Supporting Figures. Figures S1-S16 Figure S1: Fragment distribution of the de novo assembly of ICC4958. Number of fragments covering different percentile of the de novo assembly plotted against different length percentile. Figure S2: Read depth at assembled bases of chickpea ICC4958 (Based on 454/Roche read alignment). Frequency of 454/Roche reads at the assembled bases (x-axis) plotted against the number of bases (y-axis). The poison-shaped distribution showing a peak at 15 denotes the average 15X throughput of the assembled reads. The x-axis and y-axis in the figure have been limited to 1001 and 1.0x108, respectively. Figure S3. GC content distribution in the genome sequence of chickpea as compared to other plant species. The x-axis represents GC content percentage and y-axis represents fraction of bins (bin size of 500 bp in sliding non-overlapping window). Figure S4. Top 20 GO terms represented in chickpea geneset. GO terms were assigned using Blast2Go pipeline. Figure S5. Top 20 PFAM domains represented in the chickpea geneset. PF00069 Protein kinase domain, PF07714 Protein tyrosine kinase, PF00067 Cytochrome P450, PF00249 Myb-like DNA-binding domain, PF00076 RNA recognition motif, PF12854 PPR repeat, PF12678 RING-H2 zinc finger, PF00847 AP2 domain, PF00501 AMPbinding enzyme, PF00010 Helix-loop-helix DNA-binding domain, PF03171 2OGFe(II) oxygenase superfamily, PF00106 short chain dehydrogenase, PF12697 Alpha/beta hydrolase family, PF00083 Sugar (and other) transporter, PF00072 Response regulator receiver domain, PF00201 UDP-glucoronosyl and UDP-glucosyl transferase, PF00400 WD domain, G-beta repeat, PF03401 Tripartite tricarboxylate transporter family receptor, PF00005 ABC transporter, PF00270 DEAD/DEAH box helicase Figure S6. Strategy for the identification of lineage-specific genes in chickpea genome. The genes that showed significant hits with non-Fabaceae plant species are in dotted boxes. “Yes” represents a significant hit, and “No” represents no significant hit in BLAST searches as per the given criteria (E ≤1e-5). The genes identified as candidate chickpea-specific (CS) and legume-specific (LS) are highlighted in gray boxes. Figure S7. Top 10 GO terms represented in the genes included in chickpeaspecific gene families. The distribution of the top ten GO terms in the genes included in gene families unique to chickpea (CS), conserved in legumes (chickpea, soybean, M. truncatula and pigeonpea; LS) and conserved in the five plants species (chickpea, soybean, M. truncatula, pigeonpea and grapevine; all) has been shown. The p-value for the enrichment of these GO terms in gene families unique to chickpea was at least 0.001. Asterisks indicate the GO terms were enriched with p-value of at least 1E-10. Figure S8. Gene distribution in different transcription factor families in chickpea, other sequenced legumes and Arabidopsis. Figure S9. Phylogenetic analysis of chickpea and Medicago genes belonging to CC-NBS-LRR (a) and Leghaemoglobin (b) families. Medicago, chickpea and soybean genes are shown in red, green and blue respectively. Bootstrap values are mentioned next to the branches. Medicago and soybean show a clear expansion in these families. Chickpea genes form distinct clusters suggesting diversification. Figure S10. Ks distribution analysis of paralogous chickpea gene pairs to determine the genome duplication event. The number of paralog pairs within a Ks range (bin size of 0.05) are shown. The peak observed at 0.7 corresponds to the duplication event in legume genomes. Figure S11. The whole genome dot-plot was generated between chickpea linkage groups (x-axis) and Medicago truncatula chromosome arms (y-axis). An asterisk before a chromosome number indicates reverse complement. Order and orientation of chromosomes are rearranged so that the synteny observed is easier to visualize. Syntenic blocks are formed by red or blue dots representing best hits across any two chromosomes in the same or opposite direction, respectively. A total of 12406 hits were observed, out of which 9673 hits were in syntenic blocks. The syntenic blocks are shown in green circles. Figure S12. Microsynteny of chickpea (Ca) LG 5 with M. truncatula (Mt) chromosome 3. Chickpea gene models are mapped on both the pseudomolecules to show gene order. The upper panel shows overall synteny with local rearrangements. The microsynteny presented in the lower panel shows conserved gene order between two genomes. Figure S13. The whole genome dot-plot was generated between chickpea linkage groups (x-axis) and Glycine max chromosome arms (y-axis). An asterisk before a chromosome number indicates reverse complement. Order and orientation of chromosomes are rearranged so that the synteny observed is easier to visualize. Syntenic blocks are formed by red or blue dots representing best hits across any two chromosomes in the same or opposite direction, respectively. A total of 10387 hits were observed, out of which 4842 hits were in syntenic blocks. Duplicated syntenic blocks within green circles refer to recent whole genome duplication in the Glycine max genome. Figure S14. Scatter plot showing distribution of Ka/Ks (ω) with respect to Ks between gene pairs present in the collinear blocks of chickpea and Medicago. The gene pairs are distributed in four clusters according their Ks values. Average Ka/Ks values of the clusters are decreasing with Ks. Clusters with average Ks≥1.5 attribute to pan-eudicot palaeopolyploidization indicating genes in the other cluster with higher ω are under purifying selection. Figure S15. Ka/Ks distribution analysis of chickpea gene pairs. Distribution of ratio of non-synonymous vs. synonymous substitution rates within the chickpea gene families of size 2-6. The number of gene pairs within a Ka/Ks range 0.2 to 2.0 (bin size of 0.1) are shown. Figure S16. Distribution of various GOSlim categories (level 2) in chickpea gene pairs with Ka/Ks >1. Supporting Data 1-13. SNP and SSR marker resources Supporting Data 1. SSR primer :ICC4958 Supporting Data 2. PolymorphicSSR primer : PI489777 vs. JG62 Supporting Data 3. PolymorphicSSR primer : PI489777 vs. ICCV2 Supporting Data 4. PolymorphicSSR primer :ICC4958 vs. PI489777 Supporting Data 5. PolymorphicSSR primer :ICC4958 vs. ICCV2 Supporting Data 6. PolymorphicSSR primer : ICC4958 vs. JG62 Supporting Data 7. PolymorphicSSR primer : ICCV2 vs. JG62 Supporting Data 8. SNP primer : ICC4958 vs. PI489777 Supporting Data 9. SNP primer : ICC4958 vs. ICCV2 Supporting Data 10. SNP primer : ICC4958 vs. JG62 Supporting Data 11. SNP primer : PI489777 vs. ICCV Supporting Data 12. SNP primer : PI489777 vs. JG62 Supporting Data 13. SNP primer : ICCV2 vs. JG62