Figure Legends for Supplemental Information Figures for the manuscript 2003-1213814. Figure SI-1. BAC clone selection process. Seed clones were initially selected either randomly from the BAC library or from contigs from initial FPC maps (constructed at the BCCA-GSC). BAC end sequences (produced at TIGR) and FPC fingerprint maps were generated concurrent to BAC skim sequencing and eBAC assembly. Data from all sources fed back into later rounds selecting walking and gap-filling clones to produce a tiling path. Figure SI-2. Coverage of finished BACs based on depth of BAC bait coverage. Finished sequences for twelve BACS, each on a different chromosome, were compared with enriched-BAC assemblies for varying depth of BAC-read coverage (dashed lines). On average (bold line), as the depth of BAC coverage (sum of trimmed read length divided by the fingerprint-estimated BAC size) approaches 2, the contigs from the enriched-BAC assembly approach 100 percent coverage of the finished sequence. Figure SI-3. Partitioning of fingerprint and YAC map contigs across the rat sequence assembly. Contigs were anchored to the assembly using BAC end and in silico mapping methods. Contigs that localized to multiple regions on the assembly are joined by colored curves, anchored at the location of the middle of the contig regions. The contigs themselves are suppressed for clarity (see C for an example of a detailed region). The density of segmental duplications, calculated as a windowed sum over adjacent 1 Mb regions, is shown by blue bars oriented towards the center of the circle. The scale for the segmental duplication is logarithmic, with decades at 0, 1, 10 and 100 kb marked by concentric grey circles. a. Red curves join segmented BAC fingerprint contig regions which have been visually inspected and determined to overlap in the fingerprint map. Grey curves join contig segment pairs where at least one member of the pair is located on chrUn which contains unanchored parts of the assembly. b. YAC contigs are anchored to the assembly by way of hybridizations to BACs with sequence coordinates. Light grey lines link regions that (a) are anchored by hybridization to a single BAC that is associated with <80% of the contig YACs or (b) are anchored by <20% of the contig BACs, leading to spurious contig segmentation on the assembly not likely to be due to actual inconsistencies between the YAC map and the sequence assembly. Red lines link the remaining region pairs, for which segmentation evidence is robust. c. The fingerprint map contig structure is shown in detail for the 80-200 Mb region of chromosome 2. The bottom contig track (green) represents contigs in the manually edited map. Sequence information was used to merge contigs and increase the contiguity of the map. The upper contig track (blue) shows the contig layout in the merged map. Contig 3019 (contig 2040 in the merged map) maps to two disjoint regions by sequence. This split is not corroborated by the contig structure in the fingerprint map. The histogram below the contig tracks shows the relative number of fingerprint map clones with sequence coordinate annotations in windows of 250 kb. Figure SI-4. Coverage of BAC Clones Selected for Sequencing from the Fingerprint Map. Summary of the sequence clone selection process during the 44 rounds of selection, shown by the total size of selected clones, coverage provided by the clones and number of coverage gaps. Solid glyphs correspond to the sequence clone selections, hollow glyphs correspond to simulated random sets. The total size and coverage plots show statistics compiled from selections made from the fingerprint map only, representing 16,299 clones. The gap count was computed using all selected clones. The random sets contained the same number of simulated selections as their corresponding experimental equivalents. Figure SI-5. Correlation of SINE locations in rat and mouse. Correlation of the lineage-specific SINE densities in 14,243 100 kb windows in rat and the orthologous regions in mouse. For each of the SINE families B1, B2 and ID, we constructed consensus sequences for multiple rat-specific, mouse-specific, as well as ancestral subfamilies to optimize the distinction of lineage-specific and shared repeats. To further minimize the number of elements falsely labeled to be lineage-specific, which would exaggerate the correlation, the maximum divergence of a copy to a consensus sequence was set below 9%, i.e. below the neutral substitution level since the speciation in both species (SINEs observed at identical sites in rat and mouse and therefore predating the speciation were actually 12% or more diverged from the available consensus sequences, due to CpG content and the still incomplete resolution of subfamily structure). For the same windows the density of lineage-specific LTR elements or L1 copies in rat and mouse showed no correlation. Figure SI-6. Dot-plots showing positions of aligning segments in rat-rat, rat-mouse, and rat-human comparisons, for a 10 Mb region of rat Chromosome 10 and the orthologous regions in mouse and human. All known interspersed repeats and tandem repeats were masked prior to the alignment using BLASTZ (default parameters). The extensive off-diagonal lines depict the prevalent, medium length duplications. A putative gene cluster in the center of this region, which in humans contains the genes TREM5 and CMRF35, was excluded from the calculations of the frequency of medium-length duplications. Figure SI-7. Sequence conservation at AG acceptor splice sites in aligned introns. Figure SI-8. Sequence conservation at GT donor splice sites in aligned introns.