Captions for Supplemental Tables Supplemental Table 1 A table of

Captions for Supplemental Tables Supplemental Table 1 A table of all allelic variation in the Immunoglobulin Heavy Chain locus. We reconstructed the specific allele present in each rat line and assembled all alleles for all IgH genes in a single table. Column A identifies the gene segment or family name (using the IMGT nomenclature for rat immunoglobulin heavy chain gene families/segments). Column B provides the specific gene name. Column C indicates that the allele information in that row corresponds to the Brown Norway rat reference genome sequence (BN) or to the SHR-A3 and SHR-B2 genome sequences we have assembled to the reference. Column D provides an allele count to indicate the reference allele as allele 1 and to identify any additional alleles for each gene that were identified in either or both SHR-A3 and SHR-B2. Column E provides a cumulative tally of the new IgH alleles (303 new in total). Column F provides the allele sequence for all alleles, noting the variant base residues in color. Supplemental Table 2 A complete description of the functional state in SHR-A3 and SHR-B2, along with the amino acids affected by sequence change is provided in this table. Columns A and B are as in Supplemental Table 1. Columns C, D and E indicate for BN, SHR-A3 and SHR-B2 respectively whether the sequence for each IgH gene encodes a functional gene, an open reading frame or a null segment. Column F indicates that the allele information in that row corresponds to the Brown Norway rat reference genome sequence (BN) or to the SHR-A3 and SHR-B2 genome sequences we have assembled to the reference. Column G provides a count of non-synonymous alleles to indicate the reference allele as allele 1 and to identify any additional non-synonymous alleles for each gene that were identified in either or both SHR-A3 and SHR-B2. . Column H provides a cumulative tally of the new IgH alleles that have amino acid sequence variation (128 in total). Column I indicates non-synonymous alleles comparing SHR-A3 and SHR-B2 (98 in total). Column J provides the amino acid sequence encoded by the alleles with variant residues indicated in color. Supplemental Table 3 This table integrates allelic information with sequence genome position and includes a color-coded haplotype map of the IgH locus. Columns A through E are as in Supplemental Table 2. Column F indicates whether the sequences of each gene segment are inherited identical by descent and if so, which strains share identity. For simplicity SHR-A3 and SHR-B2 are reduced to A3 and B2 and Brown Norway to BN in this column. IBD indicates identical by descent for the two or three strains preceding this abbreviation. Diff indicates strains that have different alleles that are not inherited identical by descent. Column G and H provide a color-coded haplotype map of SHR-A3 (column G) and SHR-B2 (Column H) to represent the ancestral state of each IgH gene segment. When the segment is IBD with BN it is colored brown. When a unique SHR-A3 allele exists, it is colored red. When a unique SHR-B2 allele exists, it is colored green. When an allele that differs from BN is present in both SHR-A3 and SHR-B2 and is IBD, it is colored blue. Column I indicates whether the coding sequence is present on the forward (+) or reverse (-) strands of the genome. Column J provides the position of the gene in the rat genome assembly 3.4. Supplemental Figure 2 indicated that genome sequence read coverage was variable across this region of the genome. This may reflect the presence of sequence duplication and deletion occurring in this highly segmented region of the genome. Since variation is detected by alignment to the reference it is possible that some regions of this part of the genome are duplicated in SHR-A3 and/or SHR-B2. Duplications can undergo subsequent genome sequence divergence. During alignment to the reference, multiple duplicated segments that have diverged through single nucleotide polymorphism may be aligned to a single sequence in the reference genome that reflects the original state of that segment prior to duplication. We have provided information in Column K that may reflect the occurrence of this phenomenon. For example, if a sequence is been duplicated 4 times in SHR-A3 or SHR-B2 compared to the reference sequence, then this may be reflected in sequence coverage greater than the genome-wide average of ~50X coverage we obtained, as indicated in Suppl. Fig. 2. This may result in all 4 duplicated segments aligning to a single segment of the reference genome. Depending on its evolutionary path, each duplicated segment may or may not contain variants that are present in its related duplicated segments. For example, in row 37 we found that the reference sequence was identified at the same location in 149 of 204 aligned sequence reads in SHR-A3, but that a variant was located in 55 of 204 reads. This may reflect segmental duplication where subsequent polymorphism has affected only one of the duplicated segments. In the same region, SHR-B2 varied from the reference sequence and the variation was present in all reads. Next generation short read genome sequencing is not sufficiently able to resolve these segmental duplications that are followed by the creation of subsequent polymorphism. However we note our observations here so that they may be examined more closely as sequencing technology advances and longer read assemblies become available that can resolve highly segmentally duplicated sequences. Columns L and M repeat the base sequence and amino acid sequence allele information also available in Supplemental Tables 1 and 2.

Captions for Supplemental Tables Supplemental Table 1 A table of

Related documents

Products

Support

Captions for Supplemental Tables Supplemental Table 1 A table of

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib