Word file (39 KB )

advertisement
Supplementary Information for Bentley et al.
Methods
Landmark maps. Landmark maps (~15 markers / Mb) were constructed either by whole genome
radiation hybrid mapping9,11,12 (chromosomes 1,6,9,10,13,20) or by building contigs of overlapping
YAC clones (chromosomes 22 and X) as described previously13. Publicly available markers included
polymorphic microsatellites and other short tandem repeats from the genetic maps 14,15, gene based
markers from GeneMap999 (www.ncbi.nlm.nih.gov/GeneMap), and anonymous genomic markers from
the Genome DataBase (www.gdb.org) and the Whitehead Institute 12. These were supplemented with
new markers derived from single pass sequences of plasmid subclones made using DNA from flowsorted chromosomes 1, 6, 13, 20, X, and a pool of 9-12 (these four chromosomes are not readily
separated by flow sorting from normal human cell lines) 16. Each batch of purified chromosomes was
tested by using an aliquot for FISH with chromosome specific repeat probes, and batches of <95%
purity were rejected. PCR assays for marker sequences were tested for their ability to amplify DNA of
a monochromosomal hybrid line containing the appropriate chromosome (assignment panel 2; Coriell
Cell Repositories, Camden, NJ). In addition to GeneMap99, integrated RH maps were constructed for
chromosomes 1, 6 and 20 (www.sanger.ac.uk/HGP/Rhmap/maps.shtml).
Bacterial clone mapping. Clones were isolated by screening the human PAC (RPCI 1-6) and BAC
(RPCI 11 & 13, and Caltech A-D) libraries. Clones were restriction fingerprinted using either
fluorescent labelled HindIII/Sau3AI digests, resolved using ABI377 sequencing machines
(chromosomes 1, 6, 20 and X)17; or by analysis of HindIII digests on agarose gels (chromosomes 9, 10
and 13)18. Data were collected using IMAGE; contigs were assembled using the program FPC and
ordered taking account of the landmark data19 before import into specific chromosome implementations
of ACEDB. For integration of tiling path clones into the HumanMap database, fingerprints of the
selected clones were generated either experimentally, or where possible, “in silico”, by restriction
analysis of finished sequence and regeneration of a fingerprint pattern19. Contig extension and joining
(see www.sanger.ac.uk/HGP/methods for details) was done initially by using new markers generated
from the BAC or PAC ends next to (or near) gaps using (i) the vectorette PCR technique adapted for
bacterial clones; (ii) BAC and PAC end-sequences generated in house; (iii) publicly available BAC
end-sequences; (iv) genomic sequence. Markers were used to screen clones in existing contigs to
identify joins directly, and where necessary to screen for new clones using a wide range of available
genomic libraries. In a few cases, YACs were used for gap closure and sequencing when necessary.
Clones selected for the tiling path were re-purified to single colonies and DNA preparations were tested
by re-fingerprinting, and in some cases FISH analysis of metaphase chromosome spreads, before
genomic sequencing. The results of the FISH analysis contributed to integration of the clone map with
the cytogenetic map, as described in an accompanying paper 20. Paralagous sequences, which are
known to occur for example in pericentromeric regions can present difficulties in mapping. A strategy
for obtaining additional information on the map position of a clone in these regions was to determine
the chromosome assignment of multiple STSs designed from available sequence of each BAC or PAC,
by PCR analysis using chromosome assignment panel 2 (Coriell). Taking all STS data for each clone it
was possible to determine a unique chromosome assignment in most cases. These clones were selected
for sequencing.
Analysis of gaps. We found that many of the remaining gaps between contigs lie in regions with
higher than the average GC content of 41%. Marked examples are on chromosome 9 (11 of the 21
gaps are in 9q34 near the telomere, which is 50% GC) and chromosome 20 (3 of the 4 gaps are near
20q-tel which is 55% GC). As each chromosome map neared completion, the sizes of the remaining
gaps were measured by two colour DNA fibre FISH analysis using clones taken from the contigs on
either side of each gap. Gap sizes of up to 300 kb were estimated as a fraction of total clone length.
For larger gaps, the same clones were used as probes for FISH analysis of interphase chromosomes.
Estimates of distance between clones in the range of 0.3-1.0 Mb were made using control clones of
known separation as a reference. FISH to metaphase chromosomes was used for gaps within the range
2-10 Mb. Coincident signals were deemed to represent a maximum separation of 2 Mb (calibrated on
the basis of a parallel analysis using clones selected from chromosome 22, spaced according to the
finished sequence21.
Detection of euchromatic boundaries. Analysis of telomeric boundaries was based on either
identification of sub-telomeric repeats in the genomic sequence, or integration into the map of bacterial
clones made from YAC clones containing captured human telomeres22. Telomeric boundaries have
been found so far within contigs for chromosome arms 1q, 6p, 6q, 9p, 9q, 10p, 13q, 20p and 20q.
Centromeric boundaries were defined by the presence of a block of satellite repeats in the contigs;
these have been found on 1q, 6q, 10p, 10q, 20p, 20q, and Xp to date.
Coverage estimates. For each chromosome in table 1, an estimate is given for the total length
(according to Morton1). For the extent of the euchromatic portion (i.e total minus heterochromatin), 5
Mb was allowed for each centromeric satellite region (approximately equal to the measured size of this
region on chromosome 1010 – see text) except for Xcen which has been estimated to be 3 Mb 23. In
addition, chromosomes 1, 9 and the acrocentric chromosome 13 contain extended blocks of
heterochromatin, which have been estimated as 20 Mb (1q), 17 Mb (9q), and 16 Mb (13p),
respectively.
Chromosomes X and 22 were shared with other groups: markers delimiting the respective boundaries
were DMD and DXS532 (X) and D22S536 and ACR (22), and physical map coverage estimates are for
the regions defined; GeneMap calculations for 22 used the total published sequence 3 and are provided
for comparison.
For analysis of GeneMap99 marker coverage, a set of unique gene-based and genetic markers was
derived for each chromosome. Markers were matched to the available draft genome sequence using
BLAST24 with RepeatMasker (v. 04/21/99, Smit AFA & Green P
http://ftp.genome.washington.edu/RM/RepeatMasker.html) followed by ePCR 25; all unmatched
markers were re-checked using BLAST without Repeatmasker, and the combined total constitutes the
“in sequence” value for each chromosome in table 1. Additional markers known to be present in the
map on the basis of experimental data were added to this total to derive the “in map” values (table 1).
Estimation of the physical length of clone contigs was based on the total number of fingerprint bands in
the consensus map for each chromosome. An average conversion factor (kb/fingerprint band) was
calculated for BACs or PACs analysed by each fingerprinting method, by comparing the known length
(in kb) of a set of clones with full sequence available, with the number of fingerprint bands observed
experimentally for each clone. For chromosome 9, 10 and 13 (analysed using HindIII fingerprints),
analysis of 277 clones provided a conversion factor of 4.4kb/band. For chromosomes 1, 6, 20 and X
(analysed using HindIII/Sau3AI fingerprints), conversion factors of 4.4 kb/band (from 63 BACs) and
3.63 kb/band (from 913 PACs) were obtained. The extent of the map for each chromosome was then
calculated taking into account the proportion of BACs or PACs which contribute to the map in each
chromosome. We observed that the higher average insert size of the BACs (171 kb), compared to the
PACs (115 kb), resulted in the BAC fingerprints being significantly denser than those of the PACs,
with more comigrating bands (doublets, triplets etc.) which are each scored once by the IMAGE
software. The underscoring of BACs compared to PACs is reflected in the different conversion factors.
Coverage of the tiling path in clone overlaps was obtained from the fingerprint data on average, 25% of
the coverage of the contigs is present in overlapping clones,
Map displays. Images were extracted from ACEDB as .gif files. From the overview page, detailed
mapping data including the tiling path, complete contigs and selected markers, can be accessed by
viewing a series of intervals of ~2Mb (chromosomes 1,6,9,10,13,20, and X), or 1Mb (chromosome 22
sequence map). Tiling path clones are colour-coded to denote status of analysis and laboratory of
origin. All sequence accession identifiers are indicated, and the “mouse-over” facility enables viewing
of the library address of each clone. A search facility allows access to information about individual
clones, sequence accessions or markers. (See www.nature.com; future updates will be at
www.sanger.ac.uk).
1.
Morton, N. E. Parameters of the human genome. Proc Natl Acad Sci U S A 88,
7474-6 (1991).
2.
Hattori, M. et al.The DNA sequence of human chromosome 21. The
chromosome 21 mapping and sequencing consortium. Nature 405, 311-9 (2000).
3.
Dunham, I. et al.The DNA sequence of human chromosome 22. Nature 402,
489-95 (1999).
4.
Page et al.A physical map of the human Y chromosome. Nature (this issue).
5.
Kucherlapati et al. A high-resolution map of human chromosome 12. Nature
(this issue).
6.
Weissenbach et al. (2001).
7.
McPherson et al. Nature (this issue).
8.
The International Human Genome Consortium. Initial sequencing and analysis
of the human genome. Nature (this issue).
9.
Deloukas, P. et al.A physical map of 30,000 human genes. Science 282, 744-6
(1998).
10.
Jackson, M. S., See, C. G., Mulligan, L. M. & Lauffart, B. F. A 9.75-Mb map
across the centromere of human chromosome 10. Genomics 33, 258-70 (1996).
11.
Walter, M. A., Spillett, D. J., Thomas, P., Weissenbach, J. & Goodfellow, P.
N. A method for constructing radiation hybrid maps of whole genomes. Nat Genet 7,
22-8 (1994).
12.
Hudson, T. J. et al.An STS-based map of the human genome. Science 270,
1945-54 (1995).
13.
Collins, J. E. et al.A high-density YAC contig map of human chromosome 22.
Nature 377, 367-79 (1995).
14.
Dib, C. et al.A comprehensive genetic map of the human genome based on
5,264 microsatellites. Nature 380, 152-4 (1996).
15.
Murray, J. C. et al.A comprehensive human linkage map with centimorgan
density. Cooperative Human Linkage Center (CHLC). Science 265, 2049-54 (1994).
16.
Mungall, A. J. et al.From long range mapping to sequence-ready contigs on
human chromosome 6. DNA Seq 8, 151-4 (1997).
17.
Gregory, S. G., Howell, G. R. & Bentley, D. R. Genome mapping by
fluorescent fingerprinting. Genome Res 7, 1162-8 (1997).
18.
Marra, M. A. et al.High throughput fingerprint analysis of large-insert clones.
Genome Res 7, 1072-84 (1997).
19.
Soderlund, C., Humphray, S., Dunham, A. & French, L. Contigs Built with
Fingerprints, Markers, and FPC V4.7. Genome Res 10, 1772-1787 (2000).
20.
Trask., (2001).
21.
Leversha, M. A., Dunham, I. & Carter, N. P. A molecular cytogenetic clone
resource for chromosome 22. Chromosome Res 7, 571-3 (1999).
22.
Knight, S. J. et al.An optimized set of human telomere clones for studying
telomere integrity and architecture. Am J Hum Genet 67, 320-32 (2000).
23.
Mahtani, M. M. & Willard, H. F. Physical and genetic mapping of the human
X chromosome centromere: repression of recombination. Genome Res 8, 100-10
(1998).
24.
Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic
local alignment search tool. J Mol Biol 215, 403-10 (1990).
25.
Schuler, G. D. Electronic PCR: bridging the gap between genome mapping
and genome sequencing. Trends Biotechnol 16, 456-9 (1998).
Download