The Plant Journal (1996) 9(5), 755-765 TECHNICAL ADVANCE Detailed description of four YAC contigs representing 17 Mb of chromosome 4 of Arabidopsis thaliana ecotype Columbia Renate Schmidt t, Joanne West, Gerda Cnopst, Karina Love, Alma Balestrazzi§ and Caroline Dean* Department of Molecular Genetics, John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK Summary The detailed arrangement of 563 YAC clones comprising four contigs covering -17 Mbp of chromosome 4 is presented. YAC clones were positioned relative to each other and to markers by taking into account marker and end fragment hybridization data and the sizes of all YAC clones. This analysis made it possible to estimate physical distances between the majority of chromosome 4 markers. It also identified a relatively large number of YAC clones containing chimaeric inserts. The YAC contig map of the Columbia ecotype presents an important resource for map-based cloning experiments, rapid mapping of DNA sequences and large-scale genomic sequencing programs. Introduction Arabidopsis tha/iana is an important model organism for the analysis of complex plant processes using molecular genetic techniques (Meyerowitz and Somerville, 1994). Many laboratories are currently pursuing map-based cloning strategies to isolate Arabidopsis genes. This experimental approach would greatly benefit from the availability of a complete physical map of the Arabidopsis genome. The first attempt to produce a physical map by fingerprinting cosmid clones, paralleling the Caenorhabditis elegans genome project (Coulson et al., 1986), resulted in 750 contigs with an average size of 120 kb (Hauge et al., 1991). When yeast artificial chromosome (YAC) libraries became available (Grill and Somerville, 1991; Ward and Jen, 1990) an international collaboration was set up Received 13 November 1995; revised 6 February 1996; accepted 21 February1996. *For correspondence (fax +44 1603 505725; e-mail arabidopsis@bbsrc.ac.uk). tpresent address: Max-Delbr0ck-Laboratorium in der Max-PlanckGesellschaft,CarI-von-Linnd-Weg10, 50829Cologne,Germany. tPresent address: Laboratorium Genetika, Universiteit Gent, Ledeganckstraat35, 9000Gent,Belgium. ~Present address: Dipartimento di Genetica e Microbiologia, Via Abbiategrasso207,27100Pavia,Italy. with the aim of generating a contig map of the whole Arabidopsis genome based on YAC clones. The initial experiments used 125 RFLP markers to identify and position 296 YAC clones, representing approximately 30% of the Arabidopsis genome (Hwang et al., 1991). Since then many more RFLP markers have become available and PCR markers have been developed (Bell and Ecker, 1994; Konieczny and Ausubel, 1993; Reiter et al., 1992). Currently, more than 100 DNA markers have been mapped to Arabidopsis chromosome 4. Four Arabidopsis YAC libraries made from the Columbia ecotype, CIC (Creusot et al., 1995), EG (Grill and Somerville, 1991), EW (Ward and Jen, 1990) and yUP (Ecker, 1990) are available for physical mapping experiments, representing in total at least 10 genome equivalents. The plant DNA insert sizes differ between the libraries, with the average being 160 kb in the EG and EW libraries, 250 kb in the yUP library and 420 kb in the CIC library. The frequency of repetitive sequences (Creusot et al., 1995; Schmidt eta/., 1994; Dunn and Ecker, unpublished results) also varies between the libraries. We recently published a tiling path of YAC clones covering more than 90% of the genetic map of chromosome 4. Mapping of the rDNA locus, the repeated sequences flanking the centromere and 77 genetically mapped markers on the YAC clones allowed us to integrate the cytogenetic, genetic and physical maps of chromosome 4 (Schmidt eta/., 1995). To establish the tiling path, 158 probes were used. We have extended this physical mapping and report the results for a total of 263 probes. Furthermore, we indicate known chimaeric clones and present all the YAC clones hybridizing to the markers rather than only the YAC clones which link two or more markers. The emphasis of the work presented here is on the relative positioning of YAC clones and markers, allowing the physical distances between markers to be estimated and greatly increasing the usefulness of the YAC contigs in map-based cloning experiments. Results Southern blot analysis of YAC clones The YAC tiling path for Arabidopsis chromosome 4 (Schmidt et al., 1995; World Wide Web at URL: http:// 755 756 Renate Schmidt et al. YAC contigs on Arabidopsis chromosome 4 757 758 Renate Schmidt et al. nasc.nott.ac.uk/JIC-contigs/JIC-contigs.html) shows the order of 158 probes along the chromosome and all YAC clones which link two or more probes. However, it neither displays the different physical sizes of the YAC clones nor the distances between particular markers. In order to achieve a representation which also fulfils these criteria additional data had to be determined for the YAC clones. First, the YAC insert sizes for all YAC clones which have been mapped on to chromosome 4 were determined using pulsed field gel electrophoresis (PFGE). The sizes of the YAC inserts in the CIC library were already available (Creusot et al., 1995). Second, Southern blot analyses were used to position YAC clones relative to each other and to probes. The Southern blots contained all YAC clones identified as carrying chromosome 4 DNA, digested with EcoRI/BamHl. This restriction digest ensured that for the EG, yUP and the CIC clones the insert DNA was removed completely from the vector sequences. For EW YAC clones the vector sequences could not be removed entirely, as these clones were constructed using sheared DNA and the cloning site was destroyed in the cloning process. The Southern blot analysis showed that a particular single-copy marker hybridized to common restriction fragments in all the YAC clones it had hybridized to in the colony hybridization experiments, demonstrating overlap between the different YAC inserts. When all the EcoRI/BamHI restriction fragments of a particular marker were found in a YAC clone it was concluded that the marker was fully contained within that clone. Some of the YAC clones only contained a subset of the restriction fragments of the marker, especially when the relatively large cosmid or Lambda DNA markers were used as probes. This indicated that these particular YAC clones did not span the marker completely but ended within it. Sequences adjacent to the YAC vector sequences in EW YAC clones were found on a junction EcoRI/BamHI restriction fragment, which was a different size to that in the probe. The knowledge whether a particular marker is fully contained within a YAC clone can give important information about the order of markers which map physically very close to each other. For example, markers AGL19 and pCIT-d23 could only be unambiguously placed relative to markers B10206 and m210 in contig III due to the information that YAC clone CIC9G5 is not fully contained within pCIT-d23 (compare Figure 1). The Southern blot analysis also detected aberrations in YAC clones. In a few cases YAC clones (e,g. CIC5D2/E2, CIC9H6) were found not to contain an entire marker despite the fact that they spanned the two flanking markers. We interpret this as an indication of small deletions. Also, in some cases one of the hybridizing EcoRI/BamHI restriction fragments varied in size in different YAC clones. This is most likely caused by chimaeric inserts in the YAC clones. Examples of this kind were found in YAC clones EG5D4 and yUP6B4, which contained aberrant sized restriction fragments hybridizing to markers m210 and mi128, respectively (see Table 2). A considerable proportion of the DNA sequences used to probe the YAC libraries were cosmid or Lambda clones. Since some of the YAC clones could potentially contain only a very small part of the marker it could not be ruled out that some YAC clones corresponding to a particular marker were not detected in the colony hybridization experiments. To ensure that all linking clones between markers had been identified, a Southern blot analysis of all clones mapping to a genomic region was carried out using all markers in this region as probes. This test was very important to avoid the non-detection of particular clones due to experimental shortcomings which could have resulted in the false ordering of markers in particular genomic regions. The Southern blot analysis also allowed the unambiguous integration of multilocus markers into the map. For markers mapping to multiple loci in the Arabidopsis genome (e.g. UM177, UM415, BIO206, g4564) distinct Southern hybridization patterns for the different loci could be established. A locus was integrated into the YAC contig map if some of the YAC clones representing that locus had previously been anchored on to the chromosome 4 map by a single-copy marker. For example, marker BIO206 showed hybridization to two different sets of restriction fragments, one represented by clones EW8C7, yUP15D11, yUP 16F9, CIC4A7 and CICll B4 and the other one by EW5C9, Figure 1. YAC contigs coveringchromosome4. The arrangementof the YAC clones is consistentwith hybridizationto markers (shown at the top of each contig) and a limited numberof chromosome walking experiments.The linescrossingthe YACclonesrepresentthe approximatelocationof the markers/endfragmentswithin the clones.The approximate size of a markeris given by the thicknessof the line. For markersmappingto the samegenomiclocation,the lines are shown in differentshadesof grey and bars indicatethe extentof each marker.The sizesof all the YAC clonesare drawn to scale, if a clonehas beenshownto containseveralYACs,the size of this clone is indicatedby multiple boxes, however, it has not been determined if all the YACsof a particularclone hybridizeto a given markerand/or repetitive sequence.For known chimaericclones the non-contiguoussequencesare indicatedas dark grey boxes. For those clones, for which marker hybridizationdata and the physicalsizeof the clone are inconsistent,the putativechimaericpart of the cloneis shown by light grey shading.The location and the extentof the chimaericsequencesin a cloneare consistentwith the markerand end fragmenthybridizationdata,but can vary from thoseshown in the figure. All YAC end fragmentsthat havebeenanalysedare indicatedin the figure, left-endfragmentsas ellipsesand right-endfragmentsas triangles. Only end fragmentswhich havebeenfully integratedinto the YACcontigs are representedin the sameway as markers.Chimaericend fragmentsare shown in black.For markersmappingto multipleloci, only the YACclonescorrespondingto the particularchromosome4 locusare representedin the figure.Where marker order is ambiguous,markersare eithershown at identicalpositionsor an arrow indicatesthat the order of markerscould be reversed. YAC contigs on Arabidopsis chromosome 4 759 CIC3G3, CIC3H5, CIC9G5, EG10C4, CIC8B4, EW15B12, EW22E4 and yUP4D12. All clones corresponding to the first pattern had previously been shown to hybridize to LD and/or GA1, hence this BIO206 locus maps between LD and GA1 (Contig I, Figure 1). A number of clones which revealed the second hybridization pattern had been found to hybridize to m518 and/or AGL19, thus placing the second BIO206 locus between these markers in Contig Ill (Figure 1). Since clones EW5C9, EG10C4, EW15B12, EW22E4 and yUP14D12 showed common restriction fragments when compared with CIC3G3, CIC3H5 and CIC9G5, these clones could also be incorporated into the contig, although they did not hybridize to any of the markers flanking BIO206. Previously 132 markers and a repetitive sequence had been used to establish YAC contigs for chromosome 4 (Schmidt et al., 1995). Here, we have incorporated an additional eight markers into the map (AtPLCI: Yamamoto et al., 1995; B31: B. Osborne and B. Baker (Plant Gene Expression Center, Albany); I/dSPM312: M. Aarts and A. Pereira (CPRO, Wageningen); JAG9: J. Glover, A. Chaudhury and E. Dennis (CSlRO, Canberra); PDS: Wetzel et al., 1994; phyD, phyE: Clack et al., 1994; r808-b: S. Naito (Hokkaido University, Sapporo)). Some of these new markers identified YAC clones which had not been mapped to chromosome 4 before. Furthermore, it is interesting to note that the incorporation of these eight more markers into the contig map reduced the number of total contigs on chromosome 4--solely generated by marker hybridizations - - from 14 (Schmidt et al., 1995) to 12 (Contig la: BIO217-mi122; Contig Ib: g3843-mi87; Contig I1: HY4nga8; Contig Ilia: RPS18C-pCITf3; Contig IIIb: H2761-mi198; Contig IIIc: PRL1-PHYE; Contig IIId: CH42-g17340; Contig IVa: PRHA; Contig IVb: g8300; Contig IVc: AtKC1/mi431AP2; Contig IVd: ELI3-ve031; Contig IVe: g3713-DHS1/ mi369). Five of these contigs were established by one or two markers (contigs II, Ilia, IVa, IVb), while contigs Ib, Illb and IIId spanned over 20 markers each. The vast majority of the markers (137 out of 140) were used to screen all four Columbia YAC libraries. These 137 markers, representing 140 chromosome 4 loci, identified between two and 19 YAC clones per locus, with an average of 8.5 YAC clones. None of the four YAC libraries used yielded YAC clones for every marker tested, demonstrating the need to use multiple libraries. For example, markers PHYE, PRL1, PHYD, g3713, g3265, DHSl and mi369 did not detect YAC clones in the CIC library. YAC end fragment analysis Chromosome walking experiments were employed to join the remaining 12 YAC contigs reducing the number of YAC contigs to four (Schmidt et aL, 1995). In addition to the eight end fragments presented on the tiling path 116 end fragments have been produced in the course of generating the YAC contig map for chromosome 4, all of which are indicated on Figure 1. End fragments which either joined adjacent contigs, established by the marker hybridzations, or which provided an additional link in an area with sparse YAC coverage have been analysed using Southern blots of all YAC clones mapping to a particular area to ensure that all linking YAC clones have been identified. These end fragments are represented in Figure 1 in the same way as the markers. The majority of end fragments have only been tested on a subset of YAC clones mapping to a particular genomic region preventing an unambiguous positioning of the end fragment relative to all YAC clones in this area. However, even the partial information is useful in generating the YAC contig map and more importantly the end fragment analysis reveals chimaerism of YAC clones (see below). The cosmid clones which have been isolated using YACs or end fragments (g14587, CC5P13, CC6N7, CC7J19, CC10M20, CC12J20, CC15D15, CC15017, CC16N19, CC2012, CC27Pll, CC28C17, CC34A17, CC44C20, CC44H2, CC50H10, CC50K21; Schmidt et al., 1995) were particularly useful for contig generation. They assessed a bigger genomic region than most end fragments in the Southern blot analysis - end fragments produced by inverse polymerase chain reaction (IPCR) were often smaller than 300 bp - - hence the positioning of YAC clones relative to each other and to the cosmids was more accurately assessed. Alignment of YAC clones relative to markers and end fragments in the YAC contigs Southern blot analysis of YAC clones using markers or end fragments as probes in combination with the size information allowed us to position the YAC clones relative to each other and to the markers in the YAC contigs. The arrangement of the YAC clones was drawn out using the following rules: (i) Where a YAC clone spans two markers the physical distance between those markers is the size of the YAC insert or smaller; (ii) where a YAC clone lies between two markers but does not contain any of the restriction fragments of these markers the distance between these markers is bigger than the size of the YAC insert; (iii) the order of markers given is consistent with all Southern hybridization data, especially if markers end within a particular marker; (iv) unless additional data prove otherwise, the clones are positioned relative to the markers under the assumption that none of the clones is chimaeric. If this was not feasible, an arrangement was chosen which required the fewest chimaeric clones possible and putatively chimaeric clones are indicated. 760 Renate Schmidt et al. Figure 1 shows an arrangement of all 563 YAC clones mapping to chromosome 4 which is consistent with the results of all marker and end fragment hybridizations and the sizes of all YAC inserts. This representation of the YAC clones provides important information on regions with sparse YAC coverage. Although multiple YAC coverage has been achieved for the majority of the chromosome we have reported that some of the links are spanned by a single YAC clone (Schmidt et al., 1995). For example, CIC12G2 is the only clone which links contigs IVc and IVd. However, Figure 1 shows that multiple clones extend into the interval between CC12J20 and CC5P13 (e.g. yUP10Bg, yUP10B10, yUP15H2 and yUP10H11). End fragments from such clones could prove useful to establish additional links. Redundant YAC coverage for the majority of genomic intervals and a high density of markers in most areas of chromosome 4 ensured that the markers could be placed within the YAC contigs with high accuracy. This allowed the determination of physical distances between adjacent markers and the extent of the YAC contigs. From the low degree of flexibility in the positioning of the YAC clones, we estimate that the error on the majority of distances is less than 10%. A few genomic intervals, however, are only spanned by one or several large insert YAC clones. The lack of small YAC clones linking the markers flanking these intervals prevents an accurate estimate of the physical distances in these cases (e.g. intervals: HY4-ngaS, RJSmi422, mi 123-02213-2, PRHA-g8300). The four YAC contigs shown in Figure 1 cover approximately 17 Mb. Chimaeric YAC clones The YAC clones forming the chromosome 4 map were analysed for chimaerism using various criteria. For all YAC libraries, clones carrying chloroplast DNA, rDNA sequences and the 180 bp repeated DNA sequence have been identified (Creusot et aL, 1995; Schmidt et aL, 1994; Dunn and Ecker, personal communication). These coordinates have been compared with the YAC clones mapping to chromosome 4. Southern blot analysis using the repetitive sequences as probes verified that 74 YAC clones, representing 68 independent clones, carried chromosome 4-specific sequences in addition to unlinked repetitive sequences (Table 1). In some cases it was shown that end fragments of the clones corresponded to the repeated DNA sequences (EG2D2LE, yUP14B12RE: rDNA-sequences; EG1B10RE, EG15C10LE, yUP8A5LE: chloroplast DNA sequences). Thirteen YAC clones hybridized to two or three different locations on chromosome 4, as shown by the marker hybridization results and thus must be chimaeric (Table 2). End fragment analysis revealed that at least 22 out of 106 YAC clones tested were carrying non-contiguous sequences (Table 3 and see above). Interestingly, several YAC clones are comprised of multiple unlinked sequences (e.g. EG5D4, yUP6B4, EG15C10, EG17C8/C9). Putative chimaeric clones were also detected when the YAC clones were aligned with the markers in order to form the contigs. For some clones the size ascertained in the PFGE analysis was not consistent with the marker hybridization data and the distance of the markers, as established by other YAC clones in the area; these clones could only be integrated into the contigs based on the assumption that the clones were chimaeric. In some cases it was shown that the clones were indeed chimaeric, as they also hybridised to unlinked markers (e.g. yUP20H1, yUP19E11; Schmidt et al., unpublished results). Discussion The YAC contig map for chromosome 4 has been produced using genetically mapped markers and YAC end fragments as probes, guaranteeing that the YAC contig map is of direct use for map-based cloning experiments. Four different YAC libraries were used to generate the YAC contigs. On average 8.5 YAC cJones were isolated per marker, resulting in a highly redundant YAC coverage for most areas of the chromosome. Use of YAC libraries with different average insert sizes has a number of advantages. The big insert YACs were advantageous for linking genomic regions with sparse marker cover, while in areas of high marker density small YAC clones were extremely useful for determining marker order. The integration of small YAC clones into the physical map will be an important resource in the finemapping and subcloning stages of map-based cloning experiments since they cover smaller genomic regions. YAC clones carrying single-copy sequences derived from chromosome 4 and unlinked repetitive sequences were found in three of the four YAC libraries. Twenty-eight percent and 14% of the EG and yUP YAC clones, respectively, belonged to this class of chimaeric clones, while only 2% of the CIC YAC clones showed this problem. The limited analysis for chimaeric clones on chromosome 4 as described in this paper revealed that at least 2-3% of the EW and CIC YAC clones are chimaeric, while for the yUP (>21%) and the EG library (>35%) the percentage of chimaeric clones is much higher. Despite the presence of chimaeric clones the map is reliable, since most intervals are covered by a minimum of two YAC clones. The size of the four YAC contigs comprising more than 90% of chromosome 4 is approximately 17 Mb, while the nucleolar organizing region carrying the tandemly repeated rDNA units covers approximately 3.5 Mb (Copenhaver and Pikaard, personal communication). The gaps between YAC contigs I, II and III amount to 6.6 cM while the gap between YAC contigs III and IV is 2.3 cM (Schmidt et al., 1995). Based on the average kb/cM ratio and taking into account that the YAC contigs extend beyond the markers which YAC contigs on A r a b i d o p s i s c h r o m o s o m e 4 Table 1. Chimaeric YAC clones carrying single-copy nuclear sequences as well as repetitive DNA YAC co-ordinate Hybridization to repetitive DNA sequences Location on chromosome 4 CIC5C2 CIC8B4 EG1B10 EG1B11 EG2A8 EG2D2 EG2D11 EG2G4/H4/H6 EG3A11 EG3F12 EG4G9 EG5C1 EG5F2 EG5G2 EG7B11 EG7C4 EG7G2/H2 EG7G6 EG8E6 EG8F7 EG8G8 EG10C12 EG10F2 EG10G6/H6 EG11G8 EG11H6 EG13C7 EG13G5 EG14G5 EG15C10 EG15E1 EG15H3 EG17A11 EG17C8/C9 EG17G3/H3 EG18A2 EG18A12 EG18B4 EG18G1 EG19D12 EG19E10 EG19E11 EG19E8 EG19H3 EG23A3 yUP2B9 yUP3G10 yUP5G9 yUP6D1 yUP6F11 yUP7A6 yUP7B10 yUP7H4 yUP8A5 yUP8E11 yUP9B4 yUP9E4 yUP10D3 yUP14B12 yUP15B7 yUP16G2 yUP17B8 yUP18G9 yUP20A2 yUP20C1 yUP20H3 yUP21D11 yUP24B7 Chloroplast DNA Chloroplast DNA Chloroplast DNA Chloroplast DNA / rDNA Chloroplast DNA rDNA rDNA Chloro )last DNA Chloro )last DNA Chloro )last DNA / rDNA Chloro )last DNA Chloro )last DNA Chloro )last DNA / rDNA Chloro )last DNA Chloro )last DNA Chloro )last DNA Chloro )last DNA / rDNA rDNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA Chloroplast DNA Chloroplast DNA Chloroplast DNA / rDNA Chloroplast DNA rDNA Chloroplast DNA Chloroplast DNA Chloroplast DNA / rDNA rDNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA Hindlll repeat sequence rDNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA rDNA Chloroplast DNA Chloroplast DNA / rDNA rDNA rDNA Chloroplast DNA Chloroplast DNA / rDNA Chloroplast DNA / rDNA rDNA rDNA rDNA Chloroplast DNA Chloroplast DNA Chloroplast DNA Chloroplast DNA Chloroplast DNA Chloroplast DNA Chloroplast DNA rDNA rDNA Chloroplast DNA rDNA rDNA Chloroplast DNA rDNA Chloroplast DNA rDNA Chloroplast DNA rDNA rDNA Chloroplast DNA Contig II1:UM415-KG32 Contig II1:BIO206-m210 Contig II1:COP9-g10086 Contig II1:H2761 Contig II1:mi232 / I/dSpm64-RLK5 Contig II1: EG15C8RE Contig IV: AtKC1 / mi431 Contig IV: yUP24C6LE-g3088 Contig II1:CC28C17 Contig I: m506 Contig Ill: CCl14 / g4564 Contig IV: g15064 Contig IV: g15064 Contig II1:CC28C17 Contig IV: yUP7A3LE-I/dSpm76 Contig IV: r808-b Contig IV: PRHA Contig II1:B31 Contig II1:ve030-KG32 Contig II1:ve030-KG32 Contig IV: g8300 Contig II1:m557-g3883 Contig Ill: AG--g19838 Contig I: Ms2 / I/dSPM312-ve023 / GT148 Contig II1: m557-261aContig I: BIO219 Contig I: BIO217-CIC10C8RE Contig II1:KG32 Contig II1:g4513-g17340 Contig II1:m326/455 / ve024 Contig II1:CC2012-g4539 Contig I: CC50K21 Contig II1:SEP2B-CC128 Contig IV: pCITdl04; UM415/555 Contig I: g3843 Contig II1:ve030-KG32 Contig II1: EW14E4LE Contig II1:m557-CC10 Contig IV: AtGI~ Contig IV: AtH1 Contig II1:KG32 Contig I1:HY4 Contig II1:ve030-KG32 Contig I: ABP-m448 Contig IV: pCITd99-pCITd76 Contig II1:CC28C17 Contig II1:m518 Contig II1:AG-pCITd71 Contig IV: yUP24C6LE-yUP6D1LE Contig Ill: mi465 Contig II1:mi330-Athsf1 Contig II1: mi198-EG15C8LE Contig II1:B31 Contig IV: CC44H2-pCITd104 Contig II1:CC250 / PG11-DD1 Contig II1: CIC1H1LE Contig II1:g2620-COP9 Contig II1:COP9--g10086 Contig II1: EW19E8LE Contig II1: CIC1H1LE Contig II1:yUP1F4LE-mi465 Contig II1:02213-2-g4513 Contig II1:EW9C3LE-CC250 / PG11 Contig II1:m326/455 / ve024-m580 Contig II1:mi422-UM415/555 Contig IV: g8300 Contig II1:CC10M20-CH42 Contig II1:CC250 / PG11-DD1 761 762 Renate Schrnidt et al. Table 2. Chimaeric YAC clones carrying non-contiguous single-copy sequences of chromosome 4 YAC co-ordinate Locations on chromosome 4 EG3A2 EG5D4 EG17C8/C9 EG23G8 EG23G9 EG24F9 EW8 E11 yUP4G1 yUP6B4 yUP10C1 yUP19H3 yUP20H8 Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig II1:AGP66 / Contig II1:mi232 / I/dSpm64-g17340 II1:m210-g4108 / Contig II1:m326/455 IV: pCITdl04 / Contig IV: UM415/555 II1:TG1C8 / Contig II1:JGB9-LM117 II1:TG1C8 / Contig II1:JGB9-LM117 / Contig II1:UM415/555 II1:g4539 / Contig II1:CC10M20 I: mi51-g8802 / Contig II1: yUP17B7LE II1:m518 / Contig II1:g3883-261aII1:mi128-g6837 / Contig II1:ve030-KG32 / Contig IV: g8300 I: g2616 / Contig II1: EW11E9LE-EW9C3LE I: g8802-g6844 / Contig I: BIO206-mi233 II1: yUP13C7RE-yUP3F7RE / Contig II1:EW11E9LE-CC250 / PG11 Table 3. Chimaeric end fragments YAC co-ordinate Location of clone on chromosome 4 Unlinked end fragment CIC5C3 EG1B10 EG1E3 EG2D2 EG4G7 EG5D4 EG6H10 EGTA4 EG11B7 EG11F9 EG11H6 EG15C10 EG17G9 EG23D9/E10 EW13F9 yUP7G10 yUP8A5 yUP10C1 yUP14B12 yUP15D11 yUP17F1 yUP20A4 Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig Contig RE RE LE LE RE LE LE RE RE RE RE LE LE LE LE RE LE LE RE LE LE LE I: mi87 II1:COP9-g10086 IV: yUP7A3LE-m214 II1: EG15C8RE II1: yUP13CTRE II1: m210-g4108; m326/455 IV: pCITd76-RNA-Polymerasell LS IV: CC5P13 / CC7J19-g2486 / CC127 II1: yUP13CTRE II1:CC250 / PG11 I: BIO219 II1:m326/455 / ve024 II1:g4108 II1:g14587-PRL1 II1: mi260-yUP3F7RE I: I/dSpm27-g6844 IV: CC44H2-pCITd104 II1: EW11E9LE-EW9C3LE II1: EW19E8LE I: LD-GA1 I: CC50K21-1/dSpm41G II1:Athsfl have been genetically mapped we can estimate the size of the gaps to be greater than 1 Mb. Thus the total size of chromosome 4 is greater than 21.5 Mb. Measurements of the synaptonemal complex length of chromosome 4 have shown that this chromosome comprises 16.8% of the genome (Albini, 1994). In the nucleolar organizing region the synaptonemal complex is interrupted and it is not clear how much of this region contributes to the entire length of the chromosome established by this method. If we assume that the nucleolar organizing region does not contribute to the synaptonemal complex length at all or in a much reduced way, then the minimum size of the nuclear genome of Arabidopsis would be 107 Mb. The information presented in this paper (the detailed arrangement of the YAC clones in the YAC contigs, the sizes of the YACs and known chimaeric YAC clones) is available on the World Wide Web (WWW) at both URL: http://genome-www.stanford.edu/Arabidopsis/ JIC-contigs.html and URL:http://nasc.nott.ac.uk/JIC-contigs/JIC-contigs.html. All available markers (i.e. all but agp66, PETC and H2761) have been deposited at the Arabidopsis Biological Resource Centre at Ohio. The majority of the YAC end probes were derived using IPCR and were not cloned. However, they can easily be regenerated using the protocols described in the Experimental procedures section. The density of RFLP markers mapped on the genetic and physical maps of chromosome 4 now means that any new mutation can be mapped to a small interval both genetically and physically. These intervals vary in size along the chromosome, most of them are between 50 and 500 kb. Given the availability of cosmid libraries built in Agrobacterium binary vectors (Olszewski et al., 1988) and YAC contigs on A r a b i d o p s i s c h r o m o s o m e 4 763 The YAC libraries were maintained as described previously (Schmidt et al., 1992, 1994).The preparation of yeast colony filters, the probe labelling, and the hybridization and washing conditions were identical to the ones outlined before (Schmidt et aL, 1992). For markers cloned in vectors with pYAC-homology the plant DNA fragments were separated from the vector sequences prior to their use as probes in colony hybridization experiments. For all markers cloned in Lambda-vectors, the cosmid vectors Lorist or pLAFR3, the complete clones were digested with a restriction enzyme (four bp recognition sequence) and subsequently labelled. were defined as those sequences which were adjacent to the left arm of pYAC4 and its derivatives, while the right ends are neighbouring the right arm. IPCR can be used to isolate both ends of YAC DNA inserts. Yeast genomic DNA (0.5-1 ilg) was digested with Alul, EcoRV and Hincll. Left ends were isolated with Alul and EcoRV and right ends with Alul and Hincll. Use of these enzymes guaranteed the isolation of suitable end fragments more than 50% of the time. After an ethanol precipitation the fragments were ligated at 4°C under dilute conditions to promote circle formation. Heat inactivation of the ligase was followed by an ethanol precipitation. Samples for the left end YAC circles were then digested with Nhel while right end YAC circles were linearized with Sspl. After phenol extraction, samples were passed over a Sepharose CL 6B spin column (in TE buffer). The resulting DNA solutions were used in the PCR reactions. The PCR reactions contained 10-50 ng of DNA, 10 mM Tris-HCI (pH 8.3), 50 mM KCI, 2 mM MgCI2, 0.01% gelatin, 0.005% Tween 20, 0.005% NP40, 0.1 mM dATP, 0.1 mM dCTP,0.1 mM dGTP,0.1 mM d'l-lP, 0.2 I~M of each of the appropriate primers and 1.25 U of Taq DNA polymerase. The reaction volume was 100 111.Thirty-five cycles of 1 min at 94°C, 1 min at 60°C and 2 rain at 72°C were followed by an additional 10-min incubation at 72°C. The sequences of the PCR oligonucleotides used were: Yeast genomic DNA preparation for restriction enzyme digestion and YAC end-fragment isolation Left-end (outer nest): D71 5'-TCCTGCTCGCTI'CGCTACTT-3' C78 GCGATGCTGTCGGAATGGAC Right end (outer nest): C69 CTGGGAAGTGAATGGAGACATA C70 AGGAGTCGCATAAGGGAGAG. the relatively facile in p/anta transformation procedure for Arabidopsis (Bechtold et al., 1993), the isolation of genes from chromosome 4 using map-based cloning should now not be the limiting factor in the analysis of complex plant biological processes. Experimental procedures Yeast colony hybridizations Yeast colonies were removed from agar plates and resuspended in 400 Id TE/SDS (10 mM Tris-HCI, pH 8.0, 1 mM EDTA, 0.1% SDS). An equal volume of phenol/chloroform/isoamylalcohol (25:24:1, v:v:v) was added, the preparations were mixed carefully and subsequently incubated for 20-30 min at 65°C. The preparations were again mixed thoroughly. After centrifugation, the supernatants were re-extracted with phenol/chloroform/isoamylalcohol. Forty microlitres of 3 M NaAc (pH 5.4) were added to the preparations before they were precipitated with ethanol. The DNA pellets were resuspended in 50 p.I of TE (10 mM Tris-HCI, 1 mM EDTA, pH 8.0) and incubated overnight at 4°C. After centrifugation the supernatants were extracted with an equal volume of phenol/ chloroform/isoamylalcohol. Five microlitres of 3 M NaAc (pH 5.4) were added to the preparations and they were precipitated with ethanol. The DNA pellets were resuspended in TE and used for restriction enzyme digestion. For Southern blot analysis, the yeast genomic DNA was digested with EcoRI/BamHl. Gel transfer to Hybond-N and hybridization conditions were according to manufacturer's instructions (Amersham) with the modifications described previously (Schmidt eta/., 1994). Sizing of YACs To size YACs, intact yeast chromosomal DNA was isolated and separated by PFGE using concatemers of Lambda-DNA as a size standard. Southern blots of the gels were hybridised using pYAC vector as probe (Schmidt et al,, 1994). YAC end-fragment isolation Isolation of YAC end fragments was carried out by IPCR or plasmidrescue. The left arm of the pYAC4 vector carries Trp I, ARS I and CEN 4 sequences as well as an origin of replication and an antibiotic resistance gene functional in Escherichia coil, while the right arm harbours the Ura 3 sequences. Left ends of YAC inserts PCR products derived from A/ul circles could be reamplified with inner nest primers (D72 / C77 for left end fragments and C72 / C71 for right end fragments). PCR products derived from EcoRVcircles had to be reamplified with C78 and D72 (or D71) and PCR products derived from Hincll circles could only be reamplified with C70 and C72 (or C69), since oligonucleotides C77 and C71 are homologous to vector sequences which are absent from the EcoRV and the Hincll circles, respectively. Left end (inner nest): D72 5'-CACTATCGACTACGCGATCA - 3 ' C77 GTGATAAACTACCGCA'I-I-AAAGC Right end (inner nest): C72 CGAGTCGAACGCCCGATCTC C71 AGAG CCTTCAACCCAGTCAG. Details of the circles and PCR products generated have been described elsewhere (Schmidt and Dean, 1996; Schmidt et al., 1992). Insert fragments adjacent to the left arm of the YAC vector could also be isolated by plasmid-rescue. Yeast genomic DNA (1 ilg) was digested with Xhol or Ndel. The DNA was extracted with phenol/chloroform/isoamyl alcohol and then precipitated with ethanol. The ligation of the fragments was carried out under dilute conditions at 4°C to promote circularization. After heat inactivation the DNA was precipitated with ethanol. Electroporation of aliquots of the ligated DNA into competent DH5{x E. coil cells was carried out using a Bio-Rad electroporator according to manufacturer's instructions. Immediately after the electroporation 1 ml of growth medium was added and the cells were grown for 1 h at 37°C before the cells were spread on an agar plate containing ampicillin (50 lig ml-1). Three clones were characterized from each of the transformations. To use the end fragments produced by IPCR or left end rescue in colony hybridization experiments, all vector sequences had to be removed. IPCR fragments and plasmid-rescue products were cut with the enzyme which was used to digest the yeast DNA prior to self-ligation (IPCR: e.g. Alul, Hincll, EcoRV, Plasmid-rescue: Xhol, Ndel). Furthermore, products derived from the EG YAC clones had to be cut with BamHI (cloning site in pYAC-41), while 764 Renate S c h m i d t et al. products from CIC and yUP YAC clones had to be digested with EcoRI (cloning site in pYAC4). Since the clones of the EW library do not contain a restored YAC vector cloning site, a variety of diagnostic digests were used on the plasmid-rescue derived clones to identify an enzyme which was suitable to isolate at least part of the YAC insert-specific fragment. IPCR products from EW YAC clones were cut with Hhal (left-end products) and Sau3A (rightend products), since these restriction enzymes have recognition sequences very close to the original cloning site in pYAC3. resolved RFLP analysis and long range restriction mapping of the DNA of Arabidopsis thaliana using whole YAC clones as probes. Nucl. Acids Res. 20, 6201-6207. Bechtold, N., Ellis, J. and Pelletier, G. (1993) In planta Agrobacterium-mediated gene transfer by infiltration of adult Arabidopsis thaliana plants. C.R. Acad. ScL Paris, 316, 11941199. Bell, C.J. and Ecker, J.R. (1994) Assignment of 30 microsatellite loci to the linkage map of Arabidopsis. Genomics, 19, 137-144. Bent, A.E, Kunkel, B.N, Dahlbeck, D., Brown, K.L., Schmidt, R., Giraudat, J., Leung, J. and Staskawicz, B.J. (1994) RPS2 of Screening of cosmid libraries with whole YACs or end fragments as probes A gridded cosmid library carrying approximately 25 kb inserts of Columbia ecotype DNA in pLAFR3 (Lister and Dean, unpublished results) was plated at high density on selective medium and grown overnight at 37°C. Up to 10 colony lifts were taken from each agar plate. Colony filters were treated as described (Sambrook et aL, 1989). Hybridization and washing conditions were the same as those used for Southern hybridization experiments, however, the length of the washes was reduced when complete YACs were used as probes. YACs were gel-purified using the protocol described by Bancroft et al. (1992). DNA probes The repetitive DNA sequences (25S-18S rDNA, 5S-rDNA, chloroplast DNA, pAL1; Martinez-Zapater et aL, 1986) and most marker DNA sequences used as probes to screen the YAC libraries have been described previously (Schmidt et al., 1994, 1995). Several end fragments provided by other laboratories were also incorporated in the YAC contig map, CICIHILE (B. Dietrich and J. Dangl, University of North Carolina, Chapel Hill), EG21C9LE, EW22B3LE, yUP3E9LE (B. Staskawicz, University of California, Berkeley), EW14G1LE, EW9C3LE, EW11E9LE, yUP10B5LE, yUP11F11LE, yUP17B7LE (Bent et al., 1994) and yUP17E10LE (Pepper et al., 1994). Acknowledgements We acknowledge the generosity of the following people in sending unpublished markers: B. Osborne, B. Baker (Plant Gene Expression Center, Albany), B. Dietrich, J. Dangl (University of North Carolina, Chapel Hill), B. Staskawicz (University of California, Berkeley), M. Aarts, A. Pereira (CPRO, Wageningen), J. Glover, A. Chaudhury and E. Dennis (CSIRO, Canberra) and S. Naito (Hokkaido University, Sapporo). We thank Z. Lenehan (John Innes Centre, Norwich) for help with the YAC mapping and D. Bouchez (INRA, Versailles), P. Dunn, J. Ecker (University of Pennsylvania, Philadelphia) for providing data on the YAC clones prior to publication. We also thank David Flanders (Stanford AtDB) for construction of the Web page and Mary Anderson (Nottingham Arabidopsis Stock Centre) for setting up a mirror site. This work was supported by grants from the European Community (BLOTCT 90-0207) and the BBSRC (208/PG0608 and 208/PG01525) to C.D. and EC training fellowships to A.B., G.C. and R.S. References Albini, S.M. (1994) A karyotype of the Arabidopsis thafiana genome derived from synaptonemal complex analysis at prophase I of meiosis. Plant J, 5, 665-672. Bancroft, I., Westphal, L., Schmidt, R. and Dean, C. (1992) PFGE- Arabidopsis thaliana: A leucine-rich repeat class of plant disease resistance genes. Science, 265, 1856-1860. Clack, 1".,Mathews, S. and Sharrock, R.A. (1994) The phytochrome apoprotein family in Arabidopsis is encoded by five genes: the sequences and expression of PHYD and PHYE. Plant Mol. BioL 25, 413-427. Coulson, A., Sulston, J., Brenner, S. and Karn, J. (1986) Toward a physical map of the genome of the nematode Caenorhabditis elegans. Proc. Natl Acad. Sci. USA, 83, 7821-7825. Creusot, F., Fouilloux, E., Dron, M. et aL The CIC library: a large insert YAC library for genome mapping in Arabidopsis thaliana. Plant J. 8, 763-770. Ecker, J.R. (1990) PFGE and YAC analysis of the Arabidopsis genome. Methods, 1, 186-194. Grill, E. and Somerville, C. ( 1991) Construction and characterization of a yeast artificial chromosome library of Arabidopsis which is suitable for chromosome walking. Mo/. Gen. Genet. 226, 484-490. Hauge, B.M., Hanley, S., Giraudat, J. and Goodman, H.M (1991) Mapping the Arabidopsis genome. In Molecular Biology of Plant Development (Jenkins, G.I. and Schuch, W., eds). Cambridge: The Company of Biologists, pp. 45-56. Hwang, I., Kohchi, T., Hauge, B.M. et aL (1991) Identification and map position of YAC clones comprising one-third of the Arabidopsis genome. Plant J. 1,367-374. Konieczny, A. and Ausubel, E (1993) A procedure for quick mapping of Arabidopsis mutants using ecotype specific markers. Plant J. 4, 403-410. Martinez-Zapater, J.M., Estelle, M.A. and Somerville, C.R. (1986) A highly repeated DNA sequence in Arabidopsis tha/iana. Mo/. Gen. Genet. 204, 417-423. Meyerowitz, E.M. and Somerville, C.R. (eds) (1994) Arabidopsis. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Olszewski, N.E., Martin, F.B. and Ausubel, F.M. (1988) Specialised binary vectors for plant transformation: expression of the AHAS gene in Nicotiana tabacum. Nuc/. Acids Res. 16, 10 765-10 782. Pepper, A., Delaney, T., Washburn, T., Poole, D. and Chory, J. (1994) DET1,a negative regulator of light-mediateddevelopment and gene expression in Arabidopsis, encodes a novel nuclearlocalized protein. Cell, 7, 109-116. Reiter, R.S., Williams, J.G.K., Feldmann, K.A., Rafalski, J.A., Tingey, S.V. and Scolnik, P.A. (1992) Global and local genome mapping in Arabidopsis tha/iana by using recombinant inbred lines and random amplified polymorphic DNAs. Proc. Nat/ Acad. ScL USA, 89, 1477-1481. Sambrook, J., Fritsch, E.F. and Maniatis, T. (1989) Molecular Cloning: A Laboratory Manual Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Schmidt, R. and Dean, C. (1996) Hybridization analysis of YAC clones. In Methods Mo/. Cell. Biol., in press. Schmidt, R., Chops, G., Bancroft, I. and Dean, C. (1992) Construction of an overlapping YAC library of the Arabidopsis tha/iana genome. Aust. J. Plant Physiol. 19, 341-351. YAC contigs on A r a b i d o p s i s c h r o m o s o m e 4 Schmidt, R., Putterill, J., West, J., Chops, G., Robson, F., Coupland, G. and Dean, C. (1994) Analysis of clones carrying repeated DNA sequences in two YAC libraries of Arabidopsis thaliana DNA. Plant J. 5, 735-744. Schmidt, R., West, J., Love, K., Lenehan, Z., Lister, C., Thompson, H., Bouchez, D. and Dean, C. (1995) Physical map and organization of Arabidopsis thaliana chromosome 4. Science, 270, 480-483. Ward, E.R. and Jen, G.C. (1990) Isolation of single-copy-sequence 765 clones from a yeast artificial chromosome library of randomlysheared Arabidopsis thaliana DNA. Plant MoL BioL 14, 561-568. Wetzel, C.M., Jiang, C.-Z., Meehan, L.J., Voytas, D.F.and Rodermel, S.R. (1994) Nuclear-organelle interactions: the immutans variegation mutant of Arabidopsis is plastid autonomous and impaired in carotenoid biosynthesis. Plant J, 6, 161-175. Yamamoto, Y.T., Conkling, M.A., Sussex, I.M. and Irish, V.F. (1995) An Arabidopsis cDNA related to animal phosphoinositidespecific phospholipase C genes. Plant Phys. 107, 1027-1030.