The Zon Lab Guide to Positional Cloning in the Zebrafish Introduction Mapping Strains Families and Genetic Markers Crosses for Line Maintenance and Mapping Choosing your grandparents and parents for better resolution mapping Microsattelite Markers - Agarose Scorable Mapping Genes Low Resolution Mapping Half-tetrad analysis Scanning for Linkage Preparation of the DNA Low Resolution Scanning Intermediate Resolution Mapping Fish Husbandry Allele Specific Oligonucleotide Hybridization Tempo High Resolution Mapping AFLP Three-allele systems vs. four-allele systems Collecting Mutant Embryos Chromosomal Walking Screening the PAC or BAC Libraries Care of BAC and PAC Libraries Addendum I (Protocol for choosing oligos for hybridization) Addendum II (BAC/PAC Filter Hybridization with oligos) PCR screening of the pooled PAC and BAC libraries New BAC Libraries Walking and Establishing Contigs by sequencing PACs Introduction. The process of positional cloning involves unique issues for each organism. Success is usually based on experience. Since my laboratory has utilized positional cloning to isolate over 10 genes now, we have accumulated significant experience in the field. We have put this experience into a guide that would be helpful for the zebrafish community. This is not a manuscript, and the text at this point is not referenced. It is simply a guide for doing positional cloning in the zebrafish. Mapping Strains. Many of the problem issues in positional cloning are due to the genetic polymorphisms within different strains of zebrafish. We have typically utilized the AB or TU (Tübingen) strains for mutagenesis screens in my laboratory. A mutant should be maintained on the laboratory strain of zebrafish on which it was created. These and all other widely available zebrafish strains are not entirely inbred. There are genetic polymorphisms that may be present within a given family of fish. One cannot assume that a family of zebrafish is as isogenic as inbred strains of mice. Therefore it is important to prepare grandparent and parent DNA by tail clipping. This will help examine polymorphisms that subsequently become important in the genetic mapping of a mutant gene. There are two strains of zebrafish, WIK and SJD, that are polymorphic in respect to AB and TU and therefore can be used for genetic mapping. Of note, the strains can be interchanged; i.e. mutations can be created on wik or SJD, and mapping done with AB. For our mutations on the AB background, a heterozygote carrier is mated to WIK and a mapping family is generated. Families and Genetic Markers. Once map crosses have been created and heterozygote mapping pairs have been identified from these tanks, it is important to tail clip and store DNA from these mapping pairs and from the parents of these fish (example: the AB het and wik wild-type that created the map cross (the grandparents) and the AB/wik heterozygotes (the parents)). This parent and grandparent DNA is helpful in analyzing polymorphisms in subsequent mapping. Once flanking markers have been found, it is helpful to test the parents and the grandparents to determine which families are polymorphic and segregate the marker in an easily interpretable manner. In other words, it is best to collect embryos with the best allele systems for mapping. Once this has been determined, the mapping heterozygotes that are polymorphic can be selectively used to create additional backcrosses that will also be polymorphic for the flanking markers. Crosses for Line Maintenance and Mapping: For the purposes of this discussion, mutagenesis was performed with AB, and the polymorphic strain to be used for mapping is WIK. Also assume the mutation of interest is embryonic lethal, and lines must be maintained as heterozygotes. Definitions: Incross – sibling intercross Outcross – AB(mut)/AB heterozygote X AB wild-type Mapcross – AB(mut)/AB heterozygote X WIK wild-type Backcross – AB(mut)/WIK heterozygote X WIK wild-type For long-term line maintenance, we keep the mutation on the same strain in which mutagenesis was originally performed (in this case AB) so as not to jeopardize future mapping efforts. One could perform either an AB(mut)/AB heterozygote incross or an outcross. Outcrosses are generally preferable because it helps dilute out other recessive mutations acquired during the ENU mutagenesis that are not linked to the phenotype of interest. Using mapcrosses or backcrosses for line maintenance is problematic because recombination can result in loss of distant polymorphisms that could be critical for future low-resolution mapping. For mapping, polymorphic hybrid strains must be created. We perform a mapcross. The offspring are raised, half of which will be heterozygotes. Heterozygotes are identified by multiple random incrosses, and identified heterozygote pairs are set aside for ongoing incross embryo collection. If not enough embryos can be collected in that generation and close flanking markers are not available, we go back to the newest generation of the pure original strain [AB(mut)/AB] and perform another mapcross to repeat the process. We do NOT raise embryos from an AB(mut)/WIK incross because recombination in this generation renders the next generation of embryos useless for mapping. If, however, close markers flanking the mutation are available, a backcross can be performed. Particularly if the markers are agarose scorable, there is the potential to identify large numbers of heterozygotes by tail clipping instead of mating. Using a high throughput PCR format, we have identified up to 50 heterozygote pairs in a week. It is critical to tail clip the parents used in the backcross as well as the grandparents used for the mapcross so the allele system can be accurately followed in the next generation. To facilitate this process, it is useful for a lab to have several (about 10) tail clipped wild-type fish isolated (in this example WIK). Before the backcross is performed, the tail clipped WIK wildtypes are screened along with tail clip DNA from the AB(mut)/WIK heterozygote to establish which wild-type has the best allele system for following the mutation. Those wild-types are then used for the backcross. When identifying heterozygotes by tail clip DNA, it is important to remember that there will be a defined error rate due to recombination. The magnitude of the error depends on the distance between the marker and the mutation. Recombination rates in males is about 10 fold lower than in females so if flanking markers are still somewhat far from the mutation, one should consider doing a backcross in which the AB(mut)/WIK is male and the WIK wild-type is female. Choosing your grandparents and parents for better high-resolution mapping. Usually this occurs after low-resolution mapping. For high-resolution mapping purposes we segregate individual wik fish and genotype them to define the allele the system that is the best advantage for our high-resolution mapping purposes. Microsatellite Markers - Agarose Scorable. We have recently undertaken a large-scale approach to evaluate microsatellites available from the Fishman laboratory for their ability to be scorable on an agarose gel. We have uncovered numerous markers and will soon publish a list of these markers to help in positional cloning. Through this analysis we realized that SJD fishes were mostly isogenic, however, there are regions with polymorphisms within the strain. SJD allows easier mapping than wik; however the strain is very difficult to use, and cannot be propagated in our laboratory. We have asked Steve Johnson to send us males and utilized these males by sacrificing them and in vitro fertilizing heterozygotes for creating mapping families. The wik strain works well, but is very polymorphic between individual fish (not inbred). We utilize WIK to do most of our mapping. We are in the process of developing the microsatellites for use on a capillary system, such as the ABI 3700 system. This should be very useful for high throughput mapping using different strains. MAPPING GENES Low Resolution Mapping. There are two preferred methods for low resolution mapping of a gene to a particular chromosome. The first method makes use of scanning microsatellite markers throughout the genome. The second method, called half-tetrad analysis, makes use of early pressure treated embryos to evaluate the mutated chromosome. Half tetrad analysis. Although the zebrafish is a diploid organism, haploids can live several days, and maternally homozygous diploid fish can be produced by applying early pressure (EP) to inhibit the second meiotic division after fertilization with UV-inactivated sperm. Analogous to the creation of the maternal diploids, diploid androgenotes can be obtained by UV inactivation of the egg, subsequent fertilization by normal sperm, and application of EP to inhibit the second meiotic division. The ability to create gynogenetic diploids allows the rapid assignment of a gene to a particular chromosome while obtaining information about its distance from the centromere. In order to genetically localize a mutation, female offspring of a mutant x AB (wild-type) outcross are squeezed and fertilized with sperm from a genetically unrelated (wild-type) male. F1 heterozygous mut/+ females are identified by random matings between the mut.AB F1 offspring. Obligate carriers should be detected at approximately a 25% rate. F1 mut/+ females are subsequently squeezed, and gynogenetically diploid embryos derived from them. Based on the equation [distance in cM = 50 (1 - (2 x mutant number/total number of embryos))], the mut-centromere distance is approximately calculated. Because the second meiotic division has been inhibited in creating gynogenetic diploids, the region of the chromosome between the centromere and the mutation cannot have recombined. Markers proximal to the mutant (so-called centromeric markers) when polymorphic between the background and wild-type strain will be necessarily homozygous background strain in mutants and wild-type in unaffected embryos. Centromeric markers have been defined for all 20 zebrafish chromosomes, and thus, chromosomal localization and distance from the centromere can be rapidly assigned. In our early experience we used half-tetrad analysis. As we developed robotics in the laboratory, it became easier for us to develop the scanning method. Recently with the advent of many polymorphic markers that are agarose scorable, we favor the scanning method. Scanning for linkage. The first step in mapping a recessive mutation to a chromosome is the generation of mapping hybrids (eg AB*/WIK; where the asterix denotes the mutant allele). To make this mapping cross, a heterozygous AB carrier (AB*/AB) is mated to a wild-type WIK fish and the resulting F1 generation, is raised. Practically, we generate 2-4 map cross families with different WIK founders to ensure an informative allele system is obtained (as the WIK strain is not completely inbred). To identify heterozygous F1 individuals, these AB/WIK hybrids are mated to each other (in-crossed) and their clutches scored for the mutant phenotype, Once a pair of heterozygous hybrids are found, they are mated and their wild-type and mutant progeny are collected. In addition, we tail clip the mapping hybrid to obtain DNA for analysis of the alleles carried. 40 mutants and 40 wild-types are used for the initial low-resolution mapping. DNA is prepared from all 80 individuals to give a stock of 50 microliters. Two mutant pools of 20 and two wildtype pools of 20 are made. To make the pools, 8 microliters of the stock DNA is taken from each of the 20 individuals (mutant embryos for the mutant pool or wild-type embryos for the wild-type pool) to give 160 microliters. The volume is then made up to 1.2 mls with water. The microsatellite markers are then utilized to scan the genome for linkage to the phenotype in bulk segregant analysis. Preparation of the DNA. If the embryos have not already hatched, they need to be dechorionated before freezing and prepping. Single embryos are placed individually into wells of a 96 well plate. Any excess buffer should then be removed. The embryos can either be stored dry or in methanol, and should be kept at –20 C. When prepping embryos, plates should be kept on ice unless otherwise noted. To prep the embryos, remove all methanol from the wells. All of the following incubation steps can be carried out in a PCR machine. 1. Add 50L of lysis buffer (recipe follows) to each well and incubate at 98 C for 10 minutes to lyse cells. Quench on ice or 4 C sink in PCR machine. 2. 2.Add 5L of Proteinase K (10mg/mL stock) to remove proteins. 3. Incubate at 55 C for at least 2 hours. This incubation can also go overnight. The longer the incubation, the cleaner the DNA tends to be. 4. Incubate at 98 C for 10 minutes to destroy Proteinase K. Quench on ice (or 4 C sink in PCR machine). 5. Spin down lysed embryo debris – 4,000 rmp for 10 minutes. 6. Draw off supernatant into a clean 96 well plate. 7. Dilute as necessary. Embryo Lysis Buffer: 1X PCR buffer made to 0.3% Tween 20(10% stock) and 0.3% NP40 (10% stock). For 10 mL: 10mL PCR buffer (see below) 300L NP40, 10% stock 300L Tween 20, 10% stock PCR Buffer: 10mM Tris-HCl pH 8.3, 50mM KCl For 50mL: 500L 1M Tris, pH 8.3 (autoclaved) 2.5 mL 1M KCl 47 mL sterile ddH2O Low resolution scanning. Bulk segregant analysis uses 239 agarose scorable microsatellites that are typed on a set of DNA samples from wild-type and mutant embryos. Each set contains two pools of twenty embryos from wild-types and mutants. PCR products are run on three percent agarose gels at two-hundred volts for two hours to separate bands. Most polymorphisms encountered thus far have been subtle, so running the gels longer than necessary is always better. Linkage is assumed when a band present in the wild-type pools is absent in the mutant pools. The mutant band(s) may also have a size shift when compared with the wild-types. This may also indicate linkage. It is best that individuals for this stage as well as the intermediate mapping (see below) come from the same family. Occasionally we have not been able to map a mutation in one family and another family needs to be tried. It is critical that the 80 embryos come from the same family. Introducing individuals with a different set of alleles might lead to false positives. One might assume linkage to a particular chromosome based on the pattern obtained with the additional family. Very frequently we find linkage to three or so chromosomes, but only one of these is real. To determine which microsatellite is truly linked each positive marker must be tested on individuals. By testing individuals chromosomal linkage is confirmed and the distance between markers can be evaluated. Intermediate Resolution Mapping. The purpose of intermediate resolution mapping is to position the gene between flanking markers that are scorable on an agarose gel. This allows us to do High Resolution Mapping with 1500 embryos with relative ease. This has not always been possible, but it is a goal that is worth striving for. We will subsequently collect 8 wild-types and 88 mutants for intermediate resolution mapping. It is desirable to have flanking markers that are <10cm apart. We scan microsatellites on the chromosome by ordering roughly six microsatellites on the chromosome arm. If these microsatellites are not polymorphic, we test another six markers until the mutation is linked to microsatellites on the chromosome arm. Based on recombination mapping, it is possible to define the flanking microsatellite markers. Markers that are far away from the mutation should yield more recombinants than markers that are close to the mutation. Markers that are on opposite sides of the mutation should give different sets or recombinants, and markers that are on the same side of the mutation should share recombinants. Utilizing this, we can narrow down the region until we have markers that are 10cm (or less) apart. When we are able to define microsatellite markers that are polymorphic on our agarose gel and are close enough to use as flanking markers, we set up mapping crosses with fish that utilize this allele system. The flanking markers are then used in the High Resolution Mapping. Fish Husbandry. The number of tanks needed to map a mutant varies based on sex ratio in the tanks and also based on the ease of scoring the phenotype. We will typically generate 8 map cross-tanks or back cross-tanks for a genetic mapping. Ultimately we will destroy most of these fish, but the goal is to have at least 5 pairs of fish with an advantageous allele system so that genetic mapping can go very quickly. Allele Specific Oligonucleotide Hybridization. Polymorphism between mapping strains for markers recovered from a chromosomal walk, such as ends of BAC, PAC, and YAC clones, can be converted into an allele specific oligonucleotide (ASO) hybridization assay for your meiotic map. The ASO assay (ref: Wood WI, et al. PNAS 1985, 82:15851588; Farr CJ, et al. PNAS 1988, 85:1629-1633) is capable of detecting single nucleotide polymorphic differences between mapping strains. The basis for the assay is PCR amplification from genomic DNA for the corresponding BAC, PAC or YAC end, dotting the PCR product on nylon membrane, differential hybridization with 5’-labeled 19-mer containing the single nucleotide difference present in your mapping strains, and autoradiography of the washed nylon filter. All the hybridizations of the 5’-labeled ASO primers are performed in tetramethylammonium (TMA) buffer (Farr CJ et al., above) at 45oC, and washed at 55-56oC in TMA for any oligo length of 19-mer, irrespective of GCcontent. To use this assay, one needs to know the sequences for the corresponding BAC, PAC or YAC end from your two mapping strains in order to design ASO primers. One can use the PCR primers to amplify the fragment from your homozygous mutants and homozygous wild-type embryos (these have been ascertained by other z-markers). The PCR fragments are then directly sequenced or subcloned to determine their single nucleotide differences, which are used to design your ASO 19-mers. The PCR fragments are immobilized onto nylon membrane in duplicate. To one set of membrane, one would hybridize with ASO primer for one strain, and the duplicate membrane is hybridized with the second ASO in a separate container. After hybridization at 42oC for minimum of 4 hrs, the filters are washed in 2xSSC+0.1%SDS at room temp x 20 minutes, then in TMA wash buffer at 55-56oC for specificity. The filters are then exposed for autoradiography. Once the ASO has been shown to work in your mapping cross, you can genotype large number of embryos in a high-throughput manner. Since the dot blot manifold can accommodate 96-samples, you can array 96 samples of PCR products from individual embryos/filter. Thus ~1000 embryos could easily by genotyped with 10-11 filters in one experiment using this assay. Supplies needed for the ASO assay: a. ASO: design oligonucleotide of 19 nucleotide length with the single nucleotide difference from your mapping strains located at the center of the 19-mer to maximize their Tm differences. b. Dot/Slot blot manifold (Schleicher & Schuell or Biorad): facilitates application of PCR products onto Nylon membrane. c. Tetramethylammonium choloride (TMA) solution: the key componet to the ASO assay, which equalizes the Tm differences based on GC-content of the 19-mer, so that Tm stability is a function of length exact matches to the target DNA. A premade 5M stock solution is available from Sigma (Cat. #T3411) d. [-32P]ATP of 3000-6000 Ci/mmol specific activity. Tempo. While one is working out the flanking polymorphisms and genotyping, it is important to continue to collect mutant embryos. We find that the number of embryos required for positional cloning is between 1500 and 2000. Assuming an interval of 600kb/1centimorgan (cM) and a meiotic frequency of 1/embryo (this is true for haploid individuals; diploids have an average of 1.3 to 1.5 meioses/individual – one from the mother and 0.3 to 0.5 from the father), this will give a resolution of close to 30kb/ meiosis. This resolution allows positioning of the mutant gene on a BAC or PAC clone. The characterization of markers between your flanking polymorphic markers is important. The number of recombinants obtained with each of the flanking markers (which should be placed on the Radiation Hybrid (RH) map if this information is not available) from the Intermediate Resolution Mapping are taken as a guide to estimate the position of the mutated gene. The RH map contains a multitude of expressed sequence tags (ESTs), which can be used as markers on the walk toward the mutated gene. Typically we will pick three to four ESTs in proximity of the estimated gene location, and first check if the 3' UTRs of these ESTs are polymorphic in our mapcross. Frequently the primers used to map the ESTs are indicated on the WashU Website. If this is not the case, make sure to order SSCP primers outside of the coding region. In our experience one out of four ESTs is polymorphic in a given mapcross. If you do not find a polymorphism with the chosen ESTs, you can either test more ESTs for polymorphism, or you can use the EST primers to isolate PACs. Sequencing the ends of PACs will increase the likelihood of identifying polymorphic markers. Once polymorphic markers are identified, the panel of recombinants identified at each of the flanking markers will be tested. Example: 30 recombinants were identified with the left flanking marker, and 35 with the right flanking marker. If the denominator is 1500 embryos, the distance is estimated to be 2 cM from the left flanking marker and 2.3 cM from the right flanking marker (too far to initiate a walk). Two ESTs are identified in the estimated region of the mutated gene. Both are polymorphic. Testing the panels of recombinants from either side with these markers, the following picture arises: Flanking Marker Left EST1 Recom -> binants 30 4 0 0 | | | | | | | | GENE EST2 Flanking marker Right 0 0 16 35 <-Recom binants The mutated gene is therefore situated between EST1 and 2. The estimated distance from the closer EST1 is 0.26 cM. This distance is sufficiently small to initiate a PAC walk.. High Resolution Mapping. We have traditionally collected between 1500 and 2000 mutant embryos in an effort to positionally clone. In high resolution mapping, these mutant embryos, arrayed in a 384 well format, are tested with the flanking markers found in the low and intermediate stages of mapping. It is critical that every recombination event is scored in this step, which is allowed for by the use of a two-allele system. If a three or four-allele system is used for mapping, some recombination events will be missed. Therefore, it is recommended that the families used for collecting the embryos be chosen according to the most useful allele system as well as the correct polymorphic mapping strain. Furthermore, it is advantageous to limit the number of families used to collect the mutant embryos. Although collecting embryos from only two or three pairs of fish may lengthen the time needed to reach the target number of embryos, it will simplify further steps in positional cloning. AFLP. AFLP can be used to isolate markers very close to a gene. As the map has become more dense with markers, we have not used this technique as much. See Appendix for details. Three-allele systems vs. four-allele systems. Typing map cross parents is essential to evade the allele traps encountered by this type of bulk segregant analysis. Once flanking markers have been found map cross parents must be tested for the proper allele segregation. Three-allele systems are quite common in the AB/Wik crosses used in our lab. Often we see that the flanking markers do not segregate the same between different families. This becomes a problem with the high resolution scan that includes individuals from all map crosses. One of the wild-type alleles sometimes has the tendency to migrate the same as the mutant alleles in the agarose gel, so when a high resolution scan is performed recombinants are missed because they look like a mutant embryo. A simple way to avoid these bad-allele systems is to type all the parents and grandparents before collecting all 1500 embryos. While the low and intermediate stages of mapping are being performed embryos can be collected. Just be prepared to remove those crosses from the mapping collection that have bad-allele migration. Four-allele systems can be just as confusing if not fully investigated before the high resolution phase. Generally, four alleles can be tracked with ease. However, problems can arise when an AB allele segregates with the mutant Wik allele. Heterozygotes would be counted as homozygous mutants and not recombinants. The same situation seen with three-allele systems. Good-three allele system GP1 GP2 P1 P2 Mutants WIK1 Allele WIK2 Allele AB1 Allele Bad-three allele system AB1 Allele WIK1 Allele GP1 GP2 P1 P4 P2 P3 WIK2 Allele AB2/WIK1 AB2 Allele AB1 / WIK1 Good-four allele system AB1 Allele WIK1 Allele GP1 GP2 P1 P2 WIK2 Allele AB2 Allele Mutants Collecting Mutant Embryos. Upon the conclusion of high resolution mapping, a number of recombinants have been identified. These recombinants are re-arrayed on a new plate to create the recombinant panel. During this process, it is advisable to continue collecting mutant embryos from the mapping strain beyond the initial 2000 since more embryos may be required later in the process. The recombinant panel is now utilized in positional cloning, with the number of recombination events lessening as the mutation is neared. Chromosomal Walking. After collecting1500 to 2000 mutant embryos one is ready to do chromosomal walking. You should begin with flanking microsatellite markers that are agarose scorable and linked to your gene. The walk will start from an internal marker between two flanking polymorphic microsatellites. The internal marker is a 24-mer oligonucleotide sequence from either the 3'UTR of an EST or from a PAC clone end, which was screened with one of the microsatellites or ESTs. Using an internal marker, orientation of the walk relative to the gene is established by studying meiotic recombination in F2 embryos. By taking the marker that is closest to the gene, a largeinsert genomic libarary is rescreened and the new fragment “pulled” is sequenced from both ends and new oligonucleotides are designed and tested on the F2 meiotic recombiants. These makers are used to generate new markers (probes) from the ends of the clones and used to rescreen one of the large-insert genomic libraries and therefore, a "walk" has been established. When starting a chromosomal walk, choosing a narrower genetic interval will facilitate positionally cloning the disrupted gene of interest. First step in a chromosomal walk fragment from large-insert genomic library Marker #1 Recombinants = 10 Marker #2 Recombinants =4 Second probe screened with marker #2 fragment from rescreened large-insert genomic library Gene Recombinants =3 Recombinants =2 direction Screening the PAC or BAC Libraries. There are two methods for screening the PAC and BAC libraries. The first method is to utilize a hybridization strategy. We have utilized a OLIGO6 prediction program to make two 30-mers (see addendum I). The oligos are individually hybridized through filters. The availability of "double- positives" (positive with both oligos) is an advantage for isolating true positives. We find that oligo-hybridization (addendum II) is very reproducible and has the ability to isolate over about 5 clones per hybridization. An alternative strategy is to use overgo oligonucleotides. Our laboratory is not yet invested in this strategy. The PAC library was originally isolated from AB zebrafish red cells. The PAC library has insert size of roughly 100 kb that encompasses 5x coverage or 250 384-well plates. The BAC library is made from a single AB fish and has clones that are only 82 kB on average. Both libraries are assembled in a similar format. We have both libraries in-house available to us, which may be different than other investigators. The BAC and PAC libraries are available at Incyte Genomics and the RZPD. Filters can be obtained from all those sources at a price. Care of BAC and PAC Libraries. Condensation and cross contamination are sources of problems for maintenance of the glycerol stocks of the PAC and BAC libraries. To reduce cross contamination and maintain viability of the cultures, the 384 well plates are handled carefully to limit defrosting. Plates are removed from -80oC storage and allowed to warm at room temperature for approximately 5 minutes. A sterile pipet tip is used to remove a chip of the frozen culture with subsequent streaking on an agar plate or inoculation of a broth culture. The PAC clones are kanamycin resisitant (25 ug/ml) and the BAC clones are chloramphenicol resistant (34ug/ml). The inner side of lids are wiped off with ethanol if necessary and the plates are replaced in storage. This is a diminishing return with the use of plastic sealing film (Marsh AB-0580) for plates that are utilized frequently. Addendum I (Protocol for choosing oligos for hybridization) 1) be certain there is no polylinker in your sequence, you can do this by a) recognition b) blasting at http://www.ncbi.nlm.nih.gov/blast/blast.cgi?Jform=1 vs the vector database, or c) by placing the sequence in either Oligo 6 or DNA strider 1.2 or 1.3 and seeing if you have a bunch of restriction sites piled on one another. 2) Take vectorless sequence and go to NCBI "blast site" to ensure you do not have a repeat in your sequence: http://www.ncbi.nlm.nih.gov/blast/blast.cgi?Jform=1 and do the following blasts a) blastn vs nr: compares your DNA sequence through the non-redundant Genbank sequences--can find high-copy repeats e.g SINE/LINE/mermaid. Anything that shows homology over a significant area should be noted and no 30mer made to that area. b) blastx vs nr: compares your translated DNA sequence in all 3 frames against the nonredundant protein sequences: MAY find repeats (eg. LTR of retirviruses) but MAY find you have an exon at the end of your PAC/BAC/genomic clone. May bumm you out that you missed a gene c) blastn vs dbest: MOST IMPORTANT blast. This one will find MOST repeats. What you are looking for is any number of "hits" with ZF (especially) sequences form the EST project. 3) Either write down the base numbers of the sequences you find to be repetitive (find this best) OR remove them manually--making 30mers to them may earn you Bruce's ire by toasting 2 sets of expensive blots. 4) Go to Oligo 6.x-a) open a new file and paste your sequence, "accept the sequence" using the upper left "accept/discard" line or save it to move on to the next screen. b) Go to the top pull-down menus and open "Search", select "for primers and probes" c) Place an "x" in the box labeled "hybridization primers". d) Click "parameters" and at the top pick "high" . e) Under nucleotide length type in "30", then hit ok. f) IF you want to remove some 5' or 3' sequence from consideration because it is vector or repeat you can click on the "search ranges" button and give the range to search. g) At the main "for primers and probe" box, hit "ok". h) You will get a "search status" box with numbers flying by which eventually stops. i) click "ok" and you'll see a list of the acceptable primers. 5) You should get a list of acceptable primers which do not have a hairpin loop--making primers with significant hairpins is the best way to get no signal in a 30mer hyb, but MAY form dimers. Note dimer formation is not as important when designing them for hybing than when using them for PCR. a) click on the "sort by Tm" button and single click on the first one. b) Go to the Melting Temperature" plot screen by just clicking on it and choose that primer as either the upper or lower primer. c) I check for dimer formation anyway by going to the Analyze pull-down menu and then choosing [lower or upper] primer. I then pick the oligos with LEAST negative (closest to zero) deltaG dimer formation (MOre negative the deltaG is, more stable the dimer is). d) If you find they are all really negative (>6 or so) I look manually for 30mers which can be shown by someoone who has done it before. **Despite this, 30mers with very negative deltaG dimer formation (-16 in one instance when nothing else was available) still worked well. Addendum II (BAC/PAC Filter Hybridization with oligos) For hybridizations using oligos, we use two 30mers and probe a separate set of filters with each. Positives are then identified by aligning the films and choosing the clones that appear on both. Day 1 • (hyb oven with roller bottles) Wet the filters in 2XSSC, sandwiching them in between nylon mesh equal in size to the filters. After all mesh and filters (we have done 9 in one bottle) roll them up and place in the bottle (be aware that the orientation of the rolls in the bottle versus the direction the bottle rotates is important - if it is incorrect you filters will tightly curl up). • pour out all of the 2X SSC and add 20 ml/set in a roller bottle pre-hyb solution : pre-hyb solution in 20 ml 6X SSC 0.5% SDS 700 /ml Salmon Sperm DNA 5X Denhardt's soln. 6 ml 20X SSC 500 20% SDS 2.8 ml 10 mg/ml. ssDNA 2 ml 50XDenhardt's H2O to 20 ml Oligo Hybridization • • pre-hyb at 60˚C for ~ 2-4 hours (may go overnight) while pre-hybing kinase oligo(s) 500 ng of oligo 10X PNK Buffer -32P PNK 1 2 5 1 - flick tube to mix, pulse down in microcentrifuge and incubate at 37˚C for at least 1 hour (may leave overnight) - for each oligo, spin down a G-25 Sephadex column 2X to remove column buffer (2 min at 2000) - add 30 of H2O to each oligo (final volume of 50 ) - add oligo to column and spin at 2000 for 5 min and collect eluent - take 1 and put into scintillation vial and get a count (cpm should be 500,000 cpm/ml of hyb for good hybridization) (quick and dirty test is to simply hold entire tube in front of geiger counter and if the needle goes off scale at 100X scale, oligo is sufficiently labeled) • dump out all of pre-hyb solution and add hyb solution hyb solution in 20 ml 6X SSC 0.5% SDS 100 /ml calf thymus DNA 6 ml 20X SSC 500 20%SDS 400 mgmlctDNA H2O to 20 ml • get hybridization solution to temperature, then add probe. Hybridize overnight. Day 2 • Remove hyb solution (put down extra diapers as it splatters and gets everything hot). • To wash (Oligo ): 3X 15 min in 5X SSC/0.1%SDS at RT. Initial wash is done with the filters in the bottle. Subsequents washs are done in a glass tray. If necessary, wash 1X at 42C. • Blot each filter on Whatman paper to remove excess liquid. • Cover each filter in Saran wrap, leaving 1cm+ overhang on all sides and fold excess around to side without writing. • Place orienting stickers on overhanging Saran wrap, not on filter as grids run all of the way to the edge. • Expose film with writing side away from the film and intensifier. right side of the film for orientation. • Expose overnight at -80˚C. • Develop normally. Fold the bottom PCR screening of the pooled PAC and BAC libraries. In addition to screening the PAC and BAC libraries by hybridizing to filters, we have found it useful to screen a pooled library by PCR. It is often helpful to use two separate sets of 20 mer oligos to screen the pools in order to confirm positives and reduce the number of false positives. There are 270 plates (384 well) in the PAC library. All of the wells from one plate are combined to make 270 plate pools. The 33 superpools are created by pooling either 8 (superpools 1-27) or 9 (superpools 28-33) plate pools. The first step is to PCR screen the 33 superpools. It is good to include with this a positive control (such as zebrafish genomic DNA) and a negative control (either random plasmid or water). Good positives should be strong bands that preferable amplify with more than one primer pair. Once a superpool positive has been found the next step is to screen the plate pools and row/column DNA. There are 6 384-well plates that are divided into 33 sections corresponding to the 33 superpools. Each section contains the 8 or 9 plate pools that correspond to the superpool and row and column DNA. There are 16 row DNA pools (for rows A-P) and 24 column DNA pools. For example, a well will contain all the A rows from the 8 or 9 plates that are in the superpool. The 48 or 49 wells in the section that correspond to the superpool positive should be screened by PCR. This section should be diluted before use - 1:30 in TE. This PCR should yield three positive wells: one which corresponds to a plate pool (giving you the plate number), one which corresponds to a row pool (giving you the row letter) and one which corresponds to the column number (giving you the column number). An example is shown below: A B C D E F G H 1 2 3 4 5 6 plate 1 plate 2 plate 3 plate 4 plate 5 plate 6 plate 7 plate 8 pp1 row A pp1 row B pp1 row C pp1 row D pp1 row E pp1 row F pp1 row G pp1 row H pp1 row I pp1 row J pp1 row K pp1 row L pp1 row M pp1 row N pp1 row O pp1 row P column 1 column 2 column 3 column 4 column 5 column 6 column 7 column 8 column 9 column 10 column 11 column 12 column 13 column 14 column 15 column 16 column 17 column 18 column 19 column 20 column 21 column 22 column 23 column 24 This shows the upper left section of a 384 well plate containing the plate pools and row/column pools corresponding to superpool 1. This clone can then be retrieved from the PAC library and streaked out on LB/Kan plates. It is good to confirm that this is an overlapping clone by direct PCR of this clone or by sequencing with a sequence-specific primer. New BAC Libraries. A new BAC (bacterial artificial chromosome) library is being prepared by Dr. Chris Amemiya's laboratory. The library was created by isolation of DNA from blood pooled from 20 Tubingen fish, 10 male, 10 female. The BAC vector backbone, pBACe3.6 has the advantage of being far smaller than the PAC vector which is advantageous when shotgun sequencing a single large-insert genomic clone (more clones will represent your insert and not vector). In initial test ligations, the average insert size is ~150kb , with a large number of 200kb clones represented in the library. The eventual goal is to create 15X coverage of the zebrafish genome. Other libraries are being constructed by P. deJong and by R. Plasternack. Of note, these libraries will likely be the backbone of the Sanger Center's Zebrafish Genome sequencing project. Walking and Establishing Contigs by sequencing PACs A situation may occur where taking the next step in a walk becomes impossible because of a lack of polymorphic SSCP or SSLP markers. Assume you have isolated PAC 1, were able to sequence both ends, find polymorphic SSCP markers on both ends, and could orient your walk by identifying recombinants on both ends: T7 Sp6 ______________________________________________ PAC 1 Recombinants: <----------------------- 10 12 gene Given the above data you decide to walk to the left (T7 end): Correct decision, as this end has fewer recombinants, and is therefore closer to the gene than the Sp6 end. Step 1: You now want to isolate PACs with the primers that worked on the T7 end. You obtain PACs # 2, 3, 4, 5, and 6: T7 Sp6 ______________________________________ PAC 1 ______________________________________ PAC 2 ______________________________________ PAC 3 ___________________________________________ PAC 4 ______________________________________ PAC 5 ______________________________________ PAC 6 Step 2: You are able to sequence the ends of all five PACs (although in reality you might not have obtained five PACs, and it is not always possible to obtain reliable sequence from all ends). Unfortunately none of the primers give you reliable SSCP polymorphisms in your cross (although this is quite rare if you really sequenced all 10 ends). Step 3: A way out in this scenario is to take the forward (or reverse) primers from the T7 and Sp6 ends of PAC 2 (or 3, 4, 5 or 6), 5’ end radiolabel them and use them as priming oligos to sequence PAC 1. Only one reaction will work (let’s say the Sp6 end of PAC 2 sequenced off PAC 1). You do not want to walk from the Sp6 end of PAC 2, because it points in the wrong direction! T7 Sp6 ___________________________________________ T7 Sp6 PAC 1 ______________________________________ PAC 2 You now know that the T7 end points toward your gene (although you were unable to determine this with your recombinants). Step 4: From this point you can proceed in two ways: a) Sequence the remaining PACs (3, 4, 5, and 6) with the T7 end of PAC 2. Positive results here indicate that the sequenced PAC extends further toward the gene than PAC 2 (in our example this would be true for PAC 4). Negative results indicate that PAC 2 extends further towards the gene than the sequenced (negative) PACs (in our example PACs 3, 5, and 6). You determine to use PAC 4 for further walking. In order to determine the end of PAC 4 that is closer to the gene, use both ends and sequence off PAC 2 (or any other PAC). The end that gives you a negative result is the one that is closer to the gene. Continue as under 4b) b) Use the end that is closer to the gene to isolate another set of PACs (see Step 1 above). Sequence the ends and attempt to generate polymorphic SSCP markers. If not possible, go to Step 2 etc. Depending on our estimated distance from your gene (see recombinants and denominator – not shown - on PAC 1), you could decide to take a couple of steps before attempting to re-estimate the distance to your gene by testing your recombinants on polymorphic SSCP markers from the PAC ends. NB.: a) Instead of sequencing PAC 1 using PAC 2 primers as described above, you can also attempt to carry out PCR using the primers derived from the ends of PAC 2 and DNA from PAC 1 as template. This works in many cases, but it requires stringent controls (positive and negative controls), as PCR is not as specific as sequencing. b) If a number of sequencing reactions are done simultaneously, you may consider only running the “G” (or A or T or C) lane, as knowing the exact sequence is not as important as ascertaining that the reaction worked. AFLP If there are no CAs or ESTs within 1 cM of a mutation then AFLP is the best method for generating closer markers that can be used to initiate a walk to the gene of interest. In, brief, Amplified fragment length polymorphism (AFLP) is a robust and efficient method for selectively PCR amplifying genomic DNA restriction fragment length polymorphisms that exist between strains of zebrafish. We have had great success using AFLP to find markers that map within 0.3 cM of several mutant loci but this process can take several months. Thus, AFLP is used to identify markers close to a mutation of interest only after first testing the relevant markers that have been previously placed on maps of the zebrafish genome. Note that AFLP requires specially prepared DNA samples that are cleaner that we usually use for high throughput mapping. The method we use for AFLP are available on the Zon lab web page and on the attached methods.