INVESTIGATION OF THE LEVELS OF SPERM-BINDING GENE EXPRESSION IN FROGS FROM THE GENERA XENOPUS AND LEPIDOBATRACHUS Kyle Steven Jones B.S., San Diego State University, 2007 THESIS Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in BIOLOGICAL SCIENCES (Molecular and Cellular Biology) at CALIFORNIA STATE UNIVERSITY, SACRAMENTO SPRING 2011 INVESTIGATION OF THE LEVELS OF SPERM-BINDING GENE EXPRESSION IN FROGS FROM THE GENERA XENOPUS AND LEPIDOBATRACHUS A Thesis by Kyle Steven Jones Approved by: __________________________________, Committee Chair Thomas R. Peavy, Ph.D. __________________________________, Second Reader Nicholas Ewing, Ph.D. __________________________________, Third Reader Enid T. Gonzalez, Ph.D. ____________________________ Date ii Student: Kyle Steven Jones I certify that this student has met the requirements for format contained in the University format manual, and that this thesis is suitable for shelving in the Library and credit is to be awarded for the thesis. __________________________, Graduate Coordinator Susanne Lindgren, Ph.D. Department of Biological Sciences iii ___________________ Date Abstract of INVESTIGATION OF THE LEVELS OF SPERM-BINDING GENE EXPRESSION IN FROGS FROM THE GENERA XENOPUS AND LEPIDOBATRACHUS by Kyle Steven Jones Pre-zygotic blocks to fertilization create scenarios where two members of an interbreeding population are unable to conceive and produce offspring resulting in reproductive isolation. This inability to naturally produce offspring could be a result of the failure of sperm binding to the exterior of the female egg. The mammalian oocyte is surrounded by an extracellular matrix termed the zona pellucida (vitelline envelope in amphibians) that houses glycoproteins (ZPC) that function to bind sperm in a speciesspecific manner to prevent hybridization. Zebrafish have experienced numerous ZPC gene duplications creating multiple copies within the genome, albeit non-functional with regards to sperm-binding. On the other hand, mammals have only one ZPC gene copy that codes for and functions as the primary sperm-binding protein. Evolutionarily between these two groups of taxa lie amphibians in which ZPC is also a sperm-binding protein. Multiple gene copies have been found to be expressed in three species of frogs: iv Xenopus laevis, Xenopus borealis, and Lepidobatrachus laevis. The discovery of multiple ZPC genes has led to further questions as to why multiple copies are expressed and to what degree. Since determining the function of each ZPC gene would require considerable effort (purified proteins and sperm binding assays), the focus of these studies is on the expression level of each gene because it is likely to be informative as to the functional role the translated protein plays in fertilization. One possibility is a single ZPC copy is expressed in higher amounts likely rendering it the functional sperm-binding gene while remaining gene copies play structural roles within the vitelline envelope. Alternatively, all ZPC genes may be able to bind sperm, and expression of these ZPCs in the same egg would cause competition for sperm-binding. Competition may have led to the evolution of binding affinity differences where the gene product with the greatest affinity for sperm will have a decided advantage for binding. However, the situation could be a combination of the first two scenarios where multiple genes expressed in high amounts perform sperm-binding while remaining low expressing gene copies perform alternative roles. My hypothesis is that Xenopus and Lepidobatrachus ZPC genes are unequally expressed within their respective ovaries and that orthologous genes show a similar pattern of expression. Levels of gene expression were determined from ovary cDNA libraries using quantitative PCR (qPCR) which measures the expression of each ZPC gene as compared to a reference gene. The reference gene, in this case GAPDH, is used as a baseline to which all qPCR expression data is normalized. Using qPCR, the ZPC gene expression profiles were determined for individuals from two closely related species X. laevis and X. borealis as well as the distant relative L. laevis to investigate if v gene copies are expressed in higher amounts relative to others. Through qPCR and ANOVA statistical hypothesis tests, ZPC genes were found to be unequally expressed in Xenopus and Lepidobatrachus. Furthermore, phylogenetic evidence suggests X. laevis and X. borealis ZPC orthologs have maintained a relatively similar level of expression after their ancestral split. Genes with comparable expression levels may share a common role with predominantly expressed genes performing as functional sperm-binding proteins while low expressers may have lost their ability to bind sperm. While precise functional roles of each ZPC cannot be assigned from this work, the discovery of unequal expression provides preliminary information for future experimental design aimed at elucidating the functions of the different ZPC genes. _______________________, Committee Chair Thomas R. Peavy, Ph.D. _______________________ Date vi ACKNOWLEDGMENTS Several individuals have played integral roles leading to the completion of this work. First, I would like to expresses my sincere thanks to my thesis advisor and mentor Dr. Tom Peavy. Since I first arrived in Sacramento, he has challenged me in both research and within the classroom by providing instruction and guidance, but still allowing me to earn the education I set out to attain. He has invested countless hours educating me not only on how to navigate my way through the graduate program, but how to be successful moving forward. Additionally, Dr. Enid Gonzalez and Dr. Nick Ewing deserve a great deal of thanks for their participation on my advisory committee. Membership to the committee requires sacrifice and commitment on their end to provide me with feedback and direction on how to achieve the best possible outcome in the lab and inside the classroom. I would also like to thank the faculty and staff of Sacramento State including Dr. Jamie Kneitel for lending me some of his wisdom on statistics. This type of graduate program where completion primarily depends on the amount of effort put forth, at times can seem like a never-ending quest. Shari has been by my side and always given me the proper perspective and support throughout my participation in this program and I will always appreciate her for that. Additionally, I thank all of my life-long friends from Springtown who have maintained strong support of me in each of educational journeys. Finally, I would like to recognize my parents for all their hard work to provide the tools and placing me in a position to be successful in school, work, and life. For their enduring support, I am forever grateful. vii TABLE OF CONTENTS Page Acknowledgments...................................................................................................... vii List of Tables .............................................................................................................. ix List of Figures ............................................................................................................... x Chapter 1. INTRODUCTION ..................................................................................................1 Hypothesis....................................................................................................... 19 Objectives ....................................................................................................... 19 Rational for Experimental Design .................................................................. 19 2. METHODS .......................................................................................................... 22 3. RESULTS ............................................................................................................ 35 Cloning and Sequencing GAPDH cDNAs ......................................................35 qPCR Primer Design ........................................................................................49 Optimization of Primer Annealing Temperatures ............................................65 Mixed Plasmid PCR Controls ..........................................................................72 Primer Efficiency .............................................................................................74 qPCR Assays to Determine Gene Expression .................................................79 ZPC Phylogenetic Analysis .............................................................................87 4. DISCUSSION ...................................................................................................... 95 References ................................................................................................................. 105 viii LIST OF TABLES Page Table 1. Summary of primers used to clone GAPDH from L. laevis and X. borealis .....................................................................................37 Table 2. Pairwise comparison of ZPC cDNA sequences ...................................59 Table 3. Summary of ZPC and GAPDH qPCR primers for L. laevis, X. laevis, and X. borealis .....................................................................61 Table 4. qPCR primer efficiency summary .......................................................78 Table 5. L. laevis ZPC and GAPDH CT values..................................................82 Table 6. X. laevis ZPC and GAPDH CT values .................................................85 Table 7. X. borealis ZPC and GAPDH CT values..............................................88 ix LIST OF FIGURES Page Figure 1. Steps leading to fertilization ..................................................................3 Figure 2. Structure of the ZPC gene ......................................................................6 Figure 3. Predicted 3D model of the frog Lepidobatrachus laevis ZP domain ..............................................................................................8 Figure 4. MHC and ZP3/ZPC structure...............................................................11 Figure 5. 2D immunoblot of X. laevis and L. laevis vitelline envelopes (VE).....................................................................................20 Figure 6. Quantitative PCR master mix set-up....................................................31 Figure 7. GAPDH protein sequence alignment and primer design .....................36 Figure 8. Degenerate PCR of the L. laevis GAPDH cDNA ................................39 Figure 9. Sequence alignment of the middle portion of the L. laevis and X. laevis GAPDH cDNAs .............................................................40 Figure 10. Cloning the 3’ end of the L. laevis GAPDH cDNA sequence .............42 Figure 11. Cloning the 5’ end of the L. laevis GAPDH cDNA sequence .............43 Figure 12. Degenerate PCR of the X. borealis GAPDH cDNA ............................45 Figure 13. Sequence alignment of the middle portion of the X. borealis and X. laevis GAPDH cDNAs .............................................................47 Figure 14. Cloning the 3’ end of the X. borealis GAPDH cDNA sequence ...............................................................................................48 Figure 15. Cloning the 5’ end of the X. borealis GAPDH cDNA sequence ...............................................................................................50 Figure 16. Sequence alignment of the X. borealis and X. laevis GAPDH cDNAs ...................................................................................51 Figure 17. Sequence alignment of L. laevis ZPC cDNAs .....................................54 x Figure 18. Phylogenetic tree of frog ZPC cDNAs ................................................58 Figure 19. Sequence alignment of X. laevis ZPC cDNAs .....................................63 Figure 20. Sequence alignment of X. borealis ZPC cDNAs .................................66 Figure 21. L. laevis ZPC and GAPDH qPCR primer optimization .......................69 Figure 22. X. laevis ZPC and GAPDH qPCR primer optimization ......................71 Figure 23. X. borealis ZPC and GAPDH qPCR primer optimization...................73 Figure 24. Mixed plasmid control PCR.................................................................75 Figure 25. Representative qPCR primer efficiency plot .......................................76 Figure 26. L. laevis ZPC expression levels using qPCR .......................................83 Figure 27. X. laevis ZPC expression levels using qPCR .......................................86 Figure 28. X. borealis ZPC expression levels using qPCR ...................................89 Figure 29. ZPC maximum likelihood phylogenetic tree .......................................92 Figure 30. ZPC neighbor-joining phylogenetic tree ..............................................93 xi 1 Chapter 1 INTRODUCTION Fertilization involves the fusion of haploid male and female gametes which leads to the formation of a new diploid organism. This fundamental process of generating new individuals is crucial to maintaining populations, preserving evolution, and avoids species extinction. In order to regulate this interaction, mammalian eggs are surrounded by a thick extracellular matrix called the zona pellucida, while this same structure is termed the vitelline envelope in amphibians and the chorion in fish. Glycoproteins found in these extracellular matrices serve as regulators of fertilization. In particular, one glycoprotein termed ZPC serves as the initial sperm-binding molecule which binds to a complementary receptor found on the surface of sperm. This sperm-binding interaction resembles a “lock and key” mechanism to ensure species-specificity and prevent hybridization with gametes from other species. Mutations that disrupt this lock and key mechanism would generate instances where the gametes can no longer recognize and bind to each other thus causing reproductive incompatibility between two individuals. Infertility is a heterogeneous group of disorders that prevents otherwise healthy individuals from conceiving and producing offspring. Some factors contributing to failed reproductive attempts in males include germ cell arrest, sperm autoimmunity, and reduced sperm quality [1]. The reduction in quality not only describes the sperm’s inability to reach the egg, but also translates to sperm that are unable to bind with ZPC, the sperm-binding receptor, present within the zona pellucida and form the lock and key relationship that triggers subsequent steps toward the generation of a new organism. 2 There is evidence to suggest that male and female reproductive proteins are coevolving with each other such that receptors and ligands acquire mutations that maintain their compatibility for their binding interaction and thus enable fertilization [2]. However, mutations that occur to these reproductive proteins in individuals from other populations of the same species that do not have an opportunity to breed with each other may lead to instances of infertility (prezygotic block to fertilization) due to the disruption of the lock and key interaction. As a result, this prezygotic block to fertilization due to failure of sperm-binding may lead to speciation. Therefore, a more complete understanding of the early stages where sperm interact with glycoproteins at the surface of the zona pellucida should provide insight into the evolution of reproductive proteins and build upon current ideas of infertility and speciation. Our current understanding is that species-specific binding of sperm to the egg leads to the triggering of a series of carefully orchestrated events resulting in the formation of a new organism. The zona pellucida is comprised of glycoproteins ZPA, ZPB, and ZPC, but sperm must first bind to ZPC before subsequent steps to fertilization can transpire (Figure 1). Once a single sperm has successfully bound to ZPC present within the exterior coating of the egg, a vesicle at the tip of the sperm releases its contents through an exocytotic event called the acrosome reaction. This sperm receptor-ZPC binding interaction has been shown to require enough points of contact to create a high affinity bond before the acrosome reaction is possible [3,4]. The released acrosomal contents, consisting of proteases and glycosidases, facilitate enzymatic hydrolysis of the extracellular matrix thereby enabling sperm to penetrate the thick barrier and gain access 3 Figure 1. Steps leading to fertilization. The step-by-step mechanism of fertilization in the mouse begins with (1) the sperm locating the egg; (2) then species-specific binding of the sperm receptor to ZPC (circled); (3) followed by the triggering of the acrosome reaction; (4) subsequent penetration through the zona pellucida; and (5) finally fusion of the sperm with the egg plasma membrane [5]. 4 to the egg's plasma membrane. Unfortunately, the identity of the sperm binding receptor for ZPC is unclear at this point. The role of ZPC during fertilization is not complete after one sperm penetrates into the egg. Upon sperm fusion, the egg undergoes a series of events termed the block to polyspermy which prevents subsequent sperm from penetrating the egg. This block includes a temporary but fast depolarization of the membrane so as to immediately electrically repulse additional sperm from fusing [6,7]. The second block is more permanent but slower and entails an exocytotic release of the egg's cortical granules (vesicles that reside just below the plasma membrane surface) which contain a glycosidase and protease that eliminates the sperm binding activity of ZPC [8]. Thus, prevention of subsequent sperm binding prevents polyspermic fertilization which is very important in most species since there are deleterious consequences of having more than one sperm fuse with the egg pronucleus such as abnormal development or embryo termination [9,8]. The focus of my studies is to further understand the role of the ZPC glycoprotein with respect to fertilization and speciation in vertebrates since homologues of ZPC are found in the egg envelope from all vertebrates examined to date (e.g. amphibian vitelline envelope). Much of our knowledge about the structure and function of the zona pellucida genes, in particular ZPC, comes from studies in mice, but it is unclear as to how much is translatable to other species. For example, the sperm-binding capability of ZPC was demonstrated in mice using sperm binding assays to covalently-linked purified preparations of zona pellucida glycoproteins [10]. Furthermore, the location of where 5 acrosome intact sperm bind on the ZPC glycoprotein was shown to be towards the Cterminus of the ZPC when portions of the glycoprotein were tested [11,12]. These studies have been extended into humans by using baculovirus expression truncations of human ZPC during sperm binding assays which demonstrated that the final 156 amino acids of the C-terminal region was the key region [13]. As for non-mammalian species, it has been shown that the ZPC glycoprotein found in the vitelline envelope of the frog Xenopus laevis also serves as the sperm binding protein [14,15,16]. However, fish ZPC genes do not serve a functional role in sperm binding since sperm bypass the comparable zona pellucida-like structure, termed the chorion, by entering through a small pore called the micropyle [17,18]. Although the sperm-binding location is of prime importance, ZPC proteins possess other regions that are essential for the proper folding, secretion, and incorporation of ZPC into the growing extracellular matrix during oogenesis (Figure 2). Specifically, the N-terminus of mouse ZPC is comprised of a 22 amino acid signal peptide that targets the protein for the secretory pathway whereas the X. laevis ZPC signal peptide is only slightly shorter at 21 amino acids [19,14]. Before secretion into the extracellular space, the signal peptide leads ZPC to the rough endoplasmic reticulum and golgi apparatus where it is post-translationally modified. After passage through the golgi, glycosylated ZPC is tethered internally to the plasma membrane vesicle by hydrophobic residues comprising a transmembrane domain [14]. Mutations in the transmembrane domain do not inhibit the secretion of ZPC, but rather the assembly of the zona pellucida is completely abolished making this region critical for its incorporation into the growing 6 Signal Peptide 5’ ZP Domain TM Domain Sperm Binding xxxxxxxxxxxxxxx 3’ Furin Cleavage Figure 2. Structure of the ZPC gene. The 5’ end codes for the signal peptide (white dots) followed by the ZP domain (diagonal stripes). The putative sperm binding activity (xxxxx) occurs after the ZP domain upstream of a furin cleavage site (checkers). Hydrophobic amino acids are coded for towards the 3’ end that comprises the transmembrane domain (horizontal stripes). 7 matrix [20]. Upon fusion with the plasma membrane during secretion, ZPC becomes exposed to the egg’s exterior. The tether connecting the transmembrane domain with the remaining portions of ZPC is cleaved upstream (extracellular space) of the transmembrane domain at the furin-like cleavage site (mouse Lys371 – Arg372) by the calcium-dependent serine endoprotease furin [21,14]. The cleavage from the transmembrane domain releases ZPC from the membrane allowing for incorporation into the fibrillar structure through subunit interactions at the ZP domain [22]. The ZP domain is approximately 260 amino acids in length and functions in the polymerization of the growing extracellular matrix and orients ZPC in a position where it is capable of binding sperm. Eight cysteine residues, highly conserved throughout evolution, form four disulfide bonds within the ZP domain which are critical in maintaining the structure and assembly of the zona pellucida (Figure 3). There are other proteins that possess a ZP domain such as α-tectorin and uromodulin. These ZP domain containing proteins also are found to make up extracellular matrices in other tissues (e.g. ear membranes or kidney tubule lining, respectively) but do not play a role in fertilization. For example, α-tectorin forms an extracellular matrix within the ear to aid in hearing [23,24]. The presence and location of cysteine residues within the ZP domain is exceedingly important as demonstrated by point mutations that alter the cysteines within the α-tectorin ZP domain thereby resulting in the inability to formulate a matrix and subsequent hearing loss. In essence, the loss of ZP domain disulfide bond structures alters the structure leading to loss of function [24]. With respect to ZPC, mutations that alter the three-dimensional structure of the ZP domain such as disulfide bonds are likely to inhibit 8 N-terminus Cys106 Cys138 Cys199 Cys158 C-terminus Figure 3. Predicted 3D model of the frog Lepidobatrachus laevis ZP domain. 102 amino acids of the predicted N-terminal region of the ZP domain from the frog L. laevis ZPC (Llzpc.4) modeled with Swiss-Model using the mouse ZPC crystal structure (PDB: 3D4G) as a template (resolution of 2.3 Å). Disulfide bonds are predicted to form between Cys138 and Cys158 and between Cys106 and Cys199 in the enlarged image. Arrows point toward the C-terminus. 9 the assembly of the zona pellucida and contribute to instances of infertility. As for our current model of the structure of ZPC, only 102 amino acids from the N-terminal region of the ZP domain from mouse ZPC has ever been crystallized and structurally determined, so the exact folding pattern and spatial orientation of each ZPC residue is not known [25]. In addition, glycosylation of ZPC during the secretion process is also critical for its sperm-binding activity. A combination of N- and O-linked oligosaccharides are added to the ZPC polypeptide within the golgi, and it is thought that the O-linked sugars added near the C-terminal region are the most critical for species-specific sperm-binding activity [26]. In particular, terminal N-acetylglucosamine and fucose residues seem to be of the most importance since ZPC sperm-binding activity can be eliminated when these sugars are enzymatically removed [27]. As briefly mentioned with respect to the block to polyspermy, cortical granule glycosidases are released after a single sperm fuses with the egg and it is the removal of these critical terminal sugars from their respective oligosaccharides that causes this loss of sperm binding [16]. In addition, there is an evolutionarily conserved N-linked site in all species examined (X. laevis amino acid site 113) comprised of simple mannose structures which may be important for the structure or function of the molecule [16]. Mutations at amino acid sites that alter glycosylation patterns would also affect sperm-binding activity. With respect to mutations, there is evidence that mammalian ZPC sperm-binding proteins on the surface of the female egg are experiencing amino acid substitutions at accelerated rates especially in the C-terminal region where ZPC interacts with sperm 10 [28]. This pattern of evolution, termed positive Darwinian selection, is determined by comparing the rate of nonsynonymous substitutions (amino acid replacing) to synonymous substitutions (or silent sites). In most proteins, their amino acid sites do not usually exhibit such high rates of substitutions but are rather subjected to purifying selection (nonsynonymous rate < synonymous rate) where mutations that alter amino acids are removed from populations due to selectional pressures that preserve the original function of the protein. Many other sites do not experience these extremes (positive and negative selection) but are neutral instead (nonsynonymous rate = synonymous rate). Substitutions at these sites neither have positive or detrimental effects. However, ZPC genes do show signs of high rates of amino acid substitutions in the sperm-binding region (positive selection) suggesting that changes in this region are beneficial to the reproductive success or survival of the species [28]. It is thought that interbreeding individuals in a population will co-evolve so as to have compatible mutations for successful fertilization, whereas different mutations are likely to be selected for in individuals from other populations that could lead to incompatibility of the lock and key binding interaction (infertility) between populations. This mating incompatibility between individuals from separate populations could then lead to speciation. Speciation, the splitting of lineages, is an event that is often thought of as beneficial due to the preservation of mutations that may have enabled individuals to adapt to their local environment. In addition to ZPC, there are other mammalian proteins that have been found to be under the influence of positive selection that are recognition proteins (i.e. receptor- 11 Figure 4. MHC and ZP3/ZPC structure. Regions of mammalian MHC and ZP3/ZPC believed to be under positive Darwinian selection are indicated by black spheres (MHC) and black circles (ZP3/ZPC). The 3D structure shows an antigen bound to the antigen recognition site of MHC which is thought to be the region experiencing rapid evolution. The primary structure diagram of ZP3/ZPC (folded structure mostly is unknown) also indicates alleged areas of selection including the sperm-recognition site and regions of the N-terminus [28]. 12 ligand interactions). The class I major histocompatibility complex (MHC, Figure 4) genes encode cell-surface glycoproteins present on all nucleated cells that bind non-self antigens to induce an immune response. Positive selection has been identified at sites within exons 2 and 3 which code for two variable α-domains of the extracellular antigen recognition site that bind foreign proteins and present them to cytotoxic T lymphocytes. The increase in variability resulting from rapid amino acid changes in the antigen recognition site enhances the possibility that foreign antigens will be recognized by the MHC molecules. Alterations of the amino acid sequence of the α-domains in the antigen recognition site of MHC proteins can provide a greater diversity of antigens to which they can bind and so may experience positive selection. Positive selection can be strong since the pathogens they experience are also undergoing strong positive selection for mutations that enable them to evade the host [29,30]. Furthermore, this phenomenon of rapidly evolving reproductive proteins has not only been observed in mammals, but has also been detected in lower organisms. Although ZPC is not present in invertebrates, reproductive proteins from these taxa have been shown to be experiencing positive selection. Bindin, an acrosomal surface protein isolated from the sea urchin Strongylyocentrotus purpuratus, enables their sperm to penetrate through the jelly layers of the egg and bind species-specifically to proteins embedded within the vitelline envelope [31,32]. Substantial research working with sea urchins as a model suggests that bindin sequences are highly divergent between closely related species and appears to be due to positive selection. During intraspecific mating, the concentrations of sperm needed for successful fertilization varies between males 13 suggesting that bindin has variable affinities for different receptor sequences within their population [33]. In addition to sea urchins, co-evolution between the male and female reproductive proteins in species of abalone has also been observed to be driven by positive selection [34,35,36]. Similar to the evolution of mammalian ZPCs, recent evidence from Dr. Peavy's laboratory suggests that amino acid sites within the C-terminal region of frog ZPC genes are also experiencing high rates of amino acid substitutions (positive selection). Not only is this occurring in the model organism X. laevis, but also in the close relative X. borealis indicating that positive selection may effect the sperm-binding region of ZPC in frogs as well as in mammals. Moreover, work to assess the molecular evolution of ZPC genes in amphibians yielded the discovery that multiple copies of the ZPC gene were expressed within frog ovaries. In short, after cloning and sequencing ZPC genes from individual ovary cDNA libraries, 5 genes were found in X. borealis, 4 in X. laevis, and 6 in Lepidobatrachus laevis [37, Peavy unpublished]. The detection of multiple ZPC genes expressed within the ovaries of the frog species examined has led to questions as to whether each gene is expressed in equal amounts and whether each ZPC gene product has sperm-binding activity. Interestingly, multiple ZPC genes have also been found to be expressed in the eggs of teleost fish, which include medaka, carp, and zebrafish. Since the ZPC gene has not been found to exist in invertebrate species such as the sea urchin and abalone, ZPC genes seem to have evolved after the split of vertebrate lineages. As for the model organism zebrafish, evidence points to upward of five different ZPC gene copies 14 arranged in tandem repeats due to gene duplication events that are expressed specifically in the early-growing oocyte [38,17]. Furthermore, if duplications occurred before the split that gave rise to amphibians, fish and frogs would share common ZPC genes (orthologues and paralogues). However, as mentioned previously, fish ZPC genes do not serve a functional role in sperm binding since sperm bypass the chorion entirely by entering through a small pore called the micropyle [17,18]. Thus, the same selectional forces do not act on the ZPC gene with regards to sperm binding activity in fish [39]. Since it has been shown that X. laevis ZPC does bind to sperm, the evolution of sperm binding activity likely emerged after the lineage leading to amphibians split off from fish. A shift in reproductive strategy from external to internal fertilization may have lead to “birth-and-death” ZP gene evolution in the lineage leading to mammals. After the divergence from birds (300 million years ago), two ZP genes (ZPAX and ZPD) were silenced in mammals possibly due to the accumulation of mutations from a relaxation of their functional constraints. Furthermore, gene death is seemingly observed in the mouse where ZPB exists in the genome as a nonfunctional pseudogene copy after its split from the rat lineage, whereas it has a different gene termed ZP1 that has seemingly taken its place in the zona pellucida [40]. Contrasting the larger number of gene copies in fish and frogs, mammals only express one functional ZPC gene copy that is highly homologous within mammalian species. Ancient mammalian copies may have been deleted over evolutionary time or possibly still exist as relics of their ancestry within genomes as pseudogenes. The "death" of a gene by becoming a pseudogene is due to one of the following events: mutations that cause a stop codon or reading frame shift in the coding 15 region; or by the accumulation of mutations in promoter regions causing a silencing of a once expressed gene [40,41]. This mechanism of “gene death” has been extensity studied in MHC genes, and evidence suggests that ZPCs have followed a pattern of progressive gene loss in the lineage leading mammals [30,42,43,44]. If this is the case, some ZPC gene copies expressed in amphibians may have lost the ability to bind sperm and may be heading down the path toward silencing. Additionally, amphibians and mammals require species-specific sperm-egg binding interactions at the initiation of fertilization creating the need for the evolution of high affinity binding interactions which seemingly has eliminated the need for a micropyle. Shifting to sperm-binding during the evolution from fish to amphibians would likely have caused selectional pressure on sperm to not only be the first to reach the egg, but also the ability to bind tightly enough with ZPC proteins to initiate the acrosome reaction [2]. In the case of sperm-binding genes, sperm competition and sexual conflict may be forces driving positive selection, divergence, and possibly silencing of reproductive genes. Males and females have opposing strategies when it comes to fertilization where males allocate large numbers of sperm to increase the chances of conception while females seek to limit the rate of fertilization to avoid deleterious effects such as polyspermy. The success of each sperm depends on its ability to bind complementarily to ZPC with high enough affinity to induce the acrosome reaction creating competition between sperm for the optimal binding sequence. However, subtle alterations to the ZPC protein structure may prove sufficient in slowing the fertilization rate forcing sperm to compete with one another. This causes a scenario whereby the sperm receptor co-evolves 16 with mutations found in ZPC. Modes of sexual conflict where fitness in one sex is decreased while fitness improves in the other may explain the need for constant coevolution (driven by positive selection) between the sexes [45]. By keeping sperm one step behind in the evolutionary race to fertilization, females are able to slow the process and minimize the deleterious effects of polyspermy. Since it is has been shown in mice and X. laevis that ZPC does serve as the primary sperm binding molecule, the existence of multiple ZPC gene products being expressed in frogs leads to the question as to which ones can serve as functional spermbinding molecules [11,12,27]. One possibility is that there is only one functional copy while remaining copies provide structural support to the vitelline envelope [46]. This functional copy would likely be expressed in the highest amount to ensure sperm binding. Alternatively, more than one ZPC gene may be able to bind to sperm and the expression of multiple ZPC genes in the same egg would cause a competition for sperm binding. Molecular evolution (positive selection) and competition may have led to binding affinity differences where the gene with the greatest affinity for sperm will have a decided advantage for binding. Competition among sperm would select for complementary mutations as they co-evolve which could cause a decrease in the number of sperm that can actually bind the egg at one time. This could be beneficial to the egg since there are many sperm reaching the egg vitelline envelope at the same time and this rate limiting binding step could serve to help prevent polyspermic fertilization. In this case, it is likely that all ZPC genes would be expressed in relatively equal amounts to function as competitors. However, the scenario may be a combination of expressing more than one 17 functional ZPC gene and the others having become non-functional and potentially serving alternative roles such as structural. Although, it is possible that ZPC genes serving an alternative structural role may be expressed in high amounts so as to add strength and stability to the vitelline envelope, it seems less likely since there are plenty of other ZP glycoproteins that comprise the matrix as mentioned previously. Furthermore, research examining the duplicated midbrain homeobox-1 and -2 (mbx1 and mbx2) genes in zebrafish provides evidence that the highly expressed gene (mbx1) translates into protein which performs the original function whereas the low expresser (mbx2) displays a diminished or alternative role [46]. In any case, one of the first steps towards answering these functional questions is to determine the expression levels of the various ZPC gene products found in the eggs, which is the goal of this study. Although there is substantial research and evidence elucidating the role of ZPC in anuran fertilization, we do not know the ZPC expression level for the different genes in individual frogs. Xenopus is a prime organism for these studies since it is a commonly used vertebrate model for molecular, developmental, and fertilization research. As for a comparison, it would be best to study a frog species closely related to X. laevis (i.e. X. borealis) and a more distantly related one (i.e. Lepidobatrachus laevis) so as to assess the short and long term evolutionary patterns. By examining ZPC expression in close and distantly related frog species, comparisons can be made to determine whether or not orthologous ZPC genes show similar patterns of expression through out the evolution of these vertebrate species. 18 X. borealis was chosen as the closely related species since it last shared a common ancestor with X. laevis approximately 10 mya [47,48,49]. Interestingly, the lineage leading to Xenopus frogs experienced a whole genome duplication event approximately 30 million years ago [47,48,49]. Since X. borealis was found to have 5 ZPC genes expressed in its ovary, this means that there has to be at least one other gene loci since a single loci can only provide 4 allelic copies (4n). Evidence suggests duplicated genes (paralogs) are often subject to purifying selection soon after genome duplication and gene duplicates may acquire mutations in regulatory regions which effect expression [50,18]. Our hypothesis is that the ZPC gene that functions as the primary sperm-binding protein in the closely related X. laevis and X. borealis should also be expressed to a similar degree and expressed in higher amounts compared to paralogs. As for a distant relative to Xenopus, the South American frog Lepidobatrachus laevis from the Leptodactylidae family was chosen. Within the order Anura, frog species are classified as advanced or primitive frogs grouped together in the suborders Neobatrachia and Archaebatrachia. X. laevis and X. borealis are examples of a small percentage of frogs that belong to Archaebatrachia or the more primitive classification. The vast majority (96%) of species are members of Neobatrachia including L. laevis. Based on molecular and fossil data, Xenopus and Lepidobatrachus last shared a common ancestor about 110-120 million years ago, making L. laevis a distant relative of Xenopus species [51,48]. As mentioned previously, studies from Dr. Peavy's lab have shown that a single L. laevis female expressed 6 different ZPC genes and yet this species is only diploid. In 19 addition to the existence of multiple gene copies, data from 2D immunoblots suggests that the amount of translated ZPC protein is variable within the L. laevis vitelline envelope (Figure 5, Peavy unpublished results). Although the amount of protein translation does not always correlate with levels of gene expression, this immunoblotting data suggests that ZPC protein levels do indeed vary and are unequally expressed. Hypothesis This has led me to the following hypothesis statement: Xenopus laevis, Xenopus borealis and Lepidobatrachus laevis ZPC genes are unequally expressed within their respective ovaries, and orthologous ZPC genes have a similar pattern of expression. Objectives ï‚· Clone and sequence the GAPDH cDNA for use as a reference gene in qPCR ï‚· Design qPCR primers to known ZPC sequences as well as to GAPDH ï‚· Optimize conditions and parameters for qPCR ï‚· Perform qPCR and analyze results to determine gene expression patterns Rationale for Experimental Design To support and build upon data suggesting unequal ZPC protein translation, the expression level of each X. laevis, X. borealis and L. laevis ZPC gene will be determined by quantitative PCR (qPCR). Most studies aiming to quantify gene expression turn to 20 Homologous VE antisera ZPC ZPC X. laevis ZPC antisera ZPC ZPC X. laevis L. laevis Figure 5. 2D immunoblot of X. laevis and L. laevis vitelline envelopes (VE). Isoelectric point is indicated on the top horizontal axis and molecular weights down the vertical axis. (a) Immunoblot of X. laevis VE using antisera to the entire X. laevis VE glycoproteins. (b) Immunoblot of X. laevis VE using X. laevis ZPC antisera. (c) Immunoblot of L. laevis VEs using antisera to the entire L. laevis VE. (d) Immunoblot of L. laevis VE using X. laevis ZPC antisera. 21 qPCR due to its sensitivity and level of accuracy. Quantitative PCR measures gene expression by monitoring the generation of PCR products at each successive cycle. What sets this technology apart from a traditional PCR approach is the use of fluorescent reporters that detect the formation of products in early cycles before the reaction has a chance to become variable with respect to the total amount of product made (end-point assay). The point at which the fluorescent intensity raises appreciably above background (threshold level) is termed the CT value. The CT value approximates when the reaction enters into the exponential phase of amplification. In addition, CT values correspond to the amount of starting material (mRNA or cDNA in this case) [52]. Expression will be determined by first comparing ZPC CT values to the CT of a housekeeping gene. In this case, glyceraldehyde 3-phosphate dehydrogenase (GAPDH) was chosen to normalize ZPC expressional data because it is a commonly used reference gene and is expressed in the early growing frog oocyte [53]. Since the sequence of GAPDH is known for X. laevis, it can be used for the design of primers to target amplification, cloning, and sequencing of the corresponding cDNA in X. borealis and L. laevis. Primers designed to amplify ZPC and GAPDH will then be used in qPCR reactions to determine their respective CT values for each gene. Once obtained, the CT values can be used to determine the expression levels of each ZPC gene relative to the reference gene and compared to see if similar patterns of expression were found. 22 Chapter 2 METHODS Sequencing of ZPC cDNAs from Frog Ovary cDNA Libraries. Dr. Peavy prepared ovary cDNA libraries from single female individuals from the frog species Lepidobatrachus laevis, Xenopus laevis, and X. borealis. Ovary tissue was snap frozen in liquid nitrogen and subsequently processed for RNA purification (Qiagen RNA purification kit). Ovary cDNA libraries were constructed using the Marathon cDNA Amplification Kit (Clontech). In addition, another L. laevis ovary cDNA library was made using the Lambda Uni-ZAP XR vector kit (Agilent Technologies, Santa Clara, CA). Degenerate primers designed to the conserved ZP domain of the ZPC genes were used to PCR amplify the targeted middle cDNA portion encoding the ZP domain. Subsequently, the 5' and 3' cDNA ends were PCR amplified using specific primers designed from the known sequence of the conserved domain in combination with the Marathon adaptor AP1 primer [37, Peavy unpublished]. PCR products were cloned into the pGEM-T Easy (Promega) vector, transformations performed (using competent XLIBlue E. coli cells and LB-ampicillin plates), and colonies selected for plasmid purification (Qiagen Plasmid Mini Kit). After performing EcoRI restriction digest to identify clones with appropriately sized inserts, the PCR product inserts were sequenced commercially (Davis Sequencing, Davis, CA; Sequetech, Mountain View, CA). ZPC sequence information was edited to produce full length sequences using the Lasergene software package (DNASTAR Inc, Madison, WI). Glycerol stocks were made for all ZPC clones and stored at -80oC for future use. 23 PCR and Sequencing of Glyceraldehyde 3-phosphate dehydrogenase Degenerate primers were designed based on the GAPDH amino acid sequence of an evolutionarily diverse group of organisms aligned by the web-based multiple sequence alignment program CLUSTALW (http://www.ebi.ac.uk/Tools/msa/clustalw2/) and reformatted using the pretty printing BOXSHADE program (http://www.ch.embnet.org/software/BOX_form.html). The following taxa were used for the alignment: Mouse 1 (XP_001480018.1), Mouse 2 (NP_032110.1), Human (NP_002037), Chicken (NP_989636), Zebrafish (NP_001108586), X. laevis (NP_001080567.1), and X. tropicalis (NP_001004949.1). Primers were designed to evolutionarily conserved regions to increase the chances that orthologous GAPDH sequences would be amplified from the X. borealis and L. laevis ovary cDNA libraries. One forward and two reverse degenerate primers were designed in order to perform nested PCR to reduce the amplification of nonspecific products. Custom-made degenerate primers (Sigma, Woodlands, TX) were diluted to 10µM working aliquots. PCR reactions were performed using an aliquot of the L. laevis or X. borealis ovary cDNA library in combination with an internal degenerate primer (GAPDH.dR1) and the appropriate vector primer (T3 or AP1). The Advantage PCR Polymerase kit (Clontech, Mountain View, CA) was used for amplification since this enzyme mix has proofreading capability to reduce the nucleotide incorporation error rate. PCR was performed using a MyCycler thermocycler (Bio-Rad, Hercules, CA) based on the following conditions: initial denature cycle at 95oC for 5 minutes; 35 cycles at 94oC for 45 seconds, 50oC for 45 seconds, and 72oC for 45 seconds; and a final hold cycle at 4oC. 24 A negative control without template DNA was also included. PCR products were analyzed by loading ten microliters into 1.5% agarose-TAE gels, electrophoretically separated, stained using ethidium bromide, and visualized using a AlphaImager 2200 gel documentation system (Alpha Innotech, Santa Clara, CA). For nested PCR, the initial PCR reaction was diluted 1/500x and served as the starting template for the subsequent PCR amplification. In addition, a set of nested internal degenerate primers (GAPDH.dF1 and GAPDH.dR2) were used in combination with the vector primers in a PCR reaction similar to that described above except that a gradient temperature was used for annealing conditions (50.0oC, 52.9oC, and 56.2oC). PCR products were electrophoretically analyzed as described. PCR products were subsequently purified using the QIAquick PCR Purification Kit (Qiagen), ligated into pGEM T-easy, and transformed into E. coli. After purifying plasmid DNA and analyzing restriction digests, appropriately sized PCR products were commercially sequenced. This middle section of the putative GAPDH sequences was then used in BLAST searches (http://blast.ncbi.nlm.nih.gov/Blast.cgi) of the sequence databases to determine whether it was indeed the GAPDH sequence from each frog species. Primers were subsequently designed for amplification of the 5' and 3' ends of the L. laevis and X. borealis GAPDH cDNAs based on the sequences obtained above using the Primer Select software program within the Lasergene software package. Once again, a nested PCR approach was undertaken using specific internal primers and the vector ends. Appropriately sized PCR products were PCR purified if they were one of the only bands observed on the gel. However, if there were multiple bands, PCR products were 25 isolated by cutting out gel slices from 1.4% low melt agarose-TAE gels (Agarose II, ISC BioExpress) and extracting the DNA using the QIAquick Gel Extraction Kit (Qiagen). PCR products were subsequently ligated, transformed, sequenced and edited as described. qPCR Primer Design All primers used in qPCR assays were designed using the Primer Select software. Each ZPC sequence was initially imported into the SeqBuilder software program (Lasergene) and converted to a format compatible with the Primer Select program. Forward and reverse primer sets were designed based on parameters determined for optimal qPCR as defined by Logan et al. [54]. Primer sets were designed with a theoretical melting temperature (Tm) of 60oC (+/- 2oC), %GC content to be approximately 50%, and PCR product length to be 100-150 base pairs. The length of the primers ranged from 25-36 bases to ensure that primers will specificity target and amplify a 100-150 base pair product of either ZPC or GAPDH from a cDNA library. Only two or three of the last five bases in the 3’ end of each primer were designed to have a guanine or cytosine so as to prevent artificial GC clamping and amplification of non-specific products. CLUSTALW and BOXSHADE programs were used to visualize the alignment of the ZPC cDNA sequences so that regions specific to each clone could be targeted for primer design. Potential primer sets designed through Primer Select were then evaluated with respect to their location within the CLUSTALW alignments. A total of 15 ZPC primer 26 sets were designed for qPCR assays which included 6 for L. laevis, 5 for X. borealis, and 4 for X. laevis. Additionally, a GAPDH primer set was designed for each species. qPCR Primer Annealing Temperature Optimization The optimal annealing temperature (TA) for each primer set was determined using the MyCycler thermocycler which has the capability to perform gradient annealing temperature PCR. The template DNA for each primer set was plasmid DNA that was purified from E. coli transformants that contained the appropriate cDNA insert clone. Each ZPC and GAPDH plasmid DNA was quantified using the PicoGreen fluorescent assay (Invitrogen, Carlsbad, CA) and a Turner Biosystems 380 Fluorometer (Fisher Scientific, Pittsburg, PA). Plasmid DNA was diluted to a working concentration of 2.5ng/µl which was then used to generate serial dilutions for template DNA in the PCR reactions. The GoTaq Flexi DNA Polymerase kit (Promega) was used for PCR reactions. PCR reactions were assembled in a PCR clean hood which included UV irradiation of all components except primers and DNA template. A master mix was prepared using all the components except plasmid DNA and then aliquoted to thin walled PCR tubes. In addition, a negative control without template DNA was included ("no template") so as to determine whether primer dimers were being formed during the reaction and if there was template contamination. Gradient PCR was performed using the following conditions: initial denature cycle at 95oC for 5 minutes; 25 cycles at 94oC for 45 seconds, gradient annealing temperature for 45 seconds, and 72oC for 45 seconds; and a final hold cycle at 27 4oC. PCR products were then electrophoretically analyzed using 2% agarose-TAE gels and a DNA size standard, ExACT Gene 100bp ladder (Fisher Scientific). Mixed Plasmid Experiments Mixed plasmid experiments were preformed to verify that each primer set is specific for the template it was designed for and that the specific primers do not amplify any of the other ZPC templates found in a particular cDNA library. Each PCR reaction was performed as above, but instead of using a gradient of annealing temperatures, the temperature deemed as optimal was used for annealing (i.e. amplification was robust at this temperature, but the next higher temperature resulted in a dramatic reduction of product). Each experiment contained a positive control which utilized one plasmid ZPC cDNA template, its specific primer set, and the annealing temperature optimal for amplifying its specific product. The mixed plasmid PCR reaction was then set up to include all the other plasmid ZPC cDNA templates (without the positive control plasmid template), the specific primer set for the control plasmid, and the same optimized annealing temperature. A "no template" negative control was also included as described before. PCR reactions were analyzed on 2% agarose gels as described above. iQ5 Multicolor Real-Time Detection System Calibration It is essential that the iQ5 Multicolor Real-Time Detection System (Bio-Rad) be calibrated for the type of vessel, volume, and sealing methodology prior to performing qPCR. The first calibration is the mask alignment. To align the mask, 10x External Well 28 Factor Solution (Bio-Rad) is diluted to 1x with ddH2O, and then 25µl (the chosen volume) is pipetted into the vessel of choice, either a hard-shell 96-well plate (Bio-Rad) or alternatively clear 0.2mL 8-tube Strips (Bio-Rad) depending on the needs of the experiment. For the 96-well plate, all wells were loaded with 25µL of 1x external well factor solution and sealed with Microseal ‘B’ Film (Bio-Rad). For the 0.2mL 8-tube strips, 25µl of 1x External Well Factor Solution was pipetted into each tube and sealed with corresponding flat cap strips (Bio-Rad). Within the iQ5 Optical System Software, the steps were followed to align, optimize, and save the mask settings. It is also essential that the background level of fluorescence from the vessel is accounted for (background calibration) by using empty vessels that have been sealed in the same manner as before. Once the mask and background has been calibrated, the persistent well factor is calibrated to account for well-to-well fluorescence detection variation. The same vessels containing the 25µl of 1x External Well Factor Solution was returned to the heating block and the persistent well factors were collected. Calibrations were valid for six months according to the manufacturer instructions. The "Pure Dye Calibration" did not need to be done since the fluorescent DNA binding dye SYBR Green was been used for detection. Primer Efficiency The ability of each primer set to efficiently amplify the target was tested using the iQ5 Multicolor Real-Time Detection System (Bio-Rad). Master mixes were prepared using the iQ SYBR Green Supermix (Bio-Rad) which contains 2x reaction buffer, dNTPs, iTaq DNA polymerase, MgCl2, SYBR Green I, fluorescein, and stabilizers. In 29 addition, brand new pipetteman (Gilson) were used for delivering volumes to PCR vessels to increase accuracy and prevent contamination from previous PCR reactions. Master mixes were prepared as described above in a clean PCR hood and primers and template DNA were then added later. Plasmid DNA previously used for primer optimization protocols served as the starting template for primer efficiency assays. The 2.5ng/µl working solution of plasmid DNA (containing cDNA inserts) was serially diluted 10-fold five times creating a gradient of concentrations ranging from 2.5ng/µl to 0.000025ng/µl. Only 1µl of plasmid templates diluted in the range from 0.025ng/µl to 0.000025ng/µl were used as starting template for the primer efficiency qPCR reactions. A "no template" negative control was also included for any particular set of reactions. Strip tubes were then capped, centrifuged, and placed on the optical reaction block of the iQ5 Multicolor Real-Time Detection System. The following qPCR protocol was used for each primer efficiency assay: initial denature cycle at 95oC for 5 minutes; 30 cycles at 94oC for 15 seconds, optimal annealing temperature for 30 seconds; and lastly a melting curve cycle from 55-95oC using 81 steps with each step being 15 seconds. The efficiency of each primer set was calculated by the iQ5 software as determined by setting up a standard curve for each dilution series assay. Acceptable primer efficiency has been established to be between 90-110%. Melt curves were also analyzed to examine whether there was any evidence of primer dimer amplification. Gene Expression Level Assays using qPCR 30 Quantitative PCR was performed using the same iQ5 Multicolor Real-Time Detection System as above. Master mix without primers was exposed to UV irradiation in the PCR hood for 15 minutes as before. A single master mix is prepared for the assessment of ZPC and GAPDH as outlined in Figure 6. The SYBR Green Super Mix contains the following components: ddH2O, 2x reaction buffer, dNTPs, iTaq DNA polymerase, 6mM MgCl2, SYBR Green I, fluorescein, and stabilizers. Half of the total volume (74.75µl) is transferred to one tube where primers for the specific ZPC cDNA are added, whereas the second half is added to a tube with the GAPDH primer set. Then, 25µl is pipetted into wells of the 96-well plate in triplicate. Separate master mixes are prepared to serve as the "no template" negative control (simply doesn't contain ovary cDNA). Plates were then sealed with Microseal ‘B’ Film, centrifuged, and placed on the optical reaction block of the iQ5 Multicolor Real-Time Detection System. qPCR protocols were set up using the iQ5 Optical System Software. The same qPCR protocol was used for each different ZPC or GAPDH gene being assayed based on the optimal annealing temperature utilized in the primer efficiency assays. Data Analysis Upon completion of each assay, the data is analyzed in the Data Analysis Module of the iQ5 Optical System Software. The PCR Quantification (PCR Quant) tab displays the amplification curves and is used to set the analysis conditions for each trial. The CT value threshold was manually set at 200 relative fluorescent units (RFU) for each species and gene being measured. Once the threshold was set, the specific CT value for each 31 Super Mix 81.25ul ddH2O 61.75ul Ovary cDNA 6.5ul 74.75ul GAPDH Forward Primer 3.5ul GAPDH Reverse Primer 3.5ul 25ul 74.75ul ZPC Forward Primer 3.5ul ZPC Reverse Primer 3.5ul 25ul Figure 6. Quantitative PCR master mix set-up. To minimize technical variability between reactions master mixes were used for each qPCR assay. Volumes of qPCR reagents are listed next to each PCR tube. Volumes next to arrows indicate amount of reaction material that was transferred to either a new PCR tube or the wells of a qPCR plate. 32 reaction was then recorded into a spreadsheet for further analyses of relative gene expression. Since reactions were performed in triplicate, CT values were averaged for each trail. Two additional trials, each with triplicate samples were also performed. GAPDH qPCR assays were performed concurrently with each ZPC gene trial to serve as a plate-to-plate control to make sure equal volumes of cDNA are incorporated into each trial. In order to determine the relative ZPC gene expression differences, average ZPC CT values were normalized to the average GAPDH CT values. GAPDH served as the baseline that all ZPC CT values were compared to within a particular species. CT values were normalized to GAPDH by calculating the ΔCT where ΔCT = CT ZPC – CT GAPDH. Once the ZPC CT values were normalized to GAPDH, each normalized ZPC gene was compared to the others found within the particular frog species by calculating the Δ(ΔCT) value where Δ(ΔCT) = ΔCT ZPC.n - ΔCT ZPC.lowest. In order to calculate this, the lowest expressed gene (ΔCT ZPC.lowest) was used as the baseline to which all other ZPC genes would be compared. Then, the relative fold difference was calculated using the following equation: 2Δ(ΔCT). For example, one ZPC might be expressed 4 times as much as the lowest expressed ZPC gene. In addition, melt curve data was viewed through the Melt Curve/Peak setting within the Data Analysis software. The RFU (relative fluorescence unit) data collected during the melt curve cycle was plotted against temperature to create a graphical display of the data. This melt curve was then used to derive a melt peak chart which depicts the rate of change in fluorescence with respect to temperature (a rapid decrease in 33 fluorescence signal indicates that a specific product melted). The number of different sized products can then be estimated by the number of individual peaks on the melt peak chart. Statistical Analysis Statistical tests were conducted to determine and decipher significant levels of expression between each measured gene. Using Microsoft Excel, the existence of variable expression was established through ANOVA analysis using ΔCT values from triplicate trials. Significant levels of expression for individual genes were determined by performing t-tests comparing pairwise combinations using the Bonferroni correction. The Bonferroni correction is established by dividing 0.05 by the total number of pairwise comparisons to yield the alpha level. For example, L. laevis expresses six ZPC genes making 15 total pairwise comparisons. The null hypothesis of equal expression is rejected if p-values derived from t-tests are less than 0.0033 (0.05/15 = 0.0033). Phylogenetic Analyses Phylogenetic and molecular evolutionary analyses of the ZPC gene family were conducted using MEGA software version 4 [55]. ZPC cDNA sequences from L. laevis, X. laevis, and X. borealis were imported into the MEGA program and globally aligned by CLUSTALW. Phylogenetic trees were constructed showing evolutionary relationships within and across species using the maximum likelihood and neighbor-joining methods. A bootstrap analysis was performed using 1000 replicates to resample the data and gain 34 and estimate of the level of confidence in the branching order from the resulting tree. Bootstrap values are the percentage of tree replicates that show the same branching pattern as the consensus tree [56]. 35 Chapter 3 RESULTS Cloning and Sequencing GAPDH cDNAs Design of GAPDH Degenerate Primers Degenerate primers were designed based on a GAPDH protein multiple sequence alignment of species ranging from zebrafish to human, including the frog species X. laevis and X. tropicalis (Figure 7). The diversity of the organisms included in the alignment increases the likelihood that the conserved sequences observed in the alignment will also be conserved for the two targeted frog species, L. laevis and X. borealis. Degenerate primers were designed to evolutionarily conserved amino acid regions indicated by the black shading in Figure 7. One forward and two reverse degenerate primers were designed so that they could be used in combination with vector primers based on which cDNA library kit was used for cDNA construction. The primers GAPDH.dF1 (forward) and GAPDH.dR1 (reverse) both have a fold degeneracy of 256 giving the primers a 1 in 256 chance of base pairing 100% with the corresponding GAPDH cDNA in the library (Table 1). The third degenerate primer was 384 fold degenerate making the chances of perfect base pairing lower, but still within a reasonable range. The primers were designed to be 20-24 bases in length and to have little or no degeneracy at their 3' ends to decrease the chances of non-target amplification. Cloning of L. laevis GAPDH For the first round of the L. laevis nested PCR strategy, the GAPDH.dR1 primer 36 Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 1 1 1 1 1 1 1 --MVKVGVNGFGRIGRLVTRAAICSGKVEIVAINDPFIDLNYMVYMFQYDSTHGKFNSTV --MVKVGVNGFGRIGRLVTRAAICSGKVEIVAINDPFIDLNYMVYMFQYDSTHGKFNGTV MGKVKVGVNGFGRIGRLVTRAAFNSGKVDIVAINDPFIDLNYMVYMFQYDSTHGKFHGTV --MVKVGVNGFGRIGRLVTRAAVLSGKVQVVAINDPFIDLNYMVYMFKYDSTHGHFKGTV --MVKVGINGFGRIGRLVTRAAFLTKKVEIVAINDPFIDLDYMVYMFQYDSTHGKYKGEV --MVKVGINGFGCIGRLVTRAAFDSGKVQVVAINDPFIDLDYMVYMFKYDSTHGRFKGTV ------------------------------------------MAYMFKYDSTHGRFKGTV Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 59 59 61 59 59 59 19 Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 119 119 121 119 119 119 79 KAENGKLVINGKPITIFQERDPANIKWGEASAEYVVESTGVFTTMEKARAHLKGGAKRVI KAENGKLVINGKPITIFQERDPTNIKWGEAGAEYVVESTGVFTTMEKAGAHLKGGAKRVI KAENGKLVINGNPITIFQERDPSKIKWGDAGAEYVVESTGVFTTMEKAGAHLQGGAKRVI KAENGKLVINGHAITIFQERDPSNIKWADAGAEYVVESTGVFTTMEKAGAHLKGGAKRVI KAEGGKLVIDGHAITVYSERDPANIKWGDAGATYVVESTGVFTTIEKASAHIKGGAKRVI KAENGKLIINDQVITVFQERDPSSIKWGDAGAVYVVESTGVFTTTEKASLHLKGGAKRVV CVENGKLVINGKAVTVFQERDPSNIKWGDAGAVYVVESTGVFTTKEKAGLHLKGGAKRVI GAPDH.dF1 --------> ISAPSADAPMFVMGVNHEKYDNSLKIFSNASCTTNCLAPVAKVIHDNFGIVEGLMTTVHA ISAPSADAPMFVMGVNHEKYDNSLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHA ISAPSADAPMFVMGVNHEKYDNSLKIISNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHA ISAPSADAPMFVMGVNHEKYDKSLKIVSNASCTTNCLAPLAKVIHDNFGIVEGLMTTVHA ISAPSADAPMFVMGVNHEKYDNSLTVVSNASCTTNCLAPLAKVINDNFVIVEGLMSTVHA ISAPSADAPMFVVGVNHEKYENSLKVVSNASCTTNCLAPLAKVINDNFGIVEGLMTTVHA ISAPSADAPMFVVGVNHDKYDNSLHVVSNASCTTNCLAPLAKVINDNFGILEGLMTTVHA Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 179 179 181 179 179 179 139 Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 239 239 241 239 239 239 199 Mouse1 Mouse2 Human Chicken Zebrafish X.laevis X.tropicalis 299 299 301 299 299 299 259 ITATQKTVDGPSGKLWRDGRGAAQNIIPASTGAAKAVGKVIPELNWKLTGMAFRVPTPNV ITATQKTVDGPSGKLWRDGRGAAQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPNV ITATQKTVDGPSGKLWRDGRGALQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTANV ITATQKTVDGPSGKLWRDGRGAAQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPNV ITATQKTVDGPSGKLWRDGRGASQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPNV FTATQKTVDGPSGKLWRDGRGAGQNIIPASTGAAKAVGKVIPELNGKITGMAFRVPTPNV FTATQKTVDGPSGKLWRDGRGAGQNIIPASTGAAKAVGKVIPELNGKLTGMAFRVPTPNV GAPDH.dR2 <-------FVLDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEDQVVSCYFNSNSHSSTFDARAG SVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEDQVVSCDFNSNSHSSTFDAGAG SVVDLTCRLEKPAKYDDIKKVVKQASEGPLKGILGYTEHQVVSSDFNSDTHSSTFDAGAG SVVDLTCRLEKPAKYDDIKRVVKAAADGPLKGILGYTEDQVVSCDFNGDSHSSTFDAGAG SVVDLTVRLEKPAKYDEIKKVVKAAADGPMKGILGYTEHQVVSTDFNGDCRSSIFDAGAG SVVDLTCRLQKPAKYDDIKAAIKTASEGPMKGILGYTQDQVVSTDFNGDTHSSIFDADAG SVVDLTCRLSKPAKYDDIKAAIKTAAHGPMKGILEYTEDQVVSTDFNGDTHSSIFDAGAG GAPDH.dR2 <--------IALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE IALNDNFVKLISWYDNEYGYSNRVVDLMAYMASKE IALNDHFVKLISWYDNEFGYSNRVVDLMAHMASKE IALNDHFVKLVSWYDNEFGYSNRVVDLMVHMASKE IALNDHFVKLVTWYDNEFGYSNRVCDLMAHMASKE IALNENFVKLVSWYDNECGYSNRVVDLVCHMASKE IALNDNFVKLVSWYDNECGYSHRVVDLMCHMASKE Figure 7. GAPDH protein sequence alignment and primer design. GAPDH protein sequences from taxa ranging from zebrafish to human were aligned by CLUSTALW and imported into BOXSHADE. Black shading indicates sequence identity while grey shading shows conservative substitutions. No shading (white) indicates sequence differences. Degenerate primer locations are represented by arrows above the sequences. Table 1. Summary of primers used to clone GAPDH from L. laevis and X. borealis. GAPDH degenerate primers used to clone a middle portion of the corresponding cDNA were designed by hand to evolutionarily conserved regions. Gene specific GAPDH PCR primers targeting the 5' and 3' cDNA ends of L. laevis and X. borealis were designed using Primer Select software after sequencing of the middle portion, and were used in combination with their vector primers flanking the cDNA cloning sites 37 38 was used in combination with the T3 vector primer (Figure 8a). The expected size for the PCR product was 950 bp which was observed in the stained agarose gel in addition to several other bands found to be approximately 175 bp and 750 bp. This initial PCR reaction was performed at a relatively low annealing temperature or low stringency (50oC) to ensure amplification which explains the additional PCR products. A nested PCR strategy was then used to amplify an expected 403 bp product by utilizing the two internal degenerate primers GAPDH.dF1 and GAPDH.dR2. A 1/500 dilution of the initial PCR reaction containing the 950 bp amplicon served as the starting template for this nested PCR. In addition, a gradient was used for the annealing step (50oC, 52.9oC, 56.2oC) to optimize the annealing temperature and decrease non-target amplification. The stained gel of the nested PCR revealed lone bands at the 400 bp ladder mark for each of the annealing temperatures (Figure 8b). The nested PCR reactions for L. laevis GAPDH were pooled together and then ligated into the pGEM-T Easy vector, transformed into competent E. coli cells, and plated on LB agar plates containing ampicillin. Selected colony transformants were then used to generate overnight cultures for spin column plasmid purification. After digestion with EcoR1 restriction enzyme, 5 clones were identified to contain the appropriately sized 400 bp cDNA (Figure 8c). One clone was subsequently sent off for commercial sequencing using the sp6 vector primer. After receiving the sequence information, a BLAST search was performed which returned with the best alignment being to the X. laevis GAPDH cDNA sequence. The sequenced 403 bp L. laevis cDNA sequence aligned to the middle portion of the X. laevis GAPDH cDNA as intended with 83% identity (Figure 9). 39 a. 1 L 1000 bp 950 bp 2 GAPDH 3 GAPDH 100 bp b. 1 L 2 50.0 3 52.9 4 56.2 5 TA - 1000 bp GAPDH 400 bp 100 bp c. 1 L 2 a 3 b 4 c 5 d 6 e 1000 bp 400 bp GAPDH 100 bp Figure 8. Degenerate PCR of the L. laevis GAPDH cDNA. (a) First round PCR using the GAPDH.dR1 primer in combination with the T3 vector primer (b) Second round nested PCR with internal GAPDH.dF1 and GAPDH.dR2 (c) EcoR1 restriction enzyme digests of cloned nested PCR products. L = ladder; - = negative control; TA = annealing temperature; letters (a, b, c, etc.) = clones. 40 XlGAPDH LlGAPDH 1 AGCAGATGCCCCCATGTTTGTAGTTGGCGTGAACCATGAGAAATATGAGAACTCTCTTAA 1 -GCTGATGCCCCGATGTTTGTTGTTGGTGTCAACCATGAAAGCTATGACAACTCACTGAA XlGAPDH LlGAPDH 61 AGTTGTTAGCAATGCTTCCTGCACTACAAACTGTCTGGCTCCTCTCGCAAAGGTCATCAA 60 GGTTATCAGCAATGCCTCATGCACCACCAACTGCCTTGCTCCTCTTGCAAAGGTCATCCA XlGAPDH LlGAPDH 121 CGACAACTTTGGCATTGTTGAGGGACTCATGACAACAGTCCATGCTTTCACTGCCACCCA 120 TGACAACTTTGGCATTGTAGAGGCCCTGATGACCACAGTCCATGCTTACACCGCTACCCA XlGAPDH LlGAPDH 181 GAAGACAGTGGATGGCCCATCAGGCAAGCTGTGGAGAGATGGCAGAGGTGCAGGTCAGAA 180 GAAGACCGTGGACGGACCATCTGGAAAGATGTGGCGTGATGGCAGAGGTGCAGGCCAGAA XlGAPDH LlGAPDH 241 CATTATTCCCGCCTCAACTGGTGCAGCAAAGGCTGTCGGAAAAGTTATCCCTGAGCTGAA 240 CATCATCCCAGCATCTACTGGTGCTGCTAAGGCTGTGGGCAAAGTCATCCCAGCCCTGAA XlGAPDH LlGAPDH 301 CGGAAAAATAACCGGAATGGCTTTCCGTGTCCCCACCCCAAATGTGTCCGTCGTGGATCT 300 TGGAAAGTGCACTGGTATGGCCCTCAGAGTTCCCACTCCCAATGTGTCAGTCGTTGACTT XlGAPDH LlGAPDH 361 GACCTGCCGCCTGCAGAAGCCGGCCAAGTACGATGACATCAAGG 360 GACTGCCCGTCTGGAGAAACCAGCCAAGTACGACGATATCAAGA Figure 9. Sequence alignment of the middle portion of the L. laevis and X. laevis GAPDH cDNAs. Approximately 400 bases of the cloned and sequenced L. laevis GAPDH cDNA was aligned by CLUSTALW to the published sequence for the X. laevis GAPDH cDNA (NM_001087098) and imported into BOXSHADE program. Black shading indicates sequence identity while grey and no shading (white) indicates sequence differences. The sequences aligned with 83% identity. 41 This L. laevis GAPDH sequence was then used to design additional primers for the amplification of the corresponding 5’ and 3’ cDNA ends (Table 1). To amplify the 3’ end, LlGAPDH.F1 was used with the T7 vector primer to generate a 650 bp product (Figure 10a). An aliquot of this initial PCR was then used in a nested PCR using the internal LlGAPDH.F2 primer and the T7 vector primer, and a gradient was used for the annealing step (58.8-65.0oC). In addition to the expected 600 bp product, multiple sized products were evident in the stained gel (Figure 10b). In order to separate the 600 bp product from the mixture for cloning purposes, an aliquot of the nested PCR was electrophoresed on a low melt agarose gel and the agarose plug corresponding to the 600 bp region was removed. The agarose plug was melted at 50oC and subsequently ligated into the pGEM-T Easy vector and transformed into E. coli as before. Plasmid purification of overnight bacterial cultures was performed, and EcoR1 digested plasmids revealed the correctly sized 600 bp products in the stained gel (Figure 10c). Several clones were sent off for commercial sequencing. BLAST searches of the cloned sequences revealed that it aligned to the 3' end of the X. laevis GAPDH cDNA with 78% sequence identity. For the 5' cDNA end, the primer combination of LlGAPDH.R1 and T3 was used to amplify an expected 800 bp GAPDH product, however multiple non-target products were also amplified (Figure 11a). An aliquot of this initial PCR was then used as template for a nested PCR using the internal primer LlGAPDH.R2 and the T3 vector primer. This nested PCR resulted in an intensely staining band at the 700 bp ladder mark which corresponded to the anticipated GAPDH size and with minor bands noted at 500 bp and 250 bp (Figure 11b). An aliquot of the nested PCR was used for ligation and 42 a. 1 L 2 3’ 3 3’ 1000 bp 650 bp GAPDH 100 bp b. 1 2 3 4 5 6 L 58.8 60.8 63.7 65.0 63.7 TA 1000 bp 600 bp GAPDH 100 bp c. 1 L 2 a 3 b 1000 bp 600 bp GAPDH 100 bp Figure 10. Cloning of the 3’ cDNA end of L. laevis GAPDH cDNA sequence. (a) Initial PCR amplification of the 3' end using the gene specific primer LlGAPDH.F1 in combination with the T7 vector primer (b) Nested PCR of first round PCR products using the gene specific primer LlGAPDH.F2 in combination with T7 vector primer. Labels below lane numbers indicate the gradient annealing temperatures used during amplification. (c) EcoR1 restriction enzyme digest of cloned nested PCR products. L = ladder, - = negative control, TA = annealing temperature, letters (a, b) = clones. 43 a. 1 L 2 5’ 3 5’ 1000 bp 800 bp GAPDH 100 bp b. 1 L 2 48.7 3 49.9 5 4 51.7 51.7 TA 1000 bp 750 bp GAPDH 100 bp c. 1 L 1000 bp 700 bp 2 a 3 b 4 c 5 d 6 e 7 f 8 g 9 h GAPDH 100 bp Figure 11. Cloning the 5’ end of the L. laevis GAPDH cDNA sequence. (a) Initial PCR amplification using the gene specific LlGAPDH.R1 primer with T3 vector primer (b) Nested PCR using first round PCR products and internal LlGAPDH.R2 primer with T3 vector primer. Annealing temperatures are listed below lanes; no template control is in lane 5. (c) EcoR1 restriction enzyme digests of cloned nested PCR products. L = ladder; TA = annealing temperature; letters (a, b, c, etc.) = clones. 44 transformation. Transformant plasmid DNA was purified as before and the EcoR1 digests revealed several clones with the expected 700 bp product (Figure 11c). Several clones were sent off for commercial sequencing. Interestingly, the BLAST searches of the resulting sequences indicated they were from the bacterial species Bacillus pumilus. Additional clones were subsequently sequenced, but each one aligned to the bacterial species. Although the 5’ end of the L. laevis GAPDH sequence was not obtained, it was deemed unnecessary to pursue for the design of qPCR primers since 2/3rds of the sequence (66%) was determined. Cloning of X. borealis GAPDH For the first round of the X. borealis nested PCR strategy, the degenerate primer GAPDH.dR1 was used in combination with the AP1 vector primer (Figure 12a). The expected size for the PCR product was 970 bp which was observed in the stained agarose gel in addition to several other bands found to be approximately 600 bp and 500 bp. This initial PCR reaction was run at low stringency (50oC) to ensure amplification which explains the additional PCR products. A nested PCR strategy was then used to amplify an expected 600 bp product by utilizing GAPDH.dR1 and the internal degenerate primer GAPDH.dF1. A 1/500 dilution of the initial PCR reaction containing the 970 bp amplicon served as the starting template for this nested PCR. In addition, a gradient was used for the annealing step (50oC, 52.9oC, 56.2oC) to optimize the annealing temperature and decrease non-target amplification. The stained gel of the nested PCR revealed lone bands at the 600 bp ladder mark for each of 45 a. 1 L 2 GAPDH 3 - 1000 bp 970 bp GAPDH 100 bp b. 1 L 2 50.0 3 52.9 4 56.2 5 50.0 TA 1000 bp 600 bp GAPDH 100 bp c. 1 L 2 f 3 g 4 h 5 i 6 j 7 k 8 l 1000 bp 600 bp GAPDH 100 bp Figure 12. Degenerate PCR of the X. borealis GAPDH cDNA. (a) First round PCR using the GAPDH.dR1 primer in combination with the AP1 vector primer. (b) Second round nested PCR with internal GAPDH.dF1 and GAPDH.dR2 (c) EcoR1 restriction enzyme digests of cloned nested PCR products. L = ladder; - = negative control; TA = annealing temperature; letters (f, g, h, etc.) = clones. 46 the annealing temperatures (Figure 12b). The nested PCR reactions for X. borealis GAPDH were pooled together, ligated, transformed, and colony transformant plasmids purified for analysis by EcoRI digestion. Four clones were identified to contain the appropriately sized 600 bp cDNA (Figure 12c). One clone was subsequently sent off for commercial sequencing using the sp6 vector primer. After receiving the sequence information, a BLAST search was performed which returned with the best match to the X. laevis GAPDH cDNA sequence. The 586 bp X. borealis cDNA sequence aligned to the middle portion of the X. laevis GAPDH cDNA as intended with 89% identity (Figure 13). This X. borealis GAPDH sequence was then used to design additional primers for the amplification of the corresponding 5’ and 3’ cDNA ends (Table 1). To amplify the 3’ end, XbGAPDH.F1 was used with the AP1 vector primer to generate a 600 bp product (Figure 14a). An aliquot of this initial PCR was then used in a nested PCR using the internal XbGAPDH.F2 primer and the AP1 vector primer to generate a product 550 bp in length (Figure 14b). A small amount of the 800bp product persisted, but the discrepancy in the amounts was so great that the cloning vector was thought to likely preferentially ligate the smaller GAPDH product. PCR products were subsequently ligated, transformed, and colony transformant plasmids purified for analysis by EcoRI digestion (Figure 14c). Several positive clones were sent off for commercial sequencing. BLAST searches of the cloned sequences revealed that they aligned to the 3' end of the X. laevis GAPDH cDNA with 86% sequence identity. For the 5' cDNA end, the primer combination of XbGAPDH.R1 and AP1 was 47 XlGAPDH XbGAPDH 1 AGCAGATGCCCCCATGTTTGTAGTTGGCGTGAACCATGAGAAATATGAGAACTCTCTTAA 1 TGCTGACGCACCAATGTTCGTTGTTGGAGTGAACCATGACAAATATGACAACTCTCTTAC XlGAPDH XbGAPDH 61 AGTTGTTAGCAATGCTTCCTGCACTACAAACTGTCTGGCTCCTCTCGCAAAGGTCATCAA 61 AGTTGTGAGCAATGCATCCTGCACAACAAACTGCTTGGCTCCTCTTGCAAAGGTCATAAA XlGAPDH XbGAPDH 121 CGACAACTTTGGCATTGTTGAGGGACTCATGACAACAGTCCATGCTTTCACTGCCACCCA 121 CGACAATTTTGGCATTGTTGAGGGACTAATGACAACTGTCCATGCTTACACTGCTACCCA XlGAPDH XbGAPDH 181 GAAGACAGTGGATGGCCCATCAGGCAAGCTGTGGAGAGATGGCAGAGGTGCAGGTCAGAA 181 GAAGACTGTGGATGGCCCATCAGGGAAGCTGTGGAGAGATGGAAGAGGTGCTGGTCAGAA XlGAPDH XbGAPDH 241 CATTATTCCCGCCTCAACTGGTGCAGCAAAGGCTGTCGGAAAAGTTATCCCTGAGCTGAA 241 CATCATCCCCGCCTCCACTGGTGCAGCAAAGGCTGTAGGAAAGGTTATCCCTGAGCTGAA XlGAPDH XbGAPDH 301 CGGAAAAATAACCGGAATGGCTTTCCGTGTCCCCACCCCAAATGTGTCCGTCGTGGATCT 301 TGGCAAACTCACAGGAATGGCTTTCCGTGTCCCAGTCCCTAATGTGCCCGTTGTGGATCT XlGAPDH XbGAPDH 361 GACCTGCCGCCTGCAGAAGCCGGCCAAGTACGATGACATCAAGGCCGCCATTAAGACTGC 361 GACCTGCCGCCTGGAGAAGCCTGCAAAGTACAGTGATATCAAGGCTGCGGTTAAGGCTGC XlGAPDH XbGAPDH 421 ATCAGAGGGCCCAATGAAGGGAATCCTGGGATACACACAAGACCAGGTTGTCTCCACTGA 421 GTCCGAGGGACCAATGAAGGGAATCCTGCAATACACTGAAGACCAGGTTGTCTCCACTGA XlGAPDH XbGAPDH 481 CTTCAATGGTGACACTCACTCCTCCATCTTTGATGCTGATGCTGGAATTGCCCTGAATGA 481 CTTCAATGGCTGCACTCATTCCTCCATCTTTGATGCTGATGCTGGAATTGCACTGAATGA XlGAPDH XbGAPDH 541 AAACTTTGTGAAACTGGTTTCCTGGTATGATAATGAATGCGGCTAC 541 AAACTTTGTGAAGCTGGTTTCCTGGTAGGACAACGAATGCGGGTAT Figure 13. Sequence alignment of the middle portion of the X. borealis and X. laevis GAPDH cDNAs. Approximately 600 bases of the cloned and sequenced X. borealis GAPDH cDNA was aligned by CLUSTALW to the published sequence for the X.. laevis GAPDH cDNA (NM_001087098) and imported into BOXSHADE program. Black shading indicates sequence identity while grey and no shading (white) indicates sequence differences. The sequences aligned with 89% sequence identity. 48 a. 1 2 L 3’ 3 - 1000 bp 600 bp GAPDH 100 bp b. 1 L 3 - 2 3’ 1000 bp GAPDH 550 bp 100 bp c. 1 L 2 a 3 b 4 c 1000 bp 550 bp GAPDH 100 bp Figure 14. Cloning the 3’ end of the X. borealis GAPDH cDNA sequence. (a) Initial PCR amplification using the gene specific primer XbGAPDH.F1 in combination with the AP1 vector primer. (b) Nested PCR using first round PCR products and the gene specific primer XbGAPDH.F2 in combination with AP1 vector primer. Labels below lane numbers indicate the gradient of annealing temperatures used during amplification. (c) EcoR1 restriction enzyme digests of cloned nested PCR products. L = ladder; - = negative control; TA = annealing temperature; letters (a, b, c) = clones. 49 used to amplify an expected 800 bp GAPDH product, however multiple non-target products were also amplified (Figure 15a). An aliquot of this initial PCR was then used as template for a nested PCR using the internal primer XbGAPDH.R2 and AP1 vector primer using a temperature gradient (54.9oC, 56.6 oC, and 57.8 oC ) to limit the generation of non-target products. This nested PCR resulted in an intensely staining band at the 700 bp ladder mark corresponding to the anticipated GAPDH size with minor bands noted at 450 bp and 350 bp (Figure 15b). An aliquot of the nested PCR was used for ligation and trasformation into E. coli. Plasmid DNA was purified as before and the EcoR1 digests revealed two clones with the expected 700 bp product (Figure 15c). The clones were sent off for commercial sequencing. BLAST searches of the cloned sequences revealed that they aligned to the 5' end of the X. laevis GAPDH cDNA with 91% sequence identity Thus, the full length X. borealis GAPDH cDNA sequence (1185 bases) was assembled from the 5’, middle, and 3’ sequenced regions and aligned to X. laevis (Figure 16). qPCR Primer Design Each primer to be used in future qPCR assays was designed as close to optimal parameters given sequence constraints to find unique regions for each clone. All primers were designed with a melting temperature (Tm) of 60oC according to Primer Select software (DNASTAR, Madison, WI). Primer lengths ranged from 20 to 36 bases so as to be close to this Tm, and centered around regions that were divergent to other clones. Each primer was designed with the goal of having the last base on the 3’ end of the primer being either a cytosine or a guanine for stability to increase the chances that Taq 50 a. 1 L 3 - 2 5’ 1000 bp 800 bp GAPDH 100 bp b. 1000 bp 1 2 3 4 5 L 54.9 56.6 57.8 57.8 700 bp TA GAPDH 100 bp c. 1000 bp 700 bp 1 2 3 4 5 6 7 8 9 L h i j k l m n o GAPDH 100 bp Figure 15. Cloning the 5’ end of the X. borealis GAPDH cDNA sequence. (a) Initial PCR amplification using the gene specific primer XbGAPDH.R1 in combination with the AP1 vector primer. (b) Nested PCR using first round PCR products and internal XbGAPDH.R2 primer and AP1 vector primer. (c) EcoR1 restriction enzyme digests of cloned nested PCR products. L = ladder; - = negative control; TA = annealing temperature; letters (h, i, j, etc.) = clones. 51 XlGAPDH XbGAPDH 1 -----------------------CAGACTTCAGAGG--GGTTGATAA---ACCAAATCAA 1 TCCATCCTAATACGACTCACTATAGGGCTCGAGCGGCCGCCCGGGCAGGTACCAAACCAA XlGAPDH XbGAPDH 33 GTGACTACAAAAATGGTGAAGGTTGGAATTAACGGATTTGGCTGTATTGGGCGCCTGGTG 61 GGGACCACAAAAATGGCAAAGGTTGGAATCAATGGATTTGGCCGCATTGGGCGCCTGGTG XlGAPDH XbGAPDH 93 ACCCGCGCTGCCTTTGATAGCGGCAAAGTTCAAGTCGTTGCTATCAATGACCCCTTCATC 121 ACCCGCGCTGCCTTTGATAGCGGCAAAGTTCAAGTCGTCGCCATCAATGACCCCTTCATC XlGAPDH XbGAPDH 153 GACTTGGACTACATGGTGTACATGTTCAAGTATGACTCCACCCACGGCCGCTTTAAGGGA 181 GACTTGGACTATATGGTTTACATGTTCAAGTATGACTCCACCCACGGCTGCTTTAAGGGA XlGAPDH XbGAPDH 213 ACCGTTAAGGCTGAGAATGGCAAGCTGATCATCAATGACCAAGTCATCACCGTCTTCCAG 241 ACAGTGAAGGCTGAGAATGGAAAGCTGGTCATCAATGGGCATGAAATCACCGTTTTCCAG XlGAPDH XbGAPDH 273 GAGCGTGACCCCTCCAGCATTAAGTGGGGAGATGCTGGTGCCGTGTATGTGGTGGAATCT 301 GAGCGTGATCCCTCAAACATTAAGTGGGGCGATGCTGGTGCCATGTATGTGGTGGAATCT XlGAPDH XbGAPDH 333 ACTGGAGTCTTCACAACCACAGAGAAGGCCTCTCTGCACTTGAAGGGAGGTGCCAAGCGT 361 ACTGGAGTCTTCACAACCAAAGACAAGGCCTCTATGCACTTGAAGGGAGGAGCCAAGCGT XlGAPDH XbGAPDH 393 GTCGTTATCTCCGCCCCCTCAGCAGATGCCCCCATGTTTGTAGTTGGCGTGAACCATGAG 421 GTCATCATCTCCGCCCCCTCAGCAGATGCCCCCATGTTTGTTGTTGGAGTGAACCATGAC XlGAPDH XbGAPDH 453 AAATATGAGAACTCTCTTAAAGTTGTTAGCAATGCTTCCTGCACTACAAACTGTCTGGCT 481 AAATATGACAACTCTCTTACAGTTGTGAGCAATGCATCCTGCACAACAAACTGCTTGGCT XlGAPDH XbGAPDH 513 CCTCTCGCAAAGGTCATCAACGACAACTTTGGCATTGTTGAGGGACTCATGACAACAGTC 541 CCTCTTGCAAAGGTCATAAACGACAATTTTGGCATTGTTGAGGGACTAATGACAACTGTC XlGAPDH XbGAPDH 573 CATGCTTTCACTGCCACCCAGAAGACAGTGGATGGCCCATCAGGCAAGCTGTGGAGAGAT 601 CATGCTTACACTGCTACCCAGAAGACTGTGGATGGCCCATCAGGGAAGCTGTGGAGAGAT XlGAPDH XbGAPDH 633 GGCAGAGGTGCAGGTCAGAACATTATTCCCGCCTCAACTGGTGCAGCAAAGGCTGTCGGA 661 GGAAGAGGTGCTGGTCAGAACATCATCCCCGCCTCCACTGGTGCACCACAGGCTGTAAGA XlGAPDH XbGAPDH 693 AAAGTTATCCCTGAGCTGAACGGAAAAATAACCGGAATGGCTTTCCGTGTCCCCACCCCA 721 AAGGTTATCCCTGAACTGAATGGCAAACTCACAGGAATGGCTTTCCGTGTCCCATTCCCT XlGAPDH XbGAPDH 753 AATGTGTCCGTCGTGGATCTGACCTGCCGCCTGCAGAAGCCGGCCAAGTACGATGACATC 781 AATGTGTCCGTTGTGGATCTGACCTGCCGCCTGGAGAATCCTGCAAAGTACAGTGATATC XlGAPDH XbGAPDH 813 AAGGCCGCCATTAAGACTGCATCAGAGGGCCCAATGAAGGGAATCCTGGGATACACACAA 841 AAGGCTGCGGTTAAGGCTGCGTCCGAGGGACCAATGAACGGAATCCTGCAATACACTGAA XlGAPDH XbGAPDH 873 GACCAGGTTGTCTCCACTGACTTCAATGGTGACACTCACTCCTCCATCTTTGATGCTGAT 901 GACCAGGTTGTCTCCACTGACTTCAATGGCTGCACTCATTCCTCCATCTTTGATGCTGAT XlGAPDH XbGAPDH 933 GCTGGAATTGCCCTGAATGAAAACTTTGTGAAACTGGTTTCCTGGTATGATAATGAATGC 961 GCTGGAATTGCACTGAATGAAAACTTTGTGAAGCTGGTTTCCTGGTATGATAACGAATGC XlGAPDH 993 GGCTACAGCAACCGTGTTGTGGATCTTGTGTGTCACATGGCATCTAAGGAATAAGCACTT XbGAPDH 1021 GGCTACAGCCACCGTGTTGTGGATCTTATGTGTCACATGGCATCTCAGGAATAAACACCT XlGAPDH 1053 GTCACCT-GTCAACCCCTCTTCT--CACTGAAGGGGTCCAGAGTCGCCCATCCTGCTAGT XbGAPDH 1081 GTCA----ATCAACCCCTCTTCT--CTCTGAAGGA--CCATAGTCAACCCATCTACTACT 52 XlGAPDH 1110 CTGTC------ACTGTTTCTGTGTTCCTAAATAAAACCATGATGAAACATTXbGAPDH 1133 CTGTCTGTGTCGCTGTTTCTGTGTTACTAAATAAAACAATGATGAAACAGCA Figure 16. Sequence alignment of the X. borealis and X. laevis GAPDH cDNAs. The full length X. borealis GAPDH cDNA was aligned with X. laevis GAPDH cDNA (NM_001087098) by CLUSTALW and imported into BOXSHADE. Sequences aligned with 91% sequence identity. 53 polymerase will amplify the specific product. However, care was taken to limit the last five bases at the primer’s 3’ end to have only two or three cytosine or guanine bases to ensure there is no non-specific GC clamping that could lead to amplification of nontarget products. The optimal amplicon size (PCR product) for each of the designed primer pairs was in the range of 100-150 bases. In addition, the amplicon was to contain approximately 50% GC content since melting curves will be performed at the end of qPCR runs. All designed qPCR primer pairs were then tested using conventional PCR (endpoint assay) to assess their performance in the next phase of experimentation. Lepidobatrachus laevis qPCR primers In order to design specific qPCR primers for targeted genes, it was essential to find regions that were unique to particular cDNAs. For the L. laevis ZPC genes, the six full length ZPC cDNAs that were previously sequenced were globally aligned using the CLUSTALW software program and then imported into the shading program BOXSHADE to reveal sequence differences (Figure 17). The aligned order (Llzpc.3, Llzpc.6, Llzpc.1, Llzpc.2, Llzpc.5, and Llzpc.4) demonstrates visually which sequences are more related to each other since sequences listed first in the alignment have the highest level of similarity to each other followed by the more divergent sequences. To further assess the relationships of these ZPC cDNAs, a distance-based gene tree was generated (Figure 18a), and a comparison of their pairwise nucleotide identities was performed (Table 2a). These analyses revealed that the highest level of identity exists between Llzpc.3 and Llzpc.6 at 90% whereas the lowest level of identity found was for 54 Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1 1 1 1 1 1 AGCACGAGGACACTGACATGGACTGAAGGAGTAGAAATTGTGTGACTTTTCCTGTAGTCA ---------------------------------------GTGTGACTTTTCCTGTAGTCA -----------------------------------------GTGACTTCTCCTGCGGTCA ---------------------------------------GTGTGACTTCTCCTGCAGTCA ---------------------------------------GTGTGACTTCTCCTGTAGTCA --------------------------------------AGTGCTATTCTGAATAGAGTCA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 61 22 20 22 22 23 GGATGGAGCTGTGGA------TCAGGTGGAGTTGTCTATTAGTGGTCCTGATCTATGGAG GGATGGAGCTGTGGA------TCAGGTGGAGTTGTCTATTAGTGGTTCTGATCTATGGAG GGATGGAGCTGTGGA------TCAGGTGGAGTTGTTTATTAGTGGTTCTGATCTATGGAG GGATGGAGCTGTGGA------TCAGGTGGAGTTGTCTATTAGTGGTTCTGATCTATGGAG GGATGGAGCTGTGGA------TCAGGTGGAGTTGCCTATTAGTGGTCCTGATCTATGGAG GAATGGGTTTTTGGAGATTATCCTGGTGGCCTTTGATGGTAGGAGTGATCTTCTGCAGCT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 115 76 74 76 76 83 CAGGCTTTAGCAGAGCTTTGGTTAGACCTCGGCGCCAGTCAGAC--ACT-TGGTGGAGGA CAGGCTTTAGCAGAGCCTTGGTTAGACCTCGGCGCCAGTCAGAC--ACT-TGGTGGAGGA CAGGCTTTAGCAGTGCTTTGGTTAGACCCCGGCGCCAGTCAGAC--ACT-TGGTGGAGGA CAGGCTTTTGCAGAGCTTTGGTTAGACCCCGGCGCCAGTCAGAC--ACT-TGGTGGAGGA CAGGCTTTAGCAGAGCCTTGGTTAGACCTCGGCGCCAGTCAGAC--ACT-TGGTGGAGGA TGTGCCTTGAGGTTTGTAGATCTAATGTCCTGAGTAGACCACGCCGACAACAGTGGAGTA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 172 133 131 133 133 143 GTTATCAGCCTGGATGGGGATCTCCTAGAGGACTTGGACAACCTGCATCTGGAGTGGGGT GTTATCAGCCTGGATGGGGATCTCCTAGAGGACTTGGACAACCTGTATCTGGAGTGGGCT GTTATCAGCCTGGATGGGGATCTCCTAGAGGACATGGACAACCTGTATCTGGAGTGGGCT GTTATCAGCCTGGATGGGGATCTCCTAGAGGACTTGGTCAACCTGCATCTGGAGTGGGGT GTTATCAACCTGGATGGGGATCTCCTAGAGGACTTGGACAACCTGTATCTGGAGTAGGCT ACCAGCAACCTGGTTGGGGATCTCC---AAGATCT------TCTGCA--TGGAATGCACA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 232 193 191 193 193 192 CTCCTAGAGGAAGCTCTTGGTATCCGGCTCAGTCAGTGTCTGGTTTTGGGGCTTTAAGAG CTCCTAGAGGAAGCTCTTGGTATCCGGCTCAGTCAGTGTCTGGTTTTGGGGCTTTAAGAG CTCCTAGAGGAAGCTCTTGGTATCCGGCTCAGTCAGTGTCTGGTTTTGGGGCTTTAAGAG CTCCTAGAGGAAGCTCTTGGTATCCAGCTC---------------------------GAG CTCCTAGAGGAAGCTCTTGGTATCCAGCTC---------------------------GAG GTCC-AGGAGATATGGATGGGGTTCTGTTC------------------------------ Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 292 253 251 226 226 221 GTGCTCAACCTGTGTCCGGATGGGGCT-CCAGGTTTCCCGGAAGAGA---TGATCAGACC GTGCTCAGCCTGTATCTGGATGGGGCT-CCAGGTTTCCCGGAAGAGA---TGATCAGACC GTGCTCAGCCTGTGTCCGGATGGGGCT-CCAGGTTTCCCGGAAGAGA---TGATCAGACC GTGCTCAGCCTGTGTCCGGATGGGGCT-CCAGGTTTCCTGGAAGAGA---TGATCAGACC GTGCTCAGCCTGTGTCCGGATGGGGCT-CCAGGTTTCCTGGAAGAGACGATGATCGGTCC AATCTCTGTCAG-GCTGGGACAACGCTGCTAGATATTCCAGTGGGGC---CGCTAATCTC Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 348 309 307 282 285 277 CGACAGCTTCCTCCATT------CTCTCCTATCAATGTGCAGTGTGGTGAGGACAGGATG CGACAGCTTCCTCCATTTCCATCCTCTCCTATCAGTGTGCAGTGTGGTGAGGACAGGATG CGACAGCTTCCTCCATT------CTCTCCTATCAGTGTGCAGTGTGGTGAGGACAGGATG CGACAGCTTCCTCCATCTCCATCCTCTCCTATCAGTGTGCAGTGTGGTGAGGACAGGATG CGACAGCTTCCTCCATCTCCATCCTCTCCTATCAGTGTGCAGTGTGGTGAGGACAGGATG CGACAGATGCCACCCCA---GTGGTCACCAGTCACTGTGCAGTGCAATGAAGACAACATG Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 402 369 361 342 345 334 GTGGTGATGGTGAAGAGAGACTTCTATGGTAATGGTAAGCTGGTGAAGCCCTCAGACCTG GTGGTGATGGTGAAGAGAGACTTCTATGGTAATGGTAAGCTGGTGAAGCCCTCAGACCTG GTGGTGATGGTGAAGAGAGACTTCTATGGTAATGGTAAGCTGGTGAAGCCCTCAGACCTG GTGGTGATGGTGAAGAGAGACTTCTATGGTAATGGTAAGCTGGTGAAGCCCTCAGACCTG GTGGTGATGGTGAAGAGAGACTTCTATGGTAATGGATATTTGGTGAAGCCCTCAGACCTG AGGCTGACTGTGAATAGAGACCTGTTTGGCACTGGGAAATTGGTGAAAGTTTCAGATCTA 55 Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 462 429 421 402 405 394 ACCCTGGGA------TCCTGCAGACCTGGAACACAAACTACTGATCCTAATGTGGTCTTT ACCCTGGGA------TCCTGCAGACCCGGTGTACAGACTACAGATACTACGGTGGTCTTT ACCCTGGGA------TCCTGCAGACCTGGAGCACAAACTACTGATCCTAATGTGGTCTTT ACCCTGGGA------TCCTGCAGACCTGGTGTGCAGACTACAGATACTACAGTGGTCTTT ACTTTAGGC------TCTTGTAGACCTGGACCCCAGAGTTCGGACACCGTGGTGGTCTTT AGCCTGGGACCCCAATCTTGCCCTCCTGGTCCCCAGAATGAGGATGATGTAGTGTACTTT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 516 483 475 456 459 454 GAAAATGGCCTTCAAGAATGTGGTAGTACCTTGGAGATGACTCAAGACTGGCTCATCTAC GAAAATGGCCTTCAAGAATGTGGGAACATCCTAGAGATGAGACAAGACTTGCTGATCTAC GAAAATGGCCTTCAAGAATGTGGCAGCAACCTGGAGATGACTCGAGACTGGCTCATCTAC GAAAATGGCCTTCAAGAATGTGGAAGCAGCTTAGAGATGACTGCAGACTTTCTTCTGTAT GATAATAACGTCCAGGCATGTGGCAGCACTCTACAGATGACTTCAGACTTCTTGATCTAC CAGATTGGGCTTCAGGATTGTGGAAACCGTGTGCAGATGACGGCTGACTTGTTGACCTAC Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 576 543 535 516 519 514 AAAGTCAACCTGCAGTATACTCCCACCTCCTCCAGCAATGTGCCCATCACCCGGTTCAAC CGCTCTATCCTACAATACACCCCCACTTCCTCCAGGAATGTGCCCATTATCCGGTCCAAC AAGATCAACCTACAGTACAGCCCTACATCCTCCAGCAATGTACCCATCATCAGGTCCAAC GAGACTATTCTGACCTATAGGCCAAC---CCCTGGTAATGTGCCCATTATCAGGACCAAC AGAACAGTATTAAATTACAATCCAAT---TTCCAACAACGCTGTCGTAATAAGGTCAAAT ACCACAACTCTGAACTACAACCCGACTGCCAACAGGAACAGTCCAATCATCCGAACCAAT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 636 603 595 573 576 574 CCCGCTCTGGTTCCCATCCGGTGTTACTACCCCCGACATGGCAATGTGAGCAGTAAAGCC GCTGCTGTGGTTCCCATTCAGTGCTTCTATCCAAGACATGGCAATGTGAGCAGCAAAGCA CCGGCTTTGGTTCCCATCCAGTGTTACTACTCCAGACATGGCAATGTGAGCAGCAAAGCG CCTGCTGCAGTGCCTATCCAGTGTGTCTACTTTAGACATGGGAATGTGAGCAGCAAGGCT CCTGCTGTGGTTCCCATCATGCGTTATTATCCCCGGCATGGCAATGTGAGTAGCAAAGCA TCAGCAACTGTTTCCATCCAGTGCAATTATCCAAGGCATGGTAATGTGAGCAGTAAAGCT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 696 663 655 633 636 634 ATCAAGCCAACATGGGTTCCCTTCAGCACCACGGTGGCCACAGAAGAGCGGCTGTCCTTT GTCAAGCCAACATGGGTTCCCTTCAGCACCACGGTGACCACAGAAGAGCGGCTGTCCTTT GTCAAGCCAACATGGGTTCCCTTCAGCACCACTGTGACCACAGAAGAGCGGCTGTCCTTC ATCAAACCAACATGGGCTCCATTTAGTACCACTGTGACCTCTGAGGAGCGGCTGGCTTTC ATCGGGCCAACGTGGGCTCCATTTAGTACCACAGTCTCTACAGAAGAGAGGTTGGCCTTC GTGAAACCCACATGGATTCCTTTCCACACCACCATATCCTCAGAGGAACGCTTGTCCTTC Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 756 723 715 693 696 694 GCATTACGTCTAATGACTGAGGACTGGAGCGCTCCCAGGCCATCGCTGGTCTTCCAGCTT GCATTACGTCTAATGACTGAGGACTGGAGCGCTCCCAGGCCATCACTGGTCTTCCAGCTT TCCTTGCGGCTAATGACAGAGGACTTGAACGCTCCTAGGCCATCACTGGTCTTCCAGCTT TCCTTGAACTTGATGACTGATGGATGGGGAGCTCCCAGGACTTCTTCAGTCTTCCAGCTT TCCTTGTACCTGATGACCGATGACTGGAGTAGTCGTAGAGCTTCCTCAATCTTCCAACTT TCTCTGATGTTGATGAATGATGACTGGAGTGCACCAAGGTCTTCCTCAATCTTCAGTCTT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 816 783 775 753 756 754 GGTGACATGTTCTACATAGAAGCCTCTCTGGACACTCAGAACCACCTCCCGATGACCCTT GGTGACATGTTCTACATAGAAGCCTCTCTGGACACTCAGAACCACCTCCCGATGACCCTT GGTGACATGTTCTACATAGAAGCCTCTCTGGACACTCAGAACCACCTCCCGATGACCCTT GGGGACATGTTCTACATAGAGGCTTCAGTGGACACCCAGAACCACATCCCCATGATGCTG GGGGACGTCTTCAACATAGAAGCCTCAGTGGAAACGGAGAATCATATCCCCATGACCTTG GGAGAAATGTTCTACATTGAGGCCTCCCTTAACCTCGGTAATCACGTAGAAATGACCCTG Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 876 843 835 813 816 814 TTTGTTGATAGTTGTGTGGCCACCATTACTCCAGATGCGACCTCCAATCCTCATTATGAC TTTGTTGATAGCTGTGTGGCCACCATAACTCCAGATGCGACCTCCAATCCTCATTATGAC TTTGTTGATAGCTGTGTGGCCACCATAACTCCGGATGCAACCTCCAATCCTCATTATGAC TTTGTTGACAGCTGTGTTGCCACTACTACATCCAATGTCAACTCCAACCCTCGTTATGAG TTCGTTGACTCCTGTGTGGCCACCACTACATCAGATGTCAATTCCAACCCTCGTTACGAG TTTGTTGATAGCTGTGTAGCTACACTGACTTCTGATGTCAACTCCATCCCTCGATATGAG 56 Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 936 903 895 873 876 874 ATCATTGCTTATAATGGGTGCCTGATGGATGGGATGCAAGATGATTCTTCTTCAGTCTTT ATCATTGCTTATAATGGGTGCTTAATGGATGGGATGCAAGAAGATTCCTCTTCAGTCTTT ATCATTGCTTATAATGGGTGCTTGATGGATGGATTGCAAGAAGATTCCTCCTCTGCGTTT TTCATAGCTTACAATGGGTGCCTGGTGGATGGTAAGGAAGAAGATGCTTCGTCAGCCTTT ATTGTAGCCTTTTATGGGTGCCTGGTGGATGGGACACAAGACGATTCTTCTTCAGCCTTC ATCATTGGGTCATATGGGTGTCTGCTTGATGGAAAACAGGAAGACTCCTCTTCATCCTTT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 996 963 955 933 936 934 GATTCCCCAAGACCTCAAGCTGACAAACTTCGCTTCATGGTTGATGCCTTCAGGTTCACA GATTCCCCAAGACCTCAAGCTGACAAACTTCGCTTCATGGTTGATGCCTTCAGGTTCACA ---TTGCAGAGACCCCAGGCTGACAAACTCCATTTCATGGTTGATGCCTTCAGGTTCATT AGATCTCCAAGACTTCAACCAGACAAACTGCAGTTCATGGTTGATGCCTTCAGGTTTACT AGATCTCCAAGGTCTCAACCAAATAAGATCCAGTTCATGGTCGATGCCTTCCGATTTATT AAGGCTCCAAGGTCGCAGGCAAGCAAGCTCCAGTTCATGGTTGATGCCTTTAGGTTCAGA Llzpc.3 1056 GACAGTCCTGTCTCTACGATCTACATTACCTGTGCTCTGAGAGCTGCTGCCATCAACCAG Llzpc.6 1023 GACAGTGCTGTCTCTACGATCTATATTACTTGTTCTCTGAGAGCTGCTGCCATCAACCAG Llzpc.1 1012 GACAGTGACCTTTCTACAATCTATATTACCTGTTCTCTGAGAGCTGCTGCCATCAACCAG Llzpc.2 993 GCTTCAGATGTCTCATTGATCTATATTACCTGCCAACTAAGAGCAGTGGCTGCCTCCCAG Llzpc.5 996 GAGACAGATGCCTCCACAATCTACATCACCTGTTCTCTAAGAGCAGCAGAAGCCACCCAG Llzpc.4 994 GACTCGGATCTGTCAACTATTTATATTACCTGCACCTTAAGAGCTGCCTCAGCTACTCAG Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1116 1083 1072 1053 1056 1054 ACCCCTGATCCAATGAACAAAGCCTGCTCCTACAACAAGGTCTCTAGCGGCTGGTTACCT ACCCCTGATCCAATGAACAAGGCCTGCTCCTACAACAAGGCAACTAGCAGTTGGTCTCCT ACCCCTGATCCAACGAACAAGGCCTGCTCCTACAACAAGGCTACTAGCAGTTGGTCTCCT GTCCCGGATCCCAAAAACAAGGCCTGTTCCTATAGCAAAACATCATCCAGGTGGTCTCCA CCTCCTGATCCAATGAACAAGGCCTGCTCCTACAACAAGGCTACTAGAAGCTGGTCTCCT GCTCCAGATTCTACAAACAAGGCTTGTTCATTCGATAAAACCAGGAACATGTGGACACCA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1176 1143 1132 1113 1116 1114 GTGGAAGGTCCAAGTGGGATCTGCCAGTGCTGCACCACTGGGAACTGTGCCACTGCTGCA GTGGAAGGTCCAAGTGGGATCTGCCAGTGCTGCACCACCGGGAACTGTGCCACTGCTGCA GTGGATGGTCCAAGTGGGATCTGCCAGTGCTGCACCACCGGGAACTGTGCCACTGCTGCA GTAGAAGGTTCTGCTGGTATCTGTCAATGCTGTGACACTGGTGACTGTG--ATAGATTGG ATTGAGGGTCCCAATGATATCTGCCGCTGCTGTGAGTCCGGGAACTGTGCTGCTCCTCTT CTGGAAGGCCCAAGTAACATCTGTTCCTGCTGTGAAACTGGGCTGTGTTCCCCATTTGGA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1236 1203 1192 1171 1176 1174 GGCCAGAGAACAGCATGGGGCTCGTCTCCTGGGAGGCACAGAGGATTTGGGAAGCGAGAT GGCCAGAGAACAGCATGGGGCTCATCCCCTGGGAGGCACAGAGGATTTGGGAAGAGAGAT GGCCAGAGAACAGCATGGGGCTCATCTCCTGGGAGGTCCAGAGGATTTGGGAAGAGAGAT GCTCATCAATTGGAATGGGGC----------AAAGAATTGGAAAAAGGGAAATTGCAGAA GGCCAAACCAGAAGATGGGGCACAATATATGGAGGCTCAAGAGGGATTGGGAAGAGAGAA GGTCAGACCCGAAGACTGGATGATTACTATCCTAGGTCAAGAAGAATTGGCAAAAGAACT Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1296 1263 1252 1221 1236 1234 GTTGGCTTTCCTGTAGAGAAACACGGCATGGCCACACTGGGACCTCTTCTAGTGACTGGA GTTGGTTTTCCTGTAGAGAAAAACGGCATGGCCACATTGGGACCTCTTCTAGTGATTGGA GTTGGTTCTCATCTGGAGAAACACGGCATGGCCACACTGGGACCTCTTCTAGTGATTGGA GGTTCCCATCATTCAGTAGAACATGGACTGGCTGTTCTAGGTCCTCTGCTTGTTACTGGA ATTGACCATCGTCCAGAGGAGCATGCTATGGCCACACTCGGCCCTCTACTGGTCATCGGT GTGGA---------AGAGCCAACTCTACAAGCAACACTTGGGCCCCTGTATCTCAT---- Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1356 1323 1312 1281 1296 1281 GCTGGGCCTAACCAGGTC---TCCGGAGCGGGAACCTCCCAAGCTTCCAGGATGACTGCA GCTGGGCCTAACCAGGTC---TCCGGAGCAGGAACCACCCAAGCTTCCAGGATGACTGCA GCTGGGCCTAACCAGGTC---TCTGAAGCGGGAACTGCCCGAGCTTCCAGGATGACTGCA CCAGTGAAGGAGTCTCCC---CCTGTATCAGAACATCTCCAGGCTTCCAGAATGAT---G GCTGAGAAGAACCACGTGGAATCCATTGCAGAACGTGTCCAGACTTCAAGAGTCCCTGAA -----GAATGACTACCCC---TACAAGACACTGGCCCATCAGGAGCCCCGGGACAT---- 57 Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1413 1380 1369 1335 1356 1329 GGACAGGAACCTCTACAGCTGTGGATGCTGGTGGCCATCGGCTCAGTCTCTTCAGTAGTT GGACAGGAACCTTTACAGCTGTGGATGCTGGTGGCCATTGGCTCTGTCTCTTCAGTAGTT GAACAGGAACCTCTACAGCTGTGGATGCTGGTGGCCATCGGCTCTGTCTCTTCAGTAGTT GACTCCGCACAGGTGGAACTCTGGGTCTTGGTTTCCGTCTGTTCTTTTAGTTTGGTTGTT GAGTCCCAACCCTTAGAGCTGTGGATGTTGGTGGCCATCGGTTCTGTCAGTGTGGGGATT -GGCAGTCAGACACTGGACTGTG----TTGGCATCTGTTAGTTTGGTTGTT---GTTGTC Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1473 1440 1429 1395 1416 1381 GTTGCCATTGCTCTTACTATTGCTGGAAAATGTCTTCTGAAAAGATTATCCCACAAAGAA GTTGCCATTGCTCTTACTATTGCTGGAAAATGTCTTCTGAAAAAACTATCCCACAAAGAA GTTGCCGTTGCTCTTACTATTGCTGGAAAATGTCTTCTGAAAAGATTTGTTCACCAATAG TTAATACTTGGTCTTGCTCTGACTGTGAAATGTGCTGTAAAGAAACATGCTGAAGTCCTG GTAGCTGTTGCTCTGGTTGTAATTGGTAGATACGTTGTAAAAAGGCTGTCACCCCAAGAA CTTTCTGTATGTTTTGCTGTAACCGTTCAACGT-CTCTACAGACGAAATCC-----ACAA Llzpc.3 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.5 Llzpc.4 1533 1500 1489 1455 1476 1435 TCTGAATAGAAATAAAAAACACTTTGGTTA-TCAGAATAGAAATAAAAAACACTTTGGTT--TTCAG--------------------------ACTGTCCAGAAGTAAAGCATCTCACAGCTCCT GCTCTGT-GAAATAAAAGCTAAAACAGAA--TCTGATAAATAAAAAGCTCCACAACAG----- Figure 17. Sequence alignment of L. laevis ZPC cDNAs. Full length ZPC cDNA sequences were aligned by CLUSTALW and imported into BOXSHADE. 58 a. L. laevis Llzpc.3 88 100 Llzpc.6 71 Llzpc.1 Llzpc.5 Llzpc.2 Llzpc.4 0.05 b. X. laevis 100 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 0.05 c. X. borealis 72 81 Xbzpc.1d Xbzpc.1e Xbzpc.1c Xbzpc.2 Xbzpc.3 0.05 Figure 18. Phylogenetic tree of frog ZPC cDNAs. Neighbor-joining (distance based) phylogenetic trees were constructed with MEGA from ZPC sequences derived from the ovary of (a) L. laevis, (b) X. laevis, and (c) X. borealis . Branching order is based on the level of sequence identity with highly identical sequences clustering together. The scale bar represents the number of substitutions per site. Consensus trees are reported from the results of 1000 bootstrap replicates with corresponding bootstrap values displayed at nodes. 59 Table 2. Pairwise comparison of ZPC cDNA sequences. Each ZPC cDNA was globally aligned to all other ZPC sequences found within the ovaries of the particular frog species: (a) L. laevis, (b) X. laevis, and (c) X. borealis. The percent identity for each pairwise alignment is shown. a. L. laevis Llzpc.1 Llzpc.2 Llzpc.3 Llzpc.4 Llzpc.5 Llzpc.6 Llzpc.1 Llzpc.2 Llzpc.3 Llzpc.4 Llzpc.5 Llzpc.6 75 85 65 75 90 71 65 75 76 62 71 88 64 64 76 b. X. laevis Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 98 90 65 90 65 64 c. X. borealis Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 Xbzpc.1e 97 96 90 65 Xbzpc.1d 96 88 65 Xbzpc.1c 92 65 Xbzpc.2 64 Xbzpc.3 60 the comparison of Llzpc.4 with all other cDNAs, ranging from 62-65%. Given the high sequence identities for a subset of the ZPC genes (Llzpc.3, Llzpc.6, Llzpc.1), the exploitable regions to design qPCR primers were somewhat limited. However, primer sets were designed for each of the L. laevis cDNAs (including GAPDH) using the guiding principles for the optimal primer design as described above (Table 3). X. laevis qPCR Primer Design The four X. laevis ZPC cDNAs that were previously sequenced were globally aligned using the CLUSTALW software program and then imported into the shading program BOXSHADE to reveal sequence differences (Figure 19). The aligned order, gene tree (Figure 18b), and pairwise comparison of identities for these ZPC cDNAs reveals their relationships to each other (i.e. Xlzpc.1a, Xlzpc.1b, Xlzpc.2, and Xlzpc.3). In particular, the pairwise comparison of the nucleotide identities shows that the highest level of identity is 98% between Xlzpc.1a and Xlzpc.1b and the lowest level of identity ranges from 64-65% for Xlzpc.3 compared to all others (Table 2b). As with L. laevis, the highly identical ZPC genes (especially Xlzpc.1a and Xlzpc.1b) made primer design challenging. Primer sets were designed for each of the X. laevis cDNAs (including GAPDH) using the guiding principles for optimal parameters as described (Table 3). However, for clones Xlzpc.1a, Xlzpc.1b, and Xlzpc.2, the amplicon sizes were slightly larger than the optimal 100-150 bp range (187, 165, and 152 bases respectively) so as to target divergent sequence regions. Table 3. Summary of ZPC and GAPDH qPCR primers for L. laevis, X. laevis, and X. borealis. Primers used to amplify ZPC and GAPDH genes from L. laevis, X. laevis, and X. borealis are outlined in the table. 61 62 63 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 1 1 1 1 AACATTGCAAACCATGCGCCAATGACTATATATGCTGACAGCTGTGTGGCCACAGTCACG AACATTGCAAACCATGCGCCAATGACTATATATGCTGACAGCTGTGTGGCCACAGTCACG AACATTGCAAACCATGCGCCAATGACAATTTATGTTGACAGCTGTGTGGCCACAGTCACG GATACCAGAAACCTTGGTCCCATGATGATCTTTGTTGACCGTTGTGTGGCCACCCTGTCA Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 61 61 61 61 CCTGATGTCAATTCCAACCCCCGATATGAGATAATTAATCAAAATGGGTGTCTGGT CCTGATGTCAATTCCAACCCCCGATATGAGATAATTAATCAAAATGGGTGTCTGGT CCCGATATCAATTCAAACCCTCGTTATGAGATAATTAATCAAAATGGGTGTCTGGT CCTGATTTGAACTCAAGCCCTCAGTATGAGATTATTGCTCTCAATGGGTGTTTAGT Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 117 117 117 117 AGATGGGAAACTGGATGACTCTTCTTCTGCCTTCCGATCTCCAAGGCCCCAGCCTG AGATGGGAAACTGGATGACTCTTCTTCTGCCTTCCGATCTCCAAGGCCCCAGCCTG AGATGGGAAACAGGATGACTCCTCTTCTGCCTTCCGATCGCCAAGGCCAACTCCGG GGATAGTAAACAGGAAGACTCTTCCTCTACCTTCTGGTCTCCAAGACCTTCACCAG Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 173 173 173 173 ACAAGCTTCAGTTCTCTGTGGATGCATTCAGGTTTACTACATCAGATAGCGCTGTG ACAAGCTTCAGTTCTCTGTGGATGCATTCAGGTTTACTACATCAGATAGCGCTGTG ACAAGCTTCAGTTCTCTGTTGATGCGTTCAGGTTTACTACATCAGATAGCACTGTG ACAAGCTGAGATTTAAGGTTGATGCATTCAAGTTCATTGGAGCAGATTCTCCTGTG Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 229 229 229 229 ATTTACATAACTTGCAATCTGAGAGCTGCTGCAACCACCCAAGTCCCAGACCCCAT ATTTACATAACTTGCAATCTGAGAGCTGCTGCAACCACCCAAGTCCCAGACCCCAT ATCTACATAACTTGCAATCTGAGAGCTGCTGCAACCACACAAGTCCCAGACACCAT ATCTACATCACTTGCAGTGTGAGAGCAGCTGCAGCAAACCAGGGCCCAGATGTTCT Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 285 285 285 285 GAACAAAGCCTGTTCCTTCAGCAAATCTGCAAACAGTTGGTCTCCTCTTCAAGGAC GAACAAAGCCTGTTCCTTCAGCAAATCTGCAAACAGTTGGTCTCCTGTTCAAGGAC GAACAAAGCTTGCTCCTTCAGCAAAACCACAAACAGCTGGTCTCCTGTTCAAGGAC GAACAGGGCCTGCTCCTTCAGTAAAGTCAGCAACACGTGGTTGTCACTAAATGGCC Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 341 341 341 341 CCAGTAACATATGCAGCTGCTGTGATACTGGAAACTGTGTCTCTGTACCAGGCCAA CAAGTAACATATGYAGCTGCTGTGATACTGGAAACTGTGTCTCTGTACCAGGCCAA CAAGTAACATATGTAGCTGCTGTGATACTGGAAACTGTGTCTCTCTACCAAGCCAA CCAATAATATTTGTGACTGCTGTGACACAGGAGCGTGTGCTGC--TACTGG----A Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 397 397 397 391 AGCAGAAGACTGGGACCATATTTTTCAGGCTCTAGGTGGAACCAAAAAAGGGAAGC AGCAGAAGACTAGGACCATATTTTTCAGGCTCTAGGTGGAACCAAAAAAGGGAAGC AGCAGAAGATTGGGACCACAAT---CAGGTTCTAGGTGGAATAGAAAAAGAGAAGC AGCAGA-----GGGGTCA-ATT------ATTTCAGG------AGATTAAGCAGATC Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 453 453 450 429 TGTGCATGTGTCTAAAATGGAGGAGGAAGAA---CATAGCTTGGCTACCATAGGGC TGTGCATGTGACTAAAATGGAGGAGGAAGAA---CATAGCTTGGCTACCATAGGGC CATACATGTGACTAAAATGGAGGAGGAAGAAGAACATAGCTTGGCTACTCTAGGGC GGT-----TGATTCAAGCTTTGAAGTAGAA----------TTGGTAGCAGTTGGTC Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 506 506 506 470 CCATATTGGTGGTTGTACCTGAACAAACC------AAAACACAAGCTGTAAAGCAG CCATATTGGTGGTTGCACCTGAACAAACC------AAAACACAAGCTGTAAAGCAG CCATATTGGTGGTTGCACCTGAACAAGCCCAAGCCAAAACACAAGCTGTAAAGCAG CTTTGTTCATCATTG-ACCCGAAAAGTCAT-----GTGGCTCCGTCCGTAAATCAA Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 556 556 562 520 GAACTGGAAGGGAAGACCTTGGAACTGTGGGAGCTGTTGGCATTGGGTTCTCTGGG GAACTGGAAGGAAAGTCCTTGGAACTGTGGGAGCTGTTGGCATTGGGTTCTCTGGG GAACTGGAAATCAAGACCTTGGAACTGTGGGAGCTGGTGGCATTGGGATCTTTGGG GAATC--AAGCCAAATTGTTG-AGCTTTGGCTGTTGGTAGCACTGTGTTGTTTAAG 64 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 612 612 618 573 ACTTGTCCTGCTAGCTGCCTGTATTGCTGTCATTGCCTCCAAGCTTGCTAAAAGGA ACTTGTCCTGCTAGCTGCCTGTATTGCTGTCATTGCCACTAAGCTTGCTAAAAGGA GCTTGTTGTGCTAGTTGCCTGTATTGCTGTCATTATCACTAAGCTAACTAAAAGGA TTTCATTGTGATTTCAGTTTGTGTTATTGTGAATATCCACAGATTTTGT---AGGA Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 668 668 674 626 AGCAATATATATCCACTATCCAGAAA AGCAATATATATCCACTATCCAGAAA AGCAATATATATCCACTATCCAGAAA AACAAAGTATTTTTGTTTTTGCAAAA 753 753 759 711 Figure 19. Sequence alignment of X. laevis ZPC cDNAs. ZP domain regions (middle) and sperm-binding regions (3’ end) of the ZPC cDNA sequences were aligned by CLUSTALW and imported into BOXSHADE. 65 X. borealis qPCR Primer Design The five X. borealis ZPC cDNAs that were previously sequenced were globally aligned using the CLUSTALW software program and then imported into the shading program BOXSHADE to reveal sequence differences (Figure 20). The aligned order, gene tree (Figure 18c), and pairwise comparison of identities (Table 2c) for these ZPC cDNAs reveals their relationships to each other (Xbzpc.1e, Xbzpc.1d, Xbzpc.1c, Xbzpc.2, and Xbzpc.3). In particular, the pairwise comparison of nucleotide identities shows that the highest level of identity is between clones Xbzpc.1e and Xbzpc.1d at 97% whereas the lowest level of identity ranges from 64-65% for Xbzpc.3 when compared to all others (Table 2c). Given these constraints, primer sets were designed for each of the X. borealis cDNAs (including GAPDH) using the guiding principles for optimal parameters as described (Table 3). Once again, the amplicon size for several of the clones deviated from the optimal 100-150 bp range so as to target divergent sequence regions (Xbzpc.1d, Xbzpc.1c, and Xbzpc.2 amplicons were 69, 76, and 90 bases, respectively). Optimization of Primer Annealing Temperatures Each designed primer set was tested for its optimal annealing temperature using the MyCycler Thermal Cycler System (Bio-Rad) which contains a gradient temperature block. The highest possible annealing temperature that generates significant amounts of product would be the chosen optimum since this would maximize its specificity. L. laevis Primer Optimization 66 Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 1 1 1 1 1 AACATTGCAAACCATGCGCCGATGGCCATATATGTTGACAGCTGTGTGGCCACCATTGCA AACATTGCAAACCATGCGCCGATGGCCATATATGTTGACAGCTGTGTGGCCACCATTGCA AACATTGCAAACCATGCACCAATGACTATATATGTGGACAGCTGTGTGGCCACAGTCACA AACATTGCAAACCATGCACCAATGACTATATATGTGGACAGCTGTGTGGCCACAGTCACA GATACCAGAAATCTTGGTCCCATGATGCTCTTTGTTGGCCGTTGTGTGGCCACCCTGTCG Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 61 61 61 61 61 CCTGACGTCAATTCCAACCCTCGTTATGAGATAATTAATCAAAATGGGTGTCTGGT CCTGACGTCAATTCCAACCCTCGTTATGAGATAATTAATCAAAATGGGTGTCTGGT CCTGATGTCAATTCCAACCCTCGTTATGAGATAATTAATCAAAATGGGTGTCTGAT CCTGATGTCAATTCCAACCCTCGTTATGAGATAATTAATCAAAATGGGTGTCTGAT CCTGATATGAACTCAAGCCCTCAGTATGAGATTATTGCTCTCAATGGGTGTTTAGT Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 117 117 117 117 117 AGATGGGAAACAGGATGACTCCTCTTCTGCATTCCAATCTCCAAGGCCAACACCTG AGATGGGAAACTGGATGACTCTTCTTCTGCCTTCCGATCTCCAAGGCTGCAGCCTG AGATGGGACACAGGATGACTCCTCTTCTGCATTCCAATCTCCAAGGCCAACACCTG AGATGGGAAACAGGATGACTCCTCTTCTGCATTCCAATCTCCAAGGCCAACACCTG GGACAGTAAACAGGAAGACTCTTCCTCTACCTTCTGGTCTCCAAGACCTTCACCAG Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 173 173 173 173 173 ACAAGCTTCAATTCTCTGTTGATGCCTTCAGGTTTACTACATCTGACAGCACTGTG ACAAGCTTCAGTTTTCTGTGGATGCATTCAGGTTTACTACATCAGATAGCGCTGTG ACAAGCTTCAATTCTCTGTTGATGCATTCAGGTTTACTACATCAGATAGCGCTGTG ACAAGCTTCAATTCTCTGTTGATGCCTTCAGGTTTACTACATCTGACAGCACTGTG ACAAGCTGAGGTTTAAGGTTGATGCATTCAAGTTTGTTGGGGCAGATTCTCCTGTG Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 229 229 229 229 229 ATCTACATAACTTGCAATCTGAGGGCTGCTGCAACTACCCAAGTCCCAGACCCCAT ATTTACATAACTTGCAATCTGAGGGCTGCTGCAACTACCCAAGTCCCAGACCCCAT ATTTACATAACTTGCAATCTGAGGGCTGCTGCAACTACCCAAGTCCCAGACCCCAT ATCTACATAACTTGCAATCTGAGAGCTGCTGCAACCACACAAGTCCCAGACACCAT ATGTACATCACTTGCAGTGTGAGAGCAGCTGCAGCAAACCAGGGCCCAGATGCTCT Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 285 285 285 285 285 GAACAAAGCCTGCTCCTTCAGCAAATCCACAAACAGCTGGTCTCCTCTTCAAGGAC TAACAAAGCCTGCTCCTTCAGCAAATCCACAAACAGCTGGTCTCCTCTTCAAGGAC TAACAAAGCCTGCTCCTTCAGCAAATCCACAAACAGCTGGTCTCCTCTTCAAGGAC GAACAAAGCTTGCTCCTTCAGCAAAACCACAAACAGCTGGTTTCCTCTTCAAGGGC GAACAAGGCATGCTCCTTCAGTAAAGCCAGCAACACGTGGTCGTCACTAAATGGCC Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 341 341 341 341 341 CCAGTAACATCTGCAGCTGCTGTGATACTGGAAACTGTGTTTCTGTACCAGGCCAA CCAGTAACATCTGCAGCTGCTGTGATACTGGAAACTGTGTCTCTGTACCAGGCCAA CCAGTAACATCTGCAGCTGCTGTGATACTGGAAACTGTGTCTCTGTACCAGGCCAA CAAGTAACGTTTGTAGCTGCTGTGATACTGGAAACTGTGTCTTTCTACCAAGCCAA CCAATAATATATGTGAATGCTGTGATACGGGAGCATGTGTGGCT--ACTGG----A Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 397 397 397 397 391 AGCAGAAGACTGGAACCATATTT---AGGCTTTAGGTGGCC---CAGAAAAAGGGA AGCAGAAGACTGGAACCATATTT---AGGCTCTAGGTGGCC---CAGAAAAAGGGA AGCAGAAGACTGGAACCATATTT---AGGCTCTAGGTGGCC---CAGAAAAAGGGA AGCAGAAGATTGGGAGAGAATTTTTCAGGTTTTAGGTGGTGGAACAGAAAACGGGA AGCAGAGGGGT-------CAATT-----ATTTCAGGAGATT---------AAGCAG Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 447 447 447 453 426 AGCAGTGCATGTGACCAAATTGGAGGAGGAGGAGGAACATAGCTTGGCTACCCTGG AGCAGTGCATGTGACCAAATTGGAGGAGGAGGAGGAACATAGCTTGGCTACCCTGG AGCAGTGCATGTGACCAAATTGGAGGAGGAG---GAACATAGCTTGGCTACCCTGG AGCCATACATGTGGCTAAAATGGAGGAAGAC------CATAGCTTGGCTACCATAG ATCGGT---TGAGTCAAGCTTGGAAGTAGAA------------CTGGTAGCAGTTG 67 Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 503 503 500 503 467 GGCCCATATTGGGGGTTGCGCCTGAACAAACCAAAACACAAGCTGTAAAGCAGGAA GGCCCATATTGGTGGTTGCGCCTGAACAAACCAAAACACAAGCTGTAAAGCAGGAA GGCCCATATTGGTGGTTGCGCCTGAACAAACCAAAACACAAGCTGTAAAGCAGGAA GGCCCATATTGGTGGTTACACCTGATCAAGCCAAAACACAAGCTGTGAAGCAGGAA GTCCTCTGTTCATCATTGACCCAGAAAGTCATGTGGCTCCATCAGTAAATGAAGAA Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 559 559 556 559 523 CTGGAAAGAAAGACCTTGGAACTGTGGGAGATGTTGGCATTGGGTTTTTTGGGACT CTGGAAAGAAAGACCTTGGAACTGTGGGAGATGTTGGCATTGGGTTCTTTGGGACT CTGGAAAGAAAGACCTTGGAACTGTGGGAGATGTTGGCATTGGGTTCTTTGGGACT CTGGAAAGCAAGACCTTGGAACTGTGGGAGCTGGTGGCATTGGGTTCTTTGGGGCT TC---AAGCCAACTTGTTGAGCTCTGGCTGTTGGTAGCATTGTGTTGTCTAAGTCT Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 615 615 612 615 576 TGTTTTGGTAGCTGCCTGTATTGCCGTCATTGCCATTAAGCTAACTAAAAGGAAGC TGTTCTGGTAGCTGCCTGTATTGCCGTCATTGCCACTAAGCTAACTAAAAGGAAGC TGTTCTGGTAGCTGCCTGTATTGCCGTCATTGCCACTAAGCTAACTAAAAGGAAGC TGTTGTGTTAGCTGCCTGTATTGCGGTCATTGTCACTAAGCTAACTAAAAGGAAGC TGTTGTGATTTCAGTTTGTGTTATTGTGAATATCCACAGATTTTGTAA---GAAAC Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 671 671 668 671 629 AATATATATTTACTATCCAGAAA AATATATATCTACTATCCAGAAA AATATATATCTACTATCCAGAAA AATATGTATCCACTATCCAGAAA AAAGTATCTTCGTTCTTGCAAAA 753 753 750 753 711 Figure 20. Sequence alignment of X. borealis ZPC cDNAs. ZP domain regions (middle) and sperm-binding regions (3’ end) of the ZPC cDNA sequences were aligned by CLUSTALW and imported into BOXSHADE. 68 After performing annealing temperature gradients on all primer sets for L. laevis, the optimal annealing temperature for each primer set ranged from 65.8-69.2oC with primers to Llzpc.3 being the highest and LlGAPDH primers being the lowest (Figure 21). Interestingly, this empirically determined optimal range was significantly higher than the Tm estimated from the primer design software program (60oC). Specifically, primers to Llzpc.2, Llzpc.5, and Llzpc.6 all annealed at 67.3oC while Llzpc.4 annealed at 69.5oC. As can be seen in Figure 21, an increase in temperature above these optimums yielded minimal to no product. There was slight amplification for Llzpc.2 and Llzpc.6 at the next highest temperature of 69.5oC (lane 6), but was a negligible amount compared to product yield at 67.3oC. As for Llzpc.1, a tighter or shallower gradient was performed for this primer set due to the high sequence identity to other ZPC cDNAs. In this case, it was determined that 65.9oC was the best annealing temperature for specificity while yet still yielding considerable amounts of product. GAPDH primers appeared optimal at 65.8oC since it yielded significantly more product at this temperature as compared to 67.3oC (Figure 21, lane 5). No template controls (Figure 21, lane 10) were performed for all experiments to demonstrate the absence of contamination and to show that the primer set does not form primer dimers (primer hybridization to each other and subsequent amplification) at the optimal annealing temperature. It should be noted that a slight doublet was evident for the Llzpc.5 and Llzpc.1 PCR reactions. Since these PCR reactions were performed with plasmid DNA containing the ZPC cDNAs, it is very possible that the primers may have annealed to alternative sites such as vector sequences. However, it was not viewed as a problem since the template concentrations were higher 69 1 2 3 L 64.7 64.9 4 65.3 5 65.9 6 66.7 7 8 67.4 67.8 9 10 68.0 65.9 Llzpc.1 122 bp 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 112 bp Llzpc.2 68.5 68.7 68.9 69.2 69.7 70.1 70.4 70.5 68.7 129 bp Llzpc.3 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 130 bp Llzpc.4 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 107 bp Llzpc.5 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 121 bp Llzpc.6 64.0 LlGAPDH TA 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 152 bp Figure 21. L. laevis ZPC and GAPDH qPCR primer optimization. Gradient annealing temperatures are listed under lane numbers; genes are listed down the left column; and amplicon size is listed down the right column. DNA ladder was loaded in lane 1, whereas negative controls (no template control) were loaded in lane 10. 70 in these assays (0.025ng of each template) as compared to the level expected in the cDNA libraries and that these same vector sequences are not found in the cDNA libraries (which utilize small PCR adaptor sequences). X. laevis Primer Optimization After performing annealing temperature gradients on all primer sets for X. laevis, the optimal annealing temperature for each primer set ranged from 66.0-67.3oC (Figure 22). In a similar observation to L. laevis, the empirically determined optimal range was significantly higher than the Tm estimated from the primer design software program (60oC). Specifically, primers to Xlzpc.1a annealed at 66.0oC while Xlzpc.1b, Xlzpc.2, Xlzpc.3, and XlGAPDH all annealed at 67.3oC. As can be seen in Figure 22, an increase in temperature above these optimums generates minimal to no product. There is no visual amplification for Xlzpc.1b, Xlzpc.2, Xlzpc.3, and XlGAPDH at the next highest temperature, 69.5oC (lane 6). As for Xlzpc.1a, amplification occurs at 67.3oC, but product is minimal compared to other primer sets at that temperature. In this case, it was determined that 66.0oC was the best annealing temperature for specificity while yet still yielding considerable amounts of product. No template controls (lane 10) demonstrated the absence of contamination and that the primer sets do not form primer dimers at the optimal annealing temperature. X. borealis Primer Optimization After performing annealing temperature gradients on all primer sets for X. 71 1 2 3 L 64.0 64.7 4 65.8 5 67.3 6 69.5 7 71.2 8 72.2 9 10 73.0 67.3 Xlzpc.1a 187 bp 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 Xlzpc.1b 165 bp 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 Xlzpc.2 152 bp 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 Xlzpc.3 132 bp 64.0 XlGAPDH TA 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 112 bp Figure 22. X. laevis ZPC and GAPDH qPCR primer optimization. Gradient annealing temperatures are listed under lane numbers; genes are listed down the left column; and amplicon size is listed down the right column. DNA ladder was loaded in lane 1, whereas negative controls (no template control) were loaded in lane 10. 72 borealis, the optimal annealing temperature for each primer set ranged from 64.2 – 70.8oC (Figure 23). Consistent with the prior optimization experiments, these annealing temperatures were higher than estimated from the primer design software program (60oC). Specifically, primers to Xlzpc.1e annealed at 70.8oC which was considerably higher than the 60oC target. Xbzpc.1d and Xbzpc.2 both annealed at 67.3oC while Xbzpc.1c and Xbzpc.3 optimally annealed at 68.2oC and 64.2oC, respectively. As can be seen in Figure 23, the use of tighter gradients for X. borealis (because of high sequence identity) yielded minimal product above optimal temperatures likely due to inefficient amplification at the elevated temperature. GAPDH primers appeared optimal at 65.8oC. Mixed Plasmid PCR Controls Optimal annealing temperatures acquired from the gradient PCR experiments were used in mixed plasmid experiments to demonstrate the specificity of each primer set for the template it was designed to amplify. Although the optimal annealing temperature maximizes the specificity of the primer set to its particular cDNA, it was essential to show that the other ZPC cDNAs were not capable of being amplified at these temperatures also. Thus, the plasmids containing the other ZPC cDNAs were mixed together in an equal ratio (0.025ng of each template) and used as the starting template for the mixed plasmid experiment. A positive control was included in the experiment which consisted of the primer set and its corresponding ZPC cDNA plasmid that it was designed to amplify. For all three species, each primer set demonstrated the ability to amplify its 73 1 2 3 4 5 6 7 8 9 10 L 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 109 bp Xbzpc.1e 64.0 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 69 bp Xbzpc.1d 64.7 65.0 65.5 66.3 67.3 68.2 68.7 69.0 66.3 76 bp Xbzpc.1c 65.0 65.4 66.2 67.3 68.9 70.2 71.0 71.4 67.3 Xbzpc.2 90 bp 62.0 62.4 63.1 64.2 65.6 66.8 67.6 68.0 64.2 129 bp Xbzpc.3 64.0 XbGAPDH TA 64.7 65.8 67.3 69.5 71.2 72.2 73.0 67.3 114 bp Figure 23. X. borealis ZPC and GAPDH qPCR primer optimization. Gradient annealing temperatures are listed under lane numbers; genes are listed down the left column; and amplicon size is listed down the right column. DNA ladder was loaded in lane 1, whereas negative controls (no template control) were loaded in lane 10. 74 specific ZPC template and the failure to amplify other ZPC templates at the optimized annealing temperature (Figure 24). It should be noted that conventional end point PCR is subject to variation with respect to the amount of product being synthesized at the end of the cycles which explains the variation in staining intensity for the positive plasmid controls (Figure 24, lane 2). The no template controls (NTC) were consistent with previous experiments and showed no product. Primer Efficiency Although the use of carefully matched annealing temperatures increases the specificity of primer sets to their templates, primers may behave in ways that prevent them from amplifying products in a consistent and expected exponential manner during PCR. Significant deviation from the optimal doubling of product in each cycle would not allow the results to be interpreted in a quantitative manner. Thus, the efficiency of the primer set needs to be determined to be in the range 90-110% of its expected doubling over its cycles to validate the quantitative expression study. To test this, a 10-fold dilution series is performed for the template (specifically over the range of 0.025ng/µl to 0.000025ng/µl) and the CT values compared between dilutions. If the reaction is 100% efficient then the CT values should be staggered by 3.322 for each 10-fold dilution to indicate perfect doubling each time (2n, n = cycle number). It should be noted that these assays were performed by qPCR rather than conventional PCR so that CT values could be determined. Percent efficiency is determined by plotting the CT values versus the log starting template quantity to obtain a linear regression line (Figure 25). The coefficient of 75 a. L. laevis 1 L 2 + 3 Mix 4 NTC Llzpc.1 122 bp Llzpc.2 112 bp Llzpc.3 129 bp Llzpc.4 118 bp Llzpc.5 107 bp Llzpc.6 121 bp b. X. laevis 1 L 2 + 3 Mix 4 NTC Xlzpc.1a 187 bp Xlzpc.1b 165 bp Xlzpc.2 152 bp Xlzpc.3 132 bp c. X. borealis 1 L 2 + 3 Mix 4 NTC Xbzpc.1e 109 bp Xbzpc.1d 69 bp Xbzpc.1c 76 bp Xbzpc.2 90 bp Xbzpc.3 129 bp Figure 24. Mixed plasmid control PCR. Primer sets targeting (a) L. laevis, (b) X. laevis, and (c) X. borealis ZPC cDNAs are listed down the left column, and their corresponding amplicon size is listed down the right column. Lane 1= size ladder; Lane 2= positive control (primer set and the targeted ZPC template); Lane 3= mixture of non-targeted ZPC templates; Lane 4= no template control. Threshold Cycle (CT) 76 E = 99.1% R2 = 0.996 Log Starting Quantity (ng) Figure 25. Representative qPCR primer efficiency plot. ZPC and GAPDH primer sets were tested for their ability to efficiently amplify PCR products using the iQ5 Real-Time Detection System (Bio-Rad). The graph displays how efficiency is determined using primers to L. laevis GAPDH as an example. CTs were plotted against the log concentration (ng) of starting template to determine efficiency (E) and the coefficient of determination (R2). 77 determination (R2) is also determined by plotting the CT values with the log of the starting template concentration. The R2 value represents how well the experimental data fit the regression line and is a measure of whether the amplification efficiency is the same for different DNA concentrations. When the data fits the regression line perfectly, the R2 value equals 1.0. Efficiencies less than 90% usually indicate that template DNA or primers may be forming secondary structures during the PCR reaction and preventing optimal doubling. Greater than 110% efficiency means that the PCR reaction is producing more product than expected and is usually due to primer dimer products being formed simultaneously while target is being amplified. Primer dimer products can be detected by performing a melting curve at the end of the qPCR cycle which will show up as an additional melt curve with a very low Tm due to its smaller size. All L. laevis primer sets (ZPC and GAPDH) fell within the acceptable range of 90-110% efficient (Figure 25; Table 4) with 96.1% being the lowest and 101.5% being the highest. As for X. laevis, all primer sets also fell within the acceptable range (Table 4) with 99.2% being the lowest and 105.3% being the highest (Xlzpc.1b and Xlzpc.3, respectively). And lastly, all X. borealis primer sets fell within the acceptable range (Table 4) with 96.4% being the lowest and 105.7% being the highest (Xbzpc.1e and Xlzpc.2, respectively). This data indicated that no primer dimers or secondary structures were interfering with the PCR in all cases. Melt curves performed after all reactions were completed indicated that only a single peak or PCR product was present, indicating that the intended region was being amplified with no evidence of primer dimer formation. In addition, the primer efficiency data fit the regression line well for all PCR reactions (R2 78 Table 4. qPCR primer efficiency summary. ZPC and GAPDH primer sets were tested for their ability to efficiently amplify PCR products using the iQ5 Real-Time Detection System (Bio-Rad). CT values were plotted against the log concentration (ng) of starting template to determine percent efficiency (E) and the coefficient of determination (R2). Gene Llzpc.1 Llzpc.2 Llzpc.3 Llzpc.4 Llzpc.5 Llzpc.6 LlGAPDH Percent Efficiency (E) 96.1 100.7 101.5 97.0 98.0 101.2 99.1 Coefficient of Determination (R2) 0.998 0.998 0.998 0.985 1.000 0.996 0.996 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 XlGAPDH 103.2 99.2 100.0 105.3 101.5 0.974 0.990 0.990 0.999 0.995 Xbzpc.1e Xbzpc.1c Xbzpc.1d Xbzpc.2 Xbzpc.3 XbGAPDH 96.4 104.7 97.0 105.7 100.0 97.9 0.995 0.998 0.990 0.988 0.995 0.996 79 values ranging from 0.974-1.0) indicating that these primers amplify efficiently at each template concentration tested. qPCR Assays to Determine Gene Expression Once primers were deemed specific and efficient, they were used in qPCR assays to determine the expression of ZPC genes with respect to the GAPDH reference gene. Assays were set up by first creating master mixes to contain all reaction components with the goal of reducing or minimizing technical variability between trials. Since the optimum primer annealing temperature for each ZPC cDNA varied, each ZPC gene was assayed using separate 96-well plate trials. GAPDH qPCR served as an internal reference control and was included on all ZPC 96-well plate trials in order to normalize ZPC CT values and to control for unequal cDNA loading between different ZPC assays. Within each trial experiment, three separate master mixes were constructed in order to have three independent replicates for each ZPC. The amount of product was determined by the fluorescent dye SYBR Green which was included in the master mix and is detected when it binds to double stranded DNA. The CT value was then determined by establishing the cycle number at which point the fluorescence was detected above a threshold value (in essence, the baseline). Upon completion of the qPCR amplification cycle, a melt curve was performed to assess whether or not multiple sized products were formed during the assay. Initially, the melt curve cycle starts at a low temperature (55.0oC) and is then slowly increased by 0.5oC increments. Each amplicon found in a reaction will generate a specific melt curve 80 Tm (generally larger products have a higher Tm). Primer dimers generally have a very low Tm since they are small products (approx. 30-40 bp). The anticipated result for each qPCR is one distinct melt curve peak representing a single amplicon (i.e. ZPC or GAPDH). Each qPCR gene trial/replicate was performed in triplicate and thus yielded three CT values which were averaged to determine the CT value for that particular trial. This was done two additional times to yield three CT values, one for each replicate. Since plate to plate variability was minimal, CT values from three replicates are averaged to determine the CT for that particular gene. This was continued for each ZPC gene expressed within the ovary to yield an average CT. Plate to plate variability was also minimal when measuring GAPDH so CT values from each replicate are averaged. The averaged ZPC CT value for each gene was then normalized to the average GAPDH CT value by plugging it into the following equation: 2Δ(ΔCT). The ZPC CT value was subtracted from the GAPDH CT value to generate the ΔCT (CT GAPDH – CT ZPC). This was done for each ZPC gene. Once each gene was normalized to GAPDH (ΔCT), ZPCs were then compared to each other to determine relative fold expression. The ΔCT of the lowest expressing gene served as the baseline and was subtracted from the ΔCT of each ZPC to yield the Δ(ΔCT) (ΔCT.ZPC – ΔCT.lowest ZPC). Those values were used in the 2Δ(ΔCT) equation to generate relative fold expression for each gene. Since the lowest expressed gene serves as the baseline to which all expression data (ΔCTs) are compared, it was designated to have a fold expression of 1.0 (ΔCT.lowest ZPC – ΔCT.lowest ZPC = 0, 20 = 1.0). 81 L. laevis ZPC qPCR assays The qPCR expression results for the L. laevis ZPC cDNAs found within one individual's ovary cDNA library revealed an unequal expression of genes (Table 5, Figure 26). Upon conclusion of each qPCR trail, melt curves revealed only one sized product indicating that CT values corresponded to the targeted amplicons (data not shown). After calculating the ΔCT values for all L. laevis ZPC genes (normalization to GAPDH), Llzpc.6 was found to be the lowest expressed gene so all ΔCT data was then compared to the Llzpc.6 ΔCT value. The p-value derived from an ANOVA analysis was 0.000 indicating that expression differences were highly significant. ZPC expression levels were then categorized into tiers of expression such as high and low based on multiple pairwise comparisons. These categories are useful to qualitatively assess and compare relative ZPC expression profiles across species. Significance was determined using the Bonferroni corrected post hoc test to establish differences in expression level and which genes would be categorized as high or low expressers. The Bonferroni correction is a statistical adjustment for multiple comparisons that decreases the likelihood that significant outcomes occurred by chance. Llzpc.4 and Llzpc.5 showed the highest level of expression at 7.73 and 7.94 fold respectively followed by Llzpc.2 and Llzpc.3 expressed at 5.21 and 5.35 fold. However, after performing the Bonferroni correction (p-value < 0.0033 to reject the null hypothesis of equality) the expression levels of these genes were not statistically distinct (meaning the 7.94 vs. 5.21 can not be considered as significantly different). The expression levels of Llzpc.1 and Llzpc.6 are also considered to be equal based on a pairwise comparison. However, the expression of 82 Table 5. L. laevis ZPC and GAPDH CT values. qPCR for each gene as performed in triplicate on a single plate which yielded three CT values that were averaged (column 2) along with it’s standard deviation (column 3). Three separate plates were performed for each gene. All three plate’s CT averages was then averaged together (column 4) along with standard deviations calculated (column 5). The final CT averages (column 4) were subsequently used to determine expression level using the 2Δ(ΔCT) equation. Gene Llzpc.1 Llzpc.2 Llzpc.3 Llzpc.4 Llzpc.5 Llzpc.6 LlGAPDH Plate Avg. CT 36.83 36.99 36.91 35.21 35.17 34.98 35.36 34.82 35.05 34.70 34.71 34.23 34.37 34.45 34.72 37.11 37.84 37.55 35.16 35.09 35.20 Plate St Dev. 1.07 0.45 0.12 1.07 1.28 0.36 0.77 0.68 0.08 0.07 0.20 0.13 0.31 1.14 0.87 0.88 0.10 0.26 0.87 1.28 1.26 Plate (1-3) Avg. Plate (1-3) St. Dev. Delta CT Delta/Delta CT Fold Expression 36.91 0.08 1.76 0.59 1.51 35.12 0.12 -0.03 2.38 5.21 35.08 0.27 -0.07 2.42 5.35 34.55 0.27 -0.60 2.95 7.73 34.51 0.18 -0.64 2.99 7.94 37.50 0.37 2.35 0.00 1.00 35.15 0.06 Relative Fold Expression 83 8 6 4 2 0 Llzpc.1 Llzpc.2 Llzpc.3 Llzpc.4 Llzpc.5 Llzpc.6 Gene Figure 26. L. laevis ZPC expression levels using qPCR. Averaged CT values for each ZPC gene were normalized to the averaged GAPDH CT and ΔCTs from each gene were compared to the lowest expressing gene (Llzpc.6). Relative fold expression is determined through the 2Δ(ΔCT) equation and is represented as positive expression over the lowest expressed gene (which is designated a fold expression of 1.0). 84 Llzpc.1 and Llzpc.6 falls significantly below the level of Llzpc.2, Llzpc.3, Llzpc.4, and Llzpc.5 (meaning the null hypothesis of equality was rejected in this case). Thus, there is evidence for two categories: high producers vs. low producers. X. laevis ZPC qPCR assays X. laevis ZPC genes were also found to be unequally expressed within their ovary as determined by the 2Δ(ΔCT) equation (Table 6, Figure 27). Upon conclusion of each qPCR trial, melt curves revealed only one sized product indicating that the CT values corresponded to the targeted amplicons (data not shown). The p-value derived from ANOVA analysis was 0.000 indicating the expression differences observed are highly significant. After normalization to GAPDH, Xlzpc.3 was found to be the lowest expressed gene within this individual’s ovary so all ΔCT data first normalized to GAPDH was then compared to the ΔCT of Xlzpc.3. Xlzpc.1b showed the highest level of expression at 3.23 fold greater than Xlzpc.3, but not significantly distinct from Xlzpc.2 (2.69 fold) using the Bonferroni correction (p-value < 0.0083 to reject the null hypothesis). Both Xlzpc.1b and Xlzpc.2 are more highly expressed when compared to Xlzpc.1a (1.17 fold). Xlzpc.1a was expressed 1.17 fold greater than Xlzpc.3, but their expression levels are also considered statistically equal. Similar to L. laevis, X. laevis ZPC can be categorized as either high (Xlzpc.1b and Xlzpc.2) or low (Xlzpc.1a and Xlzpc.3) producers. X. borealis ZPC qPCR assays 85 Table 6. X. laevis ZPC and GAPDH CT values. qPCR for each gene as performed in triplicate on a single plate which yielded three CT values that were averaged (column 2) along with it’s standard deviation (column 3). Three separate plates were performed for each gene. All three plate’s CT averages was then averaged together (column 4) along with standard deviations calculated (column 5). The final CT averages (column 4) were subsequently used to determine expression level using the 2Δ(ΔCT) equation. Gene Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 XlGAPDH Plate Avg. CT 27.17 27.38 27.32 25.55 26.05 25.90 25.84 26.18 26.25 27.67 27.28 27.61 25.50 25.54 25.71 Plate St. Dev. 0.13 0.29 0.29 0.10 0.04 0.06 0.06 0.13 0.29 0.18 0.12 0.17 0.15 0.13 0.20 Plate (1-3) Avg. Plate (1-3) St. Dev. Delta CT Delta/Delta CT Fold Expression 27.29 0.11 1.71 0.23 1.17 25.83 0.26 0.25 1.69 3.23 26.09 0.22 0.51 1.43 2.69 27.52 0.21 1.94 0.00 1 25.58 0.11 Relative Fold Expression 86 3.5 3 2.5 2 1.5 1 0.5 0 Xlzpc.1a Xlzpc.1b Xlzpc.2 Xlzpc.3 Gene Figure 27. X. laevis ZPC expression levels using qPCR. Averaged CT values for each ZPC gene were normalized to the averaged GAPDH CT and ΔCTs from each gene were compared to the lowest expressing gene (Xlzpc.3). Relative fold expression is determined through the 2Δ(ΔCT) equation and is represented as positive expression over the lowest expressed gene (which is designated a fold expression of 1.0). 87 Comparable to L. laevis and X. laevis, the X. borealis ZPC genes were found to be unequally expressed within the X. borealis ovary as determined by the 2Δ(ΔCT) equation using the average of triplicate trials (Table 7, Figure 28). Upon conclusion of each qPCR trail, melt curves revealed only one sized product indicating that the CT values corresponded to the targeted amplicons (data not shown). The p-value derived from an ANOVA analysis was 0.000 indicating that the expression differences observed are highly significant. After normalization to GAPDH, Xbzpc.1e was found to be the lowest expressed gene within this individual’s ovary so all GAPDH normalized ΔCT data was then compared to the ΔCT of Xlzpc.3. This revealed that Xbzpc.1d and Xbzpc.2 had the highest levels of expression at 16.56 and 17.63 fold respectively. Xbzpc.1d and Xbzpc.2 were considered statistically equal based on the Bonferroni correction using a p-value of 0.0050. Xbzpc.1c is relatively expressed at 13.55 fold, placing it as the third highest expressed gene, however pairwise comparisons indicate that expression is statistically equal to that of Xbzpc.1d and Xbzpc.3 (8.57 fold). Besides being considered statistically equal to Xbzpc.1c, Xbzpc.3 is expressed to a significantly greater degree than Xbzpc.1e, but expressed lower than Xbzpc.1d and Xbzpc.2. Thus, Xbzpc.3 was classified as moderately expressed due to its intermediary ranking. Based on these statistical pairwise comparisons, X. borealis ZPC gene expression levels are classified as being high (Xbzpc.1d, Xbzpc.1c, Xbzpc.2), moderate (Xbzpc.3) and low (Xbzpc.1e.). ZPC Phylogenetic Analysis Once expression profiles of ZPC genes were determined by qPCR, the 88 Table 7. X. borealis ZPC and GAPDH CT values. qPCR for each gene as performed in triplicate on a single plate which yielded three CT values that were averaged (column 2) along with it’s standard deviation (column 3). Three separate plates were performed for each gene. All three plate’s CT averages was then averaged together (column 4) along with standard deviations calculated (column 5). The final CT averages (column 4) were subsequently used to determine expression level using the 2Δ(ΔCT) equation. Gene Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 XbGAPDH Plate Avg. CT 24.93 24.83 24.78 20.83 20.77 20.78 20.94 21.09 21.24 20.47 21.26 20.39 21.85 21.82 21.58 18.77 18.79 18.68 Plate St Dev. 0.29 0.53 0.46 0.06 0.13 0.05 0.07 0.05 0.16 0.12 0.07 0.08 0.06 0.05 0.07 0.19 0.16 0.34 Plate (1-3) Avg. Plate (1-3) St. Dev. Delta CT Delta/Delta CT Fold Expression 24.85 0.08 6.10 0.00 1.00 20.79 0.03 2.05 4.05 16.56 21.09 0.15 2.34 3.76 13.55 20.71 0.48 1.96 4.14 17.63 21.75 0.15 3.00 3.10 8.57 18.75 0.06 89 Relative Gene Expression 20 15 10 5 0 Xbzpc.1e Xbzpc.1d Xbzpc.1c Xbzpc.2 Xbzpc.3 Gene Figure 28. X. borealis ZPC expression levels using qPCR. Averaged CT values for each ZPC gene were normalized to the averaged GAPDH CT and ΔCTs from each gene were compared to the lowest expressing gene (Xbzpc.1e). Relative fold expression is determined through the 2Δ(ΔCT) equation and is represented as positive expression over the lowest expressed gene (which is designated a fold expression of 1.0). 90 evolutionary relationships between species were assessed through the construction of phylogenetic trees. Gene trees diagram possible evolutionary paths genes have taken and reconstruct relationships such as being orthologous genes (diverged after a speciation event) or paralogous genes (diverged after a duplication event within same species) during their decent from a common ancestor. To evaluate the relationships of ZPC genes, maximum likelihood and neighbor-joining phylogenetic trees were generated using the program MEGA and ZPC protein sequences. Maximum likelihood trees use a statistical method to determine the best estimation or probability of a particular evolutionary model in generating the observed sequences. The tree with the highest probability is then returned revealing how the proteins in question are most likely related. Distance-based or neighbor-joining trees are created by calculating their sequence similarity to each other which is then converted into distance measurements of amino acid substitutions per site. In both cases, branch length corresponds to the level of divergence between two proteins with longer branches suggesting greater evolution. The data is resampled 1000 times (1000 bootstrap replicates) for both tree types to generate a consensus tree (the most common branching pattern). The robustness of the branching pattern is indicated by bootstrap values located at the branch node (values greater than 95% indicate high confidence that this is the correct branching pattern). Based upon the levels of expression and statistical significance, ZPC genes were previously qualitatively classified as being high, moderate, or low expressers within each individual’s ovary. Those designations were carried over to the phylogenetic trees to 91 hypothesize common functions of ZPC proteins based on relative gene expression and evolutionary relationships. X. laevis and X. borealis ZPC Evolutionary Relationship The generation of maximum likelihood (Figure 29) and neighbor-joining (Figure 30) phylogenetic trees suggests an orthologous relationship for a subset of the genes that are expressed to a similar degree. Two of the most highly expressed ZPC genes from the ovaries of X. laevis (Xlzpc.2) and X. borealis (Xbzpc.2) cluster together suggesting an orthologous relationship. A similar pattern is also observed between Xlzpc.3 and Xbzpc.3 which cluster together in both tree types indicating orthology. Additionally, these zpc.3 genes show a high degree of divergence from remaining ZPCs as illustrated by the extended branch length connecting this cluster. Contrasting the high expression from the cluster of Xlzpc.2 and Xbzpc.2, both Xlzpc.3 and Xbzpc.3 are not classified as highly expressed genes. Orthologous relationships cannot be determined between the remaining ZPC genes from X. laevis (Xlzpc.1a and Xlzpc.1b) and X. borealis (Xbzpc.1e, Xbzpc.1d, and Xbzpc.1c). Each set of genes cluster together suggesting the similarity is greater within each species rather than across species. These could either be gene duplication events in each lineage or alternatively allele variants (same gene location but different mutations of the gene). However, it is to be noted within each cluster that there are variable levels of expression. The cluster of Xbzpc.1e, Xbzpc.1d, and Xbzpc.1c possess both high and low expressing genes which is similar to the X. laevis cluster where Xlzpc.1a is a low 92 59 95 85 100 98 Xbzpc.1d High Xbzpc.1e Low Xbzpc.1c High Xlzpc.1a Low Xlzpc.1b High High Xbzpc.2 75 High Xlzpc.2 Xbzpc.3 100 Xlzpc.3 Low Llzpc.4 High Llzpc.2 56 High High Llzpc.5 95 Llzpc.1 91 Llzpc.3 100 97 Moderate Llzpc.6 Low High Low 0.1 Figure 29. ZPC maximum likelihood phylogenetic tree. A consensus tree of L. laevis, X. laevis, and X. borealis ZPC protein sequences (ZP domain + sperm binding regions) is constructed with MEGA version 4. Results of 1000 bootstrap replicates are indicated on each branch. Relative levels of expression determined from qPCR are located next to each gene. Branch length indicates the genetic distance between genes. 93 Xbzpc.1d High Xbzpc.1e Low Xbzpc.1c High Xlzpc.1a Low 99 Xlzpc.1b High Xlzpc.2 High 43 49 93 38 100 100 High Xbzpc.2 Moderate Xbzpc.3 89 Low Xlzpc.3 Llzpc.4 59 Low Llzpc.1 Llzpc.2 Llzpc.6 25 Llzpc.3 15 37 High Llzpc.5 High Low High High 0.2 Figure 30. ZPC neighbor-joining phylogenetic tree. A consensus tree of L. laevis, X. laevis, and X. borealis ZPC protein sequences (ZP domain + sperm binding regions) is constructed with MEGA version 4. Results of 1000 bootstrap replicates are indicated on each branch. Relative levels of expression determined from qPCR are located next to each gene. Branch length indicates the genetic distance between genes. 94 expresser and Xlzpc.1b is expressed to a significantly greater degree. Xenopus and Lepidobatrachus ZPC Evolutionary Relationships Given the evolutionary distance between Lepidobatrachus and Xenopus (diverged 110-120 mya), L. laevis ZPC genes are most closely related to other L. laevis ZPC genes when compared to X. laevis and X. borealis ZPCs (Figure 29, 30). However, Llzpc.4 is most closely related to Xenopus ZPC genes which may suggest that it is a common ancestral gene. Furthermore, Llzpc.4 is expressed relatively highly in the L. laevis ovary. The rest of the ZPC genes have mixtures of high and low expressing genes (4 of the 6 are high expressers in total). 95 Chapter 4 DISCUSSION The discovery of multiple ZPC genes present within the genomes of L. laevis, X. laevis, and X. borealis prompted the current research to determine ZPC gene expression profiles and whether or not they are equally expressed in each species so as to further formulate hypotheses as to their function. Relative ZPC expression levels were assessed through qPCR after normalization to the housekeeping gene GAPDH and comparison to each other. This revealed that ZPC genes were differentially expressed in each individual with respect to mRNA levels. This finding was consistent with 2-D gel immunoblotting data from L. laevis vitelline envelopes showing that the ZPC antiserum recognized approximately 6 different products that varied in their intensity levels (see Figure 5 from Introduction). Although mRNA levels do not always correspond directly to the amount of protein product that is translated, similar genes such as from the ZPC gene family are likely to be translated in a similar manner, thus giving credence to the assumption that the qPCR data is indicative of their protein expression levels. Furthermore, when comparing the differences in expression, multiple genes were predominant in the expression profile while others had a significantly reduced expression. These expression profiles may provide insight as to the role played by each translated protein in the fertilization process. As outlined in the introduction, three alternative scenarios were presented as to the potential expression levels: 1) a single ZPC gene is predominantly expressed whereas all others are minimal, 2) all ZPC genes are equally expressed, and 3) several ZPC genes are highly expressed relative to the others. The 96 results for all three species were consistent with scenario 3. The meaning of this data is left for interpretation, largely speculative in nature since there is no functional data for these different ZPC gene products to inform interpretation. However, it is likely that ZPC genes that are predominantly expressed have high potential for serving as sperm-binding proteins, whereas the lesser expressed genes serve an alternative role such as a structural function in the envelope, potentially even a noncritical role since there are several other ZP proteins in the envelope that are known to be important structurally (e.g. ZPA and ZPB). This notion of a diminished or alternative role for duplicated genes with reduced expression levels is not new. For example, in the zebrafish, the homeobox-1 gene duplicated to create the homeobox-2 gene (mbx1 and mbx2, respectively). Mbx1 is highly expressed in the forebrain, midbrain, and cerebellum early in neural development and functions as a transcription factor that gives rise to the mesencephalon leading to the formation of the midbrain. Although the mbx2 gene shows a high level of sequence identity, the reduced expression (expressed mainly in the midbrain) of mbx2 during neural development revealed that its functional role had changed [46]. Evidence from gene knockdown experiments indicate mbx1 maintains a more critical role in neural development between the duplicates, however mbx2 is required for the prolonged growth of the retina during embryonic development [57]. Thus, the responsibilities of the mbx2 gene compared to mbx1 are reduced which may be a similar model to diminished function of low expressing ZPC genes. It is very possible that the highly expressed ZPC genes in these frog species compete for sperm-binding activity in the egg envelope. As mentioned in the 97 introduction, sperm competition and sexual conflict may be forces driving the evolution of reproductive genes. Since frog ZPC genes appear to be rapidly evolving in the spermbinding region, complementary receptors on sperm that can bind to these ZPCs will be selected during fertilization and thus co-evolve. However, this competition is thought to be driven by the female reproductive genes as suggested by the research on mammalian ZPC, sea urchin bindin, and abalone lysin genes [45,33,34,35,36]. Thus, the male reproductive proteins such as sperm binding receptors are in competition for the functional sperm binding ZPCs. Sperm competition is thought to be beneficial since this would cause a decrease in the number of sperm that can actually penetrate and fuse with the egg at one time and help prevent polyspermic fertilization. By keeping sperm one step behind in the evolutionary race to fertilization, females are able to slow the process and minimize the deleterious effects of polyspermy. The lock and key interactions between ZPCs and sperm receptors would then be subject to relative affinity differences in their binding whereby the highest affinity interactions will trigger the acrosome reaction for subsequent sperm penetration through the envelope. The presence of multiple sperm receptors would enhance the competition and diversity of interactions. In addition, having back-up systems or alternative pathways for sperm-binding could help to ensure that fertilization does occur. For example, alternative pathways do exist in many signal transduction pathways as has been observed when one gene is knocked out in a pathway and then a second similar pathway (usually involving gene family members) takes over and accomplishes the same goal. 98 As for the actual data, in L. laevis it was found that the predominantly expressed ZPC genes were Llzpc.4 and Llzpc.5 (7.73 and 7.94 fold higher than the lowest Llzpc.6) but could not be distinguished statistically from Llzpc.2 and Llzpc.3 which were midlevel in expression (5.21 and 5.35 higher than Llzpc.6), thus they were grouped together as high expressers. This was the case since the more stringent Bonferroni correction was applied when determining significant expression differences between each expressed gene. Since Llzpc.2 and Llzpc.3 were significantly expressed in greater amounts when compared to Llzpc.1 and Llzpc.6 (low expressers) they were classified with Llzpc.4 and Llzpc.5 as having high expression. However, if a slightly more relaxed statistical method had been used Llzpc.2 and Llzpc.3 would have likely received a moderate or mid-level expression classification. Following the reasoning above, it is likely that these four genes function in sperm-binding whereas the minimally expressed genes (Llzpc.1 and Llzpc.6) are either structural or non-functional neutral components of the envelope. For X. laevis, Xlzpc.1b and Xlzpc.2 were found to be expressed at relatively high levels as compared to the lower expressing genes Xlzpc.1a and Xlzpc.3; and for X. borealis, the Xbzpc.1d, Xbzpc.1c, and Xbzpc.2 were found to be highly expressed as compared to the lower expressing Xbzpc.1e and Xbzpc.3 genes. Thus, it seems that between 50-66% of the ZPC genes found in the frog ovary cDNA libraries are expressed at high levels. Future studies will need to be performed to assess whether these high expressing genes actually do bind sperm through binding assays using purified preparations. As for the comparison of Xenopus ZPC genes, it was mentioned in the introduction that Xenopus experienced a whole genome duplication about 30 mya 99 creating multiple gene locations in the genomes of both X. laevis and X. borealis; and furthermore these species diverged about 10 mya. Thus, it should be possible to determine the relationships of the Xenopus ZPC genes to each other with respect to designating them as orthologous (separated by speciation) versus paralogous (separated by gene duplication). Based on maximum likelihood and neighbor-joining phylogenetic trees, a subset of Xenopus ZPC genes can be classified as being orthologous. The gene groupings of zpc.2 and zpc.3 from both X. laevis and X. borealis can be classified as being orthologous genes. Furthermore, these orthologous genes show relatively similar levels of expression in each individual’s ovary. For instance, Xlzpc.2 and Xbzpc.2 both received the high classification while Xlzpc.3 and Xbzpc.3 were categorized as lower expressers. Since the speculation is that high expressing ZPCs (Xlzpc.2 and Xbzpc.2) function in sperm-binding whereas alternative/loss of function is the likely fate of genes with minimal expression (Xlzpc.3 and Xbzpc.3), these orthologs may share a common function within the vitelline envelope. The remaining ZPC genes expressed in X. laevis (Xlzpc.1a and Xlzpc.1b) and X. borealis (Xbzpc.1e, Xbzpc.1d, and Xbzpc.1c) appear to be related because they cluster together a from a clade within both phylogenetic trees, however their orthologous relationships are indistinguishable. The genes within each species’ cluster together indicating they are more closely related to each other rather than to the other species. The existence of paralogs may have been a result of independent gene duplications after the split from a common ancestor which would account for the variable numbers of ZPC genes in the two species. Alternatively, the extra gene copies may be allele variants due 100 to the fact that Xenopus experienced a whole genome duplication event creating tetraploid organisms (4 allele variants possible at a particular orthologous locus). Furthermore, within each cluster there are ZPC genes that received both the high and low classification. Since the high expressers likely play an active role in sperm-binding, they may have duplicated to give rise to genes that have lost function or perform an alternative function. Based on the earlier observation of similar expression in orthologs, it is possible that genes with similar expression levels within these groups have an orthologous relationship, however this cannot be definitively determined at this point. Orthology could not be determined for L. laevis ZPC genes which may be due to the considerable evolutionary distance between Xenopus and Lepidobatrachus (diverged 110-120 mya). However, Llzpc.4 is evolutionarily most similar to Xenopus ZPCs suggesting that it is the ancestral gene since remaining gene copies are highly divergent. Numerous independent duplications may have been the fate of the ancestral copy after the split from Xenopus during evolution explaining the six expressed ZPC genes observed in the ovary of L. laevis today. Alternatively, since these species are distantly related and there is evidence of rapid evolution (particularly in the 3’ region) [28], L. laevis ZPC genes may have accumulated many mutations which obscure their phylogenic relationships with Xenopus and prohibit orthologous classifications (if they exist). Interestingly, X. laevis was found to have 4 ZPC genes whereas X. borealis had 5 ZPC genes expressed in their respective single individual ovary cDNA libraries. There are several possible reasons for this disparity. The first possibility is that the fifth gene is actually an allelic form for an orthologous gene in X. borealis within the zpc.1 clade 101 (found only in this species), as posed earlier. Another possibility is that X. borealis had an independent ZPC gene duplication that occurred after the split with X. laevis providing a paralogous gene unique to its lineage. The ZPC gene family seems to be prone to high levels of the birth and death of genes such as in the MHC gene family. This notion of independent duplications after an evolutionary split is common and may also explain the existence of the six observed ZPC genes in L. laevis. Alternatively, X. laevis may have initially had this fifth gene that is related to the zpc.1 group, but it could have been deleted from the genome or turned into a pseudogene. Other studies have indicated that such a case has happened in other duplicated genes [50,18]. On the other hand, X. laevis may indeed have a fifth gene that was not detected during the PCR amplification and sequencing of the ZPC cDNAs from its respective library likely due to being expressed in low amounts. This duplicated paralog may have acquired mutations in its regulatory region (i.e. promoters/enhancers) and diminished its expression thereby reducing the chances of its inclusion in this study since ZPC genes were identified by random sequencing of PCR amplified products from an ovary cDNA library. Even though ZPC genes were classified as being high or low expressers, the degree of expression is variable between species. As explained earlier, the fold expression of each ZPC gene was compared to the lowest expresser in each species using their CT values. It should be noted that the CT value of the Xbzpc.1e transcript within the X. borealis cDNA library was extremely high (meaning very low amounts), and thus the remaining genes were shown to have much more expression when compared to it (i.e. Xbzpc.2 was expressed 17.63 fold more than Xbzpc.1e). When Xbzpc.1e was removed 102 and data re-analyzed for expression levels, the profiles for X. laevis and X. borealis ZPC genes were very similar (i.e. two high expressers in the range of being 2-3 fold higher than the two low expressers). However, further studies will have to be performed to determine which one of these explanations is correct. Although all the ZPC genes examined in this study were assumed to have been expressed from the oocyte itself, it is also possible that the origin of some of the expressed ZPCs were derived from the follicle cells that surround and nurture the growing oocyte. Studies in mice and X. laevis have demonstrated that the oocyte does indeed express ZPC genes, but it has not been ruled out that the follicle cells do not contribute any ZPC gene products to the zona pellucida vitelline envelope. In particular, immunolabeling in mice has shown ZPC protein present within the oocyte’s golgi apparatus, secretory vesicles, and vesicular aggregates near the membrane surface [22]. Transcription factors implicated in effecting ZP glycoprotein expression have been found to be expressed by oocytes and, to a lesser degree, in surrounding follicle cells during folliculogenesis [58,59]. However, there have been no exhaustive studies to determine whether follicle cells could be expressing ZPC genes and secreting them for incorporation into the growing matrix. This could very much impact the paradigm of how the egg envelope structure is synthesized since the important location for the spermbinding ZPC glycoprotein is on the most exterior surface of the envelope and follicle cells are located at the outer edge of the egg. This could be examined by in situ hybridization experiments or separating follicle cells from oocytes to examine their expression by PCR. 103 Another interesting nuance to consider is that the ovarian cDNA library constructed for each of the anuran species used in this research was generated from oocytes at all stages of development. ZPC expression begins during stage I of oocyte development and continues until the egg reaches maturity just before ovulation (stage 5). The differential ZPC gene expression observed in L. laevis, X. laevis, and X. borealis may be due to the developmental timing of when the ovaries were collected. For example, more mature oocytes may have a completely different profile of ZPC expression than the early stage oocytes. Temporal changes in expression of each gene copy could account for concentration differences observed by each transcript. However, there is no evidence to indicate that hormonal induction of oogenesis in frogs causes a bias toward one stage of egg development since it all stages can be observed in the ovary by inspection. Developmental expression differences could be examined by removing oocytes from ovaries and performing quantitative PCR experiments on cDNA libraries generated from oocyte populations from each of the 5 stages of development. In summary, the current research supports the hypothesis that ZPC genes discovered in L. laevis, X. laevis, and X. borealis are unequally expressed within their respective ovaries and that orthologs have a similar pattern of expression. Quantitative assessment by qPCR indicated multiple genes were predominantly expressed while remaining copies were lower expressers. The most likely scenario is that the highly expressing ZPC genes function as sperm-binding molecules whereas the lesser expressed genes have altered their function. This provides a testable hypothesis for further study in which the different ZPC glycoproteins will need to be purified and tested in sperm 104 binding assays. These functional studies will be essential to further clarify the phenomenon of the expression of multiple ZPC genes in different amounts within the ovary of these frog species. 105 REFERENCES 1. Bhasin, S., DM De Kretser, HWG Baker. Pathophysiology and natural history of male infertility. Journal of Clinical Endocrinology and Metabolism. 79 (1994): 15251529. 2. Swanson, Willie J, and Victor D. Vacquier. The rapid evolution of reproductive proteins. Nature Reviews, Genetics. 3 (2002): 137-144. 3. Wassarman, Paul M., Luca Jovine, and Eveline S. Litscher. A profile of fertilization in mammals. Nature Cell Biology. 3 (2001): 59-64. 4. Wassarman, Paul M. Mammalian fertilization: molecular aspects of gamete adhesion, exocytosis, and fusion. Cell. 96 (1999): 175-183. 5. Gilbert, Scott F. Developmental Biology fourth edition. Sinauer Associates, Inc. 1994. 6. Grey, Robert D., Michael J. Bastiani, Dennis J. Webb, Eric R. Schertel. An electrochemical block is required to prevent polyspermy in eggs fertilized by natural mating Xenopus laevis. Developmental Biology. 89 (1982): 475-484. 7. Glahn, David and Richard Nucitelli. Voltage-clamp study of the activation currents and fast block to polyspermy in the egg o Xenopus laevis. Development, Growth & Differentiation. 45 (2003): 187-197. 8. Wassarman, Paul M., Sperm receptors and fertilization in Mammals. The Mount Sinai Journal of Medicine 69 (2002): 148-155. 9. Hunter, R.H.F. Sperm-egg interactions in the pig: monospermy, extensive polyspermy, and the formation of chromatin aggregates. Journal of Anatomy. 122 (1976): 43-59. 10. Vazquez, Monica H., David M. Phillips, Paul M. Wassarman. Interaction of mouse sperm with purified sperm receptors covalently linked to silica beads. Journal of Cell Science. 92 (1989): 713-722. 11. Bleil, Jeffrey D., Paul M. Wassarman. Mammalian sperm-egg interaction: identification of a glycoprotein in mouse egg zonae pellucidae possessing receptor activity for sperm. Cell. 20 (1980): 873-882. 12. Bleil, Jeffrey D., Paul M. Wassarman. Sperm-egg interactions in the mouse: sequence of events and induction of the acrosome reaction by a zona pellucida glycoprotein. Developmental Biology. 95 (1983): 317-324. 106 13. Bansal, Pankaj, Kausiki Chakrabarti, Satish K. Gupta. Functional activity of human ZP3 primary sperm receptor resides toward its C-terminus. Biology of Reproduction. 81 (2009): 7-15. 14. Yang, Joy C. and Jerry L. Hedrick. cDNA cloning and sequence analysis of the Xenopus laevis egg envelope glycoprotein gp43. Development, Growth & Differentiation. 39 (1997): 457-467. 15. Vo, Loc H. and Jerry L. Hedrick. Independent and hetero-oligomeric-dependent sperm binding to egg envelope glycoprotein ZPC in Xenopus laevis. Biology of Reproduction. 62 (2000): 766-774. 16. Vo, Loc H., Ten-Yang Yen, Bruce A. Macher, and Jerry L. Hedrick. Identification of the ZPC oligosaccharide ligand involved in sperm binding and the glycan structure of Xenopus laevis vitelline envelope glycoproteins. Biology of Reproduction. 69 (2003): 1822-1830. 17. Liu, Xingjun, Hai Wang, Zhiyuan Gong. Tandem-repeated zebrafish zp3 genes possess oocyte-specific promoters and are insensitive to estrogen induction. Biology of Reproduction. 74 (2006): 1016-1025. 18. Conner, S.J. and D.C. Hughes. Analysis of fish ZP1/ZPB homologous genes – evidence for both genome duplication and species-specific amplification models of evolution. Reproduction. 126 (2003): 347-352. 19. Wassarman, Paul M. Profile if a mammalian sperm receptor. Development. 108 (1990). 20. Jovine, Luca, Huayu Qi, Zev Williams, Eveline S. Litscher, Paul M. Wassarman. A duplicated motif controls assembly of zona pellucida domain proteins. Proceedings of the National Academy of Sciences. 101 (2004): 5922-5927. 21. Williams, Zev and Paul M. Wassarman. Secretion of mouse ZP3, the sperm receptor, requires cleavage of its polypeptide at a consensus furin cleavage site. Biochemistry. 40 (2001): 929-937 22. El-Mestrah, Majid, Phillip E. Castle, Girum Borossa, Frederick WK Kan. Subcellular distribution of ZP1, ZP2, and ZP3 glycoproteins during folliculogenesis and demonstration of their topographical disposition within the zona matrix of mouse ovarian oocytes. Biology of Reproduction. 66 (2002): 866-876. 107 23. Serblom, Anne P., Jean M. Decker, Andrew V. Muchmore. The lectin-like interaction between recombinant tumor necrosis factor and uromodulin. The Journal of Biological Chemistry. 263 (1988): 5418-5424. 24. Jovine, Luca, Huayu Qi, Zev Wiliams, Eveline Litscher, Paul M. Wassarman. The ZP domain is a conserved module for polymerization of extracellular proteins. Nature Cell Biology. 4 (2002): 457-461. 25. Monne, Magnus, Ling Han, Thomas Schwend, Sofia Burendahl, Luca Jovine. Crystal structure of the ZP-N domain ZP3 reveals the core fold of animal egg coats. Nature. 456 (2008): 653-659. 26. Florman, Harvey M. and Paul M. Wassarman. O-linked oligosaccharides of mouse egg ZP3 account for its sperm receptor activity. Cell. 41 (1985): 313-324. 27. Vo, Loc H. and Jerry L. Hendrick. Independent and hetero-oligomeric-dependent sperm binding to egg envelope glycoprotein ZPC in Xenopus laevis. Biology of Reproduction. 62 (2000): 766-774. 28. Swanson, Willie J., Ziheng Yang, Mariana F. Wolfner, Charles F. Aquadro. Positive Darwinian selection drives the evolution of several female reproductive proteins in mammals. Proceedings of the National Academy of Sciences. 98 (2001): 2509-2514. 29. Hughes, Austin L. and Masatoshi Nei. Pattern of nucleotide substitution at major histocompatability complex class I loci reveals overdominant selection. Nature. 335 (1988): 167-170. 30. Piontkivska, Helen and Masatoshi Nei. Birth-and-death evolution in primate MHC class I genes: divergence time estimates. Molecular Biology and Evolution. 20 (2003): 601-609. 31. Foltz, Kathleen R., Jacqueline S. Partin, William J. Lennarz. Sea urchin egg receptor for sperm: sequence similarity of binding domain and hsp70. Science. 259 (1993): 1421-1425. 32. Vacquier, Victor D. and Gary W. Moy. Isolation of bindin: the protein responsible for adhesion of sperm sea urchin eggs. Proceedings of the National Academy of Sciences. 74 (1977): 2456-2460. 33. Metz, Edward C. and Stephen R. Palumbi. Positive selection and sequence rearrangement generate extensive polymorphism in the gamete recognition protein bindin. Molecular Biology and Evolution. 12 (1996): 397-406. 108 34. Aagaard Jan E., Xianhua Yi, Michael J. MacCoss, Willie J. Swanson. Rapidly evolving zona pellucida domain proteins are a major component of the vitelline envelope of abalone eggs. Proceedings of the National Academy of Sciences. 103 (2006): 17302-17307. 35. Galino, Blanca E., Victor D. Vacquier, Willie J. Swanson. Positive selection in the egg receptor for abalone sperm lysine. Proceedings of the National Academy of Sciences. 100 (2003): 4639-4643. 36. Lee, Youn-Ho, Tatsuya Ota, Victor D. Vacquier. Positive selection is a general phenomeon in the evolution of abalone sperm lysin. Molecular Biology and Evolution. 12 (1995): 231-238. 37. Bakhtiari-Nejad, Faezeh. Are sperm-binding proteins among two closely related frog species rapidly diversifying? California State University, Sacramento. Thesis. (2009). 38. Kanamori, Akira, Kiyoshi Naruse, Hiroshi Mitana, Akihiro Shima, Hiroshi Hori. Genomic organization of the ZP domain containing egg envelope genes in madaka (Oryzias latipes). Gene. 305 (2003): 34-45. 39. Amanze, Diala and Arati Iyengar. The micropyle: a sperm guidance system in teleost fertilization. Development. 109 (1990): 495-500. 40. Goudet, Ghylene, Sylvie Mugnier, Isabelle Callebaut, Phillippe Monget. Phylogenetic analysis and identification of pseudogenes reveal a progressive loss of zona pellucida genes during evolution of vertebrates. Biology of Reproduction. 78 (2008): 796-806. 41. Sinowatz, F., S. Kolle, E. Topfer-Peterson. Biosynthesis and expression of zona pellucida glycoprotiens in mammals. Cells Tissues Organs. 168 (2001): 24-35. 42. Nei, Masatoshi, Xun Gu, Tatyana Sitnikova. Evolution by the birth-and-death process in multigene families of vertebrate immune system. Proceedings of the National Academy of Sciences. 94 (1997): 7799-7806. 43. Ota, Tatsuya and Masatoshi Nei. Divergent evolution by the birth-and-death process in the immunoglobulin VH gene family. Molecular Biology and Evolution. 11 (1994): 469-482. 44. Tian, Xin, Geraldine Pascal, Sophie Fouchecourt, Pierre Pontarotti, Philippe Monget. Gene birth, death, divergence: the different scenarios of reproduction-related gene evolution. Biology of Reproduction. 80 (2009): 616-621. 109 45. Gavrilets, Sergey. Rapid evolution of reproductive barriers driven by sexual conflict. Nature. 403 (2000): 886-889. 46. Chang, Lou, Brian Khoo, Loksum Wong, Vincent Tropepe. Genomic sequence and spatiotemporal expression comparison of zebrafish mbx1 and its paralog, mbx2. Development Genes and Evolution. 216 (2006): 647-654. 47. Bisbee, Chester A., Margaret A. Baker, A.C. Wilson. Albumin phylogeny for clawed frogs (Xenopus). Science. 195 (1977): 785-787. 48. Wallace, Donald G., Linda R. Maxson, Allan C. Wilson. Albumin evolution in frogs: a test of the evolutionary clock hypothesis. Proceedings of the National Academy of Sciences 68 (1971): 3127-3129. 49. Evans, Ben J., Darcy B. Kelley, Richard C. Tinsley, Don J. Melnick, David C. Cannatella. A mitochondrial DNA phylogeny of African clawed frogs: phylogeography and implications for polyploidy evolution. Molecular Phylogenetics and Evolution. 33 (2004): 197-213. 50. Chain, Frederic JJ, Dora Ilieva, Ben J. Evans. Duplicate gene evolution and expression in the wake of vertebrate allopolyploidization. BMC Evolutionary Biology. 8 (2008). 51. Ruvinsky, Ilya and Linda R. Maxson. Phylogenetic relationship among Bufonoid frogs (Anura: Neobatrachia) inferred from mitochondrial DNA sequences. Molecular Phylogenetics and Evolution. 5 (1996): 533-547. 52. Kubista, Mikael, Jose Manuel Andrade, Martin Bengtsson, Amin Forootan, Jiri Jonak, Kirstina Lind, Radek Sindelka, Robert Sjoback, Bjorn Sjogreen, Linda Strombom, Anders Stahlberg, Neven Zoric. The real-time polymerase chain reaction. Molecular Aspects of Medicine. 27 (2006): 95-125. 53. Sindelka, Radek, Zoltan Ferjentsik, Jiri Jonak. Developmental expression profiles of Xenopus laevis reference genes. Developmental Dynamics. 235 (2006): 754-758. 54. Logan, Julie, Kirsten Edwards, Nick Saunders. Real-time PCR. Caister Academic Press, Norfolk, UK. (2009). 55. Tamura, K., J. Dudley, M. Nei, S. Kumar MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution. 24 (2007): 1596-1599. 110 56. Nei, M. and S. Kumar. Molecular Evolution and Phylogenetics. Oxford University Press, New York. (2000). 57. Wong, Loksum, Cameron J Weadick, Claire Kuo, Belinda Chang, Vincent Tropepe. Duplicate dmbx1 genes regulate progenitor cell cycle and differentiation during Zebrafish midbrain and retinal development. BMC Developmental Biology. 10 (2010). 58. Huntriss J., R. Gosden, M. Hinkins, B. Oliver, D. Miller, AJ Rutherford, HM Picton. Isolation, characterization and expression of the human Factor In the Germline alpha (FIGLA) gene in ovarian follicles and oocytes. Molecular Human Reproduction. 8 (2002): 1087-1095. 59. Tormala, Reeta-Marie, Minna Jaakelainen, Jouni Lakkakorpi, Annikki Liakka, Juha S. Tapanainen, Tommi E. Vaskivuo. Zona pellucida components are present in human fetal ovary before follicle formation. Molecular and Cellular Endocrinology. 289 (2008): 10-15.